relational database based: Topics by Science.gov

Sample records for relational database based

Examining database persistence of ISO/EN 13606 standardized electronic health record extracts: relational vs. NoSQL approaches.

PubMed

Sánchez-de-Madariaga, Ricardo; Muñoz, Adolfo; Lozano-Rubí, Raimundo; Serrano-Balazote, Pablo; Castro, Antonio L; Moreno, Oscar; Pascual, Mario

2017-08-18

The objective of this research is to compare the relational and non-relational (NoSQL) database systems approaches in order to store, recover, query and persist standardized medical information in the form of ISO/EN 13606 normalized Electronic Health Record XML extracts, both in isolation and concurrently. NoSQL database systems have recently attracted much attention, but few studies in the literature address their direct comparison with relational databases when applied to build the persistence layer of a standardized medical information system. One relational and two NoSQL databases (one document-based and one native XML database) of three different sizes have been created in order to evaluate and compare the response times (algorithmic complexity) of six different complexity growing queries, which have been performed on them. Similar appropriate results available in the literature have also been considered. Relational and non-relational NoSQL database systems show almost linear algorithmic complexity query execution. However, they show very different linear slopes, the former being much steeper than the two latter. Document-based NoSQL databases perform better in concurrency than in isolation, and also better than relational databases in concurrency. Non-relational NoSQL databases seem to be more appropriate than standard relational SQL databases when database size is extremely high (secondary use, research applications). Document-based NoSQL databases perform in general better than native XML NoSQL databases. EHR extracts visualization and edition are also document-based tasks more appropriate to NoSQL database systems. However, the appropriate database solution much depends on each particular situation and specific problem.
Automating Relational Database Design for Microcomputer Users.

ERIC Educational Resources Information Center

Pu, Hao-Che

1991-01-01

Discusses issues involved in automating the relational database design process for microcomputer users and presents a prototype of a microcomputer-based system (RA, Relation Assistant) that is based on expert systems technology and helps avoid database maintenance problems. Relational database design is explained and the importance of easy input…
Performance assessment of EMR systems based on post-relational database.

PubMed

Yu, Hai-Yan; Li, Jing-Song; Zhang, Xiao-Guang; Tian, Yu; Suzuki, Muneou; Araki, Kenji

2012-08-01

Post-relational databases provide high performance and are currently widely used in American hospitals. As few hospital information systems (HIS) in either China or Japan are based on post-relational databases, here we introduce a new-generation electronic medical records (EMR) system called Hygeia, which was developed with the post-relational database Caché and the latest platform Ensemble. Utilizing the benefits of a post-relational database, Hygeia is equipped with an "integration" feature that allows all the system users to access data-with a fast response time-anywhere and at anytime. Performance tests of databases in EMR systems were implemented in both China and Japan. First, a comparison test was conducted between a post-relational database, Caché, and a relational database, Oracle, embedded in the EMR systems of a medium-sized first-class hospital in China. Second, a user terminal test was done on the EMR system Izanami, which is based on the identical database Caché and operates efficiently at the Miyazaki University Hospital in Japan. The results proved that the post-relational database Caché works faster than the relational database Oracle and showed perfect performance in the real-time EMR system.
Short Fiction on Film: A Relational DataBase.

ERIC Educational Resources Information Center

May, Charles

Short Fiction on Film is a database that was created and will run on DataRelator, a relational database manager created by Bill Finzer for the California State Department of Education in 1986. DataRelator was designed for use in teaching students database management skills and to provide teachers with examples of how a database manager might be…
A searching and reporting system for relational databases using a graph-based metadata representation.

PubMed

Hewitt, Robin; Gobbi, Alberto; Lee, Man-Ling

2005-01-01

Relational databases are the current standard for storing and retrieving data in the pharmaceutical and biotech industries. However, retrieving data from a relational database requires specialized knowledge of the database schema and of the SQL query language. At Anadys, we have developed an easy-to-use system for searching and reporting data in a relational database to support our drug discovery project teams. This system is fast and flexible and allows users to access all data without having to write SQL queries. This paper presents the hierarchical, graph-based metadata representation and SQL-construction methods that, together, are the basis of this system's capabilities.
Constructing a Geology Ontology Using a Relational Database

NASA Astrophysics Data System (ADS)

Hou, W.; Yang, L.; Yin, S.; Ye, J.; Clarke, K.

2013-12-01

In geology community, the creation of a common geology ontology has become a useful means to solve problems of data integration, knowledge transformation and the interoperation of multi-source, heterogeneous and multiple scale geological data. Currently, human-computer interaction methods and relational database-based methods are the primary ontology construction methods. Some human-computer interaction methods such as the Geo-rule based method, the ontology life cycle method and the module design method have been proposed for applied geological ontologies. Essentially, the relational database-based method is a reverse engineering of abstracted semantic information from an existing database. The key is to construct rules for the transformation of database entities into the ontology. Relative to the human-computer interaction method, relational database-based methods can use existing resources and the stated semantic relationships among geological entities. However, two problems challenge the development and application. One is the transformation of multiple inheritances and nested relationships and their representation in an ontology. The other is that most of these methods do not measure the semantic retention of the transformation process. In this study, we focused on constructing a rule set to convert the semantics in a geological database into a geological ontology. According to the relational schema of a geological database, a conversion approach is presented to convert a geological spatial database to an OWL-based geological ontology, which is based on identifying semantics such as entities, relationships, inheritance relationships, nested relationships and cluster relationships. The semantic integrity of the transformation was verified using an inverse mapping process. In a geological ontology, an inheritance and union operations between superclass and subclass were used to present the nested relationship in a geochronology and the multiple inheritances relationship. Based on a Quaternary database of downtown of Foshan city, Guangdong Province, in Southern China, a geological ontology was constructed using the proposed method. To measure the maintenance of semantics in the conversation process and the results, an inverse mapping from the ontology to a relational database was tested based on a proposed conversation rule. The comparison of schema and entities and the reduction of tables between the inverse database and the original database illustrated that the proposed method retains the semantic information well during the conversation process. An application for abstracting sandstone information showed that semantic relationships among concepts in the geological database were successfully reorganized in the constructed ontology. Key words: geological ontology; geological spatial database; multiple inheritance; OWL Acknowledgement: This research is jointly funded by the Specialized Research Fund for the Doctoral Program of Higher Education of China (RFDP) (20100171120001), NSFC (41102207) and the Fundamental Research Funds for the Central Universities (12lgpy19).
Comparing the Performance of NoSQL Approaches for Managing Archetype-Based Electronic Health Record Data

PubMed Central

Freire, Sergio Miranda; Teodoro, Douglas; Wei-Kleiner, Fang; Sundvall, Erik; Karlsson, Daniel; Lambrix, Patrick

2016-01-01

This study provides an experimental performance evaluation on population-based queries of NoSQL databases storing archetype-based Electronic Health Record (EHR) data. There are few published studies regarding the performance of persistence mechanisms for systems that use multilevel modelling approaches, especially when the focus is on population-based queries. A healthcare dataset with 4.2 million records stored in a relational database (MySQL) was used to generate XML and JSON documents based on the openEHR reference model. Six datasets with different sizes were created from these documents and imported into three single machine XML databases (BaseX, eXistdb and Berkeley DB XML) and into a distributed NoSQL database system based on the MapReduce approach, Couchbase, deployed in different cluster configurations of 1, 2, 4, 8 and 12 machines. Population-based queries were submitted to those databases and to the original relational database. Database size and query response times are presented. The XML databases were considerably slower and required much more space than Couchbase. Overall, Couchbase had better response times than MySQL, especially for larger datasets. However, Couchbase requires indexing for each differently formulated query and the indexing time increases with the size of the datasets. The performances of the clusters with 2, 4, 8 and 12 nodes were not better than the single node cluster in relation to the query response time, but the indexing time was reduced proportionally to the number of nodes. The tested XML databases had acceptable performance for openEHR-based data in some querying use cases and small datasets, but were generally much slower than Couchbase. Couchbase also outperformed the response times of the relational database, but required more disk space and had a much longer indexing time. Systems like Couchbase are thus interesting research targets for scalable storage and querying of archetype-based EHR data when population-based use cases are of interest. PMID:26958859
Comparing the Performance of NoSQL Approaches for Managing Archetype-Based Electronic Health Record Data.

PubMed

Freire, Sergio Miranda; Teodoro, Douglas; Wei-Kleiner, Fang; Sundvall, Erik; Karlsson, Daniel; Lambrix, Patrick

2016-01-01

This study provides an experimental performance evaluation on population-based queries of NoSQL databases storing archetype-based Electronic Health Record (EHR) data. There are few published studies regarding the performance of persistence mechanisms for systems that use multilevel modelling approaches, especially when the focus is on population-based queries. A healthcare dataset with 4.2 million records stored in a relational database (MySQL) was used to generate XML and JSON documents based on the openEHR reference model. Six datasets with different sizes were created from these documents and imported into three single machine XML databases (BaseX, eXistdb and Berkeley DB XML) and into a distributed NoSQL database system based on the MapReduce approach, Couchbase, deployed in different cluster configurations of 1, 2, 4, 8 and 12 machines. Population-based queries were submitted to those databases and to the original relational database. Database size and query response times are presented. The XML databases were considerably slower and required much more space than Couchbase. Overall, Couchbase had better response times than MySQL, especially for larger datasets. However, Couchbase requires indexing for each differently formulated query and the indexing time increases with the size of the datasets. The performances of the clusters with 2, 4, 8 and 12 nodes were not better than the single node cluster in relation to the query response time, but the indexing time was reduced proportionally to the number of nodes. The tested XML databases had acceptable performance for openEHR-based data in some querying use cases and small datasets, but were generally much slower than Couchbase. Couchbase also outperformed the response times of the relational database, but required more disk space and had a much longer indexing time. Systems like Couchbase are thus interesting research targets for scalable storage and querying of archetype-based EHR data when population-based use cases are of interest.
Abstraction of the Relational Model from a Department of Veterans Affairs DHCP Database: Bridging Theory and Working Application

PubMed Central

Levy, C.; Beauchamp, C.

1996-01-01

This poster describes the methods used and working prototype that was developed from an abstraction of the relational model from the VA's hierarchical DHCP database. Overlaying the relational model on DHCP permits multiple user views of the physical data structure, enhances access to the database by providing a link to commercial (SQL based) software, and supports a conceptual managed care data model based on primary and longitudinal patient care. The goal of this work was to create a relational abstraction of the existing hierarchical database; to construct, using SQL data definition language, user views of the database which reflect the clinical conceptual view of DHCP, and to allow the user to work directly with the logical view of the data using GUI based commercial software of their choosing. The workstation is intended to serve as a platform from which a managed care information model could be implemented and evaluated.
EasyKSORD: A Platform of Keyword Search Over Relational Databases

NASA Astrophysics Data System (ADS)

Peng, Zhaohui; Li, Jing; Wang, Shan

Keyword Search Over Relational Databases (KSORD) enables casual users to use keyword queries (a set of keywords) to search relational databases just like searching the Web, without any knowledge of the database schema or any need of writing SQL queries. Based on our previous work, we design and implement a novel KSORD platform named EasyKSORD for users and system administrators to use and manage different KSORD systems in a novel and simple manner. EasyKSORD supports advanced queries, efficient data-graph-based search engines, multiform result presentations, and system logging and analysis. Through EasyKSORD, users can search relational databases easily and read search results conveniently, and system administrators can easily monitor and analyze the operations of KSORD and manage KSORD systems much better.
Using an image-extended relational database to support content-based image retrieval in a PACS.

PubMed

Traina, Caetano; Traina, Agma J M; Araújo, Myrian R B; Bueno, Josiane M; Chino, Fabio J T; Razente, Humberto; Azevedo-Marques, Paulo M

2005-12-01

This paper presents a new Picture Archiving and Communication System (PACS), called cbPACS, which has content-based image retrieval capabilities. The cbPACS answers range and k-nearest- neighbor similarity queries, employing a relational database manager extended to support images. The images are compared through their features, which are extracted by an image-processing module and stored in the extended relational database. The database extensions were developed aiming at efficiently answering similarity queries by taking advantage of specialized indexing methods. The main concept supporting the extensions is the definition, inside the relational manager, of distance functions based on features extracted from the images. An extension to the SQL language enables the construction of an interpreter that intercepts the extended commands and translates them to standard SQL, allowing any relational database server to be used. By now, the system implemented works on features based on color distribution of the images through normalized histograms as well as metric histograms. Metric histograms are invariant regarding scale, translation and rotation of images and also to brightness transformations. The cbPACS is prepared to integrate new image features, based on texture and shape of the main objects in the image.
Repetitive Bibliographical Information in Relational Databases.

ERIC Educational Resources Information Center

Brooks, Terrence A.

1988-01-01

Proposes a solution to the problem of loading repetitive bibliographic information in a microcomputer-based relational database management system. The alternative design described is based on a representational redundancy design and normalization theory. (12 references) (Author/CLB)
Why Save Your Course as a Relational Database?

ERIC Educational Resources Information Center

Hamilton, Gregory C.; Katz, David L.; Davis, James E.

2000-01-01

Describes a system that stores course materials for computer-based training programs in a relational database called Of Course! Outlines the basic structure of the databases; explains distinctions between Of Course! and other authoring languages; and describes how data is retrieved from the database and presented to the student. (Author/LRW)
“NaKnowBase”: A Nanomaterials Relational Database

EPA Science Inventory

NaKnowBase is an internal relational database populated with data from peer-reviewed ORD nanomaterials research publications. The database focuses on papers describing the actions of nanomaterials in environmental or biological media including their interactions, transformations...
A protein relational database and protein family knowledge bases to facilitate structure-based design analyses.

PubMed

Mobilio, Dominick; Walker, Gary; Brooijmans, Natasja; Nilakantan, Ramaswamy; Denny, R Aldrin; Dejoannis, Jason; Feyfant, Eric; Kowticwar, Rupesh K; Mankala, Jyoti; Palli, Satish; Punyamantula, Sairam; Tatipally, Maneesh; John, Reji K; Humblet, Christine

2010-08-01

The Protein Data Bank is the most comprehensive source of experimental macromolecular structures. It can, however, be difficult at times to locate relevant structures with the Protein Data Bank search interface. This is particularly true when searching for complexes containing specific interactions between protein and ligand atoms. Moreover, searching within a family of proteins can be tedious. For example, one cannot search for some conserved residue as residue numbers vary across structures. We describe herein three databases, Protein Relational Database, Kinase Knowledge Base, and Matrix Metalloproteinase Knowledge Base, containing protein structures from the Protein Data Bank. In Protein Relational Database, atom-atom distances between protein and ligand have been precalculated allowing for millisecond retrieval based on atom identity and distance constraints. Ring centroids, centroid-centroid and centroid-atom distances and angles have also been included permitting queries for pi-stacking interactions and other structural motifs involving rings. Other geometric features can be searched through the inclusion of residue pair and triplet distances. In Kinase Knowledge Base and Matrix Metalloproteinase Knowledge Base, the catalytic domains have been aligned into common residue numbering schemes. Thus, by searching across Protein Relational Database and Kinase Knowledge Base, one can easily retrieve structures wherein, for example, a ligand of interest is making contact with the gatekeeper residue.
“NaKnowBase”: A Nanomaterials Relational Database

EPA Science Inventory

NaKnowBase is a relational database populated with data from peer-reviewed ORD nanomaterials research publications. The database focuses on papers describing the actions of nanomaterials in environmental or biological media including their interactions, transformations and poten...
The Xeno-glycomics database (XDB): a relational database of qualitative and quantitative pig glycome repertoire.

PubMed

Park, Hae-Min; Park, Ju-Hyeong; Kim, Yoon-Woo; Kim, Kyoung-Jin; Jeong, Hee-Jin; Jang, Kyoung-Soon; Kim, Byung-Gee; Kim, Yun-Gon

2013-11-15

In recent years, the improvement of mass spectrometry-based glycomics techniques (i.e. highly sensitive, quantitative and high-throughput analytical tools) has enabled us to obtain a large dataset of glycans. Here we present a database named Xeno-glycomics database (XDB) that contains cell- or tissue-specific pig glycomes analyzed with mass spectrometry-based techniques, including a comprehensive pig glycan information on chemical structures, mass values, types and relative quantities. It was designed as a user-friendly web-based interface that allows users to query the database according to pig tissue/cell types or glycan masses. This database will contribute in providing qualitative and quantitative information on glycomes characterized from various pig cells/organs in xenotransplantation and might eventually provide new targets in the α1,3-galactosyltransferase gene-knock out pigs era. The database can be accessed on the web at http://bioinformatics.snu.ac.kr/xdb.
49 CFR 1572.107 - Other analyses.

Code of Federal Regulations, 2011 CFR

2011-10-01

... applicant poses a security threat based on a search of the following databases: (1) Interpol and other international databases, as appropriate. (2) Terrorist watchlists and related databases. (3) Any other databases...
49 CFR 1572.107 - Other analyses.

Code of Federal Regulations, 2010 CFR

2010-10-01

... applicant poses a security threat based on a search of the following databases: (1) Interpol and other international databases, as appropriate. (2) Terrorist watchlists and related databases. (3) Any other databases...
49 CFR 1572.107 - Other analyses.

Code of Federal Regulations, 2014 CFR

2014-10-01

... applicant poses a security threat based on a search of the following databases: (1) Interpol and other international databases, as appropriate. (2) Terrorist watchlists and related databases. (3) Any other databases...

49 CFR 1572.107 - Other analyses.

Code of Federal Regulations, 2012 CFR

2012-10-01

... applicant poses a security threat based on a search of the following databases: (1) Interpol and other international databases, as appropriate. (2) Terrorist watchlists and related databases. (3) Any other databases...
49 CFR 1572.107 - Other analyses.

Code of Federal Regulations, 2013 CFR

2013-10-01

... applicant poses a security threat based on a search of the following databases: (1) Interpol and other international databases, as appropriate. (2) Terrorist watchlists and related databases. (3) Any other databases...
Archetype relational mapping - a practical openEHR persistence solution.

PubMed

Wang, Li; Min, Lingtong; Wang, Rui; Lu, Xudong; Duan, Huilong

2015-11-05

One of the primary obstacles to the widespread adoption of openEHR methodology is the lack of practical persistence solutions for future-proof electronic health record (EHR) systems as described by the openEHR specifications. This paper presents an archetype relational mapping (ARM) persistence solution for the archetype-based EHR systems to support healthcare delivery in the clinical environment. First, the data requirements of the EHR systems are analysed and organized into archetype-friendly concepts. The Clinical Knowledge Manager (CKM) is queried for matching archetypes; when necessary, new archetypes are developed to reflect concepts that are not encompassed by existing archetypes. Next, a template is designed for each archetype to apply constraints related to the local EHR context. Finally, a set of rules is designed to map the archetypes to data tables and provide data persistence based on the relational database. A comparison study was conducted to investigate the differences among the conventional database of an EHR system from a tertiary Class A hospital in China, the generated ARM database, and the Node + Path database. Five data-retrieving tests were designed based on clinical workflow to retrieve exams and laboratory tests. Additionally, two patient-searching tests were designed to identify patients who satisfy certain criteria. The ARM database achieved better performance than the conventional database in three of the five data-retrieving tests, but was less efficient in the remaining two tests. The time difference of query executions conducted by the ARM database and the conventional database is less than 130 %. The ARM database was approximately 6-50 times more efficient than the conventional database in the patient-searching tests, while the Node + Path database requires far more time than the other two databases to execute both the data-retrieving and the patient-searching tests. The ARM approach is capable of generating relational databases using archetypes and templates for archetype-based EHR systems, thus successfully adapting to changes in data requirements. ARM performance is similar to that of conventionally-designed EHR systems, and can be applied in a practical clinical environment. System components such as ARM can greatly facilitate the adoption of openEHR architecture within EHR systems.
A Quantitative Analysis of the Extrinsic and Intrinsic Turnover Factors of Relational Database Support Professionals

ERIC Educational Resources Information Center

Takusi, Gabriel Samuto

2010-01-01

This quantitative analysis explored the intrinsic and extrinsic turnover factors of relational database support specialists. Two hundred and nine relational database support specialists were surveyed for this research. The research was conducted based on Hackman and Oldham's (1980) Job Diagnostic Survey. Regression analysis and a univariate ANOVA…
MicroUse: The Database on Microcomputer Applications in Libraries and Information Centers.

ERIC Educational Resources Information Center

Chen, Ching-chih; Wang, Xiaochu

1984-01-01

Describes MicroUse, a microcomputer-based database on microcomputer applications in libraries and information centers which was developed using relational database manager dBASE II. The description includes its system configuration, software utilized, the in-house-developed dBASE programs, multifile structure, basic functions, MicroUse records,…
A dynamic clinical dental relational database.

PubMed

Taylor, D; Naguib, R N G; Boulton, S

2004-09-01

The traditional approach to relational database design is based on the logical organization of data into a number of related normalized tables. One assumption is that the nature and structure of the data is known at the design stage. In the case of designing a relational database to store historical dental epidemiological data from individual clinical surveys, the structure of the data is not known until the data is presented for inclusion into the database. This paper addresses the issues concerned with the theoretical design of a clinical dynamic database capable of adapting the internal table structure to accommodate clinical survey data, and presents a prototype database application capable of processing, displaying, and querying the dental data.
ZeBase: an open-source relational database for zebrafish laboratories.

PubMed

Hensley, Monica R; Hassenplug, Eric; McPhail, Rodney; Leung, Yuk Fai

2012-03-01

Abstract ZeBase is an open-source relational database for zebrafish inventory. It is designed for the recording of genetic, breeding, and survival information of fish lines maintained in a single- or multi-laboratory environment. Users can easily access ZeBase through standard web-browsers anywhere on a network. Convenient search and reporting functions are available to facilitate routine inventory work; such functions can also be automated by simple scripting. Optional barcode generation and scanning are also built-in for easy access to the information related to any fish. Further information of the database and an example implementation can be found at http://zebase.bio.purdue.edu.
Database constraints applied to metabolic pathway reconstruction tools.

PubMed

Vilaplana, Jordi; Solsona, Francesc; Teixido, Ivan; Usié, Anabel; Karathia, Hiren; Alves, Rui; Mateo, Jordi

2014-01-01

Our group developed two biological applications, Biblio-MetReS and Homol-MetReS, accessing the same database of organisms with annotated genes. Biblio-MetReS is a data-mining application that facilitates the reconstruction of molecular networks based on automated text-mining analysis of published scientific literature. Homol-MetReS allows functional (re)annotation of proteomes, to properly identify both the individual proteins involved in the process(es) of interest and their function. It also enables the sets of proteins involved in the process(es) in different organisms to be compared directly. The efficiency of these biological applications is directly related to the design of the shared database. We classified and analyzed the different kinds of access to the database. Based on this study, we tried to adjust and tune the configurable parameters of the database server to reach the best performance of the communication data link to/from the database system. Different database technologies were analyzed. We started the study with a public relational SQL database, MySQL. Then, the same database was implemented by a MapReduce-based database named HBase. The results indicated that the standard configuration of MySQL gives an acceptable performance for low or medium size databases. Nevertheless, tuning database parameters can greatly improve the performance and lead to very competitive runtimes.
Integration of an Evidence Base into a Probabilistic Risk Assessment Model. The Integrated Medical Model Database: An Organized Evidence Base for Assessing In-Flight Crew Health Risk and System Design

NASA Technical Reports Server (NTRS)

Saile, Lynn; Lopez, Vilma; Bickham, Grandin; FreiredeCarvalho, Mary; Kerstman, Eric; Byrne, Vicky; Butler, Douglas; Myers, Jerry; Walton, Marlei

2011-01-01

This slide presentation reviews the Integrated Medical Model (IMM) database, which is an organized evidence base for assessing in-flight crew health risk. The database is a relational database accessible to many people. The database quantifies the model inputs by a ranking based on the highest value of the data as Level of Evidence (LOE) and the quality of evidence (QOE) score that provides an assessment of the evidence base for each medical condition. The IMM evidence base has already been able to provide invaluable information for designers, and for other uses.
Identifying work-related motor vehicle crashes in multiple databases.

PubMed

Thomas, Andrea M; Thygerson, Steven M; Merrill, Ray M; Cook, Lawrence J

2012-01-01

To compare and estimate the magnitude of work-related motor vehicle crashes in Utah using 2 probabilistically linked statewide databases. Data from 2006 and 2007 motor vehicle crash and hospital databases were joined through probabilistic linkage. Summary statistics and capture-recapture were used to describe occupants injured in work-related motor vehicle crashes and estimate the size of this population. There were 1597 occupants in the motor vehicle crash database and 1673 patients in the hospital database identified as being in a work-related motor vehicle crash. We identified 1443 occupants with at least one record from either the motor vehicle crash or hospital database indicating work-relatedness that linked to any record in the opposing database. We found that 38.7 percent of occupants injured in work-related motor vehicle crashes identified in the motor vehicle crash database did not have a primary payer code of workers' compensation in the hospital database and 40.0 percent of patients injured in work-related motor vehicle crashes identified in the hospital database did not meet our definition of a work-related motor vehicle crash in the motor vehicle crash database. Depending on how occupants injured in work-related motor crashes are identified, we estimate the population to be between 1852 and 8492 in Utah for the years 2006 and 2007. Research on single databases may lead to biased interpretations of work-related motor vehicle crashes. Combining 2 population based databases may still result in an underestimate of the magnitude of work-related motor vehicle crashes. Improved coding of work-related incidents is needed in current databases.
Definitions of database files and fields of the Personal Computer-Based Water Data Sources Directory

USGS Publications Warehouse

Green, J. Wayne

1991-01-01

This report describes the data-base files and fields of the personal computer-based Water Data Sources Directory (WDSD). The personal computer-based WDSD was derived from the U.S. Geological Survey (USGS) mainframe computer version. The mainframe version of the WDSD is a hierarchical data-base design. The personal computer-based WDSD is a relational data- base design. This report describes the data-base files and fields of the relational data-base design in dBASE IV (the use of brand names in this abstract is for identification purposes only and does not constitute endorsement by the U.S. Geological Survey) for the personal computer. The WDSD contains information on (1) the type of organization, (2) the major orientation of water-data activities conducted by each organization, (3) the names, addresses, and telephone numbers of offices within each organization from which water data may be obtained, (4) the types of data held by each organization and the geographic locations within which these data have been collected, (5) alternative sources of an organization's data, (6) the designation of liaison personnel in matters related to water-data acquisition and indexing, (7) the volume of water data indexed for the organization, and (8) information about other types of data and services available from the organization that are pertinent to water-resources activities.
Rapid storage and retrieval of genomic intervals from a relational database system using nested containment lists

PubMed Central

Wiley, Laura K.; Sivley, R. Michael; Bush, William S.

2013-01-01

Efficient storage and retrieval of genomic annotations based on range intervals is necessary, given the amount of data produced by next-generation sequencing studies. The indexing strategies of relational database systems (such as MySQL) greatly inhibit their use in genomic annotation tasks. This has led to the development of stand-alone applications that are dependent on flat-file libraries. In this work, we introduce MyNCList, an implementation of the NCList data structure within a MySQL database. MyNCList enables the storage, update and rapid retrieval of genomic annotations from the convenience of a relational database system. Range-based annotations of 1 million variants are retrieved in under a minute, making this approach feasible for whole-genome annotation tasks. Database URL: https://github.com/bushlab/mynclist PMID:23894185
Rapid storage and retrieval of genomic intervals from a relational database system using nested containment lists.

PubMed

Wiley, Laura K; Sivley, R Michael; Bush, William S

2013-01-01

Efficient storage and retrieval of genomic annotations based on range intervals is necessary, given the amount of data produced by next-generation sequencing studies. The indexing strategies of relational database systems (such as MySQL) greatly inhibit their use in genomic annotation tasks. This has led to the development of stand-alone applications that are dependent on flat-file libraries. In this work, we introduce MyNCList, an implementation of the NCList data structure within a MySQL database. MyNCList enables the storage, update and rapid retrieval of genomic annotations from the convenience of a relational database system. Range-based annotations of 1 million variants are retrieved in under a minute, making this approach feasible for whole-genome annotation tasks. Database URL: https://github.com/bushlab/mynclist.
Assessing animal welfare in sow herds using data on meat inspection, medication and mortality.

PubMed

Knage-Rasmussen, K M; Rousing, T; Sørensen, J T; Houe, H

2015-03-01

This paper aims to contribute to the development of a cost-effective alternative to expensive on-farm animal-based welfare assessment systems. The objective of the study was to design an animal welfare index based on central database information (DBWI), and to validate it against an animal welfare index based on-farm animal-based measurements (AWI). Data on 63 Danish sow herds with herd-sizes of 80 to 2500 sows and an average herd size of 501 were collected from three central databases containing: Meat inspection data collected at animal level in the abattoir, mortality data at herd level from the rendering plants of DAKA, and medicine records at both herd and animal group level (sow with piglets, weaners or finishers) from the central database Vetstat. Selected measurements taken from these central databases were used to construct the DBWI. The relative welfare impacts of both individual database measurements and the databases overall were assigned in consultation with a panel consisting of 12 experts. The experts were drawn from production advisory activities, animal science and in one case an animal welfare organization. The expert panel weighted each measurement on a scale from 1 (not-important) to 5 (very important). The experts also gave opinions on the relative weightings of measurements for each of the three databases by stating a relative weight of each database in the DBWI. On the basis of this, the aggregated DBWI was normalized. The aggregation of AWI was based on weighted summary of herd prevalence's of 20 clinical and behavioural measurements originating from a 1 day data collection. AWI did not show linear dependency of DBWI. This suggests that DBWI is not suited to replace an animal welfare index using on-farm animal-based measurements.
Search extension transforms Wiki into a relational system: a case for flavonoid metabolite database.

PubMed

Arita, Masanori; Suwa, Kazuhiro

2008-09-17

In computer science, database systems are based on the relational model founded by Edgar Codd in 1970. On the other hand, in the area of biology the word 'database' often refers to loosely formatted, very large text files. Although such bio-databases may describe conflicts or ambiguities (e.g. a protein pair do and do not interact, or unknown parameters) in a positive sense, the flexibility of the data format sacrifices a systematic query mechanism equivalent to the widely used SQL. To overcome this disadvantage, we propose embeddable string-search commands on a Wiki-based system and designed a half-formatted database. As proof of principle, a database of flavonoid with 6902 molecular structures from over 1687 plant species was implemented on MediaWiki, the background system of Wikipedia. Registered users can describe any information in an arbitrary format. Structured part is subject to text-string searches to realize relational operations. The system was written in PHP language as the extension of MediaWiki. All modifications are open-source and publicly available. This scheme benefits from both the free-formatted Wiki style and the concise and structured relational-database style. MediaWiki supports multi-user environments for document management, and the cost for database maintenance is alleviated.
Search extension transforms Wiki into a relational system: A case for flavonoid metabolite database

PubMed Central

Arita, Masanori; Suwa, Kazuhiro

2008-01-01

Background In computer science, database systems are based on the relational model founded by Edgar Codd in 1970. On the other hand, in the area of biology the word 'database' often refers to loosely formatted, very large text files. Although such bio-databases may describe conflicts or ambiguities (e.g. a protein pair do and do not interact, or unknown parameters) in a positive sense, the flexibility of the data format sacrifices a systematic query mechanism equivalent to the widely used SQL. Results To overcome this disadvantage, we propose embeddable string-search commands on a Wiki-based system and designed a half-formatted database. As proof of principle, a database of flavonoid with 6902 molecular structures from over 1687 plant species was implemented on MediaWiki, the background system of Wikipedia. Registered users can describe any information in an arbitrary format. Structured part is subject to text-string searches to realize relational operations. The system was written in PHP language as the extension of MediaWiki. All modifications are open-source and publicly available. Conclusion This scheme benefits from both the free-formatted Wiki style and the concise and structured relational-database style. MediaWiki supports multi-user environments for document management, and the cost for database maintenance is alleviated. PMID:18822113
Resources | Division of Cancer Prevention

Cancer.gov

Manual of Operations Version 3, 12/13/2012 (PDF, 162KB) Database Sources Consortium for Functional Glycomics databases Design Studies Related to the Development of Distributed, Web-based European Carbohydrate Databases (EUROCarbDB) |
Dynamic tables: an architecture for managing evolving, heterogeneous biomedical data in relational database management systems.

PubMed

Corwin, John; Silberschatz, Avi; Miller, Perry L; Marenco, Luis

2007-01-01

Data sparsity and schema evolution issues affecting clinical informatics and bioinformatics communities have led to the adoption of vertical or object-attribute-value-based database schemas to overcome limitations posed when using conventional relational database technology. This paper explores these issues and discusses why biomedical data are difficult to model using conventional relational techniques. The authors propose a solution to these obstacles based on a relational database engine using a sparse, column-store architecture. The authors provide benchmarks comparing the performance of queries and schema-modification operations using three different strategies: (1) the standard conventional relational design; (2) past approaches used by biomedical informatics researchers; and (3) their sparse, column-store architecture. The performance results show that their architecture is a promising technique for storing and processing many types of data that are not handled well by the other two semantic data models.
Relational Data Bases--Are You Ready?

ERIC Educational Resources Information Center

Marshall, Dorothy M.

1989-01-01

Migrating from a traditional to a relational database technology requires more than traditional project management techniques. An overview of what to consider before migrating to relational database technology is presented. Leadership, staffing, vendor support, hardware, software, and application development are discussed. (MLW)
Database Constraints Applied to Metabolic Pathway Reconstruction Tools

PubMed Central

Vilaplana, Jordi; Solsona, Francesc; Teixido, Ivan; Usié, Anabel; Karathia, Hiren; Alves, Rui; Mateo, Jordi

2014-01-01

Our group developed two biological applications, Biblio-MetReS and Homol-MetReS, accessing the same database of organisms with annotated genes. Biblio-MetReS is a data-mining application that facilitates the reconstruction of molecular networks based on automated text-mining analysis of published scientific literature. Homol-MetReS allows functional (re)annotation of proteomes, to properly identify both the individual proteins involved in the process(es) of interest and their function. It also enables the sets of proteins involved in the process(es) in different organisms to be compared directly. The efficiency of these biological applications is directly related to the design of the shared database. We classified and analyzed the different kinds of access to the database. Based on this study, we tried to adjust and tune the configurable parameters of the database server to reach the best performance of the communication data link to/from the database system. Different database technologies were analyzed. We started the study with a public relational SQL database, MySQL. Then, the same database was implemented by a MapReduce-based database named HBase. The results indicated that the standard configuration of MySQL gives an acceptable performance for low or medium size databases. Nevertheless, tuning database parameters can greatly improve the performance and lead to very competitive runtimes. PMID:25202745

Constructing a Graph Database for Semantic Literature-Based Discovery.

PubMed

Hristovski, Dimitar; Kastrin, Andrej; Dinevski, Dejan; Rindflesch, Thomas C

2015-01-01

Literature-based discovery (LBD) generates discoveries, or hypotheses, by combining what is already known in the literature. Potential discoveries have the form of relations between biomedical concepts; for example, a drug may be determined to treat a disease other than the one for which it was intended. LBD views the knowledge in a domain as a network; a set of concepts along with the relations between them. As a starting point, we used SemMedDB, a database of semantic relations between biomedical concepts extracted with SemRep from Medline. SemMedDB is distributed as a MySQL relational database, which has some problems when dealing with network data. We transformed and uploaded SemMedDB into the Neo4j graph database, and implemented the basic LBD discovery algorithms with the Cypher query language. We conclude that storing the data needed for semantic LBD is more natural in a graph database. Also, implementing LBD discovery algorithms is conceptually simpler with a graph query language when compared with standard SQL.
The Data Base and Decision Making in Public Schools.

ERIC Educational Resources Information Center

Hedges, William D.

1984-01-01

Describes generic types of databases--file management systems, relational database management systems, and network/hierarchical database management systems--with their respective strengths and weaknesses; discusses factors to be considered in determining whether a database is desirable; and provides evaluative criteria for use in choosing…
Biological age as a health index for mortality and major age-related disease incidence in Koreans: National Health Insurance Service – Health screening 11-year follow-up study

PubMed Central

Kang, Young Gon; Suh, Eunkyung; Lee, Jae-woo; Kim, Dong Wook; Cho, Kyung Hee; Bae, Chul-Young

2018-01-01

Purpose A comprehensive health index is needed to measure an individual’s overall health and aging status and predict the risk of death and age-related disease incidence, and evaluate the effect of a health management program. The purpose of this study is to demonstrate the validity of estimated biological age (BA) in relation to all-cause mortality and age-related disease incidence based on National Sample Cohort database. Patients and methods This study was based on National Sample Cohort database of the National Health Insurance Service – Eligibility database and the National Health Insurance Service – Medical and Health Examination database of the year 2002 through 2013. BA model was developed based on the National Health Insurance Service – National Sample Cohort (NHIS – NSC) database and Cox proportional hazard analysis was done for mortality and major age-related disease incidence. Results For every 1 year increase of the calculated BA and chronological age difference, the hazard ratio for mortality significantly increased by 1.6% (1.5% in men and 2.0% in women) and also for hypertension, diabetes mellitus, heart disease, stroke, and cancer incidence by 2.5%, 4.2%, 1.3%, 1.6%, and 0.4%, respectively (p<0.001). Conclusion Estimated BA by the developed BA model based on NHIS – NSC database is expected to be used not only as an index for assessing health and aging status and predicting mortality and major age-related disease incidence, but can also be applied to various health care fields. PMID:29593385
YAdumper: extracting and translating large information volumes from relational databases to structured flat files.

PubMed

Fernández, José M; Valencia, Alfonso

2004-10-12

Downloading the information stored in relational databases into XML and other flat formats is a common task in bioinformatics. This periodical dumping of information requires considerable CPU time, disk and memory resources. YAdumper has been developed as a purpose-specific tool to deal with the integral structured information download of relational databases. YAdumper is a Java application that organizes database extraction following an XML template based on an external Document Type Declaration. Compared with other non-native alternatives, YAdumper substantially reduces memory requirements and considerably improves writing performance.
The Use of a Relational Database in Qualitative Research on Educational Computing.

ERIC Educational Resources Information Center

Winer, Laura R.; Carriere, Mario

1990-01-01

Discusses the use of a relational database as a data management and analysis tool for nonexperimental qualitative research, and describes the use of the Reflex Plus database in the Vitrine 2001 project in Quebec to study computer-based learning environments. Information systems are also discussed, and the use of a conceptual model is explained.…
[Establishment of a comprehensive database for laryngeal cancer related genes and the miRNAs].

PubMed

Li, Mengjiao; E, Qimin; Liu, Jialin; Huang, Tingting; Liang, Chuanyu

2015-09-01

By collecting and analyzing the laryngeal cancer related genes and the miRNAs, to build a comprehensive laryngeal cancer-related gene database, which differs from the current biological information database with complex and clumsy structure and focuses on the theme of gene and miRNA, and it could make the research and teaching more convenient and efficient. Based on the B/S architecture, using Apache as a Web server, MySQL as coding language of database design and PHP as coding language of web design, a comprehensive database for laryngeal cancer-related genes was established, providing with the gene tables, protein tables, miRNA tables and clinical information tables of the patients with laryngeal cancer. The established database containsed 207 laryngeal cancer related genes, 243 proteins, 26 miRNAs, and their particular information such as mutations, methylations, diversified expressions, and the empirical references of laryngeal cancer relevant molecules. The database could be accessed and operated via the Internet, by which browsing and retrieval of the information were performed. The database were maintained and updated regularly. The database for laryngeal cancer related genes is resource-integrated and user-friendly, providing a genetic information query tool for the study of laryngeal cancer.
Data, knowledge and method bases in chemical sciences. Part IV. Current status in databases.

PubMed

Braibanti, Antonio; Rao, Rupenaguntla Sambasiva; Rao, Gollapalli Nagesvara; Ramam, Veluri Anantha; Rao, Sattiraju Veera Venkata Satyanarayana

2002-01-01

Computer readable databases have become an integral part of chemical research right from planning data acquisition to interpretation of the information generated. The databases available today are numerical, spectral and bibliographic. Data representation by different schemes--relational, hierarchical and objects--is demonstrated. Quality index (QI) throws light on the quality of data. The objective, prospects and impact of database activity on expert systems are discussed. The number and size of corporate databases available on international networks crossed manageable number leading to databases about their contents. Subsets of corporate or small databases have been developed by groups of chemists. The features and role of knowledge-based or intelligent databases are described.
Adding glycaemic index and glycaemic load functionality to DietPLUS, a Malaysian food composition database and diet intake calculator.

PubMed

Shyam, Sangeetha; Wai, Tony Ng Kock; Arshad, Fatimah

2012-01-01

This paper outlines the methodology to add glycaemic index (GI) and glycaemic load (GL) functionality to food DietPLUS, a Microsoft Excel-based Malaysian food composition database and diet intake calculator. Locally determined GI values and published international GI databases were used as the source of GI values. Previously published methodology for GI value assignment was modified to add GI and GL calculators to the database. Two popular local low GI foods were added to the DietPLUS database, bringing up the total number of foods in the database to 838 foods. Overall, in relation to the 539 major carbohydrate foods in the Malaysian Food Composition Database, 243 (45%) food items had local Malaysian values or were directly matched to International GI database and another 180 (33%) of the foods were linked to closely-related foods in the GI databases used. The mean ± SD dietary GI and GL of the dietary intake of 63 women with previous gestational diabetes mellitus, calculated using DietPLUS version3 were, 62 ± 6 and 142 ± 45, respectively. These values were comparable to those reported from other local studies. DietPLUS version3, a simple Microsoft Excel-based programme aids calculation of diet GI and GL for Malaysian diets based on food records.
Relational-database model for improving quality assurance and process control in a composite manufacturing environment

NASA Astrophysics Data System (ADS)

Gentry, Jeffery D.

2000-05-01

A relational database is a powerful tool for collecting and analyzing the vast amounts of inner-related data associated with the manufacture of composite materials. A relational database contains many individual database tables that store data that are related in some fashion. Manufacturing process variables as well as quality assurance measurements can be collected and stored in database tables indexed according to lot numbers, part type or individual serial numbers. Relationships between manufacturing process and product quality can then be correlated over a wide range of product types and process variations. This paper presents details on how relational databases are used to collect, store, and analyze process variables and quality assurance data associated with the manufacture of advanced composite materials. Important considerations are covered including how the various types of data are organized and how relationships between the data are defined. Employing relational database techniques to establish correlative relationships between process variables and quality assurance measurements is then explored. Finally, the benefits of database techniques such as data warehousing, data mining and web based client/server architectures are discussed in the context of composite material manufacturing.
Development of a database system for near-future climate change projections under the Japanese National Project SI-CAT

NASA Astrophysics Data System (ADS)

Nakagawa, Y.; Kawahara, S.; Araki, F.; Matsuoka, D.; Ishikawa, Y.; Fujita, M.; Sugimoto, S.; Okada, Y.; Kawazoe, S.; Watanabe, S.; Ishii, M.; Mizuta, R.; Murata, A.; Kawase, H.

2017-12-01

Analyses of large ensemble data are quite useful in order to produce probabilistic effect projection of climate change. Ensemble data of "+2K future climate simulations" are currently produced by Japanese national project "Social Implementation Program on Climate Change Adaptation Technology (SI-CAT)" as a part of a database for Policy Decision making for Future climate change (d4PDF; Mizuta et al. 2016) produced by Program for Risk Information on Climate Change. Those data consist of global warming simulations and regional downscaling simulations. Considering that those data volumes are too large (a few petabyte) to download to a local computer of users, a user-friendly system is required to search and download data which satisfy requests of the users. We develop "a database system for near-future climate change projections" for providing functions to find necessary data for the users under SI-CAT. The database system for near-future climate change projections mainly consists of a relational database, a data download function and user interface. The relational database using PostgreSQL is a key function among them. Temporally and spatially compressed data are registered on the relational database. As a first step, we develop the relational database for precipitation, temperature and track data of typhoon according to requests by SI-CAT members. The data download function using Open-source Project for a Network Data Access Protocol (OPeNDAP) provides a function to download temporally and spatially extracted data based on search results obtained by the relational database. We also develop the web-based user interface for using the relational database and the data download function. A prototype of the database system for near-future climate change projections are currently in operational test on our local server. The database system for near-future climate change projections will be released on Data Integration and Analysis System Program (DIAS) in fiscal year 2017. Techniques of the database system for near-future climate change projections might be quite useful for simulation and observational data in other research fields. We report current status of development and some case studies of the database system for near-future climate change projections.
Relational databases for rare disease study: application to vascular anomalies.

PubMed

Perkins, Jonathan A; Coltrera, Marc D

2008-01-01

To design a relational database integrating clinical and basic science data needed for multidisciplinary treatment and research in the field of vascular anomalies. Based on data points agreed on by the American Society of Pediatric Otolaryngology (ASPO) Vascular Anomalies Task Force. The database design enables sharing of data subsets in a Health Insurance Portability and Accountability Act (HIPAA)-compliant manner for multisite collaborative trials. Vascular anomalies pose diagnostic and therapeutic challenges. Our understanding of these lesions and treatment improvement is limited by nonstandard terminology, severity assessment, and measures of treatment efficacy. The rarity of these lesions places a premium on coordinated studies among multiple participant sites. The relational database design is conceptually centered on subjects having 1 or more lesions. Each anomaly can be tracked individually along with their treatment outcomes. This design allows for differentiation between treatment responses and untreated lesions' natural course. The relational database design eliminates data entry redundancy and results in extremely flexible search and data export functionality. Vascular anomaly programs in the United States. A relational database correlating clinical findings and photographic, radiologic, histologic, and treatment data for vascular anomalies was created for stand-alone and multiuser networked systems. Proof of concept for independent site data gathering and HIPAA-compliant sharing of data subsets was demonstrated. The collaborative effort by the ASPO Vascular Anomalies Task Force to create the database helped define a common vascular anomaly data set. The resulting relational database software is a powerful tool to further the study of vascular anomalies and the development of evidence-based treatment innovation.
SoyBase, The USDA-ARS Soybean Genetics and Genomics Database

USDA-ARS?s Scientific Manuscript database

SoyBase, the USDA-ARS soybean genetic database, is a comprehensive repository for professionally curated genetics, genomics and related data resources for soybean. SoyBase contains the most current genetic, physical and genomic sequence maps integrated with qualitative and quantitative traits. The...
Dynamic taxonomies applied to a web-based relational database for geo-hydrological risk mitigation

NASA Astrophysics Data System (ADS)

Sacco, G. M.; Nigrelli, G.; Bosio, A.; Chiarle, M.; Luino, F.

2012-02-01

In its 40 years of activity, the Research Institute for Geo-hydrological Protection of the Italian National Research Council has amassed a vast and varied collection of historical documentation on landslides, muddy-debris flows, and floods in northern Italy from 1600 to the present. Since 2008, the archive resources have been maintained through a relational database management system. The database is used for routine study and research purposes as well as for providing support during geo-hydrological emergencies, when data need to be quickly and accurately retrieved. Retrieval speed and accuracy are the main objectives of an implementation based on a dynamic taxonomies model. Dynamic taxonomies are a general knowledge management model for configuring complex, heterogeneous information bases that support exploratory searching. At each stage of the process, the user can explore or browse the database in a guided yet unconstrained way by selecting the alternatives suggested for further refining the search. Dynamic taxonomies have been successfully applied to such diverse and apparently unrelated domains as e-commerce and medical diagnosis. Here, we describe the application of dynamic taxonomies to our database and compare it to traditional relational database query methods. The dynamic taxonomy interface, essentially a point-and-click interface, is considerably faster and less error-prone than traditional form-based query interfaces that require the user to remember and type in the "right" search keywords. Finally, dynamic taxonomy users have confirmed that one of the principal benefits of this approach is the confidence of having considered all the relevant information. Dynamic taxonomies and relational databases work in synergy to provide fast and precise searching: one of the most important factors in timely response to emergencies.
A PATO-compliant zebrafish screening database (MODB): management of morpholino knockdown screen information.

PubMed

Knowlton, Michelle N; Li, Tongbin; Ren, Yongliang; Bill, Brent R; Ellis, Lynda Bm; Ekker, Stephen C

2008-01-07

The zebrafish is a powerful model vertebrate amenable to high throughput in vivo genetic analyses. Examples include reverse genetic screens using morpholino knockdown, expression-based screening using enhancer trapping and forward genetic screening using transposon insertional mutagenesis. We have created a database to facilitate web-based distribution of data from such genetic studies. The MOrpholino DataBase is a MySQL relational database with an online, PHP interface. Multiple quality control levels allow differential access to data in raw and finished formats. MODBv1 includes sequence information relating to almost 800 morpholinos and their targets and phenotypic data regarding the dose effect of each morpholino (mortality, toxicity and defects). To improve the searchability of this database, we have incorporated a fixed-vocabulary defect ontology that allows for the organization of morpholino affects based on anatomical structure affected and defect produced. This also allows comparison between species utilizing Phenotypic Attribute Trait Ontology (PATO) designated terminology. MODB is also cross-linked with ZFIN, allowing full searches between the two databases. MODB offers users the ability to retrieve morpholino data by sequence of morpholino or target, name of target, anatomical structure affected and defect produced. MODB data can be used for functional genomic analysis of morpholino design to maximize efficacy and minimize toxicity. MODB also serves as a template for future sequence-based functional genetic screen databases, and it is currently being used as a model for the creation of a mutagenic insertional transposon database.
The integrated web service and genome database for agricultural plants with biotechnology information.

PubMed

Kim, Changkug; Park, Dongsuk; Seol, Youngjoo; Hahn, Jangho

2011-01-01

The National Agricultural Biotechnology Information Center (NABIC) constructed an agricultural biology-based infrastructure and developed a Web based relational database for agricultural plants with biotechnology information. The NABIC has concentrated on functional genomics of major agricultural plants, building an integrated biotechnology database for agro-biotech information that focuses on genomics of major agricultural resources. This genome database provides annotated genome information from 1,039,823 records mapped to rice, Arabidopsis, and Chinese cabbage.
A fragile zero watermarking scheme to detect and characterize malicious modifications in database relations.

PubMed

Khan, Aihab; Husain, Syed Afaq

2013-01-01

We put forward a fragile zero watermarking scheme to detect and characterize malicious modifications made to a database relation. Most of the existing watermarking schemes for relational databases introduce intentional errors or permanent distortions as marks into the database original content. These distortions inevitably degrade the data quality and data usability as the integrity of a relational database is violated. Moreover, these fragile schemes can detect malicious data modifications but do not characterize the tempering attack, that is, the nature of tempering. The proposed fragile scheme is based on zero watermarking approach to detect malicious modifications made to a database relation. In zero watermarking, the watermark is generated (constructed) from the contents of the original data rather than introduction of permanent distortions as marks into the data. As a result, the proposed scheme is distortion-free; thus, it also resolves the inherent conflict between security and imperceptibility. The proposed scheme also characterizes the malicious data modifications to quantify the nature of tempering attacks. Experimental results show that even minor malicious modifications made to a database relation can be detected and characterized successfully.
Evaluation of relational and NoSQL database architectures to manage genomic annotations.

PubMed

Schulz, Wade L; Nelson, Brent G; Felker, Donn K; Durant, Thomas J S; Torres, Richard

2016-12-01

While the adoption of next generation sequencing has rapidly expanded, the informatics infrastructure used to manage the data generated by this technology has not kept pace. Historically, relational databases have provided much of the framework for data storage and retrieval. Newer technologies based on NoSQL architectures may provide significant advantages in storage and query efficiency, thereby reducing the cost of data management. But their relative advantage when applied to biomedical data sets, such as genetic data, has not been characterized. To this end, we compared the storage, indexing, and query efficiency of a common relational database (MySQL), a document-oriented NoSQL database (MongoDB), and a relational database with NoSQL support (PostgreSQL). When used to store genomic annotations from the dbSNP database, we found the NoSQL architectures to outperform traditional, relational models for speed of data storage, indexing, and query retrieval in nearly every operation. These findings strongly support the use of novel database technologies to improve the efficiency of data management within the biological sciences. Copyright Â© 2016 Elsevier Inc. All rights reserved.
Design of Integrated Database on Mobile Information System: A Study of Yogyakarta Smart City App

NASA Astrophysics Data System (ADS)

Nurnawati, E. K.; Ermawati, E.

2018-02-01

An integration database is a database which acts as the data store for multiple applications and thus integrates data across these applications (in contrast to an Application Database). An integration database needs a schema that takes all its client applications into account. The benefit of the schema that sharing data among applications does not require an extra layer of integration services on the applications. Any changes to data made in a single application are made available to all applications at the time of database commit - thus keeping the applications’ data use better synchronized. This study aims to design and build an integrated database that can be used by various applications in a mobile device based system platforms with the based on smart city system. The built-in database can be used by various applications, whether used together or separately. The design and development of the database are emphasized on the flexibility, security, and completeness of attributes that can be used together by various applications to be built. The method used in this study is to choice of the appropriate database logical structure (patterns of data) and to build the relational-database models (Design Databases). Test the resulting design with some prototype apps and analyze system performance with test data. The integrated database can be utilized both of the admin and the user in an integral and comprehensive platform. This system can help admin, manager, and operator in managing the application easily and efficiently. This Android-based app is built based on a dynamic clientserver where data is extracted from an external database MySQL. So if there is a change of data in the database, then the data on Android applications will also change. This Android app assists users in searching of Yogyakarta (as smart city) related information, especially in culture, government, hotels, and transportation.
A survey of commercial object-oriented database management systems

NASA Technical Reports Server (NTRS)

Atkins, John

1992-01-01

The object-oriented data model is the culmination of over thirty years of database research. Initially, database research focused on the need to provide information in a consistent and efficient manner to the business community. Early data models such as the hierarchical model and the network model met the goal of consistent and efficient access to data and were substantial improvements over simple file mechanisms for storing and accessing data. However, these models required highly skilled programmers to provide access to the data. Consequently, in the early 70's E.F. Codd, an IBM research computer scientists, proposed a new data model based on the simple mathematical notion of the relation. This model is known as the Relational Model. In the relational model, data is represented in flat tables (or relations) which have no physical or internal links between them. The simplicity of this model fostered the development of powerful but relatively simple query languages that now made data directly accessible to the general database user. Except for large, multi-user database systems, a database professional was in general no longer necessary. Database professionals found that traditional data in the form of character data, dates, and numeric data were easily represented and managed via the relational model. Commercial relational database management systems proliferated and performance of relational databases improved dramatically. However, there was a growing community of potential database users whose needs were not met by the relational model. These users needed to store data with data types not available in the relational model and who required a far richer modelling environment than that provided by the relational model. Indeed, the complexity of the objects to be represented in the model mandated a new approach to database technology. The Object-Oriented Model was the result.
Compartmental and Data-Based Modeling of Cerebral Hemodynamics: Linear Analysis.

PubMed

Henley, B C; Shin, D C; Zhang, R; Marmarelis, V Z

Compartmental and data-based modeling of cerebral hemodynamics are alternative approaches that utilize distinct model forms and have been employed in the quantitative study of cerebral hemodynamics. This paper examines the relation between a compartmental equivalent-circuit and a data-based input-output model of dynamic cerebral autoregulation (DCA) and CO2-vasomotor reactivity (DVR). The compartmental model is constructed as an equivalent-circuit utilizing putative first principles and previously proposed hypothesis-based models. The linear input-output dynamics of this compartmental model are compared with data-based estimates of the DCA-DVR process. This comparative study indicates that there are some qualitative similarities between the two-input compartmental model and experimental results.

FINDbase: a relational database recording frequencies of genetic defects leading to inherited disorders worldwide.

PubMed

van Baal, Sjozef; Kaimakis, Polynikis; Phommarinh, Manyphong; Koumbi, Daphne; Cuppens, Harry; Riccardino, Francesca; Macek, Milan; Scriver, Charles R; Patrinos, George P

2007-01-01

Frequency of INherited Disorders database (FINDbase) (http://www.findbase.org) is a relational database, derived from the ETHNOS software, recording frequencies of causative mutations leading to inherited disorders worldwide. Database records include the population and ethnic group, the disorder name and the related gene, accompanied by links to any corresponding locus-specific mutation database, to the respective Online Mendelian Inheritance in Man entries and the mutation together with its frequency in that population. The initial information is derived from the published literature, locus-specific databases and genetic disease consortia. FINDbase offers a user-friendly query interface, providing instant access to the list and frequencies of the different mutations. Query outputs can be either in a table or graphical format, accompanied by reference(s) on the data source. Registered users from three different groups, namely administrator, national coordinator and curator, are responsible for database curation and/or data entry/correction online via a password-protected interface. Databaseaccess is free of charge and there are no registration requirements for data querying. FINDbase provides a simple, web-based system for population-based mutation data collection and retrieval and can serve not only as a valuable online tool for molecular genetic testing of inherited disorders but also as a non-profit model for sustainable database funding, in the form of a 'database-journal'.
A comparison of traditional anti-inflammation and anti-infection medicinal plants with current evidence from biomedical research: Results from a regional study

PubMed Central

Vieira, A.

2010-01-01

Background: In relation to pharmacognosy, an objective of many ethnobotanical studies is to identify plant species to be further investigated, for example, tested in disease models related to the ethnomedicinal application. To further warrant such testing, research evidence for medicinal applications of these plants (or of their major phytochemical constituents and metabolic derivatives) is typically analyzed in biomedical databases. Methods: As a model of this process, the current report presents novel information regarding traditional anti-inflammation and anti-infection medicinal plant use. This information was obtained from an interview-based ethnobotanical study; and was compared with current biomedical evidence using the Medline® database. Results: Of the 8 anti-infection plant species identified in the ethnobotanical study, 7 have related activities reported in the database; and of the 6 anti-inflammation plants, 4 have related activities in the database. Conclusion: Based on novel and complimentary results from the ethnobotanical and biomedical database analyses, it is suggested that some of these plants warrant additional investigation of potential anti-inflammatory or anti-infection activities in related disease models, and also additional studies in other population groups. PMID:21589754
The integrated web service and genome database for agricultural plants with biotechnology information

PubMed Central

Kim, ChangKug; Park, DongSuk; Seol, YoungJoo; Hahn, JangHo

2011-01-01

The National Agricultural Biotechnology Information Center (NABIC) constructed an agricultural biology-based infrastructure and developed a Web based relational database for agricultural plants with biotechnology information. The NABIC has concentrated on functional genomics of major agricultural plants, building an integrated biotechnology database for agro-biotech information that focuses on genomics of major agricultural resources. This genome database provides annotated genome information from 1,039,823 records mapped to rice, Arabidopsis, and Chinese cabbage. PMID:21887015
MARC and Relational Databases.

ERIC Educational Resources Information Center

Llorens, Jose; Trenor, Asuncion

1993-01-01

Discusses the use of MARC format in relational databases and addresses problems of incompatibilities. A solution is presented that is in accordance with Open Systems Interconnection (OSI) standards and is based on experiences at the library of the Universidad Politecnica de Valencia (Spain). (four references) (EA)
Complex Adaptive Systems Based Data Integration: Theory and Applications

ERIC Educational Resources Information Center

Rohn, Eliahu

2008-01-01

Data Definition Languages (DDLs) have been created and used to represent data in programming languages and in database dictionaries. This representation includes descriptions in the form of data fields and relations in the form of a hierarchy, with the common exception of relational databases where relations are flat. Network computing created an…
SPIRE Data-Base Management System

NASA Technical Reports Server (NTRS)

Fuechsel, C. F.

1984-01-01

Spacelab Payload Integration and Rocket Experiment (SPIRE) data-base management system (DBMS) based on relational model of data bases. Data bases typically used for engineering and mission analysis tasks and, unlike most commercially available systems, allow data items and data structures stored in forms suitable for direct analytical computation. SPIRE DBMS designed to support data requests from interactive users as well as applications programs.
Relax with CouchDB - Into the non-relational DBMS era of Bioinformatics

PubMed Central

Manyam, Ganiraju; Payton, Michelle A.; Roth, Jack A.; Abruzzo, Lynne V.; Coombes, Kevin R.

2012-01-01

With the proliferation of high-throughput technologies, genome-level data analysis has become common in molecular biology. Bioinformaticians are developing extensive resources to annotate and mine biological features from high-throughput data. The underlying database management systems for most bioinformatics software are based on a relational model. Modern non-relational databases offer an alternative that has flexibility, scalability, and a non-rigid design schema. Moreover, with an accelerated development pace, non-relational databases like CouchDB can be ideal tools to construct bioinformatics utilities. We describe CouchDB by presenting three new bioinformatics resources: (a) geneSmash, which collates data from bioinformatics resources and provides automated gene-centric annotations, (b) drugBase, a database of drug-target interactions with a web interface powered by geneSmash, and (c) HapMap-CN, which provides a web interface to query copy number variations from three SNP-chip HapMap datasets. In addition to the web sites, all three systems can be accessed programmatically via web services. PMID:22609849
Heterogeneous distributed databases: A case study

NASA Technical Reports Server (NTRS)

Stewart, Tracy R.; Mukkamala, Ravi

1991-01-01

Alternatives are reviewed for accessing distributed heterogeneous databases and a recommended solution is proposed. The current study is limited to the Automated Information Systems Center at the Naval Sea Combat Systems Engineering Station at Norfolk, VA. This center maintains two databases located on Digital Equipment Corporation's VAX computers running under the VMS operating system. The first data base, ICMS, resides on a VAX11/780 and has been implemented using VAX DBMS, a CODASYL based system. The second database, CSA, resides on a VAX 6460 and has been implemented using the ORACLE relational database management system (RDBMS). Both databases are used for configuration management within the U.S. Navy. Different customer bases are supported by each database. ICMS tracks U.S. Navy ships and major systems (anti-sub, sonar, etc.). Even though the major systems on ships and submarines have totally different functions, some of the equipment within the major systems are common to both ships and submarines.
Executing Complexity-Increasing Queries in Relational (MySQL) and NoSQL (MongoDB and EXist) Size-Growing ISO/EN 13606 Standardized EHR Databases

PubMed Central

Sánchez-de-Madariaga, Ricardo; Muñoz, Adolfo; Castro, Antonio L; Moreno, Oscar; Pascual, Mario

2018-01-01

This research shows a protocol to assess the computational complexity of querying relational and non-relational (NoSQL (not only Structured Query Language)) standardized electronic health record (EHR) medical information database systems (DBMS). It uses a set of three doubling-sized databases, i.e. databases storing 5000, 10,000 and 20,000 realistic standardized EHR extracts, in three different database management systems (DBMS): relational MySQL object-relational mapping (ORM), document-based NoSQL MongoDB, and native extensible markup language (XML) NoSQL eXist. The average response times to six complexity-increasing queries were computed, and the results showed a linear behavior in the NoSQL cases. In the NoSQL field, MongoDB presents a much flatter linear slope than eXist. NoSQL systems may also be more appropriate to maintain standardized medical information systems due to the special nature of the updating policies of medical information, which should not affect the consistency and efficiency of the data stored in NoSQL databases. One limitation of this protocol is the lack of direct results of improved relational systems such as archetype relational mapping (ARM) with the same data. However, the interpolation of doubling-size database results to those presented in the literature and other published results suggests that NoSQL systems might be more appropriate in many specific scenarios and problems to be solved. For example, NoSQL may be appropriate for document-based tasks such as EHR extracts used in clinical practice, or edition and visualization, or situations where the aim is not only to query medical information, but also to restore the EHR in exactly its original form. PMID:29608174
Executing Complexity-Increasing Queries in Relational (MySQL) and NoSQL (MongoDB and EXist) Size-Growing ISO/EN 13606 Standardized EHR Databases.

PubMed

Sánchez-de-Madariaga, Ricardo; Muñoz, Adolfo; Castro, Antonio L; Moreno, Oscar; Pascual, Mario

2018-03-19

This research shows a protocol to assess the computational complexity of querying relational and non-relational (NoSQL (not only Structured Query Language)) standardized electronic health record (EHR) medical information database systems (DBMS). It uses a set of three doubling-sized databases, i.e. databases storing 5000, 10,000 and 20,000 realistic standardized EHR extracts, in three different database management systems (DBMS): relational MySQL object-relational mapping (ORM), document-based NoSQL MongoDB, and native extensible markup language (XML) NoSQL eXist. The average response times to six complexity-increasing queries were computed, and the results showed a linear behavior in the NoSQL cases. In the NoSQL field, MongoDB presents a much flatter linear slope than eXist. NoSQL systems may also be more appropriate to maintain standardized medical information systems due to the special nature of the updating policies of medical information, which should not affect the consistency and efficiency of the data stored in NoSQL databases. One limitation of this protocol is the lack of direct results of improved relational systems such as archetype relational mapping (ARM) with the same data. However, the interpolation of doubling-size database results to those presented in the literature and other published results suggests that NoSQL systems might be more appropriate in many specific scenarios and problems to be solved. For example, NoSQL may be appropriate for document-based tasks such as EHR extracts used in clinical practice, or edition and visualization, or situations where the aim is not only to query medical information, but also to restore the EHR in exactly its original form.
Database Management System

NASA Technical Reports Server (NTRS)

1990-01-01

In 1981 Wayne Erickson founded Microrim, Inc, a company originally focused on marketing a microcomputer version of RIM (Relational Information Manager). Dennis Comfort joined the firm and is now vice president, development. The team developed an advanced spinoff from the NASA system they had originally created, a microcomputer database management system known as R:BASE 4000. Microrim added many enhancements and developed a series of R:BASE products for various environments. R:BASE is now the second largest selling line of microcomputer database management software in the world.
An Object-Relational Ifc Storage Model Based on Oracle Database

NASA Astrophysics Data System (ADS)

Li, Hang; Liu, Hua; Liu, Yong; Wang, Yuan

2016-06-01

With the building models are getting increasingly complicated, the levels of collaboration across professionals attract more attention in the architecture, engineering and construction (AEC) industry. In order to adapt the change, buildingSMART developed Industry Foundation Classes (IFC) to facilitate the interoperability between software platforms. However, IFC data are currently shared in the form of text file, which is defective. In this paper, considering the object-based inheritance hierarchy of IFC and the storage features of different database management systems (DBMS), we propose a novel object-relational storage model that uses Oracle database to store IFC data. Firstly, establish the mapping rules between data types in IFC specification and Oracle database. Secondly, design the IFC database according to the relationships among IFC entities. Thirdly, parse the IFC file and extract IFC data. And lastly, store IFC data into corresponding tables in IFC database. In experiment, three different building models are selected to demonstrate the effectiveness of our storage model. The comparison of experimental statistics proves that IFC data are lossless during data exchange.
Inconsistencies in the red blood cell membrane proteome analysis: generation of a database for research and diagnostic applications

PubMed Central

Hegedűs, Tamás; Chaubey, Pururawa Mayank; Várady, György; Szabó, Edit; Sarankó, Hajnalka; Hofstetter, Lia; Roschitzki, Bernd; Sarkadi, Balázs

2015-01-01

Based on recent results, the determination of the easily accessible red blood cell (RBC) membrane proteins may provide new diagnostic possibilities for assessing mutations, polymorphisms or regulatory alterations in diseases. However, the analysis of the current mass spectrometry-based proteomics datasets and other major databases indicates inconsistencies—the results show large scattering and only a limited overlap for the identified RBC membrane proteins. Here, we applied membrane-specific proteomics studies in human RBC, compared these results with the data in the literature, and generated a comprehensive and expandable database using all available data sources. The integrated web database now refers to proteomic, genetic and medical databases as well, and contains an unexpected large number of validated membrane proteins previously thought to be specific for other tissues and/or related to major human diseases. Since the determination of protein expression in RBC provides a method to indicate pathological alterations, our database should facilitate the development of RBC membrane biomarker platforms and provide a unique resource to aid related further research and diagnostics. Database URL: http://rbcc.hegelab.org PMID:26078478
An effective model for store and retrieve big health data in cloud computing.

PubMed

Goli-Malekabadi, Zohreh; Sargolzaei-Javan, Morteza; Akbari, Mohammad Kazem

2016-08-01

The volume of healthcare data including different and variable text types, sounds, and images is increasing day to day. Therefore, the storage and processing of these data is a necessary and challenging issue. Generally, relational databases are used for storing health data which are not able to handle the massive and diverse nature of them. This study aimed at presenting the model based on NoSQL databases for the storage of healthcare data. Despite different types of NoSQL databases, document-based DBs were selected by a survey on the nature of health data. The presented model was implemented in the Cloud environment for accessing to the distribution properties. Then, the data were distributed on the database by applying the Shard property. The efficiency of the model was evaluated in comparison with the previous data model, Relational Database, considering query time, data preparation, flexibility, and extensibility parameters. The results showed that the presented model approximately performed the same as SQL Server for "read" query while it acted more efficiently than SQL Server for "write" query. Also, the performance of the presented model was better than SQL Server in the case of flexibility, data preparation and extensibility. Based on these observations, the proposed model was more effective than Relational Databases for handling health data. Copyright © 2016 Elsevier Ireland Ltd. All rights reserved.
rSNPBase 3.0: an updated database of SNP-related regulatory elements, element-gene pairs and SNP-based gene regulatory networks.

PubMed

Guo, Liyuan; Wang, Jing

2018-01-04

Here, we present the updated rSNPBase 3.0 database (http://rsnp3.psych.ac.cn), which provides human SNP-related regulatory elements, element-gene pairs and SNP-based regulatory networks. This database is the updated version of the SNP regulatory annotation database rSNPBase and rVarBase. In comparison to the last two versions, there are both structural and data adjustments in rSNPBase 3.0: (i) The most significant new feature is the expansion of analysis scope from SNP-related regulatory elements to include regulatory element-target gene pairs (E-G pairs), therefore it can provide SNP-based gene regulatory networks. (ii) Web function was modified according to data content and a new network search module is provided in the rSNPBase 3.0 in addition to the previous regulatory SNP (rSNP) search module. The two search modules support data query for detailed information (related-elements, element-gene pairs, and other extended annotations) on specific SNPs and SNP-related graphic networks constructed by interacting transcription factors (TFs), miRNAs and genes. (3) The type of regulatory elements was modified and enriched. To our best knowledge, the updated rSNPBase 3.0 is the first data tool supports SNP functional analysis from a regulatory network prospective, it will provide both a comprehensive understanding and concrete guidance for SNP-related regulatory studies. © The Author(s) 2017. Published by Oxford University Press on behalf of Nucleic Acids Research.
rSNPBase 3.0: an updated database of SNP-related regulatory elements, element-gene pairs and SNP-based gene regulatory networks

PubMed Central

2018-01-01

Abstract Here, we present the updated rSNPBase 3.0 database (http://rsnp3.psych.ac.cn), which provides human SNP-related regulatory elements, element-gene pairs and SNP-based regulatory networks. This database is the updated version of the SNP regulatory annotation database rSNPBase and rVarBase. In comparison to the last two versions, there are both structural and data adjustments in rSNPBase 3.0: (i) The most significant new feature is the expansion of analysis scope from SNP-related regulatory elements to include regulatory element–target gene pairs (E–G pairs), therefore it can provide SNP-based gene regulatory networks. (ii) Web function was modified according to data content and a new network search module is provided in the rSNPBase 3.0 in addition to the previous regulatory SNP (rSNP) search module. The two search modules support data query for detailed information (related-elements, element-gene pairs, and other extended annotations) on specific SNPs and SNP-related graphic networks constructed by interacting transcription factors (TFs), miRNAs and genes. (3) The type of regulatory elements was modified and enriched. To our best knowledge, the updated rSNPBase 3.0 is the first data tool supports SNP functional analysis from a regulatory network prospective, it will provide both a comprehensive understanding and concrete guidance for SNP-related regulatory studies. PMID:29140525
Special Section: The USMARC Community Information Format.

ERIC Educational Resources Information Center

Lutz, Marilyn; And Others

1992-01-01

Five papers discuss topics related to the USMARC Community Information Format (CIF), including using CIF to create a public service resource network; development of a CIF-based database of materials relating to multicultural and differently-abled populations; background on CIF; development of an information and referral database; and CIF and…
Fuzzy Relational Databases: Representational Issues and Reduction Using Similarity Measures.

ERIC Educational Resources Information Center

Prade, Henri; Testemale, Claudette

1987-01-01

Compares and expands upon two approaches to dealing with fuzzy relational databases. The proposed similarity measure is based on a fuzzy Hausdorff distance and estimates the mismatch between two possibility distributions using a reduction process. The consequences of the reduction process on query evaluation are studied. (Author/EM)
CottonGen: a genomics, genetics and breeding database for cotton research

USDA-ARS?s Scientific Manuscript database

CottonGen (http://www.cottongen.org) is a curated and integrated web-based relational database providing access to publicly available genomic, genetic and breeding data for cotton. CottonGen supercedes CottonDB and the Cotton Marker Database, with enhanced tools for easier data sharing, mining, vis...
Rapid development of entity-based data models for bioinformatics with persistence object-oriented design and structured interfaces.

PubMed

Ezra Tsur, Elishai

2017-01-01

Databases are imperative for research in bioinformatics and computational biology. Current challenges in database design include data heterogeneity and context-dependent interconnections between data entities. These challenges drove the development of unified data interfaces and specialized databases. The curation of specialized databases is an ever-growing challenge due to the introduction of new data sources and the emergence of new relational connections between established datasets. Here, an open-source framework for the curation of specialized databases is proposed. The framework supports user-designed models of data encapsulation, objects persistency and structured interfaces to local and external data sources such as MalaCards, Biomodels and the National Centre for Biotechnology Information (NCBI) databases. The proposed framework was implemented using Java as the development environment, EclipseLink as the data persistency agent and Apache Derby as the database manager. Syntactic analysis was based on J3D, jsoup, Apache Commons and w3c.dom open libraries. Finally, a construction of a specialized database for aneurysms associated vascular diseases is demonstrated. This database contains 3-dimensional geometries of aneurysms, patient's clinical information, articles, biological models, related diseases and our recently published model of aneurysms' risk of rapture. Framework is available in: http://nbel-lab.com.

SQL is Dead; Long-live SQL: Relational Database Technology in Science Contexts

NASA Astrophysics Data System (ADS)

Howe, B.; Halperin, D.

2014-12-01

Relational databases are often perceived as a poor fit in science contexts: Rigid schemas, poor support for complex analytics, unpredictable performance, significant maintenance and tuning requirements --- these idiosyncrasies often make databases unattractive in science contexts characterized by heterogeneous data sources, complex analysis tasks, rapidly changing requirements, and limited IT budgets. In this talk, I'll argue that although the value proposition of typical relational database systems are weak in science, the core ideas that power relational databases have become incredibly prolific in open source science software, and are emerging as a universal abstraction for both big data and small data. In addition, I'll talk about two open source systems we are building to "jailbreak" the core technology of relational databases and adapt them for use in science. The first is SQLShare, a Database-as-a-Service system supporting collaborative data analysis and exchange by reducing database use to an Upload-Query-Share workflow with no installation, schema design, or configuration required. The second is Myria, a service that supports much larger scale data, complex analytics, and supports multiple back end systems. Finally, I'll describe some of the ways our collaborators in oceanography, astronomy, biology, fisheries science, and more are using these systems to replace script-based workflows for reasons of performance, flexibility, and convenience.
Serials Management by Microcomputer: The Potential of DBMS.

ERIC Educational Resources Information Center

Vogel, J. Thomas; Burns, Lynn W.

1984-01-01

Describes serials management at Philadelphia College of Textiles and Science library via a microcomputer, a file manager called PFS, and a relational database management system called dBase II. Check-in procedures, programing with dBase II, "static" and "active" databases, and claim procedures are discussed. Check-in forms are…
A Methodolgy, Based on Analytical Modeling, for the Design of Parallel and Distributed Architectures for Relational Database Query Processors.

DTIC Science & Technology

1987-12-01

Application Programs Intelligent Disk Database Controller Manangement System Operating System Host .1’ I% Figure 2. Intelligent Disk Controller Application...8217. /- - • Database Control -% Manangement System Disk Data Controller Application Programs Operating Host I"" Figure 5. Processor-Per- Head data. Therefore, the...However. these ad- ditional properties have been proven in classical set and relation theory [75]. These additional properties are described here
[The future of clinical laboratory database management system].

PubMed

Kambe, M; Imidy, D; Matsubara, A; Sugimoto, Y

1999-09-01

To assess the present status of the clinical laboratory database management system, the difference between the Clinical Laboratory Information System and Clinical Laboratory System was explained in this study. Although three kinds of database management systems (DBMS) were shown including the relational model, tree model and network model, the relational model was found to be the best DBMS for the clinical laboratory database based on our experience and developments of some clinical laboratory expert systems. As a future clinical laboratory database management system, the IC card system connected to an automatic chemical analyzer was proposed for personal health data management and a microscope/video system was proposed for dynamic data management of leukocytes or bacteria.
A web based relational database management system for filariasis control

PubMed Central

Murty, Upadhyayula Suryanarayana; Kumar, Duvvuri Venkata Rama Satya; Sriram, Kumaraswamy; Rao, Kadiri Madhusudhan; Bhattacharyulu, Chakravarthula Hayageeva Narasimha Venakata; Praveen, Bhoopathi; Krishna, Amirapu Radha

2005-01-01

The present study describes a RDBMS (relational database management system) for the effective management of Filariasis, a vector borne disease. Filariasis infects 120 million people from 83 countries. The possible re-emergence of the disease and the complexity of existing control programs warrant the development of new strategies. A database containing comprehensive data associated with filariasis finds utility in disease control. We have developed a database containing information on the socio-economic status of patients, mosquito collection procedures, mosquito dissection data, filariasis survey report and mass blood data. The database can be searched using a user friendly web interface. Availability http://www.webfil.org (login and password can be obtained from the authors) PMID:17597846
The NASA ADS Abstract Service and the Distributed Astronomy Digital Library [and] Project Soup: Comparing Evaluations of Digital Collection Efforts [and] Cross-Organizational Access Management: A Digital Library Authentication and Authorization Architecture [and] BibRelEx: Exploring Bibliographic Databases by Visualization of Annotated Content-based Relations [and] Semantics-Sensitive Retrieval for Digital Picture Libraries [and] Encoded Archival Description: An Introduction and Overview.

ERIC Educational Resources Information Center

Kurtz, Michael J.; Eichorn, Guenther; Accomazzi, Alberto; Grant, Carolyn S.; Demleitner, Markus; Murray, Stephen S.; Jones, Michael L. W.; Gay, Geri K.; Rieger, Robert H.; Millman, David; Bruggemann-Klein, Anne; Klein, Rolf; Landgraf, Britta; Wang, James Ze; Li, Jia; Chan, Desmond; Wiederhold, Gio; Pitti, Daniel V.

1999-01-01

Includes six articles that discuss a digital library for astronomy; comparing evaluations of digital collection efforts; cross-organizational access management of Web-based resources; searching scientific bibliographic databases based on content-based relations between documents; semantics-sensitive retrieval for digital picture libraries; and…
Relax with CouchDB--into the non-relational DBMS era of bioinformatics.

PubMed

Manyam, Ganiraju; Payton, Michelle A; Roth, Jack A; Abruzzo, Lynne V; Coombes, Kevin R

2012-07-01

With the proliferation of high-throughput technologies, genome-level data analysis has become common in molecular biology. Bioinformaticians are developing extensive resources to annotate and mine biological features from high-throughput data. The underlying database management systems for most bioinformatics software are based on a relational model. Modern non-relational databases offer an alternative that has flexibility, scalability, and a non-rigid design schema. Moreover, with an accelerated development pace, non-relational databases like CouchDB can be ideal tools to construct bioinformatics utilities. We describe CouchDB by presenting three new bioinformatics resources: (a) geneSmash, which collates data from bioinformatics resources and provides automated gene-centric annotations, (b) drugBase, a database of drug-target interactions with a web interface powered by geneSmash, and (c) HapMap-CN, which provides a web interface to query copy number variations from three SNP-chip HapMap datasets. In addition to the web sites, all three systems can be accessed programmatically via web services. Copyright © 2012 Elsevier Inc. All rights reserved.
Migration of legacy mumps applications to relational database servers.

PubMed

O'Kane, K C

2001-07-01

An extended implementation of the Mumps language is described that facilitates vendor neutral migration of legacy Mumps applications to SQL-based relational database servers. Implemented as a compiler, this system translates Mumps programs to operating system independent, standard C code for subsequent compilation to fully stand-alone, binary executables. Added built-in functions and support modules extend the native hierarchical Mumps database with access to industry standard, networked, relational database management servers (RDBMS) thus freeing Mumps applications from dependence upon vendor specific, proprietary, unstandardized database models. Unlike Mumps systems that have added captive, proprietary RDMBS access, the programs generated by this development environment can be used with any RDBMS system that supports common network access protocols. Additional features include a built-in web server interface and the ability to interoperate directly with programs and functions written in other languages.
Performance related issues in distributed database systems

NASA Technical Reports Server (NTRS)

Mukkamala, Ravi

1991-01-01

The key elements of research performed during the year long effort of this project are: Investigate the effects of heterogeneity in distributed real time systems; Study the requirements to TRAC towards building a heterogeneous database system; Study the effects of performance modeling on distributed database performance; and Experiment with an ORACLE based heterogeneous system.
Implementing a Dynamic Database-Driven Course Using LAMP

ERIC Educational Resources Information Center

Laverty, Joseph Packy; Wood, David; Turchek, John

2011-01-01

This paper documents the formulation of a database driven open source architecture web development course. The design of a web-based curriculum faces many challenges: a) relative emphasis of client and server-side technologies, b) choice of a server-side language, and c) the cost and efficient delivery of a dynamic web development, database-driven…
Microcomputer-Based Access to Machine-Readable Numeric Databases.

ERIC Educational Resources Information Center

Wenzel, Patrick

1988-01-01

Describes the use of microcomputers and relational database management systems to improve access to numeric databases by the Data and Program Library Service at the University of Wisconsin. The internal records management system, in-house reference tools, and plans to extend these tools to the entire campus are discussed. (3 references) (CLB)
Development of a Relational Database for Learning Management Systems

ERIC Educational Resources Information Center

Deperlioglu, Omer; Sarpkaya, Yilmaz; Ergun, Ertugrul

2011-01-01

In today's world, Web-Based Distance Education Systems have a great importance. Web-based Distance Education Systems are usually known as Learning Management Systems (LMS). In this article, a database design, which was developed to create an educational institution as a Learning Management System, is described. In this sense, developed Learning…
Agent-Based Framework for Discrete Entity Simulations

DTIC Science & Technology

2006-11-01

Postgres database server for environment queries of neighbors and continuum data. As expected for raw database queries (no database optimizations in...form. Eventually the code was ported to GNU C++ on the same single Intel Pentium 4 CPU running RedHat Linux 9.0 and Postgres database server...Again Postgres was used for environmental queries, and the tool remained relatively slow because of the immense number of queries necessary to assess
Cry-Bt identifier: a biological database for PCR detection of Cry genes present in transgenic plants.

PubMed

Singh, Vinay Kumar; Ambwani, Sonu; Marla, Soma; Kumar, Anil

2009-10-23

We describe the development of a user friendly tool that would assist in the retrieval of information relating to Cry genes in transgenic crops. The tool also helps in detection of transformed Cry genes from Bacillus thuringiensis present in transgenic plants by providing suitable designed primers for PCR identification of these genes. The tool designed based on relational database model enables easy retrieval of information from the database with simple user queries. The tool also enables users to access related information about Cry genes present in various databases by interacting with different sources (nucleotide sequences, protein sequence, sequence comparison tools, published literature, conserved domains, evolutionary and structural data). http://insilicogenomics.in/Cry-btIdentifier/welcome.html.
Database Search Engines: Paradigms, Challenges and Solutions.

PubMed

Verheggen, Kenneth; Martens, Lennart; Berven, Frode S; Barsnes, Harald; Vaudel, Marc

2016-01-01

The first step in identifying proteins from mass spectrometry based shotgun proteomics data is to infer peptides from tandem mass spectra, a task generally achieved using database search engines. In this chapter, the basic principles of database search engines are introduced with a focus on open source software, and the use of database search engines is demonstrated using the freely available SearchGUI interface. This chapter also discusses how to tackle general issues related to sequence database searching and shows how to minimize their impact.
Documentation of a spatial data-base management system for monitoring pesticide application in Washington

USGS Publications Warehouse

Schurr, K.M.; Cox, S.E.

1994-01-01

The Pesticide-Application Data-Base Management System was created as a demonstration project and was tested with data submitted to the Washington State Department of Agriculture by pesticide applicators from a small geographic area. These data were entered into the Department's relational data-base system and uploaded into the system's ARC/INFO files. Locations for pesticide applica- tions are assigned within the Public Land Survey System grids, and ARC/INFO programs in the Pesticide-Application Data-Base Management System can subdivide each survey section into sixteen idealized quarter-quarter sections for display map grids. The system provides data retrieval and geographic information system plotting capabilities from a menu of seven basic retrieval options. Additionally, ARC/INFO coverages can be created from the retrieved data when required for particular applications. The Pesticide-Application Data-Base Management System, or the general principles used in the system, could be adapted to other applica- tions or to other states.
AgeFactDB--the JenAge Ageing Factor Database--towards data integration in ageing research.

PubMed

Hühne, Rolf; Thalheim, Torsten; Sühnel, Jürgen

2014-01-01

AgeFactDB (http://agefactdb.jenage.de) is a database aimed at the collection and integration of ageing phenotype data including lifespan information. Ageing factors are considered to be genes, chemical compounds or other factors such as dietary restriction, whose action results in a changed lifespan or another ageing phenotype. Any information related to the effects of ageing factors is called an observation and is presented on observation pages. To provide concise access to the complete information for a particular ageing factor, corresponding observations are also summarized on ageing factor pages. In a first step, ageing-related data were primarily taken from existing databases such as the Ageing Gene Database--GenAge, the Lifespan Observations Database and the Dietary Restriction Gene Database--GenDR. In addition, we have started to include new ageing-related information. Based on homology data taken from the HomoloGene Database, AgeFactDB also provides observation and ageing factor pages of genes that are homologous to known ageing-related genes. These homologues are considered as candidate or putative ageing-related genes. AgeFactDB offers a variety of search and browse options, and also allows the download of ageing factor or observation lists in TSV, CSV and XML formats.
GALT protein database, a bioinformatics resource for the management and analysis of structural features of a galactosemia-related protein and its mutants.

PubMed

d'Acierno, Antonio; Facchiano, Angelo; Marabotti, Anna

2009-06-01

We describe the GALT-Prot database and its related web-based application that have been developed to collect information about the structural and functional effects of mutations on the human enzyme galactose-1-phosphate uridyltransferase (GALT) involved in the genetic disease named galactosemia type I. Besides a list of missense mutations at gene and protein sequence levels, GALT-Prot reports the analysis results of mutant GALT structures. In addition to the structural information about the wild-type enzyme, the database also includes structures of over 100 single point mutants simulated by means of a computational procedure, and the analysis to each mutant was made with several bioinformatics programs in order to investigate the effect of the mutations. The web-based interface allows querying of the database, and several links are also provided in order to guarantee a high integration with other resources already present on the web. Moreover, the architecture of the database and the web application is flexible and can be easily adapted to store data related to other proteins with point mutations. GALT-Prot is freely available at http://bioinformatica.isa.cnr.it/GALT/.
Accessing the public MIMIC-II intensive care relational database for clinical research.

PubMed

Scott, Daniel J; Lee, Joon; Silva, Ikaro; Park, Shinhyuk; Moody, George B; Celi, Leo A; Mark, Roger G

2013-01-10

The Multiparameter Intelligent Monitoring in Intensive Care II (MIMIC-II) database is a free, public resource for intensive care research. The database was officially released in 2006, and has attracted a growing number of researchers in academia and industry. We present the two major software tools that facilitate accessing the relational database: the web-based QueryBuilder and a downloadable virtual machine (VM) image. QueryBuilder and the MIMIC-II VM have been developed successfully and are freely available to MIMIC-II users. Simple example SQL queries and the resulting data are presented. Clinical studies pertaining to acute kidney injury and prediction of fluid requirements in the intensive care unit are shown as typical examples of research performed with MIMIC-II. In addition, MIMIC-II has also provided data for annual PhysioNet/Computing in Cardiology Challenges, including the 2012 Challenge "Predicting mortality of ICU Patients". QueryBuilder is a web-based tool that provides easy access to MIMIC-II. For more computationally intensive queries, one can locally install a complete copy of MIMIC-II in a VM. Both publicly available tools provide the MIMIC-II research community with convenient querying interfaces and complement the value of the MIMIC-II relational database.
Guide on Data Models in the Selection and Use of Database Management Systems. Final Report.

ERIC Educational Resources Information Center

Gallagher, Leonard J.; Draper, Jesse M.

A tutorial introduction to data models in general is provided, with particular emphasis on the relational and network models defined by the two proposed ANSI (American National Standards Institute) database language standards. Examples based on the network and relational models include specific syntax and semantics, while examples from the other…

Engineering the object-relation database model in O-Raid

NASA Technical Reports Server (NTRS)

Dewan, Prasun; Vikram, Ashish; Bhargava, Bharat

1989-01-01

Raid is a distributed database system based on the relational model. O-raid is an extension of the Raid system and will support complex data objects. The design of O-Raid is evolutionary and retains all features of relational data base systems and those of a general purpose object-oriented programming language. O-Raid has several novel properties. Objects, classes, and inheritance are supported together with a predicate-base relational query language. O-Raid objects are compatible with C++ objects and may be read and manipulated by a C++ program without any 'impedance mismatch'. Relations and columns within relations may themselves be treated as objects with associated variables and methods. Relations may contain heterogeneous objects, that is, objects of more than one class in a certain column, which can individually evolve by being reclassified. Special facilities are provided to reduce the data search in a relation containing complex objects.
Organizing, exploring, and analyzing antibody sequence data: the case for relational-database managers.

PubMed

Owens, John

2009-01-01

Technological advances in the acquisition of DNA and protein sequence information and the resulting onrush of data can quickly overwhelm the scientist unprepared for the volume of information that must be evaluated and carefully dissected to discover its significance. Few laboratories have the luxury of dedicated personnel to organize, analyze, or consistently record a mix of arriving sequence data. A methodology based on a modern relational-database manager is presented that is both a natural storage vessel for antibody sequence information and a conduit for organizing and exploring sequence data and accompanying annotation text. The expertise necessary to implement such a plan is equal to that required by electronic word processors or spreadsheet applications. Antibody sequence projects maintained as independent databases are selectively unified by the relational-database manager into larger database families that contribute to local analyses, reports, interactive HTML pages, or exported to facilities dedicated to sophisticated sequence analysis techniques. Database files are transposable among current versions of Microsoft, Macintosh, and UNIX operating systems.
A structural informatics approach to mine kinase knowledge bases.

PubMed

Brooijmans, Natasja; Mobilio, Dominick; Walker, Gary; Nilakantan, Ramaswamy; Denny, Rajiah A; Feyfant, Eric; Diller, David; Bikker, Jack; Humblet, Christine

2010-03-01

In this paper, we describe a combination of structural informatics approaches developed to mine data extracted from existing structure knowledge bases (Protein Data Bank and the GVK database) with a focus on kinase ATP-binding site data. In contrast to existing systems that retrieve and analyze protein structures, our techniques are centered on a database of ligand-bound geometries in relation to residues lining the binding site and transparent access to ligand-based SAR data. We illustrate the systems in the context of the Abelson kinase and related inhibitor structures. 2009 Elsevier Ltd. All rights reserved.
Construction of a Linux based chemical and biological information system.

PubMed

Molnár, László; Vágó, István; Fehér, András

2003-01-01

A chemical and biological information system with a Web-based easy-to-use interface and corresponding databases has been developed. The constructed system incorporates all chemical, numerical and textual data related to the chemical compounds, including numerical biological screen results. Users can search the database by traditional textual/numerical and/or substructure or similarity queries through the web interface. To build our chemical database management system, we utilized existing IT components such as ORACLE or Tripos SYBYL for database management and Zope application server for the web interface. We chose Linux as the main platform, however, almost every component can be used under various operating systems.
(BARS) -- Bibliographic Retrieval System Sandia Shock Compression (SSC) database Shock Physics Index (SPHINX) database. Volume 1: UNIX version query guide customized application for INGRES

DOE Office of Scientific and Technical Information (OSTI.GOV)

Herrmann, W.; von Laven, G.M.; Parker, T.

1993-09-01

The Bibliographic Retrieval System (BARS) is a data base management system specially designed to retrieve bibliographic references. Two databases are available, (i) the Sandia Shock Compression (SSC) database which contains over 5700 references to the literature related to stress waves in solids and their applications, and (ii) the Shock Physics Index (SPHINX) which includes over 8000 further references to stress waves in solids, material properties at intermediate and low rates, ballistic and hypervelocity impact, and explosive or shock fabrication methods. There is some overlap in the information in the two data bases.
Brain Tumor Database, a free relational database for collection and analysis of brain tumor patient information.

PubMed

Bergamino, Maurizio; Hamilton, David J; Castelletti, Lara; Barletta, Laura; Castellan, Lucio

2015-03-01

In this study, we describe the development and utilization of a relational database designed to manage the clinical and radiological data of patients with brain tumors. The Brain Tumor Database was implemented using MySQL v.5.0, while the graphical user interface was created using PHP and HTML, thus making it easily accessible through a web browser. This web-based approach allows for multiple institutions to potentially access the database. The BT Database can record brain tumor patient information (e.g. clinical features, anatomical attributes, and radiological characteristics) and be used for clinical and research purposes. Analytic tools to automatically generate statistics and different plots are provided. The BT Database is a free and powerful user-friendly tool with a wide range of possible clinical and research applications in neurology and neurosurgery. The BT Database graphical user interface source code and manual are freely available at http://tumorsdatabase.altervista.org. © The Author(s) 2013.
Study on Big Database Construction and its Application of Sample Data Collected in CHINA'S First National Geographic Conditions Census Based on Remote Sensing Images

NASA Astrophysics Data System (ADS)

Cheng, T.; Zhou, X.; Jia, Y.; Yang, G.; Bai, J.

2018-04-01

In the project of China's First National Geographic Conditions Census, millions of sample data have been collected all over the country for interpreting land cover based on remote sensing images, the quantity of data files reaches more than 12,000,000 and has grown in the following project of National Geographic Conditions Monitoring. By now, using database such as Oracle for storing the big data is the most effective method. However, applicable method is more significant for sample data's management and application. This paper studies a database construction method which is based on relational database with distributed file system. The vector data and file data are saved in different physical location. The key issues and solution method are discussed. Based on this, it studies the application method of sample data and analyzes some kinds of using cases, which could lay the foundation for sample data's application. Particularly, sample data locating in Shaanxi province are selected for verifying the method. At the same time, it takes 10 first-level classes which defined in the land cover classification system for example, and analyzes the spatial distribution and density characteristics of all kinds of sample data. The results verify that the method of database construction which is based on relational database with distributed file system is very useful and applicative for sample data's searching, analyzing and promoted application. Furthermore, sample data collected in the project of China's First National Geographic Conditions Census could be useful in the earth observation and land cover's quality assessment.
Global Distribution of Outbreaks of Water-Associated Infectious Diseases

PubMed Central

Yang, Kun; LeJeune, Jeffrey; Alsdorf, Doug; Lu, Bo; Shum, C. K.; Liang, Song

2012-01-01

Background Water plays an important role in the transmission of many infectious diseases, which pose a great burden on global public health. However, the global distribution of these water-associated infectious diseases and underlying factors remain largely unexplored. Methods and Findings Based on the Global Infectious Disease and Epidemiology Network (GIDEON), a global database including water-associated pathogens and diseases was developed. In this study, reported outbreak events associated with corresponding water-associated infectious diseases from 1991 to 2008 were extracted from the database. The location of each reported outbreak event was identified and geocoded into a GIS database. Also collected in the GIS database included geo-referenced socio-environmental information including population density (2000), annual accumulated temperature, surface water area, and average annual precipitation. Poisson models with Bayesian inference were developed to explore the association between these socio-environmental factors and distribution of the reported outbreak events. Based on model predictions a global relative risk map was generated. A total of 1,428 reported outbreak events were retrieved from the database. The analysis suggested that outbreaks of water-associated diseases are significantly correlated with socio-environmental factors. Population density is a significant risk factor for all categories of reported outbreaks of water-associated diseases; water-related diseases (e.g., vector-borne diseases) are associated with accumulated temperature; water-washed diseases (e.g., conjunctivitis) are inversely related to surface water area; both water-borne and water-related diseases are inversely related to average annual rainfall. Based on the model predictions, “hotspots” of risks for all categories of water-associated diseases were explored. Conclusions At the global scale, water-associated infectious diseases are significantly correlated with socio-environmental factors, impacting all regions which are affected disproportionately by different categories of water-associated infectious diseases. PMID:22348158
NEMiD: a web-based curated microbial diversity database with geo-based plotting.

PubMed

Bhattacharjee, Kaushik; Joshi, Santa Ram

2014-01-01

The majority of the Earth's microbes remain unknown, and that their potential utility cannot be exploited until they are discovered and characterized. They provide wide scope for the development of new strains as well as biotechnological uses. The documentation and bioprospection of microorganisms carry enormous significance considering their relevance to human welfare. This calls for an urgent need to develop a database with emphasis on the microbial diversity of the largest untapped reservoirs in the biosphere. The data annotated in the North-East India Microbial database (NEMiD) were obtained by the isolation and characterization of microbes from different parts of the Eastern Himalayan region. The database was constructed as a relational database management system (RDBMS) for data storage in MySQL in the back-end on a Linux server and implemented in an Apache/PHP environment. This database provides a base for understanding the soil microbial diversity pattern in this megabiodiversity hotspot and indicates the distribution patterns of various organisms along with identification. The NEMiD database is freely available at www.mblabnehu.info/nemid/.
NEMiD: A Web-Based Curated Microbial Diversity Database with Geo-Based Plotting

PubMed Central

Bhattacharjee, Kaushik; Joshi, Santa Ram

2014-01-01

The majority of the Earth's microbes remain unknown, and that their potential utility cannot be exploited until they are discovered and characterized. They provide wide scope for the development of new strains as well as biotechnological uses. The documentation and bioprospection of microorganisms carry enormous significance considering their relevance to human welfare. This calls for an urgent need to develop a database with emphasis on the microbial diversity of the largest untapped reservoirs in the biosphere. The data annotated in the North-East India Microbial database (NEMiD) were obtained by the isolation and characterization of microbes from different parts of the Eastern Himalayan region. The database was constructed as a relational database management system (RDBMS) for data storage in MySQL in the back-end on a Linux server and implemented in an Apache/PHP environment. This database provides a base for understanding the soil microbial diversity pattern in this megabiodiversity hotspot and indicates the distribution patterns of various organisms along with identification. The NEMiD database is freely available at www.mblabnehu.info/nemid/. PMID:24714636
PASS2: an automated database of protein alignments organised as structural superfamilies.

PubMed

Bhaduri, Anirban; Pugalenthi, Ganesan; Sowdhamini, Ramanathan

2004-04-02

The functional selection and three-dimensional structural constraints of proteins in nature often relates to the retention of significant sequence similarity between proteins of similar fold and function despite poor sequence identity. Organization of structure-based sequence alignments for distantly related proteins, provides a map of the conserved and critical regions of the protein universe that is useful for the analysis of folding principles, for the evolutionary unification of protein families and for maximizing the information return from experimental structure determination. The Protein Alignment organised as Structural Superfamily (PASS2) database represents continuously updated, structural alignments for evolutionary related, sequentially distant proteins. An automated and updated version of PASS2 is, in direct correspondence with SCOP 1.63, consisting of sequences having identity below 40% among themselves. Protein domains have been grouped into 628 multi-member superfamilies and 566 single member superfamilies. Structure-based sequence alignments for the superfamilies have been obtained using COMPARER, while initial equivalencies have been derived from a preliminary superposition using LSQMAN or STAMP 4.0. The final sequence alignments have been annotated for structural features using JOY4.0. The database is supplemented with sequence relatives belonging to different genomes, conserved spatially interacting and structural motifs, probabilistic hidden markov models of superfamilies based on the alignments and useful links to other databases. Probabilistic models and sensitive position specific profiles obtained from reliable superfamily alignments aid annotation of remote homologues and are useful tools in structural and functional genomics. PASS2 presents the phylogeny of its members both based on sequence and structural dissimilarities. Clustering of members allows us to understand diversification of the family members. The search engine has been improved for simpler browsing of the database. The database resolves alignments among the structural domains consisting of evolutionarily diverged set of sequences. Availability of reliable sequence alignments of distantly related proteins despite poor sequence identity and single-member superfamilies permit better sampling of structures in libraries for fold recognition of new sequences and for the understanding of protein structure-function relationships of individual superfamilies. PASS2 is accessible at http://www.ncbs.res.in/~faculty/mini/campass/pass2.html
Experience in running relational databases on clustered storage

NASA Astrophysics Data System (ADS)

Gaspar Aparicio, Ruben; Potocky, Miroslav

2015-12-01

For past eight years, CERN IT Database group has based its backend storage on NAS (Network-Attached Storage) architecture, providing database access via NFS (Network File System) protocol. In last two and half years, our storage has evolved from a scale-up architecture to a scale-out one. This paper describes our setup and a set of functionalities providing key features to other services like Database on Demand [1] or CERN Oracle backup and recovery service. It also outlines possible trend of evolution that, storage for databases could follow.
A Blind Reversible Robust Watermarking Scheme for Relational Databases

PubMed Central

Chang, Chin-Chen; Nguyen, Thai-Son; Lin, Chia-Chen

2013-01-01

Protecting the ownership and controlling the copies of digital data have become very important issues in Internet-based applications. Reversible watermark technology allows the distortion-free recovery of relational databases after the embedded watermark data are detected or verified. In this paper, we propose a new, blind, reversible, robust watermarking scheme that can be used to provide proof of ownership for the owner of a relational database. In the proposed scheme, a reversible data-embedding algorithm, which is referred to as “histogram shifting of adjacent pixel difference” (APD), is used to obtain reversibility. The proposed scheme can detect successfully 100% of the embedded watermark data, even if as much as 80% of the watermarked relational database is altered. Our extensive analysis and experimental results show that the proposed scheme is robust against a variety of data attacks, for example, alteration attacks, deletion attacks, mix-match attacks, and sorting attacks. PMID:24223033
A blind reversible robust watermarking scheme for relational databases.

PubMed

Chang, Chin-Chen; Nguyen, Thai-Son; Lin, Chia-Chen

2013-01-01

Protecting the ownership and controlling the copies of digital data have become very important issues in Internet-based applications. Reversible watermark technology allows the distortion-free recovery of relational databases after the embedded watermark data are detected or verified. In this paper, we propose a new, blind, reversible, robust watermarking scheme that can be used to provide proof of ownership for the owner of a relational database. In the proposed scheme, a reversible data-embedding algorithm, which is referred to as "histogram shifting of adjacent pixel difference" (APD), is used to obtain reversibility. The proposed scheme can detect successfully 100% of the embedded watermark data, even if as much as 80% of the watermarked relational database is altered. Our extensive analysis and experimental results show that the proposed scheme is robust against a variety of data attacks, for example, alteration attacks, deletion attacks, mix-match attacks, and sorting attacks.
The HISTMAG database: combining historical, archaeomagnetic and volcanic data

NASA Astrophysics Data System (ADS)

Arneitz, Patrick; Leonhardt, Roman; Schnepp, Elisabeth; Heilig, Balázs; Mayrhofer, Franziska; Kovacs, Peter; Hejda, Pavel; Valach, Fridrich; Vadasz, Gergely; Hammerl, Christa; Egli, Ramon; Fabian, Karl; Kompein, Niko

2017-09-01

Records of the past geomagnetic field can be divided into two main categories. These are instrumental historical observations on the one hand, and field estimates based on the magnetization acquired by rocks, sediments and archaeological artefacts on the other hand. In this paper, a new database combining historical, archaeomagnetic and volcanic records is presented. HISTMAG is a relational database, implemented in MySQL, and can be accessed via a web-based interface (http://www.conrad-observatory.at/zamg/index.php/data-en/histmag-database). It combines available global historical data compilations covering the last ∼500 yr as well as archaeomagnetic and volcanic data collections from the last 50 000 yr. Furthermore, new historical and archaeomagnetic records, mainly from central Europe, have been acquired. In total, 190 427 records are currently available in the HISTMAG database, whereby the majority is related to historical declination measurements (155 525). The original database structure was complemented by new fields, which allow for a detailed description of the different data types. A user-comment function provides the possibility for a scientific discussion about individual records. Therefore, HISTMAG database supports thorough reliability and uncertainty assessments of the widely different data sets, which are an essential basis for geomagnetic field reconstructions. A database analysis revealed systematic offset for declination records derived from compass roses on historical geographical maps through comparison with other historical records, while maps created for mining activities represent a reliable source.
Mashup of Geo and Space Science Data Provided via Relational Databases in the Semantic Web

NASA Astrophysics Data System (ADS)

Ritschel, B.; Seelus, C.; Neher, G.; Iyemori, T.; Koyama, Y.; Yatagai, A. I.; Murayama, Y.; King, T. A.; Hughes, J. S.; Fung, S. F.; Galkin, I. A.; Hapgood, M. A.; Belehaki, A.

2014-12-01

The use of RDBMS for the storage and management of geo and space science data and/or metadata is very common. Although the information stored in tables is based on a data model and therefore well organized and structured, a direct mashup with RDF based data stored in triple stores is not possible. One solution of the problem consists in the transformation of the whole content into RDF structures and storage in triple stores. Another interesting way is the use of a specific system/service, such as e.g. D2RQ, for the access to relational database content as virtual, read only RDF graphs. The Semantic Web based -proof of concept- GFZ ISDC uses the triple store Virtuoso for the storage of general context information/metadata to geo and space science satellite and ground station data. There is information about projects, platforms, instruments, persons, product types, etc. available but no detailed metadata about the data granuals itself. Such important information, as e.g. start or end time or the detailed spatial coverage of a single measurement is stored in RDBMS tables of the ISDC catalog system only. In order to provide a seamless access to all available information about the granuals/data products a mashup of the different data resources (triple store and RDBMS) is necessary. This paper describes the use of D2RQ for a Semantic Web/SPARQL based mashup of relational databases used for ISDC data server but also for the access to IUGONET and/or ESPAS and further geo and space science data resources. RDBMS Relational Database Management System RDF Resource Description Framework SPARQL SPARQL Protocol And RDF Query Language D2RQ Accessing Relational Databases as Virtual RDF Graphs GFZ ISDC German Research Centre for Geosciences Information System and Data Center IUGONET Inter-university Upper Atmosphere Global Observation Network (Japanese project) ESPAS Near earth space data infrastructure for e-science (European Union funded project)
Building an Integrated Environment for Multimedia

NASA Technical Reports Server (NTRS)

1997-01-01

Multimedia courseware on the solar system and earth science suitable for use in elementary, middle, and high schools was developed under this grant. The courseware runs on Silicon Graphics, Incorporated (SGI) workstations and personal computers (PCs). There is also a version of the courseware accessible via the World Wide Web. Accompanying multimedia database systems were also developed to enhance the multimedia courseware. The database systems accompanying the PC software are based on the relational model, while the database systems accompanying the SGI software are based on the object-oriented model.
In silico analysis of fragile histidine triad involved in regression of carcinoma.

PubMed

Rasheed, Muhammad Asif; Tariq, Fatima; Afzal, Sara; Mannanv, Shazia

2017-04-01

Hepatocellular carcinoma (HCCa) is a primary malignancy of the liver. Many different proteins are involved in HCCa including insulin growth factor (IGF) II , signal transducers and activators of transcription (STAT) 3, STAT4, mothers against decapentaplegic homolog 4 (SMAD 4), fragile histidine triad (FHIT) and selective internal radiation therapy (SIRT) etc. The present study is based on the bioinformatics analysis of FHIT protein in order to understand the proteomics aspect and improvement of the diagnosis of the disease based on the protein. Different information related to protein were gathered from different databases, including National Centre for Biotechnology Information (NCBI) Gene, Protein and Online Mendelian Inheritance in Man (OMIM) databases, Uniprot database, String database and Kyoto Encyclopedia of Genes and Genomes (KEGG) database. Moreover, the structure of the protein and evaluation of the quality of the structure were included from Easy modeler programme. Hence, this analysis not only helped to gather information related to the protein at one place, but also analysed the structure and quality of the protein to conclude that the protein has a role in carcinoma.
Compatibility between livestock databases used for quantitative biosecurity response in New Zealand.

PubMed

Jewell, C P; van Andel, M; Vink, W D; McFadden, A M J

2016-05-01

To characterise New Zealand's livestock biosecurity databases, and investigate their compatibility and capacity to provide a single integrated data source for quantitative outbreak analysis. Contemporary snapshots of the data in three national livestock biosecurity databases, AgriBase, FarmsOnLine (FOL) and the National Animal Identification and Tracing Scheme (NAIT), were obtained on 16 September, 1 September and 30 April 2014, respectively, and loaded into a relational database. A frequency table of animal numbers per farm was calculated for the AgriBase and FOL datasets. A two dimensional kernel density estimate was calculated for farms reporting the presence of cattle, pigs, deer, and small ruminants in each database and the ratio of farm densities for AgriBase versus FOL calculated. The extent to which records in the three databases could be matched and linked was quantified, and the level of agreement amongst them for the presence of different species on properties assessed using Cohen's kappa statistic. AgriBase contained fewer records than FOL, but recorded animal numbers present on each farm, whereas FOL contained more records, but captured only presence/absence of animals. The ratio of farm densities in AgriBase relative to FOL for pigs and deer was reasonably homogeneous across New Zealand, with AgriBase having a farm density approximately 80% of FOL. For cattle and small ruminants, there was considerable heterogeneity, with AgriBase showing a density of cattle farms in the Central Otago region that was 20% of FOL, and a density of small ruminant farms in the central West Coast area that was twice that of FOL. Only 37% of records in FOL could be linked to AgriBase, but the level of agreement for the presence of different species between these databases was substantial (kappa>0.6). Both NAIT and FOL shared common farm identifiers which could be used to georeference animal movements, and there was a fair to substantial agreement (kappa 0.32-0.69) between these databases for the presence of cattle and deer on properties. The three databases broadly agreed with each other, but important differences existed in both species composition and spatial coverage which raises concern over their accuracy. Importantly, they cannot be reliably linked together to provide a single picture of New Zealand's livestock industry, limiting the ability to use advanced quantitative techniques to provide effective decision support during disease outbreaks. We recommend that a single integrated database be developed, with alignment of resources and legislation for its upkeep.
A veterinary anatomy tutoring system.

PubMed

Theodoropoulos, G; Loumos, V; Antonopoulos, J

1994-02-14

A veterinary anatomy tutoring system was developed by using Knowledge Pro, an object-oriented software development tool with hypermedia capabilities, and MS Access, a relational database. Communication between them is facilitated by using the Structured Query Language (SQL). The architecture of the system is based on knowledge sets, each of which covers four different descriptions of an organ, namely gross anatomy (general description), gross anatomy (comparative features), histology, and embryology, which constitute the knowledge units. These knowledge units are linked with three global variables that define the animals, the topographies, and the system to which this organ belongs, creating three data-bases. These three data-bases are interrelated through the organ field in order to establish a relational model. This system allows versatility in the student's navigation through the information space by offering different modes for information location and presentation. These include course mode, review mode, reference mode, dissection mode, and comparison mode. In addition, the system provides a self-evaluation mode.

LMSD: LIPID MAPS structure database

PubMed Central

Sud, Manish; Fahy, Eoin; Cotter, Dawn; Brown, Alex; Dennis, Edward A.; Glass, Christopher K.; Merrill, Alfred H.; Murphy, Robert C.; Raetz, Christian R. H.; Russell, David W.; Subramaniam, Shankar

2007-01-01

The LIPID MAPS Structure Database (LMSD) is a relational database encompassing structures and annotations of biologically relevant lipids. Structures of lipids in the database come from four sources: (i) LIPID MAPS Consortium's core laboratories and partners; (ii) lipids identified by LIPID MAPS experiments; (iii) computationally generated structures for appropriate lipid classes; (iv) biologically relevant lipids manually curated from LIPID BANK, LIPIDAT and other public sources. All the lipid structures in LMSD are drawn in a consistent fashion. In addition to a classification-based retrieval of lipids, users can search LMSD using either text-based or structure-based search options. The text-based search implementation supports data retrieval by any combination of these data fields: LIPID MAPS ID, systematic or common name, mass, formula, category, main class, and subclass data fields. The structure-based search, in conjunction with optional data fields, provides the capability to perform a substructure search or exact match for the structure drawn by the user. Search results, in addition to structure and annotations, also include relevant links to external databases. The LMSD is publicly available at PMID:17098933
Design and implementation of a distributed large-scale spatial database system based on J2EE

NASA Astrophysics Data System (ADS)

Gong, Jianya; Chen, Nengcheng; Zhu, Xinyan; Zhang, Xia

2003-03-01

With the increasing maturity of distributed object technology, CORBA, .NET and EJB are universally used in traditional IT field. However, theories and practices of distributed spatial database need farther improvement in virtue of contradictions between large scale spatial data and limited network bandwidth or between transitory session and long transaction processing. Differences and trends among of CORBA, .NET and EJB are discussed in details, afterwards the concept, architecture and characteristic of distributed large-scale seamless spatial database system based on J2EE is provided, which contains GIS client application, web server, GIS application server and spatial data server. Moreover the design and implementation of components of GIS client application based on JavaBeans, the GIS engine based on servlet, the GIS Application server based on GIS enterprise JavaBeans(contains session bean and entity bean) are explained.Besides, the experiments of relation of spatial data and response time under different conditions are conducted, which proves that distributed spatial database system based on J2EE can be used to manage, distribute and share large scale spatial data on Internet. Lastly, a distributed large-scale seamless image database based on Internet is presented.
Database on Demand: insight how to build your own DBaaS

NASA Astrophysics Data System (ADS)

Gaspar Aparicio, Ruben; Coterillo Coz, Ignacio

2015-12-01

At CERN, a number of key database applications are running on user-managed MySQL, PostgreSQL and Oracle database services. The Database on Demand (DBoD) project was born out of an idea to provide CERN user community with an environment to develop and run database services as a complement to the central Oracle based database service. The Database on Demand empowers the user to perform certain actions that had been traditionally done by database administrators, providing an enterprise platform for database applications. It also allows the CERN user community to run different database engines, e.g. presently three major RDBMS (relational database management system) vendors are offered. In this article we show the actual status of the service after almost three years of operations, some insight of our new redesign software engineering and near future evolution.
Accessing the public MIMIC-II intensive care relational database for clinical research

PubMed Central

2013-01-01

Background The Multiparameter Intelligent Monitoring in Intensive Care II (MIMIC-II) database is a free, public resource for intensive care research. The database was officially released in 2006, and has attracted a growing number of researchers in academia and industry. We present the two major software tools that facilitate accessing the relational database: the web-based QueryBuilder and a downloadable virtual machine (VM) image. Results QueryBuilder and the MIMIC-II VM have been developed successfully and are freely available to MIMIC-II users. Simple example SQL queries and the resulting data are presented. Clinical studies pertaining to acute kidney injury and prediction of fluid requirements in the intensive care unit are shown as typical examples of research performed with MIMIC-II. In addition, MIMIC-II has also provided data for annual PhysioNet/Computing in Cardiology Challenges, including the 2012 Challenge “Predicting mortality of ICU Patients”. Conclusions QueryBuilder is a web-based tool that provides easy access to MIMIC-II. For more computationally intensive queries, one can locally install a complete copy of MIMIC-II in a VM. Both publicly available tools provide the MIMIC-II research community with convenient querying interfaces and complement the value of the MIMIC-II relational database. PMID:23302652
Retrovirus Integration Database (RID): a public database for retroviral insertion sites into host genomes.

PubMed

Shao, Wei; Shan, Jigui; Kearney, Mary F; Wu, Xiaolin; Maldarelli, Frank; Mellors, John W; Luke, Brian; Coffin, John M; Hughes, Stephen H

2016-07-04

The NCI Retrovirus Integration Database is a MySql-based relational database created for storing and retrieving comprehensive information about retroviral integration sites, primarily, but not exclusively, HIV-1. The database is accessible to the public for submission or extraction of data originating from experiments aimed at collecting information related to retroviral integration sites including: the site of integration into the host genome, the virus family and subtype, the origin of the sample, gene exons/introns associated with integration, and proviral orientation. Information about the references from which the data were collected is also stored in the database. Tools are built into the website that can be used to map the integration sites to UCSC genome browser, to plot the integration site patterns on a chromosome, and to display provirus LTRs in their inserted genome sequence. The website is robust, user friendly, and allows users to query the database and analyze the data dynamically. https://rid.ncifcrf.gov ; or http://home.ncifcrf.gov/hivdrp/resources.htm .
Scale out databases for CERN use cases

NASA Astrophysics Data System (ADS)

Baranowski, Zbigniew; Grzybek, Maciej; Canali, Luca; Lanza Garcia, Daniel; Surdy, Kacper

2015-12-01

Data generation rates are expected to grow very fast for some database workloads going into LHC run 2 and beyond. In particular this is expected for data coming from controls, logging and monitoring systems. Storing, administering and accessing big data sets in a relational database system can quickly become a very hard technical challenge, as the size of the active data set and the number of concurrent users increase. Scale-out database technologies are a rapidly developing set of solutions for deploying and managing very large data warehouses on commodity hardware and with open source software. In this paper we will describe the architecture and tests on database systems based on Hadoop and the Cloudera Impala engine. We will discuss the results of our tests, including tests of data loading and integration with existing data sources and in particular with relational databases. We will report on query performance tests done with various data sets of interest at CERN, notably data from the accelerator log database.
Ontology based heterogeneous materials database integration and semantic query

NASA Astrophysics Data System (ADS)

Zhao, Shuai; Qian, Quan

2017-10-01

Materials digital data, high throughput experiments and high throughput computations are regarded as three key pillars of materials genome initiatives. With the fast growth of materials data, the integration and sharing of data is very urgent, that has gradually become a hot topic of materials informatics. Due to the lack of semantic description, it is difficult to integrate data deeply in semantic level when adopting the conventional heterogeneous database integration approaches such as federal database or data warehouse. In this paper, a semantic integration method is proposed to create the semantic ontology by extracting the database schema semi-automatically. Other heterogeneous databases are integrated to the ontology by means of relational algebra and the rooted graph. Based on integrated ontology, semantic query can be done using SPARQL. During the experiments, two world famous First Principle Computational databases, OQMD and Materials Project are used as the integration targets, which show the availability and effectiveness of our method.
Teleform scannable data entry: an efficient method to update a community-based medical record? Community care coordination network Database Group.

PubMed Central

Guerette, P.; Robinson, B.; Moran, W. P.; Messick, C.; Wright, M.; Wofford, J.; Velez, R.

1995-01-01

Community-based multi-disciplinary care of chronically ill individuals frequently requires the efforts of several agencies and organizations. The Community Care Coordination Network (CCCN) is an effort to establish a community-based clinical database and electronic communication system to facilitate the exchange of pertinent patient data among primary care, community-based and hospital-based providers. In developing a primary care based electronic record, a method is needed to update records from the field or remote sites and agencies and yet maintain data quality. Scannable data entry with fixed fields, optical character recognition and verification was compared to traditional keyboard data entry to determine the relative efficiency of each method in updating the CCCN database. PMID:8563414
Effective spatial database support for acquiring spatial information from remote sensing images

NASA Astrophysics Data System (ADS)

Jin, Peiquan; Wan, Shouhong; Yue, Lihua

2009-12-01

In this paper, a new approach to maintain spatial information acquiring from remote-sensing images is presented, which is based on Object-Relational DBMS. According to this approach, the detected and recognized results of targets are stored and able to be further accessed in an ORDBMS-based spatial database system, and users can access the spatial information using the standard SQL interface. This approach is different from the traditional ArcSDE-based method, because the spatial information management module is totally integrated into the DBMS and becomes one of the core modules in the DBMS. We focus on three issues, namely the general framework for the ORDBMS-based spatial database system, the definitions of the add-in spatial data types and operators, and the process to develop a spatial Datablade on Informix. The results show that the ORDBMS-based spatial database support for image-based target detecting and recognition is easy and practical to be implemented.
Development of a relational database to capture and merge clinical history with the quantitative results of radionuclide renography.

PubMed

Folks, Russell D; Savir-Baruch, Bital; Garcia, Ernest V; Verdes, Liudmila; Taylor, Andrew T

2012-12-01

Our objective was to design and implement a clinical history database capable of linking to our database of quantitative results from (99m)Tc-mercaptoacetyltriglycine (MAG3) renal scans and export a data summary for physicians or our software decision support system. For database development, we used a commercial program. Additional software was developed in Interactive Data Language. MAG3 studies were processed using an in-house enhancement of a commercial program. The relational database has 3 parts: a list of all renal scans (the RENAL database), a set of patients with quantitative processing results (the Q2 database), and a subset of patients from Q2 containing clinical data manually transcribed from the hospital information system (the CLINICAL database). To test interobserver variability, a second physician transcriber reviewed 50 randomly selected patients in the hospital information system and tabulated 2 clinical data items: hydronephrosis and presence of a current stent. The CLINICAL database was developed in stages and contains 342 fields comprising demographic information, clinical history, and findings from up to 11 radiologic procedures. A scripted algorithm is used to reliably match records present in both Q2 and CLINICAL. An Interactive Data Language program then combines data from the 2 databases into an XML (extensible markup language) file for use by the decision support system. A text file is constructed and saved for review by physicians. RENAL contains 2,222 records, Q2 contains 456 records, and CLINICAL contains 152 records. The interobserver variability testing found a 95% match between the 2 observers for presence or absence of ureteral stent (κ = 0.52), a 75% match for hydronephrosis based on narrative summaries of hospitalizations and clinical visits (κ = 0.41), and a 92% match for hydronephrosis based on the imaging report (κ = 0.84). We have developed a relational database system to integrate the quantitative results of MAG3 image processing with clinical records obtained from the hospital information system. We also have developed a methodology for formatting clinical history for review by physicians and export to a decision support system. We identified several pitfalls, including the fact that important textual information extracted from the hospital information system by knowledgeable transcribers can show substantial interobserver variation, particularly when record retrieval is based on the narrative clinical records.
Experiences with the Application of Services Oriented Approaches to the Federation of Heterogeneous Geologic Data Resources

NASA Astrophysics Data System (ADS)

Cervato, C.; Fils, D.; Bohling, G.; Diver, P.; Greer, D.; Reed, J.; Tang, X.

2006-12-01

The federation of databases is not a new endeavor. Great strides have been made e.g. in the health and astrophysics communities. Reviews of those successes indicate that they have been able to leverage off key cross-community core concepts. In its simplest implementation, a federation of databases with identical base schemas that can be extended to address individual efforts, is relatively easy to accomplish. Efforts of groups like the Open Geospatial Consortium have shown methods to geospatially relate data between different sources. We present here a summary of CHRONOS's (http://www.chronos.org) experience with highly heterogeneous data. Our experience with the federation of very diverse databases shows that the wide variety of encoding options for items like locality, time scale, taxon ID, and other key parameters makes it difficult to effectively join data across them. However, the response to this is not to develop one large, monolithic database, which will suffer growth pains due to social, national, and operational issues, but rather to systematically develop the architecture that will enable cross-resource (database, repository, tool, interface) interaction. CHRONOS has accomplished the major hurdle of federating small IT database efforts with service-oriented and XML-based approaches. The application of easy-to-use procedures that allow groups of all sizes to implement and experiment with searches across various databases and to use externally created tools is vital. We are sharing with the geoinformatics community the difficulties with application frameworks, user authentication, standards compliance, and data storage encountered in setting up web sites and portals for various science initiatives (e.g., ANDRILL, EARTHTIME). The ability to incorporate CHRONOS data, services, and tools into the existing framework of a group is crucial to the development of a model that supports and extends the vitality of the small- to medium-sized research effort that is essential for a vibrant scientific community. This presentation will directly address issues of portal development related to JSR-168 and other portal API's as well as issues related to both federated and local directory-based authentication. The application of service-oriented architecture in connection with ReST-based approaches is vital to facilitate service use by experienced and less experienced information technology groups. Application of these services with XML- based schemas allows for the connection to third party tools such a GIS-based tools and software designed to perform a specific scientific analysis. The connection of all these capabilities into a combined framework based on the standard XHTML Document object model and CSS 2.0 standards used in traditional web development will be demonstrated. CHRONOS also utilizes newer client techniques such as AJAX and cross- domain scripting along with traditional server-side database, application, and web servers. The combination of the various components of this architecture creates an environment based on open and free standards that allows for the discovery, retrieval, and integration of tools and data.
The Design of Lexical Database for Indonesian Language

NASA Astrophysics Data System (ADS)

Gunawan, D.; Amalia, A.

2017-03-01

Kamus Besar Bahasa Indonesia (KBBI), an official dictionary for Indonesian language, provides lists of words with their meaning. The online version can be accessed via Internet network. Another online dictionary is Kateglo. KBBI online and Kateglo only provides an interface for human. A machine cannot retrieve data from the dictionary easily without using advanced techniques. Whereas, lexical of words is required in research or application development which related to natural language processing, text mining, information retrieval or sentiment analysis. To address this requirement, we need to build a lexical database which provides well-defined structured information about words. A well-known lexical database is WordNet, which provides the relation among words in English. This paper proposes the design of a lexical database for Indonesian language based on the combination of KBBI 4th edition, Kateglo and WordNet structure. Knowledge representation by utilizing semantic networks depict the relation among words and provide the new structure of lexical database for Indonesian language. The result of this design can be used as the foundation to build the lexical database for Indonesian language.
An Updating System for the Gridded Population Database of China Based on Remote Sensing, GIS and Spatial Database Technologies.

PubMed

Yang, Xiaohuan; Huang, Yaohuan; Dong, Pinliang; Jiang, Dong; Liu, Honghui

2009-01-01

The spatial distribution of population is closely related to land use and land cover (LULC) patterns on both regional and global scales. Population can be redistributed onto geo-referenced square grids according to this relation. In the past decades, various approaches to monitoring LULC using remote sensing and Geographic Information Systems (GIS) have been developed, which makes it possible for efficient updating of geo-referenced population data. A Spatial Population Updating System (SPUS) is developed for updating the gridded population database of China based on remote sensing, GIS and spatial database technologies, with a spatial resolution of 1 km by 1 km. The SPUS can process standard Moderate Resolution Imaging Spectroradiometer (MODIS L1B) data integrated with a Pattern Decomposition Method (PDM) and an LULC-Conversion Model to obtain patterns of land use and land cover, and provide input parameters for a Population Spatialization Model (PSM). The PSM embedded in SPUS is used for generating 1 km by 1 km gridded population data in each population distribution region based on natural and socio-economic variables. Validation results from finer township-level census data of Yishui County suggest that the gridded population database produced by the SPUS is reliable.
MIPS: analysis and annotation of proteins from whole genomes

PubMed Central

Mewes, H. W.; Amid, C.; Arnold, R.; Frishman, D.; Güldener, U.; Mannhaupt, G.; Münsterkötter, M.; Pagel, P.; Strack, N.; Stümpflen, V.; Warfsmann, J.; Ruepp, A.

2004-01-01

The Munich Information Center for Protein Sequences (MIPS-GSF), Neuherberg, Germany, provides protein sequence-related information based on whole-genome analysis. The main focus of the work is directed toward the systematic organization of sequence-related attributes as gathered by a variety of algorithms, primary information from experimental data together with information compiled from the scientific literature. MIPS maintains automatically generated and manually annotated genome-specific databases, develops systematic classification schemes for the functional annotation of protein sequences and provides tools for the comprehensive analysis of protein sequences. This report updates the information on the yeast genome (CYGD), the Neurospora crassa genome (MNCDB), the database of complete cDNAs (German Human Genome Project, NGFN), the database of mammalian protein–protein interactions (MPPI), the database of FASTA homologies (SIMAP), and the interface for the fast retrieval of protein-associated information (QUIPOS). The Arabidopsis thaliana database, the rice database, the plant EST databases (MATDB, MOsDB, SPUTNIK), as well as the databases for the comprehensive set of genomes (PEDANT genomes) are described elsewhere in the 2003 and 2004 NAR database issues, respectively. All databases described, and the detailed descriptions of our projects can be accessed through the MIPS web server (http://mips.gsf.de). PMID:14681354
MIPS: analysis and annotation of proteins from whole genomes.

PubMed

Mewes, H W; Amid, C; Arnold, R; Frishman, D; Güldener, U; Mannhaupt, G; Münsterkötter, M; Pagel, P; Strack, N; Stümpflen, V; Warfsmann, J; Ruepp, A

2004-01-01

The Munich Information Center for Protein Sequences (MIPS-GSF), Neuherberg, Germany, provides protein sequence-related information based on whole-genome analysis. The main focus of the work is directed toward the systematic organization of sequence-related attributes as gathered by a variety of algorithms, primary information from experimental data together with information compiled from the scientific literature. MIPS maintains automatically generated and manually annotated genome-specific databases, develops systematic classification schemes for the functional annotation of protein sequences and provides tools for the comprehensive analysis of protein sequences. This report updates the information on the yeast genome (CYGD), the Neurospora crassa genome (MNCDB), the database of complete cDNAs (German Human Genome Project, NGFN), the database of mammalian protein-protein interactions (MPPI), the database of FASTA homologies (SIMAP), and the interface for the fast retrieval of protein-associated information (QUIPOS). The Arabidopsis thaliana database, the rice database, the plant EST databases (MATDB, MOsDB, SPUTNIK), as well as the databases for the comprehensive set of genomes (PEDANT genomes) are described elsewhere in the 2003 and 2004 NAR database issues, respectively. All databases described, and the detailed descriptions of our projects can be accessed through the MIPS web server (http://mips.gsf.de).
FERN Ethnomedicinal Plant Database: Exploring Fern Ethnomedicinal Plants Knowledge for Computational Drug Discovery.

PubMed

Thakar, Sambhaji B; Ghorpade, Pradnya N; Kale, Manisha V; Sonawane, Kailas D

2015-01-01

Fern plants are known for their ethnomedicinal applications. Huge amount of fern medicinal plants information is scattered in the form of text. Hence, database development would be an appropriate endeavor to cope with the situation. So by looking at the importance of medicinally useful fern plants, we developed a web based database which contains information about several group of ferns, their medicinal uses, chemical constituents as well as protein/enzyme sequences isolated from different fern plants. Fern ethnomedicinal plant database is an all-embracing, content management web-based database system, used to retrieve collection of factual knowledge related to the ethnomedicinal fern species. Most of the protein/enzyme sequences have been extracted from NCBI Protein sequence database. The fern species, family name, identification, taxonomy ID from NCBI, geographical occurrence, trial for, plant parts used, ethnomedicinal importance, morphological characteristics, collected from various scientific literatures and journals available in the text form. NCBI's BLAST, InterPro, phylogeny, Clustal W web source has also been provided for the future comparative studies. So users can get information related to fern plants and their medicinal applications at one place. This Fern ethnomedicinal plant database includes information of 100 fern medicinal species. This web based database would be an advantageous to derive information specifically for computational drug discovery, botanists or botanical interested persons, pharmacologists, researchers, biochemists, plant biotechnologists, ayurvedic practitioners, doctors/pharmacists, traditional medicinal users, farmers, agricultural students and teachers from universities as well as colleges and finally fern plant lovers. This effort would be useful to provide essential knowledge for the users about the adventitious applications for drug discovery, applications, conservation of fern species around the world and finally to create social awareness.
Validated MicroRNA Target Databases: An Evaluation.

PubMed

Lee, Yun Ji Diana; Kim, Veronica; Muth, Dillon C; Witwer, Kenneth W

2015-11-01

Preclinical Research Positive findings from preclinical and clinical studies involving depletion or supplementation of microRNA (miRNA) engender optimism about miRNA-based therapeutics. However, off-target effects must be considered. Predicting these effects is complicated. Each miRNA may target many gene transcripts, and the rules governing imperfectly complementary miRNA: target interactions are incompletely understood. Several databases provide lists of the relatively small number of experimentally confirmed miRNA: target pairs. Although incomplete, this information might allow assessment of at least some of the off-target effects. We evaluated the performance of four databases of experimentally validated miRNA: target interactions (miRWalk 2.0, miRTarBase, miRecords, and TarBase 7.0) using a list of 50 alphabetically consecutive genes. We examined the provided citations to determine the degree to which each interaction was experimentally supported. To assess stability, we tested at the beginning and end of a five-month period. Results varied widely by database. Two of the databases changed significantly over the course of 5 months. Most reported evidence for miRNA: target interactions were indirect or otherwise weak, and relatively few interactions were supported by more than one publication. Some returned results appear to arise from simplistic text searches that offer no insight into the relationship of the search terms, may not even include the reported gene or miRNA, and may thus, be invalid. We conclude that validation databases provide important information, but not all information in all extant databases is up-to-date or accurate. Nevertheless, the more comprehensive validation databases may provide useful starting points for investigation of off-target effects of proposed small RNA therapies. © 2015 Wiley Periodicals, Inc.
CREDO: a structural interactomics database for drug discovery

PubMed Central

Schreyer, Adrian M.; Blundell, Tom L.

2013-01-01

CREDO is a unique relational database storing all pairwise atomic interactions of inter- as well as intra-molecular contacts between small molecules and macromolecules found in experimentally determined structures from the Protein Data Bank. These interactions are integrated with further chemical and biological data. The database implements useful data structures and algorithms such as cheminformatics routines to create a comprehensive analysis platform for drug discovery. The database can be accessed through a web-based interface, downloads of data sets and web services at http://www-cryst.bioc.cam.ac.uk/credo. Database URL: http://www-cryst.bioc.cam.ac.uk/credo PMID:23868908
Bridging international law and rights-based litigation: mapping health-related rights through the development of the Global Health and Human Rights Database.

PubMed

Meier, Benjamin Mason; Cabrera, Oscar A; Ayala, Ana; Gostin, Lawrence O

2012-06-15

The O'Neill Institute for National and Global Health Law at Georgetown University, the World Health Organization, and the Lawyers Collective have come together to develop a searchable Global Health and Human Rights Database that maps the intersection of health and human rights in judgments, international and regional instruments, and national constitutions. Where states long remained unaccountable for violations of health-related human rights, litigation has arisen as a central mechanism in an expanding movement to create rights-based accountability. Facilitated by the incorporation of international human rights standards in national law, this judicial enforcement has supported the implementation of rights-based claims, giving meaning to states' longstanding obligations to realize the highest attainable standard of health. Yet despite these advancements, there has been insufficient awareness of the international and domestic legal instruments enshrining health-related rights and little understanding of the scope and content of litigation upholding these rights. As this accountability movement evolves, the Global Health and Human Rights Database seeks to chart this burgeoning landscape of international instruments, national constitutions, and judgments for health-related rights. Employing international legal research to document and catalogue these three interconnected aspects of human rights for the public's health, the Database's categorization by human rights, health topics, and regional scope provides a comprehensive means of understanding health and human rights law. Through these categorizations, the Global Health and Human Rights Database serves as a basis for analogous legal reasoning across states to serve as precedents for future cases, for comparative legal analysis of similar health claims in different country contexts, and for empirical research to clarify the impact of human rights judgments on public health outcomes. Copyright © 2012 Meier, Nygren-Krug, Cabrera, Ayala, and Gostin.
DNA-based methods of geochemical prospecting

DOEpatents

Ashby, Matthew [Mill Valley, CA

2011-12-06

The present invention relates to methods for performing surveys of the genetic diversity of a population. The invention also relates to methods for performing genetic analyses of a population. The invention further relates to methods for the creation of databases comprising the survey information and the databases created by these methods. The invention also relates to methods for analyzing the information to correlate the presence of nucleic acid markers with desired parameters in a sample. These methods have application in the fields of geochemical exploration, agriculture, bioremediation, environmental analysis, clinical microbiology, forensic science and medicine.

SM-TF: A structural database of small molecule-transcription factor complexes.

PubMed

Xu, Xianjin; Ma, Zhiwei; Sun, Hongmin; Zou, Xiaoqin

2016-06-30

Transcription factors (TFs) are the proteins involved in the transcription process, ensuring the correct expression of specific genes. Numerous diseases arise from the dysfunction of specific TFs. In fact, over 30 TFs have been identified as therapeutic targets of about 9% of the approved drugs. In this study, we created a structural database of small molecule-transcription factor (SM-TF) complexes, available online at http://zoulab.dalton.missouri.edu/SM-TF. The 3D structures of the co-bound small molecule and the corresponding binding sites on TFs are provided in the database, serving as a valuable resource to assist structure-based drug design related to TFs. Currently, the SM-TF database contains 934 entries covering 176 TFs from a variety of species. The database is further classified into several subsets by species and organisms. The entries in the SM-TF database are linked to the UniProt database and other sequence-based TF databases. Furthermore, the druggable TFs from human and the corresponding approved drugs are linked to the DrugBank. © 2016 Wiley Periodicals, Inc. © 2016 Wiley Periodicals, Inc.
A probabilistic NF2 relational algebra for integrated information retrieval and database systems

DOE Office of Scientific and Technical Information (OSTI.GOV)

Fuhr, N.; Roelleke, T.

The integration of information retrieval (IR) and database systems requires a data model which allows for modelling documents as entities, representing uncertainty and vagueness and performing uncertain inference. For this purpose, we present a probabilistic data model based on relations in non-first-normal-form (NF2). Here, tuples are assigned probabilistic weights giving the probability that a tuple belongs to a relation. Thus, the set of weighted index terms of a document are represented as a probabilistic subrelation. In a similar way, imprecise attribute values are modelled as a set-valued attribute. We redefine the relational operators for this type of relations such thatmore » the result of each operator is again a probabilistic NF2 relation, where the weight of a tuple gives the probability that this tuple belongs to the result. By ordering the tuples according to decreasing probabilities, the model yields a ranking of answers like in most IR models. This effect also can be used for typical database queries involving imprecise attribute values as well as for combinations of database and IR queries.« less
A Bioinformatics Workflow for Variant Peptide Detection in Shotgun Proteomics*

PubMed Central

Li, Jing; Su, Zengliu; Ma, Ze-Qiang; Slebos, Robbert J. C.; Halvey, Patrick; Tabb, David L.; Liebler, Daniel C.; Pao, William; Zhang, Bing

2011-01-01

Shotgun proteomics data analysis usually relies on database search. However, commonly used protein sequence databases do not contain information on protein variants and thus prevent variant peptides and proteins from been identified. Including known coding variations into protein sequence databases could help alleviate this problem. Based on our recently published human Cancer Proteome Variation Database, we have created a protein sequence database that comprehensively annotates thousands of cancer-related coding variants collected in the Cancer Proteome Variation Database as well as noncancer-specific ones from the Single Nucleotide Polymorphism Database (dbSNP). Using this database, we then developed a data analysis workflow for variant peptide identification in shotgun proteomics. The high risk of false positive variant identifications was addressed by a modified false discovery rate estimation method. Analysis of colorectal cancer cell lines SW480, RKO, and HCT-116 revealed a total of 81 peptides that contain either noncancer-specific or cancer-related variations. Twenty-three out of 26 variants randomly selected from the 81 were confirmed by genomic sequencing. We further applied the workflow on data sets from three individual colorectal tumor specimens. A total of 204 distinct variant peptides were detected, and five carried known cancer-related mutations. Each individual showed a specific pattern of cancer-related mutations, suggesting potential use of this type of information for personalized medicine. Compatibility of the workflow has been tested with four popular database search engines including Sequest, Mascot, X!Tandem, and MyriMatch. In summary, we have developed a workflow that effectively uses existing genomic data to enable variant peptide detection in proteomics. PMID:21389108
Production and distribution of scientific and technical databases - Comparison among Japan, US and Europe

NASA Astrophysics Data System (ADS)

Onodera, Natsuo; Mizukami, Masayuki

This paper estimates several quantitative indice on production and distribution of scientific and technical databases based on various recent publications and attempts to compare the indice internationally. Raw data used for the estimation are brought mainly from the Database Directory (published by MITI) for database production and from some domestic and foreign study reports for database revenues. The ratio of the indice among Japan, US and Europe for usage of database is similar to those for general scientific and technical activities such as population and R&D expenditures. But Japanese contributions to production, revenue and over-countory distribution of databases are still lower than US and European countries. International comparison of relative database activities between public and private sectors is also discussed.
HEDS - EPA DATABASE SYSTEM FOR PUBLIC ACCESS TO HUMAN EXPOSURE DATA

EPA Science Inventory

Human Exposure Database System (HEDS) is an Internet-based system developed to provide public access to human-exposure-related data from studies conducted by EPA's National Exposure Research Laboratory (NERL). HEDS was designed to work with the EPA Office of Research and Devel...
Adding Hierarchical Objects to Relational Database General-Purpose XML-Based Information Managements

NASA Technical Reports Server (NTRS)

Lin, Shu-Chun; Knight, Chris; La, Tracy; Maluf, David; Bell, David; Tran, Khai Peter; Gawdiak, Yuri

2006-01-01

NETMARK is a flexible, high-throughput software system for managing, storing, and rapid searching of unstructured and semi-structured documents. NETMARK transforms such documents from their original highly complex, constantly changing, heterogeneous data formats into well-structured, common data formats in using Hypertext Markup Language (HTML) and/or Extensible Markup Language (XML). The software implements an object-relational database system that combines the best practices of the relational model utilizing Structured Query Language (SQL) with those of the object-oriented, semantic database model for creating complex data. In particular, NETMARK takes advantage of the Oracle 8i object-relational database model using physical-address data types for very efficient keyword searches of records across both context and content. NETMARK also supports multiple international standards such as WEBDAV for drag-and-drop file management and SOAP for integrated information management using Web services. The document-organization and -searching capabilities afforded by NETMARK are likely to make this software attractive for use in disciplines as diverse as science, auditing, and law enforcement.
A case study for a digital seabed database: Bohai Sea engineering geology database

NASA Astrophysics Data System (ADS)

Tianyun, Su; Shikui, Zhai; Baohua, Liu; Ruicai, Liang; Yanpeng, Zheng; Yong, Wang

2006-07-01

This paper discusses the designing plan of ORACLE-based Bohai Sea engineering geology database structure from requisition analysis, conceptual structure analysis, logical structure analysis, physical structure analysis and security designing. In the study, we used the object-oriented Unified Modeling Language (UML) to model the conceptual structure of the database and used the powerful function of data management which the object-oriented and relational database ORACLE provides to organize and manage the storage space and improve its security performance. By this means, the database can provide rapid and highly effective performance in data storage, maintenance and query to satisfy the application requisition of the Bohai Sea Oilfield Paradigm Area Information System.
Seasonal Forecasting of Fire Weather Based on a New Global Fire Weather Database

NASA Technical Reports Server (NTRS)

Dowdy, Andrew J.; Field, Robert D.; Spessa, Allan C.

2016-01-01

Seasonal forecasting of fire weather is examined based on a recently produced global database of the Fire Weather Index (FWI) system beginning in 1980. Seasonal average values of the FWI are examined in relation to measures of the El Nino-Southern Oscillation (ENSO) and the Indian Ocean Dipole (IOD). The results are used to examine seasonal forecasts of fire weather conditions throughout the world.
Gene annotation from scientific literature using mappings between keyword systems.

PubMed

Pérez, Antonio J; Perez-Iratxeta, Carolina; Bork, Peer; Thode, Guillermo; Andrade, Miguel A

2004-09-01

The description of genes in databases by keywords helps the non-specialist to quickly grasp the properties of a gene and increases the efficiency of computational tools that are applied to gene data (e.g. searching a gene database for sequences related to a particular biological process). However, the association of keywords to genes or protein sequences is a difficult process that ultimately implies examination of the literature related to a gene. To support this task, we present a procedure to derive keywords from the set of scientific abstracts related to a gene. Our system is based on the automated extraction of mappings between related terms from different databases using a model of fuzzy associations that can be applied with all generality to any pair of linked databases. We tested the system by annotating genes of the SWISS-PROT database with keywords derived from the abstracts linked to their entries (stored in the MEDLINE database of scientific references). The performance of the annotation procedure was much better for SWISS-PROT keywords (recall of 47%, precision of 68%) than for Gene Ontology terms (recall of 8%, precision of 67%). The algorithm can be publicly accessed and used for the annotation of sequences through a web server at http://www.bork.embl.de/kat
Initial experiences with building a health care infrastructure based on Java and object-oriented database technology.

PubMed

Dionisio, J D; Sinha, U; Dai, B; Johnson, D B; Taira, R K

1999-01-01

A multi-tiered telemedicine system based on Java and object-oriented database technology has yielded a number of practical insights and experiences on their effectiveness and suitability as implementation bases for a health care infrastructure. The advantages and drawbacks to their use, as seen within the context of the telemedicine system's development, are discussed. Overall, these technologies deliver on their early promise, with a few remaining issues that are due primarily to their relative newness.
Renal Gene Expression Database (RGED): a relational database of gene expression profiles in kidney disease

PubMed Central

Zhang, Qingzhou; Yang, Bo; Chen, Xujiao; Xu, Jing; Mei, Changlin; Mao, Zhiguo

2014-01-01

We present a bioinformatics database named Renal Gene Expression Database (RGED), which contains comprehensive gene expression data sets from renal disease research. The web-based interface of RGED allows users to query the gene expression profiles in various kidney-related samples, including renal cell lines, human kidney tissues and murine model kidneys. Researchers can explore certain gene profiles, the relationships between genes of interests and identify biomarkers or even drug targets in kidney diseases. The aim of this work is to provide a user-friendly utility for the renal disease research community to query expression profiles of genes of their own interest without the requirement of advanced computational skills. Availability and implementation: Website is implemented in PHP, R, MySQL and Nginx and freely available from http://rged.wall-eva.net. Database URL: http://rged.wall-eva.net PMID:25252782
Insect barcode information system.

PubMed

Pratheepa, Maria; Jalali, Sushil Kumar; Arokiaraj, Robinson Silvester; Venkatesan, Thiruvengadam; Nagesh, Mandadi; Panda, Madhusmita; Pattar, Sharath

2014-01-01

Insect Barcode Information System called as Insect Barcode Informática (IBIn) is an online database resource developed by the National Bureau of Agriculturally Important Insects, Bangalore. This database provides acquisition, storage, analysis and publication of DNA barcode records of agriculturally important insects, for researchers specifically in India and other countries. It bridges a gap in bioinformatics by integrating molecular, morphological and distribution details of agriculturally important insects. IBIn was developed using PHP/My SQL by using relational database management concept. This database is based on the client- server architecture, where many clients can access data simultaneously. IBIn is freely available on-line and is user-friendly. IBIn allows the registered users to input new information, search and view information related to DNA barcode of agriculturally important insects.This paper provides a current status of insect barcode in India and brief introduction about the database IBIn. http://www.nabg-nbaii.res.in/barcode.
AR Based App for Tourist Attraction in ESKİ ÇARŞI (Safranbolu)

NASA Astrophysics Data System (ADS)

Polat, Merve; Rakıp Karaş, İsmail; Kahraman, İdris; Alizadehashrafi, Behnam

2016-10-01

This research is dealing with 3D modeling of historical and heritage landmarks of Safranbolu that are registered by UNESCO. This is an Augmented Reality (AR) based project in order to trigger virtual three-dimensional (3D) models, cultural music, historical photos, artistic features and animated text information. The aim is to propose a GIS-based approach with these features and add to the system as attribute data in a relational database. The database will be available in an AR-based application to provide information for the tourists.
GrTEdb: the first web-based database of transposable elements in cotton (Gossypium raimondii).

PubMed

Xu, Zhenzhen; Liu, Jing; Ni, Wanchao; Peng, Zhen; Guo, Yue; Ye, Wuwei; Huang, Fang; Zhang, Xianggui; Xu, Peng; Guo, Qi; Shen, Xinlian; Du, Jianchang

2017-01-01

Although several diploid and tetroploid Gossypium species genomes have been sequenced, the well annotated web-based transposable elements (TEs) database is lacking. To better understand the roles of TEs in structural, functional and evolutionary dynamics of the cotton genome, a comprehensive, specific, and user-friendly web-based database, Gossypium raimondii transposable elements database (GrTEdb), was constructed. A total of 14 332 TEs were structurally annotated and clearly categorized in G. raimondii genome, and these elements have been classified into seven distinct superfamilies based on the order of protein-coding domains, structures and/or sequence similarity, including 2929 Copia-like elements, 10 368 Gypsy-like elements, 299 L1 , 12 Mutators , 435 PIF-Harbingers , 275 CACTAs and 14 Helitrons . Meanwhile, the web-based sequence browsing, searching, downloading and blast tool were implemented to help users easily and effectively to annotate the TEs or TE fragments in genomic sequences from G. raimondii and other closely related Gossypium species. GrTEdb provides resources and information related with TEs in G. raimondii , and will facilitate gene and genome analyses within or across Gossypium species, evaluating the impact of TEs on their host genomes, and investigating the potential interaction between TEs and protein-coding genes in Gossypium species. http://www.grtedb.org/. © The Author(s) 2017. Published by Oxford University Press.
Database computing in HEP

NASA Technical Reports Server (NTRS)

Day, C. T.; Loken, S.; Macfarlane, J. F.; May, E.; Lifka, D.; Lusk, E.; Price, L. E.; Baden, A.; Grossman, R.; Qin, X.

1992-01-01

The major SSC experiments are expected to produce up to 1 Petabyte of data per year each. Once the primary reconstruction is completed by farms of inexpensive processors, I/O becomes a major factor in further analysis of the data. We believe that the application of database techniques can significantly reduce the I/O performed in these analyses. We present examples of such I/O reductions in prototypes based on relational and object-oriented databases of CDF data samples.
Information Retrieval System Design Issues in a Microcomputer-Based Relational DBMS Environment.

ERIC Educational Resources Information Center

Wolfram, Dietmar

1992-01-01

Outlines the file structure requirements for a microcomputer-based information retrieval system using FoxPro, a relational database management system (DBMS). Issues relating to the design and implementation of such systems are discussed, and two possible designs are examined in terms of space economy and practicality of implementation. (15…
REDIdb: the RNA editing database.

PubMed

Picardi, Ernesto; Regina, Teresa Maria Rosaria; Brennicke, Axel; Quagliariello, Carla

2007-01-01

The RNA Editing Database (REDIdb) is an interactive, web-based database created and designed with the aim to allocate RNA editing events such as substitutions, insertions and deletions occurring in a wide range of organisms. The database contains both fully and partially sequenced DNA molecules for which editing information is available either by experimental inspection (in vitro) or by computational detection (in silico). Each record of REDIdb is organized in a specific flat-file containing a description of the main characteristics of the entry, a feature table with the editing events and related details and a sequence zone with both the genomic sequence and the corresponding edited transcript. REDIdb is a relational database in which the browsing and identification of editing sites has been simplified by means of two facilities to either graphically display genomic or cDNA sequences or to show the corresponding alignment. In both cases, all editing sites are highlighted in colour and their relative positions are detailed by mousing over. New editing positions can be directly submitted to REDIdb after a user-specific registration to obtain authorized secure access. This first version of REDIdb database stores 9964 editing events and can be freely queried at http://biologia.unical.it/py_script/search.html.
Evolution of the use of relational and NoSQL databases in the ATLAS experiment

NASA Astrophysics Data System (ADS)

Barberis, D.

2016-09-01

The ATLAS experiment used for many years a large database infrastructure based on Oracle to store several different types of non-event data: time-dependent detector configuration and conditions data, calibrations and alignments, configurations of Grid sites, catalogues for data management tools, job records for distributed workload management tools, run and event metadata. The rapid development of "NoSQL" databases (structured storage services) in the last five years allowed an extended and complementary usage of traditional relational databases and new structured storage tools in order to improve the performance of existing applications and to extend their functionalities using the possibilities offered by the modern storage systems. The trend is towards using the best tool for each kind of data, separating for example the intrinsically relational metadata from payload storage, and records that are frequently updated and benefit from transactions from archived information. Access to all components has to be orchestrated by specialised services that run on front-end machines and shield the user from the complexity of data storage infrastructure. This paper describes this technology evolution in the ATLAS database infrastructure and presents a few examples of large database applications that benefit from it.
Design and Implementation of a Three-Tiered Web-Based Inventory Ordering and Tracking System Prototype Using CORBA and Java

DTIC Science & Technology

2000-03-01

languages yet still be able to access the legacy relational databases that businesses have huge investments in. JDBC is a low-level API designed for...consider the return of investment . The system requirements, discussed in Chapter II, are the main source of input to developing the relational...1996. Inprise, Gatekeeper Guide, Inprise Corporation, 1999. Kroenke, D., Database Processing Fundementals , Design, and Implementation, Sixth Edition
SALAD database: a motif-based database of protein annotations for plant comparative genomics

PubMed Central

Mihara, Motohiro; Itoh, Takeshi; Izawa, Takeshi

2010-01-01

Proteins often have several motifs with distinct evolutionary histories. Proteins with similar motifs have similar biochemical properties and thus related biological functions. We constructed a unique comparative genomics database termed the SALAD database (http://salad.dna.affrc.go.jp/salad/) from plant-genome-based proteome data sets. We extracted evolutionarily conserved motifs by MEME software from 209 529 protein-sequence annotation groups selected by BLASTP from the proteome data sets of 10 species: rice, sorghum, Arabidopsis thaliana, grape, a lycophyte, a moss, 3 algae, and yeast. Similarity clustering of each protein group was performed by pairwise scoring of the motif patterns of the sequences. The SALAD database provides a user-friendly graphical viewer that displays a motif pattern diagram linked to the resulting bootstrapped dendrogram for each protein group. Amino-acid-sequence-based and nucleotide-sequence-based phylogenetic trees for motif combination alignment, a logo comparison diagram for each clade in the tree, and a Pfam-domain pattern diagram are also available. We also developed a viewer named ‘SALAD on ARRAYs’ to view arbitrary microarray data sets of paralogous genes linked to the same dendrogram in a window. The SALAD database is a powerful tool for comparing protein sequences and can provide valuable hints for biological analysis. PMID:19854933

SALAD database: a motif-based database of protein annotations for plant comparative genomics.

PubMed

Mihara, Motohiro; Itoh, Takeshi; Izawa, Takeshi

2010-01-01

Proteins often have several motifs with distinct evolutionary histories. Proteins with similar motifs have similar biochemical properties and thus related biological functions. We constructed a unique comparative genomics database termed the SALAD database (http://salad.dna.affrc.go.jp/salad/) from plant-genome-based proteome data sets. We extracted evolutionarily conserved motifs by MEME software from 209,529 protein-sequence annotation groups selected by BLASTP from the proteome data sets of 10 species: rice, sorghum, Arabidopsis thaliana, grape, a lycophyte, a moss, 3 algae, and yeast. Similarity clustering of each protein group was performed by pairwise scoring of the motif patterns of the sequences. The SALAD database provides a user-friendly graphical viewer that displays a motif pattern diagram linked to the resulting bootstrapped dendrogram for each protein group. Amino-acid-sequence-based and nucleotide-sequence-based phylogenetic trees for motif combination alignment, a logo comparison diagram for each clade in the tree, and a Pfam-domain pattern diagram are also available. We also developed a viewer named 'SALAD on ARRAYs' to view arbitrary microarray data sets of paralogous genes linked to the same dendrogram in a window. The SALAD database is a powerful tool for comparing protein sequences and can provide valuable hints for biological analysis.
The OAuth 2.0 Web Authorization Protocol for the Internet Addiction Bioinformatics (IABio) Database.

PubMed

Choi, Jeongseok; Kim, Jaekwon; Lee, Dong Kyun; Jang, Kwang Soo; Kim, Dai-Jin; Choi, In Young

2016-03-01

Internet addiction (IA) has become a widespread and problematic phenomenon as smart devices pervade society. Moreover, internet gaming disorder leads to increases in social expenditures for both individuals and nations alike. Although the prevention and treatment of IA are getting more important, the diagnosis of IA remains problematic. Understanding the neurobiological mechanism of behavioral addictions is essential for the development of specific and effective treatments. Although there are many databases related to other addictions, a database for IA has not been developed yet. In addition, bioinformatics databases, especially genetic databases, require a high level of security and should be designed based on medical information standards. In this respect, our study proposes the OAuth standard protocol for database access authorization. The proposed IA Bioinformatics (IABio) database system is based on internet user authentication, which is a guideline for medical information standards, and uses OAuth 2.0 for access control technology. This study designed and developed the system requirements and configuration. The OAuth 2.0 protocol is expected to establish the security of personal medical information and be applied to genomic research on IA.
Using ontology databases for scalable query answering, inconsistency detection, and data integration

PubMed Central

Dou, Dejing

2011-01-01

An ontology database is a basic relational database management system that models an ontology plus its instances. To reason over the transitive closure of instances in the subsumption hierarchy, for example, an ontology database can either unfold views at query time or propagate assertions using triggers at load time. In this paper, we use existing benchmarks to evaluate our method—using triggers—and we demonstrate that by forward computing inferences, we not only improve query time, but the improvement appears to cost only more space (not time). However, we go on to show that the true penalties were simply opaque to the benchmark, i.e., the benchmark inadequately captures load-time costs. We have applied our methods to two case studies in biomedicine, using ontologies and data from genetics and neuroscience to illustrate two important applications: first, ontology databases answer ontology-based queries effectively; second, using triggers, ontology databases detect instance-based inconsistencies—something not possible using views. Finally, we demonstrate how to extend our methods to perform data integration across multiple, distributed ontology databases. PMID:22163378
PROTICdb: a web-based application to store, track, query, and compare plant proteome data.

PubMed

Ferry-Dumazet, Hélène; Houel, Gwenn; Montalent, Pierre; Moreau, Luc; Langella, Olivier; Negroni, Luc; Vincent, Delphine; Lalanne, Céline; de Daruvar, Antoine; Plomion, Christophe; Zivy, Michel; Joets, Johann

2005-05-01

PROTICdb is a web-based application, mainly designed to store and analyze plant proteome data obtained by two-dimensional polyacrylamide gel electrophoresis (2-D PAGE) and mass spectrometry (MS). The purposes of PROTICdb are (i) to store, track, and query information related to proteomic experiments, i.e., from tissue sampling to protein identification and quantitative measurements, and (ii) to integrate information from the user's own expertise and other sources into a knowledge base, used to support data interpretation (e.g., for the determination of allelic variants or products of post-translational modifications). Data insertion into the relational database of PROTICdb is achieved either by uploading outputs of image analysis and MS identification software, or by filling web forms. 2-D PAGE annotated maps can be displayed, queried, and compared through a graphical interface. Links to external databases are also available. Quantitative data can be easily exported in a tabulated format for statistical analyses. PROTICdb is based on the Oracle or the PostgreSQL Database Management System and is freely available upon request at the following URL: http://moulon.inra.fr/ bioinfo/PROTICdb.
Sting_RDB: a relational database of structural parameters for protein analysis with support for data warehousing and data mining.

PubMed

Oliveira, S R M; Almeida, G V; Souza, K R R; Rodrigues, D N; Kuser-Falcão, P R; Yamagishi, M E B; Santos, E H; Vieira, F D; Jardine, J G; Neshich, G

2007-10-05

An effective strategy for managing protein databases is to provide mechanisms to transform raw data into consistent, accurate and reliable information. Such mechanisms will greatly reduce operational inefficiencies and improve one's ability to better handle scientific objectives and interpret the research results. To achieve this challenging goal for the STING project, we introduce Sting_RDB, a relational database of structural parameters for protein analysis with support for data warehousing and data mining. In this article, we highlight the main features of Sting_RDB and show how a user can explore it for efficient and biologically relevant queries. Considering its importance for molecular biologists, effort has been made to advance Sting_RDB toward data quality assessment. To the best of our knowledge, Sting_RDB is one of the most comprehensive data repositories for protein analysis, now also capable of providing its users with a data quality indicator. This paper differs from our previous study in many aspects. First, we introduce Sting_RDB, a relational database with mechanisms for efficient and relevant queries using SQL. Sting_rdb evolved from the earlier, text (flat file)-based database, in which data consistency and integrity was not guaranteed. Second, we provide support for data warehousing and mining. Third, the data quality indicator was introduced. Finally and probably most importantly, complex queries that could not be posed on a text-based database, are now easily implemented. Further details are accessible at the Sting_RDB demo web page: http://www.cbi.cnptia.embrapa.br/StingRDB.
Optics Toolbox: An Intelligent Relational Database System For Optical Designers

NASA Astrophysics Data System (ADS)

Weller, Scott W.; Hopkins, Robert E.

1986-12-01

Optical designers were among the first to use the computer as an engineering tool. Powerful programs have been written to do ray-trace analysis, third-order layout, and optimization. However, newer computing techniques such as database management and expert systems have not been adopted by the optical design community. For the purpose of this discussion we will define a relational database system as a database which allows the user to specify his requirements using logical relations. For example, to search for all lenses in a lens database with a F/number less than two, and a half field of view near 28 degrees, you might enter the following: FNO < 2.0 and FOV of 28 degrees ± 5% Again for the purpose of this discussion, we will define an expert system as a program which contains expert knowledge, can ask intelligent questions, and can form conclusions based on the answers given and the knowledge which it contains. Most expert systems store this knowledge in the form of rules-of-thumb, which are written in an English-like language, and which are easily modified by the user. An example rule is: IF require microscope objective in air and require NA > 0.9 THEN suggest the use of an oil immersion objective The heart of the expert system is the rule interpreter, sometimes called an inference engine, which reads the rules and forms conclusions based on them. The use of a relational database system containing lens prototypes seems to be a viable prospect. However, it is not clear that expert systems have a place in optical design. In domains such as medical diagnosis and petrology, expert systems are flourishing. These domains are quite different from optical design, however, because optical design is a creative process, and the rules are difficult to write down. We do think that an expert system is feasible in the area of first order layout, which is sufficiently diagnostic in nature to permit useful rules to be written. This first-order expert would emulate an expert designer as he interacted with a customer for the first time: asking the right questions, forming conclusions, and making suggestions. With these objectives in mind, we have developed the Optics Toolbox. Optics Toolbox is actually two programs in one: it is a powerful relational database system with twenty-one search parameters, four search modes, and multi-database support, as well as a first-order optical design expert system with a rule interpreter which has full access to the relational database. The system schematic is shown in Figure 1.
The ability of older adults to use customized online medical databases to improve their health-related knowledge.

PubMed

Freund, Ophir; Reychav, Iris; McHaney, Roger; Goland, Ella; Azuri, Joseph

2017-06-01

Patient compliance with medical advice and recommended treatment depends on perception of health condition, medical knowledge, attitude, and self-efficacy. This study investigated how use of customized online medical databases, intended to improve knowledge in a variety of relevant medical topics, influenced senior adults' perceptions. Seventy-nine older adults in residence homes completed a computerized, tablet-based questionnaire, with medical scenarios and related questions. Following an intervention, control group participants answered questions without online help while an experimental group received internet links that directed them to customized, online medical databases. Medical knowledge and test scores among the experimental group significantly improved from pre- to post-intervention (p<0.0001) and was higher in comparison with the control group (p<0.0001). No significant change occurred in the control group. Older adults improved their knowledge in desired medical topic areas using customized online medical databases. The study demonstrated how such databases help solve health-related questions among older adult population members, and that older patients appear willing to consider technology usage in information acquisition. Copyright © 2017 Elsevier B.V. All rights reserved.
ImmunemiR - A Database of Prioritized Immune miRNA Disease Associations and its Interactome.

PubMed

Prabahar, Archana; Natarajan, Jeyakumar

2017-01-01

MicroRNAs are the key regulators of gene expression and their abnormal expression in the immune system may be associated with several human diseases such as inflammation, cancer and autoimmune diseases. Elucidation of miRNA disease association through the interactome will deepen the understanding of its disease mechanisms. A specialized database for immune miRNAs is highly desirable to demonstrate the immune miRNA disease associations in the interactome. miRNAs specific to immune related diseases were retrieved from curated databases such as HMDD, miR2disease and PubMed literature based on MeSH classification of immune system diseases. The additional data such as miRNA target genes, genes coding protein-protein interaction information were compiled from related resources. Further, miRNAs were prioritized to specific immune diseases using random walk ranking algorithm. In total 245 immune miRNAs associated with 92 OMIM disease categories were identified from external databases. The resultant data were compiled as ImmunemiR, a database of prioritized immune miRNA disease associations. This database provides both text based annotation information and network visualization of its interactome. To our knowledge, ImmunemiR is the first available database to provide a comprehensive repository of human immune disease associated miRNAs with network visualization options of its target genes, protein-protein interactions (PPI) and its disease associations. It is freely available at http://www.biominingbu.org/immunemir/. Copyright© Bentham Science Publishers; For any queries, please email at epub@benthamscience.org.
Assessment of COPD-related outcomes via a national electronic medical record database.

PubMed

Asche, Carl; Said, Quayyim; Joish, Vijay; Hall, Charles Oaxaca; Brixner, Diana

2008-01-01

The technology and sophistication of healthcare utilization databases have expanded over the last decade to include results of lab tests, vital signs, and other clinical information. This review provides an assessment of the methodological and analytical challenges of conducting chronic obstructive pulmonary disease (COPD) outcomes research in a national electronic medical records (EMR) dataset and its potential application towards the assessment of national health policy issues, as well as a description of the challenges or limitations. An EMR database and its application to measuring outcomes for COPD are described. The ability to measure adherence to the COPD evidence-based practice guidelines, generated by the NIH and HEDIS quality indicators, in this database was examined. Case studies, before and after their publication, were used to assess the adherence to guidelines and gauge the conformity to quality indicators. EMR was the only source of information for pulmonary function tests, but low frequency in ordering by primary care was an issue. The EMR data can be used to explore impact of variation in healthcare provision on clinical outcomes. The EMR database permits access to specific lab data and biometric information. The richness and depth of information on "real world" use of health services for large population-based analytical studies at relatively low cost render such databases an attractive resource for outcomes research. Various sources of information exist to perform outcomes research. It is important to understand the desired endpoints of such research and choose the appropriate database source.
Improving data management and dissemination in web based information systems by semantic enrichment of descriptive data aspects

NASA Astrophysics Data System (ADS)

Gebhardt, Steffen; Wehrmann, Thilo; Klinger, Verena; Schettler, Ingo; Huth, Juliane; Künzer, Claudia; Dech, Stefan

2010-10-01

The German-Vietnamese water-related information system for the Mekong Delta (WISDOM) project supports business processes in Integrated Water Resources Management in Vietnam. Multiple disciplines bring together earth and ground based observation themes, such as environmental monitoring, water management, demographics, economy, information technology, and infrastructural systems. This paper introduces the components of the web-based WISDOM system including data, logic and presentation tier. It focuses on the data models upon which the database management system is built, including techniques for tagging or linking metadata with the stored information. The model also uses ordered groupings of spatial, thematic and temporal reference objects to semantically tag datasets to enable fast data retrieval, such as finding all data in a specific administrative unit belonging to a specific theme. A spatial database extension is employed by the PostgreSQL database. This object-oriented database was chosen over a relational database to tag spatial objects to tabular data, improving the retrieval of census and observational data at regional, provincial, and local areas. While the spatial database hinders processing raster data, a "work-around" was built into WISDOM to permit efficient management of both raster and vector data. The data model also incorporates styling aspects of the spatial datasets through styled layer descriptions (SLD) and web mapping service (WMS) layer specifications, allowing retrieval of rendered maps. Metadata elements of the spatial data are based on the ISO19115 standard. XML structured information of the SLD and metadata are stored in an XML database. The data models and the data management system are robust for managing the large quantity of spatial objects, sensor observations, census and document data. The operational WISDOM information system prototype contains modules for data management, automatic data integration, and web services for data retrieval, analysis, and distribution. The graphical user interfaces facilitate metadata cataloguing, data warehousing, web sensor data analysis and thematic mapping.
Content-based video indexing and searching with wavelet transformation

NASA Astrophysics Data System (ADS)

Stumpf, Florian; Al-Jawad, Naseer; Du, Hongbo; Jassim, Sabah

2006-05-01

Biometric databases form an essential tool in the fight against international terrorism, organised crime and fraud. Various government and law enforcement agencies have their own biometric databases consisting of combination of fingerprints, Iris codes, face images/videos and speech records for an increasing number of persons. In many cases personal data linked to biometric records are incomplete and/or inaccurate. Besides, biometric data in different databases for the same individual may be recorded with different personal details. Following the recent terrorist atrocities, law enforcing agencies collaborate more than before and have greater reliance on database sharing. In such an environment, reliable biometric-based identification must not only determine who you are but also who else you are. In this paper we propose a compact content-based video signature and indexing scheme that can facilitate retrieval of multiple records in face biometric databases that belong to the same person even if their associated personal data are inconsistent. We shall assess the performance of our system using a benchmark audio visual face biometric database that has multiple videos for each subject but with different identity claims. We shall demonstrate that retrieval of relatively small number of videos that are nearest, in terms of the proposed index, to any video in the database results in significant proportion of that individual biometric data.
Using semantic data modeling techniques to organize an object-oriented database for extending the mass storage model

NASA Technical Reports Server (NTRS)

Campbell, William J.; Short, Nicholas M., Jr.; Roelofs, Larry H.; Dorfman, Erik

1991-01-01

A methodology for optimizing organization of data obtained by NASA earth and space missions is discussed. The methodology uses a concept based on semantic data modeling techniques implemented in a hierarchical storage model. The modeling is used to organize objects in mass storage devices, relational database systems, and object-oriented databases. The semantic data modeling at the metadata record level is examined, including the simulation of a knowledge base and semantic metadata storage issues. The semantic data model hierarchy and its application for efficient data storage is addressed, as is the mapping of the application structure to the mass storage.
Improved orthologous databases to ease protozoan targets inference.

PubMed

Kotowski, Nelson; Jardim, Rodrigo; Dávila, Alberto M R

2015-09-29

Homology inference helps on identifying similarities, as well as differences among organisms, which provides a better insight on how closely related one might be to another. In addition, comparative genomics pipelines are widely adopted tools designed using different bioinformatics applications and algorithms. In this article, we propose a methodology to build improved orthologous databases with the potential to aid on protozoan target identification, one of the many tasks which benefit from comparative genomics tools. Our analyses are based on OrthoSearch, a comparative genomics pipeline originally designed to infer orthologs through protein-profile comparison, supported by an HMM, reciprocal best hits based approach. Our methodology allows OrthoSearch to confront two orthologous databases and to generate an improved new one. Such can be later used to infer potential protozoan targets through a similarity analysis against the human genome. The protein sequences of Cryptosporidium hominis, Entamoeba histolytica and Leishmania infantum genomes were comparatively analyzed against three orthologous databases: (i) EggNOG KOG, (ii) ProtozoaDB and (iii) Kegg Orthology (KO). That allowed us to create two new orthologous databases, "KO + EggNOG KOG" and "KO + EggNOG KOG + ProtozoaDB", with 16,938 and 27,701 orthologous groups, respectively. Such new orthologous databases were used for a regular OrthoSearch run. By confronting "KO + EggNOG KOG" and "KO + EggNOG KOG + ProtozoaDB" databases and protozoan species we were able to detect the following total of orthologous groups and coverage (relation between the inferred orthologous groups and the species total number of proteins): Cryptosporidium hominis: 1,821 (11 %) and 3,254 (12 %); Entamoeba histolytica: 2,245 (13 %) and 5,305 (19 %); Leishmania infantum: 2,702 (16 %) and 4,760 (17 %). Using our HMM-based methodology and the largest created orthologous database, it was possible to infer 13 orthologous groups which represent potential protozoan targets; these were found because of our distant homology approach. We also provide the number of species-specific, pair-to-pair and core groups from such analyses, depicted in Venn diagrams. The orthologous databases generated by our HMM-based methodology provide a broader dataset, with larger amounts of orthologous groups when compared to the original databases used as input. Those may be used for several homology inference analyses, annotation tasks and protozoan targets identification.
Development of the method and U.S. normalization database for Life Cycle Impact Assessment and sustainability metrics.

PubMed

Bare, Jane; Gloria, Thomas; Norris, Gregory

2006-08-15

Normalization is an optional step within Life Cycle Impact Assessment (LCIA) that may be used to assist in the interpretation of life cycle inventory data as well as life cycle impact assessment results. Normalization transforms the magnitude of LCI and LCIA results into relative contribution by substance and life cycle impact category. Normalization thus can significantly influence LCA-based decisions when tradeoffs exist. The U. S. Environmental Protection Agency (EPA) has developed a normalization database based on the spatial scale of the 48 continental U.S. states, Hawaii, Alaska, the District of Columbia, and Puerto Rico with a one-year reference time frame. Data within the normalization database were compiled based on the impact methodologies and lists of stressors used in TRACI-the EPA's Tool for the Reduction and Assessment of Chemical and other environmental Impacts. The new normalization database published within this article may be used for LCIA case studies within the United States, and can be used to assist in the further development of a global normalization database. The underlying data analyzed for the development of this database are included to allow the development of normalization data consistent with other impact assessment methodologies as well.
XML technology planning database : lessons learned

NASA Technical Reports Server (NTRS)

Some, Raphael R.; Neff, Jon M.

2005-01-01

A hierarchical Extensible Markup Language(XML) database called XCALIBR (XML Analysis LIBRary) has been developed by Millennium Program to assist in technology investment (ROI) analysis and technology Language Capability the New return on portfolio optimization. The database contains mission requirements and technology capabilities, which are related by use of an XML dictionary. The XML dictionary codifies a standardized taxonomy for space missions, systems, subsystems and technologies. In addition to being used for ROI analysis, the database is being examined for use in project planning, tracking and documentation. During the past year, the database has moved from development into alpha testing. This paper describes the lessons learned during construction and testing of the prototype database and the motivation for moving from an XML taxonomy to a standard XML-based ontology.
Designing a Relational Database for the Basic School; Schools Command Web Enabled Officer and Enlisted Database (Sword)

DTIC Science & Technology

2002-06-01

Student memo for personnel MCLLS . . . . . . . . . . . . . . 75 i. Migrate data to SQL Server...The Web Server is on the same server as the SWORD database in the current version. 4: results set 5: dynamic HTML page 6: dynamic HTML page 3: SQL ...still be supported by Access. SQL Server would be a more viable tool for a fully developed application based on the number of potential users and
Computer-Aided Clinical Trial Recruitment Based on Domain-Specific Language Translation: A Case Study of Retinopathy of Prematurity

PubMed Central

2017-01-01

Reusing the data from healthcare information systems can effectively facilitate clinical trials (CTs). How to select candidate patients eligible for CT recruitment criteria is a central task. Related work either depends on DBA (database administrator) to convert the recruitment criteria to native SQL queries or involves the data mapping between a standard ontology/information model and individual data source schema. This paper proposes an alternative computer-aided CT recruitment paradigm, based on syntax translation between different DSLs (domain-specific languages). In this paradigm, the CT recruitment criteria are first formally represented as production rules. The referenced rule variables are all from the underlying database schema. Then the production rule is translated to an intermediate query-oriented DSL (e.g., LINQ). Finally, the intermediate DSL is directly mapped to native database queries (e.g., SQL) automated by ORM (object-relational mapping). PMID:29065644
Computer-Aided Clinical Trial Recruitment Based on Domain-Specific Language Translation: A Case Study of Retinopathy of Prematurity.

PubMed

Zhang, Yinsheng; Zhang, Guoming; Shang, Qian

2017-01-01

Reusing the data from healthcare information systems can effectively facilitate clinical trials (CTs). How to select candidate patients eligible for CT recruitment criteria is a central task. Related work either depends on DBA (database administrator) to convert the recruitment criteria to native SQL queries or involves the data mapping between a standard ontology/information model and individual data source schema. This paper proposes an alternative computer-aided CT recruitment paradigm, based on syntax translation between different DSLs (domain-specific languages). In this paradigm, the CT recruitment criteria are first formally represented as production rules. The referenced rule variables are all from the underlying database schema. Then the production rule is translated to an intermediate query-oriented DSL (e.g., LINQ). Finally, the intermediate DSL is directly mapped to native database queries (e.g., SQL) automated by ORM (object-relational mapping).
CALINVASIVES: a revolutionary tool to monitor invasive threats

Treesearch

M. Garbelotto; S. Drill; C. Powell; J. Malpas

2017-01-01

CALinvasives is a web-based relational database and content management system (CMS) cataloging the statewide distribution of invasive pathogens and pests and the plant hosts they impact. The database has been developed as a collaboration between the Forest Pathology and Mycology Laboratory at UC Berkeley and Calflora. CALinvasives will combine information on the...
ProXL (Protein Cross-Linking Database): A Platform for Analysis, Visualization, and Sharing of Protein Cross-Linking Mass Spectrometry Data

PubMed Central

2016-01-01

ProXL is a Web application and accompanying database designed for sharing, visualizing, and analyzing bottom-up protein cross-linking mass spectrometry data with an emphasis on structural analysis and quality control. ProXL is designed to be independent of any particular software pipeline. The import process is simplified by the use of the ProXL XML data format, which shields developers of data importers from the relative complexity of the relational database schema. The database and Web interfaces function equally well for any software pipeline and allow data from disparate pipelines to be merged and contrasted. ProXL includes robust public and private data sharing capabilities, including a project-based interface designed to ensure security and facilitate collaboration among multiple researchers. ProXL provides multiple interactive and highly dynamic data visualizations that facilitate structural-based analysis of the observed cross-links as well as quality control. ProXL is open-source, well-documented, and freely available at https://github.com/yeastrc/proxl-web-app. PMID:27302480

ProXL (Protein Cross-Linking Database): A Platform for Analysis, Visualization, and Sharing of Protein Cross-Linking Mass Spectrometry Data.

PubMed

Riffle, Michael; Jaschob, Daniel; Zelter, Alex; Davis, Trisha N

2016-08-05

ProXL is a Web application and accompanying database designed for sharing, visualizing, and analyzing bottom-up protein cross-linking mass spectrometry data with an emphasis on structural analysis and quality control. ProXL is designed to be independent of any particular software pipeline. The import process is simplified by the use of the ProXL XML data format, which shields developers of data importers from the relative complexity of the relational database schema. The database and Web interfaces function equally well for any software pipeline and allow data from disparate pipelines to be merged and contrasted. ProXL includes robust public and private data sharing capabilities, including a project-based interface designed to ensure security and facilitate collaboration among multiple researchers. ProXL provides multiple interactive and highly dynamic data visualizations that facilitate structural-based analysis of the observed cross-links as well as quality control. ProXL is open-source, well-documented, and freely available at https://github.com/yeastrc/proxl-web-app .
Use of Graph Database for the Integration of Heterogeneous Biological Data.

PubMed

Yoon, Byoung-Ha; Kim, Seon-Kyu; Kim, Seon-Young

2017-03-01

Understanding complex relationships among heterogeneous biological data is one of the fundamental goals in biology. In most cases, diverse biological data are stored in relational databases, such as MySQL and Oracle, which store data in multiple tables and then infer relationships by multiple-join statements. Recently, a new type of database, called the graph-based database, was developed to natively represent various kinds of complex relationships, and it is widely used among computer science communities and IT industries. Here, we demonstrate the feasibility of using a graph-based database for complex biological relationships by comparing the performance between MySQL and Neo4j, one of the most widely used graph databases. We collected various biological data (protein-protein interaction, drug-target, gene-disease, etc.) from several existing sources, removed duplicate and redundant data, and finally constructed a graph database containing 114,550 nodes and 82,674,321 relationships. When we tested the query execution performance of MySQL versus Neo4j, we found that Neo4j outperformed MySQL in all cases. While Neo4j exhibited a very fast response for various queries, MySQL exhibited latent or unfinished responses for complex queries with multiple-join statements. These results show that using graph-based databases, such as Neo4j, is an efficient way to store complex biological relationships. Moreover, querying a graph database in diverse ways has the potential to reveal novel relationships among heterogeneous biological data.
Use of Graph Database for the Integration of Heterogeneous Biological Data

PubMed Central

Yoon, Byoung-Ha; Kim, Seon-Kyu

2017-01-01

Understanding complex relationships among heterogeneous biological data is one of the fundamental goals in biology. In most cases, diverse biological data are stored in relational databases, such as MySQL and Oracle, which store data in multiple tables and then infer relationships by multiple-join statements. Recently, a new type of database, called the graph-based database, was developed to natively represent various kinds of complex relationships, and it is widely used among computer science communities and IT industries. Here, we demonstrate the feasibility of using a graph-based database for complex biological relationships by comparing the performance between MySQL and Neo4j, one of the most widely used graph databases. We collected various biological data (protein-protein interaction, drug-target, gene-disease, etc.) from several existing sources, removed duplicate and redundant data, and finally constructed a graph database containing 114,550 nodes and 82,674,321 relationships. When we tested the query execution performance of MySQL versus Neo4j, we found that Neo4j outperformed MySQL in all cases. While Neo4j exhibited a very fast response for various queries, MySQL exhibited latent or unfinished responses for complex queries with multiple-join statements. These results show that using graph-based databases, such as Neo4j, is an efficient way to store complex biological relationships. Moreover, querying a graph database in diverse ways has the potential to reveal novel relationships among heterogeneous biological data. PMID:28416946
A functional-dependencies-based Bayesian networks learning method and its application in a mobile commerce system.

PubMed

Liao, Stephen Shaoyi; Wang, Huai Qing; Li, Qiu Dan; Liu, Wei Yi

2006-06-01

This paper presents a new method for learning Bayesian networks from functional dependencies (FD) and third normal form (3NF) tables in relational databases. The method sets up a linkage between the theory of relational databases and probabilistic reasoning models, which is interesting and useful especially when data are incomplete and inaccurate. The effectiveness and practicability of the proposed method is demonstrated by its implementation in a mobile commerce system.
A comprehensive clinical research database based on CDISC ODM and i2b2.

PubMed

Meineke, Frank A; Stäubert, Sebastian; Löbe, Matthias; Winter, Alfred

2014-01-01

We present a working approach for a clinical research database as part of an archival information system. The CDISC ODM standard is target for clinical study and research relevant routine data, thus decoupling the data ingest process from the access layer. The presented research database is comprehensive as it covers annotating, mapping and curation of poorly annotated source data. Besides a conventional relational database the medical data warehouse i2b2 serves as main frontend for end-users. The system we developed is suitable to support patient recruitment, cohort identification and quality assurance in daily routine.
GenColors-based comparative genome databases for small eukaryotic genomes.

PubMed

Felder, Marius; Romualdi, Alessandro; Petzold, Andreas; Platzer, Matthias; Sühnel, Jürgen; Glöckner, Gernot

2013-01-01

Many sequence data repositories can give a quick and easily accessible overview on genomes and their annotations. Less widespread is the possibility to compare related genomes with each other in a common database environment. We have previously described the GenColors database system (http://gencolors.fli-leibniz.de) and its applications to a number of bacterial genomes such as Borrelia, Legionella, Leptospira and Treponema. This system has an emphasis on genome comparison. It combines data from related genomes and provides the user with an extensive set of visualization and analysis tools. Eukaryote genomes are normally larger than prokaryote genomes and thus pose additional challenges for such a system. We have, therefore, adapted GenColors to also handle larger datasets of small eukaryotic genomes and to display eukaryotic gene structures. Further recent developments include whole genome views, genome list options and, for bacterial genome browsers, the display of horizontal gene transfer predictions. Two new GenColors-based databases for two fungal species (http://fgb.fli-leibniz.de) and for four social amoebas (http://sacgb.fli-leibniz.de) were set up. Both new resources open up a single entry point for related genomes for the amoebozoa and fungal research communities and other interested users. Comparative genomics approaches are greatly facilitated by these resources.
Artemis and ACT: viewing, annotating and comparing sequences stored in a relational database.

PubMed

Carver, Tim; Berriman, Matthew; Tivey, Adrian; Patel, Chinmay; Böhme, Ulrike; Barrell, Barclay G; Parkhill, Julian; Rajandream, Marie-Adèle

2008-12-01

Artemis and Artemis Comparison Tool (ACT) have become mainstream tools for viewing and annotating sequence data, particularly for microbial genomes. Since its first release, Artemis has been continuously developed and supported with additional functionality for editing and analysing sequences based on feedback from an active user community of laboratory biologists and professional annotators. Nevertheless, its utility has been somewhat restricted by its limitation to reading and writing from flat files. Therefore, a new version of Artemis has been developed, which reads from and writes to a relational database schema, and allows users to annotate more complex, often large and fragmented, genome sequences. Artemis and ACT have now been extended to read and write directly to the Generic Model Organism Database (GMOD, http://www.gmod.org) Chado relational database schema. In addition, a Gene Builder tool has been developed to provide structured forms and tables to edit coordinates of gene models and edit functional annotation, based on standard ontologies, controlled vocabularies and free text. Artemis and ACT are freely available (under a GPL licence) for download (for MacOSX, UNIX and Windows) at the Wellcome Trust Sanger Institute web sites: http://www.sanger.ac.uk/Software/Artemis/ http://www.sanger.ac.uk/Software/ACT/
Integration of Oracle and Hadoop: Hybrid Databases Affordable at Scale

NASA Astrophysics Data System (ADS)

Canali, L.; Baranowski, Z.; Kothuri, P.

2017-10-01

This work reports on the activities aimed at integrating Oracle and Hadoop technologies for the use cases of CERN database services and in particular on the development of solutions for offloading data and queries from Oracle databases into Hadoop-based systems. The goal and interest of this investigation is to increase the scalability and optimize the cost/performance footprint for some of our largest Oracle databases. These concepts have been applied, among others, to build offline copies of CERN accelerator controls and logging databases. The tested solution allows to run reports on the controls data offloaded in Hadoop without affecting the critical production database, providing both performance benefits and cost reduction for the underlying infrastructure. Other use cases discussed include building hybrid database solutions with Oracle and Hadoop, offering the combined advantages of a mature relational database system with a scalable analytics engine.
Environmental modeling and recognition for an autonomous land vehicle

NASA Technical Reports Server (NTRS)

Lawton, D. T.; Levitt, T. S.; Mcconnell, C. C.; Nelson, P. C.

1987-01-01

An architecture for object modeling and recognition for an autonomous land vehicle is presented. Examples of objects of interest include terrain features, fields, roads, horizon features, trees, etc. The architecture is organized around a set of data bases for generic object models and perceptual structures, temporary memory for the instantiation of object and relational hypotheses, and a long term memory for storing stable hypotheses that are affixed to the terrain representation. Multiple inference processes operate over these databases. Researchers describe these particular components: the perceptual structure database, the grouping processes that operate over this, schemas, and the long term terrain database. A processing example that matches predictions from the long term terrain model to imagery, extracts significant perceptual structures for consideration as potential landmarks, and extracts a relational structure to update the long term terrain database is given.
MODBASE, a database of annotated comparative protein structure models

PubMed Central

Pieper, Ursula; Eswar, Narayanan; Stuart, Ashley C.; Ilyin, Valentin A.; Sali, Andrej

2002-01-01

MODBASE (http://guitar.rockefeller.edu/modbase) is a relational database of annotated comparative protein structure models for all available protein sequences matched to at least one known protein structure. The models are calculated by MODPIPE, an automated modeling pipeline that relies on PSI-BLAST, IMPALA and MODELLER. MODBASE uses the MySQL relational database management system for flexible and efficient querying, and the MODVIEW Netscape plugin for viewing and manipulating multiple sequences and structures. It is updated regularly to reflect the growth of the protein sequence and structure databases, as well as improvements in the software for calculating the models. For ease of access, MODBASE is organized into different datasets. The largest dataset contains models for domains in 304 517 out of 539 171 unique protein sequences in the complete TrEMBL database (23 March 2001); only models based on significant alignments (PSI-BLAST E-value < 10–4) and models assessed to have the correct fold are included. Other datasets include models for target selection and structure-based annotation by the New York Structural Genomics Research Consortium, models for prediction of genes in the Drosophila melanogaster genome, models for structure determination of several ribosomal particles and models calculated by the MODWEB comparative modeling web server. PMID:11752309
An architecture for integrating distributed and cooperating knowledge-based Air Force decision aids

NASA Technical Reports Server (NTRS)

Nugent, Richard O.; Tucker, Richard W.

1988-01-01

MITRE has been developing a Knowledge-Based Battle Management Testbed for evaluating the viability of integrating independently-developed knowledge-based decision aids in the Air Force tactical domain. The primary goal for the testbed architecture is to permit a new system to be added to a testbed with little change to the system's software. Each system that connects to the testbed network declares that it can provide a number of services to other systems. When a system wants to use another system's service, it does not address the server system by name, but instead transmits a request to the testbed network asking for a particular service to be performed. A key component of the testbed architecture is a common database which uses a relational database management system (RDBMS). The RDBMS provides a database update notification service to requesting systems. Normally, each system is expected to monitor data relations of interest to it. Alternatively, a system may broadcast an announcement message to inform other systems that an event of potential interest has occurred. Current research is aimed at dealing with issues resulting from integration efforts, such as dealing with potential mismatches of each system's assumptions about the common database, decentralizing network control, and coordinating multiple agents.
Open Clients for Distributed Databases

NASA Astrophysics Data System (ADS)

Chayes, D. N.; Arko, R. A.

2001-12-01

We are actively developing a collection of open source example clients that demonstrate use of our "back end" data management infrastructure. The data management system is reported elsewhere at this meeting (Arko and Chayes: A Scaleable Database Infrastructure). In addition to their primary goal of being examples for others to build upon, some of these clients may have limited utility in them selves. More information about the clients and the data infrastructure is available on line at http://data.ldeo.columbia.edu. The available examples to be demonstrated include several web-based clients including those developed for the Community Review System of the Digital Library for Earth System Education, a real-time watch standers log book, an offline interface to use log book entries, a simple client to search on multibeam metadata and others are Internet enabled and generally web-based front ends that support searches against one or more relational databases using industry standard SQL queries. In addition to the web based clients, simple SQL searches from within Excel and similar applications will be demonstrated. By defining, documenting and publishing a clear interface to the fully searchable databases, it becomes relatively easy to construct client interfaces that are optimized for specific applications in comparison to building a monolithic data and user interface system.
Object-oriented structures supporting remote sensing databases

NASA Technical Reports Server (NTRS)

Wichmann, Keith; Cromp, Robert F.

1995-01-01

Object-oriented databases show promise for modeling the complex interrelationships pervasive in scientific domains. To examine the utility of this approach, we have developed an Intelligent Information Fusion System based on this technology, and applied it to the problem of managing an active repository of remotely-sensed satellite scenes. The design and implementation of the system is compared and contrasted with conventional relational database techniques, followed by a presentation of the underlying object-oriented data structures used to enable fast indexing into the data holdings.
Spatial Data Integration Using Ontology-Based Approach

NASA Astrophysics Data System (ADS)

Hasani, S.; Sadeghi-Niaraki, A.; Jelokhani-Niaraki, M.

2015-12-01

In today's world, the necessity for spatial data for various organizations is becoming so crucial that many of these organizations have begun to produce spatial data for that purpose. In some circumstances, the need to obtain real time integrated data requires sustainable mechanism to process real-time integration. Case in point, the disater management situations that requires obtaining real time data from various sources of information. One of the problematic challenges in the mentioned situation is the high degree of heterogeneity between different organizations data. To solve this issue, we introduce an ontology-based method to provide sharing and integration capabilities for the existing databases. In addition to resolving semantic heterogeneity, better access to information is also provided by our proposed method. Our approach is consisted of three steps, the first step is identification of the object in a relational database, then the semantic relationships between them are modelled and subsequently, the ontology of each database is created. In a second step, the relative ontology will be inserted into the database and the relationship of each class of ontology will be inserted into the new created column in database tables. Last step is consisted of a platform based on service-oriented architecture, which allows integration of data. This is done by using the concept of ontology mapping. The proposed approach, in addition to being fast and low cost, makes the process of data integration easy and the data remains unchanged and thus takes advantage of the legacy application provided.
"Mr. Database" : Jim Gray and the History of Database Technologies.

PubMed

Hanwahr, Nils C

2017-12-01

Although the widespread use of the term "Big Data" is comparatively recent, it invokes a phenomenon in the developments of database technology with distinct historical contexts. The database engineer Jim Gray, known as "Mr. Database" in Silicon Valley before his disappearance at sea in 2007, was involved in many of the crucial developments since the 1970s that constitute the foundation of exceedingly large and distributed databases. Jim Gray was involved in the development of relational database systems based on the concepts of Edgar F. Codd at IBM in the 1970s before he went on to develop principles of Transaction Processing that enable the parallel and highly distributed performance of databases today. He was also involved in creating forums for discourse between academia and industry, which influenced industry performance standards as well as database research agendas. As a co-founder of the San Francisco branch of Microsoft Research, Gray increasingly turned toward scientific applications of database technologies, e. g. leading the TerraServer project, an online database of satellite images. Inspired by Vannevar Bush's idea of the memex, Gray laid out his vision of a Personal Memex as well as a World Memex, eventually postulating a new era of data-based scientific discovery termed "Fourth Paradigm Science". This article gives an overview of Gray's contributions to the development of database technology as well as his research agendas and shows that central notions of Big Data have been occupying database engineers for much longer than the actual term has been in use.
Belgian health-related data in three international databases

PubMed Central

2011-01-01

Aims of the study This study wants to examine the availability of Belgian healthcare data in the three main international health databases: the World Health Organization European Health for All Database (WHO-HFA), the Organisation for Economic Co-operation and Development Health Data 2009 and EUROSTAT. Methods For the indicators present in the three databases, the availability of Belgian data and the source of these data were checked. Main findings The most important problem concerning the availability of Belgian health-related data in the three major international databases is the lack of recent data. Recent data are available for 27% of the indicators of the WHO-HFA database, 73% of the OECD Health Data, and for half of the Eurostat indicators. Especially recent data about health status (including mortality-based indicators) are lacking. Discussion Only the availability of the health-related data is studied in this article. The quality of the Belgian data is however also important to examine. The main problem concerning the availability of health data is the timeliness. One of the causes of this lack of (especially mortality) data is the reform of the Belgian State. Nowadays mortality data are provided by the communities. This results in a delay in the delivery of national mortality data. However several efforts are made to catch up. PMID:22958554
The Spanish National Reference Database for Ionizing Radiations (BANDRRI)

PubMed

Los Arcos JM; Bailador; Gonzalez; Gonzalez; Gorostiza; Ortiz; Sanchez; Shaw; Williart

2000-03-01

The Spanish National Reference Database for Ionizing Radiations (BANDRRI) is being implemented by a reasearch team in the frame of a joint project between CIEMAT (Unidad de Metrologia de Radiaciones Ionizantes and Direccion de Informatica) and the Universidad Nacional de Educacion a Distancia (UNED, Departamento de Mecanica y Departamento de Fisica de Materiales). This paper presents the main objectives of BANDRRI, its dynamic and relational data base structure, interactive Web accessibility and its main radionuclide-related contents at this moment.
FlavonoidSearch: A system for comprehensive flavonoid annotation by mass spectrometry.

PubMed

Akimoto, Nayumi; Ara, Takeshi; Nakajima, Daisuke; Suda, Kunihiro; Ikeda, Chiaki; Takahashi, Shingo; Muneto, Reiko; Yamada, Manabu; Suzuki, Hideyuki; Shibata, Daisuke; Sakurai, Nozomu

2017-04-28

Currently, in mass spectrometry-based metabolomics, limited reference mass spectra are available for flavonoid identification. In the present study, a database of probable mass fragments for 6,867 known flavonoids (FsDatabase) was manually constructed based on new structure- and fragmentation-related rules using new heuristics to overcome flavonoid complexity. We developed the FlavonoidSearch system for flavonoid annotation, which consists of the FsDatabase and a computational tool (FsTool) to automatically search the FsDatabase using the mass spectra of metabolite peaks as queries. This system showed the highest identification accuracy for the flavonoid aglycone when compared to existing tools and revealed accurate discrimination between the flavonoid aglycone and other compounds. Sixteen new flavonoids were found from parsley, and the diversity of the flavonoid aglycone among different fruits and vegetables was investigated.
A pilot GIS database of active faults of Mt. Etna (Sicily): A tool for integrated hazard evaluation

NASA Astrophysics Data System (ADS)

Barreca, Giovanni; Bonforte, Alessandro; Neri, Marco

2013-02-01

A pilot GIS-based system has been implemented for the assessment and analysis of hazard related to active faults affecting the eastern and southern flanks of Mt. Etna. The system structure was developed in ArcGis® environment and consists of different thematic datasets that include spatially-referenced arc-features and associated database. Arc-type features, georeferenced into WGS84 Ellipsoid UTM zone 33 Projection, represent the five main fault systems that develop in the analysed region. The backbone of the GIS-based system is constituted by the large amount of information which was collected from the literature and then stored and properly geocoded in a digital database. This consists of thirty five alpha-numeric fields which include all fault parameters available from literature such us location, kinematics, landform, slip rate, etc. Although the system has been implemented according to the most common procedures used by GIS developer, the architecture and content of the database represent a pilot backbone for digital storing of fault parameters, providing a powerful tool in modelling hazard related to the active tectonics of Mt. Etna. The database collects, organises and shares all scientific currently available information about the active faults of the volcano. Furthermore, thanks to the strong effort spent on defining the fields of the database, the structure proposed in this paper is open to the collection of further data coming from future improvements in the knowledge of the fault systems. By layering additional user-specific geographic information and managing the proposed database (topological querying) a great diversity of hazard and vulnerability maps can be produced by the user. This is a proposal of a backbone for a comprehensive geographical database of fault systems, universally applicable to other sites.
Comparison of the Frontier Distributed Database Caching System to NoSQL Databases

NASA Astrophysics Data System (ADS)

Dykstra, Dave

2012-12-01

One of the main attractions of non-relational “NoSQL” databases is their ability to scale to large numbers of readers, including readers spread over a wide area. The Frontier distributed database caching system, used in production by the Large Hadron Collider CMS and ATLAS detector projects for Conditions data, is based on traditional SQL databases but also adds high scalability and the ability to be distributed over a wide-area for an important subset of applications. This paper compares the major characteristics of the two different approaches and identifies the criteria for choosing which approach to prefer over the other. It also compares in some detail the NoSQL databases used by CMS and ATLAS: MongoDB, CouchDB, HBase, and Cassandra.

Comparison of the Frontier Distributed Database Caching System to NoSQL Databases

DOE Office of Scientific and Technical Information (OSTI.GOV)

Dykstra, Dave

One of the main attractions of non-relational NoSQL databases is their ability to scale to large numbers of readers, including readers spread over a wide area. The Frontier distributed database caching system, used in production by the Large Hadron Collider CMS and ATLAS detector projects for Conditions data, is based on traditional SQL databases but also adds high scalability and the ability to be distributed over a wide-area for an important subset of applications. This paper compares the major characteristics of the two different approaches and identifies the criteria for choosing which approach to prefer over the other. It alsomore » compares in some detail the NoSQL databases used by CMS and ATLAS: MongoDB, CouchDB, HBase, and Cassandra.« less
Metabolonote: A Wiki-Based Database for Managing Hierarchical Metadata of Metabolome Analyses

PubMed Central

Ara, Takeshi; Enomoto, Mitsuo; Arita, Masanori; Ikeda, Chiaki; Kera, Kota; Yamada, Manabu; Nishioka, Takaaki; Ikeda, Tasuku; Nihei, Yoshito; Shibata, Daisuke; Kanaya, Shigehiko; Sakurai, Nozomu

2015-01-01

Metabolomics – technology for comprehensive detection of small molecules in an organism – lags behind the other “omics” in terms of publication and dissemination of experimental data. Among the reasons for this are difficulty precisely recording information about complicated analytical experiments (metadata), existence of various databases with their own metadata descriptions, and low reusability of the published data, resulting in submitters (the researchers who generate the data) being insufficiently motivated. To tackle these issues, we developed Metabolonote, a Semantic MediaWiki-based database designed specifically for managing metabolomic metadata. We also defined a metadata and data description format, called “Togo Metabolome Data” (TogoMD), with an ID system that is required for unique access to each level of the tree-structured metadata such as study purpose, sample, analytical method, and data analysis. Separation of the management of metadata from that of data and permission to attach related information to the metadata provide advantages for submitters, readers, and database developers. The metadata are enriched with information such as links to comparable data, thereby functioning as a hub of related data resources. They also enhance not only readers’ understanding and use of data but also submitters’ motivation to publish the data. The metadata are computationally shared among other systems via APIs, which facilitate the construction of novel databases by database developers. A permission system that allows publication of immature metadata and feedback from readers also helps submitters to improve their metadata. Hence, this aspect of Metabolonote, as a metadata preparation tool, is complementary to high-quality and persistent data repositories such as MetaboLights. A total of 808 metadata for analyzed data obtained from 35 biological species are published currently. Metabolonote and related tools are available free of cost at http://metabolonote.kazusa.or.jp/. PMID:25905099
Metabolonote: a wiki-based database for managing hierarchical metadata of metabolome analyses.

PubMed

Ara, Takeshi; Enomoto, Mitsuo; Arita, Masanori; Ikeda, Chiaki; Kera, Kota; Yamada, Manabu; Nishioka, Takaaki; Ikeda, Tasuku; Nihei, Yoshito; Shibata, Daisuke; Kanaya, Shigehiko; Sakurai, Nozomu

2015-01-01

Metabolomics - technology for comprehensive detection of small molecules in an organism - lags behind the other "omics" in terms of publication and dissemination of experimental data. Among the reasons for this are difficulty precisely recording information about complicated analytical experiments (metadata), existence of various databases with their own metadata descriptions, and low reusability of the published data, resulting in submitters (the researchers who generate the data) being insufficiently motivated. To tackle these issues, we developed Metabolonote, a Semantic MediaWiki-based database designed specifically for managing metabolomic metadata. We also defined a metadata and data description format, called "Togo Metabolome Data" (TogoMD), with an ID system that is required for unique access to each level of the tree-structured metadata such as study purpose, sample, analytical method, and data analysis. Separation of the management of metadata from that of data and permission to attach related information to the metadata provide advantages for submitters, readers, and database developers. The metadata are enriched with information such as links to comparable data, thereby functioning as a hub of related data resources. They also enhance not only readers' understanding and use of data but also submitters' motivation to publish the data. The metadata are computationally shared among other systems via APIs, which facilitate the construction of novel databases by database developers. A permission system that allows publication of immature metadata and feedback from readers also helps submitters to improve their metadata. Hence, this aspect of Metabolonote, as a metadata preparation tool, is complementary to high-quality and persistent data repositories such as MetaboLights. A total of 808 metadata for analyzed data obtained from 35 biological species are published currently. Metabolonote and related tools are available free of cost at http://metabolonote.kazusa.or.jp/.
Graph Databases for Large-Scale Healthcare Systems: A Framework for Efficient Data Management and Data Services

DOE Office of Scientific and Technical Information (OSTI.GOV)

Park, Yubin; Shankar, Mallikarjun; Park, Byung H.

Designing a database system for both efficient data management and data services has been one of the enduring challenges in the healthcare domain. In many healthcare systems, data services and data management are often viewed as two orthogonal tasks; data services refer to retrieval and analytic queries such as search, joins, statistical data extraction, and simple data mining algorithms, while data management refers to building error-tolerant and non-redundant database systems. The gap between service and management has resulted in rigid database systems and schemas that do not support effective analytics. We compose a rich graph structure from an abstracted healthcaremore » RDBMS to illustrate how we can fill this gap in practice. We show how a healthcare graph can be automatically constructed from a normalized relational database using the proposed 3NF Equivalent Graph (3EG) transformation.We discuss a set of real world graph queries such as finding self-referrals, shared providers, and collaborative filtering, and evaluate their performance over a relational database and its 3EG-transformed graph. Experimental results show that the graph representation serves as multiple de-normalized tables, thus reducing complexity in a database and enhancing data accessibility of users. Based on this finding, we propose an ensemble framework of databases for healthcare applications.« less
Knowledge acquisition to qualify Unified Medical Language System interconceptual relationships.

PubMed Central

Le Duff, F.; Burgun, A.; Cleret, M.; Pouliquen, B.; Barac'h, V.; Le Beux, P.

2000-01-01

Adding automatically relations between concepts from a database to a knowledge base such as the Unified Medical Language System can be very useful to increase the consistency of the latter one. But the transfer of qualified relationships is more interesting. The most important interest of these new acquisitions is that the UMLS became more compliant and medically pertinent to be used in different medical applications. This paper describes the possibility to inherit automatically medical inter-conceptual relationships qualifiers from a disease description included into a database and to integrate them into the UMLS knowledge base. The paper focuses on the transmission of knowledge from a French medical database to an English one. PMID:11079930
Fragment virtual screening based on Bayesian categorization for discovering novel VEGFR-2 scaffolds.

PubMed

Zhang, Yanmin; Jiao, Yu; Xiong, Xiao; Liu, Haichun; Ran, Ting; Xu, Jinxing; Lu, Shuai; Xu, Anyang; Pan, Jing; Qiao, Xin; Shi, Zhihao; Lu, Tao; Chen, Yadong

2015-11-01

The discovery of novel scaffolds against a specific target has long been one of the most significant but challengeable goals in discovering lead compounds. A scaffold that binds in important regions of the active pocket is more favorable as a starting point because scaffolds generally possess greater optimization possibilities. However, due to the lack of sufficient chemical space diversity of the databases and the ineffectiveness of the screening methods, it still remains a great challenge to discover novel active scaffolds. Since the strengths and weaknesses of both fragment-based drug design and traditional virtual screening (VS), we proposed a fragment VS concept based on Bayesian categorization for the discovery of novel scaffolds. This work investigated the proposal through an application on VEGFR-2 target. Firstly, scaffold and structural diversity of chemical space for 10 compound databases were explicitly evaluated. Simultaneously, a robust Bayesian classification model was constructed for screening not only compound databases but also their corresponding fragment databases. Although analysis of the scaffold diversity demonstrated a very unevenly distribution of scaffolds over molecules, results showed that our Bayesian model behaved better in screening fragments than molecules. Through a literature retrospective research, several generated fragments with relatively high Bayesian scores indeed exhibit VEGFR-2 biological activity, which strongly proved the effectiveness of fragment VS based on Bayesian categorization models. This investigation of Bayesian-based fragment VS can further emphasize the necessity for enrichment of compound databases employed in lead discovery by amplifying the diversity of databases with novel structures.
The development of a prototype intelligent user interface subsystem for NASA's scientific database systems

NASA Technical Reports Server (NTRS)

Campbell, William J.; Roelofs, Larry H.; Short, Nicholas M., Jr.

1987-01-01

The National Space Science Data Center (NSSDC) has initiated an Intelligent Data Management (IDM) research effort which has as one of its components the development of an Intelligent User Interface (IUI).The intent of the latter is to develop a friendly and intelligent user interface service that is based on expert systems and natural language processing technologies. The purpose is to support the large number of potential scientific and engineering users presently having need of space and land related research and technical data but who have little or no experience in query languages or understanding of the information content or architecture of the databases involved. This technical memorandum presents prototype Intelligent User Interface Subsystem (IUIS) using the Crustal Dynamics Project Database as a test bed for the implementation of the CRUDDES (Crustal Dynamics Expert System). The knowledge base has more than 200 rules and represents a single application view and the architectural view. Operational performance using CRUDDES has allowed nondatabase users to obtain useful information from the database previously accessible only to an expert database user or the database designer.
PS1-41: Just Add Data: Implementing an Event-Based Data Model for Clinical Trial Tracking

PubMed Central

Fuller, Sharon; Carrell, David; Pardee, Roy

2012-01-01

Background/Aims Clinical research trials often have similar fundamental tracking needs, despite being quite variable in their specific logic and activities. A model tracking database that can be quickly adapted by a variety of studies has the potential to achieve significant efficiencies in database development and maintenance. Methods Over the course of several different clinical trials, we have developed a database model that is highly adaptable to a variety of projects. Rather than hard-coding each specific event that might occur in a trial, along with its logical consequences, this model considers each event and its parameters to be a data record in its own right. Each event may have related variables (metadata) describing its prerequisites, subsequent events due, associated mailings, or events that it overrides. The metadata for each event is stored in the same record with the event name. When changes are made to the study protocol, no structural changes to the database are needed. One has only to add or edit events and their metadata. Changes in the event metadata automatically determine any related logic changes. In addition to streamlining application code, this model simplifies communication between the programmer and other team members. Database requirements can be phrased as changes to the underlying data, rather than to the application code. The project team can review a single report of events and metadata and easily see where changes might be needed. In addition to benefitting from streamlined code, the front end database application can also implement useful standard features such as automated mail merges and to do lists. Results The event-based data model has proven itself to be robust, adaptable and user-friendly in a variety of study contexts. We have chosen to implement it as a SQL Server back end and distributed Access front end. Interested readers may request a copy of the Access front end and scripts for creating the back end database. Discussion An event-based database with a consistent, robust set of features has the potential to significantly reduce development time and maintenance expense for clinical trial tracking databases.
Cadastral Database Positional Accuracy Improvement

NASA Astrophysics Data System (ADS)

Hashim, N. M.; Omar, A. H.; Ramli, S. N. M.; Omar, K. M.; Din, N.

2017-10-01

Positional Accuracy Improvement (PAI) is the refining process of the geometry feature in a geospatial dataset to improve its actual position. This actual position relates to the absolute position in specific coordinate system and the relation to the neighborhood features. With the growth of spatial based technology especially Geographical Information System (GIS) and Global Navigation Satellite System (GNSS), the PAI campaign is inevitable especially to the legacy cadastral database. Integration of legacy dataset and higher accuracy dataset like GNSS observation is a potential solution for improving the legacy dataset. However, by merely integrating both datasets will lead to a distortion of the relative geometry. The improved dataset should be further treated to minimize inherent errors and fitting to the new accurate dataset. The main focus of this study is to describe a method of angular based Least Square Adjustment (LSA) for PAI process of legacy dataset. The existing high accuracy dataset known as National Digital Cadastral Database (NDCDB) is then used as bench mark to validate the results. It was found that the propose technique is highly possible for positional accuracy improvement of legacy spatial datasets.
Background qualitative analysis of the European Reference Life Cycle Database (ELCD) energy datasets - part I: fuel datasets.

PubMed

Garraín, Daniel; Fazio, Simone; de la Rúa, Cristina; Recchioni, Marco; Lechón, Yolanda; Mathieux, Fabrice

2015-01-01

The aim of this study is to identify areas of potential improvement of the European Reference Life Cycle Database (ELCD) fuel datasets. The revision is based on the data quality indicators described by the ILCD Handbook, applied on sectorial basis. These indicators evaluate the technological, geographical and time-related representativeness of the dataset and the appropriateness in terms of completeness, precision and methodology. Results show that ELCD fuel datasets have a very good quality in general terms, nevertheless some findings and recommendations in order to improve the quality of Life-Cycle Inventories have been derived. Moreover, these results ensure the quality of the fuel-related datasets to any LCA practitioner, and provide insights related to the limitations and assumptions underlying in the datasets modelling. Giving this information, the LCA practitioner will be able to decide whether the use of the ELCD fuel datasets is appropriate based on the goal and scope of the analysis to be conducted. The methodological approach would be also useful for dataset developers and reviewers, in order to improve the overall DQR of databases.
A Quality-Control-Oriented Database for a Mesoscale Meteorological Observation Network

NASA Astrophysics Data System (ADS)

Lussana, C.; Ranci, M.; Uboldi, F.

2012-04-01

In the operational context of a local weather service, data accessibility and quality related issues must be managed by taking into account a wide set of user needs. This work describes the structure and the operational choices made for the operational implementation of a database system storing data from highly automated observing stations, metadata and information on data quality. Lombardy's environmental protection agency, ARPA Lombardia, manages a highly automated mesoscale meteorological network. A Quality Assurance System (QAS) ensures that reliable observational information is collected and disseminated to the users. The weather unit in ARPA Lombardia, at the same time an important QAS component and an intensive data user, has developed a database specifically aimed to: 1) providing quick access to data for operational activities and 2) ensuring data quality for real-time applications, by means of an Automatic Data Quality Control (ADQC) procedure. Quantities stored in the archive include hourly aggregated observations of: precipitation amount, temperature, wind, relative humidity, pressure, global and net solar radiation. The ADQC performs several independent tests on raw data and compares their results in a decision-making procedure. An important ADQC component is the Spatial Consistency Test based on Optimal Interpolation. Interpolated and Cross-Validation analysis values are also stored in the database, providing further information to human operators and useful estimates in case of missing data. The technical solution adopted is based on a LAMP (Linux, Apache, MySQL and Php) system, constituting an open source environment suitable for both development and operational practice. The ADQC procedure itself is performed by R scripts directly interacting with the MySQL database. Users and network managers can access the database by using a set of web-based Php applications.
NCBI GEO: mining millions of expression profiles--database and tools.

PubMed

Barrett, Tanya; Suzek, Tugba O; Troup, Dennis B; Wilhite, Stephen E; Ngau, Wing-Chi; Ledoux, Pierre; Rudnev, Dmitry; Lash, Alex E; Fujibuchi, Wataru; Edgar, Ron

2005-01-01

The Gene Expression Omnibus (GEO) at the National Center for Biotechnology Information (NCBI) is the largest fully public repository for high-throughput molecular abundance data, primarily gene expression data. The database has a flexible and open design that allows the submission, storage and retrieval of many data types. These data include microarray-based experiments measuring the abundance of mRNA, genomic DNA and protein molecules, as well as non-array-based technologies such as serial analysis of gene expression (SAGE) and mass spectrometry proteomic technology. GEO currently holds over 30,000 submissions representing approximately half a billion individual molecular abundance measurements, for over 100 organisms. Here, we describe recent database developments that facilitate effective mining and visualization of these data. Features are provided to examine data from both experiment- and gene-centric perspectives using user-friendly Web-based interfaces accessible to those without computational or microarray-related analytical expertise. The GEO database is publicly accessible through the World Wide Web at http://www.ncbi.nlm.nih.gov/geo.
Mining databases for protein aggregation: a review.

PubMed

Tsiolaki, Paraskevi L; Nastou, Katerina C; Hamodrakas, Stavros J; Iconomidou, Vassiliki A

2017-09-01

Protein aggregation is an active area of research in recent decades, since it is the most common and troubling indication of protein instability. Understanding the mechanisms governing protein aggregation and amyloidogenesis is a key component to the aetiology and pathogenesis of many devastating disorders, including Alzheimer's disease or type 2 diabetes. Protein aggregation data are currently found "scattered" in an increasing number of repositories, since advances in computational biology greatly influence this field of research. This review exploits the various resources of aggregation data and attempts to distinguish and analyze the biological knowledge they contain, by introducing protein-based, fragment-based and disease-based repositories, related to aggregation. In order to gain a broad overview of the available repositories, a novel comprehensive network maps and visualizes the current association between aggregation databases and other important databases and/or tools and discusses the beneficial role of community annotation. The need for unification of aggregation databases in a common platform is also addressed.
Supplier Management System

NASA Technical Reports Server (NTRS)

Ramirez, Eric; Gutheinz, Sandy; Brison, James; Ho, Anita; Allen, James; Ceritelli, Olga; Tobar, Claudia; Nguyen, Thuykien; Crenshaw, Harrel; Santos, Roxann

2008-01-01

Supplier Management System (SMS) allows for a consistent, agency-wide performance rating system for suppliers used by NASA. This version (2.0) combines separate databases into one central database that allows for the sharing of supplier data. Information extracted from the NBS/Oracle database can be used to generate ratings. Also, supplier ratings can now be generated in the areas of cost, product quality, delivery, and audit data. Supplier data can be charted based on real-time user input. Based on these individual ratings, an overall rating can be generated. Data that normally would be stored in multiple databases, each requiring its own log-in, is now readily available and easily accessible with only one log-in required. Additionally, the database can accommodate the storage and display of quality-related data that can be analyzed and used in the supplier procurement decision-making process. Moreover, the software allows for a Closed-Loop System (supplier feedback), as well as the capability to communicate with other federal agencies.
Ontology to relational database transformation for web application development and maintenance

NASA Astrophysics Data System (ADS)

Mahmudi, Kamal; Inggriani Liem, M. M.; Akbar, Saiful

2018-03-01

Ontology is used as knowledge representation while database is used as facts recorder in a KMS (Knowledge Management System). In most applications, data are managed in a database system and updated through the application and then they are transformed to knowledge as needed. Once a domain conceptor defines the knowledge in the ontology, application and database can be generated from the ontology. Most existing frameworks generate application from its database. In this research, ontology is used for generating the application. As the data are updated through the application, a mechanism is designed to trigger an update to the ontology so that the application can be rebuilt based on the newest ontology. By this approach, a knowledge engineer has a full flexibility to renew the application based on the latest ontology without dependency to a software developer. In many cases, the concept needs to be updated when the data changed. The framework is built and tested in a spring java environment. A case study was conducted to proof the concepts.
Cloud-Based NoSQL Open Database of Pulmonary Nodules for Computer-Aided Lung Cancer Diagnosis and Reproducible Research.

PubMed

Ferreira Junior, José Raniery; Oliveira, Marcelo Costa; de Azevedo-Marques, Paulo Mazzoncini

2016-12-01

Lung cancer is the leading cause of cancer-related deaths in the world, and its main manifestation is pulmonary nodules. Detection and classification of pulmonary nodules are challenging tasks that must be done by qualified specialists, but image interpretation errors make those tasks difficult. In order to aid radiologists on those hard tasks, it is important to integrate the computer-based tools with the lesion detection, pathology diagnosis, and image interpretation processes. However, computer-aided diagnosis research faces the problem of not having enough shared medical reference data for the development, testing, and evaluation of computational methods for diagnosis. In order to minimize this problem, this paper presents a public nonrelational document-oriented cloud-based database of pulmonary nodules characterized by 3D texture attributes, identified by experienced radiologists and classified in nine different subjective characteristics by the same specialists. Our goal with the development of this database is to improve computer-aided lung cancer diagnosis and pulmonary nodule detection and classification research through the deployment of this database in a cloud Database as a Service framework. Pulmonary nodule data was provided by the Lung Image Database Consortium and Image Database Resource Initiative (LIDC-IDRI), image descriptors were acquired by a volumetric texture analysis, and database schema was developed using a document-oriented Not only Structured Query Language (NoSQL) approach. The proposed database is now with 379 exams, 838 nodules, and 8237 images, 4029 of them are CT scans and 4208 manually segmented nodules, and it is allocated in a MongoDB instance on a cloud infrastructure.
Global search tool for the Advanced Photon Source Integrated Relational Model of Installed Systems (IRMIS) database.

DOE Office of Scientific and Technical Information (OSTI.GOV)

Quock, D. E. R.; Cianciarulo, M. B.; APS Engineering Support Division

2007-01-01

The Integrated Relational Model of Installed Systems (IRMIS) is a relational database tool that has been implemented at the Advanced Photon Source to maintain an updated account of approximately 600 control system software applications, 400,000 process variables, and 30,000 control system hardware components. To effectively display this large amount of control system information to operators and engineers, IRMIS was initially built with nine Web-based viewers: Applications Organizing Index, IOC, PLC, Component Type, Installed Components, Network, Controls Spares, Process Variables, and Cables. However, since each viewer is designed to provide details from only one major category of the control system, themore » necessity for a one-stop global search tool for the entire database became apparent. The user requirements for extremely fast database search time and ease of navigation through search results led to the choice of Asynchronous JavaScript and XML (AJAX) technology in the implementation of the IRMIS global search tool. Unique features of the global search tool include a two-tier level of displayed search results, and a database data integrity validation and reporting mechanism.« less
An Evidence-Based Systematic Review on Cognitive Interventions for Individuals with Dementia

ERIC Educational Resources Information Center

Hopper, Tammy; Bourgeois, Michelle; Pimentel, Jane; Qualls, Constance Dean; Hickey, Ellen; Frymark, Tobi; Schooling, Tracy

2013-01-01

Purpose: To evaluate the current state of research evidence related to cognitive interventions for individuals with Alzheimer's disease or related dementias. Method: A systematic search of the literature was conducted across 27 electronic databases based on a set of a priori questions, inclusion/exclusion criteria, and search parameters. Studies…
A Foot-Mounted Inertial Measurement Unit (IMU) Positioning Algorithm Based on Magnetic Constraint

PubMed Central

Zou, Jiaheng

2018-01-01

With the development of related applications, indoor positioning techniques have been more and more widely developed. Based on Wi-Fi, Bluetooth low energy (BLE) and geomagnetism, indoor positioning techniques often rely on the physical location of fingerprint information. The focus and difficulty of establishing the fingerprint database are in obtaining a relatively accurate physical location with as little given information as possible. This paper presents a foot-mounted inertial measurement unit (IMU) positioning algorithm under the loop closure constraint based on magnetic information. It can provide relatively reliable position information without maps and geomagnetic information and provides a relatively accurate coordinate for the collection of a fingerprint database. In the experiment, the features extracted by the multi-level Fourier transform method proposed in this paper are validated and the validity of loop closure matching is tested with a RANSAC-based method. Moreover, the loop closure detection results show that the cumulative error of the trajectory processed by the graph optimization algorithm is significantly suppressed, presenting a good accuracy. The average error of the trajectory under loop closure constraint is controlled below 2.15 m. PMID:29494542
A Foot-Mounted Inertial Measurement Unit (IMU) Positioning Algorithm Based on Magnetic Constraint.

PubMed

Wang, Yan; Li, Xin; Zou, Jiaheng

2018-03-01

With the development of related applications, indoor positioning techniques have been more and more widely developed. Based on Wi-Fi, Bluetooth low energy (BLE) and geomagnetism, indoor positioning techniques often rely on the physical location of fingerprint information. The focus and difficulty of establishing the fingerprint database are in obtaining a relatively accurate physical location with as little given information as possible. This paper presents a foot-mounted inertial measurement unit (IMU) positioning algorithm under the loop closure constraint based on magnetic information. It can provide relatively reliable position information without maps and geomagnetic information and provides a relatively accurate coordinate for the collection of a fingerprint database. In the experiment, the features extracted by the multi-level Fourier transform method proposed in this paper are validated and the validity of loop closure matching is tested with a RANSAC-based method. Moreover, the loop closure detection results show that the cumulative error of the trajectory processed by the graph optimization algorithm is significantly suppressed, presenting a good accuracy. The average error of the trajectory under loop closure constraint is controlled below 2.15 m.

Re-thinking organisms: The impact of databases on model organism biology.

PubMed

Leonelli, Sabina; Ankeny, Rachel A

2012-03-01

Community databases have become crucial to the collection, ordering and retrieval of data gathered on model organisms, as well as to the ways in which these data are interpreted and used across a range of research contexts. This paper analyses the impact of community databases on research practices in model organism biology by focusing on the history and current use of four community databases: FlyBase, Mouse Genome Informatics, WormBase and The Arabidopsis Information Resource. We discuss the standards used by the curators of these databases for what counts as reliable evidence, acceptable terminology, appropriate experimental set-ups and adequate materials (e.g., specimens). On the one hand, these choices are informed by the collaborative research ethos characterising most model organism communities. On the other hand, the deployment of these standards in databases reinforces this ethos and gives it concrete and precise instantiations by shaping the skills, practices, values and background knowledge required of the database users. We conclude that the increasing reliance on community databases as vehicles to circulate data is having a major impact on how researchers conduct and communicate their research, which affects how they understand the biology of model organisms and its relation to the biology of other species. Copyright © 2011 Elsevier Ltd. All rights reserved.
The PROTICdb database for 2-DE proteomics.

PubMed

Langella, Olivier; Zivy, Michel; Joets, Johann

2007-01-01

PROTICdb is a web-based database mainly designed to store and analyze plant proteome data obtained by 2D polyacrylamide gel electrophoresis (2D PAGE) and mass spectrometry (MS). The goals of PROTICdb are (1) to store, track, and query information related to proteomic experiments, i.e., from tissue sampling to protein identification and quantitative measurements; and (2) to integrate information from the user's own expertise and other sources into a knowledge base, used to support data interpretation (e.g., for the determination of allelic variants or products of posttranslational modifications). Data insertion into the relational database of PROTICdb is achieved either by uploading outputs from Mélanie, PDQuest, IM2d, ImageMaster(tm) 2D Platinum v5.0, Progenesis, Sequest, MS-Fit, and Mascot software, or by filling in web forms (experimental design and methods). 2D PAGE-annotated maps can be displayed, queried, and compared through the GelBrowser. Quantitative data can be easily exported in a tabulated format for statistical analyses with any third-party software. PROTICdb is based on the Oracle or the PostgreSQLDataBase Management System (DBMS) and is freely available upon request at http://cms.moulon.inra.fr/content/view/14/44/.
Traditional Medicine Collection Tracking System (TM-CTS): a database for ethnobotanically driven drug-discovery programs.

PubMed

Harris, Eric S J; Erickson, Sean D; Tolopko, Andrew N; Cao, Shugeng; Craycroft, Jane A; Scholten, Robert; Fu, Yanling; Wang, Wenquan; Liu, Yong; Zhao, Zhongzhen; Clardy, Jon; Shamu, Caroline E; Eisenberg, David M

2011-05-17

Ethnobotanically driven drug-discovery programs include data related to many aspects of the preparation of botanical medicines, from initial plant collection to chemical extraction and fractionation. The Traditional Medicine Collection Tracking System (TM-CTS) was created to organize and store data of this type for an international collaborative project involving the systematic evaluation of commonly used Traditional Chinese Medicinal plants. The system was developed using domain-driven design techniques, and is implemented using Java, Hibernate, PostgreSQL, Business Intelligence and Reporting Tools (BIRT), and Apache Tomcat. The TM-CTS relational database schema contains over 70 data types, comprising over 500 data fields. The system incorporates a number of unique features that are useful in the context of ethnobotanical projects such as support for information about botanical collection, method of processing, quality tests for plants with existing pharmacopoeia standards, chemical extraction and fractionation, and historical uses of the plants. The database also accommodates data provided in multiple languages and integration with a database system built to support high throughput screening based drug discovery efforts. It is accessed via a web-based application that provides extensive, multi-format reporting capabilities. This new database system was designed to support a project evaluating the bioactivity of Chinese medicinal plants. The software used to create the database is open source, freely available, and could potentially be applied to other ethnobotanically driven natural product collection and drug-discovery programs. Copyright © 2011 Elsevier Ireland Ltd. All rights reserved.
Traditional Medicine Collection Tracking System (TM-CTS): A Database for Ethnobotanically-Driven Drug-Discovery Programs

PubMed Central

Harris, Eric S. J.; Erickson, Sean D.; Tolopko, Andrew N.; Cao, Shugeng; Craycroft, Jane A.; Scholten, Robert; Fu, Yanling; Wang, Wenquan; Liu, Yong; Zhao, Zhongzhen; Clardy, Jon; Shamu, Caroline E.; Eisenberg, David M.

2011-01-01

Aim of the study. Ethnobotanically-driven drug-discovery programs include data related to many aspects of the preparation of botanical medicines, from initial plant collection to chemical extraction and fractionation. The Traditional Medicine-Collection Tracking System (TM-CTS) was created to organize and store data of this type for an international collaborative project involving the systematic evaluation of commonly used Traditional Chinese Medicinal plants. Materials and Methods. The system was developed using domain-driven design techniques, and is implemented using Java, Hibernate, PostgreSQL, Business Intelligence and Reporting Tools (BIRT), and Apache Tomcat. Results. The TM-CTS relational database schema contains over 70 data types, comprising over 500 data fields. The system incorporates a number of unique features that are useful in the context of ethnobotanical projects such as support for information about botanical collection, method of processing, quality tests for plants with existing pharmacopoeia standards, chemical extraction and fractionation, and historical uses of the plants. The database also accommodates data provided in multiple languages and integration with a database system built to support high throughput screening based drug discovery efforts. It is accessed via a web-based application that provides extensive, multi-format reporting capabilities. Conclusions. This new database system was designed to support a project evaluating the bioactivity of Chinese medicinal plants. The software used to create the database is open source, freely available, and could potentially be applied to other ethnobotanically-driven natural product collection and drug-discovery programs. PMID:21420479
Clinical Databases for Chest Physicians.

PubMed

Courtwright, Andrew M; Gabriel, Peter E

2018-04-01

A clinical database is a repository of patient medical and sociodemographic information focused on one or more specific health condition or exposure. Although clinical databases may be used for research purposes, their primary goal is to collect and track patient data for quality improvement, quality assurance, and/or actual clinical management. This article aims to provide an introduction and practical advice on the development of small-scale clinical databases for chest physicians and practice groups. Through example projects, we discuss the pros and cons of available technical platforms, including Microsoft Excel and Access, relational database management systems such as Oracle and PostgreSQL, and Research Electronic Data Capture. We consider approaches to deciding the base unit of data collection, creating consensus around variable definitions, and structuring routine clinical care to complement database aims. We conclude with an overview of regulatory and security considerations for clinical databases. Copyright © 2018 American College of Chest Physicians. Published by Elsevier Inc. All rights reserved.
Development of a replicated database of DHCP data for evaluation of drug use.

PubMed Central

Graber, S E; Seneker, J A; Stahl, A A; Franklin, K O; Neel, T E; Miller, R A

1996-01-01

This case report describes development and testing of a method to extract clinical information stored in the Veterans Affairs (VA) Decentralized Hospital Computer System (DHCP) for the purpose of analyzing data about groups of patients. The authors used a microcomputer-based, structured query language (SQL)-compatible, relational database system to replicate a subset of the Nashville VA Hospital's DHCP patient database. This replicated database contained the complete current Nashville DHCP prescription, provider, patient, and drug data sets, and a subset of the laboratory data. A pilot project employed this replicated database to answer questions that might arise in drug-use evaluation, such as identification of cases of polypharmacy, suboptimal drug regimens, and inadequate laboratory monitoring of drug therapy. These database queries included as candidates for review all prescriptions for all outpatients. The queries demonstrated that specific drug-use events could be identified for any time interval represented in the replicated database. PMID:8653451
Development of a replicated database of DHCP data for evaluation of drug use.

PubMed

Graber, S E; Seneker, J A; Stahl, A A; Franklin, K O; Neel, T E; Miller, R A

1996-01-01

This case report describes development and testing of a method to extract clinical information stored in the Veterans Affairs (VA) Decentralized Hospital Computer System (DHCP) for the purpose of analyzing data about groups of patients. The authors used a microcomputer-based, structured query language (SQL)-compatible, relational database system to replicate a subset of the Nashville VA Hospital's DHCP patient database. This replicated database contained the complete current Nashville DHCP prescription, provider, patient, and drug data sets, and a subset of the laboratory data. A pilot project employed this replicated database to answer questions that might arise in drug-use evaluation, such as identification of cases of polypharmacy, suboptimal drug regimens, and inadequate laboratory monitoring of drug therapy. These database queries included as candidates for review all prescriptions for all outpatients. The queries demonstrated that specific drug-use events could be identified for any time interval represented in the replicated database.
Brief Report: Databases in the Asia-Pacific Region: The Potential for a Distributed Network Approach.

PubMed

Lai, Edward Chia-Cheng; Man, Kenneth K C; Chaiyakunapruk, Nathorn; Cheng, Ching-Lan; Chien, Hsu-Chih; Chui, Celine S L; Dilokthornsakul, Piyameth; Hardy, N Chantelle; Hsieh, Cheng-Yang; Hsu, Chung Y; Kubota, Kiyoshi; Lin, Tzu-Chieh; Liu, Yanfang; Park, Byung Joo; Pratt, Nicole; Roughead, Elizabeth E; Shin, Ju-Young; Watcharathanakij, Sawaeng; Wen, Jin; Wong, Ian C K; Yang, Yea-Huei Kao; Zhang, Yinghong; Setoguchi, Soko

2015-11-01

This study describes the availability and characteristics of databases in Asian-Pacific countries and assesses the feasibility of a distributed network approach in the region. A web-based survey was conducted among investigators using healthcare databases in the Asia-Pacific countries. Potential survey participants were identified through the Asian Pharmacoepidemiology Network. Investigators from a total of 11 databases participated in the survey. Database sources included four nationwide claims databases from Japan, South Korea, and Taiwan; two nationwide electronic health records from Hong Kong and Singapore; a regional electronic health record from western China; two electronic health records from Thailand; and cancer and stroke registries from Taiwan. We identified 11 databases with capabilities for distributed network approaches. Many country-specific coding systems and terminologies have been already converted to international coding systems. The harmonization of health expenditure data is a major obstacle for future investigations attempting to evaluate issues related to medical costs.
The methodology of database design in organization management systems

NASA Astrophysics Data System (ADS)

Chudinov, I. L.; Osipova, V. V.; Bobrova, Y. V.

2017-01-01

The paper describes the unified methodology of database design for management information systems. Designing the conceptual information model for the domain area is the most important and labor-intensive stage in database design. Basing on the proposed integrated approach to design, the conceptual information model, the main principles of developing the relation databases are provided and user’s information needs are considered. According to the methodology, the process of designing the conceptual information model includes three basic stages, which are defined in detail. Finally, the article describes the process of performing the results of analyzing user’s information needs and the rationale for use of classifiers.
LSE-Sign: A lexical database for Spanish Sign Language.

PubMed

Gutierrez-Sigut, Eva; Costello, Brendan; Baus, Cristina; Carreiras, Manuel

2016-03-01

The LSE-Sign database is a free online tool for selecting Spanish Sign Language stimulus materials to be used in experiments. It contains 2,400 individual signs taken from a recent standardized LSE dictionary, and a further 2,700 related nonsigns. Each entry is coded for a wide range of grammatical, phonological, and articulatory information, including handshape, location, movement, and non-manual elements. The database is accessible via a graphically based search facility which is highly flexible both in terms of the search options available and the way the results are displayed. LSE-Sign is available at the following website: http://www.bcbl.eu/databases/lse/.
Design and implementation of relational databases relevant to the diverse needs of a tuberculosis case contact study in the Gambia.

PubMed

Jeffries, D J; Donkor, S; Brookes, R H; Fox, A; Hill, P C

2004-09-01

The data requirements of a large multidisciplinary tuberculosis case contact study are complex. We describe an ACCESS-based relational database system that meets our rigorous requirements for data entry and validation, while being user-friendly, flexible, exportable, and easy to install on a network or stand alone system. This includes the development of a double data entry package for epidemiology and laboratory data, semi-automated entry of ELISPOT data directly from the plate reader, and a suite of new programmes for the manipulation and integration of flow cytometry data. The double entered epidemiology and immunology databases are combined into a separate database, providing a near-real-time analysis of immuno-epidemiological data, allowing important trends to be identified early and major decisions about the study to be made and acted on. This dynamic data management model is portable and can easily be applied to other studies.
Efficient hemodynamic event detection utilizing relational databases and wavelet analysis

NASA Technical Reports Server (NTRS)

Saeed, M.; Mark, R. G.

2001-01-01

Development of a temporal query framework for time-oriented medical databases has hitherto been a challenging problem. We describe a novel method for the detection of hemodynamic events in multiparameter trends utilizing wavelet coefficients in a MySQL relational database. Storage of the wavelet coefficients allowed for a compact representation of the trends, and provided robust descriptors for the dynamics of the parameter time series. A data model was developed to allow for simplified queries along several dimensions and time scales. Of particular importance, the data model and wavelet framework allowed for queries to be processed with minimal table-join operations. A web-based search engine was developed to allow for user-defined queries. Typical queries required between 0.01 and 0.02 seconds, with at least two orders of magnitude improvement in speed over conventional queries. This powerful and innovative structure will facilitate research on large-scale time-oriented medical databases.
Renal Gene Expression Database (RGED): a relational database of gene expression profiles in kidney disease.

PubMed

Zhang, Qingzhou; Yang, Bo; Chen, Xujiao; Xu, Jing; Mei, Changlin; Mao, Zhiguo

2014-01-01

We present a bioinformatics database named Renal Gene Expression Database (RGED), which contains comprehensive gene expression data sets from renal disease research. The web-based interface of RGED allows users to query the gene expression profiles in various kidney-related samples, including renal cell lines, human kidney tissues and murine model kidneys. Researchers can explore certain gene profiles, the relationships between genes of interests and identify biomarkers or even drug targets in kidney diseases. The aim of this work is to provide a user-friendly utility for the renal disease research community to query expression profiles of genes of their own interest without the requirement of advanced computational skills. Website is implemented in PHP, R, MySQL and Nginx and freely available from http://rged.wall-eva.net. http://rged.wall-eva.net. © The Author(s) 2014. Published by Oxford University Press.
Security and privacy qualities of medical devices: an analysis of FDA postmarket surveillance.

PubMed

Kramer, Daniel B; Baker, Matthew; Ransford, Benjamin; Molina-Markham, Andres; Stewart, Quinn; Fu, Kevin; Reynolds, Matthew R

2012-01-01

Medical devices increasingly depend on computing functions such as wireless communication and Internet connectivity for software-based control of therapies and network-based transmission of patients' stored medical information. These computing capabilities introduce security and privacy risks, yet little is known about the prevalence of such risks within the clinical setting. We used three comprehensive, publicly available databases maintained by the Food and Drug Administration (FDA) to evaluate recalls and adverse events related to security and privacy risks of medical devices. Review of weekly enforcement reports identified 1,845 recalls; 605 (32.8%) of these included computers, 35 (1.9%) stored patient data, and 31 (1.7%) were capable of wireless communication. Searches of databases specific to recalls and adverse events identified only one event with a specific connection to security or privacy. Software-related recalls were relatively common, and most (81.8%) mentioned the possibility of upgrades, though only half of these provided specific instructions for the update mechanism. Our review of recalls and adverse events from federal government databases reveals sharp inconsistencies with databases at individual providers with respect to security and privacy risks. Recalls related to software may increase security risks because of unprotected update and correction mechanisms. To detect signals of security and privacy problems that adversely affect public health, federal postmarket surveillance strategies should rethink how to effectively and efficiently collect data on security and privacy problems in devices that increasingly depend on computing systems susceptible to malware.
Security and Privacy Qualities of Medical Devices: An Analysis of FDA Postmarket Surveillance

PubMed Central

Kramer, Daniel B.; Baker, Matthew; Ransford, Benjamin; Molina-Markham, Andres; Stewart, Quinn; Fu, Kevin; Reynolds, Matthew R.

2012-01-01

Background Medical devices increasingly depend on computing functions such as wireless communication and Internet connectivity for software-based control of therapies and network-based transmission of patients’ stored medical information. These computing capabilities introduce security and privacy risks, yet little is known about the prevalence of such risks within the clinical setting. Methods We used three comprehensive, publicly available databases maintained by the Food and Drug Administration (FDA) to evaluate recalls and adverse events related to security and privacy risks of medical devices. Results Review of weekly enforcement reports identified 1,845 recalls; 605 (32.8%) of these included computers, 35 (1.9%) stored patient data, and 31 (1.7%) were capable of wireless communication. Searches of databases specific to recalls and adverse events identified only one event with a specific connection to security or privacy. Software-related recalls were relatively common, and most (81.8%) mentioned the possibility of upgrades, though only half of these provided specific instructions for the update mechanism. Conclusions Our review of recalls and adverse events from federal government databases reveals sharp inconsistencies with databases at individual providers with respect to security and privacy risks. Recalls related to software may increase security risks because of unprotected update and correction mechanisms. To detect signals of security and privacy problems that adversely affect public health, federal postmarket surveillance strategies should rethink how to effectively and efficiently collect data on security and privacy problems in devices that increasingly depend on computing systems susceptible to malware. PMID:22829874
Risk of cardiac death among cancer survivors in the United States: a SEER database analysis.

PubMed

Abdel-Rahman, Omar

2017-09-01

Population-based data on the risk of cardiac death among cancer survivors are needed. This scenario was evaluated in cancer survivors (>5 years) registered within the Surveillance, Epidemiology and End Results (SEER) database. The SEER database was queried using SEER*Stat to determine the frequency of cardiac death compared to other causes of death; and to determine heart disease-specific and cancer-specific survival rates in survivors of each of the 10 most common cancers in men and women in the SEER database. For cancer-specific survival rate, the highest rates were related to thyroid cancer survivors; while the lowest rates were related to lung cancer survivors. For heart disease-specific survival rate, the highest rates were related to thyroid cancer survivors; while the lowest rates were related to both lung cancer survivors and urinary bladder cancer survivors. The following factors were associated with a higher likelihood of cardiac death: male gender, old age at diagnosis, black race and local treatment with radiotherapy rather than surgery (P < 0.0001 for all parameters). Among cancer survivors (>5 years), cardiac death is a significant cause of death and there is a wide variability among different cancers in the relative importance of cardiac death vs. cancer-related death.
A Benchmark and Comparative Study of Video-Based Face Recognition on COX Face Database.

PubMed

Huang, Zhiwu; Shan, Shiguang; Wang, Ruiping; Zhang, Haihong; Lao, Shihong; Kuerban, Alifu; Chen, Xilin

2015-12-01

Face recognition with still face images has been widely studied, while the research on video-based face recognition is inadequate relatively, especially in terms of benchmark datasets and comparisons. Real-world video-based face recognition applications require techniques for three distinct scenarios: 1) Videoto-Still (V2S); 2) Still-to-Video (S2V); and 3) Video-to-Video (V2V), respectively, taking video or still image as query or target. To the best of our knowledge, few datasets and evaluation protocols have benchmarked for all the three scenarios. In order to facilitate the study of this specific topic, this paper contributes a benchmarking and comparative study based on a newly collected still/video face database, named COX(1) Face DB. Specifically, we make three contributions. First, we collect and release a largescale still/video face database to simulate video surveillance with three different video-based face recognition scenarios (i.e., V2S, S2V, and V2V). Second, for benchmarking the three scenarios designed on our database, we review and experimentally compare a number of existing set-based methods. Third, we further propose a novel Point-to-Set Correlation Learning (PSCL) method, and experimentally show that it can be used as a promising baseline method for V2S/S2V face recognition on COX Face DB. Extensive experimental results clearly demonstrate that video-based face recognition needs more efforts, and our COX Face DB is a good benchmark database for evaluation.
A Chado case study: an ontology-based modular schema for representing genome-associated biological information.

PubMed

Mungall, Christopher J; Emmert, David B

2007-07-01

A few years ago, FlyBase undertook to design a new database schema to store Drosophila data. It would fully integrate genomic sequence and annotation data with bibliographic, genetic, phenotypic and molecular data from the literature representing a distillation of the first 100 years of research on this major animal model system. In developing this new integrated schema, FlyBase also made a commitment to ensure that its design was generic, extensible and available as open source, so that it could be employed as the core schema of any model organism data repository, thereby avoiding redundant software development and potentially increasing interoperability. Our question was whether we could create a relational database schema that would be successfully reused. Chado is a relational database schema now being used to manage biological knowledge for a wide variety of organisms, from human to pathogens, especially the classes of information that directly or indirectly can be associated with genome sequences or the primary RNA and protein products encoded by a genome. Biological databases that conform to this schema can interoperate with one another, and with application software from the Generic Model Organism Database (GMOD) toolkit. Chado is distinctive because its design is driven by ontologies. The use of ontologies (or controlled vocabularies) is ubiquitous across the schema, as they are used as a means of typing entities. The Chado schema is partitioned into integrated subschemas (modules), each encapsulating a different biological domain, and each described using representations in appropriate ontologies. To illustrate this methodology, we describe here the Chado modules used for describing genomic sequences. GMOD is a collaboration of several model organism database groups, including FlyBase, to develop a set of open-source software for managing model organism data. The Chado schema is freely distributed under the terms of the Artistic License (http://www.opensource.org/licenses/artistic-license.php) from GMOD (www.gmod.org).
[Effects of soil data and map scale on assessment of total phosphorus storage in upland soils.

PubMed

Li, Heng Rong; Zhang, Li Ming; Li, Xiao di; Yu, Dong Sheng; Shi, Xue Zheng; Xing, Shi He; Chen, Han Yue

2016-06-01

Accurate assessment of total phosphorus storage in farmland soils is of great significance to sustainable agricultural and non-point source pollution control. However, previous studies haven't considered the estimation errors from mapping scales and various databases with different sources of soil profile data. In this study, a total of 393×10 4 hm 2 of upland in the 29 counties (or cities) of North Jiangsu was cited as a case for study. Analysis was performed of how the four sources of soil profile data, namely, "Soils of County", "Soils of Prefecture", "Soils of Province" and "Soils of China", and the six scales, i.e. 1:50000, 1:250000, 1:500000, 1:1000000, 1:4000000 and1:10000000, used in the 24 soil databases established for the four soil journals, affected assessment of soil total phosphorus. Compared with the most detailed 1:50000 soil database established with 983 upland soil profiles, relative deviation of the estimates of soil total phosphorus density (STPD) and soil total phosphorus storage (STPS) from the other soil databases varied from 4.8% to 48.9% and from 1.6% to 48.4%, respectively. The estimated STPD and STPS based on the 1:50000 database of "Soils of County" and most of the estimates based on the databases of each scale in "Soils of County" and "Soils of Prefecture" were different, with the significance levels of P＜0.001 or P＜0.05. Extremely significant differences (P＜0.001) existed between the estimates based on the 1:50000 database of "Soils of County" and the estimates based on the databases of each scale in "Soils of Province" and "Soils of China". This study demonstrated the significance of appropriate soil data sources and appropriate mapping scales in estimating STPS.
Hospital nurses' information retrieval behaviours in relation to evidence based nursing: a literature review.

PubMed

Alving, Berit Elisabeth; Christensen, Janne Buck; Thrysøe, Lars

2018-03-01

The purpose of this literature review is to provide an overview of the information retrieval behaviour of clinical nurses, in terms of the use of databases and other information resources and their frequency of use. Systematic searches carried out in five databases and handsearching were used to identify the studies from 2010 to 2016, with a populations, exposures and outcomes (PEO) search strategy, focusing on the question: In which databases or other information resources do hospital nurses search for evidence based information, and how often? Of 5272 titles retrieved based on the search strategy, only nine studies fulfilled the criteria for inclusion. The studies are from the United States, Canada, Taiwan and Nigeria. The results show that hospital nurses' primary choice of source for evidence based information is Google and peers, while bibliographic databases such as PubMed are secondary choices. Data on frequency are only included in four of the studies, and data are heterogenous. The reasons for choosing Google and peers are primarily lack of time; lack of information; lack of retrieval skills; or lack of training in database searching. Only a few studies are published on clinical nurses' retrieval behaviours, and more studies are needed from Europe and Australia. © 2018 Health Libraries Group.

Integrated database for identifying candidate genes for Aspergillus flavus resistance in maize

PubMed Central

2010-01-01

Background Aspergillus flavus Link:Fr, an opportunistic fungus that produces aflatoxin, is pathogenic to maize and other oilseed crops. Aflatoxin is a potent carcinogen, and its presence markedly reduces the value of grain. Understanding and enhancing host resistance to A. flavus infection and/or subsequent aflatoxin accumulation is generally considered an efficient means of reducing grain losses to aflatoxin. Different proteomic, genomic and genetic studies of maize (Zea mays L.) have generated large data sets with the goal of identifying genes responsible for conferring resistance to A. flavus, or aflatoxin. Results In order to maximize the usage of different data sets in new studies, including association mapping, we have constructed a relational database with web interface integrating the results of gene expression, proteomic (both gel-based and shotgun), Quantitative Trait Loci (QTL) genetic mapping studies, and sequence data from the literature to facilitate selection of candidate genes for continued investigation. The Corn Fungal Resistance Associated Sequences Database (CFRAS-DB) (http://agbase.msstate.edu/) was created with the main goal of identifying genes important to aflatoxin resistance. CFRAS-DB is implemented using MySQL as the relational database management system running on a Linux server, using an Apache web server, and Perl CGI scripts as the web interface. The database and the associated web-based interface allow researchers to examine many lines of evidence (e.g. microarray, proteomics, QTL studies, SNP data) to assess the potential role of a gene or group of genes in the response of different maize lines to A. flavus infection and subsequent production of aflatoxin by the fungus. Conclusions CFRAS-DB provides the first opportunity to integrate data pertaining to the problem of A. flavus and aflatoxin resistance in maize in one resource and to support queries across different datasets. The web-based interface gives researchers different query options for mining the database across different types of experiments. The database is publically available at http://agbase.msstate.edu. PMID:20946609
Integrated database for identifying candidate genes for Aspergillus flavus resistance in maize.

PubMed

Kelley, Rowena Y; Gresham, Cathy; Harper, Jonathan; Bridges, Susan M; Warburton, Marilyn L; Hawkins, Leigh K; Pechanova, Olga; Peethambaran, Bela; Pechan, Tibor; Luthe, Dawn S; Mylroie, J E; Ankala, Arunkanth; Ozkan, Seval; Henry, W B; Williams, W P

2010-10-07

Aspergillus flavus Link:Fr, an opportunistic fungus that produces aflatoxin, is pathogenic to maize and other oilseed crops. Aflatoxin is a potent carcinogen, and its presence markedly reduces the value of grain. Understanding and enhancing host resistance to A. flavus infection and/or subsequent aflatoxin accumulation is generally considered an efficient means of reducing grain losses to aflatoxin. Different proteomic, genomic and genetic studies of maize (Zea mays L.) have generated large data sets with the goal of identifying genes responsible for conferring resistance to A. flavus, or aflatoxin. In order to maximize the usage of different data sets in new studies, including association mapping, we have constructed a relational database with web interface integrating the results of gene expression, proteomic (both gel-based and shotgun), Quantitative Trait Loci (QTL) genetic mapping studies, and sequence data from the literature to facilitate selection of candidate genes for continued investigation. The Corn Fungal Resistance Associated Sequences Database (CFRAS-DB) (http://agbase.msstate.edu/) was created with the main goal of identifying genes important to aflatoxin resistance. CFRAS-DB is implemented using MySQL as the relational database management system running on a Linux server, using an Apache web server, and Perl CGI scripts as the web interface. The database and the associated web-based interface allow researchers to examine many lines of evidence (e.g. microarray, proteomics, QTL studies, SNP data) to assess the potential role of a gene or group of genes in the response of different maize lines to A. flavus infection and subsequent production of aflatoxin by the fungus. CFRAS-DB provides the first opportunity to integrate data pertaining to the problem of A. flavus and aflatoxin resistance in maize in one resource and to support queries across different datasets. The web-based interface gives researchers different query options for mining the database across different types of experiments. The database is publically available at http://agbase.msstate.edu.
WholeCellSimDB: a hybrid relational/HDF database for whole-cell model predictions

PubMed Central

Karr, Jonathan R.; Phillips, Nolan C.; Covert, Markus W.

2014-01-01

Mechanistic ‘whole-cell’ models are needed to develop a complete understanding of cell physiology. However, extracting biological insights from whole-cell models requires running and analyzing large numbers of simulations. We developed WholeCellSimDB, a database for organizing whole-cell simulations. WholeCellSimDB was designed to enable researchers to search simulation metadata to identify simulations for further analysis, and quickly slice and aggregate simulation results data. In addition, WholeCellSimDB enables users to share simulations with the broader research community. The database uses a hybrid relational/hierarchical data format architecture to efficiently store and retrieve both simulation setup metadata and results data. WholeCellSimDB provides a graphical Web-based interface to search, browse, plot and export simulations; a JavaScript Object Notation (JSON) Web service to retrieve data for Web-based visualizations; a command-line interface to deposit simulations; and a Python API to retrieve data for advanced analysis. Overall, we believe WholeCellSimDB will help researchers use whole-cell models to advance basic biological science and bioengineering. Database URL: http://www.wholecellsimdb.org Source code repository URL: http://github.com/CovertLab/WholeCellSimDB PMID:25231498
Hand-held computer operating system program for collection of resident experience data.

PubMed

Malan, T K; Haffner, W H; Armstrong, A Y; Satin, A J

2000-11-01

To describe a system for recording resident experience involving hand-held computers with the Palm Operating System (3 Com, Inc., Santa Clara, CA). Hand-held personal computers (PCs) are popular, easy to use, inexpensive, portable, and can share data among other operating systems. Residents in our program carry individual hand-held database computers to record Residency Review Committee (RRC) reportable patient encounters. Each resident's data is transferred to a single central relational database compatible with Microsoft Access (Microsoft Corporation, Redmond, WA). Patient data entry and subsequent transfer to a central database is accomplished with commercially available software that requires minimal computer expertise to implement and maintain. The central database can then be used for statistical analysis or to create required RRC resident experience reports. As a result, the data collection and transfer process takes less time for residents and program director alike, than paper-based or central computer-based systems. The system of collecting resident encounter data using hand-held computers with the Palm Operating System is easy to use, relatively inexpensive, accurate, and secure. The user-friendly system provides prompt, complete, and accurate data, enhancing the education of residents while facilitating the job of the program director.
MethBank: a database integrating next-generation sequencing single-base-resolution DNA methylation programming data.

PubMed

Zou, Dong; Sun, Shixiang; Li, Rujiao; Liu, Jiang; Zhang, Jing; Zhang, Zhang

2015-01-01

DNA methylation plays crucial roles during embryonic development. Here we present MethBank (http://dnamethylome.org), a DNA methylome programming database that integrates the genome-wide single-base nucleotide methylomes of gametes and early embryos in different model organisms. Unlike extant relevant databases, MethBank incorporates the whole-genome single-base-resolution methylomes of gametes and early embryos at multiple different developmental stages in zebrafish and mouse. MethBank allows users to retrieve methylation levels, differentially methylated regions, CpG islands, gene expression profiles and genetic polymorphisms for a specific gene or genomic region. Moreover, it offers a methylome browser that is capable of visualizing high-resolution DNA methylation profiles as well as other related data in an interactive manner and thus is of great helpfulness for users to investigate methylation patterns and changes of gametes and early embryos at different developmental stages. Ongoing efforts are focused on incorporation of methylomes and related data from other organisms. Together, MethBank features integration and visualization of high-resolution DNA methylation data as well as other related data, enabling identification of potential DNA methylation signatures in different developmental stages and accordingly providing an important resource for the epigenetic and developmental studies. © The Author(s) 2014. Published by Oxford University Press on behalf of Nucleic Acids Research.
ALARA database value in future outage work planning and dose management

DOE Office of Scientific and Technical Information (OSTI.GOV)

Miller, D.W.; Green, W.H.

1995-03-01

ALARA database encompassing job-specific duration and man-rem plant specific information over three refueling outages represents an invaluable tool for the outage work planner and ALARA engineer. This paper describes dose-management trends emerging based on analysis of three refueling outages at Clinton Power Station. Conclusions reached based on hard data available from a relational database dose-tracking system is a valuable tool for planning of future outage work. The system`s ability to identify key problem areas during a refueling outage is improving as more outage comparative data becomes available. Trends over a three outage period are identified in this paper in themore » categories of number and type of radiation work permits implemented, duration of jobs, projected vs. actual dose rates in work areas, and accuracy of outage person-rem projection. The value of the database in projecting 1 and 5 year station person-rem estimates is discussed.« less
Identification of Functionally Related Enzymes by Learning-to-Rank Methods.

PubMed

Stock, Michiel; Fober, Thomas; Hüllermeier, Eyke; Glinca, Serghei; Klebe, Gerhard; Pahikkala, Tapio; Airola, Antti; De Baets, Bernard; Waegeman, Willem

2014-01-01

Enzyme sequences and structures are routinely used in the biological sciences as queries to search for functionally related enzymes in online databases. To this end, one usually departs from some notion of similarity, comparing two enzymes by looking for correspondences in their sequences, structures or surfaces. For a given query, the search operation results in a ranking of the enzymes in the database, from very similar to dissimilar enzymes, while information about the biological function of annotated database enzymes is ignored. In this work, we show that rankings of that kind can be substantially improved by applying kernel-based learning algorithms. This approach enables the detection of statistical dependencies between similarities of the active cleft and the biological function of annotated enzymes. This is in contrast to search-based approaches, which do not take annotated training data into account. Similarity measures based on the active cleft are known to outperform sequence-based or structure-based measures under certain conditions. We consider the Enzyme Commission (EC) classification hierarchy for obtaining annotated enzymes during the training phase. The results of a set of sizeable experiments indicate a consistent and significant improvement for a set of similarity measures that exploit information about small cavities in the surface of enzymes.
Solving Relational Database Problems with ORDBMS in an Advanced Database Course

ERIC Educational Resources Information Center

Wang, Ming

2011-01-01

This paper introduces how to use the object-relational database management system (ORDBMS) to solve relational database (RDB) problems in an advanced database course. The purpose of the paper is to provide a guideline for database instructors who desire to incorporate the ORDB technology in their traditional database courses. The paper presents…
[The Development and Application of the Orthopaedics Implants Failure Database Software Based on WEB].

PubMed

Huang, Jiahua; Zhou, Hai; Zhang, Binbin; Ding, Biao

2015-09-01

This article develops a new failure database software for orthopaedics implants based on WEB. The software is based on B/S mode, ASP dynamic web technology is used as its main development language to achieve data interactivity, Microsoft Access is used to create a database, these mature technologies make the software extend function or upgrade easily. In this article, the design and development idea of the software, the software working process and functions as well as relative technical features are presented. With this software, we can store many different types of the fault events of orthopaedics implants, the failure data can be statistically analyzed, and in the macroscopic view, it can be used to evaluate the reliability of orthopaedics implants and operations, it also can ultimately guide the doctors to improve the clinical treatment level.
Intelligent data management

NASA Technical Reports Server (NTRS)

Campbell, William J.

1985-01-01

Intelligent data management is the concept of interfacing a user to a database management system with a value added service that will allow a full range of data management operations at a high level of abstraction using human written language. The development of such a system will be based on expert systems and related artificial intelligence technologies, and will allow the capturing of procedural and relational knowledge about data management operations and the support of a user with such knowledge in an on-line, interactive manner. Such a system will have the following capabilities: (1) the ability to construct a model of the users view of the database, based on the query syntax; (2) the ability to transform English queries and commands into database instructions and processes; (3) the ability to use heuristic knowledge to rapidly prune the data space in search processes; and (4) the ability to use an on-line explanation system to allow the user to understand what the system is doing and why it is doing it. Additional information is given in outline form.
Incremental Query Rewriting with Resolution

NASA Astrophysics Data System (ADS)

Riazanov, Alexandre; Aragão, Marcelo A. T.

We address the problem of semantic querying of relational databases (RDB) modulo knowledge bases using very expressive knowledge representation formalisms, such as full first-order logic or its various fragments. We propose to use a resolution-based first-order logic (FOL) reasoner for computing schematic answers to deductive queries, with the subsequent translation of these schematic answers to SQL queries which are evaluated using a conventional relational DBMS. We call our method incremental query rewriting, because an original semantic query is rewritten into a (potentially infinite) series of SQL queries. In this chapter, we outline the main idea of our technique - using abstractions of databases and constrained clauses for deriving schematic answers, and provide completeness and soundness proofs to justify the applicability of this technique to the case of resolution for FOL without equality. The proposed method can be directly used with regular RDBs, including legacy databases. Moreover, we propose it as a potential basis for an efficient Web-scale semantic search technology.
Lead exposure in US worksites: A literature review and development of an occupational lead exposure database from the published literature

PubMed Central

Koh, Dong-Hee; Locke, Sarah J.; Chen, Yu-Cheng; Purdue, Mark P.; Friesen, Melissa C.

2016-01-01

Background Retrospective exposure assessment of occupational lead exposure in population-based studies requires historical exposure information from many occupations and industries. Methods We reviewed published US exposure monitoring studies to identify lead exposure measurement data. We developed an occupational lead exposure database from the 175 identified papers containing 1,111 sets of lead concentration summary statistics (21% area air, 47% personal air, 32% blood). We also extracted ancillary exposure-related information, including job, industry, task/location, year collected, sampling strategy, control measures in place, and sampling and analytical methods. Results Measurements were published between 1940 and 2010 and represented 27 2-digit standardized industry classification codes. The majority of the measurements were related to lead-based paint work, joining or cutting metal using heat, primary and secondary metal manufacturing, and lead acid battery manufacturing. Conclusions This database can be used in future statistical analyses to characterize differences in lead exposure across time, jobs, and industries. PMID:25968240
Critical Needs for Robust and Reliable Database for Design and Manufacturing of Ceramic Matrix Composites

NASA Technical Reports Server (NTRS)

Singh, M.

1999-01-01

Ceramic matrix composite (CMC) components are being designed, fabricated, and tested for a number of high temperature, high performance applications in aerospace and ground based systems. The critical need for and the role of reliable and robust databases for the design and manufacturing of ceramic matrix composites are presented. A number of issues related to engineering design, manufacturing technologies, joining, and attachment technologies, are also discussed. Examples of various ongoing activities in the area of composite databases. designing to codes and standards, and design for manufacturing are given.
SAM: String-based sequence search algorithm for mitochondrial DNA database queries

PubMed Central

Röck, Alexander; Irwin, Jodi; Dür, Arne; Parsons, Thomas; Parson, Walther

2011-01-01

The analysis of the haploid mitochondrial (mt) genome has numerous applications in forensic and population genetics, as well as in disease studies. Although mtDNA haplotypes are usually determined by sequencing, they are rarely reported as a nucleotide string. Traditionally they are presented in a difference-coded position-based format relative to the corrected version of the first sequenced mtDNA. This convention requires recommendations for standardized sequence alignment that is known to vary between scientific disciplines, even between laboratories. As a consequence, database searches that are vital for the interpretation of mtDNA data can suffer from biased results when query and database haplotypes are annotated differently. In the forensic context that would usually lead to underestimation of the absolute and relative frequencies. To address this issue we introduce SAM, a string-based search algorithm that converts query and database sequences to position-free nucleotide strings and thus eliminates the possibility that identical sequences will be missed in a database query. The mere application of a BLAST algorithm would not be a sufficient remedy as it uses a heuristic approach and does not address properties specific to mtDNA, such as phylogenetically stable but also rapidly evolving insertion and deletion events. The software presented here provides additional flexibility to incorporate phylogenetic data, site-specific mutation rates, and other biologically relevant information that would refine the interpretation of mitochondrial DNA data. The manuscript is accompanied by freeware and example data sets that can be used to evaluate the new software (http://stringvalidation.org). PMID:21056022
CFTR-France, a national relational patient database for sharing genetic and phenotypic data associated with rare CFTR variants.

PubMed

Claustres, Mireille; Thèze, Corinne; des Georges, Marie; Baux, David; Girodon, Emmanuelle; Bienvenu, Thierry; Audrezet, Marie-Pierre; Dugueperoux, Ingrid; Férec, Claude; Lalau, Guy; Pagin, Adrien; Kitzis, Alain; Thoreau, Vincent; Gaston, Véronique; Bieth, Eric; Malinge, Marie-Claire; Reboul, Marie-Pierre; Fergelot, Patricia; Lemonnier, Lydie; Mekki, Chadia; Fanen, Pascale; Bergougnoux, Anne; Sasorith, Souphatta; Raynal, Caroline; Bareil, Corinne

2017-10-01

Most of the 2,000 variants identified in the CFTR (cystic fibrosis transmembrane regulator) gene are rare or private. Their interpretation is hampered by the lack of available data and resources, making patient care and genetic counseling challenging. We developed a patient-based database dedicated to the annotations of rare CFTR variants in the context of their cis- and trans-allelic combinations. Based on almost 30 years of experience of CFTR testing, CFTR-France (https://cftr.iurc.montp.inserm.fr/cftr) currently compiles 16,819 variant records from 4,615 individuals with cystic fibrosis (CF) or CFTR-RD (related disorders), fetuses with ultrasound bowel anomalies, newborns awaiting clinical diagnosis, and asymptomatic compound heterozygotes. For each of the 736 different variants reported in the database, patient characteristics and genetic information (other variations in cis or in trans) have been thoroughly checked by a dedicated curator. Combining updated clinical, epidemiological, in silico, or in vitro functional data helps to the interpretation of unclassified and the reassessment of misclassified variants. This comprehensive CFTR database is now an invaluable tool for diagnostic laboratories gathering information on rare variants, especially in the context of genetic counseling, prenatal and preimplantation genetic diagnosis. CFTR-France is thus highly complementary to the international database CFTR2 focused so far on the most common CF-causing alleles. © 2017 Wiley Periodicals, Inc.
A relational data-knowledge base system and its potential in developing a distributed data-knowledge system

NASA Technical Reports Server (NTRS)

Rahimian, Eric N.; Graves, Sara J.

1988-01-01

A new approach used in constructing a rational data knowledge base system is described. The relational database is well suited for distribution due to its property of allowing data fragmentation and fragmentation transparency. An example is formulated of a simple relational data knowledge base which may be generalized for use in developing a relational distributed data knowledge base system. The efficiency and ease of application of such a data knowledge base management system is briefly discussed. Also discussed are the potentials of the developed model for sharing the data knowledge base as well as the possible areas of difficulty in implementing the relational data knowledge base management system.
HbVar: A relational database of human hemoglobin variants and thalassemia mutations at the globin gene server.

PubMed

Hardison, Ross C; Chui, David H K; Giardine, Belinda; Riemer, Cathy; Patrinos, George P; Anagnou, Nicholas; Miller, Webb; Wajcman, Henri

2002-03-01

We have constructed a relational database of hemoglobin variants and thalassemia mutations, called HbVar, which can be accessed on the web at http://globin.cse.psu.edu. Extensive information is recorded for each variant and mutation, including a description of the variant and associated pathology, hematology, electrophoretic mobility, methods of isolation, stability information, ethnic occurrence, structure studies, functional studies, and references. The initial information was derived from books by Dr. Titus Huisman and colleagues [Huisman et al., 1996, 1997, 1998]. The current database is updated regularly with the addition of new data and corrections to previous data. Queries can be formulated based on fields in the database. Tables of common categories of variants, such as all those involving the alpha1-globin gene (HBA1) or all those that result in high oxygen affinity, are maintained by automated queries on the database. Users can formulate more precise queries, such as identifying "all beta-globin variants associated with instability and found in Scottish populations." This new database should be useful for clinical diagnosis as well as in fundamental studies of hemoglobin biochemistry, globin gene regulation, and human sequence variation at these loci. Copyright 2002 Wiley-Liss, Inc.
Greedy Sampling and Incremental Surrogate Model-Based Tailoring of Aeroservoelastic Model Database for Flexible Aircraft

NASA Technical Reports Server (NTRS)

Wang, Yi; Pant, Kapil; Brenner, Martin J.; Ouellette, Jeffrey A.

2018-01-01

This paper presents a data analysis and modeling framework to tailor and develop linear parameter-varying (LPV) aeroservoelastic (ASE) model database for flexible aircrafts in broad 2D flight parameter space. The Kriging surrogate model is constructed using ASE models at a fraction of grid points within the original model database, and then the ASE model at any flight condition can be obtained simply through surrogate model interpolation. The greedy sampling algorithm is developed to select the next sample point that carries the worst relative error between the surrogate model prediction and the benchmark model in the frequency domain among all input-output channels. The process is iterated to incrementally improve surrogate model accuracy till a pre-determined tolerance or iteration budget is met. The methodology is applied to the ASE model database of a flexible aircraft currently being tested at NASA/AFRC for flutter suppression and gust load alleviation. Our studies indicate that the proposed method can reduce the number of models in the original database by 67%. Even so the ASE models obtained through Kriging interpolation match the model in the original database constructed directly from the physics-based tool with the worst relative error far below 1%. The interpolated ASE model exhibits continuously-varying gains along a set of prescribed flight conditions. More importantly, the selected grid points are distributed non-uniformly in the parameter space, a) capturing the distinctly different dynamic behavior and its dependence on flight parameters, and b) reiterating the need and utility for adaptive space sampling techniques for ASE model database compaction. The present framework is directly extendible to high-dimensional flight parameter space, and can be used to guide the ASE model development, model order reduction, robust control synthesis and novel vehicle design of flexible aircraft.
ARCPHdb: A comprehensive protein database for SF1 and SF2 helicase from archaea.

PubMed

Moukhtar, Mirna; Chaar, Wafi; Abdel-Razzak, Ziad; Khalil, Mohamad; Taha, Samir; Chamieh, Hala

2017-01-01

Superfamily 1 and Superfamily 2 helicases, two of the largest helicase protein families, play vital roles in many biological processes including replication, transcription and translation. Study of helicase proteins in the model microorganisms of archaea have largely contributed to the understanding of their function, architecture and assembly. Based on a large phylogenomics approach, we have identified and classified all SF1 and SF2 protein families in ninety five sequenced archaea genomes. Here we developed an online webserver linked to a specialized protein database named ARCPHdb to provide access for SF1 and SF2 helicase families from archaea. ARCPHdb was implemented using MySQL relational database. Web interfaces were developed using Netbeans. Data were stored according to UniProt accession numbers, NCBI Ref Seq ID, PDB IDs and Entrez Databases. A user-friendly interactive web interface has been developed to browse, search and download archaeal helicase protein sequences, their available 3D structure models, and related documentation available in the literature provided by ARCPHdb. The database provides direct links to matching external databases. The ARCPHdb is the first online database to compile all protein information on SF1 and SF2 helicase from archaea in one platform. This database provides essential resource information for all researchers interested in the field. Copyright © 2016 Elsevier Ltd. All rights reserved.
Reference manual for data base on Nevada water-rights permits

USGS Publications Warehouse

Cartier, K.D.; Bauer, E.M.; Farnham, J.L.

1995-01-01

The U.S. Geological Survey and Nevada Division of Water Resources have cooperatively developed and implemented a data-base system for managing water-rights permit information for the State of Nevada. The Water-Rights Permit data base is part of an integrated system of computer data bases using the Ingres Relational Data-Base Manage-ment System, which allows efficient storage and access to water information from the State Engineer's office. The data base contains a main table, three ancillary tables, and five lookup tables, as well as a menu-driven system for entering, updating, and reporting on the data. This reference guide outlines the general functions of the system and provides a brief description of data tables and data-entry screens.

AgeFactDB—the JenAge Ageing Factor Database—towards data integration in ageing research

PubMed Central

Hühne, Rolf; Thalheim, Torsten; Sühnel, Jürgen

2014-01-01

AgeFactDB (http://agefactdb.jenage.de) is a database aimed at the collection and integration of ageing phenotype data including lifespan information. Ageing factors are considered to be genes, chemical compounds or other factors such as dietary restriction, whose action results in a changed lifespan or another ageing phenotype. Any information related to the effects of ageing factors is called an observation and is presented on observation pages. To provide concise access to the complete information for a particular ageing factor, corresponding observations are also summarized on ageing factor pages. In a first step, ageing-related data were primarily taken from existing databases such as the Ageing Gene Database—GenAge, the Lifespan Observations Database and the Dietary Restriction Gene Database—GenDR. In addition, we have started to include new ageing-related information. Based on homology data taken from the HomoloGene Database, AgeFactDB also provides observation and ageing factor pages of genes that are homologous to known ageing-related genes. These homologues are considered as candidate or putative ageing-related genes. AgeFactDB offers a variety of search and browse options, and also allows the download of ageing factor or observation lists in TSV, CSV and XML formats. PMID:24217911
Creating a sampling frame for population-based veteran research: representativeness and overlap of VA and Department of Defense databases.

PubMed

Washington, Donna L; Sun, Su; Canning, Mark

2010-01-01

Most veteran research is conducted in Department of Veterans Affairs (VA) healthcare settings, although most veterans obtain healthcare outside the VA. Our objective was to determine the adequacy and relative contributions of Veterans Health Administration (VHA), Veterans Benefits Administration (VBA), and Department of Defense (DOD) administrative databases for representing the U.S. veteran population, using as an example the creation of a sampling frame for the National Survey of Women Veterans. In 2008, we merged the VHA, VBA, and DOD databases. We identified the number of unique records both overall and from each database. The combined databases yielded 925,946 unique records, representing 51% of the 1,802,000 U.S. women veteran population. The DOD database included 30% of the population (with 8% overlap with other databases). The VHA enrollment database contributed an additional 20% unique women veterans (with 6% overlap with VBA databases). VBA databases contributed an additional 2% unique women veterans (beyond 10% overlap with other databases). Use of VBA and DOD databases substantially expands access to the population of veterans beyond those in VHA databases, regardless of VA use. Adoption of these additional databases would enhance the value and generalizability of a wide range of studies of both male and female veterans.
DB Dehydrogenase: an online integrated structural database on enzyme dehydrogenase.

PubMed

Nandy, Suman Kumar; Bhuyan, Rajabrata; Seal, Alpana

2012-01-01

Dehydrogenase enzymes are almost inevitable for metabolic processes. Shortage or malfunctioning of dehydrogenases often leads to several acute diseases like cancers, retinal diseases, diabetes mellitus, Alzheimer, hepatitis B & C etc. With advancement in modern-day research, huge amount of sequential, structural and functional data are generated everyday and widens the gap between structural attributes and its functional understanding. DB Dehydrogenase is an effort to relate the functionalities of dehydrogenase with its structures. It is a completely web-based structural database, covering almost all dehydrogenases [~150 enzyme classes, ~1200 entries from ~160 organisms] whose structures are known. It is created by extracting and integrating various online resources to provide the true and reliable data and implemented by MySQL relational database through user friendly web interfaces using CGI Perl. Flexible search options are there for data extraction and exploration. To summarize, sequence, structure, function of all dehydrogenases in one place along with the necessary option of cross-referencing; this database will be utile for researchers to carry out further work in this field. The database is available for free at http://www.bifku.in/DBD/
Systemic Vulnerabilities to Suicide among Veterans from the Iraq and Afghanistan Conflicts: Review of Case Reports from a National Veterans Affairs Database

ERIC Educational Resources Information Center

Mills, Peter D.; Huber, Samuel J.; Watts, Bradley Vince; Bagian, James P.

2011-01-01

While suicide among recently returned veterans is of great concern, it is a relatively rare occurrence within individual hospitals and clinics. Root cause analysis (RCA) generates a detailed case report that can be used to identify system-based vulnerabilities following an adverse event. Review of a national database of RCA reports may identify…
NVST Data Archiving System Based On FastBit NoSQL Database

NASA Astrophysics Data System (ADS)

Liu, Ying-bo; Wang, Feng; Ji, Kai-fan; Deng, Hui; Dai, Wei; Liang, Bo

2014-06-01

The New Vacuum Solar Telescope (NVST) is a 1-meter vacuum solar telescope that aims to observe the fine structures of active regions on the Sun. The main tasks of the NVST are high resolution imaging and spectral observations, including the measurements of the solar magnetic field. The NVST has been collecting more than 20 million FITS files since it began routine observations in 2012 and produces a maximum observational records of 120 thousand files in a day. Given the large amount of files, the effective archiving and retrieval of files becomes a critical and urgent problem. In this study, we implement a new data archiving system for the NVST based on the Fastbit Not Only Structured Query Language (NoSQL) database. Comparing to the relational database (i.e., MySQL; My Structured Query Language), the Fastbit database manifests distinctive advantages on indexing and querying performance. In a large scale database of 40 million records, the multi-field combined query response time of Fastbit database is about 15 times faster and fully meets the requirements of the NVST. Our study brings a new idea for massive astronomical data archiving and would contribute to the design of data management systems for other astronomical telescopes.
HBVPathDB: a database of HBV infection-related molecular interaction network.

PubMed

Zhang, Yi; Bo, Xiao-Chen; Yang, Jing; Wang, Sheng-Qi

2005-03-21

To describe molecules or genes interaction between hepatitis B viruses (HBV) and host, for understanding how virus' and host's genes and molecules are networked to form a biological system and for perceiving mechanism of HBV infection. The knowledge of HBV infection-related reactions was organized into various kinds of pathways with carefully drawn graphs in HBVPathDB. Pathway information is stored with relational database management system (DBMS), which is currently the most efficient way to manage large amounts of data and query is implemented with powerful Structured Query Language (SQL). The search engine is written using Personal Home Page (PHP) with SQL embedded and web retrieval interface is developed for searching with Hypertext Markup Language (HTML). We present the first version of HBVPathDB, which is a HBV infection-related molecular interaction network database composed of 306 pathways with 1 050 molecules involved. With carefully drawn graphs, pathway information stored in HBVPathDB can be browsed in an intuitive way. We develop an easy-to-use interface for flexible accesses to the details of database. Convenient software is implemented to query and browse the pathway information of HBVPathDB. Four search page layout options-category search, gene search, description search, unitized search-are supported by the search engine of the database. The database is freely available at http://www.bio-inf.net/HBVPathDB/HBV/. The conventional perspective HBVPathDB have already contained a considerable amount of pathway information with HBV infection related, which is suitable for in-depth analysis of molecular interaction network of virus and host. HBVPathDB integrates pathway data-sets with convenient software for query, browsing, visualization, that provides users more opportunity to identify regulatory key molecules as potential drug targets and to explore the possible mechanism of HBV infection based on gene expression datasets.
Variability sensitivity of dynamic texture based recognition in clinical CT data

NASA Astrophysics Data System (ADS)

Kwitt, Roland; Razzaque, Sharif; Lowell, Jeffrey; Aylward, Stephen

2014-03-01

Dynamic texture recognition using a database of template models has recently shown promising results for the task of localizing anatomical structures in Ultrasound video. In order to understand its clinical value, it is imperative to study the sensitivity with respect to inter-patient variability as well as sensitivity to acquisition parameters such as Ultrasound probe angle. Fully addressing patient and acquisition variability issues, however, would require a large database of clinical Ultrasound from many patients, acquired in a multitude of controlled conditions, e.g., using a tracked transducer. Since such data is not readily attainable, we advocate an alternative evaluation strategy using abdominal CT data as a surrogate. In this paper, we describe how to replicate Ultrasound variabilities by extracting subvolumes from CT and interpreting the image material as an ordered sequence of video frames. Utilizing this technique, and based on a database of abdominal CT from 45 patients, we report recognition results on an organ (kidney) recognition task, where we try to discriminate kidney subvolumes/videos from a collection of randomly sampled negative instances. We demonstrate that (1) dynamic texture recognition is relatively insensitive to inter-patient variation while (2) viewing angle variability needs to be accounted for in the template database. Since naively extending the template database to counteract variability issues can lead to impractical database sizes, we propose an alternative strategy based on automated identification of a small set of representative models.
MEGGASENSE - The Metagenome/Genome Annotated Sequence Natural Language Search Engine: A Platform for  the Construction of Sequence Data Warehouses.

PubMed

Gacesa, Ranko; Zucko, Jurica; Petursdottir, Solveig K; Gudmundsdottir, Elisabet Eik; Fridjonsson, Olafur H; Diminic, Janko; Long, Paul F; Cullum, John; Hranueli, Daslav; Hreggvidsson, Gudmundur O; Starcevic, Antonio

2017-06-01

The MEGGASENSE platform constructs relational databases of DNA or protein sequences. The default functional analysis uses 14 106 hidden Markov model (HMM) profiles based on sequences in the KEGG database. The Solr search engine allows sophisticated queries and a BLAST search function is also incorporated. These standard capabilities were used to generate the SCATT database from the predicted proteome of Streptomyces cattleya . The implementation of a specialised metagenome database (AMYLOMICS) for bioprospecting of carbohydrate-modifying enzymes is described. In addition to standard assembly of reads, a novel 'functional' assembly was developed, in which screening of reads with the HMM profiles occurs before the assembly. The AMYLOMICS database incorporates additional HMM profiles for carbohydrate-modifying enzymes and it is illustrated how the combination of HMM and BLAST analyses helps identify interesting genes. A variety of different proteome and metagenome databases have been generated by MEGGASENSE.
Multiple imputation as one tool to provide longitudinal databases for modelling human height and weight development.

PubMed

Aßmann, C

2016-06-01

Besides large efforts regarding field work, provision of valid databases requires statistical and informational infrastructure to enable long-term access to longitudinal data sets on height, weight and related issues. To foster use of longitudinal data sets within the scientific community, provision of valid databases has to address data-protection regulations. It is, therefore, of major importance to hinder identifiability of individuals from publicly available databases. To reach this goal, one possible strategy is to provide a synthetic database to the public allowing for pretesting strategies for data analysis. The synthetic databases can be established using multiple imputation tools. Given the approval of the strategy, verification is based on the original data. Multiple imputation by chained equations is illustrated to facilitate provision of synthetic databases as it allows for capturing a wide range of statistical interdependencies. Also missing values, typically occurring within longitudinal databases for reasons of item non-response, can be addressed via multiple imputation when providing databases. The provision of synthetic databases using multiple imputation techniques is one possible strategy to ensure data protection, increase visibility of longitudinal databases and enhance the analytical potential.
Development of the ECODAB into a relational database for Escherichia coli O-antigens and other bacterial polysaccharides.

PubMed

Rojas-Macias, Miguel A; Ståhle, Jonas; Lütteke, Thomas; Widmalm, Göran

2015-03-01

Escherichia coli O-antigen database (ECODAB) is a web-based application to support the collection of E. coli O-antigen structures, polymerase and flippase amino acid sequences, NMR chemical shift data of O-antigens as well as information on glycosyltransferases (GTs) involved in the assembly of O-antigen polysaccharides. The database content has been compiled from scientific literature. Furthermore, the system has evolved from being a repository to one that can be used for generating novel data on its own. GT specificity is suggested through sequence comparison with GTs whose function is known. The migration of ECODAB to a relational database has allowed the automation of all processes to update, retrieve and present information, thereby, endowing the system with greater flexibility and improved overall performance. ECODAB is freely available at http://www.casper.organ.su.se/ECODAB/. Currently, data on 169 E. coli unique O-antigen entries and 338 GTs is covered. Moreover, the scope of the database has been extended so that polysaccharide structure and related information from other bacteria subsequently can be added, for example, from Streptococcus pneumoniae. © The Author 2014. Published by Oxford University Press. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.
Background qualitative analysis of the European reference life cycle database (ELCD) energy datasets - part II: electricity datasets.

PubMed

Garraín, Daniel; Fazio, Simone; de la Rúa, Cristina; Recchioni, Marco; Lechón, Yolanda; Mathieux, Fabrice

2015-01-01

The aim of this paper is to identify areas of potential improvement of the European Reference Life Cycle Database (ELCD) electricity datasets. The revision is based on the data quality indicators described by the International Life Cycle Data system (ILCD) Handbook, applied on sectorial basis. These indicators evaluate the technological, geographical and time-related representativeness of the dataset and the appropriateness in terms of completeness, precision and methodology. Results show that ELCD electricity datasets have a very good quality in general terms, nevertheless some findings and recommendations in order to improve the quality of Life-Cycle Inventories have been derived. Moreover, these results ensure the quality of the electricity-related datasets to any LCA practitioner, and provide insights related to the limitations and assumptions underlying in the datasets modelling. Giving this information, the LCA practitioner will be able to decide whether the use of the ELCD electricity datasets is appropriate based on the goal and scope of the analysis to be conducted. The methodological approach would be also useful for dataset developers and reviewers, in order to improve the overall Data Quality Requirements of databases.
A database perspective of the transition from single-use (ancillary-based) systems to integrated models supporting clinical care and research in a MUMPS-based system.

PubMed

Siegel, J; Kirkland, D

1991-01-01

The Composite Health Care System (CHCS), a MUMPS-based hospital information system (HIS), has evolved from the Decentralized Hospital Computer Program (DHCP) installed within VA Hospitals. The authors explore the evolution of an ancillary-based system toward an integrated model with a look at its current state and possible future. The history and relationships between orders of different types tie specific patient-related data into a logical and temporal model. Diagrams demonstrate how the database structure has evolved to support clinical needs for integration. It is suggested that a fully integrated model is capable of meeting traditional HIS needs.
Towards pathogenomics: a web-based resource for pathogenicity islands

PubMed Central

Yoon, Sung Ho; Park, Young-Kyu; Lee, Soohyun; Choi, Doil; Oh, Tae Kwang; Hur, Cheol-Goo; Kim, Jihyun F.

2007-01-01

Pathogenicity islands (PAIs) are genetic elements whose products are essential to the process of disease development. They have been horizontally (laterally) transferred from other microbes and are important in evolution of pathogenesis. In this study, a comprehensive database and search engines specialized for PAIs were established. The pathogenicity island database (PAIDB) is a comprehensive relational database of all the reported PAIs and potential PAI regions which were predicted by a method that combines feature-based analysis and similarity-based analysis. Also, using the PAI Finder search application, a multi-sequence query can be analyzed onsite for the presence of potential PAIs. As of April 2006, PAIDB contains 112 types of PAIs and 889 GenBank accessions containing either partial or all PAI loci previously reported in the literature, which are present in 497 strains of pathogenic bacteria. The database also offers 310 candidate PAIs predicted from 118 sequenced prokaryotic genomes. With the increasing number of prokaryotic genomes without functional inference and sequenced genetic regions of suspected involvement in diseases, this web-based, user-friendly resource has the potential to be of significant use in pathogenomics. PAIDB is freely accessible at . PMID:17090594
The crustal dynamics intelligent user interface anthology

NASA Technical Reports Server (NTRS)

Short, Nicholas M., Jr.; Campbell, William J.; Roelofs, Larry H.; Wattawa, Scott L.

1987-01-01

The National Space Science Data Center (NSSDC) has initiated an Intelligent Data Management (IDM) research effort which has, as one of its components, the development of an Intelligent User Interface (IUI). The intent of the IUI is to develop a friendly and intelligent user interface service based on expert systems and natural language processing technologies. The purpose of such a service is to support the large number of potential scientific and engineering users that have need of space and land-related research and technical data, but have little or no experience in query languages or understanding of the information content or architecture of the databases of interest. This document presents the design concepts, development approach and evaluation of the performance of a prototype IUI system for the Crustal Dynamics Project Database, which was developed using a microcomputer-based expert system tool (M. 1), the natural language query processor THEMIS, and the graphics software system GSS. The IUI design is based on a multiple view representation of a database from both the user and database perspective, with intelligent processes to translate between the views.
A literature search tool for intelligent extraction of disease-associated genes.

PubMed

Jung, Jae-Yoon; DeLuca, Todd F; Nelson, Tristan H; Wall, Dennis P

2014-01-01

To extract disorder-associated genes from the scientific literature in PubMed with greater sensitivity for literature-based support than existing methods. We developed a PubMed query to retrieve disorder-related, original research articles. Then we applied a rule-based text-mining algorithm with keyword matching to extract target disorders, genes with significant results, and the type of study described by the article. We compared our resulting candidate disorder genes and supporting references with existing databases. We demonstrated that our candidate gene set covers nearly all genes in manually curated databases, and that the references supporting the disorder-gene link are more extensive and accurate than other general purpose gene-to-disorder association databases. We implemented a novel publication search tool to find target articles, specifically focused on links between disorders and genotypes. Through comparison against gold-standard manually updated gene-disorder databases and comparison with automated databases of similar functionality we show that our tool can search through the entirety of PubMed to extract the main gene findings for human diseases rapidly and accurately.
A prediction model-based algorithm for computer-assisted database screening of adverse drug reactions in the Netherlands.

PubMed

Scholl, Joep H G; van Hunsel, Florence P A M; Hak, Eelko; van Puijenbroek, Eugène P

2018-02-01

The statistical screening of pharmacovigilance databases containing spontaneously reported adverse drug reactions (ADRs) is mainly based on disproportionality analysis. The aim of this study was to improve the efficiency of full database screening using a prediction model-based approach. A logistic regression-based prediction model containing 5 candidate predictors was developed and internally validated using the Summary of Product Characteristics as the gold standard for the outcome. All drug-ADR associations, with the exception of those related to vaccines, with a minimum of 3 reports formed the training data for the model. Performance was based on the area under the receiver operating characteristic curve (AUC). Results were compared with the current method of database screening based on the number of previously analyzed associations. A total of 25 026 unique drug-ADR associations formed the training data for the model. The final model contained all 5 candidate predictors (number of reports, disproportionality, reports from healthcare professionals, reports from marketing authorization holders, Naranjo score). The AUC for the full model was 0.740 (95% CI; 0.734-0.747). The internal validity was good based on the calibration curve and bootstrapping analysis (AUC after bootstrapping = 0.739). Compared with the old method, the AUC increased from 0.649 to 0.740, and the proportion of potential signals increased by approximately 50% (from 12.3% to 19.4%). A prediction model-based approach can be a useful tool to create priority-based listings for signal detection in databases consisting of spontaneous ADRs. © 2017 The Authors. Pharmacoepidemiology & Drug Safety Published by John Wiley & Sons Ltd.
Supporting user-defined granularities in a spatiotemporal conceptual model

USGS Publications Warehouse

Khatri, V.; Ram, S.; Snodgrass, R.T.; O'Brien, G. M.

2002-01-01

Granularities are integral to spatial and temporal data. A large number of applications require storage of facts along with their temporal and spatial context, which needs to be expressed in terms of appropriate granularities. For many real-world applications, a single granularity in the database is insufficient. In order to support any type of spatial or temporal reasoning, the semantics related to granularities needs to be embedded in the database. Specifying granularities related to facts is an important part of conceptual database design because under-specifying the granularity can restrict an application, affect the relative ordering of events and impact the topological relationships. Closely related to granularities is indeterminacy, i.e., an occurrence time or location associated with a fact that is not known exactly. In this paper, we present an ontology for spatial granularities that is a natural analog of temporal granularities. We propose an upward-compatible, annotation-based spatiotemporal conceptual model that can comprehensively capture the semantics related to spatial and temporal granularities, and indeterminacy without requiring new spatiotemporal constructs. We specify the formal semantics of this spatiotemporal conceptual model via translation to a conventional conceptual model. To underscore the practical focus of our approach, we describe an on-going case study. We apply our approach to a hydrogeologic application at the United States Geologic Survey and demonstrate that our proposed granularity-based spatiotemporal conceptual model is straightforward to use and is comprehensive.
PIPEMicroDB: microsatellite database and primer generation tool for pigeonpea genome

PubMed Central

Sarika; Arora, Vasu; Iquebal, M. A.; Rai, Anil; Kumar, Dinesh

2013-01-01

Molecular markers play a significant role for crop improvement in desirable characteristics, such as high yield, resistance to disease and others that will benefit the crop in long term. Pigeonpea (Cajanus cajan L.) is the recently sequenced legume by global consortium led by ICRISAT (Hyderabad, India) and been analysed for gene prediction, synteny maps, markers, etc. We present PIgeonPEa Microsatellite DataBase (PIPEMicroDB) with an automated primer designing tool for pigeonpea genome, based on chromosome wise as well as location wise search of primers. Total of 123 387 Short Tandem Repeats (STRs) were extracted from pigeonpea genome, available in public domain using MIcroSAtellite tool (MISA). The database is an online relational database based on ‘three-tier architecture’ that catalogues information of microsatellites in MySQL and user-friendly interface is developed using PHP. Search for STRs may be customized by limiting their location on chromosome as well as number of markers in that range. This is a novel approach and is not been implemented in any of the existing marker database. This database has been further appended with Primer3 for primer designing of selected markers with left and right flankings of size up to 500 bp. This will enable researchers to select markers of choice at desired interval over the chromosome. Furthermore, one can use individual STRs of a targeted region over chromosome to narrow down location of gene of interest or linked Quantitative Trait Loci (QTLs). Although it is an in silico approach, markers’ search based on characteristics and location of STRs is expected to be beneficial for researchers. Database URL: http://cabindb.iasri.res.in/pigeonpea/ PMID:23396298
PIPEMicroDB: microsatellite database and primer generation tool for pigeonpea genome.

PubMed

Sarika; Arora, Vasu; Iquebal, M A; Rai, Anil; Kumar, Dinesh

2013-01-01

Molecular markers play a significant role for crop improvement in desirable characteristics, such as high yield, resistance to disease and others that will benefit the crop in long term. Pigeonpea (Cajanus cajan L.) is the recently sequenced legume by global consortium led by ICRISAT (Hyderabad, India) and been analysed for gene prediction, synteny maps, markers, etc. We present PIgeonPEa Microsatellite DataBase (PIPEMicroDB) with an automated primer designing tool for pigeonpea genome, based on chromosome wise as well as location wise search of primers. Total of 123 387 Short Tandem Repeats (STRs) were extracted from pigeonpea genome, available in public domain using MIcroSAtellite tool (MISA). The database is an online relational database based on 'three-tier architecture' that catalogues information of microsatellites in MySQL and user-friendly interface is developed using PHP. Search for STRs may be customized by limiting their location on chromosome as well as number of markers in that range. This is a novel approach and is not been implemented in any of the existing marker database. This database has been further appended with Primer3 for primer designing of selected markers with left and right flankings of size up to 500 bp. This will enable researchers to select markers of choice at desired interval over the chromosome. Furthermore, one can use individual STRs of a targeted region over chromosome to narrow down location of gene of interest or linked Quantitative Trait Loci (QTLs). Although it is an in silico approach, markers' search based on characteristics and location of STRs is expected to be beneficial for researchers. Database URL: http://cabindb.iasri.res.in/pigeonpea/
A mapping review of the literature on UK-focused health and social care databases.

PubMed

Cooper, Chris; Rogers, Morwenna; Bethel, Alison; Briscoe, Simon; Lowe, Jenny

2015-03-01

Bibliographic databases are a day-to-day tool of the researcher: they offer the researcher easy and organised access to knowledge, but how much is actually known about the databases on offer? The focus of this paper is UK health and social care databases. These databases are often small, specialised by topic, and provide a complementary literature to the large, international databases. There is, however, good evidence that these databases are overlooked in systematic reviews, perhaps because little is known about what they can offer. To systematically locate and map, published and unpublished literature on the key UK health and social care bibliographic databases. Systematic searching and mapping. Two hundred and forty-two items were identified which specifically related to the 24 of the 34 databases under review. There is little published or unpublished literature specifically analysing the key UK health and social care databases. Since several UK databases have closed, others are at risk, and some are overlooked in reviews, better information is required to enhance our knowledge. Further research on UK health and social care databases is required. This paper suggests the need to develop the evidence base through a series of case studies on each of the databases. © 2014 The authors. Health Information and Libraries Journal © 2014 Health Libraries Journal.

Database for the degradation risk assessment of groundwater resources (Southern Italy)

NASA Astrophysics Data System (ADS)

Polemio, M.; Dragone, V.; Mitolo, D.

2003-04-01

The risk characterisation of quality degradation and availability lowering of groundwater resources has been pursued for a wide coastal plain (Basilicata region, Southern Italy), an area covering 40 km along the Ionian Sea and 10 km inland. The quality degradation is due two phenomena: pollution due to discharge of waste water (coming from urban areas) and due to salt pollution, related to seawater intrusion but not only. The availability lowering is due to overexploitation but also due to drought effects. To this purpose the historical data of 1,130 wells have been collected. Wells, homogenously distributed in the area, were the source of geological, stratigraphical, hydrogeological, geochemical data. In order to manage space-related information via a GIS, a database system has been devised to encompass all the surveyed wells and the body of information available per well. Geo-databases were designed to comprise the four types of data collected: a database including geometrical, geological and hydrogeological data on wells (WDB), a database devoted to chemical and physical data on groundwater (CDB), a database including the geotechnical parameters (GDB), a database concering piezometric and hydrological (rainfall, air temperature, river discharge) data (HDB). The record pertaining to each well is identified in these databases by the progressive number of the well itself. Every database is designed as follows: a) the HDB contains 1,158 records, 28 of and 31 fields, mainly describing the geometry of the well and of the stratigraphy; b) the CDB encompasses data about 157 wells, based on which the chemical and physical analyses of groundwater have been carried out. More than one record has been associated with these 157 wells, due to periodic monitoring and analysis; c) the GDB covers 61 wells to which the geotechnical parameters obtained by soil samples taken at various depths; the HDB is designed to permit the analysis of long time series (from 1918) of piezometric data, monitored by more than 60 wells, temperature, rainfall and river discharge data. Based on geo-databases, the geostatistical processing of data has permitted to characterise the degradation risk of groundwater resources of a wide coastal aquifer.
A new natural hazards data-base for volcanic ash and SO2 from global satellite remote sensing measurements

NASA Astrophysics Data System (ADS)

Prata, F.; Stebel, K.

2013-12-01

Over the last few years there has been a recognition of the utility of satellite measurements to identify and track volcanic emissions that present a natural hazard to human populations. Mitigation of the volcanic hazard to life and the environment requires understanding of the properties of volcanic emissions, identifying the hazard in near real-time and being able to provide timely and accurate forecasts to affected areas. Amongst the many ways to measure volcanic emissions, satellite remote sensing is capable of providing global quantitative retrievals of important microphysical parameters such as ash mass loading, ash particle effective radius, infrared optical depth, SO2 partial and total column abundance, plume altitude, aerosol optical depth and aerosol absorbing index. The eruption of Eyjafjallajokull in April-May, 2010 led to increased research and measurement programs to better characterize properties of volcanic ash and the need to establish a data-base in which to store and access these data was confirmed. The European Space Agency (ESA) has recognized the importance of having a quality controlled data-base of satellite retrievals and has funded an activity (VAST) to develop novel remote sensing retrieval schemes and a data-base, initially focused on several recent hazardous volcanic eruptions. As a first step, satellite retrievals for the eruptions of Eyjafjallajokull, Grimsvotn, Puyhue-Cordon Caulle, Nabro, Merapi, Okmok, Kasatochi and Sarychev Peak are being considered. Here we describe the data, retrievals and methods being developed for the data-base. Three important applications of the data-base are illustrated related to the ash/aviation problem, to the impact of the Merapi volcanic eruption on the local population, and to estimate SO2 fluxes from active volcanoes-as a means to diagnose future unrest. Dispersion model simulations are also being included in the data-base. In time, data from conventional in situ sampling instruments, airborne and ground-based remote sensing platforms and other meta-data (bulk ash and gas properties, volcanic setting, volcanic eruption chronologies, hazards and impacts etc.) will be added. The data-base has the potential to provide the natural hazards community with the first dynamic atmospheric volcanic hazards map and will be a valuable tool particularly for global transport.
Class dependency of fuzzy relational database using relational calculus and conditional probability

NASA Astrophysics Data System (ADS)

Deni Akbar, Mohammad; Mizoguchi, Yoshihiro; Adiwijaya

2018-03-01

In this paper, we propose a design of fuzzy relational database to deal with a conditional probability relation using fuzzy relational calculus. In the previous, there are several researches about equivalence class in fuzzy database using similarity or approximate relation. It is an interesting topic to investigate the fuzzy dependency using equivalence classes. Our goal is to introduce a formulation of a fuzzy relational database model using the relational calculus on the category of fuzzy relations. We also introduce general formulas of the relational calculus for the notion of database operations such as ’projection’, ’selection’, ’injection’ and ’natural join’. Using the fuzzy relational calculus and conditional probabilities, we introduce notions of equivalence class, redundant, and dependency in the theory fuzzy relational database.
Transaction Logging.

ERIC Educational Resources Information Center

Jones, S.; And Others

1997-01-01

Discusses the use of transaction logging in Okapi-related projects to allow search algorithms and user interfaces to be investigated, evaluated, and compared. A series of examples is presented, illustrating logging software for character-based and graphical user interface systems, and demonstrating the usefulness of relational database management…
PR Bibliography, 1994.

ERIC Educational Resources Information Center

Walker, Albert, Ed.

1994-01-01

Based on searches of databases and over 140 periodicals, this annotated bibliography presents a representative collection of books and journal articles related to the knowledge and practice of public relations published in 1993. The annotated bibliography is subdivided into 35 different categories, including business credibility and ethics;…
Technology for organization of the onboard system for processing and storage of ERS data for ultrasmall spacecraft

NASA Astrophysics Data System (ADS)

Strotov, Valery V.; Taganov, Alexander I.; Konkin, Yuriy V.; Kolesenkov, Aleksandr N.

2017-10-01

Task of processing and analysis of obtained Earth remote sensing data on ultra-small spacecraft board is actual taking into consideration significant expenditures of energy for data transfer and low productivity of computers. Thereby, there is an issue of effective and reliable storage of the general information flow obtained from onboard systems of information collection, including Earth remote sensing data, into a specialized data base. The paper has considered peculiarities of database management system operation with the multilevel memory structure. For storage of data in data base the format has been developed that describes a data base physical structure which contains required parameters for information loading. Such structure allows reducing a memory size occupied by data base because it is not necessary to store values of keys separately. The paper has shown architecture of the relational database management system oriented into embedment into the onboard ultra-small spacecraft software. Data base for storage of different information, including Earth remote sensing data, can be developed by means of such database management system for its following processing. Suggested database management system architecture has low requirements to power of the computer systems and memory resources on the ultra-small spacecraft board. Data integrity is ensured under input and change of the structured information.
Integration of Evidence Base into a Probabilistic Risk Assessment

NASA Technical Reports Server (NTRS)

Saile, Lyn; Lopez, Vilma; Bickham, Grandin; Kerstman, Eric; FreiredeCarvalho, Mary; Byrne, Vicky; Butler, Douglas; Myers, Jerry; Walton, Marlei

2011-01-01

INTRODUCTION: A probabilistic decision support model such as the Integrated Medical Model (IMM) utilizes an immense amount of input data that necessitates a systematic, integrated approach for data collection, and management. As a result of this approach, IMM is able to forecasts medical events, resource utilization and crew health during space flight. METHODS: Inflight data is the most desirable input for the Integrated Medical Model. Non-attributable inflight data is collected from the Lifetime Surveillance for Astronaut Health study as well as the engineers, flight surgeons, and astronauts themselves. When inflight data is unavailable cohort studies, other models and Bayesian analyses are used, in addition to subject matters experts input on occasion. To determine the quality of evidence of a medical condition, the data source is categorized and assigned a level of evidence from 1-5; the highest level is one. The collected data reside and are managed in a relational SQL database with a web-based interface for data entry and review. The database is also capable of interfacing with outside applications which expands capabilities within the database itself. Via the public interface, customers can access a formatted Clinical Findings Form (CLiFF) that outlines the model input and evidence base for each medical condition. Changes to the database are tracked using a documented Configuration Management process. DISSCUSSION: This strategic approach provides a comprehensive data management plan for IMM. The IMM Database s structure and architecture has proven to support additional usages. As seen by the resources utilization across medical conditions analysis. In addition, the IMM Database s web-based interface provides a user-friendly format for customers to browse and download the clinical information for medical conditions. It is this type of functionality that will provide Exploratory Medicine Capabilities the evidence base for their medical condition list. CONCLUSION: The IMM Database in junction with the IMM is helping NASA aerospace program improve the health care and reduce risk for the astronauts crew. Both the database and model will continue to expand to meet customer needs through its multi-disciplinary evidence based approach to managing data. Future expansion could serve as a platform for a Space Medicine Wiki of medical conditions.
DEXTER: Disease-Expression Relation Extraction from Text.

PubMed

Gupta, Samir; Dingerdissen, Hayley; Ross, Karen E; Hu, Yu; Wu, Cathy H; Mazumder, Raja; Vijay-Shanker, K

2018-01-01

Gene expression levels affect biological processes and play a key role in many diseases. Characterizing expression profiles is useful for clinical research, and diagnostics and prognostics of diseases. There are currently several high-quality databases that capture gene expression information, obtained mostly from large-scale studies, such as microarray and next-generation sequencing technologies, in the context of disease. The scientific literature is another rich source of information on gene expression-disease relationships that not only have been captured from large-scale studies but have also been observed in thousands of small-scale studies. Expression information obtained from literature through manual curation can extend expression databases. While many of the existing databases include information from literature, they are limited by the time-consuming nature of manual curation and have difficulty keeping up with the explosion of publications in the biomedical field. In this work, we describe an automated text-mining tool, Disease-Expression Relation Extraction from Text (DEXTER) to extract information from literature on gene and microRNA expression in the context of disease. One of the motivations in developing DEXTER was to extend the BioXpress database, a cancer-focused gene expression database that includes data derived from large-scale experiments and manual curation of publications. The literature-based portion of BioXpress lags behind significantly compared to expression information obtained from large-scale studies and can benefit from our text-mined results. We have conducted two different evaluations to measure the accuracy of our text-mining tool and achieved average F-scores of 88.51 and 81.81% for the two evaluations, respectively. Also, to demonstrate the ability to extract rich expression information in different disease-related scenarios, we used DEXTER to extract information on differential expression information for 2024 genes in lung cancer, 115 glycosyltransferases in 62 cancers and 826 microRNA in 171 cancers. All extractions using DEXTER are integrated in the literature-based portion of BioXpress.Database URL: http://biotm.cis.udel.edu/DEXTER.
A web-based, relational database for studying glaciers in the Italian Alps

NASA Astrophysics Data System (ADS)

Nigrelli, G.; Chiarle, M.; Nuzzi, A.; Perotti, L.; Torta, G.; Giardino, M.

2013-02-01

Glaciers are among the best terrestrial indicators of climate change and thus glacier inventories have attracted a growing, worldwide interest in recent years. In Italy, the first official glacier inventory was completed in 1925 and 774 glacial bodies were identified. As the amount of data continues to increase, and new techniques become available, there is a growing demand for computer tools that can efficiently manage the collected data. The Research Institute for Geo-hydrological Protection of the National Research Council, in cooperation with the Departments of Computer Science and Earth Sciences of the University of Turin, created a database that provides a modern tool for storing, processing and sharing glaciological data. The database was developed according to the need of storing heterogeneous information, which can be retrieved through a set of web search queries. The database's architecture is server-side, and was designed by means of an open source software. The website interface, simple and intuitive, was intended to meet the needs of a distributed public: through this interface, any type of glaciological data can be managed, specific queries can be performed, and the results can be exported in a standard format. The use of a relational database to store and organize a large variety of information about Italian glaciers collected over the last hundred years constitutes a significant step forward in ensuring the safety and accessibility of such data. Moreover, the same benefits also apply to the enhanced operability for handling information in the future, including new and emerging types of data formats, such as geographic and multimedia files. Future developments include the integration of cartographic data, such as base maps, satellite images and vector data. The relational database described in this paper will be the heart of a new geographic system that will merge data, data attributes and maps, leading to a complete description of Italian glacial environments.
Transactional Database Transformation and Its Application in Prioritizing Human Disease Genes

PubMed Central

Xiang, Yang; Payne, Philip R.O.; Huang, Kun

2013-01-01

Binary (0,1) matrices, commonly known as transactional databases, can represent many application data, including gene-phenotype data where “1” represents a confirmed gene-phenotype relation and “0” represents an unknown relation. It is natural to ask what information is hidden behind these “0”s and “1”s. Unfortunately, recent matrix completion methods, though very effective in many cases, are less likely to infer something interesting from these (0,1)-matrices. To answer this challenge, we propose IndEvi, a very succinct and effective algorithm to perform independent-evidence-based transactional database transformation. Each entry of a (0,1)-matrix is evaluated by “independent evidence” (maximal supporting patterns) extracted from the whole matrix for this entry. The value of an entry, regardless of its value as 0 or 1, has completely no effect for its independent evidence. The experiment on a gene-phenotype database shows that our method is highly promising in ranking candidate genes and predicting unknown disease genes. PMID:21422495
EDULISS: a small-molecule database with data-mining and pharmacophore searching capabilities

PubMed Central

Hsin, Kun-Yi; Morgan, Hugh P.; Shave, Steven R.; Hinton, Andrew C.; Taylor, Paul; Walkinshaw, Malcolm D.

2011-01-01

We present the relational database EDULISS (EDinburgh University Ligand Selection System), which stores structural, physicochemical and pharmacophoric properties of small molecules. The database comprises a collection of over 4 million commercially available compounds from 28 different suppliers. A user-friendly web-based interface for EDULISS (available at http://eduliss.bch.ed.ac.uk/) has been established providing a number of data-mining possibilities. For each compound a single 3D conformer is stored along with over 1600 calculated descriptor values (molecular properties). A very efficient method for unique compound recognition, especially for a large scale database, is demonstrated by making use of small subgroups of the descriptors. Many of the shape and distance descriptors are held as pre-calculated bit strings permitting fast and efficient similarity and pharmacophore searches which can be used to identify families of related compounds for biological testing. Two ligand searching applications are given to demonstrate how EDULISS can be used to extract families of molecules with selected structural and biophysical features. PMID:21051336
Updates to the Virtual Atomic and Molecular Data Centre

NASA Astrophysics Data System (ADS)

Hill, Christian; Tennyson, Jonathan; Gordon, Iouli E.; Rothman, Laurence S.; Dubernet, Marie-Lise

2014-06-01

The Virtual Atomic and Molecular Data Centre (VAMDC) has established a set of standards for the storage and transmission of atomic and molecular data and an SQL-based query language (VSS2) for searching online databases, known as nodes. The project has also created an online service, the VAMDC Portal, through which all of these databases may be searched and their results compared and aggregated. Since its inception four years ago, the VAMDC e-infrastructure has grown to encompass over 40 databases, including HITRAN, in more than 20 countries and engages actively with scientists in six continents. Associated with the portal are a growing suite of software tools for the transformation of data from its native, XML-based, XSAMS format, to a range of more convenient human-readable (such as HTML) and machinereadable (such as CSV) formats. The relational database for HITRAN1, created as part of the VAMDC project is a flexible and extensible data model which is able to represent a wider range of parameters than the current fixed-format text-based one. Over the next year, a new online interface to this database will be tested, released and fully documented - this web application, HITRANonline2, will fully replace the ageing and incomplete JavaHAWKS software suite.
Sieve-based coreference resolution enhances semi-supervised learning model for chemical-induced disease relation extraction.

PubMed

Le, Hoang-Quynh; Tran, Mai-Vu; Dang, Thanh Hai; Ha, Quang-Thuy; Collier, Nigel

2016-07-01

The BioCreative V chemical-disease relation (CDR) track was proposed to accelerate the progress of text mining in facilitating integrative understanding of chemicals, diseases and their relations. In this article, we describe an extension of our system (namely UET-CAM) that participated in the BioCreative V CDR. The original UET-CAM system's performance was ranked fourth among 18 participating systems by the BioCreative CDR track committee. In the Disease Named Entity Recognition and Normalization (DNER) phase, our system employed joint inference (decoding) with a perceptron-based named entity recognizer (NER) and a back-off model with Semantic Supervised Indexing and Skip-gram for named entity normalization. In the chemical-induced disease (CID) relation extraction phase, we proposed a pipeline that includes a coreference resolution module and a Support Vector Machine relation extraction model. The former module utilized a multi-pass sieve to extend entity recall. In this article, the UET-CAM system was improved by adding a 'silver' CID corpus to train the prediction model. This silver standard corpus of more than 50 thousand sentences was automatically built based on the Comparative Toxicogenomics Database (CTD) database. We evaluated our method on the CDR test set. Results showed that our system could reach the state of the art performance with F1 of 82.44 for the DNER task and 58.90 for the CID task. Analysis demonstrated substantial benefits of both the multi-pass sieve coreference resolution method (F1 + 4.13%) and the silver CID corpus (F1 +7.3%).Database URL: SilverCID-The silver-standard corpus for CID relation extraction is freely online available at: https://zenodo.org/record/34530 (doi:10.5281/zenodo.34530). © The Author(s) 2016. Published by Oxford University Press.
The database of the PREDICTS (Projecting Responses of Ecological Diversity In Changing Terrestrial Systems) project

Treesearch

Lawrence N. Hudson; Joseph Wunderle M.; And Others

2016-01-01

The PREDICTS projectâProjecting Responses of Ecological Diversity In Changing Terrestrial Systems (www.predicts.org.uk)âhas collated from published studies a large, reasonably representative database of comparable samples of biodiversity from multiple sites that differ in the nature or intensity of human impacts relating to land use. We have used this evidence base to...
Assessing the quality of life history information in publicly available databases.

PubMed

Thorson, James T; Cope, Jason M; Patrick, Wesley S

2014-01-01

Single-species life history parameters are central to ecological research and management, including the fields of macro-ecology, fisheries science, and ecosystem modeling. However, there has been little independent evaluation of the precision and accuracy of the life history values in global and publicly available databases. We therefore develop a novel method based on a Bayesian errors-in-variables model that compares database entries with estimates from local experts, and we illustrate this process by assessing the accuracy and precision of entries in FishBase, one of the largest and oldest life history databases. This model distinguishes biases among seven life history parameters, two types of information available in FishBase (i.e., published values and those estimated from other parameters), and two taxa (i.e., bony and cartilaginous fishes) relative to values from regional experts in the United States, while accounting for additional variance caused by sex- and region-specific life history traits. For published values in FishBase, the model identifies a small positive bias in natural mortality and negative bias in maximum age, perhaps caused by unacknowledged mortality caused by fishing. For life history values calculated by FishBase, the model identified large and inconsistent biases. The model also demonstrates greatest precision for body size parameters, decreased precision for values derived from geographically distant populations, and greatest between-sex differences in age at maturity. We recommend that our bias and precision estimates be used in future errors-in-variables models as a prior on measurement errors. This approach is broadly applicable to global databases of life history traits and, if used, will encourage further development and improvements in these databases.
Effects of Soil Data and Simulation Unit Resolution on Quantifying Changes of Soil Organic Carbon at Regional Scale with a Biogeochemical Process Model

PubMed Central

Zhang, Liming; Yu, Dongsheng; Shi, Xuezheng; Xu, Shengxiang; Xing, Shihe; Zhao, Yongcong

2014-01-01

Soil organic carbon (SOC) models were often applied to regions with high heterogeneity, but limited spatially differentiated soil information and simulation unit resolution. This study, carried out in the Tai-Lake region of China, defined the uncertainty derived from application of the DeNitrification-DeComposition (DNDC) biogeochemical model in an area with heterogeneous soil properties and different simulation units. Three different resolution soil attribute databases, a polygonal capture of mapping units at 1∶50,000 (P5), a county-based database of 1∶50,000 (C5) and county-based database of 1∶14,000,000 (C14), were used as inputs for regional DNDC simulation. The P5 and C5 databases were combined with the 1∶50,000 digital soil map, which is the most detailed soil database for the Tai-Lake region. The C14 database was combined with 1∶14,000,000 digital soil map, which is a coarse database and is often used for modeling at a national or regional scale in China. The soil polygons of P5 database and county boundaries of C5 and C14 databases were used as basic simulation units. Results project that from 1982 to 2000, total SOC change in the top layer (0–30 cm) of the 2.3 M ha of paddy soil in the Tai-Lake region was +1.48 Tg C, −3.99 Tg C and −15.38 Tg C based on P5, C5 and C14 databases, respectively. With the total SOC change as modeled with P5 inputs as the baseline, which is the advantages of using detailed, polygon-based soil dataset, the relative deviation of C5 and C14 were 368% and 1126%, respectively. The comparison illustrates that DNDC simulation is strongly influenced by choice of fundamental geographic resolution as well as input soil attribute detail. The results also indicate that improving the framework of DNDC is essential in creating accurate models of the soil carbon cycle. PMID:24523922
Population-based drug-related anaphylaxis in children and adolescents captured by South Carolina Emergency Room Hospital Discharge Database (SCERHDD) (2000-2002).

PubMed

West, Suzanne L; D'Aloisio, Aimee A; Ringel-Kulka, Tamar; Waller, Anna E; Clayton Bordley, W

2007-12-01

Anaphylaxis is a life-threatening condition; drug-related anaphylaxis represents approximately 10% of all cases. We assessed the utility of a statewide emergency department (ED) database for identifying drug-related anaphylaxis in children by developing and validating an algorithm composed of ICD-9-CM codes. There were 1 314,760 visits to South Carolina (SC) emergency departments (EDs) for patients <19 years in 2000-2002. We used ICD-9-CM disease or external cause of injury codes (E-codes) that suggested drug-related anaphylaxis or a severe drug-related allergic reaction. We found 50 cases classifiable as probable or possible drug-related anaphylaxis and 13 as drug-related allergic reactions. We used clinical evaluation by two pediatricians as the 'alloyed gold standard'1 for estimating sensitivity, specificity, and positive predictive value (PPV) of our algorithm. ED-treated drug-related anaphylaxis in the SC pediatric population was 1.56/100,000 person-years based on the algorithm and 0.50/100,000 person-years based on clinical evaluation. Assuming the disease codes we used identified all potential anaphylaxis cases in the database, the sensitivity was 1.00 (95%CI: 0.79, 1.00), specificity was 0.28 (95%CI: 0.16, 0.43), and the PPV was 0.32 (0.20, 0.47) for the algorithm. Sensitivity analyses improved the measurement properties of the algorithm. E-codes were invaluable for developing an anaphylaxis algorithm although the frequently used code of E947.9 was often incorrectly applied. We believe that our algorithm may have over-ascertained drug-related anaphylaxis patients seen in an ED, but the clinical evaluation may have under-represented this diagnosis due to limited information on the offending agent in the abstracted ED records. Post-marketing drug surveillance using ED records may be viable if clinicians were to document drug-related anaphylaxis in the charts so that billing codes could be assigned properly. Copyright 2007 John Wiley & Sons, Ltd.
AbMiner: a bioinformatic resource on available monoclonal antibodies and corresponding gene identifiers for genomic, proteomic, and immunologic studies.

PubMed

Major, Sylvia M; Nishizuka, Satoshi; Morita, Daisaku; Rowland, Rick; Sunshine, Margot; Shankavaram, Uma; Washburn, Frank; Asin, Daniel; Kouros-Mehr, Hosein; Kane, David; Weinstein, John N

2006-04-06

Monoclonal antibodies are used extensively throughout the biomedical sciences for detection of antigens, either in vitro or in vivo. We, for example, have used them for quantitation of proteins on "reverse-phase" protein lysate arrays. For those studies, we quality-controlled > 600 available monoclonal antibodies and also needed to develop precise information on the genes that encode their antigens. Translation among the various protein and gene identifier types proved non-trivial because of one-to-many and many-to-one relationships. To organize the antibody, protein, and gene information, we initially developed a relational database in Filemaker for our own use. When it became apparent that the information would be useful to many other researchers faced with the need to choose or characterize antibodies, we developed it further as AbMiner, a fully relational web-based database under MySQL, programmed in Java. AbMiner is a user-friendly, web-based relational database of information on > 600 commercially available antibodies that we validated by Western blot for protein microarray studies. It includes many types of information on the antibody, the immunogen, the vendor, the antigen, and the antigen's gene. Multiple gene and protein identifier types provide links to corresponding entries in a variety of other public databases, including resources for phosphorylation-specific antibodies. AbMiner also includes our quality-control data against a pool of 60 diverse cancer cell types (the NCI-60) and also protein expression levels for the NCI-60 cells measured using our high-density "reverse-phase" protein lysate microarrays for a selection of the listed antibodies. Some other available database resources give information on antibody specificity for one or a couple of cell types. In contrast, the data in AbMiner indicate specificity with respect to the antigens in a pool of 60 diverse cell types from nine different tissues of origin. AbMiner is a relational database that provides extensive information from our own laboratory and other sources on more than 600 available antibodies and the genes that encode the antibodies' antigens. The data will be made freely available at http://discover.nci.nih.gov/abminer.
Historical seismometry database project: A comprehensive relational database for historical seismic records

NASA Astrophysics Data System (ADS)

Bono, Andrea

2007-01-01

The recovery and preservation of the patrimony made of the instrumental registrations regarding the historical earthquakes is with no doubt a subject of great interest. This attention, besides being purely historical, must necessarily be also scientific. In fact, the availability of a great amount of parametric information on the seismic activity in a given area is a doubtless help to the seismologic researcher's activities. In this article the project of the Sismos group of the National Institute of Geophysics and Volcanology of Rome new database is presented. In the structure of the new scheme the matured experience of five years of activity is summarized. We consider it useful for those who are approaching to "recovery and reprocess" computer based facilities. In the past years several attempts on Italian seismicity have followed each other. It has almost never been real databases. Some of them have had positive success because they were well considered and organized. In others it was limited in supplying lists of events with their relative hypocentral standards. What makes this project more interesting compared to the previous work is the completeness and the generality of the managed information. For example, it will be possible to view the hypocentral information regarding a given historical earthquake; it will be possible to research the seismograms in raster, digital or digitalized format, the information on times of arrival of the phases in the various stations, the instrumental standards and so on. The relational modern logic on which the archive is based, allows the carrying out of all these operations with little effort. The database described below will completely substitute Sismos' current data bank. Some of the organizational principles of this work are similar to those that inspire the database for the real-time monitoring of the seismicity in use in the principal offices of international research. A modern planning logic in a distinctly historical context is introduced. Following are the descriptions of the various planning phases, from the conceptual level to the physical implementation of the scheme. Each time principle instructions, rules, considerations of technical-scientific nature are highlighted that take to the final result: a vanguard relational scheme for historical data.
Sequential data access with Oracle and Hadoop: a performance comparison

NASA Astrophysics Data System (ADS)

Baranowski, Zbigniew; Canali, Luca; Grancher, Eric

2014-06-01

The Hadoop framework has proven to be an effective and popular approach for dealing with "Big Data" and, thanks to its scaling ability and optimised storage access, Hadoop Distributed File System-based projects such as MapReduce or HBase are seen as candidates to replace traditional relational database management systems whenever scalable speed of data processing is a priority. But do these projects deliver in practice? Does migrating to Hadoop's "shared nothing" architecture really improve data access throughput? And, if so, at what cost? Authors answer these questions-addressing cost/performance as well as raw performance- based on a performance comparison between an Oracle-based relational database and Hadoop's distributed solutions like MapReduce or HBase for sequential data access. A key feature of our approach is the use of an unbiased data model as certain data models can significantly favour one of the technologies tested.

PomBase: a comprehensive online resource for fission yeast

PubMed Central

Wood, Valerie; Harris, Midori A.; McDowall, Mark D.; Rutherford, Kim; Vaughan, Brendan W.; Staines, Daniel M.; Aslett, Martin; Lock, Antonia; Bähler, Jürg; Kersey, Paul J.; Oliver, Stephen G.

2012-01-01

PomBase (www.pombase.org) is a new model organism database established to provide access to comprehensive, accurate, and up-to-date molecular data and biological information for the fission yeast Schizosaccharomyces pombe to effectively support both exploratory and hypothesis-driven research. PomBase encompasses annotation of genomic sequence and features, comprehensive manual literature curation and genome-wide data sets, and supports sophisticated user-defined queries. The implementation of PomBase integrates a Chado relational database that houses manually curated data with Ensembl software that supports sequence-based annotation and web access. PomBase will provide user-friendly tools to promote curation by experts within the fission yeast community. This will make a key contribution to shaping its content and ensuring its comprehensiveness and long-term relevance. PMID:22039153
A Web-based database for pathology faculty effort reporting.

PubMed

Dee, Fred R; Haugen, Thomas H; Wynn, Philip A; Leaven, Timothy C; Kemp, John D; Cohen, Michael B

2008-04-01

To ensure appropriate mission-based budgeting and equitable distribution of funds for faculty salaries, our compensation committee developed a pathology-specific effort reporting database. Principles included the following: (1) measurement should be done by web-based databases; (2) most entry should be done by departmental administration or be relational to other databases; (3) data entry categories should be aligned with funding streams; and (4) units of effort should be equal across categories of effort (service, teaching, research). MySQL was used for all data transactions (http://dev.mysql.com/downloads), and scripts were constructed using PERL (http://www.perl.org). Data are accessed with forms that correspond to fields in the database. The committee's work resulted in a novel database using pathology value units (PVUs) as a standard quantitative measure of effort for activities in an academic pathology department. The most common calculation was to estimate the number of hours required for a specific task, divide by 2080 hours (a Medicare year) and then multiply by 100. Other methods included assigning a baseline PVU for program, laboratory, or course directorship with an increment for each student or staff in that unit. With these methods, a faculty member should acquire approximately 100 PVUs. Some outcomes include (1) plotting PVUs versus salary to identify outliers for salary correction, (2) quantifying effort in activities outside the department, (3) documenting salary expenditure for unfunded research, (4) evaluating salary equity by plotting PVUs versus salary by sex, and (5) aggregating data by category of effort for mission-based budgeting and long-term planning.
Design and deployment of a large brain-image database for clinical and nonclinical research

NASA Astrophysics Data System (ADS)

Yang, Guo Liang; Lim, Choie Cheio Tchoyoson; Banukumar, Narayanaswami; Aziz, Aamer; Hui, Francis; Nowinski, Wieslaw L.

2004-04-01

An efficient database is an essential component of organizing diverse information on image metadata and patient information for research in medical imaging. This paper describes the design, development and deployment of a large database system serving as a brain image repository that can be used across different platforms in various medical researches. It forms the infrastructure that links hospitals and institutions together and shares data among them. The database contains patient-, pathology-, image-, research- and management-specific data. The functionalities of the database system include image uploading, storage, indexing, downloading and sharing as well as database querying and management with security and data anonymization concerns well taken care of. The structure of database is multi-tier client-server architecture with Relational Database Management System, Security Layer, Application Layer and User Interface. Image source adapter has been developed to handle most of the popular image formats. The database has a user interface based on web browsers and is easy to handle. We have used Java programming language for its platform independency and vast function libraries. The brain image database can sort data according to clinically relevant information. This can be effectively used in research from the clinicians" points of view. The database is suitable for validation of algorithms on large population of cases. Medical images for processing could be identified and organized based on information in image metadata. Clinical research in various pathologies can thus be performed with greater efficiency and large image repositories can be managed more effectively. The prototype of the system has been installed in a few hospitals and is working to the satisfaction of the clinicians.
Physical Activity and Yoga-Based Approaches for Pregnancy-Related Low Back and Pelvic Pain.

PubMed

Kinser, Patricia Anne; Pauli, Jena; Jallo, Nancy; Shall, Mary; Karst, Kailee; Hoekstra, Michelle; Starkweather, Angela

To conduct an integrative review to evaluate current literature about nonpharmacologic, easily accessible management strategies for pregnancy-related low back and pelvic pain (PR-LBPP). PubMed, CINAHL, Cochrane Database of Systematic Reviews. Original research articles were considered for review if they were full-length publications written in English and published in peer-reviewed journals from 2005 through 2015, included measures of pain and symptoms related to PR-LBPP, and evaluated treatment modalities that used a physical exercise or yoga-based approach for the described conditions. Electronic database searches yielded 1,435 articles. A total of 15 articles met eligibility criteria for further review. These modalities show preliminary promise for pain relief and other related symptoms, including stress and depression. However, our findings also indicate several gaps in knowledge about these therapies for PR-LBPP and methodologic issues with the current literature. Although additional research is required, the results of this integrative review suggest that clinicians may consider recommending nonpharmacologic treatment options, such as gentle physical activity and yoga-based interventions, for PR-LBPP and related symptoms. Copyright © 2017 AWHONN, the Association of Women’s Health, Obstetric and Neonatal Nurses. Published by Elsevier Inc. All rights reserved.
A web-based relational database for monitoring and analyzing mosquito population dynamics.

PubMed

Sucaet, Yves; Van Hemert, John; Tucker, Brad; Bartholomay, Lyric

2008-07-01

Mosquito population dynamics have been monitored on an annual basis in the state of Iowa since 1969. The primary goal of this project was to integrate light trap data from these efforts into a centralized back-end database and interactive website that is available through the internet at http://iowa-mosquito.ent.iastate.edu. For comparative purposes, all data were categorized according to the week of the year and normalized according to the number of traps running. Users can readily view current, weekly mosquito abundance compared with data from previous years. Additional interactive capabilities facilitate analyses of the data based on mosquito species, distribution, or a time frame of interest. All data can be viewed in graphical and tabular format and can be downloaded to a comma separated value (CSV) file for import into a spreadsheet or more specialized statistical software package. Having this long-term dataset in a centralized database/website is useful for informing mosquito and mosquito-borne disease control and for exploring the ecology of the species represented therein. In addition to mosquito population dynamics, this database is available as a standardized platform that could be modified and applied to a multitude of projects that involve repeated collection of observational data. The development and implementation of this tool provides capacity for the user to mine data from standard spreadsheets into a relational database and then view and query the data in an interactive website.
DataSpread: Unifying Databases and Spreadsheets.

PubMed

Bendre, Mangesh; Sun, Bofan; Zhang, Ding; Zhou, Xinyan; Chang, Kevin ChenChuan; Parameswaran, Aditya

2015-08-01

Spreadsheet software is often the tool of choice for ad-hoc tabular data management, processing, and visualization, especially on tiny data sets. On the other hand, relational database systems offer significant power, expressivity, and efficiency over spreadsheet software for data management, while lacking in the ease of use and ad-hoc analysis capabilities. We demonstrate DataSpread, a data exploration tool that holistically unifies databases and spreadsheets. It continues to offer a Microsoft Excel-based spreadsheet front-end, while in parallel managing all the data in a back-end database, specifically, PostgreSQL. DataSpread retains all the advantages of spreadsheets, including ease of use, ad-hoc analysis and visualization capabilities, and a schema-free nature, while also adding the advantages of traditional relational databases, such as scalability and the ability to use arbitrary SQL to import, filter, or join external or internal tables and have the results appear in the spreadsheet. DataSpread needs to reason about and reconcile differences in the notions of schema, addressing of cells and tuples, and the current "pane" (which exists in spreadsheets but not in traditional databases), and support data modifications at both the front-end and the back-end. Our demonstration will center on our first and early prototype of the DataSpread, and will give the attendees a sense for the enormous data exploration capabilities offered by unifying spreadsheets and databases.
DataSpread: Unifying Databases and Spreadsheets

PubMed Central

Bendre, Mangesh; Sun, Bofan; Zhang, Ding; Zhou, Xinyan; Chang, Kevin ChenChuan; Parameswaran, Aditya

2015-01-01

Spreadsheet software is often the tool of choice for ad-hoc tabular data management, processing, and visualization, especially on tiny data sets. On the other hand, relational database systems offer significant power, expressivity, and efficiency over spreadsheet software for data management, while lacking in the ease of use and ad-hoc analysis capabilities. We demonstrate DataSpread, a data exploration tool that holistically unifies databases and spreadsheets. It continues to offer a Microsoft Excel-based spreadsheet front-end, while in parallel managing all the data in a back-end database, specifically, PostgreSQL. DataSpread retains all the advantages of spreadsheets, including ease of use, ad-hoc analysis and visualization capabilities, and a schema-free nature, while also adding the advantages of traditional relational databases, such as scalability and the ability to use arbitrary SQL to import, filter, or join external or internal tables and have the results appear in the spreadsheet. DataSpread needs to reason about and reconcile differences in the notions of schema, addressing of cells and tuples, and the current “pane” (which exists in spreadsheets but not in traditional databases), and support data modifications at both the front-end and the back-end. Our demonstration will center on our first and early prototype of the DataSpread, and will give the attendees a sense for the enormous data exploration capabilities offered by unifying spreadsheets and databases. PMID:26900487
Conceptual Model Formalization in a Semantic Interoperability Service Framework: Transforming Relational Database Schemas to OWL.

PubMed

Bravo, Carlos; Suarez, Carlos; González, Carolina; López, Diego; Blobel, Bernd

2014-01-01

Healthcare information is distributed through multiple heterogeneous and autonomous systems. Access to, and sharing of, distributed information sources are a challenging task. To contribute to meeting this challenge, this paper presents a formal, complete and semi-automatic transformation service from Relational Databases to Web Ontology Language. The proposed service makes use of an algorithm that allows to transform several data models of different domains by deploying mainly inheritance rules. The paper emphasizes the relevance of integrating the proposed approach into an ontology-based interoperability service to achieve semantic interoperability.
DBSecSys 2.0: a database of Burkholderia mallei and Burkholderia pseudomallei secretion systems.

PubMed

Memišević, Vesna; Kumar, Kamal; Zavaljevski, Nela; DeShazer, David; Wallqvist, Anders; Reifman, Jaques

2016-09-20

Burkholderia mallei and B. pseudomallei are the causative agents of glanders and melioidosis, respectively, diseases with high morbidity and mortality rates. B. mallei and B. pseudomallei are closely related genetically; B. mallei evolved from an ancestral strain of B. pseudomallei by genome reduction and adaptation to an obligate intracellular lifestyle. Although these two bacteria cause different diseases, they share multiple virulence factors, including bacterial secretion systems, which represent key components of bacterial pathogenicity. Despite recent progress, the secretion system proteins for B. mallei and B. pseudomallei, their pathogenic mechanisms of action, and host factors are not well characterized. We previously developed a manually curated database, DBSecSys, of bacterial secretion system proteins for B. mallei. Here, we report an expansion of the database with corresponding information about B. pseudomallei. DBSecSys 2.0 contains comprehensive literature-based and computationally derived information about B. mallei ATCC 23344 and literature-based and computationally derived information about B. pseudomallei K96243. The database contains updated information for 163 B. mallei proteins from the previous database and 61 additional B. mallei proteins, and new information for 281 B. pseudomallei proteins associated with 5 secretion systems, their 1,633 human- and murine-interacting targets, and 2,400 host-B. mallei interactions and 2,286 host-B. pseudomallei interactions. The database also includes information about 13 pathogenic mechanisms of action for B. mallei and B. pseudomallei secretion system proteins inferred from the available literature or computationally. Additionally, DBSecSys 2.0 provides details about 82 virulence attenuation experiments for 52 B. mallei secretion system proteins and 98 virulence attenuation experiments for 61 B. pseudomallei secretion system proteins. We updated the Web interface and data access layer to speed-up users' search of detailed information for orthologous proteins related to secretion systems of the two pathogens. The updates of DBSecSys 2.0 provide unique capabilities to access comprehensive information about secretion systems of B. mallei and B. pseudomallei. They enable studies and comparisons of corresponding proteins of these two closely related pathogens and their host-interacting partners. The database is available at http://dbsecsys.bhsai.org .
Use of a Drosophila Genome-Wide Conserved Sequence Database to Identify Functionally Related cis-Regulatory Enhancers

PubMed Central

Brody, Thomas; Yavatkar, Amarendra S; Kuzin, Alexander; Kundu, Mukta; Tyson, Leonard J; Ross, Jermaine; Lin, Tzu-Yang; Lee, Chi-Hon; Awasaki, Takeshi; Lee, Tzumin; Odenwald, Ward F

2012-01-01

Background: Phylogenetic footprinting has revealed that cis-regulatory enhancers consist of conserved DNA sequence clusters (CSCs). Currently, there is no systematic approach for enhancer discovery and analysis that takes full-advantage of the sequence information within enhancer CSCs. Results: We have generated a Drosophila genome-wide database of conserved DNA consisting of >100,000 CSCs derived from EvoPrints spanning over 90% of the genome. cis-Decoder database search and alignment algorithms enable the discovery of functionally related enhancers. The program first identifies conserved repeat elements within an input enhancer and then searches the database for CSCs that score highly against the input CSC. Scoring is based on shared repeats as well as uniquely shared matches, and includes measures of the balance of shared elements, a diagnostic that has proven to be useful in predicting cis-regulatory function. To demonstrate the utility of these tools, a temporally-restricted CNS neuroblast enhancer was used to identify other functionally related enhancers and analyze their structural organization. Conclusions: cis-Decoder reveals that co-regulating enhancers consist of combinations of overlapping shared sequence elements, providing insights into the mode of integration of multiple regulating transcription factors. The database and accompanying algorithms should prove useful in the discovery and analysis of enhancers involved in any developmental process. Developmental Dynamics 241:169–189, 2012. © 2011 Wiley Periodicals, Inc. Key findings A genome-wide catalog of Drosophila conserved DNA sequence clusters. cis-Decoder discovers functionally related enhancers. Functionally related enhancers share balanced sequence element copy numbers. Many enhancers function during multiple phases of development. PMID:22174086
Using the Proteomics Identifications Database (PRIDE).

PubMed

Martens, Lennart; Jones, Phil; Côté, Richard

2008-03-01

The Proteomics Identifications Database (PRIDE) is a public data repository designed to store, disseminate, and analyze mass spectrometry based proteomics datasets. The PRIDE database can accommodate any level of detailed metadata about the submitted results, which can be queried, explored, viewed, or downloaded via the PRIDE Web interface. The PRIDE database also provides a simple, yet powerful, access control mechanism that fully supports confidential peer-reviewing of data related to a manuscript, ensuring that these results remain invisible to the general public while allowing referees and journal editors anonymized access to the data. This unit describes in detail the functionality that PRIDE provides with regards to searching, viewing, and comparing the available data, as well as different options for submitting data to PRIDE.
Development and application of basis database for materials life cycle assessment in china

NASA Astrophysics Data System (ADS)

Li, Xiaoqing; Gong, Xianzheng; Liu, Yu

2017-03-01

As the data intensive method, high quality environmental burden data is an important premise of carrying out materials life cycle assessment (MLCA), and the reliability of data directly influences the reliability of the assessment results and its application performance. Therefore, building Chinese MLCA database is the basic data needs and technical supports for carrying out and improving LCA practice. Firstly, some new progress on database which related to materials life cycle assessment research and development are introduced. Secondly, according to requirement of ISO 14040 series standards, the database framework and main datasets of the materials life cycle assessment are studied. Thirdly, MLCA data platform based on big data is developed. Finally, the future research works were proposed and discussed.
Crowdsourcing-Assisted Radio Environment Database for V2V Communication.

PubMed

Katagiri, Keita; Sato, Koya; Fujii, Takeo

2018-04-12

In order to realize reliable Vehicle-to-Vehicle (V2V) communication systems for autonomous driving, the recognition of radio propagation becomes an important technology. However, in the current wireless distributed network systems, it is difficult to accurately estimate the radio propagation characteristics because of the locality of the radio propagation caused by surrounding buildings and geographical features. In this paper, we propose a measurement-based radio environment database for improving the accuracy of the radio environment estimation in the V2V communication systems. The database first gathers measurement datasets of the received signal strength indicator (RSSI) related to the transmission/reception locations from V2V systems. By using the datasets, the average received power maps linked with transmitter and receiver locations are generated. We have performed measurement campaigns of V2V communications in the real environment to observe RSSI for the database construction. Our results show that the proposed method has higher accuracy of the radio propagation estimation than the conventional path loss model-based estimation.
A TEX86 surface sediment database and extended Bayesian calibration

NASA Astrophysics Data System (ADS)

Tierney, Jessica E.; Tingley, Martin P.

2015-06-01

Quantitative estimates of past temperature changes are a cornerstone of paleoclimatology. For a number of marine sediment-based proxies, the accuracy and precision of past temperature reconstructions depends on a spatial calibration of modern surface sediment measurements to overlying water temperatures. Here, we present a database of 1095 surface sediment measurements of TEX86, a temperature proxy based on the relative cyclization of marine archaeal glycerol dialkyl glycerol tetraether (GDGT) lipids. The dataset is archived in a machine-readable format with geospatial information, fractional abundances of lipids (if available), and metadata. We use this new database to update surface and subsurface temperature calibration models for TEX86 and demonstrate the applicability of the TEX86 proxy to past temperature prediction. The TEX86 database confirms that surface sediment GDGT distribution has a strong relationship to temperature, which accounts for over 70% of the variance in the data. Future efforts, made possible by the data presented here, will seek to identify variables with secondary relationships to GDGT distributions, such as archaeal community composition.
PseudoBase: a database with RNA pseudoknots.

PubMed

van Batenburg, F H; Gultyaev, A P; Pleij, C W; Ng, J; Oliehoek, J

2000-01-01

PseudoBase is a database containing structural, functional and sequence data related to RNA pseudo-knots. It can be reached at http://wwwbio. Leiden Univ.nl/ approximately Batenburg/PKB.html. This page will direct the user to a retrieval page from where a particular pseudoknot can be chosen, or to a submission page which enables the user to add pseudoknot information to the database or to an informative page that elaborates on the various aspects of the database. For each pseudoknot, 12 items are stored, e.g. the nucleotides of the region that contains the pseudoknot, the stem positions of the pseudoknot, the EMBL accession number of the sequence that contains this pseudoknot and the support that can be given regarding the reliability of the pseudoknot. Access is via a small number of steps, using 16 different categories. The development process was done by applying the evolutionary methodology for software development rather than by applying the methodology of the classical waterfall model or the more modern spiral model.
StreptomycesInforSys: A web-enabled information repository

PubMed Central

Jain, Chakresh Kumar; Gupta, Vidhi; Gupta, Ashvarya; Gupta, Sanjay; Wadhwa, Gulshan; Sharma, Sanjeev Kumar; Sarethy, Indira P

2012-01-01

Members of Streptomyces produce 70% of natural bioactive products. There is considerable amount of information available based on polyphasic approach for classification of Streptomyces. However, this information based on phenotypic, genotypic and bioactive component production profiles is crucial for pharmacological screening programmes. This is scattered across various journals, books and other resources, many of which are not freely accessible. The designed database incorporates polyphasic typing information using combinations of search options to aid in efficient screening of new isolates. This will help in the preliminary categorization of appropriate groups. It is a free relational database compatible with existing operating systems. A cross platform technology with XAMPP Web server has been used to develop, manage, and facilitate the user query effectively with database support. Employment of PHP, a platform-independent scripting language, embedded in HTML and the database management software MySQL will facilitate dynamic information storage and retrieval. The user-friendly, open and flexible freeware (PHP, MySQL and Apache) is foreseen to reduce running and maintenance cost. Availability www.sis.biowaves.org PMID:23275736
Crowdsourcing-Assisted Radio Environment Database for V2V Communication †

PubMed Central

Katagiri, Keita; Fujii, Takeo

2018-01-01

In order to realize reliable Vehicle-to-Vehicle (V2V) communication systems for autonomous driving, the recognition of radio propagation becomes an important technology. However, in the current wireless distributed network systems, it is difficult to accurately estimate the radio propagation characteristics because of the locality of the radio propagation caused by surrounding buildings and geographical features. In this paper, we propose a measurement-based radio environment database for improving the accuracy of the radio environment estimation in the V2V communication systems. The database first gathers measurement datasets of the received signal strength indicator (RSSI) related to the transmission/reception locations from V2V systems. By using the datasets, the average received power maps linked with transmitter and receiver locations are generated. We have performed measurement campaigns of V2V communications in the real environment to observe RSSI for the database construction. Our results show that the proposed method has higher accuracy of the radio propagation estimation than the conventional path loss model-based estimation. PMID:29649174
StreptomycesInforSys: A web-enabled information repository.

PubMed

Jain, Chakresh Kumar; Gupta, Vidhi; Gupta, Ashvarya; Gupta, Sanjay; Wadhwa, Gulshan; Sharma, Sanjeev Kumar; Sarethy, Indira P

2012-01-01

Members of Streptomyces produce 70% of natural bioactive products. There is considerable amount of information available based on polyphasic approach for classification of Streptomyces. However, this information based on phenotypic, genotypic and bioactive component production profiles is crucial for pharmacological screening programmes. This is scattered across various journals, books and other resources, many of which are not freely accessible. The designed database incorporates polyphasic typing information using combinations of search options to aid in efficient screening of new isolates. This will help in the preliminary categorization of appropriate groups. It is a free relational database compatible with existing operating systems. A cross platform technology with XAMPP Web server has been used to develop, manage, and facilitate the user query effectively with database support. Employment of PHP, a platform-independent scripting language, embedded in HTML and the database management software MySQL will facilitate dynamic information storage and retrieval. The user-friendly, open and flexible freeware (PHP, MySQL and Apache) is foreseen to reduce running and maintenance cost. www.sis.biowaves.org.
Experiment on building Sundanese lexical database based on WordNet

NASA Astrophysics Data System (ADS)

Dewi Budiwati, Sari; Nurani Setiawan, Novihana

2018-03-01

Sundanese language is the second biggest local language used in Indonesia. Currently, Sundanese language is rarely used since we have the Indonesian language in everyday conversation and as the national language. We built a Sundanese lexical database based on WordNet and Indonesian WordNet as an alternative way to preserve the language as one of local culture. WordNet was chosen because of Sundanese language has three levels of word delivery, called language code of conduct. Web user participant involved in this research for specifying Sundanese semantic relations, and an expert linguistic for validating the relations. The merge methodology was implemented in this experiment. Some words are equivalent with WordNet while another does not have its equivalence since some words are not exist in another culture.
MetPetDB: A database for metamorphic geochemistry

NASA Astrophysics Data System (ADS)

Spear, Frank S.; Hallett, Benjamin; Pyle, Joseph M.; Adalı, Sibel; Szymanski, Boleslaw K.; Waters, Anthony; Linder, Zak; Pearce, Shawn O.; Fyffe, Matthew; Goldfarb, Dennis; Glickenhouse, Nickolas; Buletti, Heather

2009-12-01

We present a data model for the initial implementation of MetPetDB, a geochemical database specific to metamorphic rock samples. The database is designed around the concept of preservation of spatial relationships, at all scales, of chemical analyses and their textural setting. Objects in the database (samples) represent physical rock samples; each sample may contain one or more subsamples with associated geochemical and image data. Samples, subsamples, geochemical data, and images are described with attributes (some required, some optional); these attributes also serve as search delimiters. All data in the database are classified as published (i.e., archived or published data), public or private. Public and published data may be freely searched and downloaded. All private data is owned; permission to view, edit, download and otherwise manipulate private data may be granted only by the data owner; all such editing operations are recorded by the database to create a data version log. The sharing of data permissions among a group of collaborators researching a common sample is done by the sample owner through the project manager. User interaction with MetPetDB is hosted by a web-based platform based upon the Java servlet application programming interface, with the PostgreSQL relational database. The database web portal includes modules that allow the user to interact with the database: registered users may save and download public and published data, upload private data, create projects, and assign permission levels to project collaborators. An Image Viewer module provides for spatial integration of image and geochemical data. A toolkit consisting of plotting and geochemical calculation software for data analysis and a mobile application for viewing the public and published data is being developed. Future issues to address include population of the database, integration with other geochemical databases, development of the analysis toolkit, creation of data models for derivative data, and building a community-wide user base. It is believed that this and other geochemical databases will enable more productive collaborations, generate more efficient research efforts, and foster new developments in basic research in the field of solid earth geochemistry.

Relational Database for the Geology of the Northern Rocky Mountains - Idaho, Montana, and Washington

USGS Publications Warehouse

Causey, J. Douglas; Zientek, Michael L.; Bookstrom, Arthur A.; Frost, Thomas P.; Evans, Karl V.; Wilson, Anna B.; Van Gosen, Bradley S.; Boleneus, David E.; Pitts, Rebecca A.

2008-01-01

A relational database was created to prepare and organize geologic map-unit and lithologic descriptions for input into a spatial database for the geology of the northern Rocky Mountains, a compilation of forty-three geologic maps for parts of Idaho, Montana, and Washington in U.S. Geological Survey Open File Report 2005-1235. Not all of the information was transferred to and incorporated in the spatial database due to physical file limitations. This report releases that part of the relational database that was completed for that earlier product. In addition to descriptive geologic information for the northern Rocky Mountains region, the relational database contains a substantial bibliography of geologic literature for the area. The relational database nrgeo.mdb (linked below) is available in Microsoft Access version 2000, a proprietary database program. The relational database contains data tables and other tables used to define terms, relationships between the data tables, and hierarchical relationships in the data; forms used to enter data; and queries used to extract data.
Stereoselective virtual screening of the ZINC database using atom pair 3D-fingerprints.

PubMed

Awale, Mahendra; Jin, Xian; Reymond, Jean-Louis

2015-01-01

Tools to explore large compound databases in search for analogs of query molecules provide a strategically important support in drug discovery to help identify available analogs of any given reference or hit compound by ligand based virtual screening (LBVS). We recently showed that large databases can be formatted for very fast searching with various 2D-fingerprints using the city-block distance as similarity measure, in particular a 2D-atom pair fingerprint (APfp) and the related category extended atom pair fingerprint (Xfp) which efficiently encode molecular shape and pharmacophores, but do not perceive stereochemistry. Here we investigated related 3D-atom pair fingerprints to enable rapid stereoselective searches in the ZINC database (23.2 million 3D structures). Molecular fingerprints counting atom pairs at increasing through-space distance intervals were designed using either all atoms (16-bit 3DAPfp) or different atom categories (80-bit 3DXfp). These 3D-fingerprints retrieved molecular shape and pharmacophore analogs (defined by OpenEye ROCS scoring functions) of 110,000 compounds from the Cambridge Structural Database with equal or better accuracy than the 2D-fingerprints APfp and Xfp, and showed comparable performance in recovering actives from decoys in the DUD database. LBVS by 3DXfp or 3DAPfp similarity was stereoselective and gave very different analogs when starting from different diastereomers of the same chiral drug. Results were also different from LBVS with the parent 2D-fingerprints Xfp or APfp. 3D- and 2D-fingerprints also gave very different results in LBVS of folded molecules where through-space distances between atom pairs are much shorter than topological distances. 3DAPfp and 3DXfp are suitable for stereoselective searches for shape and pharmacophore analogs of query molecules in large databases. Web-browsers for searching ZINC by 3DAPfp and 3DXfp similarity are accessible at www.gdb.unibe.ch and should provide useful assistance to drug discovery projects. Graphical abstractAtom pair fingerprints based on through-space distances (3DAPfp) provide better shape encoding than atom pair fingerprints based on topological distances (APfp) as measured by the recovery of ROCS shape analogs by fp similarity.
Migration from relational to NoSQL database

NASA Astrophysics Data System (ADS)

Ghotiya, Sunita; Mandal, Juhi; Kandasamy, Saravanakumar

2017-11-01

Data generated by various real time applications, social networking sites and sensor devices is of very huge amount and unstructured, which makes it difficult for Relational database management systems to handle the data. Data is very precious component of any application and needs to be analysed after arranging it in some structure. Relational databases are only able to deal with structured data, so there is need of NoSQL Database management System which can deal with semi -structured data also. Relational database provides the easiest way to manage the data but as the use of NoSQL is increasing it is becoming necessary to migrate the data from Relational to NoSQL databases. Various frameworks has been proposed previously which provides mechanisms for migration of data stored at warehouses in SQL, middle layer solutions which can provide facility of data to be stored in NoSQL databases to handle data which is not structured. This paper provides a literature review of some of the recent approaches proposed by various researchers to migrate data from relational to NoSQL databases. Some researchers proposed mechanisms for the co-existence of NoSQL and Relational databases together. This paper provides a summary of mechanisms which can be used for mapping data stored in Relational databases to NoSQL databases. Various techniques for data transformation and middle layer solutions are summarised in the paper.
An inventory of continental U.S. terrestrial candidate ecological restoration areas based on landscape context.

PubMed

Wickham, James; Riitters, Kurt; Vogt, Peter; Costanza, Jennifer; Neale, Anne

2017-11-01

Landscape context is an important factor in restoration ecology, but the use of landscape context for site prioritization has not been as fully developed. We used morphological image processing to identify candidate ecological restoration areas based on their proximity to existing natural vegetation. We identified 1,102,720 candidate ecological restoration areas across the continental United States. Candidate ecological restoration areas were concentrated in the Great Plains and eastern United States. We populated the database of candidate ecological restoration areas with 17 attributes related to site content and context, including factors such as soil fertility and roads (site content), and number and area of potentially conjoined vegetated regions (site context) to facilitate its use for site prioritization. We demonstrate the utility of the database in the state of North Carolina, U.S.A. for a restoration objective related to restoration of water quality (mandated by the U.S. Clean Water Act), wetlands, and forest. The database will be made publicly available on the U.S. Environmental Protection Agency's EnviroAtlas website (http://enviroatlas.epa.gov) for stakeholders interested in ecological restoration.
An inventory of continental U.S. terrestrial candidate ecological restoration areas based on landscape context

PubMed Central

Wickham, James; Riitters, Kurt; Vogt, Peter; Costanza, Jennifer; Neale, Anne

2018-01-01

Landscape context is an important factor in restoration ecology, but the use of landscape context for site prioritization has not been as fully developed. We used morphological image processing to identify candidate ecological restoration areas based on their proximity to existing natural vegetation. We identified 1,102,720 candidate ecological restoration areas across the continental United States. Candidate ecological restoration areas were concentrated in the Great Plains and eastern United States. We populated the database of candidate ecological restoration areas with 17 attributes related to site content and context, including factors such as soil fertility and roads (site content), and number and area of potentially conjoined vegetated regions (site context) to facilitate its use for site prioritization. We demonstrate the utility of the database in the state of North Carolina, U.S.A. for a restoration objective related to restoration of water quality (mandated by the U.S. Clean Water Act), wetlands, and forest. The database will be made publicly available on the U.S. Environmental Protection Agency's EnviroAtlas website (http://enviroatlas.epa.gov) for stakeholders interested in ecological restoration. PMID:29683130
Design and Implementation of CNEOST Image Database Based on NoSQL System

NASA Astrophysics Data System (ADS)

Wang, X.

2013-07-01

The China Near Earth Object Survey Telescope (CNEOST) is the largest Schmidt telescope in China, and it has acquired more than 3 TB astronomical image data since it saw the first light in 2006. After the upgradation of the CCD camera in 2013, over 10 TB data will be obtained every year. The management of massive images is not only an indispensable part of data processing pipeline but also the basis of data sharing. Based on the analysis of requirement, an image management system is designed and implemented by employing the non-relational database.
Design and Implementation of CNEOST Image Database Based on NoSQL System

NASA Astrophysics Data System (ADS)

Wang, Xin

2014-04-01

The China Near Earth Object Survey Telescope is the largest Schmidt telescope in China, and it has acquired more than 3 TB astronomical image data since it saw the first light in 2006. After the upgrade of the CCD camera in 2013, over 10 TB data will be obtained every year. The management of the massive images is not only an indispensable part of data processing pipeline but also the basis of data sharing. Based on the analysis of requirement, an image management system is designed and implemented by employing the non-relational database.
Concentrations of indoor pollutants database: User's manual

DOE Office of Scientific and Technical Information (OSTI.GOV)

Not Available

1992-05-01

This manual describes the computer-based database on indoor air pollutants. This comprehensive database alloys helps utility personnel perform rapid searches on literature related to indoor air pollutants. Besides general information, it provides guidance for finding specific information on concentrations of indoor air pollutants. The manual includes information on installing and using the database as well as a tutorial to assist the user in becoming familiar with the procedures involved in doing bibliographic and summary section searches. The manual demonstrates how to search for information by going through a series of questions that provide search parameters such as pollutants type, year,more » building type, keywords (from a specific list), country, geographic region, author's last name, and title. As more and more parameters are specified, the list of references found in the data search becomes smaller and more specific to the user's needs. Appendixes list types of information that can be input into the database when making a request. The CIP database allows individual utilities to obtain information on indoor air quality based on building types and other factors in their own service territory. This information is useful for utilities with concerns about indoor air quality and the control of indoor air pollutants. The CIP database itself is distributed by the Electric Power Software Center and runs on IBM PC-compatible computers.« less
Concentrations of indoor pollutants database: User`s manual

DOE Office of Scientific and Technical Information (OSTI.GOV)

Not Available

1992-05-01

This manual describes the computer-based database on indoor air pollutants. This comprehensive database alloys helps utility personnel perform rapid searches on literature related to indoor air pollutants. Besides general information, it provides guidance for finding specific information on concentrations of indoor air pollutants. The manual includes information on installing and using the database as well as a tutorial to assist the user in becoming familiar with the procedures involved in doing bibliographic and summary section searches. The manual demonstrates how to search for information by going through a series of questions that provide search parameters such as pollutants type, year,more » building type, keywords (from a specific list), country, geographic region, author`s last name, and title. As more and more parameters are specified, the list of references found in the data search becomes smaller and more specific to the user`s needs. Appendixes list types of information that can be input into the database when making a request. The CIP database allows individual utilities to obtain information on indoor air quality based on building types and other factors in their own service territory. This information is useful for utilities with concerns about indoor air quality and the control of indoor air pollutants. The CIP database itself is distributed by the Electric Power Software Center and runs on IBM PC-compatible computers.« less
Columba: an integrated database of proteins, structures, and annotations.

PubMed

Trissl, Silke; Rother, Kristian; Müller, Heiko; Steinke, Thomas; Koch, Ina; Preissner, Robert; Frömmel, Cornelius; Leser, Ulf

2005-03-31

Structural and functional research often requires the computation of sets of protein structures based on certain properties of the proteins, such as sequence features, fold classification, or functional annotation. Compiling such sets using current web resources is tedious because the necessary data are spread over many different databases. To facilitate this task, we have created COLUMBA, an integrated database of annotations of protein structures. COLUMBA currently integrates twelve different databases, including PDB, KEGG, Swiss-Prot, CATH, SCOP, the Gene Ontology, and ENZYME. The database can be searched using either keyword search or data source-specific web forms. Users can thus quickly select and download PDB entries that, for instance, participate in a particular pathway, are classified as containing a certain CATH architecture, are annotated as having a certain molecular function in the Gene Ontology, and whose structures have a resolution under a defined threshold. The results of queries are provided in both machine-readable extensible markup language and human-readable format. The structures themselves can be viewed interactively on the web. The COLUMBA database facilitates the creation of protein structure data sets for many structure-based studies. It allows to combine queries on a number of structure-related databases not covered by other projects at present. Thus, information on both many and few protein structures can be used efficiently. The web interface for COLUMBA is available at http://www.columba-db.de.
Palaeo sea-level and ice-sheet databases: problems, strategies and perspectives

NASA Astrophysics Data System (ADS)

Rovere, Alessio; Düsterhus, André; Carlson, Anders; Barlow, Natasha; Bradwell, Tom; Dutton, Andrea; Gehrels, Roland; Hibbert, Fiona; Hijma, Marc; Horton, Benjamin; Klemann, Volker; Kopp, Robert; Sivan, Dorit; Tarasov, Lev; Törnqvist, Torbjorn

2016-04-01

Databases of palaeoclimate data have driven many major developments in understanding the Earth system. The measurement and interpretation of palaeo sea-level and ice-sheet data that form such databases pose considerable challenges to the scientific communities that use them for further analyses. In this paper, we build on the experience of the PALSEA (PALeo constraints on SEA level rise) community, which is a working group inside the PAGES (Past Global Changes) project, to describe the challenges and best strategies that can be adopted to build a self-consistent and standardised database of geological and geochemical data related to palaeo sea levels and ice sheets. Our aim in this paper is to identify key points that need attention and subsequent funding when undertaking the task of database creation. We conclude that any sea-level or ice-sheet database must be divided into three instances: i) measurement; ii) interpretation; iii) database creation. Measurement should include postion, age, description of geological features, and quantification of uncertainties. All must be described as objectively as possible. Interpretation can be subjective, but it should always include uncertainties and include all the possible interpretations, without unjustified a priori exclusions. We propose that, in the creation of a database, an approach based on Accessibility, Transparency, Trust, Availability, Continued updating, Completeness and Communication of content (ATTAC3) must be adopted. Also, it is essential to consider the community structure that creates and benefits of a database. We conclude that funding sources should consider to address not only the creation of original data in specific research-question oriented projects, but also include the possibility to use part of the funding for IT-related and database creation tasks, which are essential to guarantee accessibility and maintenance of the collected data.
DPTEdb, an integrative database of transposable elements in dioecious plants.

PubMed

Li, Shu-Fen; Zhang, Guo-Jun; Zhang, Xue-Jin; Yuan, Jin-Hong; Deng, Chuan-Liang; Gu, Lian-Feng; Gao, Wu-Jun

2016-01-01

Dioecious plants usually harbor 'young' sex chromosomes, providing an opportunity to study the early stages of sex chromosome evolution. Transposable elements (TEs) are mobile DNA elements frequently found in plants and are suggested to play important roles in plant sex chromosome evolution. The genomes of several dioecious plants have been sequenced, offering an opportunity to annotate and mine the TE data. However, comprehensive and unified annotation of TEs in these dioecious plants is still lacking. In this study, we constructed a dioecious plant transposable element database (DPTEdb). DPTEdb is a specific, comprehensive and unified relational database and web interface. We used a combination of de novo, structure-based and homology-based approaches to identify TEs from the genome assemblies of previously published data, as well as our own. The database currently integrates eight dioecious plant species and a total of 31 340 TEs along with classification information. DPTEdb provides user-friendly web interfaces to browse, search and download the TE sequences in the database. Users can also use tools, including BLAST, GetORF, HMMER, Cut sequence and JBrowse, to analyze TE data. Given the role of TEs in plant sex chromosome evolution, the database will contribute to the investigation of TEs in structural, functional and evolutionary dynamics of the genome of dioecious plants. In addition, the database will supplement the research of sex diversification and sex chromosome evolution of dioecious plants.Database URL: http://genedenovoweb.ticp.net:81/DPTEdb/index.php. © The Author(s) 2016. Published by Oxford University Press.
A method to implement fine-grained access control for personal health records through standard relational database queries.

PubMed

Sujansky, Walter V; Faus, Sam A; Stone, Ethan; Brennan, Patricia Flatley

2010-10-01

Online personal health records (PHRs) enable patients to access, manage, and share certain of their own health information electronically. This capability creates the need for precise access-controls mechanisms that restrict the sharing of data to that intended by the patient. The authors describe the design and implementation of an access-control mechanism for PHR repositories that is modeled on the eXtensible Access Control Markup Language (XACML) standard, but intended to reduce the cognitive and computational complexity of XACML. The authors implemented the mechanism entirely in a relational database system using ANSI-standard SQL statements. Based on a set of access-control rules encoded as relational table rows, the mechanism determines via a single SQL query whether a user who accesses patient data from a specific application is authorized to perform a requested operation on a specified data object. Testing of this query on a moderately large database has demonstrated execution times consistently below 100ms. The authors include the details of the implementation, including algorithms, examples, and a test database as Supplementary materials. Copyright © 2010 Elsevier Inc. All rights reserved.
An integrated approach to reservoir modeling

DOE Office of Scientific and Technical Information (OSTI.GOV)

Donaldson, K.

1993-08-01

The purpose of this research is to evaluate the usefulness of the following procedural and analytical methods in investigating the heterogeneity of the oil reserve for the Mississipian Big Injun Sandstone of the Granny Creek field, Clay and Roane counties, West Virginia: (1) relational database, (2) two-dimensional cross sections, (3) true three-dimensional modeling, (4) geohistory analysis, (5) a rule-based expert system, and (6) geographical information systems. The large data set could not be effectively integrated and interpreted without this approach. A relational database was designed to fully integrate three- and four-dimensional data. The database provides an effective means for maintainingmore » and manipulating the data. A two-dimensional cross section program was designed to correlate stratigraphy, depositional environments, porosity, permeability, and petrographic data. This flexible design allows for additional four-dimensional data. Dynamic Graphics[sup [trademark
Covariant Evolutionary Event Analysis for Base Interaction Prediction Using a Relational Database Management System for RNA.

PubMed

Xu, Weijia; Ozer, Stuart; Gutell, Robin R

2009-01-01

With an increasingly large amount of sequences properly aligned, comparative sequence analysis can accurately identify not only common structures formed by standard base pairing but also new types of structural elements and constraints. However, traditional methods are too computationally expensive to perform well on large scale alignment and less effective with the sequences from diversified phylogenetic classifications. We propose a new approach that utilizes coevolutional rates among pairs of nucleotide positions using phylogenetic and evolutionary relationships of the organisms of aligned sequences. With a novel data schema to manage relevant information within a relational database, our method, implemented with a Microsoft SQL Server 2005, showed 90% sensitivity in identifying base pair interactions among 16S ribosomal RNA sequences from Bacteria, at a scale 40 times bigger and 50% better sensitivity than a previous study. The results also indicated covariation signals for a few sets of cross-strand base stacking pairs in secondary structure helices, and other subtle constraints in the RNA structure.
Covariant Evolutionary Event Analysis for Base Interaction Prediction Using a Relational Database Management System for RNA

PubMed Central

Xu, Weijia; Ozer, Stuart; Gutell, Robin R.

2010-01-01

With an increasingly large amount of sequences properly aligned, comparative sequence analysis can accurately identify not only common structures formed by standard base pairing but also new types of structural elements and constraints. However, traditional methods are too computationally expensive to perform well on large scale alignment and less effective with the sequences from diversified phylogenetic classifications. We propose a new approach that utilizes coevolutional rates among pairs of nucleotide positions using phylogenetic and evolutionary relationships of the organisms of aligned sequences. With a novel data schema to manage relevant information within a relational database, our method, implemented with a Microsoft SQL Server 2005, showed 90% sensitivity in identifying base pair interactions among 16S ribosomal RNA sequences from Bacteria, at a scale 40 times bigger and 50% better sensitivity than a previous study. The results also indicated covariation signals for a few sets of cross-strand base stacking pairs in secondary structure helices, and other subtle constraints in the RNA structure. PMID:20502534
JRC GMO-Amplicons: a collection of nucleic acid sequences related to genetically modified organisms

PubMed Central

Petrillo, Mauro; Angers-Loustau, Alexandre; Henriksson, Peter; Bonfini, Laura; Patak, Alex; Kreysa, Joachim

2015-01-01

The DNA target sequence is the key element in designing detection methods for genetically modified organisms (GMOs). Unfortunately this information is frequently lacking, especially for unauthorized GMOs. In addition, patent sequences are generally poorly annotated, buried in complex and extensive documentation and hard to link to the corresponding GM event. Here, we present the JRC GMO-Amplicons, a database of amplicons collected by screening public nucleotide sequence databanks by in silico determination of PCR amplification with reference methods for GMO analysis. The European Union Reference Laboratory for Genetically Modified Food and Feed (EU-RL GMFF) provides these methods in the GMOMETHODS database to support enforcement of EU legislation and GM food/feed control. The JRC GMO-Amplicons database is composed of more than 240 000 amplicons, which can be easily accessed and screened through a web interface. To our knowledge, this is the first attempt at pooling and collecting publicly available sequences related to GMOs in food and feed. The JRC GMO-Amplicons supports control laboratories in the design and assessment of GMO methods, providing inter-alia in silico prediction of primers specificity and GM targets coverage. The new tool can assist the laboratories in the analysis of complex issues, such as the detection and identification of unauthorized GMOs. Notably, the JRC GMO-Amplicons database allows the retrieval and characterization of GMO-related sequences included in patents documentation. Finally, it can help annotating poorly described GM sequences and identifying new relevant GMO-related sequences in public databases. The JRC GMO-Amplicons is freely accessible through a web-based portal that is hosted on the EU-RL GMFF website. Database URL: http://gmo-crl.jrc.ec.europa.eu/jrcgmoamplicons/ PMID:26424080
JRC GMO-Amplicons: a collection of nucleic acid sequences related to genetically modified organisms.

PubMed

Petrillo, Mauro; Angers-Loustau, Alexandre; Henriksson, Peter; Bonfini, Laura; Patak, Alex; Kreysa, Joachim

2015-01-01

The DNA target sequence is the key element in designing detection methods for genetically modified organisms (GMOs). Unfortunately this information is frequently lacking, especially for unauthorized GMOs. In addition, patent sequences are generally poorly annotated, buried in complex and extensive documentation and hard to link to the corresponding GM event. Here, we present the JRC GMO-Amplicons, a database of amplicons collected by screening public nucleotide sequence databanks by in silico determination of PCR amplification with reference methods for GMO analysis. The European Union Reference Laboratory for Genetically Modified Food and Feed (EU-RL GMFF) provides these methods in the GMOMETHODS database to support enforcement of EU legislation and GM food/feed control. The JRC GMO-Amplicons database is composed of more than 240 000 amplicons, which can be easily accessed and screened through a web interface. To our knowledge, this is the first attempt at pooling and collecting publicly available sequences related to GMOs in food and feed. The JRC GMO-Amplicons supports control laboratories in the design and assessment of GMO methods, providing inter-alia in silico prediction of primers specificity and GM targets coverage. The new tool can assist the laboratories in the analysis of complex issues, such as the detection and identification of unauthorized GMOs. Notably, the JRC GMO-Amplicons database allows the retrieval and characterization of GMO-related sequences included in patents documentation. Finally, it can help annotating poorly described GM sequences and identifying new relevant GMO-related sequences in public databases. The JRC GMO-Amplicons is freely accessible through a web-based portal that is hosted on the EU-RL GMFF website. Database URL: http://gmo-crl.jrc.ec.europa.eu/jrcgmoamplicons/. © The Author(s) 2015. Published by Oxford University Press.
Continuity of care to optimize chronic disease management in the community setting: an evidence-based analysis.

PubMed

2013-01-01

This evidence-based analysis reviews relational and management continuity of care. Relational continuity refers to the duration and quality of the relationship between the care provider and the patient. Management continuity ensures that patients receive coherent, complementary, and timely care. There are 4 components of continuity of care: duration, density, dispersion, and sequence. The objective of this evidence-based analysis was to determine if continuity of care is associated with decreased health resource utilization, improved patient outcomes, and patient satisfaction. MEDLINE, EMBASE, CINAHL, the Cochrane Library, and the Centre for Reviews and Dissemination database were searched for studies on continuity of care and chronic disease published from January 2002 until December 2011. Systematic reviews, randomized controlled trials, and observational studies were eligible if they assessed continuity of care in adults and reported health resource utilization, patient outcomes, or patient satisfaction. Eight systematic reviews and 13 observational studies were identified. The reviews concluded that there is an association between continuity of care and outcomes; however, the literature base is weak. The observational studies found that higher continuity of care was frequently associated with fewer hospitalizations and emergency department visits. Three systematic reviews reported that higher continuity of care is associated with improved patient satisfaction, especially among patients with chronic conditions. Most of the studies were retrospective cross-sectional studies of large administrative databases. The databases do not capture information on trust and confidence in the provider, which is a critical component of relational continuity of care. The definitions for the selection of patients from the databases varied across studies. There is low quality evidence that: Higher continuity of care is associated with decreased health service utilization.There is insufficient evidence on the relationship of continuity of care with disease-specific outcomes.There is an association between high continuity of care and patient satisfaction, particularly among patients with chronic diseases.
DNetDB: The human disease network database based on dysfunctional regulation mechanism.

PubMed

Yang, Jing; Wu, Su-Juan; Yang, Shao-You; Peng, Jia-Wei; Wang, Shi-Nuo; Wang, Fu-Yan; Song, Yu-Xing; Qi, Ting; Li, Yi-Xue; Li, Yuan-Yuan

2016-05-21

Disease similarity study provides new insights into disease taxonomy, pathogenesis, which plays a guiding role in diagnosis and treatment. The early studies were limited to estimate disease similarities based on clinical manifestations, disease-related genes, medical vocabulary concepts or registry data, which were inevitably biased to well-studied diseases and offered small chance of discovering novel findings in disease relationships. In other words, genome-scale expression data give us another angle to address this problem since simultaneous measurement of the expression of thousands of genes allows for the exploration of gene transcriptional regulation, which is believed to be crucial to biological functions. Although differential expression analysis based methods have the potential to explore new disease relationships, it is difficult to unravel the upstream dysregulation mechanisms of diseases. We therefore estimated disease similarities based on gene expression data by using differential coexpression analysis, a recently emerging method, which has been proved to be more potential to capture dysfunctional regulation mechanisms than differential expression analysis. A total of 1,326 disease relationships among 108 diseases were identified, and the relevant information constituted the human disease network database (DNetDB). Benefiting from the use of differential coexpression analysis, the potential common dysfunctional regulation mechanisms shared by disease pairs (i.e. disease relationships) were extracted and presented. Statistical indicators, common disease-related genes and drugs shared by disease pairs were also included in DNetDB. In total, 1,326 disease relationships among 108 diseases, 5,598 pathways, 7,357 disease-related genes and 342 disease drugs are recorded in DNetDB, among which 3,762 genes and 148 drugs are shared by at least two diseases. DNetDB is the first database focusing on disease similarity from the viewpoint of gene regulation mechanism. It provides an easy-to-use web interface to search and browse the disease relationships and thus helps to systematically investigate etiology and pathogenesis, perform drug repositioning, and design novel therapeutic interventions.Database URL: http://app.scbit.org/DNetDB/ #.

McMaster Optimal Aging Portal: an evidence-based database for geriatrics-focused health professionals.

PubMed

Barbara, Angela M; Dobbins, Maureen; Brian Haynes, R; Iorio, Alfonso; Lavis, John N; Raina, Parminder; Levinson, Anthony J

2017-07-11

The objective of this work was to provide easy access to reliable health information based on good quality research that will help health care professionals to learn what works best for seniors to stay as healthy as possible, manage health conditions and build supportive health systems. This will help meet the demands of our aging population that clinicians provide high quality care for older adults, that public health professionals deliver disease prevention and health promotion strategies across the life span, and that policymakers address the economic and social need to create a robust health system and a healthy society for all ages. The McMaster Optimal Aging Portal's (Portal) professional bibliographic database contains high quality scientific evidence about optimal aging specifically targeted to clinicians, public health professionals and policymakers. The database content comes from three information services: McMaster Premium LiteratUre Service (MacPLUS™), Health Evidence™ and Health Systems Evidence. The Portal is continually updated, freely accessible online, easily searchable, and provides email-based alerts when new records are added. The database is being continually assessed for value, usability and use. A number of improvements are planned, including French language translation of content, increased linkages between related records within the Portal database, and inclusion of additional types of content. While this article focuses on the professional database, the Portal also houses resources for patients, caregivers and the general public, which may also be of interest to geriatric practitioners and researchers.
DR HAGIS-a fundus image database for the automatic extraction of retinal surface vessels from diabetic patients.

PubMed

Holm, Sven; Russell, Greg; Nourrit, Vincent; McLoughlin, Niall

2017-01-01

A database of retinal fundus images, the DR HAGIS database, is presented. This database consists of 39 high-resolution color fundus images obtained from a diabetic retinopathy screening program in the UK. The NHS screening program uses service providers that employ different fundus and digital cameras. This results in a range of different image sizes and resolutions. Furthermore, patients enrolled in such programs often display other comorbidities in addition to diabetes. Therefore, in an effort to replicate the normal range of images examined by grading experts during screening, the DR HAGIS database consists of images of varying image sizes and resolutions and four comorbidity subgroups: collectively defined as the diabetic retinopathy, hypertension, age-related macular degeneration, and Glaucoma image set (DR HAGIS). For each image, the vasculature has been manually segmented to provide a realistic set of images on which to test automatic vessel extraction algorithms. Modified versions of two previously published vessel extraction algorithms were applied to this database to provide some baseline measurements. A method based purely on the intensity of images pixels resulted in a mean segmentation accuracy of 95.83% ([Formula: see text]), whereas an algorithm based on Gabor filters generated an accuracy of 95.71% ([Formula: see text]).
sc-PDB-Frag: a database of protein-ligand interaction patterns for Bioisosteric replacements.

PubMed

Desaphy, Jérémy; Rognan, Didier

2014-07-28

Bioisosteric replacement plays an important role in medicinal chemistry by keeping the biological activity of a molecule while changing either its core scaffold or substituents, thereby facilitating lead optimization and patenting. Bioisosteres are classically chosen in order to keep the main pharmacophoric moieties of the substructure to replace. However, notably when changing a scaffold, no attention is usually paid as whether all atoms of the reference scaffold are equally important for binding to the desired target. We herewith propose a novel database for bioisosteric replacement (scPDBFrag), capitalizing on our recently published structure-based approach to scaffold hopping, focusing on interaction pattern graphs. Protein-bound ligands are first fragmented and the interaction of the corresponding fragments with their protein environment computed-on-the-fly. Using an in-house developed graph alignment tool, interaction patterns graphs can be compared, aligned, and sorted by decreasing similarity to any reference. In the herein presented sc-PDB-Frag database ( http://bioinfo-pharma.u-strasbg.fr/scPDBFrag ), fragments, interaction patterns, alignments, and pairwise similarity scores have been extracted from the sc-PDB database of 8077 druggable protein-ligand complexes and further stored in a relational database. We herewith present the database, its Web implementation, and procedures for identifying true bioisosteric replacements based on conserved interaction patterns.
Multi-Sensor Scene Synthesis and Analysis

DTIC Science & Technology

1981-09-01

Quad Trees for Image Representation and Processing ...... ... 126 2.6.2 Databases ..... ..... ... ..... ... ..... ..... 138 2.6.2.1 Definitions and...Basic Concepts ....... 138 2.6.3 Use of Databases in Hierarchical Scene Analysis ...... ... ..................... 147 2.6.4 Use of Relational Tables...Multisensor Image Database Systems (MIDAS) . 161 2.7.2 Relational Database System for Pictures .... ..... 168 2.7.3 Relational Pictorial Database
Design and Implementation of an Interface Editor for the Amadeus Multi- Relational Database Front-end System

DTIC Science & Technology

1993-03-25

application of Object-Oriented Programming (OOP) and Human-Computer Interface (HCI) design principles. Knowledge gained from each topic has been incorporated...through the ap- plication of Object-Oriented Programming (OOP) and Human-Computer Interface (HCI) design principles. Knowledge gained from each topic has...programming and Human-Computer Interface (HCI) design. Knowledge gained from each is applied to the design of a Form-based interface for database data
Cloud-Based Distributed Control of Unmanned Systems

DTIC Science & Technology

2015-04-01

during mission execution. At best, the data is saved onto hard-drives and is accessible only by the local team. Data history in a form available and...following open source technologies: GeoServer, OpenLayers, PostgreSQL , and PostGIS are chosen to implement the back-end database and server. A brief...geospatial map data. 3. PostgreSQL : An SQL-compliant object-relational database that easily scales to accommodate large amounts of data - upwards to
Enhanced DIII-D Data Management Through a Relational Database

NASA Astrophysics Data System (ADS)

Burruss, J. R.; Peng, Q.; Schachter, J.; Schissel, D. P.; Terpstra, T. B.

2000-10-01

A relational database is being used to serve data about DIII-D experiments. The database is optimized for queries across multiple shots, allowing for rapid data mining by SQL-literate researchers. The relational database relates different experiments and datasets, thus providing a big picture of DIII-D operations. Users are encouraged to add their own tables to the database. Summary physics quantities about DIII-D discharges are collected and stored in the database automatically. Meta-data about code runs, MDSplus usage, and visualization tool usage are collected, stored in the database, and later analyzed to improve computing. Documentation on the database may be accessed through programming languages such as C, Java, and IDL, or through ODBC compliant applications such as Excel and Access. A database-driven web page also provides a convenient means for viewing database quantities through the World Wide Web. Demonstrations will be given at the poster.
Actinobase: Database on molecular diversity, phylogeny and biocatalytic potential of salt tolerant alkaliphilic actinomycetes.

PubMed

Sharma, Amit K; Gohel, Sangeeta; Singh, Satya P

2012-01-01

Actinobase is a relational database of molecular diversity, phylogeny and biocatalytic potential of haloalkaliphilic actinomycetes. The main objective of this data base is to provide easy access to range of information, data storage, comparison and analysis apart from reduced data redundancy, data entry, storage, retrieval costs and improve data security. Information related to habitat, cell morphology, Gram reaction, biochemical characterization and molecular features would allow researchers in understanding identification and stress adaptation of the existing and new candidates belonging to salt tolerant alkaliphilic actinomycetes. The PHP front end helps to add nucleotides and protein sequence of reported entries which directly help researchers to obtain the required details. Analysis of the genus wise status of the salt tolerant alkaliphilic actinomycetes indicated 6 different genera among the 40 classified entries of the salt tolerant alkaliphilic actinomycetes. The results represented wide spread occurrence of salt tolerant alkaliphilic actinomycetes belonging to diverse taxonomic positions. Entries and information related to actinomycetes in the database are publicly accessible at http://www.actinobase.in. On clustalW/X multiple sequence alignment of the alkaline protease gene sequences, different clusters emerged among the groups. The narrow search and limit options of the constructed database provided comparable information. The user friendly access to PHP front end facilitates would facilitate addition of sequences of reported entries. The database is available for free at http://www.actinobase.in.
Online-data Bases On Natural-hazard Research, Early-warning Systems and Operative Disaster Prevention Programs

NASA Astrophysics Data System (ADS)

Hermanns, R. L.; Zentel, K.-O.; Wenzel, F.; Hövel, M.; Hesse, A.

In order to benefit from synergies and to avoid replication in the field of disaster re- duction programs and related scientific projects it is important to create an overview on the state of art, the fields of activity and their key aspects. Therefore, the German Committee for Disaster Reduction intends to document projects and institution related to natural disaster prevention in three databases. One database is designed to docu- ment scientific programs and projects related to natural hazards. In a first step data acquisition concentrated on projects carried out by German institutions. In a second step projects from all other European countries will be archived. The second database focuses on projects on early-warning systems and has no regional limit. Data mining started in November 2001 and will be finished soon. The third database documents op- erational projects dealing with disaster prevention and concentrates on international projects or internationally funded projects. These databases will be available on the internet end of spring 2002 (http://www.dkkv.org) and will be updated continuously. They will allow rapid and concise information on various international projects, pro- vide up-to-date descriptions, and facilitate exchange as all relevant information in- cluding contact addresses are available to the public. The aim of this contribution is to present concepts and the work done so far, to invite participation, and to contact other organizations with similar objectives.
Database of non-canonical base pairs found in known RNA structures

NASA Technical Reports Server (NTRS)

Nagaswamy, U.; Voss, N.; Zhang, Z.; Fox, G. E.

2000-01-01

Atomic resolution RNA structures are being published at an increasing rate. It is common to find a modest number of non-canonical base pairs in these structures in addition to the usual Watson-Crick pairs. This database summarizes the occurrence of these rare base pairs in accordance with standard nomenclature. The database, http://prion.bchs.uh.edu/, contains information such as sequence context, sugar pucker conformation, anti / syn base conformations, chemical shift, p K (a)values, melting temperature and free energy. Of the 29 anticipated pairs with two or more hydrogen bonds, 20 have been encountered to date. In addition, four unexpected pairs with two hydrogen bonds have been reported bringing the total to 24. Single hydrogen bond versions of five of the expected geometries have been encountered among the single hydrogen bond interactions. In addition, 18 different types of base triplets have been encountered, each of which involves three to six hydrogen bonds. The vast majority of the rare base pairs are antiparallel with the bases in the anti configuration relative to the ribose. The most common are the GU wobble, the Sheared GA pair, the Reverse Hoogsteen pair and the GA imino pair.
Content based image retrieval using local binary pattern operator and data mining techniques.

PubMed

Vatamanu, Oana Astrid; Frandeş, Mirela; Lungeanu, Diana; Mihalaş, Gheorghe-Ioan

2015-01-01

Content based image retrieval (CBIR) concerns the retrieval of similar images from image databases, using feature vectors extracted from images. These feature vectors globally define the visual content present in an image, defined by e.g., texture, colour, shape, and spatial relations between vectors. Herein, we propose the definition of feature vectors using the Local Binary Pattern (LBP) operator. A study was performed in order to determine the optimum LBP variant for the general definition of image feature vectors. The chosen LBP variant is then subsequently used to build an ultrasound image database, and a database with images obtained from Wireless Capsule Endoscopy. The image indexing process is optimized using data clustering techniques for images belonging to the same class. Finally, the proposed indexing method is compared to the classical indexing technique, which is nowadays widely used.
Clique-based data mining for related genes in a biomedical database.

PubMed

Matsunaga, Tsutomu; Yonemori, Chikara; Tomita, Etsuji; Muramatsu, Masaaki

2009-07-01

Progress in the life sciences cannot be made without integrating biomedical knowledge on numerous genes in order to help formulate hypotheses on the genetic mechanisms behind various biological phenomena, including diseases. There is thus a strong need for a way to automatically and comprehensively search from biomedical databases for related genes, such as genes in the same families and genes encoding components of the same pathways. Here we address the extraction of related genes by searching for densely-connected subgraphs, which are modeled as cliques, in a biomedical relational graph. We constructed a graph whose nodes were gene or disease pages, and edges were the hyperlink connections between those pages in the Online Mendelian Inheritance in Man (OMIM) database. We obtained over 20,000 sets of related genes (called 'gene modules') by enumerating cliques computationally. The modules included genes in the same family, genes for proteins that form a complex, and genes for components of the same signaling pathway. The results of experiments using 'metabolic syndrome'-related gene modules show that the gene modules can be used to get a coherent holistic picture helpful for interpreting relations among genes. We presented a data mining approach extracting related genes by enumerating cliques. The extracted gene sets provide a holistic picture useful for comprehending complex disease mechanisms.
Database extraction strategies for low-template evidence.

PubMed

Bleka, Øyvind; Dørum, Guro; Haned, Hinda; Gill, Peter

2014-03-01

Often in forensic cases, the profile of at least one of the contributors to a DNA evidence sample is unknown and a database search is needed to discover possible perpetrators. In this article we consider two types of search strategies to extract suspects from a database using methods based on probability arguments. The performance of the proposed match scores is demonstrated by carrying out a study of each match score relative to the level of allele drop-out in the crime sample, simulating low-template DNA. The efficiency was measured by random man simulation and we compared the performance using the SGM Plus kit and the ESX 17 kit for the Norwegian population, demonstrating that the latter has greatly enhanced power to discover perpetrators of crime in large national DNA databases. The code for the database extraction strategies will be prepared for release in the R-package forensim. Copyright © 2013 Elsevier Ireland Ltd. All rights reserved.
Public participation in genetic databases: crossing the boundaries between biobanks and forensic DNA databases through the principle of solidarity

PubMed Central

Machado, Helena; Silva, Susana

2015-01-01

The ethical aspects of biobanks and forensic DNA databases are often treated as separate issues. As a reflection of this, public participation, or the involvement of citizens in genetic databases, has been approached differently in the fields of forensics and medicine. This paper aims to cross the boundaries between medicine and forensics by exploring the flows between the ethical issues presented in the two domains and the subsequent conceptualisation of public trust and legitimisation. We propose to introduce the concept of ‘solidarity’, traditionally applied only to medical and research biobanks, into a consideration of public engagement in medicine and forensics. Inclusion of a solidarity-based framework, in both medical biobanks and forensic DNA databases, raises new questions that should be included in the ethical debate, in relation to both health services/medical research and activities associated with the criminal justice system. PMID:26139851
Medicaid Services for Persons with Mental Retardation and Related Conditions. Project Report 27.

ERIC Educational Resources Information Center

Lakin, K. Charlie; And Others

This report examines policy-related trends and projections in the use of various Medicaid-funded care services for persons with mental retardation and related conditions, and identifies factors influencing these trends nationally and in the various states. The examination is based on three sets of research activities: analyses of databases on…
A Database as a Service for the Healthcare System to Store Physiological Signal Data.

PubMed

Chang, Hsien-Tsung; Lin, Tsai-Huei

2016-01-01

Wearable devices that measure physiological signals to help develop self-health management habits have become increasingly popular in recent years. These records are conducive for follow-up health and medical care. In this study, based on the characteristics of the observed physiological signal records- 1) a large number of users, 2) a large amount of data, 3) low information variability, 4) data privacy authorization, and 5) data access by designated users-we wish to resolve physiological signal record-relevant issues utilizing the advantages of the Database as a Service (DaaS) model. Storing a large amount of data using file patterns can reduce database load, allowing users to access data efficiently; the privacy control settings allow users to store data securely. The results of the experiment show that the proposed system has better database access performance than a traditional relational database, with a small difference in database volume, thus proving that the proposed system can improve data storage performance.
A Database as a Service for the Healthcare System to Store Physiological Signal Data

PubMed Central

Lin, Tsai-Huei

2016-01-01

Wearable devices that measure physiological signals to help develop self-health management habits have become increasingly popular in recent years. These records are conducive for follow-up health and medical care. In this study, based on the characteristics of the observed physiological signal records– 1) a large number of users, 2) a large amount of data, 3) low information variability, 4) data privacy authorization, and 5) data access by designated users—we wish to resolve physiological signal record-relevant issues utilizing the advantages of the Database as a Service (DaaS) model. Storing a large amount of data using file patterns can reduce database load, allowing users to access data efficiently; the privacy control settings allow users to store data securely. The results of the experiment show that the proposed system has better database access performance than a traditional relational database, with a small difference in database volume, thus proving that the proposed system can improve data storage performance. PMID:28033415
Space Launch System Booster Separation Aerodynamic Database Development and Uncertainty Quantification

NASA Technical Reports Server (NTRS)

Chan, David T.; Pinier, Jeremy T.; Wilcox, Floyd J., Jr.; Dalle, Derek J.; Rogers, Stuart E.; Gomez, Reynaldo J.

2016-01-01

The development of the aerodynamic database for the Space Launch System (SLS) booster separation environment has presented many challenges because of the complex physics of the ow around three independent bodies due to proximity e ects and jet inter- actions from the booster separation motors and the core stage engines. This aerodynamic environment is dicult to simulate in a wind tunnel experiment and also dicult to simu- late with computational uid dynamics. The database is further complicated by the high dimensionality of the independent variable space, which includes the orientation of the core stage, the relative positions and orientations of the solid rocket boosters, and the thrust lev- els of the various engines. Moreover, the clearance between the core stage and the boosters during the separation event is sensitive to the aerodynamic uncertainties of the database. This paper will present the development process for Version 3 of the SLS booster separa- tion aerodynamic database and the statistics-based uncertainty quanti cation process for the database.
Importance of Data Management in a Long-term Biological Monitoring Program

DOE Office of Scientific and Technical Information (OSTI.GOV)

Christensen, Sigurd W; Brandt, Craig C; McCracken, Kitty

2011-01-01

The long-term Biological Monitoring and Abatement Program (BMAP) has always needed to collect and retain high-quality data on which to base its assessments of ecological status of streams and their recovery after remediation. Its formal quality assurance, data processing, and data management components all contribute to this need. The Quality Assurance Program comprehensively addresses requirements from various institutions, funders, and regulators, and includes a data management component. Centralized data management began a few years into the program. An existing relational database was adapted and extended to handle biological data. Data modeling enabled the program's database to process, store, and retrievemore » its data. The data base's main data tables and several key reference tables are described. One of the most important related activities supporting long-term analyses was the establishing of standards for sampling site names, taxonomic identification, flagging, and other components. There are limitations. Some types of program data were not easily accommodated in the central systems, and many possible data-sharing and integration options are not easily accessible to investigators. The implemented relational database supports the transmittal of data to the Oak Ridge Environmental Information System (OREIS) as the permanent repository. From our experience we offer data management advice to other biologically oriented long-term environmental sampling and analysis programs.« less
ASM Based Synthesis of Handwritten Arabic Text Pages

PubMed Central

Al-Hamadi, Ayoub; Elzobi, Moftah; El-etriby, Sherif; Ghoneim, Ahmed

2015-01-01

Document analysis tasks, as text recognition, word spotting, or segmentation, are highly dependent on comprehensive and suitable databases for training and validation. However their generation is expensive in sense of labor and time. As a matter of fact, there is a lack of such databases, which complicates research and development. This is especially true for the case of Arabic handwriting recognition, that involves different preprocessing, segmentation, and recognition methods, which have individual demands on samples and ground truth. To bypass this problem, we present an efficient system that automatically turns Arabic Unicode text into synthetic images of handwritten documents and detailed ground truth. Active Shape Models (ASMs) based on 28046 online samples were used for character synthesis and statistical properties were extracted from the IESK-arDB database to simulate baselines and word slant or skew. In the synthesis step ASM based representations are composed to words and text pages, smoothed by B-Spline interpolation and rendered considering writing speed and pen characteristics. Finally, we use the synthetic data to validate a segmentation method. An experimental comparison with the IESK-arDB database encourages to train and test document analysis related methods on synthetic samples, whenever no sufficient natural ground truthed data is available. PMID:26295059

ASM Based Synthesis of Handwritten Arabic Text Pages.

PubMed

Dinges, Laslo; Al-Hamadi, Ayoub; Elzobi, Moftah; El-Etriby, Sherif; Ghoneim, Ahmed

2015-01-01

Document analysis tasks, as text recognition, word spotting, or segmentation, are highly dependent on comprehensive and suitable databases for training and validation. However their generation is expensive in sense of labor and time. As a matter of fact, there is a lack of such databases, which complicates research and development. This is especially true for the case of Arabic handwriting recognition, that involves different preprocessing, segmentation, and recognition methods, which have individual demands on samples and ground truth. To bypass this problem, we present an efficient system that automatically turns Arabic Unicode text into synthetic images of handwritten documents and detailed ground truth. Active Shape Models (ASMs) based on 28046 online samples were used for character synthesis and statistical properties were extracted from the IESK-arDB database to simulate baselines and word slant or skew. In the synthesis step ASM based representations are composed to words and text pages, smoothed by B-Spline interpolation and rendered considering writing speed and pen characteristics. Finally, we use the synthetic data to validate a segmentation method. An experimental comparison with the IESK-arDB database encourages to train and test document analysis related methods on synthetic samples, whenever no sufficient natural ground truthed data is available.
NPACT: Naturally Occurring Plant-based Anti-cancer Compound-Activity-Target database

PubMed Central

Mangal, Manu; Sagar, Parul; Singh, Harinder; Raghava, Gajendra P. S.; Agarwal, Subhash M.

2013-01-01

Plant-derived molecules have been highly valued by biomedical researchers and pharmaceutical companies for developing drugs, as they are thought to be optimized during evolution. Therefore, we have collected and compiled a central resource Naturally Occurring Plant-based Anti-cancer Compound-Activity-Target database (NPACT, http://crdd.osdd.net/raghava/npact/) that gathers the information related to experimentally validated plant-derived natural compounds exhibiting anti-cancerous activity (in vitro and in vivo), to complement the other databases. It currently contains 1574 compound entries, and each record provides information on their structure, manually curated published data on in vitro and in vivo experiments along with reference for users referral, inhibitory values (IC50/ED50/EC50/GI50), properties (physical, elemental and topological), cancer types, cell lines, protein targets, commercial suppliers and drug likeness of compounds. NPACT can easily be browsed or queried using various options, and an online similarity tool has also been made available. Further, to facilitate retrieval of existing data, each record is hyperlinked to similar databases like SuperNatural, Herbal Ingredients’ Targets, Comparative Toxicogenomics Database, PubChem and NCI-60 GI50 data. PMID:23203877
Thailand mutation and variation database (ThaiMUT).

PubMed

Ruangrit, Uttapong; Srikummool, Metawee; Assawamakin, Anunchai; Ngamphiw, Chumpol; Chuechote, Suparat; Thaiprasarnsup, Vilasinee; Agavatpanitch, Gallissara; Pasomsab, Ekawat; Yenchitsomanus, Pa-Thai; Mahasirimongkol, Surakameth; Chantratita, Wasun; Palittapongarnpim, Prasit; Uyyanonvara, Bunyarit; Limwongse, Chanin; Tongsima, Sissades

2008-08-01

With the completion of the human genome project, novel sequencing and genotyping technologies had been utilized to detect mutations. Such mutations have continually been produced at exponential rate by researchers in various communities. Based on the population's mutation spectra, occurrences of Mendelian diseases are different across ethnic groups. A proportion of Mendelian diseases can be observed in some countries at higher rates than others. Recognizing the importance of mutation effects in Thailand, we established a National and Ethnic Mutation Database (NEMDB) for Thai people. This database, named Thailand Mutation and Variation database (ThaiMUT), offers a web-based access to genetic mutation and variation information in Thai population. This NEMDB initiative is an important informatics tool for both research and clinical purposes to retrieve and deposit human variation data. The mutation data cataloged in ThaiMUT database were derived from journal articles available in PubMed and local publications. In addition to collected mutation data, ThaiMUT also records genetic polymorphisms located in drug related genes. ThaiMUT could then provide useful information for clinical mutation screening services for Mendelian diseases and pharmacogenomic researches. ThaiMUT can be publicly accessed from http://gi.biotec.or.th/thaimut.
Materials engineering data base

NASA Technical Reports Server (NTRS)

1995-01-01

The various types of materials related data that exist at the NASA Marshall Space Flight Center and compiled into databases which could be accessed by all the NASA centers and by other contractors, are presented.
Technical Aspects of Interfacing MUMPS to an External SQL Relational Database Management System

PubMed Central

Kuzmak, Peter M.; Walters, Richard F.; Penrod, Gail

1988-01-01

This paper describes an interface connecting InterSystems MUMPS (M/VX) to an external relational DBMS, the SYBASE Database Management System. The interface enables MUMPS to operate in a relational environment and gives the MUMPS language full access to a complete set of SQL commands. MUMPS generates SQL statements as ASCII text and sends them to the RDBMS. The RDBMS executes the statements and returns ASCII results to MUMPS. The interface suggests that the language features of MUMPS make it an attractive tool for use in the relational database environment. The approach described in this paper separates MUMPS from the relational database. Positioning the relational database outside of MUMPS promotes data sharing and permits a number of different options to be used for working with the data. Other languages like C, FORTRAN, and COBOL can access the RDBMS database. Advanced tools provided by the relational database vendor can also be used. SYBASE is an advanced high-performance transaction-oriented relational database management system for the VAX/VMS and UNIX operating systems. SYBASE is designed using a distributed open-systems architecture, and is relatively easy to interface with MUMPS.
Navigation integrity monitoring and obstacle detection for enhanced-vision systems

NASA Astrophysics Data System (ADS)

Korn, Bernd; Doehler, Hans-Ullrich; Hecker, Peter

2001-08-01

Typically, Enhanced Vision (EV) systems consist of two main parts, sensor vision and synthetic vision. Synthetic vision usually generates a virtual out-the-window view using databases and accurate navigation data, e. g. provided by differential GPS (DGPS). The reliability of the synthetic vision highly depends on both, the accuracy of the used database and the integrity of the navigation data. But especially in GPS based systems, the integrity of the navigation can't be guaranteed. Furthermore, only objects that are stored in the database can be displayed to the pilot. Consequently, unexpected obstacles are invisible and this might cause severe problems. Therefore, additional information has to be extracted from sensor data to overcome these problems. In particular, the sensor data analysis has to identify obstacles and has to monitor the integrity of databases and navigation. Furthermore, if a lack of integrity arises, navigation data, e.g. the relative position of runway and aircraft, has to be extracted directly from the sensor data. The main contribution of this paper is about the realization of these three sensor data analysis tasks within our EV system, which uses the HiVision 35 GHz MMW radar of EADS, Ulm as the primary EV sensor. For the integrity monitoring, objects extracted from radar images are registered with both database objects and objects (e. g. other aircrafts) transmitted via data link. This results in a classification into known and unknown radar image objects and consequently, in a validation of the integrity of database and navigation. Furthermore, special runway structures are searched for in the radar image where they should appear. The outcome of this runway check contributes to the integrity analysis, too. Concurrent to this investigation a radar image based navigation is performed without using neither precision navigation nor detailed database information to determine the aircraft's position relative to the runway. The performance of our approach is demonstrated with real data acquired during extensive flight tests to several airports in Northern Germany.
Reactome graph database: Efficient access to complex pathway data

PubMed Central

Korninger, Florian; Viteri, Guilherme; Marin-Garcia, Pablo; Ping, Peipei; Wu, Guanming; Stein, Lincoln; D’Eustachio, Peter

2018-01-01

Reactome is a free, open-source, open-data, curated and peer-reviewed knowledgebase of biomolecular pathways. One of its main priorities is to provide easy and efficient access to its high quality curated data. At present, biological pathway databases typically store their contents in relational databases. This limits access efficiency because there are performance issues associated with queries traversing highly interconnected data. The same data in a graph database can be queried more efficiently. Here we present the rationale behind the adoption of a graph database (Neo4j) as well as the new ContentService (REST API) that provides access to these data. The Neo4j graph database and its query language, Cypher, provide efficient access to the complex Reactome data model, facilitating easy traversal and knowledge discovery. The adoption of this technology greatly improved query efficiency, reducing the average query time by 93%. The web service built on top of the graph database provides programmatic access to Reactome data by object oriented queries, but also supports more complex queries that take advantage of the new underlying graph-based data storage. By adopting graph database technology we are providing a high performance pathway data resource to the community. The Reactome graph database use case shows the power of NoSQL database engines for complex biological data types. PMID:29377902
Reactome graph database: Efficient access to complex pathway data.

PubMed

Fabregat, Antonio; Korninger, Florian; Viteri, Guilherme; Sidiropoulos, Konstantinos; Marin-Garcia, Pablo; Ping, Peipei; Wu, Guanming; Stein, Lincoln; D'Eustachio, Peter; Hermjakob, Henning

2018-01-01

Reactome is a free, open-source, open-data, curated and peer-reviewed knowledgebase of biomolecular pathways. One of its main priorities is to provide easy and efficient access to its high quality curated data. At present, biological pathway databases typically store their contents in relational databases. This limits access efficiency because there are performance issues associated with queries traversing highly interconnected data. The same data in a graph database can be queried more efficiently. Here we present the rationale behind the adoption of a graph database (Neo4j) as well as the new ContentService (REST API) that provides access to these data. The Neo4j graph database and its query language, Cypher, provide efficient access to the complex Reactome data model, facilitating easy traversal and knowledge discovery. The adoption of this technology greatly improved query efficiency, reducing the average query time by 93%. The web service built on top of the graph database provides programmatic access to Reactome data by object oriented queries, but also supports more complex queries that take advantage of the new underlying graph-based data storage. By adopting graph database technology we are providing a high performance pathway data resource to the community. The Reactome graph database use case shows the power of NoSQL database engines for complex biological data types.
Relation of pregnancy and neonatal factors to subsequent development of childhood epilepsy: a population-based cohort study.

PubMed

Whitehead, Elizabeth; Dodds, Linda; Joseph, K S; Gordon, Kevin E; Wood, Ellen; Allen, Alexander C; Camfield, Peter; Dooley, Joseph M

2006-04-01

We examined the effect of pregnancy and neonatal factors on the subsequent development of childhood epilepsy in a population-based cohort study. Children born between January 1986 and December 2000 in Nova Scotia, Canada were followed up to December 2001. Data on pregnancy and neonatal events and on diagnoses of childhood epilepsy were obtained through record linkage of 2 population-based databases: the Nova Scotia Atlee Perinatal Database and the Canadian Epilepsy Database and Registry. Factors analyzed included events during the prenatal, labor and delivery, and neonatal time periods. Cox proportional hazards regression models were used to estimate relative risks and 95% confidence intervals. There were 648 new cases of epilepsy diagnosed among 124,207 live births, for an overall rate of 63 per 100,000 person-years. Incidence rates were highest among children <1 year of age. In adjusted analyses, factors significantly associated with an increased risk of epilepsy included eclampsia, neonatal seizures, central nervous system (CNS) anomalies, placental abruption, major non-CNS anomalies, neonatal metabolic disorders, neonatal CNS diseases, previous low birth weight infant, infection in pregnancy, small for gestational age, unmarried, and not breastfeeding infant at the time of discharge from hospital. Our study supports the concept that prenatal factors contribute to the occurrence of subsequent childhood epilepsy.
A blue carbon soil database: Tidal wetland stocks for the US National Greenhouse Gas Inventory

NASA Astrophysics Data System (ADS)

Feagin, R. A.; Eriksson, M.; Hinson, A.; Najjar, R. G.; Kroeger, K. D.; Herrmann, M.; Holmquist, J. R.; Windham-Myers, L.; MacDonald, G. M.; Brown, L. N.; Bianchi, T. S.

2015-12-01

Coastal wetlands contain large reservoirs of carbon, and in 2015 the US National Greenhouse Gas Inventory began the work of placing blue carbon within the national regulatory context. The potential value of a wetland carbon stock, in relation to its location, soon could be influential in determining governmental policy and management activities, or in stimulating market-based CO2 sequestration projects. To meet the national need for high-resolution maps, a blue carbon stock database was developed linking National Wetlands Inventory datasets with the USDA Soil Survey Geographic Database. Users of the database can identify the economic potential for carbon conservation or restoration projects within specific estuarine basins, states, wetland types, physical parameters, and land management activities. The database is geared towards both national-level assessments and local-level inquiries. Spatial analysis of the stocks show high variance within individual estuarine basins, largely dependent on geomorphic position on the landscape, though there are continental scale trends to the carbon distribution as well. Future plans including linking this database with a sedimentary accretion database to predict carbon flux in US tidal wetlands.
Chapter 51: How to Build a Simple Cone Search Service Using a Local Database

NASA Astrophysics Data System (ADS)

Kent, B. R.; Greene, G. R.

The cone search service protocol will be examined from the server side in this chapter. A simple cone search service will be setup and configured locally using MySQL. Data will be read into a table, and the Java JDBC will be used to connect to the database. Readers will understand the VO cone search specification and how to use it to query a database on their local systems and return an XML/VOTable file based on an input of RA/DEC coordinates and a search radius. The cone search in this example will be deployed as a Java servlet. The resulting cone search can be tested with a verification service. This basic setup can be used with other languages and relational databases.
Information resources at the National Center for Biotechnology Information.

PubMed Central

Woodsmall, R M; Benson, D A

1993-01-01

The National Center for Biotechnology Information (NCBI), part of the National Library of Medicine, was established in 1988 to perform basic research in the field of computational molecular biology as well as build and distribute molecular biology databases. The basic research has led to new algorithms and analysis tools for interpreting genomic data and has been instrumental in the discovery of human disease genes for neurofibromatosis and Kallmann syndrome. The principal database responsibility is the National Institutes of Health (NIH) genetic sequence database, GenBank. NCBI, in collaboration with international partners, builds, distributes, and provides online and CD-ROM access to over 112,000 DNA sequences. Another major program is the integration of multiple sequences databases and related bibliographic information and the development of network-based retrieval systems for Internet access. PMID:8374583
The effective use of newspaper information in corporations (2) Centered around corporate and managemant information

NASA Astrophysics Data System (ADS)

Kamio, Tatsuo

A newspaper article is a fragmentary record of fact. For information activities in corporations it is fundamental to gather newspaper articles related to the object thema as many as possible, integrate them, analyze them, and then, create new intelligence based on them. Here in databases become effective measures. It seems essential to construct searching strategy with high recall of necessary information and understand the databases in detail when we use newspaper article databases. The cases that newspaper databases are useful for business are represented by (1) research and analysis for problem solving, (2) gathering of knowledge, and confirmation of the facts, and (3) constant observation of facts without missing any change in there. Particularly for case (1) various methods are tried for analyzing the tendency.
CoReCG: a comprehensive database of genes associated with colon-rectal cancer

PubMed Central

Agarwal, Rahul; Kumar, Binayak; Jayadev, Msk; Raghav, Dhwani; Singh, Ashutosh

2016-01-01

Cancer of large intestine is commonly referred as colorectal cancer, which is also the third most frequently prevailing neoplasm across the globe. Though, much of work is being carried out to understand the mechanism of carcinogenesis and advancement of this disease but, fewer studies has been performed to collate the scattered information of alterations in tumorigenic cells like genes, mutations, expression changes, epigenetic alteration or post translation modification, genetic heterogeneity. Earlier findings were mostly focused on understanding etiology of colorectal carcinogenesis but less emphasis were given for the comprehensive review of the existing findings of individual studies which can provide better diagnostics based on the suggested markers in discrete studies. Colon Rectal Cancer Gene Database (CoReCG), contains 2056 colon-rectal cancer genes information involved in distinct colorectal cancer stages sourced from published literature with an effective knowledge based information retrieval system. Additionally, interactive web interface enriched with various browsing sections, augmented with advance search facility for querying the database is provided for user friendly browsing, online tools for sequence similarity searches and knowledge based schema ensures a researcher friendly information retrieval mechanism. Colorectal cancer gene database (CoReCG) is expected to be a single point source for identification of colorectal cancer-related genes, thereby helping with the improvement of classification, diagnosis and treatment of human cancers. Database URL: lms.snu.edu.in/corecg PMID:27114494
The Gypsy Database (GyDB) of mobile genetic elements: release 2.0

PubMed Central

Llorens, Carlos; Futami, Ricardo; Covelli, Laura; Domínguez-Escribá, Laura; Viu, Jose M.; Tamarit, Daniel; Aguilar-Rodríguez, Jose; Vicente-Ripolles, Miguel; Fuster, Gonzalo; Bernet, Guillermo P.; Maumus, Florian; Munoz-Pomer, Alfonso; Sempere, Jose M.; Latorre, Amparo; Moya, Andres

2011-01-01

This article introduces the second release of the Gypsy Database of Mobile Genetic Elements (GyDB 2.0): a research project devoted to the evolutionary dynamics of viruses and transposable elements based on their phylogenetic classification (per lineage and protein domain). The Gypsy Database (GyDB) is a long-term project that is continuously progressing, and that owing to the high molecular diversity of mobile elements requires to be completed in several stages. GyDB 2.0 has been powered with a wiki to allow other researchers participate in the project. The current database stage and scope are long terminal repeats (LTR) retroelements and relatives. GyDB 2.0 is an update based on the analysis of Ty3/Gypsy, Retroviridae, Ty1/Copia and Bel/Pao LTR retroelements and the Caulimoviridae pararetroviruses of plants. Among other features, in terms of the aforementioned topics, this update adds: (i) a variety of descriptions and reviews distributed in multiple web pages; (ii) protein-based phylogenies, where phylogenetic levels are assigned to distinct classified elements; (iii) a collection of multiple alignments, lineage-specific hidden Markov models and consensus sequences, called GyDB collection; (iv) updated RefSeq databases and BLAST and HMM servers to facilitate sequence characterization of new LTR retroelement and caulimovirus queries; and (v) a bibliographic server. GyDB 2.0 is available at http://gydb.org. PMID:21036865
The Gypsy Database (GyDB) of mobile genetic elements: release 2.0.

PubMed

Llorens, Carlos; Futami, Ricardo; Covelli, Laura; Domínguez-Escribá, Laura; Viu, Jose M; Tamarit, Daniel; Aguilar-Rodríguez, Jose; Vicente-Ripolles, Miguel; Fuster, Gonzalo; Bernet, Guillermo P; Maumus, Florian; Munoz-Pomer, Alfonso; Sempere, Jose M; Latorre, Amparo; Moya, Andres

2011-01-01

This article introduces the second release of the Gypsy Database of Mobile Genetic Elements (GyDB 2.0): a research project devoted to the evolutionary dynamics of viruses and transposable elements based on their phylogenetic classification (per lineage and protein domain). The Gypsy Database (GyDB) is a long-term project that is continuously progressing, and that owing to the high molecular diversity of mobile elements requires to be completed in several stages. GyDB 2.0 has been powered with a wiki to allow other researchers participate in the project. The current database stage and scope are long terminal repeats (LTR) retroelements and relatives. GyDB 2.0 is an update based on the analysis of Ty3/Gypsy, Retroviridae, Ty1/Copia and Bel/Pao LTR retroelements and the Caulimoviridae pararetroviruses of plants. Among other features, in terms of the aforementioned topics, this update adds: (i) a variety of descriptions and reviews distributed in multiple web pages; (ii) protein-based phylogenies, where phylogenetic levels are assigned to distinct classified elements; (iii) a collection of multiple alignments, lineage-specific hidden Markov models and consensus sequences, called GyDB collection; (iv) updated RefSeq databases and BLAST and HMM servers to facilitate sequence characterization of new LTR retroelement and caulimovirus queries; and (v) a bibliographic server. GyDB 2.0 is available at http://gydb.org.
A geodata warehouse: Using denormalisation techniques as a tool for delivering spatially enabled integrated geological information to geologists

NASA Astrophysics Data System (ADS)

Kingdon, Andrew; Nayembil, Martin L.; Richardson, Anne E.; Smith, A. Graham

2016-11-01

New requirements to understand geological properties in three dimensions have led to the development of PropBase, a data structure and delivery tools to deliver this. At the BGS, relational database management systems (RDBMS) has facilitated effective data management using normalised subject-based database designs with business rules in a centralised, vocabulary controlled, architecture. These have delivered effective data storage in a secure environment. However, isolated subject-oriented designs prevented efficient cross-domain querying of datasets. Additionally, the tools provided often did not enable effective data discovery as they struggled to resolve the complex underlying normalised structures providing poor data access speeds. Users developed bespoke access tools to structures they did not fully understand sometimes delivering them incorrect results. Therefore, BGS has developed PropBase, a generic denormalised data structure within an RDBMS to store property data, to facilitate rapid and standardised data discovery and access, incorporating 2D and 3D physical and chemical property data, with associated metadata. This includes scripts to populate and synchronise the layer with its data sources through structured input and transcription standards. A core component of the architecture includes, an optimised query object, to deliver geoscience information from a structure equivalent to a data warehouse. This enables optimised query performance to deliver data in multiple standardised formats using a web discovery tool. Semantic interoperability is enforced through vocabularies combined from all data sources facilitating searching of related terms. PropBase holds 28.1 million spatially enabled property data points from 10 source databases incorporating over 50 property data types with a vocabulary set that includes 557 property terms. By enabling property data searches across multiple databases PropBase has facilitated new scientific research, previously considered impractical. PropBase is easily extended to incorporate 4D data (time series) and is providing a baseline for new "big data" monitoring projects.
Design and development of a web-based application for diabetes patient data management.

PubMed

Deo, S S; Deobagkar, D N; Deobagkar, Deepti D

2005-01-01

A web-based database management system developed for collecting, managing and analysing information of diabetes patients is described here. It is a searchable, client-server, relational database application, developed on the Windows platform using Oracle, Active Server Pages (ASP), Visual Basic Script (VB Script) and Java Script. The software is menu-driven and allows authorized healthcare providers to access, enter, update and analyse patient information. Graphical representation of data can be generated by the system using bar charts and pie charts. An interactive web interface allows users to query the database and generate reports. Alpha- and beta-testing of the system was carried out and the system at present holds records of 500 diabetes patients and is found useful in diagnosis and treatment. In addition to providing patient data on a continuous basis in a simple format, the system is used in population and comparative analysis. It has proved to be of significant advantage to the healthcare provider as compared to the paper-based system.
Mandatory and Location-Aware Access Control for Relational Databases

NASA Astrophysics Data System (ADS)

Decker, Michael

Access control is concerned with determining which operations a particular user is allowed to perform on a particular electronic resource. For example, an access control decision could say that user Alice is allowed to perform the operation read (but not write) on the resource research report. With conventional access control this decision is based on the user's identity whereas the basic idea of Location-Aware Access Control (LAAC) is to evaluate also a user's current location when making the decision if a particular request should be granted or denied. LAAC is an interesting approach for mobile information systems because these systems are exposed to specific security threads like the loss of a device. Some data models for LAAC can be found in literature, but almost all of them are based on RBAC and none of them is designed especially for Database Management Systems (DBMS). In this paper we therefore propose a LAAC-approach for DMBS and describe a prototypical implementation of that approach that is based on database triggers.
Integration of multiple DICOM Web servers into an enterprise-wide Web-based electronic medical record

NASA Astrophysics Data System (ADS)

Stewart, Brent K.; Langer, Steven G.; Martin, Kelly P.

1999-07-01

The purpose of this paper is to integrate multiple DICOM image webservers into the currently existing enterprises- wide web-browsable electronic medical record. Over the last six years the University of Washington has created a clinical data repository combining in a distributed relational database information from multiple departmental databases (MIND). A character cell-based view of this data called the Mini Medical Record (MMR) has been available for four years, MINDscape, unlike the text-based MMR. provides a platform independent, dynamic, web browser view of the MIND database that can be easily linked with medical knowledge resources on the network, like PubMed and the Federated Drug Reference. There are over 10,000 MINDscape user accounts at the University of Washington Academic Medical Centers. The weekday average number of hits to MINDscape is 35,302 and weekday average number of individual users is 1252. DICOM images from multiple webservers are now being viewed through the MINDscape electronic medical record.

Geo-spatial Service and Application based on National E-government Network Platform and Cloud

NASA Astrophysics Data System (ADS)

Meng, X.; Deng, Y.; Li, H.; Yao, L.; Shi, J.

2014-04-01

With the acceleration of China's informatization process, our party and government take a substantive stride in advancing development and application of digital technology, which promotes the evolution of e-government and its informatization. Meanwhile, as a service mode based on innovative resources, cloud computing may connect huge pools together to provide a variety of IT services, and has become one relatively mature technical pattern with further studies and massive practical applications. Based on cloud computing technology and national e-government network platform, "National Natural Resources and Geospatial Database (NRGD)" project integrated and transformed natural resources and geospatial information dispersed in various sectors and regions, established logically unified and physically dispersed fundamental database and developed national integrated information database system supporting main e-government applications. Cross-sector e-government applications and services are realized to provide long-term, stable and standardized natural resources and geospatial fundamental information products and services for national egovernment and public users.
78 FR 42775 - CGI Federal, Inc., and Custom Applications Management; Transfer of Data

Federal Register 2010, 2011, 2012, 2013, 2014

2013-07-17

... develop applications, Web sites, Web pages, web-based applications and databases, in accordance with EPA policies and related Federal standards and procedures. The Contractor will provide [[Page 42776
Over 20 years of reaction access systems from MDL: a novel reaction substructure search algorithm.

PubMed

Chen, Lingran; Nourse, James G; Christie, Bradley D; Leland, Burton A; Grier, David L

2002-01-01

From REACCS, to MDL ISIS/Host Reaction Gateway, and most recently to MDL Relational Chemistry Server, a new product based on Oracle data cartridge technology, MDL's reaction database management and retrieval systems have undergone great changes. The evolution of the system architecture is briefly discussed. The evolution of MDL reaction substructure search (RSS) algorithms is detailed. This article mainly describes a novel RSS algorithm. This algorithm is based on a depth-first search approach and is able to fully and prospectively use reaction specific information, such as reacting center and atom-atom mapping (AAM) information. The new algorithm has been used in the recently released MDL Relational Chemistry Server and allows the user to precisely find reaction instances in databases while minimizing unrelated hits. Finally, the existing and new RSS algorithms are compared with several examples.
What have we learned in minimally invasive colorectal surgery from NSQIP and NIS large databases? A systematic review.

PubMed

Batista Rodríguez, Gabriela; Balla, Andrea; Corradetti, Santiago; Martinez, Carmen; Hernández, Pilar; Bollo, Jesús; Targarona, Eduard M

2018-06-01

"Big data" refers to large amount of dataset. Those large databases are useful in many areas, including healthcare. The American College of Surgeons National Surgical Quality Improvement Program (ACS-NSQIP) and the National Inpatient Sample (NIS) are big databases that were developed in the USA in order to record surgical outcomes. The aim of the present systematic review is to evaluate the type and clinical impact of the information retrieved through NISQP and NIS big database articles focused on laparoscopic colorectal surgery. A systematic review was conducted using The Meta-Analysis Of Observational Studies in Epidemiology (MOOSE) guidelines. The research was carried out on PubMed database and revealed 350 published papers. Outcomes of articles in which laparoscopic colorectal surgery was the primary aim were analyzed. Fifty-five studies, published between 2007 and February 2017, were included. Articles included were categorized in groups according to the main topic as: outcomes related to surgical technique comparisons, morbidity and perioperatory results, specific disease-related outcomes, sociodemographic disparities, and academic training impact. NSQIP and NIS databases are just the tip of the iceberg for the potential application of Big Data technology and analysis in MIS. Information obtained through big data is useful and could be considered as external validation in those situations where a significant evidence-based medicine exists; also, those databases establish benchmarks to measure the quality of patient care. Data retrieved helps to inform decision-making and improve healthcare delivery.
Current Standardization and Cooperative Efforts Related to Industrial Information Infrastructures.

DTIC Science & Technology

1993-05-01

Data Management Systems: Components used to store, manage, and retrieve data. Data management includes knowledge bases, database management...Application Development Tools and Methods X/Open and POSIX APIs Integrated Design Support System (IDS) Knowledge -Based Systems (KBS) Application...IDEFlx) Yourdon Jackson System Design (JSD) Knowledge -Based Systems (KBSs) Structured Systems Development (SSD) Semantic Unification Meta-Model
Ten Years Experience In Geo-Databases For Linear Facilities Risk Assessment (Lfra)

NASA Astrophysics Data System (ADS)

Oboni, F.

2003-04-01

Keywords: geo-environmental, database, ISO14000, management, decision-making, risk, pipelines, roads, railroads, loss control, SAR, hazard identification ABSTRACT: During the past decades, characterized by the development of the Risk Management (RM) culture, a variety of different RM models have been proposed by governmental agencies in various parts of the world. The most structured models appear to have originated in the field of environmental RM. These models are briefly reviewed in the first section of the paper focusing the attention on the difference between Hazard Management and Risk Management and the need to use databases in order to allow retrieval of specific information and effective updating. The core of the paper reviews a number of different RM approaches, based on extensions of geo-databases, specifically developed for linear facilities (LF) in transportation corridors since the early 90s in Switzerland, Italy, Canada, the US and South America. The applications are compared in terms of methodology, capabilities and resources necessary to their implementation. The paper then focuses the attention on the level of detail that applications and related data have to attain. Common pitfalls related to decision making based on hazards rather than on risks are discussed. The paper focuses the last sections on the description of the next generation of linear facility RA application, including examples of results and discussion of future methodological research. It is shown that geo-databases should be linked to loss control and accident reports in order to maximize their benefits. The links between RA and ISO 14000 (environmental management code) are explicitly considered.
Fee-Based Services and the Public Library: An Administrative Perspective.

ERIC Educational Resources Information Center

Gaines, Ervin J.; Huttner, Marian A.

1983-01-01

This article enumerates factors which created demand for fee-based information service (commercial databases, competition for proprietary information in business world, effectiveness of librarians) and relates experiences at two public libraries. Sources of business, value of advertising, techniques of selling, and hiring and deployment of staff…
Survey of standards applicable to a database management system

NASA Technical Reports Server (NTRS)

Urena, J. L.

1981-01-01

Industry, government, and NASA standards, and the status of standardization activities of standards setting organizations applicable to the design, implementation and operation of a data base management system for space related applications are identified. The applicability of the standards to a general purpose, multimission data base management system is addressed.
Cyclebase 3.0: a multi-organism database on cell-cycle regulation and phenotypes.

PubMed

Santos, Alberto; Wernersson, Rasmus; Jensen, Lars Juhl

2015-01-01

The eukaryotic cell division cycle is a highly regulated process that consists of a complex series of events and involves thousands of proteins. Researchers have studied the regulation of the cell cycle in several organisms, employing a wide range of high-throughput technologies, such as microarray-based mRNA expression profiling and quantitative proteomics. Due to its complexity, the cell cycle can also fail or otherwise change in many different ways if important genes are knocked out, which has been studied in several microscopy-based knockdown screens. The data from these many large-scale efforts are not easily accessed, analyzed and combined due to their inherent heterogeneity. To address this, we have created Cyclebase--available at http://www.cyclebase.org--an online database that allows users to easily visualize and download results from genome-wide cell-cycle-related experiments. In Cyclebase version 3.0, we have updated the content of the database to reflect changes to genome annotation, added new mRNA and protein expression data, and integrated cell-cycle phenotype information from high-content screens and model-organism databases. The new version of Cyclebase also features a new web interface, designed around an overview figure that summarizes all the cell-cycle-related data for a gene. © The Author(s) 2014. Published by Oxford University Press on behalf of Nucleic Acids Research.
Analysis of prescription database extracted from standard textbooks of traditional Dai medicine.

PubMed

Zhang, Chuang; Chongsuvivatwong, Virasakdi; Keawpradub, Niwat; Lin, Yanfang

2012-08-29

Traditional Dai Medicine (TDM) is one of the four major ethnomedicine of China. In 2007 a group of experts produced a set of seven Dai medical textbooks on this subject. The first two were selected as the main data source to analyse well recognized prescriptions. To quantify patterns of prescriptions, common ingredients, indications and usages of TDM. A relational database linking the prescriptions, ingredients, herb names, indications, and usages was set up. Frequency of pattern of combination and common ingredients were tabulated. A total of 200 prescriptions and 402 herbs were compiled. Prescriptions based on "wind" disorders, a detoxification theory that most commonly deals with symptoms of digestive system diseases, accounted for over one third of all prescriptions. The major methods of preparations mostly used roots and whole herbs. The information extracted from the relational database may be useful for understanding symptomatic treatments. Antidote and detoxification theory deserves further research.
Combined use of computational chemistry and chemoinformatics methods for chemical discovery

DOE Office of Scientific and Technical Information (OSTI.GOV)

Sugimoto, Manabu, E-mail: sugimoto@kumamoto-u.ac.jp; Institute for Molecular Science, 38 Nishigo-Naka, Myodaiji, Okazaki 444-8585; CREST, Japan Science and Technology Agency, 4-1-8 Honcho, Kawaguchi, Saitama 332-0012

2015-12-31

Data analysis on numerical data by the computational chemistry calculations is carried out to obtain knowledge information of molecules. A molecular database is developed to systematically store chemical, electronic-structure, and knowledge-based information. The database is used to find molecules related to a keyword of “cancer”. Then the electronic-structure calculations are performed to quantitatively evaluate quantum chemical similarity of the molecules. Among the 377 compounds registered in the database, 24 molecules are found to be “cancer”-related. This set of molecules includes both carcinogens and anticancer drugs. The quantum chemical similarity analysis, which is carried out by using numerical results of themore » density-functional theory calculations, shows that, when some energy spectra are referred to, carcinogens are reasonably distinguished from the anticancer drugs. Therefore these spectral properties are considered of as important measures for classification.« less
UniGene Tabulator: a full parser for the UniGene format.

PubMed

Lenzi, Luca; Frabetti, Flavia; Facchin, Federica; Casadei, Raffaella; Vitale, Lorenza; Canaider, Silvia; Carinci, Paolo; Zannotti, Maria; Strippoli, Pierluigi

2006-10-15

UniGene Tabulator 1.0 provides a solution for full parsing of UniGene flat file format; it implements a structured graphical representation of each data field present in UniGene following import into a common database managing system usable in a personal computer. This database includes related tables for sequence, protein similarity, sequence-tagged site (STS) and transcript map interval (TXMAP) data, plus a summary table where each record represents a UniGene cluster. UniGene Tabulator enables full local management of UniGene data, allowing parsing, querying, indexing, retrieving, exporting and analysis of UniGene data in a relational database form, usable on Macintosh (OS X 10.3.9 or later) and Windows (2000, with service pack 4, XP, with service pack 2 or later) operating systems-based computers. The current release, including both the FileMaker runtime applications, is freely available at http://apollo11.isto.unibo.it/software/
IDAAPM: integrated database of ADMET and adverse effects of predictive modeling based on FDA approved drug data.

PubMed

Legehar, Ashenafi; Xhaard, Henri; Ghemtio, Leo

2016-01-01

The disposition of a pharmaceutical compound within an organism, i.e. its Absorption, Distribution, Metabolism, Excretion, Toxicity (ADMET) properties and adverse effects, critically affects late stage failure of drug candidates and has led to the withdrawal of approved drugs. Computational methods are effective approaches to reduce the number of safety issues by analyzing possible links between chemical structures and ADMET or adverse effects, but this is limited by the size, quality, and heterogeneity of the data available from individual sources. Thus, large, clean and integrated databases of approved drug data, associated with fast and efficient predictive tools are desirable early in the drug discovery process. We have built a relational database (IDAAPM) to integrate available approved drug data such as drug approval information, ADMET and adverse effects, chemical structures and molecular descriptors, targets, bioactivity and related references. The database has been coupled with a searchable web interface and modern data analytics platform (KNIME) to allow data access, data transformation, initial analysis and further predictive modeling. Data were extracted from FDA resources and supplemented from other publicly available databases. Currently, the database contains information regarding about 19,226 FDA approval applications for 31,815 products (small molecules and biologics) with their approval history, 2505 active ingredients, together with as many ADMET properties, 1629 molecular structures, 2.5 million adverse effects and 36,963 experimental drug-target bioactivity data. IDAAPM is a unique resource that, in a single relational database, provides detailed information on FDA approved drugs including their ADMET properties and adverse effects, the corresponding targets with bioactivity data, coupled with a data analytics platform. It can be used to perform basic to complex drug-target ADMET or adverse effects analysis and predictive modeling. IDAAPM is freely accessible at http://idaapm.helsinki.fi and can be exploited through a KNIME workflow connected to the database.Graphical abstractFDA approved drug data integration for predictive modeling.
In silico mining of putative microsatellite markers from whole genome sequence of water buffalo (Bubalus bubalis) and development of first BuffSatDB

PubMed Central

2013-01-01

Background Though India has sequenced water buffalo genome but its draft assembly is based on cattle genome BTau 4.0, thus de novo chromosome wise assembly is a major pending issue for global community. The existing radiation hybrid of buffalo and these reported STR can be used further in final gap plugging and “finishing” expected in de novo genome assembly. QTL and gene mapping needs mining of putative STR from buffalo genome at equal interval on each and every chromosome. Such markers have potential role in improvement of desirable characteristics, such as high milk yields, resistance to diseases, high growth rate. The STR mining from whole genome and development of user friendly database is yet to be done to reap the benefit of whole genome sequence. Description By in silico microsatellite mining of whole genome, we have developed first STR database of water buffalo, BuffSatDb (Buffalo MicroSatellite Database (http://cabindb.iasri.res.in/buffsatdb/) which is a web based relational database of 910529 microsatellite markers, developed using PHP and MySQL database. Microsatellite markers have been generated using MIcroSAtellite tool. It is simple and systematic web based search for customised retrieval of chromosome wise and genome-wide microsatellites. Search has been enabled based on chromosomes, motif type (mono-hexa), repeat motif and repeat kind (simple and composite). The search may be customised by limiting location of STR on chromosome as well as number of markers in that range. This is a novel approach and not been implemented in any of the existing marker database. This database has been further appended with Primer3 for primer designing of the selected markers enabling researcher to select markers of choice at desired interval over the chromosome. The unique add-on of degenerate bases further helps in resolving presence of degenerate bases in current buffalo assembly. Conclusion Being first buffalo STR database in the world , this would not only pave the way in resolving current assembly problem but shall be of immense use for global community in QTL/gene mapping critically required to increase knowledge in the endeavour to increase buffalo productivity, especially for third world country where rural economy is significantly dependent on buffalo productivity. PMID:23336431
In silico mining of putative microsatellite markers from whole genome sequence of water buffalo (Bubalus bubalis) and development of first BuffSatDB.

PubMed

Sarika; Arora, Vasu; Iquebal, Mir Asif; Rai, Anil; Kumar, Dinesh

2013-01-19

Though India has sequenced water buffalo genome but its draft assembly is based on cattle genome BTau 4.0, thus de novo chromosome wise assembly is a major pending issue for global community. The existing radiation hybrid of buffalo and these reported STR can be used further in final gap plugging and "finishing" expected in de novo genome assembly. QTL and gene mapping needs mining of putative STR from buffalo genome at equal interval on each and every chromosome. Such markers have potential role in improvement of desirable characteristics, such as high milk yields, resistance to diseases, high growth rate. The STR mining from whole genome and development of user friendly database is yet to be done to reap the benefit of whole genome sequence. By in silico microsatellite mining of whole genome, we have developed first STR database of water buffalo, BuffSatDb (Buffalo MicroSatellite Database (http://cabindb.iasri.res.in/buffsatdb/) which is a web based relational database of 910529 microsatellite markers, developed using PHP and MySQL database. Microsatellite markers have been generated using MIcroSAtellite tool. It is simple and systematic web based search for customised retrieval of chromosome wise and genome-wide microsatellites. Search has been enabled based on chromosomes, motif type (mono-hexa), repeat motif and repeat kind (simple and composite). The search may be customised by limiting location of STR on chromosome as well as number of markers in that range. This is a novel approach and not been implemented in any of the existing marker database. This database has been further appended with Primer3 for primer designing of the selected markers enabling researcher to select markers of choice at desired interval over the chromosome. The unique add-on of degenerate bases further helps in resolving presence of degenerate bases in current buffalo assembly. Being first buffalo STR database in the world , this would not only pave the way in resolving current assembly problem but shall be of immense use for global community in QTL/gene mapping critically required to increase knowledge in the endeavour to increase buffalo productivity, especially for third world country where rural economy is significantly dependent on buffalo productivity.
The clinical information system GastroBase: integration of image processing and laboratory communication.

PubMed

Kocna, P

1995-01-01

GastroBase, a clinical information system, incorporates patient identification, medical records, images, laboratory data, patient history, physical examination, and other patient-related information. Program modules are written in C; all data is processed using Novell-Btrieve data manager. Patient identification database represents the main core of this information systems. A graphic library developed in the past year and graphic modules with a special video-card enables the storing, archiving, and linking of different images to the electronic patient-medical-record. GastroBase has been running for more than four years in daily routine and the database contains more than 25,000 medical records and 1,500 images. This new version of GastroBase is now incorporated into the clinical information system of University Clinic in Prague.
Coupling computer-interpretable guidelines with a drug-database through a web-based system – The PRESGUID project

PubMed Central

Dufour, Jean-Charles; Fieschi, Dominique; Fieschi, Marius

2004-01-01

Background Clinical Practice Guidelines (CPGs) available today are not extensively used due to lack of proper integration into clinical settings, knowledge-related information resources, and lack of decision support at the point of care in a particular clinical context. Objective The PRESGUID project (PREScription and GUIDelines) aims to improve the assistance provided by guidelines. The project proposes an online service enabling physicians to consult computerized CPGs linked to drug databases for easier integration into the healthcare process. Methods Computable CPGs are structured as decision trees and coded in XML format. Recommendations related to drug classes are tagged with ATC codes. We use a mapping module to enhance computerized guidelines coupling with a drug database, which contains detailed information about each usable specific medication. In this way, therapeutic recommendations are backed up with current and up-to-date information from the database. Results Two authoritative CPGs, originally diffused as static textual documents, have been implemented to validate the computerization process and to illustrate the usefulness of the resulting automated CPGs and their coupling with a drug database. We discuss the advantages of this approach for practitioners and the implications for both guideline developers and drug database providers. Other CPGs will be implemented and evaluated in real conditions by clinicians working in different health institutions. PMID:15053828
Network pharmacology-based strategy for predicting active ingredients and potential targets of Yangxinshi tablet for treating heart failure.

PubMed

Chen, Langdong; Cao, Yan; Zhang, Hai; Lv, Diya; Zhao, Yahong; Liu, Yanjun; Ye, Guan; Chai, Yifeng

2018-01-31

Yangxinshi tablet (YXST) is an effective treatment for heart failure and myocardial infarction; it consists of 13 herbal medicines formulated according to traditional Chinese Medicine (TCM) practices. It has been used for the treatment of cardiovascular disease for many years in China. In this study, a network pharmacology-based strategy was used to elucidate the mechanism of action of YXST for the treatment of heart failure. Cardiovascular disease-related protein target and compound databases were constructed for YXST. A molecular docking platform was used to predict the protein targets of YXST. The affinity between proteins and ingredients was determined using surface plasmon resonance (SPR) assays. The action modes between targets and representative ingredients were calculated using Glide docking, and the related pathways were predicted using the Kyoto Encyclopedia of Genes and Genomes (KEGG) database. A protein target database containing 924 proteins was constructed; 179 compounds in YXST were identified, and 48 compounds with high relevance to the proteins were defined as representative ingredients. Thirty-four protein targets of the 48 representative ingredients were analyzed and classified into two categories: immune and cardiovascular systems. The SPR assay and molecular docking partly validated the interplay between protein targets and representative ingredients. Moreover, 28 pathways related to heart failure were identified, which provided directions for further research on YXST. This study demonstrated that the cardiovascular protective effect of YXST mainly involved the immune and cardiovascular systems. Through the research strategy based on network pharmacology, we analysis the complex system of YXST and found 48 representative compounds, 34 proteins and 28 related pathways of YXST, which could help us understand the underlying mechanism of YSXT's anti-heart failure effect. The network-based investigation could help researchers simplify the complex system of YXSY. It may also offer a feasible approach to decipher the chemical and pharmacological bases of other TCM formulas. Copyright © 2018 Elsevier B.V. All rights reserved.
Wavelet optimization for content-based image retrieval in medical databases.

PubMed

Quellec, G; Lamard, M; Cazuguel, G; Cochener, B; Roux, C

2010-04-01

We propose in this article a content-based image retrieval (CBIR) method for diagnosis aid in medical fields. In the proposed system, images are indexed in a generic fashion, without extracting domain-specific features: a signature is built for each image from its wavelet transform. These image signatures characterize the distribution of wavelet coefficients in each subband of the decomposition. A distance measure is then defined to compare two image signatures and thus retrieve the most similar images in a database when a query image is submitted by a physician. To retrieve relevant images from a medical database, the signatures and the distance measure must be related to the medical interpretation of images. As a consequence, we introduce several degrees of freedom in the system so that it can be tuned to any pathology and image modality. In particular, we propose to adapt the wavelet basis, within the lifting scheme framework, and to use a custom decomposition scheme. Weights are also introduced between subbands. All these parameters are tuned by an optimization procedure, using the medical grading of each image in the database to define a performance measure. The system is assessed on two medical image databases: one for diabetic retinopathy follow up and one for screening mammography, as well as a general purpose database. Results are promising: a mean precision of 56.50%, 70.91% and 96.10% is achieved for these three databases, when five images are returned by the system. Copyright 2009 Elsevier B.V. All rights reserved.
TOPSAN: a dynamic web database for structural genomics.

PubMed

Ellrott, Kyle; Zmasek, Christian M; Weekes, Dana; Sri Krishna, S; Bakolitsa, Constantina; Godzik, Adam; Wooley, John

2011-01-01

The Open Protein Structure Annotation Network (TOPSAN) is a web-based collaboration platform for exploring and annotating structures determined by structural genomics efforts. Characterization of those structures presents a challenge since the majority of the proteins themselves have not yet been characterized. Responding to this challenge, the TOPSAN platform facilitates collaborative annotation and investigation via a user-friendly web-based interface pre-populated with automatically generated information. Semantic web technologies expand and enrich TOPSAN's content through links to larger sets of related databases, and thus, enable data integration from disparate sources and data mining via conventional query languages. TOPSAN can be found at http://www.topsan.org.

DECADE Web Portal: Integrating MaGa, EarthChem and GVP Will Further Our Knowledge on Earth Degassing

NASA Astrophysics Data System (ADS)

Cardellini, C.; Frigeri, A.; Lehnert, K. A.; Ash, J.; McCormick, B.; Chiodini, G.; Fischer, T. P.; Cottrell, E.

2014-12-01

The release of gases from the Earth's interior to the exosphere takes place in both volcanic and non-volcanic areas of the planet. Fully understanding this complex process requires the integration of geochemical, petrological and volcanological data. At present, major online data repositories relevant to studies of degassing are not linked and interoperable. We are developing interoperability between three of those, which will support more powerful synoptic studies of degassing. The three data systems that will make their data accessible via the DECADE portal are: (1) the Smithsonian Institution's Global Volcanism Program database (GVP) of volcanic activity data, (2) EarthChem databases for geochemical and geochronological data of rocks and melt inclusions, and (3) the MaGa database (Mapping Gas emissions) which contains compositional and flux data of gases released at volcanic and non-volcanic degassing sites. These databases are developed and maintained by institutions or groups of experts in a specific field, and data are archived in formats specific to these databases. In the framework of the Deep Earth Carbon Degassing (DECADE) initiative of the Deep Carbon Observatory (DCO), we are developing a web portal that will create a powerful search engine of these databases from a single entry point. The portal will return comprehensive multi-component datasets, based on the search criteria selected by the user. For example, a single geographic or temporal search will return data relating to compositions of emitted gases and erupted products, the age of the erupted products, and coincident activity at the volcano. The development of this level of capability for the DECADE Portal requires complete synergy between these databases, including availability of standard-based web services (WMS, WFS) at all data systems. Data and metadata can thus be extracted from each system without interfering with each database's local schema or being replicated to achieve integration at the DECADE web portal. The DECADE portal will enable new synoptic perspectives on the Earth degassing process. Other data systems can be easily plugged in using the existing framework. Our vision is to explore Earth degassing related datasets over previously unexplored spatial or temporal ranges.
Web-based access to near real-time and archived high-density time-series data: cyber infrastructure challenges & developments in the open-source Waveform Server

NASA Astrophysics Data System (ADS)

Reyes, J. C.; Vernon, F. L.; Newman, R. L.; Steidl, J. H.

2010-12-01

The Waveform Server is an interactive web-based interface to multi-station, multi-sensor and multi-channel high-density time-series data stored in Center for Seismic Studies (CSS) 3.0 schema relational databases (Newman et al., 2009). In the last twelve months, based on expanded specifications and current user feedback, both the server-side infrastructure and client-side interface have been extensively rewritten. The Python Twisted server-side code-base has been fundamentally modified to now present waveform data stored in cluster-based databases using a multi-threaded architecture, in addition to supporting the pre-existing single database model. This allows interactive web-based access to high-density (broadband @ 40Hz to strong motion @ 200Hz) waveform data that can span multiple years; the common lifetime of broadband seismic networks. The client-side interface expands on it's use of simple JSON-based AJAX queries to now incorporate a variety of User Interface (UI) improvements including standardized calendars for defining time ranges, applying on-the-fly data calibration to display SI-unit data, and increased rendering speed. This presentation will outline the various cyber infrastructure challenges we have faced while developing this application, the use-cases currently in existence, and the limitations of web-based application development.
Method to assess the temporal persistence of potential biometric features: Application to oculomotor, gait, face and brain structure databases

PubMed Central

Nixon, Mark S.; Komogortsev, Oleg V.

2017-01-01

We introduce the intraclass correlation coefficient (ICC) to the biometric community as an index of the temporal persistence, or stability, of a single biometric feature. It requires, as input, a feature on an interval or ratio scale, and which is reasonably normally distributed, and it can only be calculated if each subject is tested on 2 or more occasions. For a biometric system, with multiple features available for selection, the ICC can be used to measure the relative stability of each feature. We show, for 14 distinct data sets (1 synthetic, 8 eye-movement-related, 2 gait-related, and 2 face-recognition-related, and one brain-structure-related), that selecting the most stable features, based on the ICC, resulted in the best biometric performance generally. Analyses based on using only the most stable features produced superior Rank-1-Identification Rate (Rank-1-IR) performance in 12 of 14 databases (p = 0.0065, one-tailed), when compared to other sets of features, including the set of all features. For Equal Error Rate (EER), using a subset of only high-ICC features also produced superior performance in 12 of 14 databases (p = 0. 0065, one-tailed). In general, then, for our databases, prescreening potential biometric features, and choosing only highly reliable features yields better performance than choosing lower ICC features or than choosing all features combined. We also determined that, as the ICC of a group of features increases, the median of the genuine similarity score distribution increases and the spread of this distribution decreases. There was no statistically significant similar relationships for the impostor distributions. We believe that the ICC will find many uses in biometric research. In case of the eye movement-driven biometrics, the use of reliable features, as measured by ICC, allowed to us achieve the authentication performance with EER = 2.01%, which was not possible before. PMID:28575030
Method to assess the temporal persistence of potential biometric features: Application to oculomotor, gait, face and brain structure databases.

PubMed

Friedman, Lee; Nixon, Mark S; Komogortsev, Oleg V

2017-01-01

We introduce the intraclass correlation coefficient (ICC) to the biometric community as an index of the temporal persistence, or stability, of a single biometric feature. It requires, as input, a feature on an interval or ratio scale, and which is reasonably normally distributed, and it can only be calculated if each subject is tested on 2 or more occasions. For a biometric system, with multiple features available for selection, the ICC can be used to measure the relative stability of each feature. We show, for 14 distinct data sets (1 synthetic, 8 eye-movement-related, 2 gait-related, and 2 face-recognition-related, and one brain-structure-related), that selecting the most stable features, based on the ICC, resulted in the best biometric performance generally. Analyses based on using only the most stable features produced superior Rank-1-Identification Rate (Rank-1-IR) performance in 12 of 14 databases (p = 0.0065, one-tailed), when compared to other sets of features, including the set of all features. For Equal Error Rate (EER), using a subset of only high-ICC features also produced superior performance in 12 of 14 databases (p = 0. 0065, one-tailed). In general, then, for our databases, prescreening potential biometric features, and choosing only highly reliable features yields better performance than choosing lower ICC features or than choosing all features combined. We also determined that, as the ICC of a group of features increases, the median of the genuine similarity score distribution increases and the spread of this distribution decreases. There was no statistically significant similar relationships for the impostor distributions. We believe that the ICC will find many uses in biometric research. In case of the eye movement-driven biometrics, the use of reliable features, as measured by ICC, allowed to us achieve the authentication performance with EER = 2.01%, which was not possible before.
Improvements on a privacy-protection algorithm for DNA sequences with generalization lattices.

PubMed

Li, Guang; Wang, Yadong; Su, Xiaohong

2012-10-01

When developing personal DNA databases, there must be an appropriate guarantee of anonymity, which means that the data cannot be related back to individuals. DNA lattice anonymization (DNALA) is a successful method for making personal DNA sequences anonymous. However, it uses time-consuming multiple sequence alignment and a low-accuracy greedy clustering algorithm. Furthermore, DNALA is not an online algorithm, and so it cannot quickly return results when the database is updated. This study improves the DNALA method. Specifically, we replaced the multiple sequence alignment in DNALA with global pairwise sequence alignment to save time, and we designed a hybrid clustering algorithm comprised of a maximum weight matching (MWM)-based algorithm and an online algorithm. The MWM-based algorithm is more accurate than the greedy algorithm in DNALA and has the same time complexity. The online algorithm can process data quickly when the database is updated. Copyright © 2011 Elsevier Ireland Ltd. All rights reserved.
Identifying and Synchronizing Health Information Technology (HIT) Events from FDA Medical Device Reports.

PubMed

Kang, Hong; Wang, Frank; Zhou, Sicheng; Miao, Qi; Gong, Yang

2017-01-01

Health information technology (HIT) events, a subtype of patient safety events, pose a major threat and barrier toward a safer healthcare system. It is crucial to gain a better understanding of the nature of the errors and adverse events caused by current HIT systems. The scarcity of HIT event-exclusive databases and event reporting systems indicates the challenge of identifying the HIT events from existing resources. FDA Manufacturer and User Facility Device Experience (MAUDE) database is a potential resource for HIT events. However, the low proportion and the rapid evolvement of HIT-related events present challenges for distinguishing them from other equipment failures and hazards. We proposed a strategy to identify and synchronize HIT events from MAUDE by using a filter based on structured features and classifiers based on unstructured features. The strategy will help us develop and grow an HIT event-exclusive database, keeping pace with updates to MAUDE toward shared learning.
A Relational Database System for Student Use.

ERIC Educational Resources Information Center

Fertuck, Len

1982-01-01

Describes an APL implementation of a relational database system suitable for use in a teaching environment in which database development and database administration are studied, and discusses the functions of the user and the database administrator. An appendix illustrating system operation and an eight-item reference list are attached. (Author/JL)
An integrated software suite for surface-based analyses of cerebral cortex.

PubMed

Van Essen, D C; Drury, H A; Dickson, J; Harwell, J; Hanlon, D; Anderson, C H

2001-01-01

The authors describe and illustrate an integrated trio of software programs for carrying out surface-based analyses of cerebral cortex. The first component of this trio, SureFit (Surface Reconstruction by Filtering and Intensity Transformations), is used primarily for cortical segmentation, volume visualization, surface generation, and the mapping of functional neuroimaging data onto surfaces. The second component, Caret (Computerized Anatomical Reconstruction and Editing Tool Kit), provides a wide range of surface visualization and analysis options as well as capabilities for surface flattening, surface-based deformation, and other surface manipulations. The third component, SuMS (Surface Management System), is a database and associated user interface for surface-related data. It provides for efficient insertion, searching, and extraction of surface and volume data from the database.
An integrated software suite for surface-based analyses of cerebral cortex

NASA Technical Reports Server (NTRS)

Van Essen, D. C.; Drury, H. A.; Dickson, J.; Harwell, J.; Hanlon, D.; Anderson, C. H.

2001-01-01

The authors describe and illustrate an integrated trio of software programs for carrying out surface-based analyses of cerebral cortex. The first component of this trio, SureFit (Surface Reconstruction by Filtering and Intensity Transformations), is used primarily for cortical segmentation, volume visualization, surface generation, and the mapping of functional neuroimaging data onto surfaces. The second component, Caret (Computerized Anatomical Reconstruction and Editing Tool Kit), provides a wide range of surface visualization and analysis options as well as capabilities for surface flattening, surface-based deformation, and other surface manipulations. The third component, SuMS (Surface Management System), is a database and associated user interface for surface-related data. It provides for efficient insertion, searching, and extraction of surface and volume data from the database.
GIS for evaluating socioeconomic data of small communities in Oklahoma

DOT National Transportation Integrated Search

2001-01-01

This document summarizes the overall tasks, functionality, and limitations related to the delivered product of this project. Specifically, we present a Geographic Information Systems (GIS)-based database and analysis package for evaluating socioecono...
A data model and database for high-resolution pathology analytical image informatics.

PubMed

Wang, Fusheng; Kong, Jun; Cooper, Lee; Pan, Tony; Kurc, Tahsin; Chen, Wenjin; Sharma, Ashish; Niedermayr, Cristobal; Oh, Tae W; Brat, Daniel; Farris, Alton B; Foran, David J; Saltz, Joel

2011-01-01

The systematic analysis of imaged pathology specimens often results in a vast amount of morphological information at both the cellular and sub-cellular scales. While microscopy scanners and computerized analysis are capable of capturing and analyzing data rapidly, microscopy image data remain underutilized in research and clinical settings. One major obstacle which tends to reduce wider adoption of these new technologies throughout the clinical and scientific communities is the challenge of managing, querying, and integrating the vast amounts of data resulting from the analysis of large digital pathology datasets. This paper presents a data model, which addresses these challenges, and demonstrates its implementation in a relational database system. This paper describes a data model, referred to as Pathology Analytic Imaging Standards (PAIS), and a database implementation, which are designed to support the data management and query requirements of detailed characterization of micro-anatomic morphology through many interrelated analysis pipelines on whole-slide images and tissue microarrays (TMAs). (1) Development of a data model capable of efficiently representing and storing virtual slide related image, annotation, markup, and feature information. (2) Development of a database, based on the data model, capable of supporting queries for data retrieval based on analysis and image metadata, queries for comparison of results from different analyses, and spatial queries on segmented regions, features, and classified objects. The work described in this paper is motivated by the challenges associated with characterization of micro-scale features for comparative and correlative analyses involving whole-slides tissue images and TMAs. Technologies for digitizing tissues have advanced significantly in the past decade. Slide scanners are capable of producing high-magnification, high-resolution images from whole slides and TMAs within several minutes. Hence, it is becoming increasingly feasible for basic, clinical, and translational research studies to produce thousands of whole-slide images. Systematic analysis of these large datasets requires efficient data management support for representing and indexing results from hundreds of interrelated analyses generating very large volumes of quantifications such as shape and texture and of classifications of the quantified features. We have designed a data model and a database to address the data management requirements of detailed characterization of micro-anatomic morphology through many interrelated analysis pipelines. The data model represents virtual slide related image, annotation, markup and feature information. The database supports a wide range of metadata and spatial queries on images, annotations, markups, and features. We currently have three databases running on a Dell PowerEdge T410 server with CentOS 5.5 Linux operating system. The database server is IBM DB2 Enterprise Edition 9.7.2. The set of databases consists of 1) a TMA database containing image analysis results from 4740 cases of breast cancer, with 641 MB storage size; 2) an algorithm validation database, which stores markups and annotations from two segmentation algorithms and two parameter sets on 18 selected slides, with 66 GB storage size; and 3) an in silico brain tumor study database comprising results from 307 TCGA slides, with 365 GB storage size. The latter two databases also contain human-generated annotations and markups for regions and nuclei. Modeling and managing pathology image analysis results in a database provide immediate benefits on the value and usability of data in a research study. The database provides powerful query capabilities, which are otherwise difficult or cumbersome to support by other approaches such as programming languages. Standardized, semantic annotated data representation and interfaces also make it possible to more efficiently share image data and analysis results.
Design of a diagnostic encyclopaedia using AIDA.

PubMed

van Ginneken, A M; Smeulders, A W; Jansen, W

1987-01-01

Diagnostic Encyclopaedia Workstation (DEW) is the name of a digital encyclopaedia constructed to contain reference knowledge with respect to the pathology of the ovary. Comparing DEW with the common sources of reference knowledge (i.e. books) leads to the following advantages of DEW: it contains more verbal knowledge, pictures and case histories, and it offers information adjusted to the needs of the user. Based on an analysis of the structure of this reference knowledge we have chosen AIDA to develop a relational database and we use a video-disc player to contain the pictorial part of the database. The system consists of a database input version and a read-only run version. The design of the database input version is discussed. Reference knowledge for ovary pathology requires 1-3 Mbytes of memory. At present 15% of this amount is available. The design of the run version is based on an analysis of which information must necessarily be specified to the system by the user to access a desired item of information. Finally, the use of AIDA in constructing DEW is evaluated.
Textpresso site-specific recombinases: A text-mining server for the recombinase literature including Cre mice and conditional alleles.

PubMed

Urbanski, William M; Condie, Brian G

2009-12-01

Textpresso Site Specific Recombinases (http://ssrc.genetics.uga.edu/) is a text-mining web server for searching a database of more than 9,000 full-text publications. The papers and abstracts in this database represent a wide range of topics related to site-specific recombinase (SSR) research tools. Included in the database are most of the papers that report the characterization or use of mouse strains that express Cre recombinase as well as papers that describe or analyze mouse lines that carry conditional (floxed) alleles or SSR-activated transgenes/knockins. The database also includes reports describing SSR-based cloning methods such as the Gateway or the Creator systems, papers reporting the development or use of SSR-based tools in systems such as Drosophila, bacteria, parasites, stem cells, yeast, plants, zebrafish, and Xenopus as well as publications that describe the biochemistry, genetics, or molecular structure of the SSRs themselves. Textpresso Site Specific Recombinases is the only comprehensive text-mining resource available for the literature describing the biology and technical applications of SSRs. (c) 2009 Wiley-Liss, Inc.
Relational Databases and Biomedical Big Data.

PubMed

de Silva, N H Nisansa D

2017-01-01

In various biomedical applications that collect, handle, and manipulate data, the amounts of data tend to build up and venture into the range identified as bigdata. In such occurrences, a design decision has to be taken as to what type of database would be used to handle this data. More often than not, the default and classical solution to this in the biomedical domain according to past research is relational databases. While this used to be the norm for a long while, it is evident that there is a trend to move away from relational databases in favor of other types and paradigms of databases. However, it still has paramount importance to understand the interrelation that exists between biomedical big data and relational databases. This chapter will review the pros and cons of using relational databases to store biomedical big data that previous researches have discussed and used.
Fossil-Fuel C02 Emissions Database and Exploration System

NASA Astrophysics Data System (ADS)

Krassovski, M.; Boden, T.

2012-04-01

Fossil-Fuel C02 Emissions Database and Exploration System Misha Krassovski and Tom Boden Carbon Dioxide Information Analysis Center Oak Ridge National Laboratory The Carbon Dioxide Information Analysis Center (CDIAC) at Oak Ridge National Laboratory (ORNL) quantifies the release of carbon from fossil-fuel use and cement production each year at global, regional, and national spatial scales. These estimates are vital to climate change research given the strong evidence suggesting fossil-fuel emissions are responsible for unprecedented levels of carbon dioxide (CO2) in the atmosphere. The CDIAC fossil-fuel emissions time series are based largely on annual energy statistics published for all nations by the United Nations (UN). Publications containing historical energy statistics make it possible to estimate fossil-fuel CO2 emissions back to 1751 before the Industrial Revolution. From these core fossil-fuel CO2 emission time series, CDIAC has developed a number of additional data products to satisfy modeling needs and to address other questions aimed at improving our understanding of the global carbon cycle budget. For example, CDIAC also produces a time series of gridded fossil-fuel CO2 emission estimates and isotopic (e.g., C13) emissions estimates. The gridded data are generated using the methodology described in Andres et al. (2011) and provide monthly and annual estimates for 1751-2008 at 1° latitude by 1° longitude resolution. These gridded emission estimates are being used in the latest IPCC Scientific Assessment (AR4). Isotopic estimates are possible thanks to detailed information for individual nations regarding the carbon content of select fuels (e.g., the carbon signature of natural gas from Russia). CDIAC has recently developed a relational database to house these baseline emissions estimates and associated derived products and a web-based interface to help users worldwide query these data holdings. Users can identify, explore and download desired CDIAC fossil-fuel CO2 emissions data. This presentation introduces the architecture and design of the new relational database and web interface, summarizes the present state and functionality of the Fossil-Fuel CO2 Emissions Database and Exploration System, and highlights future plans for expansion of the relational database and interface.
Graphical tool for navigation within the semantic network of the UMLS metathesaurus on a locally installed database.

PubMed

Frankewitsch, T; Prokosch, H U

2000-01-01

Knowledge in the environment of information technologies is bound to structured vocabularies. Medical data dictionaries are necessary for uniquely describing findings like diagnoses, procedures or functions. Therefore we decided to locally install a version of the Unified Medical Language System (UMLS) of the U.S. National Library of Medicine as a repository for defining entries of a medical multimedia database. Because of the requirement to extend the vocabulary in concepts and relations between existing concepts a graphical tool for appending new items to the database has been developed: Although the database is an instance of a semantic network the focus on single entries offers the opportunity of reducing the net to a tree within this detail. Based on the graph theorem, there are definitions of nodes of concepts and nodes of knowledge. The UMLS additionally offers the specification of sub-relations, which can be represented, too. Using this view it is possible to manage these 1:n-Relations in a simple tree view. On this background an explorer like graphical user interface has been realised to add new concepts and define new relationships between those and existing entries for adapting the UMLS for specific purposes such as describing medical multimedia objects.
High-Performance Secure Database Access Technologies for HEP Grids

DOE Office of Scientific and Technical Information (OSTI.GOV)

Matthew Vranicar; John Weicher

2006-04-17

The Large Hadron Collider (LHC) at the CERN Laboratory will become the largest scientific instrument in the world when it starts operations in 2007. Large Scale Analysis Computer Systems (computational grids) are required to extract rare signals of new physics from petabytes of LHC detector data. In addition to file-based event data, LHC data processing applications require access to large amounts of data in relational databases: detector conditions, calibrations, etc. U.S. high energy physicists demand efficient performance of grid computing applications in LHC physics research where world-wide remote participation is vital to their success. To empower physicists with data-intensive analysismore » capabilities a whole hyperinfrastructure of distributed databases cross-cuts a multi-tier hierarchy of computational grids. The crosscutting allows separation of concerns across both the global environment of a federation of computational grids and the local environment of a physicist’s computer used for analysis. Very few efforts are on-going in the area of database and grid integration research. Most of these are outside of the U.S. and rely on traditional approaches to secure database access via an extraneous security layer separate from the database system core, preventing efficient data transfers. Our findings are shared by the Database Access and Integration Services Working Group of the Global Grid Forum, who states that "Research and development activities relating to the Grid have generally focused on applications where data is stored in files. However, in many scientific and commercial domains, database management systems have a central role in data storage, access, organization, authorization, etc, for numerous applications.” There is a clear opportunity for a technological breakthrough, requiring innovative steps to provide high-performance secure database access technologies for grid computing. We believe that an innovative database architecture where the secure authorization is pushed into the database engine will eliminate inefficient data transfer bottlenecks. Furthermore, traditionally separated database and security layers provide an extra vulnerability, leaving a weak clear-text password authorization as the only protection on the database core systems. Due to the legacy limitations of the systems’ security models, the allowed passwords often can not even comply with the DOE password guideline requirements. We see an opportunity for the tight integration of the secure authorization layer with the database server engine resulting in both improved performance and improved security. Phase I has focused on the development of a proof-of-concept prototype using Argonne National Laboratory’s (ANL) Argonne Tandem-Linac Accelerator System (ATLAS) project as a test scenario. By developing a grid-security enabled version of the ATLAS project’s current relation database solution, MySQL, PIOCON Technologies aims to offer a more efficient solution to secure database access.« less
A Relational Algebra Query Language for Programming Relational Databases

ERIC Educational Resources Information Center

McMaster, Kirby; Sambasivam, Samuel; Anderson, Nicole

2011-01-01

In this paper, we describe a Relational Algebra Query Language (RAQL) and Relational Algebra Query (RAQ) software product we have developed that allows database instructors to teach relational algebra through programming. Instead of defining query operations using mathematical notation (the approach commonly taken in database textbooks), students…
Event Driven Messaging with Role-Based Subscriptions

NASA Technical Reports Server (NTRS)

Bui, Tung; Bui, Bach; Malhotra, Shantanu; Chen, Fannie; Kim, rachel; Allen, Christopher; Luong, Ivy; Chang, George; Zendejas, Silvino; Sadaqathulla, Syed

2009-01-01

Event Driven Messaging with Role-Based Subscriptions (EDM-RBS) is a framework integrated into the Service Management Database (SMDB) to allow for role-based and subscription-based delivery of synchronous and asynchronous messages over JMS (Java Messaging Service), SMTP (Simple Mail Transfer Protocol), or SMS (Short Messaging Service). This allows for 24/7 operation with users in all parts of the world. The software classifies messages by triggering data type, application source, owner of data triggering event (mission), classification, sub-classification and various other secondary classifying tags. Messages are routed to applications or users based on subscription rules using a combination of the above message attributes. This program provides a framework for identifying connected users and their applications for targeted delivery of messages over JMS to the client applications the user is logged into. EDMRBS provides the ability to send notifications over e-mail or pager rather than having to rely on a live human to do it. It is implemented as an Oracle application that uses Oracle relational database management system intrinsic functions. It is configurable to use Oracle AQ JMS API or an external JMS provider for messaging. It fully integrates into the event-logging framework of SMDB (Subnet Management Database).
Integrative neuroscience: the role of a standardized database.

PubMed

Gordon, E; Cooper, N; Rennie, C; Hermens, D; Williams, L M

2005-04-01

Most brain related databases bring together specialized information, with a growing number that include neuroimaging measures. This article outlines the potential use and insights from the first entirely standardized and centralized database, which integrates information from neuroimaging measures (EEG, event related potential (ERP), structural/functional MRI), arousal (skin conductance responses (SCR)s, heart rate, respiration), neuropsychological and personality tests, genomics and demographics: The Brain Resource International Database. It comprises data from over 2000 "normative" subjects and a growing number of patients with neurological and psychiatric illnesses, acquired from over 50 laboratories (in the U.S.A, United Kingdom, Holland, South Africa, Israel and Australia), all with identical equipment and experimental procedures. Three primary goals of this database are to quantify individual differences in normative brain function, to compare an individual's performance to their database peers, and to provide a robust normative framework for clinical assessment and treatment prediction. We present three example demonstrations in relation to these goals. First, we show how consistent age differences may be quantified when large subject numbers are available, using EEG and ERP data from nearly 2000 stringently screened. normative subjects. Second, the use of a normalization technique provides a means to compare clinical subjects (50 ADHD subjects in this study) to the normative database with the effects of age and gender taken into account. Third, we show how a profile of EEG/ERP and autonomic measures potentially provides a means to predict treatment response in ADHD subjects. The example data consists of EEG under eyes open and eyes closed and ERP data for auditory oddball, working memory and Go-NoGo paradigms. Autonomic measures of skin conductance (tonic skin conductance level, SCL, and phasic skin conductance responses, SCRs) were acquired simultaneously with central EEG/ERP measures. The findings show that the power of large samples, tested using standardized protocols, allows for the quantification of individual differences that can subsequently be used to control such variation and to enhance the sensitivity and specificity of comparisons between normative and clinical groups. In terms of broader significance, the combination of size and multidimensional measures tapping the brain's core cognitive competencies, may provide a normative and evidence-based framework for individually-based assessments in "Personalized Medicine."

GIDL: a rule based expert system for GenBank Intelligent Data Loading into the Molecular Biodiversity database

PubMed Central

2012-01-01

Background In the scientific biodiversity community, it is increasingly perceived the need to build a bridge between molecular and traditional biodiversity studies. We believe that the information technology could have a preeminent role in integrating the information generated by these studies with the large amount of molecular data we can find in bioinformatics public databases. This work is primarily aimed at building a bioinformatic infrastructure for the integration of public and private biodiversity data through the development of GIDL, an Intelligent Data Loader coupled with the Molecular Biodiversity Database. The system presented here organizes in an ontological way and locally stores the sequence and annotation data contained in the GenBank primary database. Methods The GIDL architecture consists of a relational database and of an intelligent data loader software. The relational database schema is designed to manage biodiversity information (Molecular Biodiversity Database) and it is organized in four areas: MolecularData, Experiment, Collection and Taxonomy. The MolecularData area is inspired to an established standard in Generic Model Organism Databases, the Chado relational schema. The peculiarity of Chado, and also its strength, is the adoption of an ontological schema which makes use of the Sequence Ontology. The Intelligent Data Loader (IDL) component of GIDL is an Extract, Transform and Load software able to parse data, to discover hidden information in the GenBank entries and to populate the Molecular Biodiversity Database. The IDL is composed by three main modules: the Parser, able to parse GenBank flat files; the Reasoner, which automatically builds CLIPS facts mapping the biological knowledge expressed by the Sequence Ontology; the DBFiller, which translates the CLIPS facts into ordered SQL statements used to populate the database. In GIDL Semantic Web technologies have been adopted due to their advantages in data representation, integration and processing. Results and conclusions Entries coming from Virus (814,122), Plant (1,365,360) and Invertebrate (959,065) divisions of GenBank rel.180 have been loaded in the Molecular Biodiversity Database by GIDL. Our system, combining the Sequence Ontology and the Chado schema, allows a more powerful query expressiveness compared with the most commonly used sequence retrieval systems like Entrez or SRS. PMID:22536971
CTGA: the database for genetic disorders in Arab populations.

PubMed

Tadmouri, Ghazi O; Al Ali, Mahmoud Taleb; Al-Haj Ali, Sarah; Al Khaja, Najib

2006-01-01

The Arabs comprise a genetically heterogeneous group that resulted from the admixture of different populations throughout history. They share many common characteristics responsible for a considerable proportion of perinatal and neonatal mortalities. To this end, the Centre for Arab Genomic Studies (CAGS) launched a pilot project to construct the 'Catalogue of Transmission Genetics in Arabs' (CTGA) database for genetic disorders in Arabs. Information in CTGA is drawn from published research and mined hospital records. The database offers web-based basic and advanced search approaches. In either case, the final search result is a detailed HTML record that includes text-, URL- and graphic-based fields. At present, CTGA hosts entries for 692 phenotypes and 235 related genes described in Arab individuals. Of these, 213 phenotypic descriptions and 22 related genes were observed in the Arab population of the United Arab Emirates (UAE). These results emphasize the role of CTGA as an essential tool to promote scientific research on genetic disorders in the region. The priority of CTGA is to provide timely information on the occurrence of genetic disorders in Arab individuals. It is anticipated that data from Arab countries other than the UAE will be exhaustively searched and incorporated in CTGA (http://www.cags.org.ae).
CTGA: the database for genetic disorders in Arab populations

PubMed Central

Tadmouri, Ghazi O.; Ali, Mahmoud Taleb Al; Ali, Sarah Al-Haj; Khaja, Najib Al

2006-01-01

The Arabs comprise a genetically heterogeneous group that resulted from the admixture of different populations throughout history. They share many common characteristics responsible for a considerable proportion of perinatal and neonatal mortalities. To this end, the Centre for Arab Genomic Studies (CAGS) launched a pilot project to construct the ‘Catalogue of Transmission Genetics in Arabs’ (CTGA) database for genetic disorders in Arabs. Information in CTGA is drawn from published research and mined hospital records. The database offers web-based basic and advanced search approaches. In either case, the final search result is a detailed HTML record that includes text-, URL- and graphic-based fields. At present, CTGA hosts entries for 692 phenotypes and 235 related genes described in Arab individuals. Of these, 213 phenotypic descriptions and 22 related genes were observed in the Arab population of the United Arab Emirates (UAE). These results emphasize the role of CTGA as an essential tool to promote scientific research on genetic disorders in the region. The priority of CTGA is to provide timely information on the occurrence of genetic disorders in Arab individuals. It is anticipated that data from Arab countries other than the UAE will be exhaustively searched and incorporated in CTGA (). PMID:16381941
Metabolome searcher: a high throughput tool for metabolite identification and metabolic pathway mapping directly from mass spectrometry and using genome restriction.

PubMed

Dhanasekaran, A Ranjitha; Pearson, Jon L; Ganesan, Balasubramanian; Weimer, Bart C

2015-02-25

Mass spectrometric analysis of microbial metabolism provides a long list of possible compounds. Restricting the identification of the possible compounds to those produced by the specific organism would benefit the identification process. Currently, identification of mass spectrometry (MS) data is commonly done using empirically derived compound databases. Unfortunately, most databases contain relatively few compounds, leaving long lists of unidentified molecules. Incorporating genome-encoded metabolism enables MS output identification that may not be included in databases. Using an organism's genome as a database restricts metabolite identification to only those compounds that the organism can produce. To address the challenge of metabolomic analysis from MS data, a web-based application to directly search genome-constructed metabolic databases was developed. The user query returns a genome-restricted list of possible compound identifications along with the putative metabolic pathways based on the name, formula, SMILES structure, and the compound mass as defined by the user. Multiple queries can be done simultaneously by submitting a text file created by the user or obtained from the MS analysis software. The user can also provide parameters specific to the experiment's MS analysis conditions, such as mass deviation, adducts, and detection mode during the query so as to provide additional levels of evidence to produce the tentative identification. The query results are provided as an HTML page and downloadable text file of possible compounds that are restricted to a specific genome. Hyperlinks provided in the HTML file connect the user to the curated metabolic databases housed in ProCyc, a Pathway Tools platform, as well as the KEGG Pathway database for visualization and metabolic pathway analysis. Metabolome Searcher, a web-based tool, facilitates putative compound identification of MS output based on genome-restricted metabolic capability. This enables researchers to rapidly extend the possible identifications of large data sets for metabolites that are not in compound databases. Putative compound names with their associated metabolic pathways from metabolomics data sets are returned to the user for additional biological interpretation and visualization. This novel approach enables compound identification by restricting the possible masses to those encoded in the genome.
Selecting a database for literature searches in nursing: MEDLINE or CINAHL?

PubMed

Brazier, H; Begley, C M

1996-10-01

This study compares the usefulness of the MEDLINE and CINAHL databases for students on post-registration nursing courses. We searched for nine topics, using title words only. Identical searches of the two databases retrieved 1162 references, of which 88% were in MEDLINE, 33% in CINAHL and 20% in both sources. The relevance of the references was assessed by student reviewers. The positive predictive value of CINAHL (70%) was higher than that of MEDLINE (54%), but MEDLINE produced more than twice as many relevant references as CINAHL. The sensitivity of MEDLINE was 85% (95% CI 82-88%), and that of CINAHL was 41% (95% CI 37-45%). To assess the ease of obtaining the references, we developed an index of accessibility, based on the holdings of a number of Irish and British libraries. Overall, 47% of relevant references were available in the students' own library, and 64% could be obtained within 48 hours. There was no difference between the two databases overall, but when two topics relating specifically to the organization of nursing were excluded, references found in MEDLINE were significantly more accessible. We recommend that MEDLINE should be regarded as the first choice of bibliographic database for any subject other than one related strictly to the organization of nursing.
Publication of nuclear magnetic resonance experimental data with semantic web technology and the application thereof to biomedical research of proteins.

PubMed

Yokochi, Masashi; Kobayashi, Naohiro; Ulrich, Eldon L; Kinjo, Akira R; Iwata, Takeshi; Ioannidis, Yannis E; Livny, Miron; Markley, John L; Nakamura, Haruki; Kojima, Chojiro; Fujiwara, Toshimichi

2016-05-05

The nuclear magnetic resonance (NMR) spectroscopic data for biological macromolecules archived at the BioMagResBank (BMRB) provide a rich resource of biophysical information at atomic resolution. The NMR data archived in NMR-STAR ASCII format have been implemented in a relational database. However, it is still fairly difficult for users to retrieve data from the NMR-STAR files or the relational database in association with data from other biological databases. To enhance the interoperability of the BMRB database, we present a full conversion of BMRB entries to two standard structured data formats, XML and RDF, as common open representations of the NMR-STAR data. Moreover, a SPARQL endpoint has been deployed. The described case study demonstrates that a simple query of the SPARQL endpoints of the BMRB, UniProt, and Online Mendelian Inheritance in Man (OMIM), can be used in NMR and structure-based analysis of proteins combined with information of single nucleotide polymorphisms (SNPs) and their phenotypes. We have developed BMRB/XML and BMRB/RDF and demonstrate their use in performing a federated SPARQL query linking the BMRB to other databases through standard semantic web technologies. This will facilitate data exchange across diverse information resources.
High-performance Negative Database for Massive Data Management System of The Mingantu Spectral Radioheliograph

NASA Astrophysics Data System (ADS)

Shi, Congming; Wang, Feng; Deng, Hui; Liu, Yingbo; Liu, Cuiyin; Wei, Shoulin

2017-08-01

As a dedicated synthetic aperture radio interferometer in China, the MingantU SpEctral Radioheliograph (MUSER), initially known as the Chinese Spectral RadioHeliograph (CSRH), has entered the stage of routine observation. More than 23 million data records per day need to be effectively managed to provide high-performance data query and retrieval for scientific data reduction. In light of these massive amounts of data generated by the MUSER, in this paper, a novel data management technique called the negative database (ND) is proposed and used to implement a data management system for the MUSER. Based on the key-value database, the ND technique makes complete utilization of the complement set of observational data to derive the requisite information. Experimental results showed that the proposed ND can significantly reduce storage volume in comparison with a relational database management system (RDBMS). Even when considering the time needed to derive records that were absent, its overall performance, including querying and deriving the data of the ND, is comparable with that of a relational database management system (RDBMS). The ND technique effectively solves the problem of massive data storage for the MUSER and is a valuable reference for the massive data management required in next-generation telescopes.
epiPATH: an information system for the storage and management of molecular epidemiology data from infectious pathogens.

PubMed

Amadoz, Alicia; González-Candelas, Fernando

2007-04-20

Most research scientists working in the fields of molecular epidemiology, population and evolutionary genetics are confronted with the management of large volumes of data. Moreover, the data used in studies of infectious diseases are complex and usually derive from different institutions such as hospitals or laboratories. Since no public database scheme incorporating clinical and epidemiological information about patients and molecular information about pathogens is currently available, we have developed an information system, composed by a main database and a web-based interface, which integrates both types of data and satisfies requirements of good organization, simple accessibility, data security and multi-user support. From the moment a patient arrives to a hospital or health centre until the processing and analysis of molecular sequences obtained from infectious pathogens in the laboratory, lots of information is collected from different sources. We have divided the most relevant data into 12 conceptual modules around which we have organized the database schema. Our schema is very complete and it covers many aspects of sample sources, samples, laboratory processes, molecular sequences, phylogenetics results, clinical tests and results, clinical information, treatments, pathogens, transmissions, outbreaks and bibliographic information. Communication between end-users and the selected Relational Database Management System (RDMS) is carried out by default through a command-line window or through a user-friendly, web-based interface which provides access and management tools for the data. epiPATH is an information system for managing clinical and molecular information from infectious diseases. It facilitates daily work related to infectious pathogens and sequences obtained from them. This software is intended for local installation in order to safeguard private data and provides advanced SQL-users the flexibility to adapt it to their needs. The database schema, tool scripts and web-based interface are free software but data stored in our database server are not publicly available. epiPATH is distributed under the terms of GNU General Public License. More details about epiPATH can be found at http://genevo.uv.es/epipath.
Digital Image Support in the ROADNet Real-time Monitoring Platform

NASA Astrophysics Data System (ADS)

Lindquist, K. G.; Hansen, T. S.; Newman, R. L.; Vernon, F. L.; Nayak, A.; Foley, S.; Fricke, T.; Orcutt, J.; Rajasekar, A.

2004-12-01

The ROADNet real-time monitoring infrastructure has allowed researchers to integrate geophysical monitoring data from a wide variety of signal domains. Antelope-based data transport, relational-database buffering and archiving, backup/replication/archiving through the Storage Resource Broker, and a variety of web-based distribution tools create a powerful monitoring platform. In this work we discuss our use of the ROADNet system for the collection and processing of digital image data. Remote cameras have been deployed at approximately 32 locations as of September 2004, including the SDSU Santa Margarita Ecological Reserve, the Imperial Beach pier, and the Pinon Flats geophysical observatory. Fire monitoring imagery has been obtained through a connection to the HPWREN project. Near-real-time images obtained from the R/V Roger Revelle include records of seafloor operations by the JASON submersible, as part of a maintenance mission for the H2O underwater seismic observatory. We discuss acquisition mechanisms and the packet architecture for image transport via Antelope orbservers, including multi-packet support for arbitrarily large images. Relational database storage supports archiving of timestamped images, image-processing operations, grouping of related images and cameras, support for motion-detect triggers, thumbnail images, pre-computed video frames, support for time-lapse movie generation and storage of time-lapse movies. Available ROADNet monitoring tools include both orbserver-based display of incoming real-time images and web-accessible searching and distribution of images and movies driven by the relational database (http://mercali.ucsd.edu/rtapps/rtimbank.php). An extension to the Kepler Scientific Workflow System also allows real-time image display via the Ptolemy project. Custom time-lapse movies may be made from the ROADNet web pages.
Helicobacter pylori-related chronic gastritis as a risk factor for colonic neoplasms.

PubMed

Inoue, Izumi; Kato, Jun; Tamai, Hideyuki; Iguchi, Mikitaka; Maekita, Takao; Yoshimura, Noriko; Ichinose, Masao

2014-02-14

To summarize the current views and insights on associations between Helicobacter pylori (H. pylori)-related chronic gastritis and colorectal neoplasm, we reviewed recent studies to clarify whether H. pylori infection/H. pylori-related chronic gastritis is associated with an elevated risk of colorectal neoplasm. Recent studies based on large databases with careful control for confounding variables have clearly demonstrated an increased risk of colorectal neoplasm associated with H. pylori infection. The correlation between H. pylori-related chronic atrophic gastritis (CAG) and colorectal neoplasm has only been examined in a limited number of studies. A recent large study using a national histopathological database, and our study based on the stage of H. pylori-related chronic gastritis as determined by serum levels of H. pylori antibody titer and pepsinogen, indicated that H. pylori-related CAG confers an increased risk of colorectal neoplasm, and more extensive atrophic gastritis will probably be associated with even higher risk of neoplasm. In addition, our study suggested that the activity of H. pylori-related chronic gastritis is correlated with colorectal neoplasm risk. H. pylori-related chronic gastritis could be involved in an increased risk of colorectal neoplasm that appears to be enhanced by the progression of gastric atrophy and the presence of active inflammation.
Helicobacter pylori-related chronic gastritis as a risk factor for colonic neoplasms

PubMed Central

Inoue, Izumi; Kato, Jun; Tamai, Hideyuki; Iguchi, Mikitaka; Maekita, Takao; Yoshimura, Noriko; Ichinose, Masao

2014-01-01

To summarize the current views and insights on associations between Helicobacter pylori (H. pylori)-related chronic gastritis and colorectal neoplasm, we reviewed recent studies to clarify whether H. pylori infection/H. pylori-related chronic gastritis is associated with an elevated risk of colorectal neoplasm. Recent studies based on large databases with careful control for confounding variables have clearly demonstrated an increased risk of colorectal neoplasm associated with H. pylori infection. The correlation between H. pylori-related chronic atrophic gastritis (CAG) and colorectal neoplasm has only been examined in a limited number of studies. A recent large study using a national histopathological database, and our study based on the stage of H. pylori-related chronic gastritis as determined by serum levels of H. pylori antibody titer and pepsinogen, indicated that H. pylori-related CAG confers an increased risk of colorectal neoplasm, and more extensive atrophic gastritis will probably be associated with even higher risk of neoplasm. In addition, our study suggested that the activity of H. pylori-related chronic gastritis is correlated with colorectal neoplasm risk. H. pylori-related chronic gastritis could be involved in an increased risk of colorectal neoplasm that appears to be enhanced by the progression of gastric atrophy and the presence of active inflammation. PMID:24587623
Data-Based Decision Making in Education: Challenges and Opportunities

ERIC Educational Resources Information Center

Schildkamp, Kim, Ed.; Lai, Mei Kuin, Ed.; Earl, Lorna, Ed.

2013-01-01

In a context where schools are held more and more accountable for the education they provide, data-based decision making has become increasingly important. This book brings together scholars from several countries to examine data-based decision making. Data-based decision making in this book refers to making decisions based on a broad range of…
Space flight risk data collection and analysis project: Risk and reliability database

NASA Technical Reports Server (NTRS)

1994-01-01

The focus of the NASA 'Space Flight Risk Data Collection and Analysis' project was to acquire and evaluate space flight data with the express purpose of establishing a database containing measurements of specific risk assessment - reliability - availability - maintainability - supportability (RRAMS) parameters. The developed comprehensive RRAMS database will support the performance of future NASA and aerospace industry risk and reliability studies. One of the primary goals has been to acquire unprocessed information relating to the reliability and availability of launch vehicles and the subsystems and components thereof from the 45th Space Wing (formerly Eastern Space and Missile Command -ESMC) at Patrick Air Force Base. After evaluating and analyzing this information, it was encoded in terms of parameters pertinent to ascertaining reliability and availability statistics, and then assembled into an appropriate database structure.
Structator: fast index-based search for RNA sequence-structure patterns

PubMed Central

2011-01-01

Background The secondary structure of RNA molecules is intimately related to their function and often more conserved than the sequence. Hence, the important task of searching databases for RNAs requires to match sequence-structure patterns. Unfortunately, current tools for this task have, in the best case, a running time that is only linear in the size of sequence databases. Furthermore, established index data structures for fast sequence matching, like suffix trees or arrays, cannot benefit from the complementarity constraints introduced by the secondary structure of RNAs. Results We present a novel method and readily applicable software for time efficient matching of RNA sequence-structure patterns in sequence databases. Our approach is based on affix arrays, a recently introduced index data structure, preprocessed from the target database. Affix arrays support bidirectional pattern search, which is required for efficiently handling the structural constraints of the pattern. Structural patterns like stem-loops can be matched inside out, such that the loop region is matched first and then the pairing bases on the boundaries are matched consecutively. This allows to exploit base pairing information for search space reduction and leads to an expected running time that is sublinear in the size of the sequence database. The incorporation of a new chaining approach in the search of RNA sequence-structure patterns enables the description of molecules folding into complex secondary structures with multiple ordered patterns. The chaining approach removes spurious matches from the set of intermediate results, in particular of patterns with little specificity. In benchmark experiments on the Rfam database, our method runs up to two orders of magnitude faster than previous methods. Conclusions The presented method's sublinear expected running time makes it well suited for RNA sequence-structure pattern matching in large sequence databases. RNA molecules containing several stem-loop substructures can be described by multiple sequence-structure patterns and their matches are efficiently handled by a novel chaining method. Beyond our algorithmic contributions, we provide with Structator a complete and robust open-source software solution for index-based search of RNA sequence-structure patterns. The Structator software is available at http://www.zbh.uni-hamburg.de/Structator. PMID:21619640
[Critical analysis of French DRG based information system (PMSI) databases for the epidemiology of cancer: a longitudinal approach becomes possible].

PubMed

Olive, F; Gomez, F; Schott, A-M; Remontet, L; Bossard, N; Mitton, N; Polazzi, S; Colonna, M; Trombert-Paviot, B

2011-02-01

Use of French Diagnosis Related Groups (DRGs) program databases, apart from financial purposes, has recently been improved since a unique anonymous patient identification number has been created for each inpatient in administrative case mix database. Based on the work of the group for cancer epidemiological observation in the Rhône-Alpes area, (ONC-EPI group), we review the remaining difficulties in the use of DRG data for epidemiological purposes and we consider a longitudinal approach based on analysis of database over several years. We also discuss limitations of this approach. The main problems are related to a lack of quality of administrative data, especially coding of diagnoses. These errors come from missing or inappropriate codes, or not being in accordance with prioritization rules (causing an over- or under-reporting or inconsistencies in coding over time). One difficulty, partly due to the hierarchy of coding and the type of cancer, is the choice of an extraction algorithm. In two studies designed to estimate the incidence of cancer cared in hospitals (breast, colon-rectum, kidney, ovaries), a first algorithm, including a code of cancer as principal diagnosis with a selection of surgical procedures less performed than the second one including a code of cancer as principal diagnosis only, for which the number of hospitalizations per patient ratio was stable across time and space. The chaining over several years allows, by tracing the trajectory of the patient, to detect and correct inaccuracies, errors and missing values, and for incidence studies, to correct incident cases by removing prevalent cases. However, linkage, complete only since 2007, does not correct data in all cases. Ways of future improvement certainly pass through improved algorithms for case identification and especially by linking DRG data with other databases. Copyright © 2010 Elsevier Masson SAS. All rights reserved.
[Study of algorithms to identify schizophrenia in the SNIIRAM database conducted by the REDSIAM network].

PubMed

Quantin, C; Collin, C; Frérot, M; Besson, J; Cottenet, J; Corneloup, M; Soudry-Faure, A; Mariet, A-S; Roussot, A

2017-10-01

The aim of the REDSIAM network is to foster communication between users of French medico-administrative databases and to validate and promote analysis methods suitable for the data. Within this network, the working group "Mental and behavioral disorders" took an interest in algorithms to identify adult schizophrenia in the SNIIRAM database and inventoried identification criteria for patients with schizophrenia in these databases. The methodology was based on interviews with nine experts in schizophrenia concerning the procedures they use to identify patients with schizophrenia disorders in databases. The interviews were based on a questionnaire and conducted by telephone. The synthesis of the interviews showed that the SNIIRAM contains various tables which allow coders to identify patients suffering from schizophrenia: chronic disease status, drugs and hospitalizations. Taken separately, these criteria were not sufficient to recognize patients with schizophrenia, an algorithm should be based on all of them. Apparently, only one-third of people living with schizophrenia benefit from the longstanding disease status. Not all patients are hospitalized, and coding for diagnoses at the hospitalization, notably for short stays in medicine, surgery or obstetrics departments, is not exhaustive. As for treatment with antipsychotics, it is not specific enough as such treatments are also prescribed to patients with bipolar disorders, or even other disorders. It seems appropriate to combine these complementary criteria, while keeping in mind out-patient care (every year 80,000 patients are seen exclusively in an outpatient setting), even if these data are difficult to link with other information. Finally, the experts made three propositions for selection algorithms of patients with schizophrenia. Patients with schizophrenia can be relatively accurately identified using SNIIRAM data. Different combinations of the selected criteria must be used depending on the objectives and they must be related to an appropriate length of time. Copyright © 2017 Elsevier Masson SAS. All rights reserved.
Information Retrieval Strategies of Millennial Undergraduate Students in Web and Library Database Searches

ERIC Educational Resources Information Center

Porter, Brandi

2009-01-01

Millennial students make up a large portion of undergraduate students attending colleges and universities, and they have a variety of online resources available to them to complete academically related information searches, primarily Web based and library-based online information retrieval systems. The content, ease of use, and required search…
The Danish Testicular Cancer database.

PubMed

Daugaard, Gedske; Kier, Maria Gry Gundgaard; Bandak, Mikkel; Mortensen, Mette Saksø; Larsson, Heidi; Søgaard, Mette; Toft, Birgitte Groenkaer; Engvad, Birte; Agerbæk, Mads; Holm, Niels Vilstrup; Lauritsen, Jakob

2016-01-01

The nationwide Danish Testicular Cancer database consists of a retrospective research database (DaTeCa database) and a prospective clinical database (Danish Multidisciplinary Cancer Group [DMCG] DaTeCa database). The aim is to improve the quality of care for patients with testicular cancer (TC) in Denmark, that is, by identifying risk factors for relapse, toxicity related to treatment, and focusing on late effects. All Danish male patients with a histologically verified germ cell cancer diagnosis in the Danish Pathology Registry are included in the DaTeCa databases. Data collection has been performed from 1984 to 2007 and from 2013 onward, respectively. The retrospective DaTeCa database contains detailed information with more than 300 variables related to histology, stage, treatment, relapses, pathology, tumor markers, kidney function, lung function, etc. A questionnaire related to late effects has been conducted, which includes questions regarding social relationships, life situation, general health status, family background, diseases, symptoms, use of medication, marital status, psychosocial issues, fertility, and sexuality. TC survivors alive on October 2014 were invited to fill in this questionnaire including 160 validated questions. Collection of questionnaires is still ongoing. A biobank including blood/sputum samples for future genetic analyses has been established. Both samples related to DaTeCa and DMCG DaTeCa database are included. The prospective DMCG DaTeCa database includes variables regarding histology, stage, prognostic group, and treatment. The DMCG DaTeCa database has existed since 2013 and is a young clinical database. It is necessary to extend the data collection in the prospective database in order to answer quality-related questions. Data from the retrospective database will be added to the prospective data. This will result in a large and very comprehensive database for future studies on TC patients.
Alternatives to relational database: comparison of NoSQL and XML approaches for clinical data storage.

PubMed

Lee, Ken Ka-Yin; Tang, Wai-Choi; Choi, Kup-Sze

2013-04-01

Clinical data are dynamic in nature, often arranged hierarchically and stored as free text and numbers. Effective management of clinical data and the transformation of the data into structured format for data analysis are therefore challenging issues in electronic health records development. Despite the popularity of relational databases, the scalability of the NoSQL database model and the document-centric data structure of XML databases appear to be promising features for effective clinical data management. In this paper, three database approaches--NoSQL, XML-enabled and native XML--are investigated to evaluate their suitability for structured clinical data. The database query performance is reported, together with our experience in the databases development. The results show that NoSQL database is the best choice for query speed, whereas XML databases are advantageous in terms of scalability, flexibility and extensibility, which are essential to cope with the characteristics of clinical data. While NoSQL and XML technologies are relatively new compared to the conventional relational database, both of them demonstrate potential to become a key database technology for clinical data management as the technology further advances. Copyright © 2012 Elsevier Ireland Ltd. All rights reserved.
Hmrbase: a database of hormones and their receptors

PubMed Central

Rashid, Mamoon; Singla, Deepak; Sharma, Arun; Kumar, Manish; Raghava, Gajendra PS

2009-01-01

Background Hormones are signaling molecules that play vital roles in various life processes, like growth and differentiation, physiology, and reproduction. These molecules are mostly secreted by endocrine glands, and transported to target organs through the bloodstream. Deficient, or excessive, levels of hormones are associated with several diseases such as cancer, osteoporosis, diabetes etc. Thus, it is important to collect and compile information about hormones and their receptors. Description This manuscript describes a database called Hmrbase which has been developed for managing information about hormones and their receptors. It is a highly curated database for which information has been collected from the literature and the public databases. The current version of Hmrbase contains comprehensive information about ~2000 hormones, e.g., about their function, source organism, receptors, mature sequences, structures etc. Hmrbase also contains information about ~3000 hormone receptors, in terms of amino acid sequences, subcellular localizations, ligands, and post-translational modifications etc. One of the major features of this database is that it provides data about ~4100 hormone-receptor pairs. A number of online tools have been integrated into the database, to provide the facilities like keyword search, structure-based search, mapping of a given peptide(s) on the hormone/receptor sequence, sequence similarity search. This database also provides a number of external links to other resources/databases in order to help in the retrieving of further related information. Conclusion Owing to the high impact of endocrine research in the biomedical sciences, the Hmrbase could become a leading data portal for researchers. The salient features of Hmrbase are hormone-receptor pair-related information, mapping of peptide stretches on the protein sequences of hormones and receptors, Pfam domain annotations, categorical browsing options, online data submission, DrugPedia linkage etc. Hmrbase is available online for public from . PMID:19589147

Alternatives to relational databases in precision medicine: Comparison of NoSQL approaches for big data storage using supercomputers

NASA Astrophysics Data System (ADS)

Velazquez, Enrique Israel

Improvements in medical and genomic technologies have dramatically increased the production of electronic data over the last decade. As a result, data management is rapidly becoming a major determinant, and urgent challenge, for the development of Precision Medicine. Although successful data management is achievable using Relational Database Management Systems (RDBMS), exponential data growth is a significant contributor to failure scenarios. Growing amounts of data can also be observed in other sectors, such as economics and business, which, together with the previous facts, suggests that alternate database approaches (NoSQL) may soon be required for efficient storage and management of big databases. However, this hypothesis has been difficult to test in the Precision Medicine field since alternate database architectures are complex to assess and means to integrate heterogeneous electronic health records (EHR) with dynamic genomic data are not easily available. In this dissertation, we present a novel set of experiments for identifying NoSQL database approaches that enable effective data storage and management in Precision Medicine using patients' clinical and genomic information from the cancer genome atlas (TCGA). The first experiment draws on performance and scalability from biologically meaningful queries with differing complexity and database sizes. The second experiment measures performance and scalability in database updates without schema changes. The third experiment assesses performance and scalability in database updates with schema modifications due dynamic data. We have identified two NoSQL approach, based on Cassandra and Redis, which seems to be the ideal database management systems for our precision medicine queries in terms of performance and scalability. We present NoSQL approaches and show how they can be used to manage clinical and genomic big data. Our research is relevant to the public health since we are focusing on one of the main challenges to the development of Precision Medicine and, consequently, investigating a potential solution to the progressively increasing demands on health care.
Seventy Years of RN Effectiveness: A Database Development Project to Inform Best Practice.

PubMed

Lulat, Zainab; Blain-McLeod, Julie; Grinspun, Doris; Penney, Tasha; Harripaul-Yhap, Anastasia; Rey, Michelle

2018-03-23

The appropriate nursing staff mix is imperative to the provision of quality care. Nurse staffing levels and staff mix vary from country to country, as well as between care settings. Understanding how staffing skill mix impacts patient, organizational, and financial outcomes is critical in order to allow policymakers and clinicians to make evidence-informed staffing decisions. This paper reports on the methodology for creation of an electronic database of studies exploring the effectiveness of Registered Nurses (RNs) on clinical and patient outcomes, organizational and nurse outcomes, and financial outcomes. Comprehensive literature searches were conducted in four electronic databases. Inclusion criteria for the database included studies published from 1946 to 2016, peer-reviewed international literature, and studies focused on RNs in all health-care disciplines, settings, and sectors. Masters-prepared nurse researchers conducted title and abstract screening and relevance review to determine eligibility of studies for the database. High-level analysis was conducted to determine key outcomes and the frequency at which they appeared within the database. Of the initial 90,352 records, a total of 626 abstracts were included within the database. Studies were organized into three groups corresponding to clinical and patient outcomes, organizational and nurse-related outcomes, and financial outcomes. Organizational and nurse-related outcomes represented the largest category in the database with 282 studies, followed by clinical and patient outcomes with 244 studies, and lastly financial outcomes, which included 124 studies. The comprehensive database of evidence for RN effectiveness is freely available at https://rnao.ca/bpg/initiatives/RNEffectiveness. The database will serve as a resource for the Registered Nurses' Association of Ontario, as well as a tool for researchers, clinicians, and policymakers for making evidence-informed staffing decisions. © 2018 The Authors. Worldviews on Evidence-Based Nursing published by Wiley Periodicals, Inc. on behalf of Sigma Theta Tau International The Honor Society of Nursing.
The STP (Solar-Terrestrial Physics) Semantic Web based on the RSS1.0 and the RDF

NASA Astrophysics Data System (ADS)

Kubo, T.; Murata, K. T.; Kimura, E.; Ishikura, S.; Shinohara, I.; Kasaba, Y.; Watari, S.; Matsuoka, D.

2006-12-01

In the Solar-Terrestrial Physics (STP), it is pointed out that circulation and utilization of observation data among researchers are insufficient. To archive interdisciplinary researches, we need to overcome this circulation and utilization problems. Under such a background, authors' group has developed a world-wide database that manages meta-data of satellite and ground-based observation data files. It is noted that retrieving meta-data from the observation data and registering them to database have been carried out by hand so far. Our goal is to establish the STP Semantic Web. The Semantic Web provides a common framework that allows a variety of data shared and reused across applications, enterprises, and communities. We also expect that the secondary information related with observations, such as event information and associated news, are also shared over the networks. The most fundamental issue on the establishment is who generates, manages and provides meta-data in the Semantic Web. We developed an automatic meta-data collection system for the observation data using the RSS (RDF Site Summary) 1.0. The RSS1.0 is one of the XML-based markup languages based on the RDF (Resource Description Framework), which is designed for syndicating news and contents of news-like sites. The RSS1.0 is used to describe the STP meta-data, such as data file name, file server address and observation date. To describe the meta-data of the STP beyond RSS1.0 vocabulary, we defined original vocabularies for the STP resources using the RDF Schema. The RDF describes technical terms on the STP along with the Dublin Core Metadata Element Set, which is standard for cross-domain information resource descriptions. Researchers' information on the STP by FOAF, which is known as an RDF/XML vocabulary, creates a machine-readable metadata describing people. Using the RSS1.0 as a meta-data distribution method, the workflow from retrieving meta-data to registering them into the database is automated. This technique is applied for several database systems, such as the DARTS database system and NICT Space Weather Report Service. The DARTS is a science database managed by ISAS/JAXA in Japan. We succeeded in generating and collecting the meta-data automatically for the CDF (Common data Format) data, such as Reimei satellite data, provided by the DARTS. We also create an RDF service for space weather report and real-time global MHD simulation 3D data provided by the NICT. Our Semantic Web system works as follows: The RSS1.0 documents generated on the data sites (ISAS and NICT) are automatically collected by a meta-data collection agent. The RDF documents are registered and the agent extracts meta-data to store them in the Sesame, which is an open source RDF database with support for RDF Schema inferencing and querying. The RDF database provides advanced retrieval processing that has considered property and relation. Finally, the STP Semantic Web provides automatic processing or high level search for the data which are not only for observation data but for space weather news, physical events, technical terms and researches information related to the STP.
An interactive mutation database for human coagulation factor IX provides novel insights into the phenotypes and genetics of hemophilia B.

PubMed

Rallapalli, P M; Kemball-Cook, G; Tuddenham, E G; Gomez, K; Perkins, S J

2013-07-01

Factor IX (FIX) is important in the coagulation cascade, being activated to FIXa on cleavage. Defects in the human F9 gene frequently lead to hemophilia B. To assess 1113 unique F9 mutations corresponding to 3721 patient entries in a new and up-to-date interactive web database alongside the FIXa protein structure. The mutations database was built using MySQL and structural analyses were based on a homology model for the human FIXa structure based on closely-related crystal structures. Mutations have been found in 336 (73%) out of 461 residues in FIX. There were 812 unique point mutations, 182 deletions, 54 polymorphisms, 39 insertions and 26 others that together comprise a total of 1113 unique variants. The 64 unique mild severity mutations in the mature protein with known circulating protein phenotypes include 15 (23%) quantitative type I mutations and 41 (64%) predominantly qualitative type II mutations. Inhibitors were described in 59 reports (1.6%) corresponding to 25 unique mutations. The interactive database provides insights into mechanisms of hemophilia B. Type II mutations are deduced to disrupt predominantly those structural regions involved with functional interactions. The interactive features of the database will assist in making judgments about patient management. © 2013 International Society on Thrombosis and Haemostasis.
Knowledge-rich temporal relation identification and classification in clinical notes

PubMed Central

D’Souza, Jennifer; Ng, Vincent

2014-01-01

Motivation: We examine the task of temporal relation classification for the clinical domain. Our approach to this task departs from existing ones in that it is (i) ‘knowledge-rich’, employing sophisticated knowledge derived from discourse relations as well as both domain-independent and domain-dependent semantic relations, and (ii) ‘hybrid’, combining the strengths of rule-based and learning-based approaches. Evaluation results on the i2b2 Clinical Temporal Relations Challenge corpus show that our approach yields a 17–24% and 8–14% relative reduction in error over a state-of-the-art learning-based baseline system when gold-standard and automatically identified temporal relations are used, respectively. Database URL: http://www.hlt.utdallas.edu/~jld082000/temporal-relations/ PMID:25414383
A Web-based Alternative Non-animal Method Database for Safety Cosmetic Evaluations

PubMed Central

Kim, Seung Won; Kim, Bae-Hwan

2016-01-01

Animal testing was used traditionally in the cosmetics industry to confirm product safety, but has begun to be banned; alternative methods to replace animal experiments are either in development, or are being validated, worldwide. Research data related to test substances are critical for developing novel alternative tests. Moreover, safety information on cosmetic materials has neither been collected in a database nor shared among researchers. Therefore, it is imperative to build and share a database of safety information on toxicological mechanisms and pathways collected through in vivo, in vitro, and in silico methods. We developed the CAMSEC database (named after the research team; the Consortium of Alternative Methods for Safety Evaluation of Cosmetics) to fulfill this purpose. On the same website, our aim is to provide updates on current alternative research methods in Korea. The database will not be used directly to conduct safety evaluations, but researchers or regulatory individuals can use it to facilitate their work in formulating safety evaluations for cosmetic materials. We hope this database will help establish new alternative research methods to conduct efficient safety evaluations of cosmetic materials. PMID:27437094
Interactive Scene Analysis Module - A sensor-database fusion system for telerobotic environments

NASA Technical Reports Server (NTRS)

Cooper, Eric G.; Vazquez, Sixto L.; Goode, Plesent W.

1992-01-01

Accomplishing a task with telerobotics typically involves a combination of operator control/supervision and a 'script' of preprogrammed commands. These commands usually assume that the location of various objects in the task space conform to some internal representation (database) of that task space. The ability to quickly and accurately verify the task environment against the internal database would improve the robustness of these preprogrammed commands. In addition, the on-line initialization and maintenance of a task space database is difficult for operators using Cartesian coordinates alone. This paper describes the Interactive Scene' Analysis Module (ISAM) developed to provide taskspace database initialization and verification utilizing 3-D graphic overlay modelling, video imaging, and laser radar based range imaging. Through the fusion of taskspace database information and image sensor data, a verifiable taskspace model is generated providing location and orientation data for objects in a task space. This paper also describes applications of the ISAM in the Intelligent Systems Research Laboratory (ISRL) at NASA Langley Research Center, and discusses its performance relative to representation accuracy and operator interface efficiency.
A spatial-temporal system for dynamic cadastral management.

PubMed

Nan, Liu; Renyi, Liu; Guangliang, Zhu; Jiong, Xie

2006-03-01

A practical spatio-temporal database (STDB) technique for dynamic urban land management is presented. One of the STDB models, the expanded model of Base State with Amendments (BSA), is selected as the basis for developing the dynamic cadastral management technique. Two approaches, the Section Fast Indexing (SFI) and the Storage Factors of Variable Granularity (SFVG), are used to improve the efficiency of the BSA model. Both spatial graphic data and attribute data, through a succinct engine, are stored in standard relational database management systems (RDBMS) for the actual implementation of the BSA model. The spatio-temporal database is divided into three interdependent sub-databases: present DB, history DB and the procedures-tracing DB. The efficiency of database operation is improved by the database connection in the bottom layer of the Microsoft SQL Server. The spatio-temporal system can be provided at a low-cost while satisfying the basic needs of urban land management in China. The approaches presented in this paper may also be of significance to countries where land patterns change frequently or to agencies where financial resources are limited.
A Web-based Alternative Non-animal Method Database for Safety Cosmetic Evaluations.

PubMed

Kim, Seung Won; Kim, Bae-Hwan

2016-07-01

Animal testing was used traditionally in the cosmetics industry to confirm product safety, but has begun to be banned; alternative methods to replace animal experiments are either in development, or are being validated, worldwide. Research data related to test substances are critical for developing novel alternative tests. Moreover, safety information on cosmetic materials has neither been collected in a database nor shared among researchers. Therefore, it is imperative to build and share a database of safety information on toxicological mechanisms and pathways collected through in vivo, in vitro, and in silico methods. We developed the CAMSEC database (named after the research team; the Consortium of Alternative Methods for Safety Evaluation of Cosmetics) to fulfill this purpose. On the same website, our aim is to provide updates on current alternative research methods in Korea. The database will not be used directly to conduct safety evaluations, but researchers or regulatory individuals can use it to facilitate their work in formulating safety evaluations for cosmetic materials. We hope this database will help establish new alternative research methods to conduct efficient safety evaluations of cosmetic materials.
PlantTribes: a gene and gene family resource for comparative genomics in plants

PubMed Central

Wall, P. Kerr; Leebens-Mack, Jim; Müller, Kai F.; Field, Dawn; Altman, Naomi S.; dePamphilis, Claude W.

2008-01-01

The PlantTribes database (http://fgp.huck.psu.edu/tribe.html) is a plant gene family database based on the inferred proteomes of five sequenced plant species: Arabidopsis thaliana, Carica papaya, Medicago truncatula, Oryza sativa and Populus trichocarpa. We used the graph-based clustering algorithm MCL [Van Dongen (Technical Report INS-R0010 2000) and Enright et al. (Nucleic Acids Res. 2002; 30: 1575–1584)] to classify all of these species’ protein-coding genes into putative gene families, called tribes, using three clustering stringencies (low, medium and high). For all tribes, we have generated protein and DNA alignments and maximum-likelihood phylogenetic trees. A parallel database of microarray experimental results is linked to the genes, which lets researchers identify groups of related genes and their expression patterns. Unified nomenclatures were developed, and tribes can be related to traditional gene families and conserved domain identifiers. SuperTribes, constructed through a second iteration of MCL clustering, connect distant, but potentially related gene clusters. The global classification of nearly 200 000 plant proteins was used as a scaffold for sorting ∼4 million additional cDNA sequences from over 200 plant species. All data and analyses are accessible through a flexible interface allowing users to explore the classification, to place query sequences within the classification, and to download results for further study. PMID:18073194
"Trust is not something you can reclaim easily": patenting in the field of direct-to-consumer genetic testing.

PubMed

Sterckx, Sigrid; Cockbain, Julian; Howard, Heidi; Huys, Isabelle; Borry, Pascal

2013-05-01

Recently, 23andMe announced that it had obtained its first patent, related to "polymorphisms associated with Parkinson's disease" (US-B-8187811). This announcement immediately sparked controversy in the community of 23andMe users and research participants, especially with regard to issues of transparency and trust. The purpose of this article was to analyze the patent portfolio of this prominent direct-to-consumer genetic testing company and discuss the potential ethical implications of patenting in this field for public participation in Web-based genetic research. We searched the publicly accessible patent database Espacenet as well as the commercially available database Micropatent for published patents and patent applications of 23andMe. Six patent families were identified for 23andMe. These included patent applications related to: genetic comparisons between grandparents and grandchildren, family inheritance, genome sharing, processing data from genotyping chips, gamete donor selection based on genetic calculations, finding relatives in a database, and polymorphisms associated with Parkinson disease. An important lesson to be drawn from this ongoing controversy seems to be that any (private or public) organization involved in research that relies on human participation, whether by providing information, body material, or both, needs to be transparent, not only about its research goals but also about its strategies and policies regarding commercialization.
Follicle Online: an integrated database of follicle assembly, development and ovulation.

PubMed

Hua, Juan; Xu, Bo; Yang, Yifan; Ban, Rongjun; Iqbal, Furhan; Cooke, Howard J; Zhang, Yuanwei; Shi, Qinghua

2015-01-01

Folliculogenesis is an important part of ovarian function as it provides the oocytes for female reproductive life. Characterizing genes/proteins involved in folliculogenesis is fundamental for understanding the mechanisms associated with this biological function and to cure the diseases associated with folliculogenesis. A large number of genes/proteins associated with folliculogenesis have been identified from different species. However, no dedicated public resource is currently available for folliculogenesis-related genes/proteins that are validated by experiments. Here, we are reporting a database 'Follicle Online' that provides the experimentally validated gene/protein map of the folliculogenesis in a number of species. Follicle Online is a web-based database system for storing and retrieving folliculogenesis-related experimental data. It provides detailed information for 580 genes/proteins (from 23 model organisms, including Homo sapiens, Mus musculus, Rattus norvegicus, Mesocricetus auratus, Bos Taurus, Drosophila and Xenopus laevis) that have been reported to be involved in folliculogenesis, POF (premature ovarian failure) and PCOS (polycystic ovary syndrome). The literature was manually curated from more than 43,000 published articles (till 1 March 2014). The Follicle Online database is implemented in PHP + MySQL + JavaScript and this user-friendly web application provides access to the stored data. In summary, we have developed a centralized database that provides users with comprehensive information about genes/proteins involved in folliculogenesis. This database can be accessed freely and all the stored data can be viewed without any registration. Database URL: http://mcg.ustc.edu.cn/sdap1/follicle/index.php © The Author(s) 2015. Published by Oxford University Press.
Follicle Online: an integrated database of follicle assembly, development and ovulation

PubMed Central

Hua, Juan; Xu, Bo; Yang, Yifan; Ban, Rongjun; Iqbal, Furhan; Zhang, Yuanwei; Shi, Qinghua

2015-01-01

Folliculogenesis is an important part of ovarian function as it provides the oocytes for female reproductive life. Characterizing genes/proteins involved in folliculogenesis is fundamental for understanding the mechanisms associated with this biological function and to cure the diseases associated with folliculogenesis. A large number of genes/proteins associated with folliculogenesis have been identified from different species. However, no dedicated public resource is currently available for folliculogenesis-related genes/proteins that are validated by experiments. Here, we are reporting a database ‘Follicle Online’ that provides the experimentally validated gene/protein map of the folliculogenesis in a number of species. Follicle Online is a web-based database system for storing and retrieving folliculogenesis-related experimental data. It provides detailed information for 580 genes/proteins (from 23 model organisms, including Homo sapiens, Mus musculus, Rattus norvegicus, Mesocricetus auratus, Bos Taurus, Drosophila and Xenopus laevis) that have been reported to be involved in folliculogenesis, POF (premature ovarian failure) and PCOS (polycystic ovary syndrome). The literature was manually curated from more than 43 000 published articles (till 1 March 2014). The Follicle Online database is implemented in PHP + MySQL + JavaScript and this user-friendly web application provides access to the stored data. In summary, we have developed a centralized database that provides users with comprehensive information about genes/proteins involved in folliculogenesis. This database can be accessed freely and all the stored data can be viewed without any registration. Database URL: http://mcg.ustc.edu.cn/sdap1/follicle/index.php PMID:25931457
HerDing: herb recommendation system to treat diseases using genes and chemicals

PubMed Central

Choi, Wonjun; Choi, Chan-Hun; Kim, Young Ran; Kim, Seon-Jong; Na, Chang-Su; Lee, Hyunju

2016-01-01

In recent years, herbs have been researched for new drug candidates because they have a long empirical history of treating diseases and are relatively free from side effects. Studies to scientifically prove the medical efficacy of herbs for target diseases often spend a considerable amount of time and effort in choosing candidate herbs and in performing experiments to measure changes of marker genes when treating herbs. A computational approach to recommend herbs for treating diseases might be helpful to promote efficiency in the early stage of such studies. Although several databases related to traditional Chinese medicine have been already developed, there is no specialized Web tool yet recommending herbs to treat diseases based on disease-related genes. Therefore, we developed a novel search engine, HerDing, focused on retrieving candidate herb-related information with user search terms (a list of genes, a disease name, a chemical name or an herb name). HerDing was built by integrating public databases and by applying a text-mining method. The HerDing website is free and open to all users, and there is no login requirement. Database URL: http://combio.gist.ac.kr/herding PMID:26980517
Understanding the Influence of Environment on Adults' Walking Experiences: A Meta-Synthesis Study.

PubMed

Dadpour, Sara; Pakzad, Jahanshah; Khankeh, Hamidreza

2016-07-20

The environment has an important impact on physical activity, especially walking. The relationship between the environment and walking is not the same as for other types of physical activity. This study seeks to comprehensively identify the environmental factors influencing walking and to show how those environmental factors impact on walking using the experiences of adults between the ages of 18 and 65. The current study is a meta-synthesis based on a systematic review. Seven databases of related disciplines were searched, including health, transportation, physical activity, architecture, and interdisciplinary databases. In addition to the databases, two journals were searched. Of the 11,777 papers identified, 10 met the eligibility criteria and quality for selection. Qualitative content analysis was used for analysis of the results. The four themes identified as influencing walking were "safety and security", "environmental aesthetics", "social relations", and "convenience and efficiency". "Convenience and efficiency" and "environmental aesthetics" could enhance the impact of "social relations" on walking in some aspects. In addition, "environmental aesthetics" and "social relations" could hinder the influence of "convenience and efficiency" on walking in some aspects. Given the results of the study, strategies are proposed to enhance the walking experience.
HerDing: herb recommendation system to treat diseases using genes and chemicals.

PubMed

Choi, Wonjun; Choi, Chan-Hun; Kim, Young Ran; Kim, Seon-Jong; Na, Chang-Su; Lee, Hyunju

2016-01-01

In recent years, herbs have been researched for new drug candidates because they have a long empirical history of treating diseases and are relatively free from side effects. Studies to scientifically prove the medical efficacy of herbs for target diseases often spend a considerable amount of time and effort in choosing candidate herbs and in performing experiments to measure changes of marker genes when treating herbs. A computational approach to recommend herbs for treating diseases might be helpful to promote efficiency in the early stage of such studies. Although several databases related to traditional Chinese medicine have been already developed, there is no specialized Web tool yet recommending herbs to treat diseases based on disease-related genes. Therefore, we developed a novel search engine, HerDing, focused on retrieving candidate herb-related information with user search terms (a list of genes, a disease name, a chemical name or an herb name). HerDing was built by integrating public databases and by applying a text-mining method. The HerDing website is free and open to all users, and there is no login requirement. Database URL: http://combio.gist.ac.kr/herding. © The Author(s) 2016. Published by Oxford University Press.
SORTEZ: a relational translator for NCBI's ASN.1 database.

PubMed

Hart, K W; Searls, D B; Overton, G C

1994-07-01

The National Center for Biotechnology Information (NCBI) has created a database collection that includes several protein and nucleic acid sequence databases, a biosequence-specific subset of MEDLINE, as well as value-added information such as links between similar sequences. Information in the NCBI database is modeled in Abstract Syntax Notation 1 (ASN.1) an Open Systems Interconnection protocol designed for the purpose of exchanging structured data between software applications rather than as a data model for database systems. While the NCBI database is distributed with an easy-to-use information retrieval system, ENTREZ, the ASN.1 data model currently lacks an ad hoc query language for general-purpose data access. For that reason, we have developed a software package, SORTEZ, that transforms the ASN.1 database (or other databases with nested data structures) to a relational data model and subsequently to a relational database management system (Sybase) where information can be accessed through the relational query language, SQL. Because the need to transform data from one data model and schema to another arises naturally in several important contexts, including efficient execution of specific applications, access to multiple databases and adaptation to database evolution this work also serves as a practical study of the issues involved in the various stages of database transformation. We show that transformation from the ASN.1 data model to a relational data model can be largely automated, but that schema transformation and data conversion require considerable domain expertise and would greatly benefit from additional support tools.
A Model Based Mars Climate Database for the Mission Design

NASA Technical Reports Server (NTRS)

2005-01-01

A viewgraph presentation on a model based climate database is shown. The topics include: 1) Why a model based climate database?; 2) Mars Climate Database v3.1 Who uses it ? (approx. 60 users!); 3) The new Mars Climate database MCD v4.0; 4) MCD v4.0: what's new ? 5) Simulation of Water ice clouds; 6) Simulation of Water ice cycle; 7) A new tool for surface pressure prediction; 8) Acces to the database MCD 4.0; 9) How to access the database; and 10) New web access
A new natural hazards data-base for volcanic ash and SO2 from global satellite remote sensing measurements

NASA Astrophysics Data System (ADS)

Stebel, Kerstin; Prata, Fred; Theys, Nicolas; Tampellini, Lucia; Kamstra, Martijn; Zehner, Claus

2014-05-01

Over the last few years there has been a recognition of the utility of satellite measurements to identify and track volcanic emissions that present a natural hazard to human populations. Mitigation of the volcanic hazard to life and the environment requires understanding of the properties of volcanic emissions, identifying the hazard in near real-time and being able to provide timely and accurate forecasts to affected areas. Amongst the many ways to measure volcanic emissions, satellite remote sensing is capable of providing global quantitative retrievals of important microphysical parameters such as ash mass loading, ash particle effective radius, infrared optical depth, SO2 partial and total column abundance, plume altitude, aerosol optical depth and aerosol absorbing index. The eruption of Eyjafjallajökull in April May, 2010 led to increased research and measurement programs to better characterize properties of volcanic ash and the need to establish a data-base in which to store and access these data was confirmed. The European Space Agency (ESA) has recognized the importance of having a quality controlled data-base of satellite retrievals and has funded an activity called Volcanic Ash Strategic Initiative Team VAST (vast.nilu.no) to develop novel remote sensing retrieval schemes and a data-base, initially focused on several recent hazardous volcanic eruptions. In addition, the data-base will host satellite and validation data sets provided from the ESA projects Support to Aviation Control Service SACS (sacs.aeronomie.be) and Study on an end-to-end system for volcanic ash plume monitoring and prediction SMASH. Starting with data for the eruptions of Eyjafjallajökull, Grímsvötn, and Kasatochi, satellite retrievals for Puyhue-Cordon Caulle, Nabro, Merapi, Okmok, Kasatochi and Sarychev Peak will eventually be ingested. Dispersion model simulations are also being included in the data-base. Several atmospheric dispersion models (FLEXPART, SILAM and WRF-Chem) are used in VAST to simulate the dispersion of volcanic ash and SO2 emitted during an eruption. Source terms and dispersion model results will be given. In time, data from conventional in situ sampling instruments, airborne and ground-based remote sensing platforms and other meta-data (bulk ash and gas properties, volcanic setting, volcanic eruption chronologies, potential impacts etc.) will be added. Important applications of the data-base are illustrated related to the ash/aviation problem and to estimating SO2 fluxes from active volcanoes-as a means to diagnose future unrest. The data-base has the potential to provide the natural hazards community with a dynamic atmospheric volcanic hazards map and will be a valuable tool particularly for aviation.
Database for Safety-Oriented Tracking of Chemicals

NASA Technical Reports Server (NTRS)

Stump, Jacob; Carr, Sandra; Plumlee, Debrah; Slater, Andy; Samson, Thomas M.; Holowaty, Toby L.; Skeete, Darren; Haenz, Mary Alice; Hershman, Scot; Raviprakash, Pushpa

2010-01-01

SafetyChem is a computer program that maintains a relational database for tracking chemicals and associated hazards at Johnson Space Center (JSC) by use of a Web-based graphical user interface. The SafetyChem database is accessible to authorized users via a JSC intranet. All new chemicals pass through a safety office, where information on hazards, required personal protective equipment (PPE), fire-protection warnings, and target organ effects (TOEs) is extracted from material safety data sheets (MSDSs) and recorded in the database. The database facilitates real-time management of inventory with attention to such issues as stability, shelf life, reduction of waste through transfer of unused chemicals to laboratories that need them, quantification of chemical wastes, and identification of chemicals for which disposal is required. Upon searching the database for a chemical, the user receives information on physical properties of the chemical, hazard warnings, required PPE, a link to the MSDS, and references to the applicable International Standards Organization (ISO) 9000 standard work instructions and the applicable job hazard analysis. Also, to reduce the labor hours needed to comply with reporting requirements of the Occupational Safety and Health Administration, the data can be directly exported into the JSC hazardous- materials database.

Drug-Path: a database for drug-induced pathways

PubMed Central

Zeng, Hui; Cui, Qinghua

2015-01-01

Some databases for drug-associated pathways have been built and are publicly available. However, the pathways curated in most of these databases are drug-action or drug-metabolism pathways. In recent years, high-throughput technologies such as microarray and RNA-sequencing have produced lots of drug-induced gene expression profiles. Interestingly, drug-induced gene expression profile frequently show distinct patterns, indicating that drugs normally induce the activation or repression of distinct pathways. Therefore, these pathways contribute to study the mechanisms of drugs and drug-repurposing. Here, we present Drug-Path, a database of drug-induced pathways, which was generated by KEGG pathway enrichment analysis for drug-induced upregulated genes and downregulated genes based on drug-induced gene expression datasets in Connectivity Map. Drug-Path provides user-friendly interfaces to retrieve, visualize and download the drug-induced pathway data in the database. In addition, the genes deregulated by a given drug are highlighted in the pathways. All data were organized using SQLite. The web site was implemented using Django, a Python web framework. Finally, we believe that this database will be useful for related researches. Database URL: http://www.cuilab.cn/drugpath PMID:26130661
Development of an electronic database for Acute Pain Service outcomes

PubMed Central

Love, Brandy L; Jensen, Louise A; Schopflocher, Donald; Tsui, Ban CH

2012-01-01

BACKGROUND: Quality assurance is increasingly important in the current health care climate. An electronic database can be used for tracking patient information and as a research tool to provide quality assurance for patient care. OBJECTIVE: An electronic database was developed for the Acute Pain Service, University of Alberta Hospital (Edmonton, Alberta) to record patient characteristics, identify at-risk populations, compare treatment efficacies and guide practice decisions. METHOD: Steps in the database development involved identifying the goals for use, relevant variables to include, and a plan for data collection, entry and analysis. Protocols were also created for data cleaning quality control. The database was evaluated with a pilot test using existing data to assess data collection burden, accuracy and functionality of the database. RESULTS: A literature review resulted in an evidence-based list of demographic, clinical and pain management outcome variables to include. Time to assess patients and collect the data was 20 min to 30 min per patient. Limitations were primarily software related, although initial data collection completion was only 65% and accuracy of data entry was 96%. CONCLUSIONS: The electronic database was found to be relevant and functional for the identified goals of data storage and research. PMID:22518364
Using SQL Databases for Sequence Similarity Searching and Analysis.

PubMed

Pearson, William R; Mackey, Aaron J

2017-09-13

Relational databases can integrate diverse types of information and manage large sets of similarity search results, greatly simplifying genome-scale analyses. By focusing on taxonomic subsets of sequences, relational databases can reduce the size and redundancy of sequence libraries and improve the statistical significance of homologs. In addition, by loading similarity search results into a relational database, it becomes possible to explore and summarize the relationships between all of the proteins in an organism and those in other biological kingdoms. This unit describes how to use relational databases to improve the efficiency of sequence similarity searching and demonstrates various large-scale genomic analyses of homology-related data. It also describes the installation and use of a simple protein sequence database, seqdb_demo, which is used as a basis for the other protocols. The unit also introduces search_demo, a database that stores sequence similarity search results. The search_demo database is then used to explore the evolutionary relationships between E. coli proteins and proteins in other organisms in a large-scale comparative genomic analysis. © 2017 by John Wiley & Sons, Inc. Copyright © 2017 John Wiley & Sons, Inc.
MalaCards: an integrated compendium for diseases and their annotation

PubMed Central

Rappaport, Noa; Nativ, Noam; Stelzer, Gil; Twik, Michal; Guan-Golan, Yaron; Iny Stein, Tsippi; Bahir, Iris; Belinky, Frida; Morrey, C. Paul; Safran, Marilyn; Lancet, Doron

2013-01-01

Comprehensive disease classification, integration and annotation are crucial for biomedical discovery. At present, disease compilation is incomplete, heterogeneous and often lacking systematic inquiry mechanisms. We introduce MalaCards, an integrated database of human maladies and their annotations, modeled on the architecture and strategy of the GeneCards database of human genes. MalaCards mines and merges 44 data sources to generate a computerized card for each of 16 919 human diseases. Each MalaCard contains disease-specific prioritized annotations, as well as inter-disease connections, empowered by the GeneCards relational database, its searches and GeneDecks set analyses. First, we generate a disease list from 15 ranked sources, using disease-name unification heuristics. Next, we use four schemes to populate MalaCards sections: (i) directly interrogating disease resources, to establish integrated disease names, synonyms, summaries, drugs/therapeutics, clinical features, genetic tests and anatomical context; (ii) searching GeneCards for related publications, and for associated genes with corresponding relevance scores; (iii) analyzing disease-associated gene sets in GeneDecks to yield affiliated pathways, phenotypes, compounds and GO terms, sorted by a composite relevance score and presented with GeneCards links; and (iv) searching within MalaCards itself, e.g. for additional related diseases and anatomical context. The latter forms the basis for the construction of a disease network, based on shared MalaCards annotations, embodying associations based on etiology, clinical features and clinical conditions. This broadly disposed network has a power-law degree distribution, suggesting that this might be an inherent property of such networks. Work in progress includes hierarchical malady classification, ontological mapping and disease set analyses, striving to make MalaCards an even more effective tool for biomedical research. Database URL: http://www.malacards.org/ PMID:23584832
The database of the PREDICTS (Projecting Responses of Ecological Diversity In Changing Terrestrial Systems) project.

PubMed

Hudson, Lawrence N; Newbold, Tim; Contu, Sara; Hill, Samantha L L; Lysenko, Igor; De Palma, Adriana; Phillips, Helen R P; Alhusseini, Tamera I; Bedford, Felicity E; Bennett, Dominic J; Booth, Hollie; Burton, Victoria J; Chng, Charlotte W T; Choimes, Argyrios; Correia, David L P; Day, Julie; Echeverría-Londoño, Susy; Emerson, Susan R; Gao, Di; Garon, Morgan; Harrison, Michelle L K; Ingram, Daniel J; Jung, Martin; Kemp, Victoria; Kirkpatrick, Lucinda; Martin, Callum D; Pan, Yuan; Pask-Hale, Gwilym D; Pynegar, Edwin L; Robinson, Alexandra N; Sanchez-Ortiz, Katia; Senior, Rebecca A; Simmons, Benno I; White, Hannah J; Zhang, Hanbin; Aben, Job; Abrahamczyk, Stefan; Adum, Gilbert B; Aguilar-Barquero, Virginia; Aizen, Marcelo A; Albertos, Belén; Alcala, E L; Del Mar Alguacil, Maria; Alignier, Audrey; Ancrenaz, Marc; Andersen, Alan N; Arbeláez-Cortés, Enrique; Armbrecht, Inge; Arroyo-Rodríguez, Víctor; Aumann, Tom; Axmacher, Jan C; Azhar, Badrul; Azpiroz, Adrián B; Baeten, Lander; Bakayoko, Adama; Báldi, András; Banks, John E; Baral, Sharad K; Barlow, Jos; Barratt, Barbara I P; Barrico, Lurdes; Bartolommei, Paola; Barton, Diane M; Basset, Yves; Batáry, Péter; Bates, Adam J; Baur, Bruno; Bayne, Erin M; Beja, Pedro; Benedick, Suzan; Berg, Åke; Bernard, Henry; Berry, Nicholas J; Bhatt, Dinesh; Bicknell, Jake E; Bihn, Jochen H; Blake, Robin J; Bobo, Kadiri S; Bóçon, Roberto; Boekhout, Teun; Böhning-Gaese, Katrin; Bonham, Kevin J; Borges, Paulo A V; Borges, Sérgio H; Boutin, Céline; Bouyer, Jérémy; Bragagnolo, Cibele; Brandt, Jodi S; Brearley, Francis Q; Brito, Isabel; Bros, Vicenç; Brunet, Jörg; Buczkowski, Grzegorz; Buddle, Christopher M; Bugter, Rob; Buscardo, Erika; Buse, Jörn; Cabra-García, Jimmy; Cáceres, Nilton C; Cagle, Nicolette L; Calviño-Cancela, María; Cameron, Sydney A; Cancello, Eliana M; Caparrós, Rut; Cardoso, Pedro; Carpenter, Dan; Carrijo, Tiago F; Carvalho, Anelena L; Cassano, Camila R; Castro, Helena; Castro-Luna, Alejandro A; Rolando, Cerda B; Cerezo, Alexis; Chapman, Kim Alan; Chauvat, Matthieu; Christensen, Morten; Clarke, Francis M; Cleary, Daniel F R; Colombo, Giorgio; Connop, Stuart P; Craig, Michael D; Cruz-López, Leopoldo; Cunningham, Saul A; D'Aniello, Biagio; D'Cruze, Neil; da Silva, Pedro Giovâni; Dallimer, Martin; Danquah, Emmanuel; Darvill, Ben; Dauber, Jens; Davis, Adrian L V; Dawson, Jeff; de Sassi, Claudio; de Thoisy, Benoit; Deheuvels, Olivier; Dejean, Alain; Devineau, Jean-Louis; Diekötter, Tim; Dolia, Jignasu V; Domínguez, Erwin; Dominguez-Haydar, Yamileth; Dorn, Silvia; Draper, Isabel; Dreber, Niels; Dumont, Bertrand; Dures, Simon G; Dynesius, Mats; Edenius, Lars; Eggleton, Paul; Eigenbrod, Felix; Elek, Zoltán; Entling, Martin H; Esler, Karen J; de Lima, Ricardo F; Faruk, Aisyah; Farwig, Nina; Fayle, Tom M; Felicioli, Antonio; Felton, Annika M; Fensham, Roderick J; Fernandez, Ignacio C; Ferreira, Catarina C; Ficetola, Gentile F; Fiera, Cristina; Filgueiras, Bruno K C; Fırıncıoğlu, Hüseyin K; Flaspohler, David; Floren, Andreas; Fonte, Steven J; Fournier, Anne; Fowler, Robert E; Franzén, Markus; Fraser, Lauchlan H; Fredriksson, Gabriella M; Freire, Geraldo B; Frizzo, Tiago L M; Fukuda, Daisuke; Furlani, Dario; Gaigher, René; Ganzhorn, Jörg U; García, Karla P; Garcia-R, Juan C; Garden, Jenni G; Garilleti, Ricardo; Ge, Bao-Ming; Gendreau-Berthiaume, Benoit; Gerard, Philippa J; Gheler-Costa, Carla; Gilbert, Benjamin; Giordani, Paolo; Giordano, Simonetta; Golodets, Carly; Gomes, Laurens G L; Gould, Rachelle K; Goulson, Dave; Gove, Aaron D; Granjon, Laurent; Grass, Ingo; Gray, Claudia L; Grogan, James; Gu, Weibin; Guardiola, Moisès; Gunawardene, Nihara R; Gutierrez, Alvaro G; Gutiérrez-Lamus, Doris L; Haarmeyer, Daniela H; Hanley, Mick E; Hanson, Thor; Hashim, Nor R; Hassan, Shombe N; Hatfield, Richard G; Hawes, Joseph E; Hayward, Matt W; Hébert, Christian; Helden, Alvin J; Henden, John-André; Henschel, Philipp; Hernández, Lionel; Herrera, James P; Herrmann, Farina; Herzog, Felix; Higuera-Diaz, Diego; Hilje, Branko; Höfer, Hubert; Hoffmann, Anke; Horgan, Finbarr G; Hornung, Elisabeth; Horváth, Roland; Hylander, Kristoffer; Isaacs-Cubides, Paola; Ishida, Hiroaki; Ishitani, Masahiro; Jacobs, Carmen T; Jaramillo, Víctor J; Jauker, Birgit; Hernández, F Jiménez; Johnson, McKenzie F; Jolli, Virat; Jonsell, Mats; Juliani, S Nur; Jung, Thomas S; Kapoor, Vena; Kappes, Heike; Kati, Vassiliki; Katovai, Eric; Kellner, Klaus; Kessler, Michael; Kirby, Kathryn R; Kittle, Andrew M; Knight, Mairi E; Knop, Eva; Kohler, Florian; Koivula, Matti; Kolb, Annette; Kone, Mouhamadou; Kőrösi, Ádám; Krauss, Jochen; Kumar, Ajith; Kumar, Raman; Kurz, David J; Kutt, Alex S; Lachat, Thibault; Lantschner, Victoria; Lara, Francisco; Lasky, Jesse R; Latta, Steven C; Laurance, William F; Lavelle, Patrick; Le Féon, Violette; LeBuhn, Gretchen; Légaré, Jean-Philippe; Lehouck, Valérie; Lencinas, María V; Lentini, Pia E; Letcher, Susan G; Li, Qi; Litchwark, Simon A; Littlewood, Nick A; Liu, Yunhui; Lo-Man-Hung, Nancy; López-Quintero, Carlos A; Louhaichi, Mounir; Lövei, Gabor L; Lucas-Borja, Manuel Esteban; Luja, Victor H; Luskin, Matthew S; MacSwiney G, M Cristina; Maeto, Kaoru; Magura, Tibor; Mallari, Neil Aldrin; Malone, Louise A; Malonza, Patrick K; Malumbres-Olarte, Jagoba; Mandujano, Salvador; Måren, Inger E; Marin-Spiotta, Erika; Marsh, Charles J; Marshall, E J P; Martínez, Eliana; Martínez Pastur, Guillermo; Moreno Mateos, David; Mayfield, Margaret M; Mazimpaka, Vicente; McCarthy, Jennifer L; McCarthy, Kyle P; McFrederick, Quinn S; McNamara, Sean; Medina, Nagore G; Medina, Rafael; Mena, Jose L; Mico, Estefania; Mikusinski, Grzegorz; Milder, Jeffrey C; Miller, James R; Miranda-Esquivel, Daniel R; Moir, Melinda L; Morales, Carolina L; Muchane, Mary N; Muchane, Muchai; Mudri-Stojnic, Sonja; Munira, A Nur; Muoñz-Alonso, Antonio; Munyekenye, B F; Naidoo, Robin; Naithani, A; Nakagawa, Michiko; Nakamura, Akihiro; Nakashima, Yoshihiro; Naoe, Shoji; Nates-Parra, Guiomar; Navarrete Gutierrez, Dario A; Navarro-Iriarte, Luis; Ndang'ang'a, Paul K; Neuschulz, Eike L; Ngai, Jacqueline T; Nicolas, Violaine; Nilsson, Sven G; Noreika, Norbertas; Norfolk, Olivia; Noriega, Jorge Ari; Norton, David A; Nöske, Nicole M; Nowakowski, A Justin; Numa, Catherine; O'Dea, Niall; O'Farrell, Patrick J; Oduro, William; Oertli, Sabine; Ofori-Boateng, Caleb; Oke, Christopher Omamoke; Oostra, Vicencio; Osgathorpe, Lynne M; Otavo, Samuel Eduardo; Page, Navendu V; Paritsis, Juan; Parra-H, Alejandro; Parry, Luke; Pe'er, Guy; Pearman, Peter B; Pelegrin, Nicolás; Pélissier, Raphaël; Peres, Carlos A; Peri, Pablo L; Persson, Anna S; Petanidou, Theodora; Peters, Marcell K; Pethiyagoda, Rohan S; Phalan, Ben; Philips, T Keith; Pillsbury, Finn C; Pincheira-Ulbrich, Jimmy; Pineda, Eduardo; Pino, Joan; Pizarro-Araya, Jaime; Plumptre, A J; Poggio, Santiago L; Politi, Natalia; Pons, Pere; Poveda, Katja; Power, Eileen F; Presley, Steven J; Proença, Vânia; Quaranta, Marino; Quintero, Carolina; Rader, Romina; Ramesh, B R; Ramirez-Pinilla, Martha P; Ranganathan, Jai; Rasmussen, Claus; Redpath-Downing, Nicola A; Reid, J Leighton; Reis, Yana T; Rey Benayas, José M; Rey-Velasco, Juan Carlos; Reynolds, Chevonne; Ribeiro, Danilo Bandini; Richards, Miriam H; Richardson, Barbara A; Richardson, Michael J; Ríos, Rodrigo Macip; Robinson, Richard; Robles, Carolina A; Römbke, Jörg; Romero-Duque, Luz Piedad; Rös, Matthias; Rosselli, Loreta; Rossiter, Stephen J; Roth, Dana S; Roulston, T'ai H; Rousseau, Laurent; Rubio, André V; Ruel, Jean-Claude; Sadler, Jonathan P; Sáfián, Szabolcs; Saldaña-Vázquez, Romeo A; Sam, Katerina; Samnegård, Ulrika; Santana, Joana; Santos, Xavier; Savage, Jade; Schellhorn, Nancy A; Schilthuizen, Menno; Schmiedel, Ute; Schmitt, Christine B; Schon, Nicole L; Schüepp, Christof; Schumann, Katharina; Schweiger, Oliver; Scott, Dawn M; Scott, Kenneth A; Sedlock, Jodi L; Seefeldt, Steven S; Shahabuddin, Ghazala; Shannon, Graeme; Sheil, Douglas; Sheldon, Frederick H; Shochat, Eyal; Siebert, Stefan J; Silva, Fernando A B; Simonetti, Javier A; Slade, Eleanor M; Smith, Jo; Smith-Pardo, Allan H; Sodhi, Navjot S; Somarriba, Eduardo J; Sosa, Ramón A; Soto Quiroga, Grimaldo; St-Laurent, Martin-Hugues; Starzomski, Brian M; Stefanescu, Constanti; Steffan-Dewenter, Ingolf; Stouffer, Philip C; Stout, Jane C; Strauch, Ayron M; Struebig, Matthew J; Su, Zhimin; Suarez-Rubio, Marcela; Sugiura, Shinji; Summerville, Keith S; Sung, Yik-Hei; Sutrisno, Hari; Svenning, Jens-Christian; Teder, Tiit; Threlfall, Caragh G; Tiitsaar, Anu; Todd, Jacqui H; Tonietto, Rebecca K; Torre, Ignasi; Tóthmérész, Béla; Tscharntke, Teja; Turner, Edgar C; Tylianakis, Jason M; Uehara-Prado, Marcio; Urbina-Cardona, Nicolas; Vallan, Denis; Vanbergen, Adam J; Vasconcelos, Heraldo L; Vassilev, Kiril; Verboven, Hans A F; Verdasca, Maria João; Verdú, José R; Vergara, Carlos H; Vergara, Pablo M; Verhulst, Jort; Virgilio, Massimiliano; Vu, Lien Van; Waite, Edward M; Walker, Tony R; Wang, Hua-Feng; Wang, Yanping; Watling, James I; Weller, Britta; Wells, Konstans; Westphal, Catrin; Wiafe, Edward D; Williams, Christopher D; Willig, Michael R; Woinarski, John C Z; Wolf, Jan H D; Wolters, Volkmar; Woodcock, Ben A; Wu, Jihua; Wunderle, Joseph M; Yamaura, Yuichi; Yoshikura, Satoko; Yu, Douglas W; Zaitsev, Andrey S; Zeidler, Juliane; Zou, Fasheng; Collen, Ben; Ewers, Rob M; Mace, Georgina M; Purves, Drew W; Scharlemann, Jörn P W; Purvis, Andy

2017-01-01

The PREDICTS project-Projecting Responses of Ecological Diversity In Changing Terrestrial Systems (www.predicts.org.uk)-has collated from published studies a large, reasonably representative database of comparable samples of biodiversity from multiple sites that differ in the nature or intensity of human impacts relating to land use. We have used this evidence base to develop global and regional statistical models of how local biodiversity responds to these measures. We describe and make freely available this 2016 release of the database, containing more than 3.2 million records sampled at over 26,000 locations and representing over 47,000 species. We outline how the database can help in answering a range of questions in ecology and conservation biology. To our knowledge, this is the largest and most geographically and taxonomically representative database of spatial comparisons of biodiversity that has been collated to date; it will be useful to researchers and international efforts wishing to model and understand the global status of biodiversity.
PHASE I MATERIALS PROPERTY DATABASE DEVELOPMENT FOR ASME CODES AND STANDARDS

DOE Office of Scientific and Technical Information (OSTI.GOV)

Ren, Weiju; Lin, Lianshan

2013-01-01

To support the ASME Boiler and Pressure Vessel Codes and Standard (BPVC) in modern information era, development of a web-based materials property database is initiated under the supervision of ASME Committee on Materials. To achieve efficiency, the project heavily draws upon experience from development of the Gen IV Materials Handbook and the Nuclear System Materials Handbook. The effort is divided into two phases. Phase I is planned to deliver a materials data file warehouse that offers a depository for various files containing raw data and background information, and Phase II will provide a relational digital database that provides advanced featuresmore » facilitating digital data processing and management. Population of the database will start with materials property data for nuclear applications and expand to data covering the entire ASME Code and Standards including the piping codes as the database structure is continuously optimized. The ultimate goal of the effort is to establish a sound cyber infrastructure that support ASME Codes and Standards development and maintenance.« less
GenBank.

PubMed

Benson, Dennis A; Karsch-Mizrachi, Ilene; Lipman, David J; Ostell, James; Wheeler, David L

2008-01-01

GenBank (R) is a comprehensive database that contains publicly available nucleotide sequences for more than 260 000 named organisms, obtained primarily through submissions from individual laboratories and batch submissions from large-scale sequencing projects. Most submissions are made using the web-based BankIt or standalone Sequin programs and accession numbers are assigned by GenBank staff upon receipt. Daily data exchange with the European Molecular Biology Laboratory Nucleotide Sequence Database in Europe and the DNA Data Bank of Japan ensures worldwide coverage. GenBank is accessible through NCBI's retrieval system, Entrez, which integrates data from the major DNA and protein sequence databases along with taxonomy, genome, mapping, protein structure and domain information, and the biomedical journal literature via PubMed. BLAST provides sequence similarity searches of GenBank and other sequence databases. Complete bimonthly releases and daily updates of the GenBank database are available by FTP. To access GenBank and its related retrieval and analysis services, begin at the NCBI Homepage: www.ncbi.nlm.nih.gov.
GenBank

PubMed Central

Benson, Dennis A.; Karsch-Mizrachi, Ilene; Lipman, David J.; Ostell, James; Wheeler, David L.

2008-01-01

GenBank (R) is a comprehensive database that contains publicly available nucleotide sequences for more than 260 000 named organisms, obtained primarily through submissions from individual laboratories and batch submissions from large-scale sequencing projects. Most submissions are made using the web-based BankIt or standalone Sequin programs and accession numbers are assigned by GenBank staff upon receipt. Daily data exchange with the European Molecular Biology Laboratory Nucleotide Sequence Database in Europe and the DNA Data Bank of Japan ensures worldwide coverage. GenBank is accessible through NCBI's retrieval system, Entrez, which integrates data from the major DNA and protein sequence databases along with taxonomy, genome, mapping, protein structure and domain information, and the biomedical journal literature via PubMed. BLAST provides sequence similarity searches of GenBank and other sequence databases. Complete bimonthly releases and daily updates of the GenBank database are available by FTP. To access GenBank and its related retrieval and analysis services, begin at the NCBI Homepage: www.ncbi.nlm.nih.gov PMID:18073190
High-quality unsaturated zone hydraulic property data for hydrologic applications

USGS Publications Warehouse

Perkins, Kimberlie; Nimmo, John R.

2009-01-01

In hydrologic studies, especially those using dynamic unsaturated zone moisture modeling, calculations based on property transfer models informed by hydraulic property databases are often used in lieu of measured data from the site of interest. Reliance on database-informed predicted values has become increasingly common with the use of neural networks. High-quality data are needed for databases used in this way and for theoretical and property transfer model development and testing. Hydraulic properties predicted on the basis of existing databases may be adequate in some applications but not others. An obvious problem occurs when the available database has few or no data for samples that are closely related to the medium of interest. The data set presented in this paper includes saturated and unsaturated hydraulic conductivity, water retention, particle-size distributions, and bulk properties. All samples are minimally disturbed, all measurements were performed using the same state of the art techniques and the environments represented are diverse.
Large Differences in Global and Regional Total Soil Carbon Stock Estimates Based on SoilGrids, HWSD, and NCSCD: Intercomparison and Evaluation Based on Field Data From USA, England, Wales, and France

NASA Astrophysics Data System (ADS)

Tifafi, Marwa; Guenet, Bertrand; Hatté, Christine

2018-01-01

Soils are the major component of the terrestrial ecosystem and the largest organic carbon reservoir on Earth. However, they are a nonrenewable natural resource and especially reactive to human disturbance and climate change. Despite its importance, soil carbon dynamics is an important source of uncertainty for future climate predictions and there is a growing need for more precise information to better understand the mechanisms controlling soil carbon dynamics and better constrain Earth system models. The aim of our work is to compare soil organic carbon stocks given by different global and regional databases that already exist. We calculated global and regional soil carbon stocks at 1 m depth given by three existing databases (SoilGrids, the Harmonized World Soil Database, and the Northern Circumpolar Soil Carbon Database). We observed that total stocks predicted by each product differ greatly: it is estimated to be around 3,400 Pg by SoilGrids and is about 2,500 Pg according to Harmonized World Soil Database. This difference is marked in particular for boreal regions where differences can be related to high disparities in soil organic carbon concentration. Differences in other regions are more limited and may be related to differences in bulk density estimates. Finally, evaluation of the three data sets versus ground truth data shows that (i) there is a significant difference in spatial patterns between ground truth data and compared data sets and that (ii) data sets underestimate by more than 40% the soil organic carbon stock compared to field data.
Vaxvec: The first web-based recombinant vaccine vector database and its data analysis

PubMed Central

Deng, Shunzhou; Martin, Carly; Patil, Rasika; Zhu, Felix; Zhao, Bin; Xiang, Zuoshuang; He, Yongqun

2015-01-01

A recombinant vector vaccine uses an attenuated virus, bacterium, or parasite as the carrier to express a heterologous antigen(s). Many recombinant vaccine vectors and related vaccines have been developed and extensively investigated. To compare and better understand recombinant vectors and vaccines, we have generated Vaxvec (http://www.violinet.org/vaxvec), the first web-based database that stores various recombinant vaccine vectors and those experimentally verified vaccines that use these vectors. Vaxvec has now included 59 vaccine vectors that have been used in 196 recombinant vector vaccines against 66 pathogens and cancers. These vectors are classified to 41 viral vectors, 15 bacterial vectors, 1 parasitic vector, and 1 fungal vector. The most commonly used viral vaccine vectors are double-stranded DNA viruses, including herpesviruses, adenoviruses, and poxviruses. For example, Vaxvec includes 63 poxvirus-based recombinant vaccines for over 20 pathogens and cancers. Vaxvec collects 30 recombinant vector influenza vaccines that use 17 recombinant vectors and were experimentally tested in 7 animal models. In addition, over 60 protective antigens used in recombinant vector vaccines are annotated and analyzed. User-friendly web-interfaces are available for querying various data in Vaxvec. To support data exchange, the information of vaccine vectors, vaccines, and related information is stored in the Vaccine Ontology (VO). Vaxvec is a timely and vital source of vaccine vector database and facilitates efficient vaccine vector research and development. PMID:26403370
System, method and apparatus for generating phrases from a database

NASA Technical Reports Server (NTRS)

McGreevy, Michael W. (Inventor)

2004-01-01

A phrase generation is a method of generating sequences of terms, such as phrases, that may occur within a database of subsets containing sequences of terms, such as text. A database is provided and a relational model of the database is created. A query is then input. The query includes a term or a sequence of terms or multiple individual terms or multiple sequences of terms or combinations thereof. Next, several sequences of terms that are contextually related to the query are assembled from contextual relations in the model of the database. The sequences of terms are then sorted and output. Phrase generation can also be an iterative process used to produce sequences of terms from a relational model of a database.
Architecture for biomedical multimedia information delivery on the World Wide Web

NASA Astrophysics Data System (ADS)

Long, L. Rodney; Goh, Gin-Hua; Neve, Leif; Thoma, George R.

1997-10-01

Research engineers at the National Library of Medicine are building a prototype system for the delivery of multimedia biomedical information on the World Wide Web. This paper discuses the architecture and design considerations for the system, which will be used initially to make images and text from the third National Health and Nutrition Examination Survey (NHANES) publicly available. We categorized our analysis as follows: (1) fundamental software tools: we analyzed trade-offs among use of conventional HTML/CGI, X Window Broadway, and Java; (2) image delivery: we examined the use of unconventional TCP transmission methods; (3) database manager and database design: we discuss the capabilities and planned use of the Informix object-relational database manager and the planned schema for the HNANES database; (4) storage requirements for our Sun server; (5) user interface considerations; (6) the compatibility of the system with other standard research and analysis tools; (7) image display: we discuss considerations for consistent image display for end users. Finally, we discuss the scalability of the system in terms of incorporating larger or more databases of similar data, and the extendibility of the system for supporting content-based retrieval of biomedical images. The system prototype is called the Web-based Medical Information Retrieval System. An early version was built as a Java applet and tested on Unix, PC, and Macintosh platforms. This prototype used the MiniSQL database manager to do text queries on a small database of records of participants in the second NHANES survey. The full records and associated x-ray images were retrievable and displayable on a standard Web browser. A second version has now been built, also a Java applet, using the MySQL database manager.
Snoring classified: The Munich-Passau Snore Sound Corpus.

PubMed

Janott, Christoph; Schmitt, Maximilian; Zhang, Yue; Qian, Kun; Pandit, Vedhas; Zhang, Zixing; Heiser, Clemens; Hohenhorst, Winfried; Herzog, Michael; Hemmert, Werner; Schuller, Björn

2018-03-01

Snoring can be excited in different locations within the upper airways during sleep. It was hypothesised that the excitation locations are correlated with distinct acoustic characteristics of the snoring noise. To verify this hypothesis, a database of snore sounds is developed, labelled with the location of sound excitation. Video and audio recordings taken during drug induced sleep endoscopy (DISE) examinations from three medical centres have been semi-automatically screened for snore events, which subsequently have been classified by ENT experts into four classes based on the VOTE classification. The resulting dataset containing 828 snore events from 219 subjects has been split into Train, Development, and Test sets. An SVM classifier has been trained using low level descriptors (LLDs) related to energy, spectral features, mel frequency cepstral coefficients (MFCC), formants, voicing, harmonic-to-noise ratio (HNR), spectral harmonicity, pitch, and microprosodic features. An unweighted average recall (UAR) of 55.8% could be achieved using the full set of LLDs including formants. Best performing subset is the MFCC-related set of LLDs. A strong difference in performance could be observed between the permutations of train, development, and test partition, which may be caused by the relatively low number of subjects included in the smaller classes of the strongly unbalanced data set. A database of snoring sounds is presented which are classified according to their sound excitation location based on objective criteria and verifiable video material. With the database, it could be demonstrated that machine classifiers can distinguish different excitation location of snoring sounds in the upper airway based on acoustic parameters. Copyright © 2018 Elsevier Ltd. All rights reserved.
TMDB: a literature-curated database for small molecular compounds found from tea.

PubMed

Yue, Yi; Chu, Gang-Xiu; Liu, Xue-Shi; Tang, Xing; Wang, Wei; Liu, Guang-Jin; Yang, Tao; Ling, Tie-Jun; Wang, Xiao-Gang; Zhang, Zheng-Zhu; Xia, Tao; Wan, Xiao-Chun; Bao, Guan-Hu

2014-09-16

Tea is one of the most consumed beverages worldwide. The healthy effects of tea are attributed to a wealthy of different chemical components from tea. Thousands of studies on the chemical constituents of tea had been reported. However, data from these individual reports have not been collected into a single database. The lack of a curated database of related information limits research in this field, and thus a cohesive database system should necessarily be constructed for data deposit and further application. The Tea Metabolome database (TMDB), a manually curated and web-accessible database, was developed to provide detailed, searchable descriptions of small molecular compounds found in Camellia spp. esp. in the plant Camellia sinensis and compounds in its manufactured products (different kinds of tea infusion). TMDB is currently the most complete and comprehensive curated collection of tea compounds data in the world. It contains records for more than 1393 constituents found in tea with information gathered from 364 published books, journal articles, and electronic databases. It also contains experimental 1H NMR and 13C NMR data collected from the purified reference compounds or collected from other database resources such as HMDB. TMDB interface allows users to retrieve tea compounds entries by keyword search using compound name, formula, occurrence, and CAS register number. Each entry in the TMDB contains an average of 24 separate data fields including its original plant species, compound structure, formula, molecular weight, name, CAS registry number, compound types, compound uses including healthy benefits, reference literatures, NMR, MS data, and the corresponding ID from databases such as HMDB and Pubmed. Users can also contribute novel regulatory entries by using a web-based submission page. The TMDB database is freely accessible from the URL of http://pcsb.ahau.edu.cn:8080/TCDB/index.jsp. The TMDB is designed to address the broad needs of tea biochemists, natural products chemists, nutritionists, and members of tea related research community. The TMDB database provides a solid platform for collection, standardization, and searching of compounds information found in tea. As such this database will be a comprehensive repository for tea biochemistry and tea health research community.
Validity of Administrative Data in Identifying Cancer-related Events in Adolescents and Young Adults: A Population-based Study Using the IMPACT Cohort.

PubMed

Gupta, Sumit; Nathan, Paul C; Baxter, Nancy N; Lau, Cindy; Daly, Corinne; Pole, Jason D

2018-06-01

Despite the importance of estimating population level cancer outcomes, most registries do not collect critical events such as relapse. Attempts to use health administrative data to identify these events have focused on older adults and have been mostly unsuccessful. We developed and tested administrative data-based algorithms in a population-based cohort of adolescents and young adults with cancer. We identified all Ontario adolescents and young adults 15-21 years old diagnosed with leukemia, lymphoma, sarcoma, or testicular cancer between 1992-2012. Chart abstraction determined the end of initial treatment (EOIT) date and subsequent cancer-related events (progression, relapse, second cancer). Linkage to population-based administrative databases identified fee and procedure codes indicating cancer treatment or palliative care. Algorithms determining EOIT based on a time interval free of treatment-associated codes, and new cancer-related events based on billing codes, were compared with chart-abstracted data. The cohort comprised 1404 patients. Time periods free of treatment-associated codes did not validly identify EOIT dates; using subsequent codes to identify new cancer events was thus associated with low sensitivity (56.2%). However, using administrative data codes that occurred after the EOIT date based on chart abstraction, the first cancer-related event was identified with excellent validity (sensitivity, 87.0%; specificity, 93.3%; positive predictive value, 81.5%; negative predictive value, 95.5%). Although administrative data alone did not validly identify cancer-related events, administrative data in combination with chart collected EOIT dates was associated with excellent validity. The collection of EOIT dates by cancer registries would significantly expand the potential of administrative data linkage to assess cancer outcomes.
Building an integrated neurodegenerative disease database at an academic health center.

PubMed

Xie, Sharon X; Baek, Young; Grossman, Murray; Arnold, Steven E; Karlawish, Jason; Siderowf, Andrew; Hurtig, Howard; Elman, Lauren; McCluskey, Leo; Van Deerlin, Vivianna; Lee, Virginia M-Y; Trojanowski, John Q

2011-07-01

It is becoming increasingly important to study common and distinct etiologies, clinical and pathological features, and mechanisms related to neurodegenerative diseases such as Alzheimer's disease, Parkinson's disease, amyotrophic lateral sclerosis, and frontotemporal lobar degeneration. These comparative studies rely on powerful database tools to quickly generate data sets that match diverse and complementary criteria set by them. In this article, we present a novel integrated neurodegenerative disease (INDD) database, which was developed at the University of Pennsylvania (Penn) with the help of a consortium of Penn investigators. Because the work of these investigators are based on Alzheimer's disease, Parkinson's disease, amyotrophic lateral sclerosis, and frontotemporal lobar degeneration, it allowed us to achieve the goal of developing an INDD database for these major neurodegenerative disorders. We used the Microsoft SQL server as a platform, with built-in "backwards" functionality to provide Access as a frontend client to interface with the database. We used PHP Hypertext Preprocessor to create the "frontend" web interface and then used a master lookup table to integrate individual neurodegenerative disease databases. We also present methods of data entry, database security, database backups, and database audit trails for this INDD database. Using the INDD database, we compared the results of a biomarker study with those using an alternative approach by querying individual databases separately. We have demonstrated that the Penn INDD database has the ability to query multiple database tables from a single console with high accuracy and reliability. The INDD database provides a powerful tool for generating data sets in comparative studies on several neurodegenerative diseases. Copyright © 2011 The Alzheimer's Association. Published by Elsevier Inc. All rights reserved.
eMelanoBase: an online locus-specific variant database for familial melanoma.

PubMed

Fung, David C Y; Holland, Elizabeth A; Becker, Therese M; Hayward, Nicholas K; Bressac-de Paillerets, Brigitte; Mann, Graham J

2003-01-01

A proportion of melanoma-prone individuals in both familial and non-familial contexts has been shown to carry inactivating mutations in either CDKN2A or, rarely, CDK4. CDKN2A is a complex locus that encodes two unrelated proteins from alternately spliced transcripts that are read in different frames. The alpha transcript (exons 1alpha, 2, and 3) produces the p16INK4A cyclin-dependent kinase inhibitor, while the beta transcript (exons 1beta and 2) is translated as p14ARF, a stabilizing factor of p53 levels through binding to MDM2. Mutations in exon 2 can impair both polypeptides and insertions and deletions in exons 1alpha, 1beta, and 2, which can theoretically generate p16INK4A-p14ARF fusion proteins. No online database currently takes into account all the consequences of these genotypes, a situation compounded by some problematic previous annotations of CDKN2A-related sequences and descriptions of their mutations. As an initiative of the international Melanoma Genetics Consortium, we have therefore established a database of germline variants observed in all loci implicated in familial melanoma susceptibility. Such a comprehensive, publicly accessible database is an essential foundation for research on melanoma susceptibility and its clinical application. Our database serves two types of data as defined by HUGO. The core dataset includes the nucleotide variants on the genomic and transcript levels, amino acid variants, and citation. The ancillary dataset includes keyword description of events at the transcription and translation levels and epidemiological data. The application that handles users' queries was designed in the model-view-controller architecture and was implemented in Java. The object-relational database schema was deduced using functional dependency analysis. We hereby present our first functional prototype of eMelanoBase. The service is accessible via the URL www.wmi.usyd.edu.au:8080/melanoma.html. Copyright 2002 Wiley-Liss, Inc.
Dam Removal Information Portal (DRIP)—A map-based resource linking scientific studies and associated geospatial information about dam removals

USGS Publications Warehouse

Duda, Jeffrey J.; Wieferich, Daniel J.; Bristol, R. Sky; Bellmore, J. Ryan; Hutchison, Vivian B.; Vittum, Katherine M.; Craig, Laura; Warrick, Jonathan A.

2016-08-18

The removal of dams has recently increased over historical levels due to aging infrastructure, changing societal needs, and modern safety standards rendering some dams obsolete. Where possibilities for river restoration, or improved safety, exceed the benefits of retaining a dam, removal is more often being considered as a viable option. Yet, as this is a relatively new development in the history of river management, science is just beginning to guide our understanding of the physical and ecological implications of dam removal. Ultimately, the “lessons learned” from previous scientific studies on the outcomes dam removal could inform future scientific understanding of ecosystem outcomes, as well as aid in decision-making by stakeholders. We created a database visualization tool, the Dam Removal Information Portal (DRIP), to display map-based, interactive information about the scientific studies associated with dam removals. Serving both as a bibliographic source as well as a link to other existing databases like the National Hydrography Dataset, the derived National Dam Removal Science Database serves as the foundation for a Web-based application that synthesizes the existing scientific studies associated with dam removals. Thus, using the DRIP application, users can explore information about completed dam removal projects (for example, their location, height, and date removed), as well as discover sources and details of associated of scientific studies. As such, DRIP is intended to be a dynamic collection of scientific information related to dams that have been removed in the United States and elsewhere. This report describes the architecture and concepts of this “metaknowledge” database and the DRIP visualization tool.
[Exposed workers to lung cancer risk. An estimation using the ISPESL database of enterprises].

PubMed

Scarselli, Alberto; Marinaccio, Alessandro; Nesti, Massimo

2007-01-01

lung cancer is the first cause of death in the industrialized country among males and is increasing among females. In 2001 a uniform and standardised list of occupations or jobs known or suspected to be associated with lung cancer has been prepared. The aim of this study is to set up a database of Italian enterprises corresponding to activities related to this list and to assess the number of potentially exposed workers. a detailed and unique list of codes, referred to Ateco91 ISTAT classification with exclusion of the State Railways and the public administration sectors, has been developed. The list is divided into two categories: respectively for occupations or jobs definitely entailing carcinogenic risk and for those which probably/possibly entail a risk. Firms have been selected from the ISPESL database of enterprises and the number of workers has been estimated on the basis of this list. Italy. assessment of the number of workers potentially exposed to lung cancer risk and creation of a register of involved firms. the number of potentially exposed workers in the industrial and services sector related to lung cancer risk is 650,886 blue collars and the number of firms censused in Italy is 117,006 units. Corresponding figures in the agriculture sector are 163,340 and 84,839. This type of evaluation, being based on administrative sources rather then on direct measures of exposure, certainly includes an overestimation of exposed workers. the lists based on a standard classification which have been created allow for the creation of databases which can be used to control occupational exposure to carcinogens and to increase comparability between epidemiologic studies based on job-exposure matrix.

Palaeo-sea-level and palaeo-ice-sheet databases: problems, strategies, and perspectives

NASA Astrophysics Data System (ADS)

Düsterhus, André; Rovere, Alessio; Carlson, Anders E.; Horton, Benjamin P.; Klemann, Volker; Tarasov, Lev; Barlow, Natasha L. M.; Bradwell, Tom; Clark, Jorie; Dutton, Andrea; Gehrels, W. Roland; Hibbert, Fiona D.; Hijma, Marc P.; Khan, Nicole; Kopp, Robert E.; Sivan, Dorit; Törnqvist, Torbjörn E.

2016-04-01

Sea-level and ice-sheet databases have driven numerous advances in understanding the Earth system. We describe the challenges and offer best strategies that can be adopted to build self-consistent and standardised databases of geological and geochemical information used to archive palaeo-sea-levels and palaeo-ice-sheets. There are three phases in the development of a database: (i) measurement, (ii) interpretation, and (iii) database creation. Measurement should include the objective description of the position and age of a sample, description of associated geological features, and quantification of uncertainties. Interpretation of the sample may have a subjective component, but it should always include uncertainties and alternative or contrasting interpretations, with any exclusion of existing interpretations requiring a full justification. During the creation of a database, an approach based on accessibility, transparency, trust, availability, continuity, completeness, and communication of content (ATTAC3) must be adopted. It is essential to consider the community that creates and benefits from a database. We conclude that funding agencies should not only consider the creation of original data in specific research-question-oriented projects, but also include the possibility of using part of the funding for IT-related and database creation tasks, which are essential to guarantee accessibility and maintenance of the collected data.
Attenuation relation for strong motion in Eastern Java based on appropriate database and method

NASA Astrophysics Data System (ADS)

Mahendra, Rian; Rohadi, Supriyanto; Rudyanto, Ariska

2017-07-01

The selection and determination of attenuation relation has become important for seismic hazard assessment in active seismic region. This research initially constructs the appropriate strong motion database, including site condition and type of the earthquake. The data set consisted of large number earthquakes of 5 ≤ Mw ≤ 9 and distance less than 500 km that occurred around Java from 2009 until 2016. The location and depth of earthquake are being relocated using double difference method to improve the quality of database. Strong motion data from twelve BMKG's accelerographs which are located in east Java is used. The site condition is known by using dominant period and Vs30. The type of earthquake is classified into crustal earthquake, interface, and intraslab based on slab geometry analysis. A total of 10 Ground Motion Prediction Equations (GMPEs) are tested using Likelihood (Scherbaum et al., 2004) and Euclidean Distance Ranking method (Kale and Akkar, 2012) with the associated database. The evaluation of these methods lead to a set of GMPEs that can be applied for seismic hazard in East Java where the strong motion data is collected. The result of these methods found that there is still high deviation of GMPEs, so the writer modified some GMPEs using inversion method. Validation was performed by analysing the attenuation curve of the selected GMPE and observation data in period 2015 up to 2016. The results show that the selected GMPE is suitable for estimated PGA value in East Java.
Agreement between ethnicity recorded in two New Zealand health databases: effects of discordance on cardiovascular outcome measures (PREDICT CVD3).

PubMed

Marshall, Roger J; Zhang, Zhongqian; Broad, Joanna B; Wells, Sue

2007-06-01

To assess agreement between ethnicity as recorded by two independent databases in New Zealand, PREDICT and the National Health Index (NHI), and to assess sensitivity of ethnic-specific measures of health outcomes to either ethnicity record. Patients assessed using PREDICT form the study cohort. Ethnicity was recorded for PREDICT and an associated NHI ethnicity code was identified by merge-match linking on an encrypted NHI number. Agreement between ethnicity measures was assessed by kappa scores and scaled rectangle diagrams. A cohort of 18,239 individuals was linked in both PREDICT and NHI databases. The agreement between ethnicity classifications was reasonably good, with overall kappa coefficient of 0.82. There was better agreement for women than men and agreement improved with age and with time since the PREDICT system has been operational. Ethnic-specific cardiovascular (CVD) hospital admission rates were sensitive to ethnicity coding by NHI or PREDICT; rate ratios for ethnic groups, relative to European, based on PREDICT were attenuated towards the null relative to the NHI classification. Agreement between ethnicity was moderately good. Discordances that do exist do not have a substantial effect on prevalence-based measures of effect; however, they do on measurement of the admission of CVD. Different categorisations of ethnicity data from routine (and other) databases can lead to different ethnic-specific estimates of epidemiological effects. There is an imperative to record ethnicity in a rational, systematic and consistent way.
ToxReporter: viewing the genome through the eyes of a toxicologist.

PubMed

Gosink, Mark

2016-01-01

One of the many roles of a toxicologist is to determine if an observed adverse event (AE) is related to a previously unrecognized function of a given gene/protein. Towards that end, he or she will search a variety of public and propriety databases for information linking that protein to the observed AE. However, these databases tend to present all available information about a protein, which can be overwhelming, limiting the ability to find information about the specific toxicity being investigated. ToxReporter compiles information from a broad selection of resources and limits display of the information to user-selected areas of interest. ToxReporter is a PERL-based web-application which utilizes a MySQL database to streamline this process by categorizing public and proprietary domain-derived information into predefined safety categories according to a customizable lexicon. Users can view gene information that is 'red-flagged' according to the safety issue under investigation. ToxReporter also uses a scoring system based on relative counts of the red-flags to rank all genes for the amount of information pertaining to each safety issue and to display their scored ranking as an easily interpretable 'Tox-At-A-Glance' chart. Although ToxReporter was originally developed to display safety information, its flexible design could easily be adapted to display disease information as well.Database URL: ToxReporter is freely available at https://github.com/mgosink/ToxReporter. © The Author(s) 2016. Published by Oxford University Press.
An Evidence-Based Systematic Review on Communication Treatments for Individuals with Right Hemisphere Brain Damage

ERIC Educational Resources Information Center

Blake, Margaret Lehman; Frymark, Tobi; Venedictov, Rebecca

2013-01-01

Purpose: The purpose of this review is to evaluate and summarize the research evidence related to the treatment of individuals with right hemisphere communication disorders. Method: A comprehensive search of the literature using key words related to right hemisphere brain damage and communication treatment was conducted in 27 databases (e.g.,…
Data model and relational database design for the New England Water-Use Data System (NEWUDS)

USGS Publications Warehouse

Tessler, Steven

2001-01-01

The New England Water-Use Data System (NEWUDS) is a database for the storage and retrieval of water-use data. NEWUDS can handle data covering many facets of water use, including (1) tracking various types of water-use activities (withdrawals, returns, transfers, distributions, consumptive-use, wastewater collection, and treatment); (2) the description, classification and location of places and organizations involved in water-use activities; (3) details about measured or estimated volumes of water associated with water-use activities; and (4) information about data sources and water resources associated with water use. In NEWUDS, each water transaction occurs unidirectionally between two site objects, and the sites and conveyances form a water network. The core entities in the NEWUDS model are site, conveyance, transaction/rate, location, and owner. Other important entities include water resources (used for withdrawals and returns), data sources, and aliases. Multiple water-exchange estimates can be stored for individual transactions based on different methods or data sources. Storage of user-defined details is accommodated for several of the main entities. Numerous tables containing classification terms facilitate detailed descriptions of data items and can be used for routine or custom data summarization. NEWUDS handles single-user and aggregate-user water-use data, can be used for large or small water-network projects, and is available as a stand-alone Microsoft? Access database structure. Users can customize and extend the database, link it to other databases, or implement the design in other relational database applications.
Progress towards a Spacecraft-Associated Microbial Meta-database (SAMM)

NASA Astrophysics Data System (ADS)

Mogul, Rakesh; Keagy, Laura; Nava, Argelia; Zerehi, Farah

The microbial inventories within the assembly facilities for spacecraft represent the primary pool of forward contaminants that may compromise life-detection missions. Accordingly, we are constructing a meta-database of these microorganisms for the purpose of building a bioinformatic resource for planetary protection and astrobiology-related endeavors. Using student-led efforts, the meta-database is being constructed from literature reports and is inclusive of both isolated microorganisms and those solely detected through DNA-based techniques. The Spacecraft-Associated Microbial Meta-database (SAMM) currently includes over 800 entries that are organized using 32 meta-tags involving taxonomy, location of isolation (facility and component), category of characterization (culture and/or genetic), types of characterizations (e.g., culture, 16s rDNA, phylochip, FAME, and DNA hybridization), growth conditions, Gram stain, and general physiological traits (e.g., sporulation, extremotolerance, and respiration properties). Interrogations on the database show that the cleanrooms at Kennedy Space Center (KSC) are ~ 2-fold greater in diversity in bacterial genera when compared to the Jet Propulsion Laboratory (JPL), and that bacteria related to water, plant, and human environments are more often associated with the KSC-specific genera. These results are parallel to those reported in the literature, and hence serve as benchmarks demonstrating the bioinformatic potential of this meta-database. The ultimate plans for SAMM include public availability, expansion through crowdsourcing efforts, and potential use as a companion resource to the culture collections assembled by DSMZ and JPL.
The Danish Cardiac Rehabilitation Database.

PubMed

Zwisler, Ann-Dorthe; Rossau, Henriette Knold; Nakano, Anne; Foghmar, Sussie; Eichhorst, Regina; Prescott, Eva; Cerqueira, Charlotte; Soja, Anne Merete Boas; Gislason, Gunnar H; Larsen, Mogens Lytken; Andersen, Ulla Overgaard; Gustafsson, Ida; Thomsen, Kristian K; Boye Hansen, Lene; Hammer, Signe; Viggers, Lone; Christensen, Bo; Kvist, Birgitte; Lindström Egholm, Cecilie; May, Ole

2016-01-01

The Danish Cardiac Rehabilitation Database (DHRD) aims to improve the quality of cardiac rehabilitation (CR) to the benefit of patients with coronary heart disease (CHD). Hospitalized patients with CHD with stenosis on coronary angiography treated with percutaneous coronary intervention, coronary artery bypass grafting, or medication alone. Reporting is mandatory for all hospitals in Denmark delivering CR. The database was initially implemented in 2013 and was fully running from August 14, 2015, thus comprising data at a patient level from the latter date onward. Patient-level data are registered by clinicians at the time of entry to CR directly into an online system with simultaneous linkage to other central patient registers. Follow-up data are entered after 6 months. The main variables collected are related to key outcome and performance indicators of CR: referral and adherence, lifestyle, patient-related outcome measures, risk factor control, and medication. Program-level online data are collected every third year. Based on administrative data, approximately 14,000 patients with CHD are hospitalized at 35 hospitals annually, with 75% receiving one or more outpatient rehabilitation services by 2015. The database has not yet been running for a full year, which explains the use of approximations. The DHRD is an online, national quality improvement database on CR, aimed at patients with CHD. Mandatory registration of data at both patient level as well as program level is done on the database. DHRD aims to systematically monitor the quality of CR over time, in order to improve the quality of CR throughout Denmark to benefit patients.
ARCTOS: a relational database relating specimens, specimen-based science, and archival documentation

USGS Publications Warehouse

Jarrell, Gordon H.; Ramotnik, Cindy A.; McDonald, D.L.

2010-01-01

Data are preserved when they are perpetually discoverable, but even in the Information Age, discovery of legacy data appropriate to particular investigations is uncertain. Secure Internet storage is necessary but insufficient. Data can be discovered only when they are adequately described, and visibility increases markedly if the data are related to other data that are receiving usage. Such relationships can be built within (1) the framework of a relational database, or (1) they can be built among separate resources, within the framework of the Internet. Evolving primarily around biological collections, Arctos is a database that does both of these tasks. It includes data structures for a diversity of specimen attributes, essentially all collection-management tasks, plus literature citations, project descriptions, etc. As a centralized collaboration of several university museums, Arctos is an ideal environment for capitalizing on the many relationships that often exist between items in separate collections. Arctos is related to NIH’s DNA-sequence repository (GenBank) with record-to-record reciprocal linkages, and it serves data to several discipline-specific web portals, including the Global Biodiversity Information Network (GBIF). The University of Alaska Museum’s paleontological collection is Arctos’s recent extension beyond the constraints of neontology. With about 1.3 million cataloged items, additional collections are being added each year.
An Integrated Software Suite for Surface-based Analyses of Cerebral Cortex

PubMed Central

Van Essen, David C.; Drury, Heather A.; Dickson, James; Harwell, John; Hanlon, Donna; Anderson, Charles H.

2001-01-01

The authors describe and illustrate an integrated trio of software programs for carrying out surface-based analyses of cerebral cortex. The first component of this trio, SureFit (Surface Reconstruction by Filtering and Intensity Transformations), is used primarily for cortical segmentation, volume visualization, surface generation, and the mapping of functional neuroimaging data onto surfaces. The second component, Caret (Computerized Anatomical Reconstruction and Editing Tool Kit), provides a wide range of surface visualization and analysis options as well as capabilities for surface flattening, surface-based deformation, and other surface manipulations. The third component, SuMS (Surface Management System), is a database and associated user interface for surface-related data. It provides for efficient insertion, searching, and extraction of surface and volume data from the database. PMID:11522765
Perianesthetic and Anesthesia-Related Mortality in a Southeastern United States Population: A Longitudinal Review of a Prospectively Collected Quality Assurance Data Base.

PubMed

Pollard, Richard J; Hopkins, Thomas; Smith, C Tyler; May, Bryan V; Doyle, James; Chambers, C Labron; Clark, Reese; Buhrman, William

2018-05-21

Perianesthetic mortality (death occurring within 48 hours of an anesthetic) continues to vary widely depending on the study population examined. The authors study in a private practice physician group that covers multiple anesthetizing locations in the Southeastern United States. This group has in place a robust quality assurance (QA) database to follow all patients undergoing anesthesia. With this study, we estimate the incidence of anesthesia-related and perianesthetic mortality in this QA database. Following institutional review board approval, data from 2011 to 2016 were obtained from the QA database of a large, community-based anesthesiology group practice. The physician practice covers 233 anesthetizing locations across 20 facilities in 2 US states. All detected cases of perianesthetic death were extracted from the database and compared to the patients' electronic medical record. These cases were further examined by a committee of 3 anesthesiologists to determine whether the death was anesthesia related (a perioperative death solely attributable to either the anesthesia provider or anesthetic technique), anesthetic contributory (a perioperative death in which anesthesia role could not be entirely excluded), or not due to anesthesia. A total of 785,467 anesthesia procedures were examined from the study period. A total of 592 cases of perianesthetic deaths were detected, giving an overall death rate of 75.37 in 100,000 cases (95% CI, 69.5-81.7). Mortality judged to be anesthesia related was found in 4 cases, giving a mortality rate of 0.509 in 100,000 (95% CI, 0.198-1.31). Mortality judged to be anesthesia contributory were found in 18 cases, giving a mortality of 2.29 in 100,000 patients (95% CI, 1.45-3.7). A total of 570 cases were judged to be nonanesthesia related, giving an incidence of 72.6 per 100,000 anesthetics (95% CI, 69.3-75.7). In a large, comprehensive database representing the full range of anesthesia practices and locations in the Southeastern United States, the rate of perianesthestic death was 0.509 in 100,000 (95% CI, 0.198-1.31). Future in-depth analysis of the epidemiology of perianesthetic deaths will be reported in later studies.
Clustering of 3D-Structure Similarity Based Network of Secondary Metabolites Reveals Their Relationships with Biological Activities.

PubMed

Ohtana, Yuki; Abdullah, Azian Azamimi; Altaf-Ul-Amin, Md; Huang, Ming; Ono, Naoaki; Sato, Tetsuo; Sugiura, Tadao; Horai, Hisayuki; Nakamura, Yukiko; Morita Hirai, Aki; Lange, Klaus W; Kibinge, Nelson K; Katsuragi, Tetsuo; Shirai, Tsuyoshi; Kanaya, Shigehiko

2014-12-01

Developing database systems connecting diverse species based on omics is the most important theme in big data biology. To attain this purpose, we have developed KNApSAcK Family Databases, which are utilized in a number of researches in metabolomics. In the present study, we have developed a network-based approach to analyze relationships between 3D structure and biological activity of metabolites consisting of four steps as follows: construction of a network of metabolites based on structural similarity (Step 1), classification of metabolites into structure groups (Step 2), assessment of statistically significant relations between structure groups and biological activities (Step 3), and 2-dimensional clustering of the constructed data matrix based on statistically significant relations between structure groups and biological activities (Step 4). Applying this method to a data set consisting of 2072 secondary metabolites and 140 biological activities reported in KNApSAcK Metabolite Activity DB, we obtained 983 statistically significant structure group-biological activity pairs. As a whole, we systematically analyzed the relationship between 3D-chemical structures of metabolites and biological activities. © 2014 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
Connecting proteins with drug-like compounds: Open source drug discovery workflows with BindingDB and KNIME

PubMed Central

Berthold, Michael R.; Hedrick, Michael P.; Gilson, Michael K.

2015-01-01

Today’s large, public databases of protein–small molecule interaction data are creating important new opportunities for data mining and integration. At the same time, new graphical user interface-based workflow tools offer facile alternatives to custom scripting for informatics and data analysis. Here, we illustrate how the large protein-ligand database BindingDB may be incorporated into KNIME workflows as a step toward the integration of pharmacological data with broader biomolecular analyses. Thus, we describe a collection of KNIME workflows that access BindingDB data via RESTful webservices and, for more intensive queries, via a local distillation of the full BindingDB dataset. We focus in particular on the KNIME implementation of knowledge-based tools to generate informed hypotheses regarding protein targets of bioactive compounds, based on notions of chemical similarity. A number of variants of this basic approach are tested for seven existing drugs with relatively ill-defined therapeutic targets, leading to replication of some previously confirmed results and discovery of new, high-quality hits. Implications for future development are discussed. Database URL: www.bindingdb.org PMID:26384374
Comparison of medical costs and healthcare resource utilization of post-menopausal women with HR+/HER2- metastatic breast cancer receiving everolimus-based therapy or chemotherapy: a retrospective claims database analysis.

PubMed

Li, Nanxin; Hao, Yanni; Koo, Valerie; Fang, Anna; Peeples, Miranda; Kageleiry, Andrew; Wu, Eric Q; Guérin, Annie

2016-01-01

To analyze medical costs and healthcare resource utilization (HRU) associated with everolimus-based therapy or chemotherapy among post-menopausal women with hormone-receptor-positive, human-epidermal-growth-factor-receptor-2-negative (HR+/HER2-) metastatic breast cancer (mBC). Patients with HR+/HER2- mBC who discontinued a non-steroidal aromatase inhibitor and began a new line of treatment with everolimus-based therapy or chemotherapy (index therapy/index date) between July 20, 2012 and April 30, 2014 were identified from two large claims databases. All-cause, BC-related, and adverse event (AE)-related medical costs (in 2014 USD) and all-cause HRU per patient per month (PPPM) were analyzed for both treatment groups across patients' first four lines of therapies for mBC. Adjusted differences in costs and HRU between the everolimus and chemotherapy treatment group were estimated pooling all lines and using multivariable generalized linear models, accounting for difference in patient characteristics. A total of 3298 patients were included: 902 everolimus-treated patients and 2636 chemotherapy-treated patients. Compared to chemotherapy, everolimus was associated with significantly lower all-cause (adjusted mean difference = $3455, p < 0.01) and BC-related ($2510, p < 0.01) total medical costs, with inpatient ($1344, p < 0.01) and outpatient costs ($1048, p < 0.01) as the main drivers for cost differences. Everolimus was also associated with significantly lower AE-related medical costs ($1730, p < 0.01), as well as significantly lower HRU (emergency room incidence rate ratio [IRR] = 0.83; inpatient IRR = 0.74; inpatient days IRR = 0.65; outpatient IRR = 0.71; BC-related outpatient IRR = 0.57; all p < 0.01). This retrospective claims database analysis of commercially-insured patients with HR+/HER2- mBC in the US showed that everolimus was associated with substantial all-cause, BC-related, and AE-related medical cost savings and less utilization of healthcare resources relative to chemotherapy.
Study of an External Neutron Source for an Accelerator-Driven System using the PHITS Code

DOE Office of Scientific and Technical Information (OSTI.GOV)

Sugawara, Takanori; Iwasaki, Tomohiko; Chiba, Takashi

A code system for the Accelerator Driven System (ADS) has been under development for analyzing dynamic behaviors of a subcritical core coupled with an accelerator. This code system named DSE (Dynamics calculation code system for a Subcritical system with an External neutron source) consists of an accelerator part and a reactor part. The accelerator part employs a database, which is calculated by using PHITS, for investigating the effect related to the accelerator such as the changes of beam energy, beam diameter, void generation, and target level. This analysis method using the database may introduce some errors into dynamics calculations sincemore » the neutron source data derived from the database has some errors in fitting or interpolating procedures. In this study, the effects of various events are investigated to confirm that the method based on the database is appropriate.« less
ClassLess: A Comprehensive Database of Young Stellar Objects

NASA Astrophysics Data System (ADS)

Hillenbrand, Lynne; Baliber, Nairn

2015-01-01

We have designed and constructed a database housing published measurements of Young Stellar Objects (YSOs) within ~1 kpc of the Sun. ClassLess, so called because it includes YSOs in all stages of evolution, is a relational database in which user interaction is conducted via HTML web browsers, queries are performed in scientific language, and all data are linked to the sources of publication. Each star is associated with a cluster (or clusters), and both spatially resolved and unresolved measurements are stored, allowing proper use of data from multiple star systems. With this fully searchable tool, myriad ground- and space-based instruments and surveys across wavelength regimes can be exploited. In addition to primary measurements, the database self consistently calculates and serves higher level data products such as extinction, luminosity, and mass. As a result, searches for young stars with specific physical characteristics can be completed with just a few mouse clicks.
Evaluation of techniques for increasing recall in a dictionary approach to gene and protein name identification.

PubMed

Schuemie, Martijn J; Mons, Barend; Weeber, Marc; Kors, Jan A

2007-06-01

Gene and protein name identification in text requires a dictionary approach to relate synonyms to the same gene or protein, and to link names to external databases. However, existing dictionaries are incomplete. We investigate two complementary methods for automatic generation of a comprehensive dictionary: combination of information from existing gene and protein databases and rule-based generation of spelling variations. Both methods have been reported in literature before, but have hitherto not been combined and evaluated systematically. We combined gene and protein names from several existing databases of four different organisms. The combined dictionaries showed a substantial increase in recall on three different test sets, as compared to any single database. Application of 23 spelling variation rules to the combined dictionaries further increased recall. However, many rules appeared to have no effect and some appear to have a detrimental effect on precision.
Karst database development in Minnesota: Design and data assembly

USGS Publications Warehouse

Gao, Y.; Alexander, E.C.; Tipping, R.G.

2005-01-01

The Karst Feature Database (KFD) of Minnesota is a relational GIS-based Database Management System (DBMS). Previous karst feature datasets used inconsistent attributes to describe karst features in different areas of Minnesota. Existing metadata were modified and standardized to represent a comprehensive metadata for all the karst features in Minnesota. Microsoft Access 2000 and ArcView 3.2 were used to develop this working database. Existing county and sub-county karst feature datasets have been assembled into the KFD, which is capable of visualizing and analyzing the entire data set. By November 17 2002, 11,682 karst features were stored in the KFD of Minnesota. Data tables are stored in a Microsoft Access 2000 DBMS and linked to corresponding ArcView applications. The current KFD of Minnesota has been moved from a Windows NT server to a Windows 2000 Citrix server accessible to researchers and planners through networked interfaces. ?? Springer-Verlag 2005.
GenBank

PubMed Central

Benson, Dennis A.; Karsch-Mizrachi, Ilene; Lipman, David J.; Ostell, James; Wheeler, David L.

2007-01-01

GenBank (R) is a comprehensive database that contains publicly available nucleotide sequences for more than 240 000 named organisms, obtained primarily through submissions from individual laboratories and batch submissions from large-scale sequencing projects. Most submissions are made using the web-based BankIt or standalone Sequin programs and accession numbers are assigned by GenBank staff upon receipt. Daily data exchange with the EMBL Data Library in Europe and the DNA Data Bank of Japan ensures worldwide coverage. GenBank is accessible through NCBI's retrieval system, Entrez, which integrates data from the major DNA and protein sequence databases along with taxonomy, genome, mapping, protein structure and domain information, and the biomedical journal literature via PubMed. BLAST provides sequence similarity searches of GenBank and other sequence databases. Complete bimonthly releases and daily updates of the GenBank database are available by FTP. To access GenBank and its related retrieval and analysis services, begin at the NCBI Homepage (). PMID:17202161
Analysis of the Microstructure of Titles in the INSPEC Data-Base

ERIC Educational Resources Information Center

And Others; Lynch, Michael F.

1973-01-01

A high degree of constancy has been found in the microstructure of titles of samples of the INSPEC data base taken over a three-year period. Character and digram frequencies are relatively stable, while variable-length character-strings characterizing samples separated by three years in time show close similarities. (2 references) (Author/SJ)

Some links on this page may take you to non-federal websites. Their policies may differ from this site.