DOE Office of Scientific and Technical Information (OSTI.GOV)
Humphrey, Walter R.
CMS is a Windows application for tracking chemical inventories. Partners will use this application to record chemicals that are stored on their site and to perform periodic inventories of those chemicals. The application records information about stored chemicals from user input via the keyboard and barcode readers and stores that information into a single-file database (SQLite). A simple user login mechanism is used to control access to functions in the application. A user interface is provided that allows users to search the database and update data in the database.
MushBase: A Mushroom Information Database Application
Le, Vang Quy; Lee, Hyun-Sook
2007-01-01
A database application, namely MushBase, has been built based on Microsoft Access in order to store and manage different kinds of data about mushroom biological information of species, strains and their physiological characteristics such as geometries and growth condition(s). In addition, it is also designed to store another group of information that is experimental data about mushroom classification by Random Amplification of Polymorphic DNA (RAPD). These two groups of information are stored and managed in the way so that it is convenient to retrieve each group of data and to cross-refer between them as well. PMID:24015087
Mashup of Geo and Space Science Data Provided via Relational Databases in the Semantic Web
NASA Astrophysics Data System (ADS)
Ritschel, B.; Seelus, C.; Neher, G.; Iyemori, T.; Koyama, Y.; Yatagai, A. I.; Murayama, Y.; King, T. A.; Hughes, J. S.; Fung, S. F.; Galkin, I. A.; Hapgood, M. A.; Belehaki, A.
2014-12-01
The use of RDBMS for the storage and management of geo and space science data and/or metadata is very common. Although the information stored in tables is based on a data model and therefore well organized and structured, a direct mashup with RDF based data stored in triple stores is not possible. One solution of the problem consists in the transformation of the whole content into RDF structures and storage in triple stores. Another interesting way is the use of a specific system/service, such as e.g. D2RQ, for the access to relational database content as virtual, read only RDF graphs. The Semantic Web based -proof of concept- GFZ ISDC uses the triple store Virtuoso for the storage of general context information/metadata to geo and space science satellite and ground station data. There is information about projects, platforms, instruments, persons, product types, etc. available but no detailed metadata about the data granuals itself. Such important information, as e.g. start or end time or the detailed spatial coverage of a single measurement is stored in RDBMS tables of the ISDC catalog system only. In order to provide a seamless access to all available information about the granuals/data products a mashup of the different data resources (triple store and RDBMS) is necessary. This paper describes the use of D2RQ for a Semantic Web/SPARQL based mashup of relational databases used for ISDC data server but also for the access to IUGONET and/or ESPAS and further geo and space science data resources. RDBMS Relational Database Management System RDF Resource Description Framework SPARQL SPARQL Protocol And RDF Query Language D2RQ Accessing Relational Databases as Virtual RDF Graphs GFZ ISDC German Research Centre for Geosciences Information System and Data Center IUGONET Inter-university Upper Atmosphere Global Observation Network (Japanese project) ESPAS Near earth space data infrastructure for e-science (European Union funded project)
MRNIDX - Marine Data Index: Database Description, Operation, Retrieval, and Display
Paskevich, Valerie F.
1982-01-01
A database referencing the location and content of data stored on magnetic medium was designed to assist in the indexing of time-series and spatially dependent marine geophysical data collected or processed by the U. S. Geological Survey. The database was designed and created for input to the Geologic Retrieval and Synopsis Program (GRASP) to allow selective retrievals of information pertaining to location of data, data format, cruise, geographical bounds and collection dates of data. This information is then used to locate the stored data for administrative purposes or further processing. Database utilization is divided into three distinct operations. The first is the inventorying of the data and the updating of the database, the second is the retrieval of information from the database, and the third is the graphic display of the geographical boundaries to which the retrieved information pertains.
Cross-Service Investigation of Geographical Information Systems
2004-03-01
Figure 8 illustrates the combined layers. Information for the layers is stored in a database format. The two types of storage are vector and...raster models. In a vector model, the image and information are stored as geometric objects such as points, lines, or polygons. In a raster model...DNCs are a vector -based digital database with selected maritime significant physical features from hydrographic charts. Layers within the DNC are data
Method for the reduction of image content redundancy in large image databases
Tobin, Kenneth William; Karnowski, Thomas P.
2010-03-02
A method of increasing information content for content-based image retrieval (CBIR) systems includes the steps of providing a CBIR database, the database having an index for a plurality of stored digital images using a plurality of feature vectors, the feature vectors corresponding to distinct descriptive characteristics of the images. A visual similarity parameter value is calculated based on a degree of visual similarity between features vectors of an incoming image being considered for entry into the database and feature vectors associated with a most similar of the stored images. Based on said visual similarity parameter value it is determined whether to store or how long to store the feature vectors associated with the incoming image in the database.
Meta-All: a system for managing metabolic pathway information.
Weise, Stephan; Grosse, Ivo; Klukas, Christian; Koschützki, Dirk; Scholz, Uwe; Schreiber, Falk; Junker, Björn H
2006-10-23
Many attempts are being made to understand biological subjects at a systems level. A major resource for these approaches are biological databases, storing manifold information about DNA, RNA and protein sequences including their functional and structural motifs, molecular markers, mRNA expression levels, metabolite concentrations, protein-protein interactions, phenotypic traits or taxonomic relationships. The use of these databases is often hampered by the fact that they are designed for special application areas and thus lack universality. Databases on metabolic pathways, which provide an increasingly important foundation for many analyses of biochemical processes at a systems level, are no exception from the rule. Data stored in central databases such as KEGG, BRENDA or SABIO-RK is often limited to read-only access. If experimentalists want to store their own data, possibly still under investigation, there are two possibilities. They can either develop their own information system for managing that own data, which is very time-consuming and costly, or they can try to store their data in existing systems, which is often restricted. Hence, an out-of-the-box information system for managing metabolic pathway data is needed. We have designed META-ALL, an information system that allows the management of metabolic pathways, including reaction kinetics, detailed locations, environmental factors and taxonomic information. Data can be stored together with quality tags and in different parallel versions. META-ALL uses Oracle DBMS and Oracle Application Express. We provide the META-ALL information system for download and use. In this paper, we describe the database structure and give information about the tools for submitting and accessing the data. As a first application of META-ALL, we show how the information contained in a detailed kinetic model can be stored and accessed. META-ALL is a system for managing information about metabolic pathways. It facilitates the handling of pathway-related data and is designed to help biochemists and molecular biologists in their daily research. It is available on the Web at http://bic-gh.de/meta-all and can be downloaded free of charge and installed locally.
Meta-All: a system for managing metabolic pathway information
Weise, Stephan; Grosse, Ivo; Klukas, Christian; Koschützki, Dirk; Scholz, Uwe; Schreiber, Falk; Junker, Björn H
2006-01-01
Background Many attempts are being made to understand biological subjects at a systems level. A major resource for these approaches are biological databases, storing manifold information about DNA, RNA and protein sequences including their functional and structural motifs, molecular markers, mRNA expression levels, metabolite concentrations, protein-protein interactions, phenotypic traits or taxonomic relationships. The use of these databases is often hampered by the fact that they are designed for special application areas and thus lack universality. Databases on metabolic pathways, which provide an increasingly important foundation for many analyses of biochemical processes at a systems level, are no exception from the rule. Data stored in central databases such as KEGG, BRENDA or SABIO-RK is often limited to read-only access. If experimentalists want to store their own data, possibly still under investigation, there are two possibilities. They can either develop their own information system for managing that own data, which is very time-consuming and costly, or they can try to store their data in existing systems, which is often restricted. Hence, an out-of-the-box information system for managing metabolic pathway data is needed. Results We have designed META-ALL, an information system that allows the management of metabolic pathways, including reaction kinetics, detailed locations, environmental factors and taxonomic information. Data can be stored together with quality tags and in different parallel versions. META-ALL uses Oracle DBMS and Oracle Application Express. We provide the META-ALL information system for download and use. In this paper, we describe the database structure and give information about the tools for submitting and accessing the data. As a first application of META-ALL, we show how the information contained in a detailed kinetic model can be stored and accessed. Conclusion META-ALL is a system for managing information about metabolic pathways. It facilitates the handling of pathway-related data and is designed to help biochemists and molecular biologists in their daily research. It is available on the Web at and can be downloaded free of charge and installed locally. PMID:17059592
Prototype of web-based database of surface wave investigation results for site classification
NASA Astrophysics Data System (ADS)
Hayashi, K.; Cakir, R.; Martin, A. J.; Craig, M. S.; Lorenzo, J. M.
2016-12-01
As active and passive surface wave methods are getting popular for evaluating site response of earthquake ground motion, demand on the development of database for investigation results is also increasing. Seismic ground motion not only depends on 1D velocity structure but also on 2D and 3D structures so that spatial information of S-wave velocity must be considered in ground motion prediction. The database can support to construct 2D and 3D underground models. Inversion of surface wave processing is essentially non-unique so that other information must be combined into the processing. The database of existed geophysical, geological and geotechnical investigation results can provide indispensable information to improve the accuracy and reliability of investigations. Most investigations, however, are carried out by individual organizations and investigation results are rarely stored in the unified and organized database. To study and discuss appropriate database and digital standard format for the surface wave investigations, we developed a prototype of web-based database to store observed data and processing results of surface wave investigations that we have performed at more than 400 sites in U.S. and Japan. The database was constructed on a web server using MySQL and PHP so that users can access to the database through the internet from anywhere with any device. All data is registered in the database with location and users can search geophysical data through Google Map. The database stores dispersion curves, horizontal to vertical spectral ratio and S-wave velocity profiles at each site that was saved in XML files as digital data so that user can review and reuse them. The database also stores a published 3D deep basin and crustal structure and user can refer it during the processing of surface wave data.
Evaluation of consumer drug information databases.
Choi, J A; Sullivan, J; Pankaskie, M; Brufsky, J
1999-01-01
To evaluate prescription drug information contained in six consumer drug information databases available on CD-ROM, and to make health care professionals aware of the information provided, so that they may appropriately recommend these databases for use by their patients. Observational study of six consumer drug information databases: The Corner Drug Store, Home Medical Advisor, Mayo Clinic Family Pharmacist, Medical Drug Reference, Mosby's Medical Encyclopedia, and PharmAssist. Not applicable. Not applicable. Information on 20 frequently prescribed drugs was evaluated in each database. The databases were ranked using a point-scale system based on primary and secondary assessment criteria. For the primary assessment, 20 categories of information based on those included in the 1998 edition of the USP DI Volume II, Advice for the Patient: Drug Information in Lay Language were evaluated for each of the 20 drugs, and each database could earn up to 400 points (for example, 1 point was awarded if the database mentioned a drug's mechanism of action). For the secondary assessment, the inclusion of 8 additional features that could enhance the utility of the databases was evaluated (for example, 1 point was awarded if the database contained a picture of the drug), and each database could earn up to 8 points. The results of the primary and secondary assessments, listed in order of highest to lowest number of points earned, are as follows: Primary assessment--Mayo Clinic Family Pharmacist (379), Medical Drug Reference (251), PharmAssist (176), Home Medical Advisor (113.5), The Corner Drug Store (98), and Mosby's Medical Encyclopedia (18.5); secondary assessment--The Mayo Clinic Family Pharmacist (8), The Corner Drug Store (5), Mosby's Medical Encyclopedia (5), Home Medical Advisor (4), Medical Drug Reference (4), and PharmAssist (3). The Mayo Clinic Family Pharmacist was the most accurate and complete source of prescription drug information based on the USP DI Volume II and would be an appropriate database for health care professionals to recommend to patients.
Giffen, Sarah E.
2002-01-01
An environmental database was developed to store water-quality data collected during the 1999 U.S. Geological Survey investigation of the occurrence and distribution of dioxins, furans, and PCBs in the riverbed sediment and fish tissue in the Penobscot River in Maine. The database can be used to store a wide range of detailed information and to perform complex queries on the data it contains. The database also could be used to store data from other historical and any future environmental studies conducted on the Penobscot River and surrounding regions.
NASA Technical Reports Server (NTRS)
Baldwin, John; Zendejas, Silvino; Gutheinz, Sandy; Borden, Chester; Wang, Yeou-Fang
2009-01-01
Mission and Assets Database (MADB) Version 1.0 is an SQL database system with a Web user interface to centralize information. The database stores flight project support resource requirements, view periods, antenna information, schedule, and forecast results for use in mid-range and long-term planning of Deep Space Network (DSN) assets.
Structure for Storing Properties of Particles (PoP)
DOE Office of Scientific and Technical Information (OSTI.GOV)
Patel, N. R.; Mattoon, C. M.; Beck, B. R.
2014-06-01
Some evaluated nuclear databases are critical for applications such as nuclear energy, nuclear medicine, homeland security, and stockpile stewardship. Particle masses, nuclear excitation levels, and other “Properties of Particles” are essential for making evaluated nuclear databases. Currently, these properties are obtained from various databases that are stored in outdated formats. Moreover, the “Properties of Particles” (PoP) structure is being designed that will allow storing all information for one or more particles in a single place, so that each evaluation, simulation, model calculation, etc. can link to the same data. Information provided in PoP will include properties of nuclei, gammas andmore » electrons (along with other particles such as pions, as evaluations extend to higher energies). Presently, PoP includes masses from the Atomic Mass Evaluation version 2003 (AME2003), and level schemes and gamma decays from the Reference Input Parameter Library (RIPL-3). The data are stored in a hierarchical structure. An example of how PoP stores nuclear masses and energy levels will be presented here.« less
Structure for Storing Properties of Particles (PoP)
DOE Office of Scientific and Technical Information (OSTI.GOV)
Patel, N.R., E-mail: infinidhi@llnl.gov; Mattoon, C.M.; Beck, B.R.
2014-06-15
Evaluated nuclear databases are critical for applications such as nuclear energy, nuclear medicine, homeland security, and stockpile stewardship. Particle masses, nuclear excitation levels, and other “Properties of Particles” are essential for making evaluated nuclear databases. Currently, these properties are obtained from various databases that are stored in outdated formats. A “Properties of Particles” (PoP) structure is being designed that will allow storing all information for one or more particles in a single place, so that each evaluation, simulation, model calculation, etc. can link to the same data. Information provided in PoP will include properties of nuclei, gammas and electrons (alongmore » with other particles such as pions, as evaluations extend to higher energies). Presently, PoP includes masses from the Atomic Mass Evaluation version 2003 (AME2003), and level schemes and gamma decays from the Reference Input Parameter Library (RIPL-3). The data are stored in a hierarchical structure. An example of how PoP stores nuclear masses and energy levels will be presented here.« less
Kersten, Ellen; Laraia, Barbara; Kelly, Maggi; Adler, Nancy; Yen, Irene H
2012-01-01
Small food stores are prevalent in urban neighborhoods, but the availability of nutritious food at such stores is not well known. The objective of this study was to determine whether data from 3 sources would yield a single, homogenous, healthful food store category that can be used to accurately characterize community nutrition environments for public health research. We conducted in-store surveys in 2009 on store type and the availability of nutritious food in a sample of nonchain food stores (n = 102) in 6 predominantly urban counties in Northern California (Alameda, Contra Costa, Marin, Sacramento, San Francisco, and Santa Clara). We compared survey results with commercial database information and neighborhood sociodemographic data by using independent sample t tests and classification and regression trees. Sampled small food stores yielded a heterogeneous group of stores in terms of store type and nutritious food options. Most stores were identified as convenience (54%) or specialty stores (22%); others were small grocery stores (19%) and large grocery stores (5%). Convenience and specialty stores were smaller and carried fewer nutritious and fresh food items. The availability of nutritious food and produce was better in stores in neighborhoods that had a higher percentage of white residents and a lower population density but did not differ significantly by neighborhood income. Commercial databases alone may not adequately categorize small food stores and the availability of nutritious foods. Alternative measures are needed to more accurately inform research and policies that seek to address disparities in diet-related health conditions.
Laraia, Barbara; Kelly, Maggi; Adler, Nancy; Yen, Irene H.
2012-01-01
Introduction Small food stores are prevalent in urban neighborhoods, but the availability of nutritious food at such stores is not well known. The objective of this study was to determine whether data from 3 sources would yield a single, homogenous, healthful food store category that can be used to accurately characterize community nutrition environments for public health research. Methods We conducted in-store surveys in 2009 on store type and the availability of nutritious food in a sample of nonchain food stores (n = 102) in 6 predominantly urban counties in Northern California (Alameda, Contra Costa, Marin, Sacramento, San Francisco, and Santa Clara). We compared survey results with commercial database information and neighborhood sociodemographic data by using independent sample t tests and classification and regression trees. Results Sampled small food stores yielded a heterogeneous group of stores in terms of store type and nutritious food options. Most stores were identified as convenience (54%) or specialty stores (22%); others were small grocery stores (19%) and large grocery stores (5%). Convenience and specialty stores were smaller and carried fewer nutritious and fresh food items. The availability of nutritious food and produce was better in stores in neighborhoods that had a higher percentage of white residents and a lower population density but did not differ significantly by neighborhood income. Conclusion Commercial databases alone may not adequately categorize small food stores and the availability of nutritious foods. Alternative measures are needed to more accurately inform research and policies that seek to address disparities in diet-related health conditions. PMID:22789445
Three Dimensional Guidance for the NPS Autonomous Underwater Vehicle
1991-09-01
is loaded into a least-squares-fit algorithm to determine surfaces of polyhedrons . These computed surfaces are then compared with the known...the obstacle information stored in the vehicle’s environmental database , there is great potential of encountering unplanned for obstacles during the... database that holds current posture information recorded by the navigator. This data store receives a new current posture on each cycle of the control
GlycoRDF: an ontology to standardize glycomics data in RDF
Ranzinger, Rene; Aoki-Kinoshita, Kiyoko F.; Campbell, Matthew P.; Kawano, Shin; Lütteke, Thomas; Okuda, Shujiro; Shinmachi, Daisuke; Shikanai, Toshihide; Sawaki, Hiromichi; Toukach, Philip; Matsubara, Masaaki; Yamada, Issaku; Narimatsu, Hisashi
2015-01-01
Motivation: Over the last decades several glycomics-based bioinformatics resources and databases have been created and released to the public. Unfortunately, there is no common standard in the representation of the stored information or a common machine-readable interface allowing bioinformatics groups to easily extract and cross-reference the stored information. Results: An international group of bioinformatics experts in the field of glycomics have worked together to create a standard Resource Description Framework (RDF) representation for glycomics data, focused on glycan sequences and related biological source, publications and experimental data. This RDF standard is defined by the GlycoRDF ontology and will be used by database providers to generate common machine-readable exports of the data stored in their databases. Availability and implementation: The ontology, supporting documentation and source code used by database providers to generate standardized RDF are available online (http://www.glycoinfo.org/GlycoRDF/). Contact: rene@ccrc.uga.edu or kkiyoko@soka.ac.jp Supplementary information: Supplementary data are available at Bioinformatics online. PMID:25388145
Encryption Characteristics of Two USB-based Personal Health Record Devices
Wright, Adam; Sittig, Dean F.
2007-01-01
Personal health records (PHRs) hold great promise for empowering patients and increasing the accuracy and completeness of health information. We reviewed two small USB-based PHR devices that allow a patient to easily store and transport their personal health information. Both devices offer password protection and encryption features. Analysis of the devices shows that they store their data in a Microsoft Access database. Due to a flaw in the encryption of this database, recovering the user’s password can be accomplished with minimal effort. Our analysis also showed that, rather than encrypting health information with the password chosen by the user, the devices stored the user’s password as a string in the database and then encrypted that database with a common password set by the manufacturer. This is another serious vulnerability. This article describes the weaknesses we discovered, outlines three critical flaws with the security model used by the devices, and recommends four guidelines for improving the security of similar devices. PMID:17460132
mHealthApps: A Repository and Database of Mobile Health Apps.
Xu, Wenlong; Liu, Yin
2015-03-18
The market of mobile health (mHealth) apps has rapidly evolved in the past decade. With more than 100,000 mHealth apps currently available, there is no centralized resource that collects information on these health-related apps for researchers in this field to effectively evaluate the strength and weakness of these apps. The objective of this study was to create a centralized mHealth app repository. We expect the analysis of information in this repository to provide insights for future mHealth research developments. We focused on apps from the two most established app stores, the Apple App Store and the Google Play Store. We extracted detailed information of each health-related app from these two app stores via our python crawling program, and then stored the information in both a user-friendly array format and a standard JavaScript Object Notation (JSON) format. We have developed a centralized resource that provides detailed information of more than 60,000 health-related apps from the Apple App Store and the Google Play Store. Using this information resource, we analyzed thousands of apps systematically and provide an overview of the trends for mHealth apps. This unique database allows the meta-analysis of health-related apps and provides guidance for research designs of future apps in the mHealth field.
Informatics in radiology: use of CouchDB for document-based storage of DICOM objects.
Rascovsky, Simón J; Delgado, Jorge A; Sanz, Alexander; Calvo, Víctor D; Castrillón, Gabriel
2012-01-01
Picture archiving and communication systems traditionally have depended on schema-based Structured Query Language (SQL) databases for imaging data management. To optimize database size and performance, many such systems store a reduced set of Digital Imaging and Communications in Medicine (DICOM) metadata, discarding informational content that might be needed in the future. As an alternative to traditional database systems, document-based key-value stores recently have gained popularity. These systems store documents containing key-value pairs that facilitate data searches without predefined schemas. Document-based key-value stores are especially suited to archive DICOM objects because DICOM metadata are highly heterogeneous collections of tag-value pairs conveying specific information about imaging modalities, acquisition protocols, and vendor-supported postprocessing options. The authors used an open-source document-based database management system (Apache CouchDB) to create and test two such databases; CouchDB was selected for its overall ease of use, capability for managing attachments, and reliance on HTTP and Representational State Transfer standards for accessing and retrieving data. A large database was created first in which the DICOM metadata from 5880 anonymized magnetic resonance imaging studies (1,949,753 images) were loaded by using a Ruby script. To provide the usual DICOM query functionality, several predefined "views" (standard queries) were created by using JavaScript. For performance comparison, the same queries were executed in both the CouchDB database and a SQL-based DICOM archive. The capabilities of CouchDB for attachment management and database replication were separately assessed in tests of a similar, smaller database. Results showed that CouchDB allowed efficient storage and interrogation of all DICOM objects; with the use of information retrieval algorithms such as map-reduce, all the DICOM metadata stored in the large database were searchable with only a minimal increase in retrieval time over that with the traditional database management system. Results also indicated possible uses for document-based databases in data mining applications such as dose monitoring, quality assurance, and protocol optimization. RSNA, 2012
SQLGEN: a framework for rapid client-server database application development.
Nadkarni, P M; Cheung, K H
1995-12-01
SQLGEN is a framework for rapid client-server relational database application development. It relies on an active data dictionary on the client machine that stores metadata on one or more database servers to which the client may be connected. The dictionary generates dynamic Structured Query Language (SQL) to perform common database operations; it also stores information about the access rights of the user at log-in time, which is used to partially self-configure the behavior of the client to disable inappropriate user actions. SQLGEN uses a microcomputer database as the client to store metadata in relational form, to transiently capture server data in tables, and to allow rapid application prototyping followed by porting to client-server mode with modest effort. SQLGEN is currently used in several production biomedical databases.
Trustworthy History and Provenance for Files and Databases
ERIC Educational Resources Information Center
Hasan, Ragib
2009-01-01
In today's world, information is increasingly created, processed, transmitted, and stored digitally. While the digital nature of information has brought enormous benefits, it has also created new vulnerabilities and attacks against data. Unlike physical documents, digitally stored information can be rapidly copied, erased, or modified. The…
Ogawa, Yoshiko; Tanabe, Naohito; Honda, Akiko; Azuma, Tomoko; Seki, Nao; Suzuki, Tsubasa; Suzuki, Hiroshi
2011-07-01
Point-of-purchase (POP) information at food stores could help promote healthy dietary habits. However, it has been difficult to evaluate the effects of such intervention on customers' behavior. We objectively evaluated the usefulness of POP health information for vegetables in the modification of customers' purchasing behavior by using the database of a point-of-sales (POS) system. Two supermarket stores belonging to the same chain were assigned as the intervention store (store I) and control store (store C). POP health information for vegetables was presented in store I for 60 days. The percent increase in daily sales of vegetables over the sales on the same date of the previous year was compared between the stores by using the database of the POS system, adjusting for the change in monthly visitors from the previous year (adjusted ∆sales). The adjusted ∆sales significantly increased during the intervention period (Spearman's ρ = 0.258, P for trend = 0.006) at store I but did not increase at store C (ρ = -0.037, P for trend = 0.728). The growth of the mean adjusted ∆sales of total vegetables from 30 days before the intervention period through the latter half of the intervention period was estimated to be greater at store I than at store C by 18.7 percentage points (95% confidence interval 1.6-35.9). Health-related POP information for vegetables in supermarkets can encourage customers to purchase and, probably, consume vegetables.
Shao, Wei; Shan, Jigui; Kearney, Mary F; Wu, Xiaolin; Maldarelli, Frank; Mellors, John W; Luke, Brian; Coffin, John M; Hughes, Stephen H
2016-07-04
The NCI Retrovirus Integration Database is a MySql-based relational database created for storing and retrieving comprehensive information about retroviral integration sites, primarily, but not exclusively, HIV-1. The database is accessible to the public for submission or extraction of data originating from experiments aimed at collecting information related to retroviral integration sites including: the site of integration into the host genome, the virus family and subtype, the origin of the sample, gene exons/introns associated with integration, and proviral orientation. Information about the references from which the data were collected is also stored in the database. Tools are built into the website that can be used to map the integration sites to UCSC genome browser, to plot the integration site patterns on a chromosome, and to display provirus LTRs in their inserted genome sequence. The website is robust, user friendly, and allows users to query the database and analyze the data dynamically. https://rid.ncifcrf.gov ; or http://home.ncifcrf.gov/hivdrp/resources.htm .
Temporal data mining for hospital management
NASA Astrophysics Data System (ADS)
Tsumoto, Shusaku; Hirano, Shoji
2009-04-01
It has passed about twenty years since clinical information are stored electronically as a hospital information system since 1980's. Stored data include from accounting information to laboratory data and even patient records are now started to be accumulated: in other words, a hospital cannot function without the information system, where almost all the pieces of medical information are stored as multimedia databases. In this paper, we applied temporal data mining and exploratory data analysis techniques to hospital management data. The results show several interesting results, which suggests that the reuse of stored data will give a powerful tool for hospial management.
The Reach Address Database (RAD)
The Reach Address Database (RAD) stores reach address information for each Water Program feature that has been linked to the underlying surface water features (streams, lakes, etc) in the National Hydrology Database (NHD) Plus dataset.
Komatsu, Setsuko; Wang, Xin; Yin, Xiaojian; Nanjo, Yohei; Ohyanagi, Hajime; Sakata, Katsumi
2017-06-23
The Soybean Proteome Database (SPD) stores data on soybean proteins obtained with gel-based and gel-free proteomic techniques. The database was constructed to provide information on proteins for functional analyses. The majority of the data is focused on soybean (Glycine max 'Enrei'). The growth and yield of soybean are strongly affected by environmental stresses such as flooding. The database was originally constructed using data on soybean proteins separated by two-dimensional polyacrylamide gel electrophoresis, which is a gel-based proteomic technique. Since 2015, the database has been expanded to incorporate data obtained by label-free mass spectrometry-based quantitative proteomics, which is a gel-free proteomic technique. Here, the portions of the database consisting of gel-free proteomic data are described. The gel-free proteomic database contains 39,212 proteins identified in 63 sample sets, such as temporal and organ-specific samples of soybean plants grown under flooding stress or non-stressed conditions. In addition, data on organellar proteins identified in mitochondria, nuclei, and endoplasmic reticulum are stored. Furthermore, the database integrates multiple omics data such as genomics, transcriptomics, metabolomics, and proteomics. The SPD database is accessible at http://proteome.dc.affrc.go.jp/Soybean/. The Soybean Proteome Database stores data obtained from both gel-based and gel-free proteomic techniques. The gel-free proteomic database comprises 39,212 proteins identified in 63 sample sets, such as different organs of soybean plants grown under flooding stress or non-stressed conditions in a time-dependent manner. In addition, organellar proteins identified in mitochondria, nuclei, and endoplasmic reticulum are stored in the gel-free proteomics database. A total of 44,704 proteins, including 5490 proteins identified using a gel-based proteomic technique, are stored in the SPD. It accounts for approximately 80% of all predicted proteins from genome sequences, though there are over lapped proteins. Based on the demonstrated application of data stored in the database for functional analyses, it is suggested that these data will be useful for analyses of biological mechanisms in soybean. Furthermore, coupled with recent advances in information and communication technology, the usefulness of this database would increase in the analyses of biological mechanisms. Copyright © 2017 Elsevier B.V. All rights reserved.
Legal Issues for an Integrated Information Center.
ERIC Educational Resources Information Center
Rees, Warren; And Others
1991-01-01
The ability to collect, store, retrieve, and combine information in computerized databases has magnified the potential for misuse of information. Laws have begun to deal with these new threats by expanding rights of privacy, copyright, misrepresentation, products liability, and defamation. Laws regarding computerized databases are certain to…
GlycoRDF: an ontology to standardize glycomics data in RDF.
Ranzinger, Rene; Aoki-Kinoshita, Kiyoko F; Campbell, Matthew P; Kawano, Shin; Lütteke, Thomas; Okuda, Shujiro; Shinmachi, Daisuke; Shikanai, Toshihide; Sawaki, Hiromichi; Toukach, Philip; Matsubara, Masaaki; Yamada, Issaku; Narimatsu, Hisashi
2015-03-15
Over the last decades several glycomics-based bioinformatics resources and databases have been created and released to the public. Unfortunately, there is no common standard in the representation of the stored information or a common machine-readable interface allowing bioinformatics groups to easily extract and cross-reference the stored information. An international group of bioinformatics experts in the field of glycomics have worked together to create a standard Resource Description Framework (RDF) representation for glycomics data, focused on glycan sequences and related biological source, publications and experimental data. This RDF standard is defined by the GlycoRDF ontology and will be used by database providers to generate common machine-readable exports of the data stored in their databases. The ontology, supporting documentation and source code used by database providers to generate standardized RDF are available online (http://www.glycoinfo.org/GlycoRDF/). © The Author 2014. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.
Deployment of Directory Service for IEEE N Bus Test System Information
NASA Astrophysics Data System (ADS)
Barman, Amal; Sil, Jaya
2008-10-01
Exchanging information over Internet and Intranet becomes a defacto standard in computer applications, among various users and organizations. Distributed system study, e-governance etc require transparent information exchange between applications, constituencies, manufacturers, and vendors. To serve these purposes database system is needed for storing system data and other relevant information. Directory service, which is a specialized database along with access protocol, could be the single solution since it runs over TCP/IP, supported by all POSIX compliance platforms and is based on open standard. This paper describes a way to deploy directory service, to store IEEE n bus test system data and integrating load flow program with it.
Storage and utilization of HLA genomic data--new approaches to HLA typing.
Helmberg, W
2000-01-01
Currently available DNA-based HLA typing assays can provide detailed information about sequence motifs of a tested sample. It is still a common practice, however, for information acquired by high-resolution sequence specific oligonucleotide probe (SSOP) typing or sequence specific priming (SSP) to be presented in a low-resolution serological format. Unfortunately, this representation can lead to significant loss of useful data in many cases. An alternative to assigning allele equivalents to suchDNA typing results is simply to store the observed typing pattern and utilize the information with the help of Virtual DNA Analysis (VDA). Interpretation of the stored typing patterns can then be updated based on newly defined alleles, assuming the sequence motifs detected by the typing reagents are known. Rather than updating reagent specificities in individual laboratories, such updates should be performed in a central, publicly available sequence database. By referring to this database, HLA genomic data can then be stored and transferred between laboratories without loss of information. The 13th International Histocompatibility Workshop offers an ideal opportunity to begin building this common database for the entire human MHC.
Prieto, Claudia I; Palau, María J; Martina, Pablo; Achiary, Carlos; Achiary, Andrés; Bettiol, Marisa; Montanaro, Patricia; Cazzola, María L; Leguizamón, Mariana; Massillo, Cintia; Figoli, Cecilia; Valeiras, Brenda; Perez, Silvia; Rentería, Fernando; Diez, Graciela; Yantorno, Osvaldo M; Bosch, Alejandra
2016-01-01
The epidemiological and clinical management of cystic fibrosis (CF) patients suffering from acute pulmonary exacerbations or chronic lung infections demands continuous updating of medical and microbiological processes associated with the constant evolution of pathogens during host colonization. In order to monitor the dynamics of these processes, it is essential to have expert systems capable of storing and subsequently extracting the information generated from different studies of the patients and microorganisms isolated from them. In this work we have designed and developed an on-line database based on an information system that allows to store, manage and visualize data from clinical studies and microbiological analysis of bacteria obtained from the respiratory tract of patients suffering from cystic fibrosis. The information system, named Cystic Fibrosis Cloud database is available on the http://servoy.infocomsa.com/cfc_database site and is composed of a main database and a web-based interface, which uses Servoy's product architecture based on Java technology. Although the CFC database system can be implemented as a local program for private use in CF centers, it can also be used, updated and shared by different users who can access the stored information in a systematic, practical and safe manner. The implementation of the CFC database could have a significant impact on the monitoring of respiratory infections, the prevention of exacerbations, the detection of emerging organisms, and the adequacy of control strategies for lung infections in CF patients. Copyright © 2015 Asociación Argentina de Microbiología. Publicado por Elsevier España, S.L.U. All rights reserved.
2014-06-01
central location. Each of the SQLite databases are converted and stored in one MySQL database and the pcap files are parsed to extract call information...from the specific communications applications used during the experiment. This extracted data is then stored in the same MySQL database. With all...rhythm of the event. Figure 3 demonstrates the application usage over the course of the experiment for the EXDIR. As seen, the EXDIR spent the majority
MaizeGDB: The Maize Genetics and Genomics Database.
USDA-ARS?s Scientific Manuscript database
MaizeGDB is the community database for biological information about the crop plant Zea mays. Genomic, genetic, sequence, gene product, functional characterization, literature reference, and person/organization contact information are among the datatypes stored at MaizeGDB. At the project’s website...
Bleda, Marta; Tarraga, Joaquin; de Maria, Alejandro; Salavert, Francisco; Garcia-Alonso, Luz; Celma, Matilde; Martin, Ainoha; Dopazo, Joaquin; Medina, Ignacio
2012-07-01
During the past years, the advances in high-throughput technologies have produced an unprecedented growth in the number and size of repositories and databases storing relevant biological data. Today, there is more biological information than ever but, unfortunately, the current status of many of these repositories is far from being optimal. Some of the most common problems are that the information is spread out in many small databases; frequently there are different standards among repositories and some databases are no longer supported or they contain too specific and unconnected information. In addition, data size is increasingly becoming an obstacle when accessing or storing biological data. All these issues make very difficult to extract and integrate information from different sources, to analyze experiments or to access and query this information in a programmatic way. CellBase provides a solution to the growing necessity of integration by easing the access to biological data. CellBase implements a set of RESTful web services that query a centralized database containing the most relevant biological data sources. The database is hosted in our servers and is regularly updated. CellBase documentation can be found at http://docs.bioinfo.cipf.es/projects/cellbase.
Burgarella, Sarah; Cattaneo, Dario; Masseroli, Marco
2006-01-01
We developed MicroGen, a multi-database Web based system for managing all the information characterizing spotted microarray experiments. It supports information gathering and storing according to the Minimum Information About Microarray Experiments (MIAME) standard. It also allows easy sharing of information and data among all multidisciplinary actors involved in spotted microarray experiments. PMID:17238488
Viral Genome DataBase: storing and analyzing genes and proteins from complete viral genomes.
Hiscock, D; Upton, C
2000-05-01
The Viral Genome DataBase (VGDB) contains detailed information of the genes and predicted protein sequences from 15 completely sequenced genomes of large (&100 kb) viruses (2847 genes). The data that is stored includes DNA sequence, protein sequence, GenBank and user-entered notes, molecular weight (MW), isoelectric point (pI), amino acid content, A + T%, nucleotide frequency, dinucleotide frequency and codon use. The VGDB is a mySQL database with a user-friendly JAVA GUI. Results of queries can be easily sorted by any of the individual parameters. The software and additional figures and information are available at http://athena.bioc.uvic.ca/genomes/index.html .
Content Is King: Databases Preserve the Collective Information of Science.
Yates, John R
2018-04-01
Databases store sequence information experimentally gathered to create resources that further science. In the last 20 years databases have become critical components of fields like proteomics where they provide the basis for large-scale and high-throughput proteomic informatics. Amos Bairoch, winner of the Association of Biomolecular Resource Facilities Frederick Sanger Award, has created some of the important databases proteomic research depends upon for accurate interpretation of data.
Text mining for metabolic pathways, signaling cascades, and protein networks.
Hoffmann, Robert; Krallinger, Martin; Andres, Eduardo; Tamames, Javier; Blaschke, Christian; Valencia, Alfonso
2005-05-10
The complexity of the information stored in databases and publications on metabolic and signaling pathways, the high throughput of experimental data, and the growing number of publications make it imperative to provide systems to help the researcher navigate through these interrelated information resources. Text-mining methods have started to play a key role in the creation and maintenance of links between the information stored in biological databases and its original sources in the literature. These links will be extremely useful for database updating and curation, especially if a number of technical problems can be solved satisfactorily, including the identification of protein and gene names (entities in general) and the characterization of their types of interactions. The first generation of openly accessible text-mining systems, such as iHOP (Information Hyperlinked over Proteins), provides additional functions to facilitate the reconstruction of protein interaction networks, combine database and text information, and support the scientist in the formulation of novel hypotheses. The next challenge is the generation of comprehensive information regarding the general function of signaling pathways and protein interaction networks.
Fernández, José M; Valencia, Alfonso
2004-10-12
Downloading the information stored in relational databases into XML and other flat formats is a common task in bioinformatics. This periodical dumping of information requires considerable CPU time, disk and memory resources. YAdumper has been developed as a purpose-specific tool to deal with the integral structured information download of relational databases. YAdumper is a Java application that organizes database extraction following an XML template based on an external Document Type Declaration. Compared with other non-native alternatives, YAdumper substantially reduces memory requirements and considerably improves writing performance.
A Medical Image Backup Architecture Based on a NoSQL Database and Cloud Computing Services.
Santos Simões de Almeida, Luan Henrique; Costa Oliveira, Marcelo
2015-01-01
The use of digital systems for storing medical images generates a huge volume of data. Digital images are commonly stored and managed on a Picture Archiving and Communication System (PACS), under the DICOM standard. However, PACS is limited because it is strongly dependent on the server's physical space. Alternatively, Cloud Computing arises as an extensive, low cost, and reconfigurable resource. However, medical images contain patient information that can not be made available in a public cloud. Therefore, a mechanism to anonymize these images is needed. This poster presents a solution for this issue by taking digital images from PACS, converting the information contained in each image file to a NoSQL database, and using cloud computing to store digital images.
NASA Astrophysics Data System (ADS)
Boyandin, A. N.; Lankin, Y. P.; Kargatova, T. V.; Popova, L. Y.; Pechurkin, N. S.
Luminescent transgenic microorganisms are widely used for study of microbial communities' functioning including closed ones. Bioluminescence is of high sensitive to effects of different environmental factors. Integration of lux-genes into different metabolic ways allows studying many aspects of microorganisms' life permitting to carry out measurements in situ. There is much information about applications of bioluminescent bacteria in different researches. But for effective using these data their summarizing and accumulation in common source is required. Therefore an information system on characteristics of transgenic microorganisms with cloned lux-genes was created. The database and client software related were developed. A database structure includes information on common characteristics of cloned lux-genes, their sources and properties, on regulation of gene expression in bacterial cells, on dependence of bioluminescence manifestation on biotic, abiotic and anthropogenic environmental factors. The database also can store description of changes in bacterial populations depending on environmental changes. The database created allows storing and using bibliographic information and also links to web sites of world collections of microorganisms. Internet publishing software permitting to open access to the database through the Internet is developed.
The Reach Address Database (RAD) stores reach address information for each Water Program feature that has been linked to the underlying surface water features (streams, lakes, etc) in the National Hydrology Database (NHD) Plus dataset.
A Database as a Service for the Healthcare System to Store Physiological Signal Data.
Chang, Hsien-Tsung; Lin, Tsai-Huei
2016-01-01
Wearable devices that measure physiological signals to help develop self-health management habits have become increasingly popular in recent years. These records are conducive for follow-up health and medical care. In this study, based on the characteristics of the observed physiological signal records- 1) a large number of users, 2) a large amount of data, 3) low information variability, 4) data privacy authorization, and 5) data access by designated users-we wish to resolve physiological signal record-relevant issues utilizing the advantages of the Database as a Service (DaaS) model. Storing a large amount of data using file patterns can reduce database load, allowing users to access data efficiently; the privacy control settings allow users to store data securely. The results of the experiment show that the proposed system has better database access performance than a traditional relational database, with a small difference in database volume, thus proving that the proposed system can improve data storage performance.
A Database as a Service for the Healthcare System to Store Physiological Signal Data
Lin, Tsai-Huei
2016-01-01
Wearable devices that measure physiological signals to help develop self-health management habits have become increasingly popular in recent years. These records are conducive for follow-up health and medical care. In this study, based on the characteristics of the observed physiological signal records– 1) a large number of users, 2) a large amount of data, 3) low information variability, 4) data privacy authorization, and 5) data access by designated users—we wish to resolve physiological signal record-relevant issues utilizing the advantages of the Database as a Service (DaaS) model. Storing a large amount of data using file patterns can reduce database load, allowing users to access data efficiently; the privacy control settings allow users to store data securely. The results of the experiment show that the proposed system has better database access performance than a traditional relational database, with a small difference in database volume, thus proving that the proposed system can improve data storage performance. PMID:28033415
Data Mining on Distributed Medical Databases: Recent Trends and Future Directions
NASA Astrophysics Data System (ADS)
Atilgan, Yasemin; Dogan, Firat
As computerization in healthcare services increase, the amount of available digital data is growing at an unprecedented rate and as a result healthcare organizations are much more able to store data than to extract knowledge from it. Today the major challenge is to transform these data into useful information and knowledge. It is important for healthcare organizations to use stored data to improve quality while reducing cost. This paper first investigates the data mining applications on centralized medical databases, and how they are used for diagnostic and population health, then introduces distributed databases. The integration needs and issues of distributed medical databases are described. Finally the paper focuses on data mining studies on distributed medical databases.
WATERS Terms of Use and Disclaimer
The Reach Address Database (RAD) stores reach address information for each Water Program feature that has been linked to the underlying surface water features (streams, lakes, etc) in the National Hydrology Database (NHD) Plus dataset.
Enabling search over encrypted multimedia databases
NASA Astrophysics Data System (ADS)
Lu, Wenjun; Swaminathan, Ashwin; Varna, Avinash L.; Wu, Min
2009-02-01
Performing information retrieval tasks while preserving data confidentiality is a desirable capability when a database is stored on a server maintained by a third-party service provider. This paper addresses the problem of enabling content-based retrieval over encrypted multimedia databases. Search indexes, along with multimedia documents, are first encrypted by the content owner and then stored onto the server. Through jointly applying cryptographic techniques, such as order preserving encryption and randomized hash functions, with image processing and information retrieval techniques, secure indexing schemes are designed to provide both privacy protection and rank-ordered search capability. Retrieval results on an encrypted color image database and security analysis of the secure indexing schemes under different attack models show that data confidentiality can be preserved while retaining very good retrieval performance. This work has promising applications in secure multimedia management.
Quantum Search in Hilbert Space
NASA Technical Reports Server (NTRS)
Zak, Michail
2003-01-01
A proposed quantum-computing algorithm would perform a search for an item of information in a database stored in a Hilbert-space memory structure. The algorithm is intended to make it possible to search relatively quickly through a large database under conditions in which available computing resources would otherwise be considered inadequate to perform such a task. The algorithm would apply, more specifically, to a relational database in which information would be stored in a set of N complex orthonormal vectors, each of N dimensions (where N can be exponentially large). Each vector would constitute one row of a unitary matrix, from which one would derive the Hamiltonian operator (and hence the evolutionary operator) of a quantum system. In other words, all the stored information would be mapped onto a unitary operator acting on a quantum state that would represent the item of information to be retrieved. Then one could exploit quantum parallelism: one could pose all search queries simultaneously by performing a quantum measurement on the system. In so doing, one would effectively solve the search problem in one computational step. One could exploit the direct- and inner-product decomposability of the unitary matrix to make the dimensionality of the memory space exponentially large by use of only linear resources. However, inasmuch as the necessary preprocessing (the mapping of the stored information into a Hilbert space) could be exponentially expensive, the proposed algorithm would likely be most beneficial in applications in which the resources available for preprocessing were much greater than those available for searching.
Sayers, Samantha; Ulysse, Guerlain; Xiang, Zuoshuang; He, Yongqun
2012-01-01
Vaccine adjuvants are compounds that enhance host immune responses to co-administered antigens in vaccines. Vaxjo is a web-based central database and analysis system that curates, stores, and analyzes vaccine adjuvants and their usages in vaccine development. Basic information of a vaccine adjuvant stored in Vaxjo includes adjuvant name, components, structure, appearance, storage, preparation, function, safety, and vaccines that use this adjuvant. Reliable references are curated and cited. Bioinformatics scripts are developed and used to link vaccine adjuvants to different adjuvanted vaccines stored in the general VIOLIN vaccine database. Presently, 103 vaccine adjuvants have been curated in Vaxjo. Among these adjuvants, 98 have been used in 384 vaccines stored in VIOLIN against over 81 pathogens, cancers, or allergies. All these vaccine adjuvants are categorized and analyzed based on adjuvant types, pathogens used, and vaccine types. As a use case study of vaccine adjuvants in infectious disease vaccines, the adjuvants used in Brucella vaccines are specifically analyzed. A user-friendly web query and visualization interface is developed for interactive vaccine adjuvant search. To support data exchange, the information of vaccine adjuvants is stored in the Vaccine Ontology (VO) in the Web Ontology Language (OWL) format.
Sayers, Samantha; Ulysse, Guerlain; Xiang, Zuoshuang; He, Yongqun
2012-01-01
Vaccine adjuvants are compounds that enhance host immune responses to co-administered antigens in vaccines. Vaxjo is a web-based central database and analysis system that curates, stores, and analyzes vaccine adjuvants and their usages in vaccine development. Basic information of a vaccine adjuvant stored in Vaxjo includes adjuvant name, components, structure, appearance, storage, preparation, function, safety, and vaccines that use this adjuvant. Reliable references are curated and cited. Bioinformatics scripts are developed and used to link vaccine adjuvants to different adjuvanted vaccines stored in the general VIOLIN vaccine database. Presently, 103 vaccine adjuvants have been curated in Vaxjo. Among these adjuvants, 98 have been used in 384 vaccines stored in VIOLIN against over 81 pathogens, cancers, or allergies. All these vaccine adjuvants are categorized and analyzed based on adjuvant types, pathogens used, and vaccine types. As a use case study of vaccine adjuvants in infectious disease vaccines, the adjuvants used in Brucella vaccines are specifically analyzed. A user-friendly web query and visualization interface is developed for interactive vaccine adjuvant search. To support data exchange, the information of vaccine adjuvants is stored in the Vaccine Ontology (VO) in the Web Ontology Language (OWL) format. PMID:22505817
Information mining in remote sensing imagery
NASA Astrophysics Data System (ADS)
Li, Jiang
The volume of remotely sensed imagery continues to grow at an enormous rate due to the advances in sensor technology, and our capability for collecting and storing images has greatly outpaced our ability to analyze and retrieve information from the images. This motivates us to develop image information mining techniques, which is very much an interdisciplinary endeavor drawing upon expertise in image processing, databases, information retrieval, machine learning, and software design. This dissertation proposes and implements an extensive remote sensing image information mining (ReSIM) system prototype for mining useful information implicitly stored in remote sensing imagery. The system consists of three modules: image processing subsystem, database subsystem, and visualization and graphical user interface (GUI) subsystem. Land cover and land use (LCLU) information corresponding to spectral characteristics is identified by supervised classification based on support vector machines (SVM) with automatic model selection, while textural features that characterize spatial information are extracted using Gabor wavelet coefficients. Within LCLU categories, textural features are clustered using an optimized k-means clustering approach to acquire search efficient space. The clusters are stored in an object-oriented database (OODB) with associated images indexed in an image database (IDB). A k-nearest neighbor search is performed using a query-by-example (QBE) approach. Furthermore, an automatic parametric contour tracing algorithm and an O(n) time piecewise linear polygonal approximation (PLPA) algorithm are developed for shape information mining of interesting objects within the image. A fuzzy object-oriented database based on the fuzzy object-oriented data (FOOD) model is developed to handle the fuzziness and uncertainty. Three specific applications are presented: integrated land cover and texture pattern mining, shape information mining for change detection of lakes, and fuzzy normalized difference vegetation index (NDVI) pattern mining. The study results show the effectiveness of the proposed system prototype and the potentials for other applications in remote sensing.
Timothy A. Bottomley
2008-01-01
The BLM uses a database, called the Forest Vegetation Information System (FORVIS), to store, retrieve, and analyze forest resource information on a majority of their forested lands. FORVIS also has the capability of easily transferring appropriate data electronically into Forest Vegetation Simulator (FVS) for simulation runs. Only minor additional data inputs or...
A Database Design and Development Case: Home Theater Video
ERIC Educational Resources Information Center
Ballenger, Robert; Pratt, Renee
2012-01-01
This case consists of a business scenario of a small video rental store, Home Theater Video, which provides background information, a description of the functional business requirements, and sample data. The case provides sufficient information to design and develop a moderately complex database to assist Home Theater Video in solving their…
MEPD: a Medaka gene expression pattern database
Henrich, Thorsten; Ramialison, Mirana; Quiring, Rebecca; Wittbrodt, Beate; Furutani-Seiki, Makoto; Wittbrodt, Joachim; Kondoh, Hisato
2003-01-01
The Medaka Expression Pattern Database (MEPD) stores and integrates information of gene expression during embryonic development of the small freshwater fish Medaka (Oryzias latipes). Expression patterns of genes identified by ESTs are documented by images and by descriptions through parameters such as staining intensity, category and comments and through a comprehensive, hierarchically organized dictionary of anatomical terms. Sequences of the ESTs are available and searchable through BLAST. ESTs in the database are clustered upon entry and have been blasted against public data-bases. The BLAST results are updated regularly, stored within the database and searchable. The MEPD is a project within the Medaka Genome Initiative (MGI) and entries will be interconnected to integrated genomic map databases. MEPD is accessible through the WWW at http://medaka.dsp.jst.go.jp/MEPD. PMID:12519950
Web client and ODBC access to legacy database information: a low cost approach.
Sanders, N. W.; Mann, N. H.; Spengler, D. M.
1997-01-01
A new method has been developed for the Department of Orthopaedics of Vanderbilt University Medical Center to access departmental clinical data. Previously this data was stored only in the medical center's mainframe DB2 database, it is now additionally stored in a departmental SQL database. Access to this data is available via any ODBC compliant front-end or a web client. With a small budget and no full time staff, we were able to give our department on-line access to many years worth of patient data that was previously inaccessible. PMID:9357735
A Database of Historical Information on Landslides and Floods in Italy
NASA Astrophysics Data System (ADS)
Guzzetti, F.; Tonelli, G.
2003-04-01
For the past 12 years we have maintained and updated a database of historical information on landslides and floods in Italy, known as the National Research Council's AVI (Damaged Urban Areas) Project archive. The database was originally designed to respond to a specific request of the Minister of Civil Protection, and was aimed at helping the regional assessment of landslide and flood risk in Italy. The database was first constructed in 1991-92 to cover the period 1917 to 1990. Information of damaging landslide and flood event was collected by searching archives, by screening thousands of newspaper issues, by reviewing the existing technical and scientific literature on landslides and floods in Italy, and by interviewing landslide and flood experts. The database was then updated chiefly through the analysis of hundreds of newspaper articles, and it now covers systematically the period 1900 to 1998, and non-systematically the periods 1900 to 1916 and 1999 to 2002. Non systematic information on landslide and flood events older than 20th century is also present in the database. The database currently contains information on more than 32,000 landslide events occurred at more than 25,700 sites, and on more than 28,800 flood events occurred at more than 15,600 sites. After a brief outline of the history and evolution of the AVI Project archive, we present and discuss: (a) the present structure of the database, including the hardware and software solutions adopted to maintain, manage, use and disseminate the information stored in the database, (b) the type and amount of information stored in the database, including an estimate of its completeness, and (c) examples of recent applications of the database, including a web-based GIS systems to show the location of sites historically affected by landslides and floods, and an estimate of geo-hydrological (i.e., landslide and flood) risk in Italy based on the available historical information.
VerSeDa: vertebrate secretome database
Cortazar, Ana R.; Oguiza, José A.
2017-01-01
Based on the current tools, de novo secretome (full set of proteins secreted by an organism) prediction is a time consuming bioinformatic task that requires a multifactorial analysis in order to obtain reliable in silico predictions. Hence, to accelerate this process and offer researchers a reliable repository where secretome information can be obtained for vertebrates and model organisms, we have developed VerSeDa (Vertebrate Secretome Database). This freely available database stores information about proteins that are predicted to be secreted through the classical and non-classical mechanisms, for the wide range of vertebrate species deposited at the NCBI, UCSC and ENSEMBL sites. To our knowledge, VerSeDa is the only state-of-the-art database designed to store secretome data from multiple vertebrate genomes, thus, saving an important amount of time spent in the prediction of protein features that can be retrieved from this repository directly. Database URL: VerSeDa is freely available at http://genomics.cicbiogune.es/VerSeDa/index.php PMID:28365718
Geospatial data infrastructure: The development of metadata for geo-information in China
NASA Astrophysics Data System (ADS)
Xu, Baiquan; Yan, Shiqiang; Wang, Qianju; Lian, Jian; Wu, Xiaoping; Ding, Keyong
2014-03-01
Stores of geoscience records are in constant flux. These stores are continually added to by new information, ideas and data, which are frequently revised. The geoscience record is in restrained by human thought and technology for handling information. Conventional methods strive, with limited success, to maintain geoscience records which are readily susceptible and renewable. The information system must adapt to the diversity of ideas and data in geoscience and their changes through time. In China, more than 400,000 types of important geological data are collected and produced in geological work during the last two decades, including oil, natural gas and marine data, mine exploration, geophysical, geochemical, remote sensing and important local geological survey and research reports. Numerous geospatial databases are formed and stored in National Geological Archives (NGA) with available formats of MapGIS, ArcGIS, ArcINFO, Metalfile, Raster, SQL Server, Access and JPEG. But there is no effective way to warrant that the quality of information is adequate in theory and practice for decision making. The need for fast, reliable, accurate and up-to-date information by providing the Geographic Information System (GIS) communities are becoming insistent for all geoinformation producers and users in China. Since 2010, a series of geoinformation projects have been carried out under the leadership of the Ministry of Land and Resources (MLR), including (1) Integration, update and maintenance of geoinformation databases; (2) Standards research on clusterization and industrialization of information services; (3) Platform construction of geological data sharing; (4) Construction of key borehole databases; (5) Product development of information services. "Nine-System" of the basic framework has been proposed for the development and improvement of the geospatial data infrastructure, which are focused on the construction of the cluster organization, cluster service, convergence, database, product, policy, technology, standard and infrastructure systems. The development of geoinformation stores and services put forward a need for Geospatial Data Infrastructure (GDI) in China. In this paper, some of the ideas envisaged into the development of metadata in China are discussed.
Team X Spacecraft Instrument Database Consolidation
NASA Technical Reports Server (NTRS)
Wallenstein, Kelly A.
2005-01-01
In the past decade, many changes have been made to Team X's process of designing each spacecraft, with the purpose of making the overall procedure more efficient over time. One such improvement is the use of information databases from previous missions, designs, and research. By referring to these databases, members of the design team can locate relevant instrument data and significantly reduce the total time they spend on each design. The files in these databases were stored in several different formats with various levels of accuracy. During the past 2 months, efforts have been made in an attempt to combine and organize these files. The main focus was in the Instruments department, where spacecraft subsystems are designed based on mission measurement requirements. A common database was developed for all instrument parameters using Microsoft Excel to minimize the time and confusion experienced when searching through files stored in several different formats and locations. By making this collection of information more organized, the files within them have become more easily searchable. Additionally, the new Excel database offers the option of importing its contents into a more efficient database management system in the future. This potential for expansion enables the database to grow and acquire more search features as needed.
Real Time Monitor of Grid job executions
NASA Astrophysics Data System (ADS)
Colling, D. J.; Martyniak, J.; McGough, A. S.; Křenek, A.; Sitera, J.; Mulač, M.; Dvořák, F.
2010-04-01
In this paper we describe the architecture and operation of the Real Time Monitor (RTM), developed by the Grid team in the HEP group at Imperial College London. This is arguably the most popular dissemination tool within the EGEE [1] Grid. Having been used, on many occasions including GridFest and LHC inauguration events held at CERN in October 2008. The RTM gathers information from EGEE sites hosting Logging and Bookkeeping (LB) services. Information is cached locally at a dedicated server at Imperial College London and made available for clients to use in near real time. The system consists of three main components: the RTM server, enquirer and an apache Web Server which is queried by clients. The RTM server queries the LB servers at fixed time intervals, collecting job related information and storing this in a local database. Job related data includes not only job state (i.e. Scheduled, Waiting, Running or Done) along with timing information but also other attributes such as Virtual Organization and Computing Element (CE) queue - if known. The job data stored in the RTM database is read by the enquirer every minute and converted to an XML format which is stored on a Web Server. This decouples the RTM server database from the client removing the bottleneck problem caused by many clients simultaneously accessing the database. This information can be visualized through either a 2D or 3D Java based client with live job data either being overlaid on to a 2 dimensional map of the world or rendered in 3 dimensions over a globe map using OpenGL.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Tourassi, Georgia D.; Harrawood, Brian; Singh, Swatee
2007-08-15
We have previously presented a knowledge-based computer-assisted detection (KB-CADe) system for the detection of mammographic masses. The system is designed to compare a query mammographic region with mammographic templates of known ground truth. The templates are stored in an adaptive knowledge database. Image similarity is assessed with information theoretic measures (e.g., mutual information) derived directly from the image histograms. A previous study suggested that the diagnostic performance of the system steadily improves as the knowledge database is initially enriched with more templates. However, as the database increases in size, an exhaustive comparison of the query case with each stored templatemore » becomes computationally burdensome. Furthermore, blind storing of new templates may result in redundancies that do not necessarily improve diagnostic performance. To address these concerns we investigated an entropy-based indexing scheme for improving the speed of analysis and for satisfying database storage restrictions without compromising the overall diagnostic performance of our KB-CADe system. The indexing scheme was evaluated on two different datasets as (i) a search mechanism to sort through the knowledge database, and (ii) a selection mechanism to build a smaller, concise knowledge database that is easier to maintain but still effective. There were two important findings in the study. First, entropy-based indexing is an effective strategy to identify fast a subset of templates that are most relevant to a given query. Only this subset could be analyzed in more detail using mutual information for optimized decision making regarding the query. Second, a selective entropy-based deposit strategy may be preferable where only high entropy cases are maintained in the knowledge database. Overall, the proposed entropy-based indexing scheme was shown to reduce the computational cost of our KB-CADe system by 55% to 80% while maintaining the system's diagnostic performance.« less
PROTICdb: a web-based application to store, track, query, and compare plant proteome data.
Ferry-Dumazet, Hélène; Houel, Gwenn; Montalent, Pierre; Moreau, Luc; Langella, Olivier; Negroni, Luc; Vincent, Delphine; Lalanne, Céline; de Daruvar, Antoine; Plomion, Christophe; Zivy, Michel; Joets, Johann
2005-05-01
PROTICdb is a web-based application, mainly designed to store and analyze plant proteome data obtained by two-dimensional polyacrylamide gel electrophoresis (2-D PAGE) and mass spectrometry (MS). The purposes of PROTICdb are (i) to store, track, and query information related to proteomic experiments, i.e., from tissue sampling to protein identification and quantitative measurements, and (ii) to integrate information from the user's own expertise and other sources into a knowledge base, used to support data interpretation (e.g., for the determination of allelic variants or products of post-translational modifications). Data insertion into the relational database of PROTICdb is achieved either by uploading outputs of image analysis and MS identification software, or by filling web forms. 2-D PAGE annotated maps can be displayed, queried, and compared through a graphical interface. Links to external databases are also available. Quantitative data can be easily exported in a tabulated format for statistical analyses. PROTICdb is based on the Oracle or the PostgreSQL Database Management System and is freely available upon request at the following URL: http://moulon.inra.fr/ bioinfo/PROTICdb.
Intrusion Detection in Database Systems
NASA Astrophysics Data System (ADS)
Javidi, Mohammad M.; Sohrabi, Mina; Rafsanjani, Marjan Kuchaki
Data represent today a valuable asset for organizations and companies and must be protected. Ensuring the security and privacy of data assets is a crucial and very difficult problem in our modern networked world. Despite the necessity of protecting information stored in database systems (DBS), existing security models are insufficient to prevent misuse, especially insider abuse by legitimate users. One mechanism to safeguard the information in these databases is to use an intrusion detection system (IDS). The purpose of Intrusion detection in database systems is to detect transactions that access data without permission. In this paper several database Intrusion detection approaches are evaluated.
A Quality-Control-Oriented Database for a Mesoscale Meteorological Observation Network
NASA Astrophysics Data System (ADS)
Lussana, C.; Ranci, M.; Uboldi, F.
2012-04-01
In the operational context of a local weather service, data accessibility and quality related issues must be managed by taking into account a wide set of user needs. This work describes the structure and the operational choices made for the operational implementation of a database system storing data from highly automated observing stations, metadata and information on data quality. Lombardy's environmental protection agency, ARPA Lombardia, manages a highly automated mesoscale meteorological network. A Quality Assurance System (QAS) ensures that reliable observational information is collected and disseminated to the users. The weather unit in ARPA Lombardia, at the same time an important QAS component and an intensive data user, has developed a database specifically aimed to: 1) providing quick access to data for operational activities and 2) ensuring data quality for real-time applications, by means of an Automatic Data Quality Control (ADQC) procedure. Quantities stored in the archive include hourly aggregated observations of: precipitation amount, temperature, wind, relative humidity, pressure, global and net solar radiation. The ADQC performs several independent tests on raw data and compares their results in a decision-making procedure. An important ADQC component is the Spatial Consistency Test based on Optimal Interpolation. Interpolated and Cross-Validation analysis values are also stored in the database, providing further information to human operators and useful estimates in case of missing data. The technical solution adopted is based on a LAMP (Linux, Apache, MySQL and Php) system, constituting an open source environment suitable for both development and operational practice. The ADQC procedure itself is performed by R scripts directly interacting with the MySQL database. Users and network managers can access the database by using a set of web-based Php applications.
The 2002 RPA Plot Summary database users manual
Patrick D. Miles; John S. Vissage; W. Brad Smith
2004-01-01
Describes the structure of the RPA 2002 Plot Summary database and provides information on generating estimates of forest statistics from these data. The RPA 2002 Plot Summary database provides a consistent framework for storing forest inventory data across all ownerships across the entire United States. The data represents the best available data as of October 2001....
Follicle Online: an integrated database of follicle assembly, development and ovulation.
Hua, Juan; Xu, Bo; Yang, Yifan; Ban, Rongjun; Iqbal, Furhan; Cooke, Howard J; Zhang, Yuanwei; Shi, Qinghua
2015-01-01
Folliculogenesis is an important part of ovarian function as it provides the oocytes for female reproductive life. Characterizing genes/proteins involved in folliculogenesis is fundamental for understanding the mechanisms associated with this biological function and to cure the diseases associated with folliculogenesis. A large number of genes/proteins associated with folliculogenesis have been identified from different species. However, no dedicated public resource is currently available for folliculogenesis-related genes/proteins that are validated by experiments. Here, we are reporting a database 'Follicle Online' that provides the experimentally validated gene/protein map of the folliculogenesis in a number of species. Follicle Online is a web-based database system for storing and retrieving folliculogenesis-related experimental data. It provides detailed information for 580 genes/proteins (from 23 model organisms, including Homo sapiens, Mus musculus, Rattus norvegicus, Mesocricetus auratus, Bos Taurus, Drosophila and Xenopus laevis) that have been reported to be involved in folliculogenesis, POF (premature ovarian failure) and PCOS (polycystic ovary syndrome). The literature was manually curated from more than 43,000 published articles (till 1 March 2014). The Follicle Online database is implemented in PHP + MySQL + JavaScript and this user-friendly web application provides access to the stored data. In summary, we have developed a centralized database that provides users with comprehensive information about genes/proteins involved in folliculogenesis. This database can be accessed freely and all the stored data can be viewed without any registration. Database URL: http://mcg.ustc.edu.cn/sdap1/follicle/index.php © The Author(s) 2015. Published by Oxford University Press.
Follicle Online: an integrated database of follicle assembly, development and ovulation
Hua, Juan; Xu, Bo; Yang, Yifan; Ban, Rongjun; Iqbal, Furhan; Zhang, Yuanwei; Shi, Qinghua
2015-01-01
Folliculogenesis is an important part of ovarian function as it provides the oocytes for female reproductive life. Characterizing genes/proteins involved in folliculogenesis is fundamental for understanding the mechanisms associated with this biological function and to cure the diseases associated with folliculogenesis. A large number of genes/proteins associated with folliculogenesis have been identified from different species. However, no dedicated public resource is currently available for folliculogenesis-related genes/proteins that are validated by experiments. Here, we are reporting a database ‘Follicle Online’ that provides the experimentally validated gene/protein map of the folliculogenesis in a number of species. Follicle Online is a web-based database system for storing and retrieving folliculogenesis-related experimental data. It provides detailed information for 580 genes/proteins (from 23 model organisms, including Homo sapiens, Mus musculus, Rattus norvegicus, Mesocricetus auratus, Bos Taurus, Drosophila and Xenopus laevis) that have been reported to be involved in folliculogenesis, POF (premature ovarian failure) and PCOS (polycystic ovary syndrome). The literature was manually curated from more than 43 000 published articles (till 1 March 2014). The Follicle Online database is implemented in PHP + MySQL + JavaScript and this user-friendly web application provides access to the stored data. In summary, we have developed a centralized database that provides users with comprehensive information about genes/proteins involved in folliculogenesis. This database can be accessed freely and all the stored data can be viewed without any registration. Database URL: http://mcg.ustc.edu.cn/sdap1/follicle/index.php PMID:25931457
77 FR 66880 - Submission for OMB Review; Comment Request
Federal Register 2010, 2011, 2012, 2013, 2014
2012-11-07
... the database that stores information for the Lost and Stolen Securities Program. We estimate that 26... Lost and Stolen Securities Program database will be kept confidential. The Commission may not conduct... SECURITIES AND EXCHANGE COMMISSION Submission for OMB Review; Comment Request Upon Written Request...
VerSeDa: vertebrate secretome database.
Cortazar, Ana R; Oguiza, José A; Aransay, Ana M; Lavín, José L
2017-01-01
Based on the current tools, de novo secretome (full set of proteins secreted by an organism) prediction is a time consuming bioinformatic task that requires a multifactorial analysis in order to obtain reliable in silico predictions. Hence, to accelerate this process and offer researchers a reliable repository where secretome information can be obtained for vertebrates and model organisms, we have developed VerSeDa (Vertebrate Secretome Database). This freely available database stores information about proteins that are predicted to be secreted through the classical and non-classical mechanisms, for the wide range of vertebrate species deposited at the NCBI, UCSC and ENSEMBL sites. To our knowledge, VerSeDa is the only state-of-the-art database designed to store secretome data from multiple vertebrate genomes, thus, saving an important amount of time spent in the prediction of protein features that can be retrieved from this repository directly. VerSeDa is freely available at http://genomics.cicbiogune.es/VerSeDa/index.php. © The Author(s) 2017. Published by Oxford University Press.
A manual for a laboratory information management system (LIMS) for light stable isotopes
Coplen, Tyler B.
1997-01-01
The reliability and accuracy of isotopic data can be improved by utilizing database software to (i) store information about samples, (ii) store the results of mass spectrometric isotope-ratio analyses of samples, (iii) calculate analytical results using standardized algorithms stored in a database, (iv) normalize stable isotopic data to international scales using isotopic reference materials, and (v) generate multi-sheet paper templates for convenient sample loading of automated mass-spectrometer sample preparation manifolds. Such a database program is presented herein. Major benefits of this system include (i) an increase in laboratory efficiency, (ii) reduction in the use of paper, (iii) reduction in workload due to the elimination or reduction of retyping of data by laboratory personnel, and (iv) decreased errors in data reported to sample submitters. Such a database provides a complete record of when and how often laboratory reference materials have been analyzed and provides a record of what correction factors have been used through time. It provides an audit trail for stable isotope laboratories. Since the original publication of the manual for LIMS for Light Stable Isotopes, the isotopes 3 H, 3 He, and 14 C, and the chlorofluorocarbons (CFCs), CFC-11, CFC-12, and CFC-113, have been added to this program.
A manual for a Laboratory Information Management System (LIMS) for light stable isotopes
Coplen, Tyler B.
1998-01-01
The reliability and accuracy of isotopic data can be improved by utilizing database software to (i) store information about samples, (ii) store the results of mass spectrometric isotope-ratio analyses of samples, (iii) calculate analytical results using standardized algorithms stored in a database, (iv) normalize stable isotopic data to international scales using isotopic reference materials, and (v) generate multi-sheet paper templates for convenient sample loading of automated mass-spectrometer sample preparation manifolds. Such a database program is presented herein. Major benefits of this system include (i) an increase in laboratory efficiency, (ii) reduction in the use of paper, (iii) reduction in workload due to the elimination or reduction of retyping of data by laboratory personnel, and (iv) decreased errors in data reported to sample submitters. Such a database provides a complete record of when and how often laboratory reference materials have been analyzed and provides a record of what correction factors have been used through time. It provides an audit trail for stable isotope laboratories. Since the original publication of the manual for LIMS for Light Stable Isotopes, the isotopes 3 H, 3 He, and 14 C, and the chlorofluorocarbons (CFCs), CFC-11, CFC-12, and CFC-113, have been added to this program.
Code of Federal Regulations, 2010 CFR
2010-04-01
... USE DEVICES General Hospital and Personal Use Miscellaneous Devices § 880.6300 Implantable... identification code is used to access patient identity and corresponding health information stored in a database...
Pearson, Daniel K.; Bumgarner, Johnathan R.; Houston, Natalie A.; Stanton, Gregory P.; Teeple, Andrew; Thomas, Jonathan V.
2012-01-01
The U.S. Geological Survey, in cooperation with Middle Pecos Groundwater Conservation District, Pecos County, City of Fort Stockton, Brewster County, and Pecos County Water Control and Improvement District No. 1, compiled groundwater, surface-water, water-quality, geophysical, and geologic data for site locations in the Pecos County region, Texas, and developed a geodatabase to facilitate use of this information. Data were compiled for an approximately 4,700 square mile area of the Pecos County region, Texas. The geodatabase contains data from 8,242 sampling locations; it was designed to organize and store field-collected geochemical and geophysical data, as well as digital database resources from the U.S. Geological Survey, Middle Pecos Groundwater Conservation District, Texas Water Development Board, Texas Commission on Environmental Quality,and numerous other State and local databases. The geodatabase combines these disparate database resources into a simple data model. Site locations are geospatially enabled and stored in a geodatabase feature class for cartographic visualization and spatial analysis within a Geographic Information System. The sampling locations are related to hydrogeologic information through the use of geodatabase relationship classes. The geodatabase relationship classes provide the ability to perform complex spatial and data-driven queries to explore data stored in the geodatabase.
System of end-to-end symmetric database encryption
NASA Astrophysics Data System (ADS)
Galushka, V. V.; Aydinyan, A. R.; Tsvetkova, O. L.; Fathi, V. A.; Fathi, D. V.
2018-05-01
The article is devoted to the actual problem of protecting databases from information leakage, which is performed while bypassing access control mechanisms. To solve this problem, it is proposed to use end-to-end data encryption, implemented at the end nodes of an interaction of the information system components using one of the symmetric cryptographic algorithms. For this purpose, a key management method designed for use in a multi-user system based on the distributed key representation model, part of which is stored in the database, and the other part is obtained by converting the user's password, has been developed and described. In this case, the key is calculated immediately before the cryptographic transformations and is not stored in the memory after the completion of these transformations. Algorithms for registering and authorizing a user, as well as changing his password, have been described, and the methods for calculating parts of a key when performing these operations have been provided.
PCACE-Personal-Computer-Aided Cabling Engineering
NASA Technical Reports Server (NTRS)
Billitti, Joseph W.
1987-01-01
PCACE computer program developed to provide inexpensive, interactive system for learning and using engineering approach to interconnection systems. Basically database system that stores information as files of individual connectors and handles wiring information in circuit groups stored as records. Directly emulates typical manual engineering methods of handling data, thus making interface between user and program very natural. Apple version written in P-Code Pascal and IBM PC version of PCACE written in TURBO Pascal 3.0
Code of Federal Regulations, 2013 CFR
2013-04-01
... ADMINISTRATION, DEPARTMENT OF HEALTH AND HUMAN SERVICES (CONTINUED) MEDICAL DEVICES GENERAL HOSPITAL AND PERSONAL... identification code is used to access patient identity and corresponding health information stored in a database...
Code of Federal Regulations, 2014 CFR
2014-04-01
... ADMINISTRATION, DEPARTMENT OF HEALTH AND HUMAN SERVICES (CONTINUED) MEDICAL DEVICES GENERAL HOSPITAL AND PERSONAL... identification code is used to access patient identity and corresponding health information stored in a database...
Code of Federal Regulations, 2012 CFR
2012-04-01
... ADMINISTRATION, DEPARTMENT OF HEALTH AND HUMAN SERVICES (CONTINUED) MEDICAL DEVICES GENERAL HOSPITAL AND PERSONAL... identification code is used to access patient identity and corresponding health information stored in a database...
Code of Federal Regulations, 2011 CFR
2011-04-01
... ADMINISTRATION, DEPARTMENT OF HEALTH AND HUMAN SERVICES (CONTINUED) MEDICAL DEVICES GENERAL HOSPITAL AND PERSONAL... identification code is used to access patient identity and corresponding health information stored in a database...
Acquiring geographical data with web harvesting
NASA Astrophysics Data System (ADS)
Dramowicz, K.
2016-04-01
Many websites contain very attractive and up to date geographical information. This information can be extracted, stored, analyzed and mapped using web harvesting techniques. Poorly organized data from websites are transformed with web harvesting into a more structured format, which can be stored in a database and analyzed. Almost 25% of web traffic is related to web harvesting, mostly while using search engines. This paper presents how to harvest geographic information from web documents using the free tool called the Beautiful Soup, one of the most commonly used Python libraries for pulling data from HTML and XML files. It is a relatively easy task to process one static HTML table. The more challenging task is to extract and save information from tables located in multiple and poorly organized websites. Legal and ethical aspects of web harvesting are discussed as well. The paper demonstrates two case studies. The first one shows how to extract various types of information about the Good Country Index from the multiple web pages, load it into one attribute table and map the results. The second case study shows how script tools and GIS can be used to extract information from one hundred thirty six websites about Nova Scotia wines. In a little more than three minutes a database containing one hundred and six liquor stores selling these wines is created. Then the availability and spatial distribution of various types of wines (by grape types, by wineries, and by liquor stores) are mapped and analyzed.
The Tropical Biominer Project: mining old sources for new drugs.
Artiguenave, François; Lins, André; Maciel, Wesley Dias; Junior, Antonio Celso Caldeira; Nacif-Coelho, Carla; de Souza Linhares, Maria Margarida Ribeiro; de Oliveira, Guilherme Correa; Barbosa, Luis Humberto Rezende; Lopes, Júlio César Dias; Junior, Claudionor Nunes Coelho
2005-01-01
The Tropical Biominer Project is a recent initiative from the Federal University of Minas Gerais (UFMG) and the Oswaldo Cruz foundation, with the participation of the Biominas Foundation (Belo Horizonte, Minas Gerais, Brazil) and the start-up Homologix. The main objective of the project is to build a new resource for the chemogenomics research, on chemical compounds, with a strong emphasis on natural molecules. Adopted technologies include the search of information from structured, semi-structured, and non-structured documents (the last two from the web) and datamining tools in order to gather information from different sources. The database is the support for developing applications to find new potential treatments for parasitic infections by using virtual screening tools. We present here the midpoint of the project: the conception and implementation of the Tropical Biominer Database. This is a Federated Database designed to store data from different resources. Connected to the database, a web crawler is able to gather information from distinct, patented web sites and store them after automatic classification using datamining tools. Finally, we demonstrate the interest of the approach, by formulating new hypotheses on specific targets of a natural compound, violacein, using inferences from a Virtual Screening procedure.
Network Configuration of Oracle and Database Programming Using SQL
NASA Technical Reports Server (NTRS)
Davis, Melton; Abdurrashid, Jibril; Diaz, Philip; Harris, W. C.
2000-01-01
A database can be defined as a collection of information organized in such a way that it can be retrieved and used. A database management system (DBMS) can further be defined as the tool that enables us to manage and interact with the database. The Oracle 8 Server is a state-of-the-art information management environment. It is a repository for very large amounts of data, and gives users rapid access to that data. The Oracle 8 Server allows for sharing of data between applications; the information is stored in one place and used by many systems. My research will focus primarily on SQL (Structured Query Language) programming. SQL is the way you define and manipulate data in Oracle's relational database. SQL is the industry standard adopted by all database vendors. When programming with SQL, you work on sets of data (i.e., information is not processed one record at a time).
Varela, Sara; González-Hernández, Javier; Casabella, Eduardo; Barrientos, Rafael
2014-01-01
Citizen science projects store an enormous amount of information about species distribution, diversity and characteristics. Researchers are now beginning to make use of this rich collection of data. However, access to these databases is not always straightforward. Apart from the largest and international projects, citizen science repositories often lack specific Application Programming Interfaces (APIs) to connect them to the scientific environments. Thus, it is necessary to develop simple routines to allow researchers to take advantage of the information collected by smaller citizen science projects, for instance, programming specific packages to connect them to popular scientific environments (like R). Here, we present rAvis, an R-package to connect R-users with Proyecto AVIS (http://proyectoavis.com), a Spanish citizen science project with more than 82,000 bird observation records. We develop several functions to explore the database, to plot the geographic distribution of the species occurrences, and to generate personal queries to the database about species occurrences (number of individuals, distribution, etc.) and birdwatcher observations (number of species recorded by each collaborator, UTMs visited, etc.). This new R-package will allow scientists to access this database and to exploit the information generated by Spanish birdwatchers over the last 40 years.
Varela, Sara; González-Hernández, Javier; Casabella, Eduardo; Barrientos, Rafael
2014-01-01
Citizen science projects store an enormous amount of information about species distribution, diversity and characteristics. Researchers are now beginning to make use of this rich collection of data. However, access to these databases is not always straightforward. Apart from the largest and international projects, citizen science repositories often lack specific Application Programming Interfaces (APIs) to connect them to the scientific environments. Thus, it is necessary to develop simple routines to allow researchers to take advantage of the information collected by smaller citizen science projects, for instance, programming specific packages to connect them to popular scientific environments (like R). Here, we present rAvis, an R-package to connect R-users with Proyecto AVIS (http://proyectoavis.com), a Spanish citizen science project with more than 82,000 bird observation records. We develop several functions to explore the database, to plot the geographic distribution of the species occurrences, and to generate personal queries to the database about species occurrences (number of individuals, distribution, etc.) and birdwatcher observations (number of species recorded by each collaborator, UTMs visited, etc.). This new R-package will allow scientists to access this database and to exploit the information generated by Spanish birdwatchers over the last 40 years. PMID:24626233
BIRS - Bioterrorism Information Retrieval System.
Tewari, Ashish Kumar; Rashi; Wadhwa, Gulshan; Sharma, Sanjeev Kumar; Jain, Chakresh Kumar
2013-01-01
Bioterrorism is the intended use of pathogenic strains of microbes to widen terror in a population. There is a definite need to promote research for development of vaccines, therapeutics and diagnostic methods as a part of preparedness to any bioterror attack in the future. BIRS is an open-access database of collective information on the organisms related to bioterrorism. The architecture of database utilizes the current open-source technology viz PHP ver 5.3.19, MySQL and IIS server under windows platform for database designing. Database stores information on literature, generic- information and unique pathways of about 10 microorganisms involved in bioterrorism. This may serve as a collective repository to accelerate the drug discovery and vaccines designing process against such bioterrorist agents (microbes). The available data has been validated from various online resources and literature mining in order to provide the user with a comprehensive information system. The database is freely available at http://www.bioterrorism.biowaves.org.
HOWDY: an integrated database system for human genome research
Hirakawa, Mika
2002-01-01
HOWDY is an integrated database system for accessing and analyzing human genomic information (http://www-alis.tokyo.jst.go.jp/HOWDY/). HOWDY stores information about relationships between genetic objects and the data extracted from a number of databases. HOWDY consists of an Internet accessible user interface that allows thorough searching of the human genomic databases using the gene symbols and their aliases. It also permits flexible editing of the sequence data. The database can be searched using simple words and the search can be restricted to a specific cytogenetic location. Linear maps displaying markers and genes on contig sequences are available, from which an object can be chosen. Any search starting point identifies all the information matching the query. HOWDY provides a convenient search environment of human genomic data for scientists unsure which database is most appropriate for their search. PMID:11752279
Code of Federal Regulations, 2011 CFR
2011-01-01
... home computer systems of an employee; or (4) Whether the information is active or inactive. (k) Record... (e.g., e-mail, databases, spreadsheets, PowerPoint presentations, electronic reporting systems... information is stored or located, including network servers, desktop or laptop computers and handheld...
Application of cloud database in the management of clinical data of patients with skin diseases.
Mao, Xiao-fei; Liu, Rui; DU, Wei; Fan, Xue; Chen, Dian; Zuo, Ya-gang; Sun, Qiu-ning
2015-04-01
To evaluate the needs and applications of using cloud database in the daily practice of dermatology department. The cloud database was established for systemic scleroderma and localized scleroderma. Paper forms were used to record the original data including personal information, pictures, specimens, blood biochemical indicators, skin lesions,and scores of self-rating scales. The results were input into the cloud database. The applications of the cloud database in the dermatology department were summarized and analyzed. The personal and clinical information of 215 systemic scleroderma patients and 522 localized scleroderma patients were included and analyzed using the cloud database. The disease status,quality of life, and prognosis were obtained by statistical calculations. The cloud database can efficiently and rapidly store and manage the data of patients with skin diseases. As a simple, prompt, safe, and convenient tool, it can be used in patients information management, clinical decision-making, and scientific research.
MedBlock: Efficient and Secure Medical Data Sharing Via Blockchain.
Fan, Kai; Wang, Shangyang; Ren, Yanhui; Li, Hui; Yang, Yintang
2018-06-21
With the development of electronic information technology, electronic medical records (EMRs) have been a common way to store the patients' data in hospitals. They are stored in different hospitals' databases, even for the same patient. Therefore, it is difficult to construct a summarized EMR for one patient from multiple hospital databases due to the security and privacy concerns. Meanwhile, current EMRs systems lack a standard data management and sharing policy, making it difficult for pharmaceutical scientists to develop precise medicines based on data obtained under different policies. To solve the above problems, we proposed a blockchain-based information management system, MedBlock, to handle patients' information. In this scheme, the distributed ledger of MedBlock allows the efficient EMRs access and EMRs retrieval. The improved consensus mechanism achieves consensus of EMRs without large energy consumption and network congestion. In addition, MedBlock also exhibits high information security combining the customized access control protocols and symmetric cryptography. MedBlock can play an important role in the sensitive medical information sharing.
The forest inventory and analysis database description and users manual version 1.0
Patrick D. Miles; Gary J. Brand; Carol L. Alerich; Larry F. Bednar; Sharon W. Woudenberg; Joseph F. Glover; Edward N. Ezell
2001-01-01
Describes the structure of the Forest Inventory and Analysis Database (FIADB) and provides information on generating estimates of forest statistics from these data. The FIADB structure provides a consistent framework for storing forest inventory data across all ownerships across the entire United States. These data are available to the public.
SwePep, a database designed for endogenous peptides and mass spectrometry.
Fälth, Maria; Sköld, Karl; Norrman, Mathias; Svensson, Marcus; Fenyö, David; Andren, Per E
2006-06-01
A new database, SwePep, specifically designed for endogenous peptides, has been constructed to significantly speed up the identification process from complex tissue samples utilizing mass spectrometry. In the identification process the experimental peptide masses are compared with the peptide masses stored in the database both with and without possible post-translational modifications. This intermediate identification step is fast and singles out peptides that are potential endogenous peptides and can later be confirmed with tandem mass spectrometry data. Successful applications of this methodology are presented. The SwePep database is a relational database developed using MySql and Java. The database contains 4180 annotated endogenous peptides from different tissues originating from 394 different species as well as 50 novel peptides from brain tissue identified in our laboratory. Information about the peptides, including mass, isoelectric point, sequence, and precursor protein, is also stored in the database. This new approach holds great potential for removing the bottleneck that occurs during the identification process in the field of peptidomics. The SwePep database is available to the public.
Web application for detailed real-time database transaction monitoring for CMS condition data
NASA Astrophysics Data System (ADS)
de Gruttola, Michele; Di Guida, Salvatore; Innocente, Vincenzo; Pierro, Antonio
2012-12-01
In the upcoming LHC era, database have become an essential part for the experiments collecting data from LHC, in order to safely store, and consistently retrieve, a wide amount of data, which are produced by different sources. In the CMS experiment at CERN, all this information is stored in ORACLE databases, allocated in several servers, both inside and outside the CERN network. In this scenario, the task of monitoring different databases is a crucial database administration issue, since different information may be required depending on different users' tasks such as data transfer, inspection, planning and security issues. We present here a web application based on Python web framework and Python modules for data mining purposes. To customize the GUI we record traces of user interactions that are used to build use case models. In addition the application detects errors in database transactions (for example identify any mistake made by user, application failure, unexpected network shutdown or Structured Query Language (SQL) statement error) and provides warning messages from the different users' perspectives. Finally, in order to fullfill the requirements of the CMS experiment community, and to meet the new development in many Web client tools, our application was further developed, and new features were deployed.
A web-based, relational database for studying glaciers in the Italian Alps
NASA Astrophysics Data System (ADS)
Nigrelli, G.; Chiarle, M.; Nuzzi, A.; Perotti, L.; Torta, G.; Giardino, M.
2013-02-01
Glaciers are among the best terrestrial indicators of climate change and thus glacier inventories have attracted a growing, worldwide interest in recent years. In Italy, the first official glacier inventory was completed in 1925 and 774 glacial bodies were identified. As the amount of data continues to increase, and new techniques become available, there is a growing demand for computer tools that can efficiently manage the collected data. The Research Institute for Geo-hydrological Protection of the National Research Council, in cooperation with the Departments of Computer Science and Earth Sciences of the University of Turin, created a database that provides a modern tool for storing, processing and sharing glaciological data. The database was developed according to the need of storing heterogeneous information, which can be retrieved through a set of web search queries. The database's architecture is server-side, and was designed by means of an open source software. The website interface, simple and intuitive, was intended to meet the needs of a distributed public: through this interface, any type of glaciological data can be managed, specific queries can be performed, and the results can be exported in a standard format. The use of a relational database to store and organize a large variety of information about Italian glaciers collected over the last hundred years constitutes a significant step forward in ensuring the safety and accessibility of such data. Moreover, the same benefits also apply to the enhanced operability for handling information in the future, including new and emerging types of data formats, such as geographic and multimedia files. Future developments include the integration of cartographic data, such as base maps, satellite images and vector data. The relational database described in this paper will be the heart of a new geographic system that will merge data, data attributes and maps, leading to a complete description of Italian glacial environments.
USDA-ARS?s Scientific Manuscript database
Many specialty foods cannot be found in research-focused food databases. However, some nutrient data can be found for many of these foods through individual website searches using brand and store names. Some popular diet-tracking websites contain data for over 3 million foods, data often entered by ...
Stahl, Olivier; Duvergey, Hugo; Guille, Arnaud; Blondin, Fanny; Vecchio, Alexandre Del; Finetti, Pascal; Granjeaud, Samuel; Vigy, Oana; Bidaut, Ghislain
2013-06-06
With the advance of post-genomic technologies, the need for tools to manage large scale data in biology becomes more pressing. This involves annotating and storing data securely, as well as granting permissions flexibly with several technologies (all array types, flow cytometry, proteomics) for collaborative work and data sharing. This task is not easily achieved with most systems available today. We developed Djeen (Database for Joomla!'s Extensible Engine), a new Research Information Management System (RIMS) for collaborative projects. Djeen is a user-friendly application, designed to streamline data storage and annotation collaboratively. Its database model, kept simple, is compliant with most technologies and allows storing and managing of heterogeneous data with the same system. Advanced permissions are managed through different roles. Templates allow Minimum Information (MI) compliance. Djeen allows managing project associated with heterogeneous data types while enforcing annotation integrity and minimum information. Projects are managed within a hierarchy and user permissions are finely-grained for each project, user and group.Djeen Component source code (version 1.5.1) and installation documentation are available under CeCILL license from http://sourceforge.net/projects/djeen/files and supplementary material.
2013-01-01
Background With the advance of post-genomic technologies, the need for tools to manage large scale data in biology becomes more pressing. This involves annotating and storing data securely, as well as granting permissions flexibly with several technologies (all array types, flow cytometry, proteomics) for collaborative work and data sharing. This task is not easily achieved with most systems available today. Findings We developed Djeen (Database for Joomla!’s Extensible Engine), a new Research Information Management System (RIMS) for collaborative projects. Djeen is a user-friendly application, designed to streamline data storage and annotation collaboratively. Its database model, kept simple, is compliant with most technologies and allows storing and managing of heterogeneous data with the same system. Advanced permissions are managed through different roles. Templates allow Minimum Information (MI) compliance. Conclusion Djeen allows managing project associated with heterogeneous data types while enforcing annotation integrity and minimum information. Projects are managed within a hierarchy and user permissions are finely-grained for each project, user and group. Djeen Component source code (version 1.5.1) and installation documentation are available under CeCILL license from http://sourceforge.net/projects/djeen/files and supplementary material. PMID:23742665
Ackerman, Katherine V.; Mixon, David M.; Sundquist, Eric T.; Stallard, Robert F.; Schwarz, Gregory E.; Stewart, David W.
2009-01-01
The Reservoir Sedimentation Survey Information System (RESIS) database, originally compiled by the Soil Conservation Service (now the Natural Resources Conservation Service) in collaboration with the Texas Agricultural Experiment Station, is the most comprehensive compilation of data from reservoir sedimentation surveys throughout the conterminous United States (U.S.). The database is a cumulative historical archive that includes data from as early as 1755 and as late as 1993. The 1,823 reservoirs included in the database range in size from farm ponds to the largest U.S. reservoirs (such as Lake Mead). Results from 6,617 bathymetric surveys are available in the database. This Data Series provides an improved version of the original RESIS database, termed RESIS-II, and a report describing RESIS-II. The RESIS-II relational database is stored in Microsoft Access and includes more precise location coordinates for most of the reservoirs than the original database but excludes information on reservoir ownership. RESIS-II is anticipated to be a template for further improvements in the database.
Sánchez-de-Madariaga, Ricardo; Muñoz, Adolfo; Castro, Antonio L; Moreno, Oscar; Pascual, Mario
2018-01-01
This research shows a protocol to assess the computational complexity of querying relational and non-relational (NoSQL (not only Structured Query Language)) standardized electronic health record (EHR) medical information database systems (DBMS). It uses a set of three doubling-sized databases, i.e. databases storing 5000, 10,000 and 20,000 realistic standardized EHR extracts, in three different database management systems (DBMS): relational MySQL object-relational mapping (ORM), document-based NoSQL MongoDB, and native extensible markup language (XML) NoSQL eXist. The average response times to six complexity-increasing queries were computed, and the results showed a linear behavior in the NoSQL cases. In the NoSQL field, MongoDB presents a much flatter linear slope than eXist. NoSQL systems may also be more appropriate to maintain standardized medical information systems due to the special nature of the updating policies of medical information, which should not affect the consistency and efficiency of the data stored in NoSQL databases. One limitation of this protocol is the lack of direct results of improved relational systems such as archetype relational mapping (ARM) with the same data. However, the interpolation of doubling-size database results to those presented in the literature and other published results suggests that NoSQL systems might be more appropriate in many specific scenarios and problems to be solved. For example, NoSQL may be appropriate for document-based tasks such as EHR extracts used in clinical practice, or edition and visualization, or situations where the aim is not only to query medical information, but also to restore the EHR in exactly its original form. PMID:29608174
Sánchez-de-Madariaga, Ricardo; Muñoz, Adolfo; Castro, Antonio L; Moreno, Oscar; Pascual, Mario
2018-03-19
This research shows a protocol to assess the computational complexity of querying relational and non-relational (NoSQL (not only Structured Query Language)) standardized electronic health record (EHR) medical information database systems (DBMS). It uses a set of three doubling-sized databases, i.e. databases storing 5000, 10,000 and 20,000 realistic standardized EHR extracts, in three different database management systems (DBMS): relational MySQL object-relational mapping (ORM), document-based NoSQL MongoDB, and native extensible markup language (XML) NoSQL eXist. The average response times to six complexity-increasing queries were computed, and the results showed a linear behavior in the NoSQL cases. In the NoSQL field, MongoDB presents a much flatter linear slope than eXist. NoSQL systems may also be more appropriate to maintain standardized medical information systems due to the special nature of the updating policies of medical information, which should not affect the consistency and efficiency of the data stored in NoSQL databases. One limitation of this protocol is the lack of direct results of improved relational systems such as archetype relational mapping (ARM) with the same data. However, the interpolation of doubling-size database results to those presented in the literature and other published results suggests that NoSQL systems might be more appropriate in many specific scenarios and problems to be solved. For example, NoSQL may be appropriate for document-based tasks such as EHR extracts used in clinical practice, or edition and visualization, or situations where the aim is not only to query medical information, but also to restore the EHR in exactly its original form.
Zhulin, Igor B.
2015-05-26
Databases play an increasingly important role in biology. They archive, store, maintain, and share information on genes, genomes, expression data, protein sequences and structures, metabolites and reactions, interactions, and pathways. All these data are critically important to microbiologists. Furthermore, microbiology has its own databases that deal with model microorganisms, microbial diversity, physiology, and pathogenesis. Thousands of biological databases are currently available, and it becomes increasingly difficult to keep up with their development. Finally, the purpose of this minireview is to provide a brief survey of current databases that are of interest to microbiologists.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Zhulin, Igor B.
Databases play an increasingly important role in biology. They archive, store, maintain, and share information on genes, genomes, expression data, protein sequences and structures, metabolites and reactions, interactions, and pathways. All these data are critically important to microbiologists. Furthermore, microbiology has its own databases that deal with model microorganisms, microbial diversity, physiology, and pathogenesis. Thousands of biological databases are currently available, and it becomes increasingly difficult to keep up with their development. Finally, the purpose of this minireview is to provide a brief survey of current databases that are of interest to microbiologists.
2015-01-01
Databases play an increasingly important role in biology. They archive, store, maintain, and share information on genes, genomes, expression data, protein sequences and structures, metabolites and reactions, interactions, and pathways. All these data are critically important to microbiologists. Furthermore, microbiology has its own databases that deal with model microorganisms, microbial diversity, physiology, and pathogenesis. Thousands of biological databases are currently available, and it becomes increasingly difficult to keep up with their development. The purpose of this minireview is to provide a brief survey of current databases that are of interest to microbiologists. PMID:26013493
A rudimentary database for three-dimensional objects using structural representation
NASA Technical Reports Server (NTRS)
Sowers, James P.
1987-01-01
A database which enables users to store and share the description of three-dimensional objects in a research environment is presented. The main objective of the design is to make it a compact structure that holds sufficient information to reconstruct the object. The database design is based on an object representation scheme which is information preserving, reasonably efficient, and yet economical in terms of the storage requirement. The determination of the needed data for the reconstruction process is guided by the belief that it is faster to do simple computations to generate needed data/information for construction than to retrieve everything from memory. Some recent techniques of three-dimensional representation that influenced the design of the database are discussed. The schema for the database and the structural definition used to define an object are given. The user manual for the software developed to create and maintain the contents of the database is included.
NASA Astrophysics Data System (ADS)
Gebhardt, Steffen; Wehrmann, Thilo; Klinger, Verena; Schettler, Ingo; Huth, Juliane; Künzer, Claudia; Dech, Stefan
2010-10-01
The German-Vietnamese water-related information system for the Mekong Delta (WISDOM) project supports business processes in Integrated Water Resources Management in Vietnam. Multiple disciplines bring together earth and ground based observation themes, such as environmental monitoring, water management, demographics, economy, information technology, and infrastructural systems. This paper introduces the components of the web-based WISDOM system including data, logic and presentation tier. It focuses on the data models upon which the database management system is built, including techniques for tagging or linking metadata with the stored information. The model also uses ordered groupings of spatial, thematic and temporal reference objects to semantically tag datasets to enable fast data retrieval, such as finding all data in a specific administrative unit belonging to a specific theme. A spatial database extension is employed by the PostgreSQL database. This object-oriented database was chosen over a relational database to tag spatial objects to tabular data, improving the retrieval of census and observational data at regional, provincial, and local areas. While the spatial database hinders processing raster data, a "work-around" was built into WISDOM to permit efficient management of both raster and vector data. The data model also incorporates styling aspects of the spatial datasets through styled layer descriptions (SLD) and web mapping service (WMS) layer specifications, allowing retrieval of rendered maps. Metadata elements of the spatial data are based on the ISO19115 standard. XML structured information of the SLD and metadata are stored in an XML database. The data models and the data management system are robust for managing the large quantity of spatial objects, sensor observations, census and document data. The operational WISDOM information system prototype contains modules for data management, automatic data integration, and web services for data retrieval, analysis, and distribution. The graphical user interfaces facilitate metadata cataloguing, data warehousing, web sensor data analysis and thematic mapping.
VAS: A Vision Advisor System combining agents and object-oriented databases
NASA Technical Reports Server (NTRS)
Eilbert, James L.; Lim, William; Mendelsohn, Jay; Braun, Ron; Yearwood, Michael
1994-01-01
A model-based approach to identifying and finding the orientation of non-overlapping parts on a tray has been developed. The part models contain both exact and fuzzy descriptions of part features, and are stored in an object-oriented database. Full identification of the parts involves several interacting tasks each of which is handled by a distinct agent. Using fuzzy information stored in the model allowed part features that were essentially at the noise level to be extracted and used for identification. This was done by focusing attention on the portion of the part where the feature must be found if the current hypothesis of the part ID is correct. In going from one set of parts to another the only thing that needs to be changed is the database of part models. This work is part of an effort in developing a Vision Advisor System (VAS) that combines agents and objected-oriented databases.
Region 7 Laboratory Information Management System
This is metadata documentation for the Region 7 Laboratory Information Management System (R7LIMS) which maintains records for the Regional Laboratory. Any Laboratory analytical work performed is stored in this system which replaces LIMS-Lite, and before that LAST. The EPA and its contractors may use this database. The Office of Policy & Management (PLMG) Division at EPA Region 7 is the primary managing entity; contractors can access this database but it is not accessible to the public.
Correspondence: World Wide Web access to the British Universities Human Embryo Database
AITON, JAMES F.; MCDONOUGH, ARIANA; MCLACHLAN, JOHN C.; SMART, STEVEN D.; WHITEN, SUSAN C.
1997-01-01
The British Universities Human Embryo Database has been created by merging information from the Walmsley Collection of Human Embryos at the School of Biological and Medical Sciences, University of St Andrews and from the Boyd Collection of Human Embryos at the Department of Anatomy, University of Cambridge. The database has been made available electronically on the Internet and World Wide Web browsers can be used to implement interactive access to the information stored in the British Universities Human Embryo Database. The database can, therefore, be accessed and searched from remote sites and specific embryos can be identified in terms of their location, age, developmental stage, plane of section, staining technique, and other parameters. It is intended to add information from other similar collections in the UK as it becomes available. PMID:9034891
Multiresource inventories incorporating GIS, GPS, and database management systems
Loukas G. Arvanitis; Balaji Ramachandran; Daniel P. Brackett; Hesham Abd-El Rasol; Xuesong Du
2000-01-01
Large-scale natural resource inventories generate enormous data sets. Their effective handling requires a sophisticated database management system. Such a system must be robust enough to efficiently store large amounts of data and flexible enough to allow users to manipulate a wide variety of information. In a pilot project, related to a multiresource inventory of the...
JANE, A new information retrieval system for the Radiation Shielding Information Center
DOE Office of Scientific and Technical Information (OSTI.GOV)
Trubey, D.K.
A new information storage and retrieval system has been developed for the Radiation Shielding Information Center (RSIC) at Oak Ridge National Laboratory to replace mainframe systems that have become obsolete. The database contains citations and abstracts of literature which were selected by RSIC analysts and indexed with terms from a controlled vocabulary. The database, begun in 1963, has been maintained continuously since that time. The new system, called JANE, incorporates automatic indexing techniques and on-line retrieval using the RSIC Data General Eclipse MV/4000 minicomputer, Automatic indexing and retrieval techniques based on fuzzy-set theory allow the presentation of results in ordermore » of Retrieval Status Value. The fuzzy-set membership function depends on term frequency in the titles and abstracts and on Term Discrimination Values which indicate the resolving power of the individual terms. These values are determined by the Cover Coefficient method. The use of a commercial database base to store and retrieve the indexing information permits rapid retrieval of the stored documents. Comparisons of the new and presently-used systems for actual searches of the literature indicate that it is practical to replace the mainframe systems with a minicomputer system similar to the present version of JANE. 18 refs., 10 figs.« less
Remote sensing applied to resource management
Henry M. Lachowski
1998-01-01
Effective management of forest resources requires access to current and consistent geospatial information that can be shared by resource managers and the public. Geospatial information describing our land and natural resources comes from many sources and is most effective when stored in a geospatial database and used in a geographic information system (GIS). The...
MaizeGDB: The Maize Genetics and Genomics Database.
Harper, Lisa; Gardiner, Jack; Andorf, Carson; Lawrence, Carolyn J
2016-01-01
MaizeGDB is the community database for biological information about the crop plant Zea mays. Genomic, genetic, sequence, gene product, functional characterization, literature reference, and person/organization contact information are among the datatypes stored at MaizeGDB. At the project's website ( http://www.maizegdb.org ) are custom interfaces enabling researchers to browse data and to seek out specific information matching explicit search criteria. In addition, pre-compiled reports are made available for particular types of data and bulletin boards are provided to facilitate communication and coordination among members of the community of maize geneticists.
CyBy(2): a structure-based data management tool for chemical and biological data.
Höck, Stefan; Riedl, Rainer
2012-01-01
We report the development of a powerful data management tool for chemical and biological data: CyBy(2). CyBy(2) is a structure-based information management tool used to store and visualize structural data alongside additional information such as project assignment, physical information, spectroscopic data, biological activity, functional data and synthetic procedures. The application consists of a database, an application server, used to query and update the database, and a client application with a rich graphical user interface (GUI) used to interact with the server.
A novel data storage logic in the cloud
Mátyás, Bence; Szarka, Máté; Járvás, Gábor; Kusper, Gábor; Argay, István; Fialowski, Alice
2016-01-01
Databases which store and manage long-term scientific information related to life science are used to store huge amount of quantitative attributes. Introduction of a new entity attribute requires modification of the existing data tables and the programs that use these data tables. The solution is increasing the virtual data tables while the number of screens remains the same. The main objective of the present study was to introduce a logic called Joker Tao (JT) which provides universal data storage for cloud-based databases. It means all types of input data can be interpreted as an entity and attribute at the same time, in the same data table. PMID:29026521
A novel data storage logic in the cloud.
Mátyás, Bence; Szarka, Máté; Járvás, Gábor; Kusper, Gábor; Argay, István; Fialowski, Alice
2016-01-01
Databases which store and manage long-term scientific information related to life science are used to store huge amount of quantitative attributes. Introduction of a new entity attribute requires modification of the existing data tables and the programs that use these data tables. The solution is increasing the virtual data tables while the number of screens remains the same. The main objective of the present study was to introduce a logic called Joker Tao (JT) which provides universal data storage for cloud-based databases. It means all types of input data can be interpreted as an entity and attribute at the same time, in the same data table.
A pilot GIS database of active faults of Mt. Etna (Sicily): A tool for integrated hazard evaluation
NASA Astrophysics Data System (ADS)
Barreca, Giovanni; Bonforte, Alessandro; Neri, Marco
2013-02-01
A pilot GIS-based system has been implemented for the assessment and analysis of hazard related to active faults affecting the eastern and southern flanks of Mt. Etna. The system structure was developed in ArcGis® environment and consists of different thematic datasets that include spatially-referenced arc-features and associated database. Arc-type features, georeferenced into WGS84 Ellipsoid UTM zone 33 Projection, represent the five main fault systems that develop in the analysed region. The backbone of the GIS-based system is constituted by the large amount of information which was collected from the literature and then stored and properly geocoded in a digital database. This consists of thirty five alpha-numeric fields which include all fault parameters available from literature such us location, kinematics, landform, slip rate, etc. Although the system has been implemented according to the most common procedures used by GIS developer, the architecture and content of the database represent a pilot backbone for digital storing of fault parameters, providing a powerful tool in modelling hazard related to the active tectonics of Mt. Etna. The database collects, organises and shares all scientific currently available information about the active faults of the volcano. Furthermore, thanks to the strong effort spent on defining the fields of the database, the structure proposed in this paper is open to the collection of further data coming from future improvements in the knowledge of the fault systems. By layering additional user-specific geographic information and managing the proposed database (topological querying) a great diversity of hazard and vulnerability maps can be produced by the user. This is a proposal of a backbone for a comprehensive geographical database of fault systems, universally applicable to other sites.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Gaponov, Yu.A.; Igarashi, N.; Hiraki, M.
2004-05-12
An integrated controlling system and a unified database for high throughput protein crystallography experiments have been developed. Main features of protein crystallography experiments (purification, crystallization, crystal harvesting, data collection, data processing) were integrated into the software under development. All information necessary to perform protein crystallography experiments is stored (except raw X-ray data that are stored in a central data server) in a MySQL relational database. The database contains four mutually linked hierarchical trees describing protein crystals, data collection of protein crystal and experimental data processing. A database editor was designed and developed. The editor supports basic database functions to view,more » create, modify and delete user records in the database. Two search engines were realized: direct search of necessary information in the database and object oriented search. The system is based on TCP/IP secure UNIX sockets with four predefined sending and receiving behaviors, which support communications between all connected servers and clients with remote control functions (creating and modifying data for experimental conditions, data acquisition, viewing experimental data, and performing data processing). Two secure login schemes were designed and developed: a direct method (using the developed Linux clients with secure connection) and an indirect method (using the secure SSL connection using secure X11 support from any operating system with X-terminal and SSH support). A part of the system has been implemented on a new MAD beam line, NW12, at the Photon Factory Advanced Ring for general user experiments.« less
BIRS – Bioterrorism Information Retrieval System
Tewari, Ashish Kumar; Rashi; Wadhwa, Gulshan; Sharma, Sanjeev Kumar; Jain, Chakresh Kumar
2013-01-01
Bioterrorism is the intended use of pathogenic strains of microbes to widen terror in a population. There is a definite need to promote research for development of vaccines, therapeutics and diagnostic methods as a part of preparedness to any bioterror attack in the future. BIRS is an open-access database of collective information on the organisms related to bioterrorism. The architecture of database utilizes the current open-source technology viz PHP ver 5.3.19, MySQL and IIS server under windows platform for database designing. Database stores information on literature, generic- information and unique pathways of about 10 microorganisms involved in bioterrorism. This may serve as a collective repository to accelerate the drug discovery and vaccines designing process against such bioterrorist agents (microbes). The available data has been validated from various online resources and literature mining in order to provide the user with a comprehensive information system. Availability The database is freely available at http://www.bioterrorism.biowaves.org PMID:23390356
Software support for Huntingtons disease research.
Conneally, P M; Gersting, J M; Gray, J M; Beidleman, K; Wexler, N S; Smith, C L
1991-01-01
Huntingtons disease (HD) is a hereditary disorder involving the central nervous system. Its effects are devastating, to the affected person as well as his family. The Department of Medical and Molecular Genetics at Indiana University (IU) plays an integral part in Huntingtons research by providing computerized repositories of HD family information for researchers and families. The National Huntingtons Disease Research Roster, founded in 1979 at IU, and the Huntingtons Disease in Venezuela Project database contain information that has proven to be invaluable in the worldwide field of HD research. This paper addresses the types of information stored in each database, the pedigree database program (MEGADATS) used to manage the data, and significant findings that have resulted from access to the data.
Drainage identification analysis and mapping, phase 2.
DOT National Transportation Integrated Search
2017-01-01
Drainage Identification, Analysis and Mapping System (DIAMS) is a computerized database that captures and : stores relevant information associated with all aboveground and underground hydraulic structures belonging to : the New Jersey Department of T...
cPath: open source software for collecting, storing, and querying biological pathways.
Cerami, Ethan G; Bader, Gary D; Gross, Benjamin E; Sander, Chris
2006-11-13
Biological pathways, including metabolic pathways, protein interaction networks, signal transduction pathways, and gene regulatory networks, are currently represented in over 220 diverse databases. These data are crucial for the study of specific biological processes, including human diseases. Standard exchange formats for pathway information, such as BioPAX, CellML, SBML and PSI-MI, enable convenient collection of this data for biological research, but mechanisms for common storage and communication are required. We have developed cPath, an open source database and web application for collecting, storing, and querying biological pathway data. cPath makes it easy to aggregate custom pathway data sets available in standard exchange formats from multiple databases, present pathway data to biologists via a customizable web interface, and export pathway data via a web service to third-party software, such as Cytoscape, for visualization and analysis. cPath is software only, and does not include new pathway information. Key features include: a built-in identifier mapping service for linking identical interactors and linking to external resources; built-in support for PSI-MI and BioPAX standard pathway exchange formats; a web service interface for searching and retrieving pathway data sets; and thorough documentation. The cPath software is freely available under the LGPL open source license for academic and commercial use. cPath is a robust, scalable, modular, professional-grade software platform for collecting, storing, and querying biological pathways. It can serve as the core data handling component in information systems for pathway visualization, analysis and modeling.
HBVPathDB: a database of HBV infection-related molecular interaction network.
Zhang, Yi; Bo, Xiao-Chen; Yang, Jing; Wang, Sheng-Qi
2005-03-21
To describe molecules or genes interaction between hepatitis B viruses (HBV) and host, for understanding how virus' and host's genes and molecules are networked to form a biological system and for perceiving mechanism of HBV infection. The knowledge of HBV infection-related reactions was organized into various kinds of pathways with carefully drawn graphs in HBVPathDB. Pathway information is stored with relational database management system (DBMS), which is currently the most efficient way to manage large amounts of data and query is implemented with powerful Structured Query Language (SQL). The search engine is written using Personal Home Page (PHP) with SQL embedded and web retrieval interface is developed for searching with Hypertext Markup Language (HTML). We present the first version of HBVPathDB, which is a HBV infection-related molecular interaction network database composed of 306 pathways with 1 050 molecules involved. With carefully drawn graphs, pathway information stored in HBVPathDB can be browsed in an intuitive way. We develop an easy-to-use interface for flexible accesses to the details of database. Convenient software is implemented to query and browse the pathway information of HBVPathDB. Four search page layout options-category search, gene search, description search, unitized search-are supported by the search engine of the database. The database is freely available at http://www.bio-inf.net/HBVPathDB/HBV/. The conventional perspective HBVPathDB have already contained a considerable amount of pathway information with HBV infection related, which is suitable for in-depth analysis of molecular interaction network of virus and host. HBVPathDB integrates pathway data-sets with convenient software for query, browsing, visualization, that provides users more opportunity to identify regulatory key molecules as potential drug targets and to explore the possible mechanism of HBV infection based on gene expression datasets.
D'Antonio, Matteo; Masseroli, Marco
2009-01-01
Background Alternative splicing has been demonstrated to affect most of human genes; different isoforms from the same gene encode for proteins which differ for a limited number of residues, thus yielding similar structures. This suggests possible correlations between alternative splicing and protein structure. In order to support the investigation of such relationships, we have developed the Alternative Splicing and Protein Structure Scrutinizer (PASS), a Web application to automatically extract, integrate and analyze human alternative splicing and protein structure data sparsely available in the Alternative Splicing Database, Ensembl databank and Protein Data Bank. Primary data from these databases have been integrated and analyzed using the Protein Identifier Cross-Reference, BLAST, CLUSTALW and FeatureMap3D software tools. Results A database has been developed to store the considered primary data and the results from their analysis; a system of Perl scripts has been implemented to automatically create and update the database and analyze the integrated data; a Web interface has been implemented to make the analyses easily accessible; a database has been created to manage user accesses to the PASS Web application and store user's data and searches. Conclusion PASS automatically integrates data from the Alternative Splicing Database with protein structure data from the Protein Data Bank. Additionally, it comprehensively analyzes the integrated data with publicly available well-known bioinformatics tools in order to generate structural information of isoform pairs. Further analysis of such valuable information might reveal interesting relationships between alternative splicing and protein structure differences, which may be significantly associated with different functions. PMID:19828075
Kentala, E; Pyykkö, I; Auramo, Y; Juhola, M
1995-03-01
An interactive database has been developed to assist the diagnostic procedure for vertigo and to store the data. The database offers a possibility to split and reunite the collected information when needed. It contains detailed information about a patient's history, symptoms, and findings in otoneurologic, audiologic, and imaging tests. The symptoms are classified into sets of questions on vertigo (including postural instability), hearing loss and tinnitus, and provoking factors. Confounding disorders are screened. The otoneurologic tests involve saccades, smooth pursuit, posturography, and a caloric test. In addition, findings from specific antibody tests, clinical neurotologic tests, magnetic resonance imaging, brain stem audiometry, and electrocochleography are included. The input information can be applied to workups for vertigo in an expert system called ONE. The database assists its user in that the input of information is easy. If not only can be used for diagnostic purposes but is also beneficial for research, and in combination with the expert system, it provides a tutorial guide for medical students.
Software support for Huntingtons disease research.
Conneally, P. M.; Gersting, J. M.; Gray, J. M.; Beidleman, K.; Wexler, N. S.; Smith, C. L.
1991-01-01
Huntingtons disease (HD) is a hereditary disorder involving the central nervous system. Its effects are devastating, to the affected person as well as his family. The Department of Medical and Molecular Genetics at Indiana University (IU) plays an integral part in Huntingtons research by providing computerized repositories of HD family information for researchers and families. The National Huntingtons Disease Research Roster, founded in 1979 at IU, and the Huntingtons Disease in Venezuela Project database contain information that has proven to be invaluable in the worldwide field of HD research. This paper addresses the types of information stored in each database, the pedigree database program (MEGADATS) used to manage the data, and significant findings that have resulted from access to the data. PMID:1839672
DOE Office of Scientific and Technical Information (OSTI.GOV)
The system is developed to collect, process, store and present the information provided by the radio frequency identification (RFID) devices. The system contains three parts, the application software, the database and the web page. The application software manages multiple RFID devices, such as readers and portals, simultaneously. It communicates with the devices through application programming interface (API) provided by the device vendor. The application software converts data collected by the RFID readers and portals to readable information. It is capable of encrypting data using 256 bits advanced encryption standard (AES). The application software has a graphical user interface (GUI). Themore » GUI mimics the configurations of the nucler material storage sites or transport vehicles. The GUI gives the user and system administrator an intuitive way to read the information and/or configure the devices. The application software is capable of sending the information to a remote, dedicated and secured web and database server. Two captured screen samples, one for storage and transport, are attached. The database is constructed to handle a large number of RFID tag readers and portals. A SQL server is employed for this purpose. An XML script is used to update the database once the information is sent from the application software. The design of the web page imitates the design of the application software. The web page retrieves data from the database and presents it in different panels. The user needs a user name combined with a password to access the web page. The web page is capable of sending e-mail and text messages based on preset criteria, such as when alarm thresholds are excceeded. A captured screen sample is attached. The application software is designed to be installed on a local computer. The local computer is directly connected to the RFID devices and can be controlled locally or remotely. There are multiple local computers managing different sites or transport vehicles. The control from remote sites and information transmitted to a central database server is through secured internet. The information stored in the central databaser server is shown on the web page. The users can view the web page on the internet. A dedicated and secured web and database server (https) is used to provide information security.« less
Data base management system for lymphatic filariasis--a neglected tropical disease.
Upadhyayula, Suryanaryana Murty; Mutheneni, Srinivasa Rao; Kadiri, Madhusudhan Rao; Kumaraswamy, Sriram; Nelaturu, Sarat Chandra Babu
2012-01-01
Researchers working in the area of Public Health are being confronted with large volumes of data on various aspects of entomology and epidemiology. To obtain the relevant information out of these data requires particular database management system. In this paper, we have described about the usages of our developed database on lymphatic filariasis. This database application is developed using Model View Controller (MVC) architecture, with MySQL as database and a web based interface. We have collected and incorporated the data on filariasis in the database from Karimnagar, Chittoor, East and West Godavari districts of Andhra Pradesh, India. The importance of this database is to store the collected data, retrieve the information and produce various combinational reports on filarial aspects which in turn will help the public health officials to understand the burden of disease in a particular locality. This information is likely to have an imperative role on decision making for effective control of filarial disease and integrated vector management operations.
Emergency Response Notification System (ERNS)
The Emergency Response Notification System (ERNS) is a database used to store information on notifications of oil discharges and hazardous substances releases. The ERNS program is a cooperative data sharing effort among the Environmental Protection Agency (EPA) Headquarters, the ...
eCOMPAGT – efficient Combination and Management of Phenotypes and Genotypes for Genetic Epidemiology
Schönherr, Sebastian; Weißensteiner, Hansi; Coassin, Stefan; Specht, Günther; Kronenberg, Florian; Brandstätter, Anita
2009-01-01
Background High-throughput genotyping and phenotyping projects of large epidemiological study populations require sophisticated laboratory information management systems. Most epidemiological studies include subject-related personal information, which needs to be handled with care by following data privacy protection guidelines. In addition, genotyping core facilities handling cooperative projects require a straightforward solution to monitor the status and financial resources of the different projects. Description We developed a database system for an efficient combination and management of phenotypes and genotypes (eCOMPAGT) deriving from genetic epidemiological studies. eCOMPAGT securely stores and manages genotype and phenotype data and enables different user modes with different rights. Special attention was drawn on the import of data deriving from TaqMan and SNPlex genotyping assays. However, the database solution is adjustable to other genotyping systems by programming additional interfaces. Further important features are the scalability of the database and an export interface to statistical software. Conclusion eCOMPAGT can store, administer and connect phenotype data with all kinds of genotype data and is available as a downloadable version at . PMID:19432954
On Study of Application of Big Data and Cloud Computing Technology in Smart Campus
NASA Astrophysics Data System (ADS)
Tang, Zijiao
2017-12-01
We live in an era of network and information, which means we produce and face a lot of data every day, however it is not easy for database in the traditional meaning to better store, process and analyze the mass data, therefore the big data was born at the right moment. Meanwhile, the development and operation of big data rest with cloud computing which provides sufficient space and resources available to process and analyze data of big data technology. Nowadays, the proposal of smart campus construction aims at improving the process of building information in colleges and universities, therefore it is necessary to consider combining big data technology and cloud computing technology into construction of smart campus to make campus database system and campus management system mutually combined rather than isolated, and to serve smart campus construction through integrating, storing, processing and analyzing mass data.
Nosql for Storage and Retrieval of Large LIDAR Data Collections
NASA Astrophysics Data System (ADS)
Boehm, J.; Liu, K.
2015-08-01
Developments in LiDAR technology over the past decades have made LiDAR to become a mature and widely accepted source of geospatial information. This in turn has led to an enormous growth in data volume. The central idea for a file-centric storage of LiDAR point clouds is the observation that large collections of LiDAR data are typically delivered as large collections of files, rather than single files of terabyte size. This split of the dataset, commonly referred to as tiling, was usually done to accommodate a specific processing pipeline. It makes therefore sense to preserve this split. A document oriented NoSQL database can easily emulate this data partitioning, by representing each tile (file) in a separate document. The document stores the metadata of the tile. The actual files are stored in a distributed file system emulated by the NoSQL database. We demonstrate the use of MongoDB a highly scalable document oriented NoSQL database for storing large LiDAR files. MongoDB like any NoSQL database allows for queries on the attributes of the document. As a specialty MongoDB also allows spatial queries. Hence we can perform spatial queries on the bounding boxes of the LiDAR tiles. Inserting and retrieving files on a cloud-based database is compared to native file system and cloud storage transfer speed.
Software Architecture Evolution
2013-12-01
system’s major components occurring via a Java Message Service message bus [69]. This architecture was designed to promote loose coupling of soft- ware...play reconfiguration of the system. The components were Java -based and platform-independent; the interfaces by which they communicated were based on...The MPCS database, a MySQL database used for storing telemetry as well as some other information, such as logs and commanding data [68]. This
The Joy of Telecomputing: Everything You Need to Know about Going On-Line at Home.
ERIC Educational Resources Information Center
Pearlman, Dara
1984-01-01
Discusses advantages and pleasures of utilizing a personal computer at home to receive electronic mail; participate in online conferences, software exchanges, and game networks; do shopping and banking; and have access to databases storing volumes of information. Information sources for the services mentioned are included. (MBR)
Federal Register 2010, 2011, 2012, 2013, 2014
2012-06-19
... handling of such items throughout GSA's supply chain system. The information is used in GSA warehouses, stored in an NSN database and provided to GSA customers. Non-Collection and/or a less frequently... any of the following methods: Regulations.gov : http://www.regulations.gov . Submit comments via the...
Kilintzis, Vassilis; Beredimas, Nikolaos; Chouvarda, Ioanna
2014-01-01
An integral part of a system that manages medical data is the persistent storage engine. For almost twenty five years Relational Database Management Systems(RDBMS) were considered the obvious decision, yet today new technologies have emerged that require our attention as possible alternatives. Triplestores store information in terms of RDF triples without necessarily binding to a specific predefined structural model. In this paper we present an attempt to compare the performance of Apache JENA-Fuseki and the Virtuoso Universal Server 6 triplestores with that of MySQL 5.6 RDBMS for storing and retrieving medical information that it is communicated as RDF/XML ontology instances over a RESTful web service. The results show that the performance, calculated as average time of storing and retrieving instances, is significantly better using Virtuoso Server while MySQL performed better than Fuseki.
cPath: open source software for collecting, storing, and querying biological pathways
Cerami, Ethan G; Bader, Gary D; Gross, Benjamin E; Sander, Chris
2006-01-01
Background Biological pathways, including metabolic pathways, protein interaction networks, signal transduction pathways, and gene regulatory networks, are currently represented in over 220 diverse databases. These data are crucial for the study of specific biological processes, including human diseases. Standard exchange formats for pathway information, such as BioPAX, CellML, SBML and PSI-MI, enable convenient collection of this data for biological research, but mechanisms for common storage and communication are required. Results We have developed cPath, an open source database and web application for collecting, storing, and querying biological pathway data. cPath makes it easy to aggregate custom pathway data sets available in standard exchange formats from multiple databases, present pathway data to biologists via a customizable web interface, and export pathway data via a web service to third-party software, such as Cytoscape, for visualization and analysis. cPath is software only, and does not include new pathway information. Key features include: a built-in identifier mapping service for linking identical interactors and linking to external resources; built-in support for PSI-MI and BioPAX standard pathway exchange formats; a web service interface for searching and retrieving pathway data sets; and thorough documentation. The cPath software is freely available under the LGPL open source license for academic and commercial use. Conclusion cPath is a robust, scalable, modular, professional-grade software platform for collecting, storing, and querying biological pathways. It can serve as the core data handling component in information systems for pathway visualization, analysis and modeling. PMID:17101041
LAND-deFeND - An innovative database structure for landslides and floods and their consequences.
Napolitano, Elisabetta; Marchesini, Ivan; Salvati, Paola; Donnini, Marco; Bianchi, Cinzia; Guzzetti, Fausto
2018-02-01
Information on historical landslides and floods - collectively called "geo-hydrological hazards - is key to understand the complex dynamics of the events, to estimate the temporal and spatial frequency of damaging events, and to quantify their impact. A number of databases on geo-hydrological hazards and their consequences have been developed worldwide at different geographical and temporal scales. Of the few available database structures that can handle information on both landslides and floods some are outdated and others were not designed to store, organize, and manage information on single phenomena or on the type and monetary value of the damages and the remediation actions. Here, we present the LANDslides and Floods National Database (LAND-deFeND), a new database structure able to store, organize, and manage in a single digital structure spatial information collected from various sources with different accuracy. In designing LAND-deFeND, we defined four groups of entities, namely: nature-related, human-related, geospatial-related, and information-source-related entities that collectively can describe fully the geo-hydrological hazards and their consequences. In LAND-deFeND, the main entities are the nature-related entities, encompassing: (i) the "phenomenon", a single landslide or local inundation, (ii) the "event", which represent the ensemble of the inundations and/or landslides occurred in a conventional geographical area in a limited period, and (iii) the "trigger", which is the meteo-climatic or seismic cause (trigger) of the geo-hydrological hazards. LAND-deFeND maintains the relations between the nature-related entities and the human-related entities even where the information is missing partially. The physical model of the LAND-deFeND contains 32 tables, including nine input tables, 21 dictionary tables, and two association tables, and ten views, including specific views that make the database structure compliant with the EC INSPIRE and the Floods Directives. The LAND-deFeND database structure is open, and freely available from http://geomorphology.irpi.cnr.it/tools. Copyright © 2017 The Authors. Published by Elsevier Ltd.. All rights reserved.
Geotherm: the U.S. geological survey geothermal information system
Bliss, J.D.; Rapport, A.
1983-01-01
GEOTHERM is a comprehensive system of public databases and software used to store, locate, and evaluate information on the geology, geochemistry, and hydrology of geothermal systems. Three main databases address the general characteristics of geothermal wells and fields, and the chemical properties of geothermal fluids; the last database is currently the most active. System tasks are divided into four areas: (1) data acquisition and entry, involving data entry via word processors and magnetic tape; (2) quality assurance, including the criteria and standards handbook and front-end data-screening programs; (3) operation, involving database backups and information extraction; and (4) user assistance, preparation of such items as application programs, and a quarterly newsletter. The principal task of GEOTHERM is to provide information and research support for the conduct of national geothermal-resource assessments. The principal users of GEOTHERM are those involved with the Geothermal Research Program of the U.S. Geological Survey. Information in the system is available to the public on request. ?? 1983.
New Zealand's National Landslide Database
NASA Astrophysics Data System (ADS)
Rosser, B.; Dellow, S.; Haubrook, S.; Glassey, P.
2016-12-01
Since 1780, landslides have caused an average of about 3 deaths a year in New Zealand and have cost the economy an average of at least NZ$250M/a (0.1% GDP). To understand the risk posed by landslide hazards to society, a thorough knowledge of where, when and why different types of landslides occur is vital. The main objective for establishing the database was to provide a centralised national-scale, publically available database to collate landslide information that could be used for landslide hazard and risk assessment. Design of a national landslide database for New Zealand required consideration of both existing landslide data stored in a variety of digital formats, and future data, yet to be collected. Pre-existing databases were developed and populated with data reflecting the needs of the landslide or hazard project, and the database structures of the time. Bringing these data into a single unified database required a new structure capable of storing and delivering data at a variety of scales and accuracy and with different attributes. A "unified data model" was developed to enable the database to hold old and new landslide data irrespective of scale and method of capture. The database contains information on landslide locations and where available: 1) the timing of landslides and the events that may have triggered them; 2) the type of landslide movement; 3) the volume and area; 4) the source and debris tail; and 5) the impacts caused by the landslide. Information from a variety of sources including aerial photographs (and other remotely sensed data), field reconnaissance and media accounts has been collated and is presented for each landslide along with metadata describing the data sources and quality. There are currently nearly 19,000 landslide records in the database that include point locations, polygons of landslide source and deposit areas, and linear features. Several large datasets are awaiting upload which will bring the total number of landslides to over 100,000. The geo-spatial database is publicly available via the Internet. Software components, including the underlying database (PostGIS), Web Map Server (GeoServer) and web application use open-source software. The hope is that others will add relevant information to the database as well as download the data contained in it.
Classification of Chemicals Based On Structured Toxicity Information
Thirty years and millions of dollars worth of pesticide registration toxicity studies, historically stored as hardcopy and scanned documents, have been digitized into highly standardized and structured toxicity data within the Toxicity Reference Database (ToxRefDB). Toxicity-bas...
Price, Curtis V.; Maupin, Molly A.
2014-01-01
The purpose of this report is to document the PSDB and explain the methods used to populate and update the data from the SDWIS, State datasets, and map and geospatial imagery. This report describes 3 data tables and 11 domain tables, including field contents, data sources, and relations between tables. Although the PSDB database is not available to the general public, this information should be useful for others who are developing other database systems to store and analyze public-supply system and facility data.
Compilation of the data-base of the star catalogue by ADABAS.
NASA Astrophysics Data System (ADS)
Ishikawa, T.
A data-base of the FK4 Star Catalogue is compiled by using HITAC M-280H in the Computer Center of Tokyo University and a commercial data-base management system (DBMS) ADABAS. The purpose of this attempt is to examine whether the ADABAS, which could be regarded as a representative of the currently available DBMS's developed majorly for business and information retrieval purposes, proves itself useful for handling mass numerical data like the star catalogue data. It is concluded that the data-base could really be a convenient way for storing and utilizing the star catalogue data.
Starbase Data Tables: An ASCII Relational Database for Unix
NASA Astrophysics Data System (ADS)
Roll, John
2011-11-01
Database management is an increasingly important part of astronomical data analysis. Astronomers need easy and convenient ways of storing, editing, filtering, and retrieving data about data. Commercial databases do not provide good solutions for many of the everyday and informal types of database access astronomers need. The Starbase database system with simple data file formatting rules and command line data operators has been created to answer this need. The system includes a complete set of relational and set operators, fast search/index and sorting operators, and many formatting and I/O operators. Special features are included to enhance the usefulness of the database when manipulating astronomical data. The software runs under UNIX, MSDOS and IRAF.
DeAngelo, Jacob
1983-01-01
GEOTHERM is a comprehensive system of public databases and software used to store, locate, and evaluate information on the geology, geochemistry, and hydrology of geothermal systems. Three main databases address the general characteristics of geothermal wells and fields, and the chemical properties of geothermal fluids; the last database is currently the most active. System tasks are divided into four areas: (1) data acquisition and entry, involving data entry via word processors and magnetic tape; (2) quality assurance, including the criteria and standards handbook and front-end data-screening programs; (3) operation, involving database backups and information extraction; and (4) user assistance, preparation of such items as application programs, and a quarterly newsletter. The principal task of GEOTHERM is to provide information and research support for the conduct of national geothermal-resource assessments. The principal users of GEOTHERM are those involved with the Geothermal Research Program of the U.S. Geological Survey.
Data Structures in Natural Computing: Databases as Weak or Strong Anticipatory Systems
NASA Astrophysics Data System (ADS)
Rossiter, B. N.; Heather, M. A.
2004-08-01
Information systems anticipate the real world. Classical databases store, organise and search collections of data of that real world but only as weak anticipatory information systems. This is because of the reductionism and normalisation needed to map the structuralism of natural data on to idealised machines with von Neumann architectures consisting of fixed instructions. Category theory developed as a formalism to explore the theoretical concept of naturality shows that methods like sketches arising from graph theory as only non-natural models of naturality cannot capture real-world structures for strong anticipatory information systems. Databases need a schema of the natural world. Natural computing databases need the schema itself to be also natural. Natural computing methods including neural computers, evolutionary automata, molecular and nanocomputing and quantum computation have the potential to be strong. At present they are mainly at the stage of weak anticipatory systems.
JNDMS Task Authorization 2 Report
2013-10-01
uses Barnyard to store alarms from all DREnet Snort sensors in a MySQL database. Barnyard is an open source tool designed to work with Snort to take...Technology ITI Information Technology Infrastructure J2EE Java 2 Enterprise Edition JAR Java Archive. This is an archive file format defined by Java ...standards. JDBC Java Database Connectivity JDW JNDMS Data Warehouse JNDMS Joint Network and Defence Management System JNDMS Joint Network Defence and
Integration and management of massive remote-sensing data based on GeoSOT subdivision model
NASA Astrophysics Data System (ADS)
Li, Shuang; Cheng, Chengqi; Chen, Bo; Meng, Li
2016-07-01
Owing to the rapid development of earth observation technology, the volume of spatial information is growing rapidly; therefore, improving query retrieval speed from large, rich data sources for remote-sensing data management systems is quite urgent. A global subdivision model, geographic coordinate subdivision grid with one-dimension integer coding on 2n-tree, which we propose as a solution, has been used in data management organizations. However, because a spatial object may cover several grids, ample data redundancy will occur when data are stored in relational databases. To solve this redundancy problem, we first combined the subdivision model with the spatial array database containing the inverted index. We proposed an improved approach for integrating and managing massive remote-sensing data. By adding a spatial code column in an array format in a database, spatial information in remote-sensing metadata can be stored and logically subdivided. We implemented our method in a Kingbase Enterprise Server database system and compared the results with the Oracle platform by simulating worldwide image data. Experimental results showed that our approach performed better than Oracle in terms of data integration and time and space efficiency. Our approach also offers an efficient storage management system for existing storage centers and management systems.
Respiratory cancer database: An open access database of respiratory cancer gene and miRNA.
Choubey, Jyotsna; Choudhari, Jyoti Kant; Patel, Ashish; Verma, Mukesh Kumar
2017-01-01
Respiratory cancer database (RespCanDB) is a genomic and proteomic database of cancer of respiratory organ. It also includes the information of medicinal plants used for the treatment of various respiratory cancers with structure of its active constituents as well as pharmacological and chemical information of drug associated with various respiratory cancers. Data in RespCanDB has been manually collected from published research article and from other databases. Data has been integrated using MySQL an object-relational database management system. MySQL manages all data in the back-end and provides commands to retrieve and store the data into the database. The web interface of database has been built in ASP. RespCanDB is expected to contribute to the understanding of scientific community regarding respiratory cancer biology as well as developments of new way of diagnosing and treating respiratory cancer. Currently, the database consist the oncogenomic information of lung cancer, laryngeal cancer, and nasopharyngeal cancer. Data for other cancers, such as oral and tracheal cancers, will be added in the near future. The URL of RespCanDB is http://ridb.subdic-bioinformatics-nitrr.in/.
A probabilistic approach to information retrieval in heterogeneous databases
DOE Office of Scientific and Technical Information (OSTI.GOV)
Chatterjee, A.; Segev, A.
During the post decade, organizations have increased their scope and operations beyond their traditional geographic boundaries. At the same time, they have adopted heterogeneous and incompatible information systems independent of each other without a careful consideration that one day they may need to be integrated. As a result of this diversity, many important business applications today require access to data stored in multiple autonomous databases. This paper examines a problem of inter-database information retrieval in a heterogeneous environment, where conventional techniques are no longer efficient. To solve the problem, broader definitions for join, union, intersection and selection operators are proposed.more » Also, a probabilistic method to specify the selectivity of these operators is discussed. An algorithm to compute these probabilities is provided in pseudocode.« less
Functions and requirements document for interim store solidified high-level and transuranic waste
DOE Office of Scientific and Technical Information (OSTI.GOV)
Smith-Fewell, M.A., Westinghouse Hanford
1996-05-17
The functions, requirements, interfaces, and architectures contained within the Functions and Requirements (F{ampersand}R) Document are based on the information currently contained within the TWRS Functions and Requirements database. The database also documents the set of technically defensible functions and requirements associated with the solidified waste interim storage mission.The F{ampersand}R Document provides a snapshot in time of the technical baseline for the project. The F{ampersand}R document is the product of functional analysis, requirements allocation and architectural structure definition. The technical baseline described in this document is traceable to the TWRS function 4.2.4.1, Interim Store Solidified Waste, and its related requirements, architecture,more » and interfaces.« less
An object-oriented, technology-adaptive information model
NASA Technical Reports Server (NTRS)
Anyiwo, Joshua C.
1995-01-01
The primary objective was to develop a computer information system for effectively presenting NASA's technologies to American industries, for appropriate commercialization. To this end a comprehensive information management model, applicable to a wide variety of situations, and immune to computer software/hardware technological gyrations, was developed. The model consists of four main elements: a DATA_STORE, a data PRODUCER/UPDATER_CLIENT and a data PRESENTATION_CLIENT, anchored to a central object-oriented SERVER engine. This server engine facilitates exchanges among the other model elements and safeguards the integrity of the DATA_STORE element. It is designed to support new technologies, as they become available, such as Object Linking and Embedding (OLE), on-demand audio-video data streaming with compression (such as is required for video conferencing), Worldwide Web (WWW) and other information services and browsing, fax-back data requests, presentation of information on CD-ROM, and regular in-house database management, regardless of the data model in place. The four components of this information model interact through a system of intelligent message agents which are customized to specific information exchange needs. This model is at the leading edge of modern information management models. It is independent of technological changes and can be implemented in a variety of ways to meet the specific needs of any communications situation. This summer a partial implementation of the model has been achieved. The structure of the DATA_STORE has been fully specified and successfully tested using Microsoft's FoxPro 2.6 database management system. Data PRODUCER/UPDATER and PRESENTATION architectures have been developed and also successfully implemented in FoxPro; and work has started on a full implementation of the SERVER engine. The model has also been successfully applied to a CD-ROM presentation of NASA's technologies in support of Langley Research Center's TAG efforts.
Effective spatial database support for acquiring spatial information from remote sensing images
NASA Astrophysics Data System (ADS)
Jin, Peiquan; Wan, Shouhong; Yue, Lihua
2009-12-01
In this paper, a new approach to maintain spatial information acquiring from remote-sensing images is presented, which is based on Object-Relational DBMS. According to this approach, the detected and recognized results of targets are stored and able to be further accessed in an ORDBMS-based spatial database system, and users can access the spatial information using the standard SQL interface. This approach is different from the traditional ArcSDE-based method, because the spatial information management module is totally integrated into the DBMS and becomes one of the core modules in the DBMS. We focus on three issues, namely the general framework for the ORDBMS-based spatial database system, the definitions of the add-in spatial data types and operators, and the process to develop a spatial Datablade on Informix. The results show that the ORDBMS-based spatial database support for image-based target detecting and recognition is easy and practical to be implemented.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Fox, P.B.; Yatabe, M.
1987-01-01
In this report the Nuclear Criticality Safety Analytical Methods Resource Center describes a new interactive version of CESAR, a critical experiments storage and retrieval program available on the Nuclear Criticality Information System (NCIS) database at Lawrence Livermore National Laboratory. The original version of CESAR did not include interactive search capabilities. The CESAR database was developed to provide a convenient, readily accessible means of storing and retrieving code input data for the SCALE Criticality Safety Analytical Sequences and the codes comprising those sequences. The database includes data for both cross section preparation and criticality safety calculations. 3 refs., 1 tab.
SDAR 1.0 a New Quantitative Toolkit for Analyze Stratigraphic Data
NASA Astrophysics Data System (ADS)
Ortiz, John; Moreno, Carlos; Cardenas, Andres; Jaramillo, Carlos
2015-04-01
Since the foundation of stratigraphy geoscientists have recognized that data obtained from stratigraphic columns (SC), two dimensional schemes recording descriptions of both geological and paleontological features (e.g., thickness of rock packages, grain size, fossil and lithological components, and sedimentary structures), are key elements for establishing reliable hypotheses about the distribution in space and time of rock sequences, and ancient sedimentary environmental and paleobiological dynamics. Despite the tremendous advances on the way geoscientists store, plot, and quantitatively analyze sedimentological and paleontological data (e.g., Macrostrat [http://www.macrostrat.org/], Paleobiology Database [http://www.paleodb.org/], respectively), there is still a lack of computational methodologies designed to quantitatively examine data from a highly detailed SCs. Moreover, frequently the stratigraphic information is plotted "manually" using vector graphics editors (e.g., Corel Draw, Illustrator), however, this information although store on a digital format, cannot be used readily for any quantitative analysis. Therefore, any attempt to examine the stratigraphic data in an analytical fashion necessarily takes further steps. Given these issues, we have developed the sofware 'Stratigraphic Data Analysis in R' (SDAR), which stores in a database all sedimentological, stratigraphic, and paleontological information collected from a SC, allowing users to generate high-quality graphic plots (including one or multiple features stored in the database). SDAR also encompasses quantitative analyses helping users to quantify stratigraphic information (e.g. grain size, sorting and rounding, proportion of sand/shale). Finally, given that the SDAR analysis module, has been written in the open-source high-level computer language "R graphics/statistics language" [R Development Core Team, 2014], it is already loaded with many of the crucial features required to accomplish basic and complex tasks of statistical analysis (i.e., R language provide more than hundred spatial libraries that allow users to explore various Geostatistics and spatial analysis). Consequently, SDAR allows a deeper exploration of the stratigraphic data collected in the field, it will allow the geoscientific community in the near future to develop complex analyses related with the distribution in space and time of rock sequences, such as lithofacial correlations, by a multivariate comparison between empirical SCs with quantitative lithofacial models established from modern sedimentary environments.
Analysis of high accuracy, quantitative proteomics data in the MaxQB database.
Schaab, Christoph; Geiger, Tamar; Stoehr, Gabriele; Cox, Juergen; Mann, Matthias
2012-03-01
MS-based proteomics generates rapidly increasing amounts of precise and quantitative information. Analysis of individual proteomic experiments has made great strides, but the crucial ability to compare and store information across different proteome measurements still presents many challenges. For example, it has been difficult to avoid contamination of databases with low quality peptide identifications, to control for the inflation in false positive identifications when combining data sets, and to integrate quantitative data. Although, for example, the contamination with low quality identifications has been addressed by joint analysis of deposited raw data in some public repositories, we reasoned that there should be a role for a database specifically designed for high resolution and quantitative data. Here we describe a novel database termed MaxQB that stores and displays collections of large proteomics projects and allows joint analysis and comparison. We demonstrate the analysis tools of MaxQB using proteome data of 11 different human cell lines and 28 mouse tissues. The database-wide false discovery rate is controlled by adjusting the project specific cutoff scores for the combined data sets. The 11 cell line proteomes together identify proteins expressed from more than half of all human genes. For each protein of interest, expression levels estimated by label-free quantification can be visualized across the cell lines. Similarly, the expression rank order and estimated amount of each protein within each proteome are plotted. We used MaxQB to calculate the signal reproducibility of the detected peptides for the same proteins across different proteomes. Spearman rank correlation between peptide intensity and detection probability of identified proteins was greater than 0.8 for 64% of the proteome, whereas a minority of proteins have negative correlation. This information can be used to pinpoint false protein identifications, independently of peptide database scores. The information contained in MaxQB, including high resolution fragment spectra, is accessible to the community via a user-friendly web interface at http://www.biochem.mpg.de/maxqb.
NASA Astrophysics Data System (ADS)
Graham, Jim; Jarnevich, Catherine S.; Simpson, Annie; Newman, Gregory J.; Stohlgren, Thomas J.
2011-06-01
Invasive species are a universal global problem, but the information to identify them, manage them, and prevent invasions is stored around the globe in a variety of formats. The Global Invasive Species Information Network is a consortium of organizations working toward providing seamless access to these disparate databases via the Internet. A distributed network of databases can be created using the Internet and a standard web service protocol. There are two options to provide this integration. First, federated searches are being proposed to allow users to search "deep" web documents such as databases for invasive species. A second method is to create a cache of data from the databases for searching. We compare these two methods, and show that federated searches will not provide the performance and flexibility required from users and a central cache of the datum are required to improve performance.
The Unified Database for BM@N experiment data handling
NASA Astrophysics Data System (ADS)
Gertsenberger, Konstantin; Rogachevsky, Oleg
2018-04-01
The article describes the developed Unified Database designed as a comprehensive relational data storage for the BM@N experiment at the Joint Institute for Nuclear Research in Dubna. The BM@N experiment, which is one of the main elements of the first stage of the NICA project, is a fixed target experiment at extracted Nuclotron beams of the Laboratory of High Energy Physics (LHEP JINR). The structure and purposes of the BM@N setup are briefly presented. The article considers the scheme of the Unified Database, its attributes and implemented features in detail. The use of the developed BM@N database provides correct multi-user access to actual information of the experiment for data processing. It stores information on the experiment runs, detectors and their geometries, different configuration, calibration and algorithm parameters used in offline data processing. An important part of any database - user interfaces are presented.
Graham, Jim; Jarnevich, Catherine S.; Simpson, Annie; Newman, Gregory J.; Stohlgren, Thomas J.
2011-01-01
Invasive species are a universal global problem, but the information to identify them, manage them, and prevent invasions is stored around the globe in a variety of formats. The Global Invasive Species Information Network is a consortium of organizations working toward providing seamless access to these disparate databases via the Internet. A distributed network of databases can be created using the Internet and a standard web service protocol. There are two options to provide this integration. First, federated searches are being proposed to allow users to search “deep” web documents such as databases for invasive species. A second method is to create a cache of data from the databases for searching. We compare these two methods, and show that federated searches will not provide the performance and flexibility required from users and a central cache of the datum are required to improve performance.
1986 Year End Report for Road Following at Carnegie-Mellon
1987-05-01
how to make them work efficiently. We designed a hierarchical structure and a monitor module which manages all parts of the hierarchy (see figure 1...database, called the Local Map, is managed by a program known as the Local Map Builder (LMB). Each module stores and retrieves information in the...knowledge-intensive modules, and a database manager that synchronizes the modules-is characteristic of a traditional blackboard system. Such a system is
RNA Bricks—a database of RNA 3D motifs and their interactions
Chojnowski, Grzegorz; Waleń, Tomasz; Bujnicki, Janusz M.
2014-01-01
The RNA Bricks database (http://iimcb.genesilico.pl/rnabricks), stores information about recurrent RNA 3D motifs and their interactions, found in experimentally determined RNA structures and in RNA–protein complexes. In contrast to other similar tools (RNA 3D Motif Atlas, RNA Frabase, Rloom) RNA motifs, i.e. ‘RNA bricks’ are presented in the molecular environment, in which they were determined, including RNA, protein, metal ions, water molecules and ligands. All nucleotide residues in RNA bricks are annotated with structural quality scores that describe real-space correlation coefficients with the electron density data (if available), backbone geometry and possible steric conflicts, which can be used to identify poorly modeled residues. The database is also equipped with an algorithm for 3D motif search and comparison. The algorithm compares spatial positions of backbone atoms of the user-provided query structure and of stored RNA motifs, without relying on sequence or secondary structure information. This enables the identification of local structural similarities among evolutionarily related and unrelated RNA molecules. Besides, the search utility enables searching ‘RNA bricks’ according to sequence similarity, and makes it possible to identify motifs with modified ribonucleotide residues at specific positions. PMID:24220091
The PROTICdb database for 2-DE proteomics.
Langella, Olivier; Zivy, Michel; Joets, Johann
2007-01-01
PROTICdb is a web-based database mainly designed to store and analyze plant proteome data obtained by 2D polyacrylamide gel electrophoresis (2D PAGE) and mass spectrometry (MS). The goals of PROTICdb are (1) to store, track, and query information related to proteomic experiments, i.e., from tissue sampling to protein identification and quantitative measurements; and (2) to integrate information from the user's own expertise and other sources into a knowledge base, used to support data interpretation (e.g., for the determination of allelic variants or products of posttranslational modifications). Data insertion into the relational database of PROTICdb is achieved either by uploading outputs from Mélanie, PDQuest, IM2d, ImageMaster(tm) 2D Platinum v5.0, Progenesis, Sequest, MS-Fit, and Mascot software, or by filling in web forms (experimental design and methods). 2D PAGE-annotated maps can be displayed, queried, and compared through the GelBrowser. Quantitative data can be easily exported in a tabulated format for statistical analyses with any third-party software. PROTICdb is based on the Oracle or the PostgreSQLDataBase Management System (DBMS) and is freely available upon request at http://cms.moulon.inra.fr/content/view/14/44/.
Publishing Linked Open Data for Physical Samples - Lessons Learned
NASA Astrophysics Data System (ADS)
Ji, P.; Arko, R. A.; Lehnert, K.; Bristol, S.
2016-12-01
Most data and information about physical samples and associated sampling features currently reside in relational databases. Integrating common concepts from various databases has motivated us to publish Linked Open Data for collections of physical samples, using Semantic Web technologies including the Resource Description Framework (RDF), RDF Query Language (SPARQL), and Web Ontology Language (OWL). The goal of our work is threefold: To evaluate and select ontologies in different granularities for common concepts; to establish best practices and develop a generic methodology for publishing physical sample data stored in relational database as Linked Open Data; and to reuse standard community vocabularies from the International Commission on Stratigraphy (ICS), Global Volcanism Program (GVP), General Bathymetric Chart of the Oceans (GEBCO), and others. Our work leverages developments in the EarthCube GeoLink project and the Interdisciplinary Earth Data Alliance (IEDA) facility for modeling and extracting physical sample data stored in relational databases. Reusing ontologies developed by GeoLink and IEDA has facilitated discovery and integration of data and information across multiple collections including the USGS National Geochemical Database (NGDB), System for Earth Sample Registration (SESAR), and Index to Marine & Lacustrine Geological Samples (IMLGS). We have evaluated, tested, and deployed Linked Open Data tools including Morph, Virtuoso Server, LodView, LodLive, and YASGUI for converting, storing, representing, and querying data in a knowledge base (RDF triplestore). Using persistent identifiers such as Open Researcher & Contributor IDs (ORCIDs) and International Geo Sample Numbers (IGSNs) at the record level makes it possible for other repositories to link related resources such as persons, datasets, documents, expeditions, awards, etc. to samples, features, and collections. This work is supported by the EarthCube "GeoLink" project (NSF# ICER14-40221 and others) and the "USGS-IEDA Partnership to Support a Data Lifecycle Framework and Tools" project (USGS# G13AC00381).
Population and Activity of On-road Vehicles in MOVES2014
This report describes the sources and derivation for on-road vehicle population and activity information and associated adjustments as stored in the MOVES2014 default databases. Motor Vehicle Emission Simulator, the MOVES2014 model, is a set of modeling tools for estimating emiss...
Bioinformatics and the Undergraduate Curriculum
ERIC Educational Resources Information Center
Maloney, Mark; Parker, Jeffrey; LeBlanc, Mark; Woodard, Craig T.; Glackin, Mary; Hanrahan, Michael
2010-01-01
Recent advances involving high-throughput techniques for data generation and analysis have made familiarity with basic bioinformatics concepts and programs a necessity in the biological sciences. Undergraduate students increasingly need training in methods related to finding and retrieving information stored in vast databases. The rapid rise of…
CBS Genome Atlas Database: a dynamic storage for bioinformatic results and sequence data.
Hallin, Peter F; Ussery, David W
2004-12-12
Currently, new bacterial genomes are being published on a monthly basis. With the growing amount of genome sequence data, there is a demand for a flexible and easy-to-maintain structure for storing sequence data and results from bioinformatic analysis. More than 150 sequenced bacterial genomes are now available, and comparisons of properties for taxonomically similar organisms are not readily available to many biologists. In addition to the most basic information, such as AT content, chromosome length, tRNA count and rRNA count, a large number of more complex calculations are needed to perform detailed comparative genomics. DNA structural calculations like curvature and stacking energy, DNA compositions like base skews, oligo skews and repeats at the local and global level are just a few of the analysis that are presented on the CBS Genome Atlas Web page. Complex analysis, changing methods and frequent addition of new models are factors that require a dynamic database layout. Using basic tools like the GNU Make system, csh, Perl and MySQL, we have created a flexible database environment for storing and maintaining such results for a collection of complete microbial genomes. Currently, these results counts to more than 220 pieces of information. The backbone of this solution consists of a program package written in Perl, which enables administrators to synchronize and update the database content. The MySQL database has been connected to the CBS web-server via PHP4, to present a dynamic web content for users outside the center. This solution is tightly fitted to existing server infrastructure and the solutions proposed here can perhaps serve as a template for other research groups to solve database issues. A web based user interface which is dynamically linked to the Genome Atlas Database can be accessed via www.cbs.dtu.dk/services/GenomeAtlas/. This paper has a supplemental information page which links to the examples presented: www.cbs.dtu.dk/services/GenomeAtlas/suppl/bioinfdatabase.
Hedefalk, Finn; Svensson, Patrick; Harrie, Lars
2017-01-01
This paper presents datasets that enable historical longitudinal studies of micro-level geographic factors in a rural setting. These types of datasets are new, as historical demography studies have generally failed to properly include the micro-level geographic factors. Our datasets describe the geography over five Swedish rural parishes, and by linking them to a longitudinal demographic database, we obtain a geocoded population (at the property unit level) for this area for the period 1813–1914. The population is a subset of the Scanian Economic Demographic Database (SEDD). The geographic information includes the following feature types: property units, wetlands, buildings, roads and railroads. The property units and wetlands are stored in object-lifeline time representations (information about creation, changes and ends of objects are recorded in time), whereas the other feature types are stored as snapshots in time. Thus, the datasets present one of the first opportunities to study historical spatio-temporal patterns at the micro-level. PMID:28398288
Improving medical stores management through automation and effective communication.
Kumar, Ashok; Cariappa, M P; Marwaha, Vishal; Sharma, Mukti; Arora, Manu
2016-01-01
Medical stores management in hospitals is a tedious and time consuming chore with limited resources tasked for the purpose and poor penetration of Information Technology. The process of automation is slow paced due to various inherent factors and is being challenged by the increasing inventory loads and escalating budgets for procurement of drugs. We carried out an indepth case study at the Medical Stores of a tertiary care health care facility. An iterative six step Quality Improvement (QI) process was implemented based on the Plan-Do-Study-Act (PDSA) cycle. The QI process was modified as per requirement to fit the medical stores management model. The results were evaluated after six months. After the implementation of QI process, 55 drugs of the medical store inventory which had expired since 2009 onwards were replaced with fresh stock by the suppliers as a result of effective communication through upgraded database management. Various pending audit objections were dropped due to the streamlined documentation and processes. Inventory management improved drastically due to automation, with disposal orders being initiated four months prior to the expiry of drugs and correct demands being generated two months prior to depletion of stocks. The monthly expense summary of drugs was now being done within ten days of the closing month. Improving communication systems within the hospital with vendor database management and reaching out to clinicians is important. Automation of inventory management requires to be simple and user-friendly, utilizing existing hardware. Physical stores monitoring is indispensable, especially due to the scattered nature of stores. Staff training and standardized documentation protocols are the other keystones for optimal medical store management.
Open Access Internet Resources for Nano-Materials Physics Education
NASA Astrophysics Data System (ADS)
Moeck, Peter; Seipel, Bjoern; Upreti, Girish; Harvey, Morgan; Garrick, Will
2006-05-01
Because a great deal of nano-material science and engineering relies on crystalline materials, materials physicists have to provide their own specific contributions to the National Nanotechnology Initiative. Here we briefly review two freely accessible internet-based crystallographic databases, the Nano-Crystallography Database (http://nanocrystallography.research.pdx.edu) and the Crystallography Open Database (http://crystallography.net). Information on over 34,000 full structure determinations are stored in these two databases in the Crystallographic Information File format. The availability of such crystallographic data on the internet in a standardized format allows for all kinds of web-based crystallographic calculations and visualizations. Two examples of which that are dealt with in this paper are: interactive crystal structure visualizations in three dimensions and calculations of lattice-fringe fingerprints for the identification of unknown nanocrystals from their atomic-resolution transmission electron microscopy images.
Space Station Freedom environmental database system (FEDS) for MSFC testing
NASA Technical Reports Server (NTRS)
Story, Gail S.; Williams, Wendy; Chiu, Charles
1991-01-01
The Water Recovery Test (WRT) at Marshall Space Flight Center (MSFC) is the first demonstration of integrated water recovery systems for potable and hygiene water reuse as envisioned for Space Station Freedom (SSF). In order to satisfy the safety and health requirements placed on the SSF program and facilitate test data assessment, an extensive laboratory analysis database was established to provide a central archive and data retrieval function. The database is required to store analysis results for physical, chemical, and microbial parameters measured from water, air and surface samples collected at various locations throughout the test facility. The Oracle Relational Database Management System (RDBMS) was utilized to implement a secured on-line information system with the ECLSS WRT program as the foundation for this system. The database is supported on a VAX/VMS 8810 series mainframe and is accessible from the Marshall Information Network System (MINS). This paper summarizes the database requirements, system design, interfaces, and future enhancements.
Virus Database and Online Inquiry System Based on Natural Vectors.
Dong, Rui; Zheng, Hui; Tian, Kun; Yau, Shek-Chung; Mao, Weiguang; Yu, Wenping; Yin, Changchuan; Yu, Chenglong; He, Rong Lucy; Yang, Jie; Yau, Stephen St
2017-01-01
We construct a virus database called VirusDB (http://yaulab.math.tsinghua.edu.cn/VirusDB/) and an online inquiry system to serve people who are interested in viral classification and prediction. The database stores all viral genomes, their corresponding natural vectors, and the classification information of the single/multiple-segmented viral reference sequences downloaded from National Center for Biotechnology Information. The online inquiry system serves the purpose of computing natural vectors and their distances based on submitted genomes, providing an online interface for accessing and using the database for viral classification and prediction, and back-end processes for automatic and manual updating of database content to synchronize with GenBank. Submitted genomes data in FASTA format will be carried out and the prediction results with 5 closest neighbors and their classifications will be returned by email. Considering the one-to-one correspondence between sequence and natural vector, time efficiency, and high accuracy, natural vector is a significant advance compared with alignment methods, which makes VirusDB a useful database in further research.
Sagace: A web-based search engine for biomedical databases in Japan
2012-01-01
Background In the big data era, biomedical research continues to generate a large amount of data, and the generated information is often stored in a database and made publicly available. Although combining data from multiple databases should accelerate further studies, the current number of life sciences databases is too large to grasp features and contents of each database. Findings We have developed Sagace, a web-based search engine that enables users to retrieve information from a range of biological databases (such as gene expression profiles and proteomics data) and biological resource banks (such as mouse models of disease and cell lines). With Sagace, users can search more than 300 databases in Japan. Sagace offers features tailored to biomedical research, including manually tuned ranking, a faceted navigation to refine search results, and rich snippets constructed with retrieved metadata for each database entry. Conclusions Sagace will be valuable for experts who are involved in biomedical research and drug development in both academia and industry. Sagace is freely available at http://sagace.nibio.go.jp/en/. PMID:23110816
Indexing and retrieval of MPEG compressed video
NASA Astrophysics Data System (ADS)
Kobla, Vikrant; Doermann, David S.
1998-04-01
To keep pace with the increased popularity of digital video as an archival medium, the development of techniques for fast and efficient analysis of ideo streams is essential. In particular, solutions to the problems of storing, indexing, browsing, and retrieving video data from large multimedia databases are necessary to a low access to these collections. Given that video is often stored efficiently in a compressed format, the costly overhead of decompression can be reduced by analyzing the compressed representation directly. In earlier work, we presented compressed domain parsing techniques which identified shots, subshots, and scenes. In this article, we present efficient key frame selection, feature extraction, indexing, and retrieval techniques that are directly applicable to MPEG compressed video. We develop a frame type independent representation which normalizes spatial and temporal features including frame type, frame size, macroblock encoding, and motion compensation vectors. Features for indexing are derived directly from this representation and mapped to a low- dimensional space where they can be accessed using standard database techniques. Spatial information is used as primary index into the database and temporal information is used to rank retrieved clips and enhance the robustness of the system. The techniques presented enable efficient indexing, querying, and retrieval of compressed video as demonstrated by our system which typically takes a fraction of a second to retrieve similar video scenes from a database, with over 95 percent recall.
NASA Astrophysics Data System (ADS)
Veneranda, M.; Negro, J. I.; Medina, J.; Rull, F.; Lantz, C.; Poulet, F.; Cousin, A.; Dypvik, H.; Hellevang, H.; Werner, S. C.
2018-04-01
The PTAL website will store multispectral analysis of samples collected from several terrestrial analogue sites and pretend to become a cornerstone tool for the scientific community interested in deepening the knowledge on Mars geological processes.
Space Images for NASA JPL Android Version
NASA Technical Reports Server (NTRS)
Nelson, Jon D.; Gutheinz, Sandy C.; Strom, Joshua R.; Arca, Jeremy M.; Perez, Martin; Boggs, Karen; Stanboli, Alice
2013-01-01
This software addresses the demand for easily accessible NASA JPL images and videos by providing a user friendly and simple graphical user interface that can be run via the Android platform from any location where Internet connection is available. This app is complementary to the iPhone version of the application. A backend infrastructure stores, tracks, and retrieves space images from the JPL Photojournal and Institutional Communications Web server, and catalogs the information into a streamlined rating infrastructure. This system consists of four distinguishing components: image repository, database, server-side logic, and Android mobile application. The image repository contains images from various JPL flight projects. The database stores the image information as well as the user rating. The server-side logic retrieves the image information from the database and categorizes each image for display. The Android mobile application is an interfacing delivery system that retrieves the image information from the server for each Android mobile device user. Also created is a reporting and tracking system for charting and monitoring usage. Unlike other Android mobile image applications, this system uses the latest emerging technologies to produce image listings based directly on user input. This allows for countless combinations of images returned. The backend infrastructure uses industry-standard coding and database methods, enabling future software improvement and technology updates. The flexibility of the system design framework permits multiple levels of display possibilities and provides integration capabilities. Unique features of the software include image/video retrieval from a selected set of categories, image Web links that can be shared among e-mail users, sharing to Facebook/Twitter, marking as user's favorites, and image metadata searchable for instant results.
Kocna, P
1995-01-01
GastroBase, a clinical information system, incorporates patient identification, medical records, images, laboratory data, patient history, physical examination, and other patient-related information. Program modules are written in C; all data is processed using Novell-Btrieve data manager. Patient identification database represents the main core of this information systems. A graphic library developed in the past year and graphic modules with a special video-card enables the storing, archiving, and linking of different images to the electronic patient-medical-record. GastroBase has been running for more than four years in daily routine and the database contains more than 25,000 medical records and 1,500 images. This new version of GastroBase is now incorporated into the clinical information system of University Clinic in Prague.
Design and Implementation of a Set-Top Box-Based Homecare System Using Hybrid Cloud.
Lin, Bor-Shing; Hsiao, Pei-Chi; Cheng, Po-Hsun; Lee, I-Jung; Jan, Gene Eu
2015-11-01
Telemedicine has become a prevalent topic in recent years, and several telemedicine systems have been proposed; however, such systems are an unsuitable fit for the daily requirements of users. The system proposed in this study was developed as a set-top box integrated with the Android™ (Google, Mountain View, CA) operating system to provide a convenient and user-friendly interface. The proposed system can assist with family healthcare management, telemedicine service delivery, and information exchange among hospitals. To manage the system, a novel type of hybrid cloud architecture was also developed. Updated information is stored on a public cloud, enabling medical staff members to rapidly access information when diagnosing patients. In the long term, the stored data can be reduced to improve the efficiency of the database. The proposed design offers a robust architecture for storing data in a homecare system and can thus resolve network overload and congestion resulting from accumulating data, which are inherent problems in centralized architectures, thereby improving system efficiency.
Case retrieval in medical databases by fusing heterogeneous information.
Quellec, Gwénolé; Lamard, Mathieu; Cazuguel, Guy; Roux, Christian; Cochener, Béatrice
2011-01-01
A novel content-based heterogeneous information retrieval framework, particularly well suited to browse medical databases and support new generation computer aided diagnosis (CADx) systems, is presented in this paper. It was designed to retrieve possibly incomplete documents, consisting of several images and semantic information, from a database; more complex data types such as videos can also be included in the framework. The proposed retrieval method relies on image processing, in order to characterize each individual image in a document by their digital content, and information fusion. Once the available images in a query document are characterized, a degree of match, between the query document and each reference document stored in the database, is defined for each attribute (an image feature or a metadata). A Bayesian network is used to recover missing information if need be. Finally, two novel information fusion methods are proposed to combine these degrees of match, in order to rank the reference documents by decreasing relevance for the query. In the first method, the degrees of match are fused by the Bayesian network itself. In the second method, they are fused by the Dezert-Smarandache theory: the second approach lets us model our confidence in each source of information (i.e., each attribute) and take it into account in the fusion process for a better retrieval performance. The proposed methods were applied to two heterogeneous medical databases, a diabetic retinopathy database and a mammography screening database, for computer aided diagnosis. Precisions at five of 0.809 ± 0.158 and 0.821 ± 0.177, respectively, were obtained for these two databases, which is very promising.
DNApod: DNA polymorphism annotation database from next-generation sequence read archives.
Mochizuki, Takako; Tanizawa, Yasuhiro; Fujisawa, Takatomo; Ohta, Tazro; Nikoh, Naruo; Shimizu, Tokurou; Toyoda, Atsushi; Fujiyama, Asao; Kurata, Nori; Nagasaki, Hideki; Kaminuma, Eli; Nakamura, Yasukazu
2017-01-01
With the rapid advances in next-generation sequencing (NGS), datasets for DNA polymorphisms among various species and strains have been produced, stored, and distributed. However, reliability varies among these datasets because the experimental and analytical conditions used differ among assays. Furthermore, such datasets have been frequently distributed from the websites of individual sequencing projects. It is desirable to integrate DNA polymorphism data into one database featuring uniform quality control that is distributed from a single platform at a single place. DNA polymorphism annotation database (DNApod; http://tga.nig.ac.jp/dnapod/) is an integrated database that stores genome-wide DNA polymorphism datasets acquired under uniform analytical conditions, and this includes uniformity in the quality of the raw data, the reference genome version, and evaluation algorithms. DNApod genotypic data are re-analyzed whole-genome shotgun datasets extracted from sequence read archives, and DNApod distributes genome-wide DNA polymorphism datasets and known-gene annotations for each DNA polymorphism. This new database was developed for storing genome-wide DNA polymorphism datasets of plants, with crops being the first priority. Here, we describe our analyzed data for 679, 404, and 66 strains of rice, maize, and sorghum, respectively. The analytical methods are available as a DNApod workflow in an NGS annotation system of the DNA Data Bank of Japan and a virtual machine image. Furthermore, DNApod provides tables of links of identifiers between DNApod genotypic data and public phenotypic data. To advance the sharing of organism knowledge, DNApod offers basic and ubiquitous functions for multiple alignment and phylogenetic tree construction by using orthologous gene information.
DNApod: DNA polymorphism annotation database from next-generation sequence read archives
Mochizuki, Takako; Tanizawa, Yasuhiro; Fujisawa, Takatomo; Ohta, Tazro; Nikoh, Naruo; Shimizu, Tokurou; Toyoda, Atsushi; Fujiyama, Asao; Kurata, Nori; Nagasaki, Hideki; Kaminuma, Eli; Nakamura, Yasukazu
2017-01-01
With the rapid advances in next-generation sequencing (NGS), datasets for DNA polymorphisms among various species and strains have been produced, stored, and distributed. However, reliability varies among these datasets because the experimental and analytical conditions used differ among assays. Furthermore, such datasets have been frequently distributed from the websites of individual sequencing projects. It is desirable to integrate DNA polymorphism data into one database featuring uniform quality control that is distributed from a single platform at a single place. DNA polymorphism annotation database (DNApod; http://tga.nig.ac.jp/dnapod/) is an integrated database that stores genome-wide DNA polymorphism datasets acquired under uniform analytical conditions, and this includes uniformity in the quality of the raw data, the reference genome version, and evaluation algorithms. DNApod genotypic data are re-analyzed whole-genome shotgun datasets extracted from sequence read archives, and DNApod distributes genome-wide DNA polymorphism datasets and known-gene annotations for each DNA polymorphism. This new database was developed for storing genome-wide DNA polymorphism datasets of plants, with crops being the first priority. Here, we describe our analyzed data for 679, 404, and 66 strains of rice, maize, and sorghum, respectively. The analytical methods are available as a DNApod workflow in an NGS annotation system of the DNA Data Bank of Japan and a virtual machine image. Furthermore, DNApod provides tables of links of identifiers between DNApod genotypic data and public phenotypic data. To advance the sharing of organism knowledge, DNApod offers basic and ubiquitous functions for multiple alignment and phylogenetic tree construction by using orthologous gene information. PMID:28234924
1988-06-01
DETAILED PROBLEM STATEM ENT ......................................................... 23 A . INTRODUCTION...assorted information about the world land masses. When this is done, the problem of storage, manipulation, and display of realistic, dense, and accurate...elevation data becomes a problem of paramount importance. If the data which is stored can be utilized to recreate specific information about certain
A user friendly database for use in ALARA job dose assessment
DOE Office of Scientific and Technical Information (OSTI.GOV)
Zodiates, A.M.; Willcock, A.
1995-03-01
The pressurized water reactor (PWR) design chosen for adoption by Nuclear Electric plc was based on the Westinghouse Standard Nuclear Unit Power Plant (SNUPPS). This design was developed to meet the United Kingdom requirements and these improvements are embodied in the Sizewell B plant which will start commercial operation in 1994. A user-friendly database was developed to assist the station in the dose and ALARP assessments of the work expected to be carried out during station operation and outage. The database stores the information in an easily accessible form and enables updating, editing, retrieval, and searches of the information. Themore » database contains job-related information such as job locations, number of workers required, job times, and the expected plant doserates. It also contains the means to flag job requirements such as requirements for temporary shielding, flushing, scaffolding, etc. Typical uses of the database are envisaged to be in the prediction of occupational doses, the identification of high collective and individual dose jobs, use in ALARP assessments, setting of dose targets, monitoring of dose control performance, and others.« less
NASA Astrophysics Data System (ADS)
Liu, G.; Wu, C.; Li, X.; Song, P.
2013-12-01
The 3D urban geological information system has been a major part of the national urban geological survey project of China Geological Survey in recent years. Large amount of multi-source and multi-subject data are to be stored in the urban geological databases. There are various models and vocabularies drafted and applied by industrial companies in urban geological data. The issues such as duplicate and ambiguous definition of terms and different coding structure increase the difficulty of information sharing and data integration. To solve this problem, we proposed a national standard-driven information classification and coding method to effectively store and integrate urban geological data, and we applied the data dictionary technology to achieve structural and standard data storage. The overall purpose of this work is to set up a common data platform to provide information sharing service. Research progresses are as follows: (1) A unified classification and coding method for multi-source data based on national standards. Underlying national standards include GB 9649-88 for geology and GB/T 13923-2006 for geography. Current industrial models are compared with national standards to build a mapping table. The attributes of various urban geological data entity models are reduced to several categories according to their application phases and domains. Then a logical data model is set up as a standard format to design data file structures for a relational database. (2) A multi-level data dictionary for data standardization constraint. Three levels of data dictionary are designed: model data dictionary is used to manage system database files and enhance maintenance of the whole database system; attribute dictionary organizes fields used in database tables; term and code dictionary is applied to provide a standard for urban information system by adopting appropriate classification and coding methods; comprehensive data dictionary manages system operation and security. (3) An extension to system data management function based on data dictionary. Data item constraint input function is making use of the standard term and code dictionary to get standard input result. Attribute dictionary organizes all the fields of an urban geological information database to ensure the consistency of term use for fields. Model dictionary is used to generate a database operation interface automatically with standard semantic content via term and code dictionary. The above method and technology have been applied to the construction of Fuzhou Urban Geological Information System, South-East China with satisfactory results.
Oral cancer databases: A comprehensive review.
Sarode, Gargi S; Sarode, Sachin C; Maniyar, Nikunj; Anand, Rahul; Patil, Shankargouda
2017-11-29
Cancer database is a systematic collection and analysis of information on various human cancers at genomic and molecular level that can be utilized to understand various steps in carcinogenesis and for therapeutic advancement in cancer field. Oral cancer is one of the leading causes of morbidity and mortality all over the world. The current research efforts in this field are aimed at cancer etiology and therapy. Advanced genomic technologies including microarrays, proteomics, transcrpitomics, and gene sequencing development have culminated in generation of extensive data and subjection of several genes and microRNAs that are distinctively expressed and this information is stored in the form of various databases. Extensive data from various resources have brought the need for collaboration and data sharing to make effective use of this new knowledge. The current review provides comprehensive information of various publicly accessible databases that contain information pertinent to oral squamous cell carcinoma (OSCC) and databases designed exclusively for OSCC. The databases discussed in this paper are Protein-Coding Gene Databases and microRNA Databases. This paper also describes gene overlap in various databases, which will help researchers to reduce redundancy and focus on only those genes, which are common to more than one databases. We hope such introduction will promote awareness and facilitate the usage of these resources in the cancer research community, and researchers can explore the molecular mechanisms involved in the development of cancer, which can help in subsequent crafting of therapeutic strategies. © 2017 John Wiley & Sons A/S. Published by John Wiley & Sons Ltd.
Milc, Justyna; Sala, Antonio; Bergamaschi, Sonia; Pecchioni, Nicola
2011-01-01
The CEREALAB database aims to store genotypic and phenotypic data obtained by the CEREALAB project and to integrate them with already existing data sources in order to create a tool for plant breeders and geneticists. The database can help them in unravelling the genetics of economically important phenotypic traits; in identifying and choosing molecular markers associated to key traits; and in choosing the desired parentals for breeding programs. The database is divided into three sub-schemas corresponding to the species of interest: wheat, barley and rice; each sub-schema is then divided into two sub-ontologies, regarding genotypic and phenotypic data, respectively. Database URL: http://www.cerealab.unimore.it/jws/cerealab.jnlp PMID:21247929
Atlas - a data warehouse for integrative bioinformatics.
Shah, Sohrab P; Huang, Yong; Xu, Tao; Yuen, Macaire M S; Ling, John; Ouellette, B F Francis
2005-02-21
We present a biological data warehouse called Atlas that locally stores and integrates biological sequences, molecular interactions, homology information, functional annotations of genes, and biological ontologies. The goal of the system is to provide data, as well as a software infrastructure for bioinformatics research and development. The Atlas system is based on relational data models that we developed for each of the source data types. Data stored within these relational models are managed through Structured Query Language (SQL) calls that are implemented in a set of Application Programming Interfaces (APIs). The APIs include three languages: C++, Java, and Perl. The methods in these API libraries are used to construct a set of loader applications, which parse and load the source datasets into the Atlas database, and a set of toolbox applications which facilitate data retrieval. Atlas stores and integrates local instances of GenBank, RefSeq, UniProt, Human Protein Reference Database (HPRD), Biomolecular Interaction Network Database (BIND), Database of Interacting Proteins (DIP), Molecular Interactions Database (MINT), IntAct, NCBI Taxonomy, Gene Ontology (GO), Online Mendelian Inheritance in Man (OMIM), LocusLink, Entrez Gene and HomoloGene. The retrieval APIs and toolbox applications are critical components that offer end-users flexible, easy, integrated access to this data. We present use cases that use Atlas to integrate these sources for genome annotation, inference of molecular interactions across species, and gene-disease associations. The Atlas biological data warehouse serves as data infrastructure for bioinformatics research and development. It forms the backbone of the research activities in our laboratory and facilitates the integration of disparate, heterogeneous biological sources of data enabling new scientific inferences. Atlas achieves integration of diverse data sets at two levels. First, Atlas stores data of similar types using common data models, enforcing the relationships between data types. Second, integration is achieved through a combination of APIs, ontology, and tools. The Atlas software is freely available under the GNU General Public License at: http://bioinformatics.ubc.ca/atlas/
Atlas – a data warehouse for integrative bioinformatics
Shah, Sohrab P; Huang, Yong; Xu, Tao; Yuen, Macaire MS; Ling, John; Ouellette, BF Francis
2005-01-01
Background We present a biological data warehouse called Atlas that locally stores and integrates biological sequences, molecular interactions, homology information, functional annotations of genes, and biological ontologies. The goal of the system is to provide data, as well as a software infrastructure for bioinformatics research and development. Description The Atlas system is based on relational data models that we developed for each of the source data types. Data stored within these relational models are managed through Structured Query Language (SQL) calls that are implemented in a set of Application Programming Interfaces (APIs). The APIs include three languages: C++, Java, and Perl. The methods in these API libraries are used to construct a set of loader applications, which parse and load the source datasets into the Atlas database, and a set of toolbox applications which facilitate data retrieval. Atlas stores and integrates local instances of GenBank, RefSeq, UniProt, Human Protein Reference Database (HPRD), Biomolecular Interaction Network Database (BIND), Database of Interacting Proteins (DIP), Molecular Interactions Database (MINT), IntAct, NCBI Taxonomy, Gene Ontology (GO), Online Mendelian Inheritance in Man (OMIM), LocusLink, Entrez Gene and HomoloGene. The retrieval APIs and toolbox applications are critical components that offer end-users flexible, easy, integrated access to this data. We present use cases that use Atlas to integrate these sources for genome annotation, inference of molecular interactions across species, and gene-disease associations. Conclusion The Atlas biological data warehouse serves as data infrastructure for bioinformatics research and development. It forms the backbone of the research activities in our laboratory and facilitates the integration of disparate, heterogeneous biological sources of data enabling new scientific inferences. Atlas achieves integration of diverse data sets at two levels. First, Atlas stores data of similar types using common data models, enforcing the relationships between data types. Second, integration is achieved through a combination of APIs, ontology, and tools. The Atlas software is freely available under the GNU General Public License at: PMID:15723693
Sharing mutants and experimental information prepublication using FgMutantDB
USDA-ARS?s Scientific Manuscript database
There has been no central location for storing generated mutants of Fusarium graminearum or for data associated with these mutants. Instead researchers relied on several independent, non-integrated databases. FgMutantDB was designed as a simple spreadsheet that is accessible globally on the web th...
75 FR 6641 - Proposed Collection; Comment Request
Federal Register 2010, 2011, 2012, 2013, 2014
2010-02-10
... from members of the public is to make these submissions available for public viewing on the Internet at... is stored in the VRS database. It is used by the installation police in traffic and parking enforcement. The DESC Parking Administrator uses the information to track parking infractions, and to identify...
Parallel Processable Cryptographic Methods with Unbounded Practical Security.
ERIC Educational Resources Information Center
Rothstein, Jerome
Addressing the problem of protecting confidential information and data stored in computer databases from access by unauthorized parties, this paper details coding schemes which present such astronomical work factors to potential code breakers that security breaches are hopeless in any practical sense. Two procedures which can be used to encode for…
NCBI GEO: mining tens of millions of expression profiles--database and tools update.
Barrett, Tanya; Troup, Dennis B; Wilhite, Stephen E; Ledoux, Pierre; Rudnev, Dmitry; Evangelista, Carlos; Kim, Irene F; Soboleva, Alexandra; Tomashevsky, Maxim; Edgar, Ron
2007-01-01
The Gene Expression Omnibus (GEO) repository at the National Center for Biotechnology Information (NCBI) archives and freely disseminates microarray and other forms of high-throughput data generated by the scientific community. The database has a minimum information about a microarray experiment (MIAME)-compliant infrastructure that captures fully annotated raw and processed data. Several data deposit options and formats are supported, including web forms, spreadsheets, XML and Simple Omnibus Format in Text (SOFT). In addition to data storage, a collection of user-friendly web-based interfaces and applications are available to help users effectively explore, visualize and download the thousands of experiments and tens of millions of gene expression patterns stored in GEO. This paper provides a summary of the GEO database structure and user facilities, and describes recent enhancements to database design, performance, submission format options, data query and retrieval utilities. GEO is accessible at http://www.ncbi.nlm.nih.gov/geo/
HLLV avionics requirements study and electronic filing system database development
NASA Technical Reports Server (NTRS)
1994-01-01
This final report provides a summary of achievements and activities performed under Contract NAS8-39215. The contract's objective was to explore a new way of delivering, storing, accessing, and archiving study products and information and to define top level system requirements for Heavy Lift Launch Vehicle (HLLV) avionics that incorporate Vehicle Health Management (VHM). This report includes technical objectives, methods, assumptions, recommendations, sample data, and issues as specified by DPD No. 772, DR-3. The report is organized into two major subsections, one specific to each of the two tasks defined in the Statement of Work: the Index Database Task and the HLLV Avionics Requirements Task. The Index Database Task resulted in the selection and modification of a commercial database software tool to contain the data developed during the HLLV Avionics Requirements Task. All summary information is addressed within each task's section.
Adaptive search in mobile peer-to-peer databases
NASA Technical Reports Server (NTRS)
Wolfson, Ouri (Inventor); Xu, Bo (Inventor)
2010-01-01
Information is stored in a plurality of mobile peers. The peers communicate in a peer to peer fashion, using a short-range wireless network. Occasionally, a peer initiates a search for information in the peer to peer network by issuing a query. Queries and pieces of information, called reports, are transmitted among peers that are within a transmission range. For each search additional peers are utilized, wherein these additional peers search and relay information on behalf of the originator of the search.
Job monitoring on DIRAC for Belle II distributed computing
NASA Astrophysics Data System (ADS)
Kato, Yuji; Hayasaka, Kiyoshi; Hara, Takanori; Miyake, Hideki; Ueda, Ikuo
2015-12-01
We developed a monitoring system for Belle II distributed computing, which consists of active and passive methods. In this paper we describe the passive monitoring system, where information stored in the DIRAC database is processed and visualized. We divide the DIRAC workload management flow into steps and store characteristic variables which indicate issues. These variables are chosen carefully based on our experiences, then visualized. As a result, we are able to effectively detect issues. Finally, we discuss the future development for automating log analysis, notification of issues, and disabling problematic sites.
NASA Technical Reports Server (NTRS)
1991-01-01
R:BASE for DOS, a computer program developed under NASA contract, has been adapted by the National Marine Mammal Laboratory and the College of the Atlantic to provide and advanced computerized photo matching technique for identification of humpback whales. The program compares photos with stored digitized descriptions, enabling researchers to track and determine distribution and migration patterns. R:BASE is a spinoff of RIM (Relational Information Manager), which was used to store data for analyzing heat shielding tiles on the Space Shuttle Orbiter. It is now the world's second largest selling line of microcomputer database management software.
Construction of In-house Databases in a Corporation
NASA Astrophysics Data System (ADS)
Okuda, Yasukazu; Yoshikawa, Ichirou; Sasano, Fumio
The authors describe the outline and the construction process of the in-house technical information system of Mitsui Petrochemical Industries Ltd., “MITOLIS”. This system was constructed in 1981 and has been improved since then to make better use of in-house technical reports. Bibliographic data and keywords of technical reports of R & D division are stored in the host computer system in Iwakuni and can be retrieved by the company members on the desk-side terminal connected to the local area network (LAN). The number of stored reports reaches 6100 from 1970 to 1987.
NASA Astrophysics Data System (ADS)
Sheldon, W.
2013-12-01
Managing data for a large, multidisciplinary research program such as a Long Term Ecological Research (LTER) site is a significant challenge, but also presents unique opportunities for data stewardship. LTER research is conducted within multiple organizational frameworks (i.e. a specific LTER site as well as the broader LTER network), and addresses both specific goals defined in an NSF proposal as well as broader goals of the network; therefore, every LTER data can be linked to rich contextual information to guide interpretation and comparison. The challenge is how to link the data to this wealth of contextual metadata. At the Georgia Coastal Ecosystems LTER we developed an integrated information management system (GCE-IMS) to manage, archive and distribute data, metadata and other research products as well as manage project logistics, administration and governance (figure 1). This system allows us to store all project information in one place, and provide dynamic links through web applications and services to ensure content is always up to date on the web as well as in data set metadata. The database model supports tracking changes over time in personnel roles, projects and governance decisions, allowing these databases to serve as canonical sources of project history. Storing project information in a central database has also allowed us to standardize both the formatting and content of critical project information, including personnel names, roles, keywords, place names, attribute names, units, and instrumentation, providing consistency and improving data and metadata comparability. Lookup services for these standard terms also simplify data entry in web and database interfaces. We have also coupled the GCE-IMS to our MATLAB- and Python-based data processing tools (i.e. through database connections) to automate metadata generation and packaging of tabular and GIS data products for distribution. Data processing history is automatically tracked throughout the data lifecycle, from initial import through quality control, revision and integration by our data processing system (GCE Data Toolbox for MATLAB), and included in metadata for versioned data products. This high level of automation and system integration has proven very effective in managing the chaos and scalability of our information management program.
Linking NCBI to Wikipedia: a wiki-based approach.
Page, Roderic D M
2011-03-31
The NCBI Taxonomy underpins many bioinformatics and phyloinformatics databases, but by itself provides limited information on the taxa it contains. One readily available source of information on many taxa is Wikipedia. This paper describes iPhylo Linkout, a Semantic wiki that maps taxa in NCBI's taxonomy database onto corresponding pages in Wikipedia. Storing the mapping in a wiki makes it easy to edit, correct, or otherwise annotate the links between NCBI and Wikipedia. The mapping currently comprises some 53,000 taxa, and is available at http://iphylo.org/linkout. The links between NCBI and Wikipedia are also made available to NCBI users through the NCBI LinkOut service.
Improving medical stores management through automation and effective communication
Kumar, Ashok; Cariappa, M.P.; Marwaha, Vishal; Sharma, Mukti; Arora, Manu
2016-01-01
Background Medical stores management in hospitals is a tedious and time consuming chore with limited resources tasked for the purpose and poor penetration of Information Technology. The process of automation is slow paced due to various inherent factors and is being challenged by the increasing inventory loads and escalating budgets for procurement of drugs. Methods We carried out an indepth case study at the Medical Stores of a tertiary care health care facility. An iterative six step Quality Improvement (QI) process was implemented based on the Plan–Do–Study–Act (PDSA) cycle. The QI process was modified as per requirement to fit the medical stores management model. The results were evaluated after six months. Results After the implementation of QI process, 55 drugs of the medical store inventory which had expired since 2009 onwards were replaced with fresh stock by the suppliers as a result of effective communication through upgraded database management. Various pending audit objections were dropped due to the streamlined documentation and processes. Inventory management improved drastically due to automation, with disposal orders being initiated four months prior to the expiry of drugs and correct demands being generated two months prior to depletion of stocks. The monthly expense summary of drugs was now being done within ten days of the closing month. Conclusion Improving communication systems within the hospital with vendor database management and reaching out to clinicians is important. Automation of inventory management requires to be simple and user-friendly, utilizing existing hardware. Physical stores monitoring is indispensable, especially due to the scattered nature of stores. Staff training and standardized documentation protocols are the other keystones for optimal medical store management. PMID:26900225
Fine-grained Database Field Search Using Attribute-Based Encryption for E-Healthcare Clouds.
Guo, Cheng; Zhuang, Ruhan; Jie, Yingmo; Ren, Yizhi; Wu, Ting; Choo, Kim-Kwang Raymond
2016-11-01
An effectively designed e-healthcare system can significantly enhance the quality of access and experience of healthcare users, including facilitating medical and healthcare providers in ensuring a smooth delivery of services. Ensuring the security of patients' electronic health records (EHRs) in the e-healthcare system is an active research area. EHRs may be outsourced to a third-party, such as a community healthcare cloud service provider for storage due to cost-saving measures. Generally, encrypting the EHRs when they are stored in the system (i.e. data-at-rest) or prior to outsourcing the data is used to ensure data confidentiality. Searchable encryption (SE) scheme is a promising technique that can ensure the protection of private information without compromising on performance. In this paper, we propose a novel framework for controlling access to EHRs stored in semi-trusted cloud servers (e.g. a private cloud or a community cloud). To achieve fine-grained access control for EHRs, we leverage the ciphertext-policy attribute-based encryption (CP-ABE) technique to encrypt tables published by hospitals, including patients' EHRs, and the table is stored in the database with the primary key being the patient's unique identity. Our framework can enable different users with different privileges to search on different database fields. Differ from previous attempts to secure outsourcing of data, we emphasize the control of the searches of the fields within the database. We demonstrate the utility of the scheme by evaluating the scheme using datasets from the University of California, Irvine.
NASA Astrophysics Data System (ADS)
Willmes, C.
2017-12-01
In the frame of the Collaborative Research Centre 806 (CRC 806) an interdisciplinary research project, that needs to manage data, information and knowledge from heterogeneous domains, such as archeology, cultural sciences, and the geosciences, a collaborative internal knowledge base system was developed. The system is based on the open source MediaWiki software, that is well known as the software that enables Wikipedia, for its facilitation of a web based collaborative knowledge and information management platform. This software is additionally enhanced with the Semantic MediaWiki (SMW) extension, that allows to store and manage structural data within the Wiki platform, as well as it facilitates complex query and API interfaces to the structured data stored in the SMW data base. Using an additional open source software called mobo, it is possible to improve the data model development process, as well as automated data imports, from small spreadsheets to large relational databases. Mobo is a command line tool that helps building and deploying SMW structure in an agile, Schema-Driven Development way, and allows to manage and collaboratively develop the data model formalizations, that are formalized in JSON-Schema format, using version control systems like git. The combination of a well equipped collaborative web platform facilitated by Mediawiki, the possibility to store and query structured data in this collaborative database provided by SMW, as well as the possibility for automated data import and data model development enabled by mobo, result in a powerful but flexible system to build and develop a collaborative knowledge base system. Furthermore, SMW allows the application of Semantic Web technology, the structured data can be exported into RDF, thus it is possible to set a triple-store including a SPARQL endpoint on top of the database. The JSON-Schema based data models, can be enhanced into JSON-LD, to facilitate and profit from the possibilities of Linked Data technology.
2005-01-01
Gene expression databases contain a wealth of information, but current data mining tools are limited in their speed and effectiveness in extracting meaningful biological knowledge from them. Online analytical processing (OLAP) can be used as a supplement to cluster analysis for fast and effective data mining of gene expression databases. We used Analysis Services 2000, a product that ships with SQLServer2000, to construct an OLAP cube that was used to mine a time series experiment designed to identify genes associated with resistance of soybean to the soybean cyst nematode, a devastating pest of soybean. The data for these experiments is stored in the soybean genomics and microarray database (SGMD). A number of candidate resistance genes and pathways were found. Compared to traditional cluster analysis of gene expression data, OLAP was more effective and faster in finding biologically meaningful information. OLAP is available from a number of vendors and can work with any relational database management system through OLE DB. PMID:16046824
Integrating RFID technique to design mobile handheld inventory management system
NASA Astrophysics Data System (ADS)
Huang, Yo-Ping; Yen, Wei; Chen, Shih-Chung
2008-04-01
An RFID-based mobile handheld inventory management system is proposed in this paper. Differing from the manual inventory management method, the proposed system works on the personal digital assistant (PDA) with an RFID reader. The system identifies electronic tags on the properties and checks the property information in the back-end database server through a ubiquitous wireless network. The system also provides a set of functions to manage the back-end inventory database and assigns different levels of access privilege according to various user categories. In the back-end database server, to prevent improper or illegal accesses, the server not only stores the inventory database and user privilege information, but also keeps track of the user activities in the server including the login and logout time and location, the records of database accessing, and every modification of the tables. Some experimental results are presented to verify the applicability of the integrated RFID-based mobile handheld inventory management system.
Jabłoński, Michał; Starčuková, Jana; Starčuk, Zenon
2017-01-23
Proton magnetic resonance spectroscopy is a non-invasive measurement technique which provides information about concentrations of up to 20 metabolites participating in intracellular biochemical processes. In order to obtain any metabolic information from measured spectra a processing should be done in specialized software, like jMRUI. The processing is interactive and complex and often requires many trials before obtaining a correct result. This paper proposes a jMRUI enhancement for efficient and unambiguous history tracking and file identification. A database storing all processing steps, parameters and files used in processing was developed for jMRUI. The solution was developed in Java, authors used a SQL database for robust storage of parameters and SHA-256 hash code for unambiguous file identification. The developed system was integrated directly in jMRUI and it will be publically available. A graphical user interface was implemented in order to make the user experience more comfortable. The database operation is invisible from the point of view of the common user, all tracking operations are performed in the background. The implemented jMRUI database is a tool that can significantly help the user to track the processing history performed on data in jMRUI. The created tool is oriented to be user-friendly, robust and easy to use. The database GUI allows the user to browse the whole processing history of a selected file and learn e.g. what processing lead to the results, where the original data are stored, to obtain the list of all processing actions performed on spectra.
Shah, Sachin D.; Maltby, David R.
2010-01-01
The U.S. Geological Survey, in cooperation with the U.S. Army Corps of Engineers, compiled salinity-related water-quality data and information in a geodatabase containing more than 6,000 sampling sites. The geodatabase was designed as a tool for water-resource management and includes readily available digital data sources from the U.S. Geological Survey, U.S. Environmental Protection Agency, New Mexico Interstate Stream Commission, Sustainability of semi-Arid Hydrology and Riparian Areas, Paso del Norte Watershed Council, numerous other State and local databases, and selected databases maintained by the University of Arizona and New Mexico State University. Salinity information was compiled for an approximately 26,000-square-mile area of the Rio Grande Basin from the Rio Arriba-Sandoval County line, New Mexico, to Presidio, Texas. The geodatabase relates the spatial location of sampling sites with salinity-related water-quality data reported by multiple agencies. The sampling sites are stored in a geodatabase feature class; each site is linked by a relationship class to the corresponding sample and results stored in data tables.
Social media based NPL system to find and retrieve ARM data: Concept paper
DOE Office of Scientific and Technical Information (OSTI.GOV)
Devarakonda, Ranjeet; Giansiracusa, Michael T.; Kumar, Jitendra
Information connectivity and retrieval has a role in our daily lives. The most pervasive source of online information is databases. The amount of data is growing at rapid rate and database technology is improving and having a profound effect. Almost all online applications are storing and retrieving information from databases. One challenge in supplying the public with wider access to informational databases is the need for knowledge of database languages like Structured Query Language (SQL). Although the SQL language has been published in many forms, not everybody is able to write SQL queries. Another challenge is that it may notmore » be practical to make the public aware of the structure of the database. There is a need for novice users to query relational databases using their natural language. To solve this problem, many natural language interfaces to structured databases have been developed. The goal is to provide more intuitive method for generating database queries and delivering responses. Social media makes it possible to interact with a wide section of the population. Through this medium, and with the help of Natural Language Processing (NLP) we can make the data of the Atmospheric Radiation Measurement Data Center (ADC) more accessible to the public. We propose an architecture for using Apache Lucene/Solr [1], OpenML [2,3], and Kafka [4] to generate an automated query/response system with inputs from Twitter5, our Cassandra DB, and our log database. Using the Twitter API and NLP we can give the public the ability to ask questions of our database and get automated responses.« less
DOE Office of Scientific and Technical Information (OSTI.GOV)
Devarakonda, Ranjeet; Giansiracusa, Michael T.; Kumar, Jitendra
Information connectivity and retrieval has a role in our daily lives. The most pervasive source of online information is databases. The amount of data is growing at rapid rate and database technology is improving and having a profound effect. Almost all online applications are storing and retrieving information from databases. One challenge in supplying the public with wider access to informational databases is the need for knowledge of database languages like Structured Query Language (SQL). Although the SQL language has been published in many forms, not everybody is able to write SQL queries. Another challenge is that it may notmore » be practical to make the public aware of the structure of the database. There is a need for novice users to query relational databases using their natural language. To solve this problem, many natural language interfaces to structured databases have been developed. The goal is to provide more intuitive method for generating database queries and delivering responses. Social media makes it possible to interact with a wide section of the population. Through this medium, and with the help of Natural Language Processing (NLP) we can make the data of the Atmospheric Radiation Measurement Data Center (ADC) more accessible to the public. We propose an architecture for using Apache Lucene/Solr [1], OpenML [2,3], and Kafka [4] to generate an automated query/response system with inputs from Twitter5, our Cassandra DB, and our log database. Using the Twitter API and NLP we can give the public the ability to ask questions of our database and get automated responses.« less
Groundwater levels for selected wells in Upper Kittitas County, Washington
Fasser, E.T.; Julich, R.J.
2011-01-01
Groundwater levels for selected wells in Upper Kittitas County, Washington, are presented on an interactive, web-based map to document the spatial distribution of groundwater levels in the study area measured during spring 2011. Groundwater-level data and well information were collected by the U.S. Geological Survey using standard techniques and are stored in the U.S. Geological Survey National Water Information System, Groundwater Site-Inventory database.
Yoo, Danny; Xu, Iris; Berardini, Tanya Z; Rhee, Seung Yon; Narayanasamy, Vijay; Twigger, Simon
2006-03-01
For most systems in biology, a large body of literature exists that describes the complexity of the system based on experimental results. Manual review of this literature to extract targeted information into biological databases is difficult and time consuming. To address this problem, we developed PubSearch and PubFetch, which store literature, keyword, and gene information in a relational database, index the literature with keywords and gene names, and provide a Web user interface for annotating the genes from experimental data found in the associated literature. A set of protocols is provided in this unit for installing, populating, running, and using PubSearch and PubFetch. In addition, we provide support protocols for performing controlled vocabulary annotations. Intended users of PubSearch and PubFetch are database curators and biology researchers interested in tracking the literature and capturing information about genes of interest in a more effective way than with conventional spreadsheets and lab notebooks.
EPA Facility Registry Service (FRS): OIL
This dataset contains location and facility identification information from EPA's Facility Registry Service (FRS) for the subset of facilities that link to the Oil database. The Oil database contains information on Spill Prevention, Control, and Countermeasure (SPCC) and Facility Response Plan (FRP) subject facilities to prevent and respond to oil spills. FRP facilities are referred to as substantial harm facilities due to the quantities of oil stored and facility characteristics. FRS identifies and geospatially locates facilities, sites or places subject to environmental regulations or of environmental interest. Using vigorous verification and data management procedures, FRS integrates facility data from EPA's national program systems, other federal agencies, and State and tribal master facility records and provides EPA with a centrally managed, single source of comprehensive and authoritative information on facilities. This data set contains the subset of FRS integrated facilities that link to Oil facilities once the Oil data has been integrated into the FRS database. Additional information on FRS is available at the EPA website https://www.epa.gov/enviro/facility-registry-service-frs.
A Community Data Model for Hydrologic Observations
NASA Astrophysics Data System (ADS)
Tarboton, D. G.; Horsburgh, J. S.; Zaslavsky, I.; Maidment, D. R.; Valentine, D.; Jennings, B.
2006-12-01
The CUAHSI Hydrologic Information System project is developing information technology infrastructure to support hydrologic science. Hydrologic information science involves the description of hydrologic environments in a consistent way, using data models for information integration. This includes a hydrologic observations data model for the storage and retrieval of hydrologic observations in a relational database designed to facilitate data retrieval for integrated analysis of information collected by multiple investigators. It is intended to provide a standard format to facilitate the effective sharing of information between investigators and to facilitate analysis of information within a single study area or hydrologic observatory, or across hydrologic observatories and regions. The observations data model is designed to store hydrologic observations and sufficient ancillary information (metadata) about the observations to allow them to be unambiguously interpreted and used and provide traceable heritage from raw measurements to usable information. The design is based on the premise that a relational database at the single observation level is most effective for providing querying capability and cross dimension data retrieval and analysis. This premise is being tested through the implementation of a prototype hydrologic observations database, and the development of web services for the retrieval of data from and ingestion of data into the database. These web services hosted by the San Diego Supercomputer center make data in the database accessible both through a Hydrologic Data Access System portal and directly from applications software such as Excel, Matlab and ArcGIS that have Standard Object Access Protocol (SOAP) capability. This paper will (1) describe the data model; (2) demonstrate the capability for representing diverse data in the same database; (3) demonstrate the use of the database from applications software for the performance of hydrologic analysis across different observation types.
Newt-omics: a comprehensive repository for omics data from the newt Notophthalmus viridescens
Bruckskotten, Marc; Looso, Mario; Reinhardt, Richard; Braun, Thomas; Borchardt, Thilo
2012-01-01
Notophthalmus viridescens, a member of the salamander family is an excellent model organism to study regenerative processes due to its unique ability to replace lost appendages and to repair internal organs. Molecular insights into regenerative events have been severely hampered by the lack of genomic, transcriptomic and proteomic data, as well as an appropriate database to store such novel information. Here, we describe ‘Newt-omics’ (http://newt-omics.mpi-bn.mpg.de), a database, which enables researchers to locate, retrieve and store data sets dedicated to the molecular characterization of newts. Newt-omics is a transcript-centred database, based on an Expressed Sequence Tag (EST) data set from the newt, covering ∼50 000 Sanger sequenced transcripts and a set of high-density microarray data, generated from regenerating hearts. Newt-omics also contains a large set of peptides identified by mass spectrometry, which was used to validate 13 810 ESTs as true protein coding. Newt-omics is open to implement additional high-throughput data sets without changing the database structure. Via a user-friendly interface Newt-omics allows access to a huge set of molecular data without the need for prior bioinformatical expertise. PMID:22039101
Intelligent robot control using an adaptive critic with a task control center and dynamic database
NASA Astrophysics Data System (ADS)
Hall, E. L.; Ghaffari, M.; Liao, X.; Alhaj Ali, S. M.
2006-10-01
The purpose of this paper is to describe the design, development and simulation of a real time controller for an intelligent, vision guided robot. The use of a creative controller that can select its own tasks is demonstrated. This creative controller uses a task control center and dynamic database. The dynamic database stores both global environmental information and local information including the kinematic and dynamic models of the intelligent robot. The kinematic model is very useful for position control and simulations. However, models of the dynamics of the manipulators are needed for tracking control of the robot's motions. Such models are also necessary for sizing the actuators, tuning the controller, and achieving superior performance. Simulations of various control designs are shown. Also, much of the model has also been used for the actual prototype Bearcat Cub mobile robot. This vision guided robot was designed for the Intelligent Ground Vehicle Contest. A novel feature of the proposed approach is that the method is applicable to both robot arm manipulators and robot bases such as wheeled mobile robots. This generality should encourage the development of more mobile robots with manipulator capability since both models can be easily stored in the dynamic database. The multi task controller also permits wide applications. The use of manipulators and mobile bases with a high-level control are potentially useful for space exploration, certain rescue robots, defense robots, and medical robotics aids.
Cros, Annick; Ahamad Fatan, Nurulhuda; White, Alan; Teoh, Shwu Jiau; Tan, Stanley; Handayani, Christian; Huang, Charles; Peterson, Nate; Venegas Li, Ruben; Siry, Hendra Yusran; Fitriana, Ria; Gove, Jamison; Acoba, Tomoko; Knight, Maurice; Acosta, Renerio; Andrew, Neil; Beare, Doug
2014-01-01
In this paper we describe the construction of an online GIS database system, hosted by WorldFish, which stores bio-physical, ecological and socio-economic data for the 'Coral Triangle Area' in South-east Asia and the Pacific. The database has been built in partnership with all six (Timor-Leste, Malaysia, Indonesia, The Philippines, Solomon Islands and Papua New Guinea) of the Coral Triangle countries, and represents a valuable source of information for natural resource managers at the regional scale. Its utility is demonstrated using biophysical data, data summarising marine habitats, and data describing the extent of marine protected areas in the region.
We've Got Plenty of Data, Now How Can We Use It?
ERIC Educational Resources Information Center
Weiler, Jeffrey K.; Mears, Robert L.
1999-01-01
To mine a large store of school data, a new technology (variously termed data warehousing, data marts, online analytical processing, and executive information systems) is emerging. Data warehousing helps school districts extract and restructure desired data from automated systems and create new databases designed to enhance analytical and…
A web-based 3D geological information visualization system
NASA Astrophysics Data System (ADS)
Song, Renbo; Jiang, Nan
2013-03-01
Construction of 3D geological visualization system has attracted much more concern in GIS, computer modeling, simulation and visualization fields. It not only can effectively help geological interpretation and analysis work, but also can it can help leveling up geosciences professional education. In this paper, an applet-based method was introduced for developing a web-based 3D geological information visualization system. The main aims of this paper are to explore a rapid and low-cost development method for constructing a web-based 3D geological system. First, the borehole data stored in Excel spreadsheets was extracted and then stored in SQLSERVER database of a web server. Second, the JDBC data access component was utilized for providing the capability of access the database. Third, the user interface was implemented with applet component embedded in JSP page and the 3D viewing and querying functions were implemented with PickCanvas of Java3D. Last, the borehole data acquired from geological survey were used for test the system, and the test results has shown that related methods of this paper have a certain application values.
mpMoRFsDB: a database of molecular recognition features in membrane proteins.
Gypas, Foivos; Tsaousis, Georgios N; Hamodrakas, Stavros J
2013-10-01
Molecular recognition features (MoRFs) are small, intrinsically disordered regions in proteins that undergo a disorder-to-order transition on binding to their partners. MoRFs are involved in protein-protein interactions and may function as the initial step in molecular recognition. The aim of this work was to collect, organize and store all membrane proteins that contain MoRFs. Membrane proteins constitute ∼30% of fully sequenced proteomes and are responsible for a wide variety of cellular functions. MoRFs were classified according to their secondary structure, after interacting with their partners. We identified MoRFs in transmembrane and peripheral membrane proteins. The position of transmembrane protein MoRFs was determined in relation to a protein's topology. All information was stored in a publicly available mySQL database with a user-friendly web interface. A Jmol applet is integrated for visualization of the structures. mpMoRFsDB provides valuable information related to disorder-based protein-protein interactions in membrane proteins. http://bioinformatics.biol.uoa.gr/mpMoRFsDB
A survey of commercial object-oriented database management systems
NASA Technical Reports Server (NTRS)
Atkins, John
1992-01-01
The object-oriented data model is the culmination of over thirty years of database research. Initially, database research focused on the need to provide information in a consistent and efficient manner to the business community. Early data models such as the hierarchical model and the network model met the goal of consistent and efficient access to data and were substantial improvements over simple file mechanisms for storing and accessing data. However, these models required highly skilled programmers to provide access to the data. Consequently, in the early 70's E.F. Codd, an IBM research computer scientists, proposed a new data model based on the simple mathematical notion of the relation. This model is known as the Relational Model. In the relational model, data is represented in flat tables (or relations) which have no physical or internal links between them. The simplicity of this model fostered the development of powerful but relatively simple query languages that now made data directly accessible to the general database user. Except for large, multi-user database systems, a database professional was in general no longer necessary. Database professionals found that traditional data in the form of character data, dates, and numeric data were easily represented and managed via the relational model. Commercial relational database management systems proliferated and performance of relational databases improved dramatically. However, there was a growing community of potential database users whose needs were not met by the relational model. These users needed to store data with data types not available in the relational model and who required a far richer modelling environment than that provided by the relational model. Indeed, the complexity of the objects to be represented in the model mandated a new approach to database technology. The Object-Oriented Model was the result.
Development of a replicated database of DHCP data for evaluation of drug use.
Graber, S E; Seneker, J A; Stahl, A A; Franklin, K O; Neel, T E; Miller, R A
1996-01-01
This case report describes development and testing of a method to extract clinical information stored in the Veterans Affairs (VA) Decentralized Hospital Computer System (DHCP) for the purpose of analyzing data about groups of patients. The authors used a microcomputer-based, structured query language (SQL)-compatible, relational database system to replicate a subset of the Nashville VA Hospital's DHCP patient database. This replicated database contained the complete current Nashville DHCP prescription, provider, patient, and drug data sets, and a subset of the laboratory data. A pilot project employed this replicated database to answer questions that might arise in drug-use evaluation, such as identification of cases of polypharmacy, suboptimal drug regimens, and inadequate laboratory monitoring of drug therapy. These database queries included as candidates for review all prescriptions for all outpatients. The queries demonstrated that specific drug-use events could be identified for any time interval represented in the replicated database. PMID:8653451
Development of a replicated database of DHCP data for evaluation of drug use.
Graber, S E; Seneker, J A; Stahl, A A; Franklin, K O; Neel, T E; Miller, R A
1996-01-01
This case report describes development and testing of a method to extract clinical information stored in the Veterans Affairs (VA) Decentralized Hospital Computer System (DHCP) for the purpose of analyzing data about groups of patients. The authors used a microcomputer-based, structured query language (SQL)-compatible, relational database system to replicate a subset of the Nashville VA Hospital's DHCP patient database. This replicated database contained the complete current Nashville DHCP prescription, provider, patient, and drug data sets, and a subset of the laboratory data. A pilot project employed this replicated database to answer questions that might arise in drug-use evaluation, such as identification of cases of polypharmacy, suboptimal drug regimens, and inadequate laboratory monitoring of drug therapy. These database queries included as candidates for review all prescriptions for all outpatients. The queries demonstrated that specific drug-use events could be identified for any time interval represented in the replicated database.
Wu, Tsung-Jung; Shamsaddini, Amirhossein; Pan, Yang; Smith, Krista; Crichton, Daniel J; Simonyan, Vahan; Mazumder, Raja
2014-01-01
Years of sequence feature curation by UniProtKB/Swiss-Prot, PIR-PSD, NCBI-CDD, RefSeq and other database biocurators has led to a rich repository of information on functional sites of genes and proteins. This information along with variation-related annotation can be used to scan human short sequence reads from next-generation sequencing (NGS) pipelines for presence of non-synonymous single-nucleotide variations (nsSNVs) that affect functional sites. This and similar workflows are becoming more important because thousands of NGS data sets are being made available through projects such as The Cancer Genome Atlas (TCGA), and researchers want to evaluate their biomarkers in genomic data. BioMuta, an integrated sequence feature database, provides a framework for automated and manual curation and integration of cancer-related sequence features so that they can be used in NGS analysis pipelines. Sequence feature information in BioMuta is collected from the Catalogue of Somatic Mutations in Cancer (COSMIC), ClinVar, UniProtKB and through biocuration of information available from publications. Additionally, nsSNVs identified through automated analysis of NGS data from TCGA are also included in the database. Because of the petabytes of data and information present in NGS primary repositories, a platform HIVE (High-performance Integrated Virtual Environment) for storing, analyzing, computing and curating NGS data and associated metadata has been developed. Using HIVE, 31 979 nsSNVs were identified in TCGA-derived NGS data from breast cancer patients. All variations identified through this process are stored in a Curated Short Read archive, and the nsSNVs from the tumor samples are included in BioMuta. Currently, BioMuta has 26 cancer types with 13 896 small-scale and 308 986 large-scale study-derived variations. Integration of variation data allows identifications of novel or common nsSNVs that can be prioritized in validation studies. Database URL: BioMuta: http://hive.biochemistry.gwu.edu/tools/biomuta/index.php; CSR: http://hive.biochemistry.gwu.edu/dna.cgi?cmd=csr; HIVE: http://hive.biochemistry.gwu.edu.
An SQL query generator for CLIPS
NASA Technical Reports Server (NTRS)
Snyder, James; Chirica, Laurian
1990-01-01
As expert systems become more widely used, their access to large amounts of external information becomes increasingly important. This information exists in several forms such as statistical, tabular data, knowledge gained by experts and large databases of information maintained by companies. Because many expert systems, including CLIPS, do not provide access to this external information, much of the usefulness of expert systems is left untapped. The scope of this paper is to describe a database extension for the CLIPS expert system shell. The current industry standard database language is SQL. Due to SQL standardization, large amounts of information stored on various computers, potentially at different locations, will be more easily accessible. Expert systems should be able to directly access these existing databases rather than requiring information to be re-entered into the expert system environment. The ORACLE relational database management system (RDBMS) was used to provide a database connection within the CLIPS environment. To facilitate relational database access a query generation system was developed as a CLIPS user function. The queries are entered in a CLlPS-like syntax and are passed to the query generator, which constructs and submits for execution, an SQL query to the ORACLE RDBMS. The query results are asserted as CLIPS facts. The query generator was developed primarily for use within the ICADS project (Intelligent Computer Aided Design System) currently being developed by the CAD Research Unit in the California Polytechnic State University (Cal Poly). In ICADS, there are several parallel or distributed expert systems accessing a common knowledge base of facts. Expert system has a narrow domain of interest and therefore needs only certain portions of the information. The query generator provides a common method of accessing this information and allows the expert system to specify what data is needed without specifying how to retrieve it.
Database recovery using redundant disk arrays
NASA Technical Reports Server (NTRS)
Mourad, Antoine N.; Fuchs, W. K.; Saab, Daniel G.
1992-01-01
Redundant disk arrays provide a way for achieving rapid recovery from media failures with a relatively low storage cost for large scale database systems requiring high availability. In this paper a method is proposed for using redundant disk arrays to support rapid-recovery from system crashes and transaction aborts in addition to their role in providing media failure recovery. A twin page scheme is used to store the parity information in the array so that the time for transaction commit processing is not degraded. Using an analytical model, it is shown that the proposed method achieves a significant increase in the throughput of database systems using redundant disk arrays by reducing the number of recovery operations needed to maintain the consistency of the database.
Recovery issues in databases using redundant disk arrays
NASA Technical Reports Server (NTRS)
Mourad, Antoine N.; Fuchs, W. K.; Saab, Daniel G.
1993-01-01
Redundant disk arrays provide a way for achieving rapid recovery from media failures with a relatively low storage cost for large scale database systems requiring high availability. In this paper we propose a method for using redundant disk arrays to support rapid recovery from system crashes and transaction aborts in addition to their role in providing media failure recovery. A twin page scheme is used to store the parity information in the array so that the time for transaction commit processing is not degraded. Using an analytical model, we show that the proposed method achieves a significant increase in the throughput of database systems using redundant disk arrays by reducing the number of recovery operations needed to maintain the consistency of the database.
Naval Ship Database: Database Design, Implementation, and Schema
2013-09-01
incoming data. The solution allows database users to store and analyze data collected by navy ships in the Royal Canadian Navy ( RCN ). The data...understanding RCN jargon and common practices on a typical RCN vessel. This experience led to the development of several error detection methods to...data to be stored in the database. Mr. Massel has also collected data pertaining to day to day activities on RCN vessels that has been imported into
The Hawaiian Algal Database: a laboratory LIMS and online resource for biodiversity data
Wang, Norman; Sherwood, Alison R; Kurihara, Akira; Conklin, Kimberly Y; Sauvage, Thomas; Presting, Gernot G
2009-01-01
Background Organization and presentation of biodiversity data is greatly facilitated by databases that are specially designed to allow easy data entry and organized data display. Such databases also have the capacity to serve as Laboratory Information Management Systems (LIMS). The Hawaiian Algal Database was designed to showcase specimens collected from the Hawaiian Archipelago, enabling users around the world to compare their specimens with our photographs and DNA sequence data, and to provide lab personnel with an organizational tool for storing various biodiversity data types. Description We describe the Hawaiian Algal Database, a comprehensive and searchable database containing photographs and micrographs, geo-referenced collecting information, taxonomic checklists and standardized DNA sequence data. All data for individual samples are linked through unique accession numbers. Users can search online for sample information by accession number, numerous levels of taxonomy, or collection site. At the present time the database contains data representing over 2,000 samples of marine, freshwater and terrestrial algae from the Hawaiian Archipelago. These samples are primarily red algae, although other taxa are being added. Conclusion The Hawaiian Algal Database is a digital repository for Hawaiian algal samples and acts as a LIMS for the laboratory. Users can make use of the online search tool to view and download specimen photographs and micrographs, DNA sequences and relevant habitat data, including georeferenced collecting locations. It is publicly available at . PMID:19728892
A web-based platform for virtual screening.
Watson, Paul; Verdonk, Marcel; Hartshorn, Michael J
2003-09-01
A fully integrated, web-based, virtual screening platform has been developed to allow rapid virtual screening of large numbers of compounds. ORACLE is used to store information at all stages of the process. The system includes a large database of historical compounds from high throughput screenings (HTS) chemical suppliers, ATLAS, containing over 3.1 million unique compounds with their associated physiochemical properties (ClogP, MW, etc.). The database can be screened using a web-based interface to produce compound subsets for virtual screening or virtual library (VL) enumeration. In order to carry out the latter task within ORACLE a reaction data cartridge has been developed. Virtual libraries can be enumerated rapidly using the web-based interface to the cartridge. The compound subsets can be seamlessly submitted for virtual screening experiments, and the results can be viewed via another web-based interface allowing ad hoc querying of the virtual screening data stored in ORACLE.
BioBenchmark Toyama 2012: an evaluation of the performance of triple stores on biological data
2014-01-01
Background Biological databases vary enormously in size and data complexity, from small databases that contain a few million Resource Description Framework (RDF) triples to large databases that contain billions of triples. In this paper, we evaluate whether RDF native stores can be used to meet the needs of a biological database provider. Prior evaluations have used synthetic data with a limited database size. For example, the largest BSBM benchmark uses 1 billion synthetic e-commerce knowledge RDF triples on a single node. However, real world biological data differs from the simple synthetic data much. It is difficult to determine whether the synthetic e-commerce data is efficient enough to represent biological databases. Therefore, for this evaluation, we used five real data sets from biological databases. Results We evaluated five triple stores, 4store, Bigdata, Mulgara, Virtuoso, and OWLIM-SE, with five biological data sets, Cell Cycle Ontology, Allie, PDBj, UniProt, and DDBJ, ranging in size from approximately 10 million to 8 billion triples. For each database, we loaded all the data into our single node and prepared the database for use in a classical data warehouse scenario. Then, we ran a series of SPARQL queries against each endpoint and recorded the execution time and the accuracy of the query response. Conclusions Our paper shows that with appropriate configuration Virtuoso and OWLIM-SE can satisfy the basic requirements to load and query biological data less than 8 billion or so on a single node, for the simultaneous access of 64 clients. OWLIM-SE performs best for databases with approximately 11 million triples; For data sets that contain 94 million and 590 million triples, OWLIM-SE and Virtuoso perform best. They do not show overwhelming advantage over each other; For data over 4 billion Virtuoso works best. 4store performs well on small data sets with limited features when the number of triples is less than 100 million, and our test shows its scalability is poor; Bigdata demonstrates average performance and is a good open source triple store for middle-sized (500 million or so) data set; Mulgara shows a little of fragility. PMID:25089180
BioBenchmark Toyama 2012: an evaluation of the performance of triple stores on biological data.
Wu, Hongyan; Fujiwara, Toyofumi; Yamamoto, Yasunori; Bolleman, Jerven; Yamaguchi, Atsuko
2014-01-01
Biological databases vary enormously in size and data complexity, from small databases that contain a few million Resource Description Framework (RDF) triples to large databases that contain billions of triples. In this paper, we evaluate whether RDF native stores can be used to meet the needs of a biological database provider. Prior evaluations have used synthetic data with a limited database size. For example, the largest BSBM benchmark uses 1 billion synthetic e-commerce knowledge RDF triples on a single node. However, real world biological data differs from the simple synthetic data much. It is difficult to determine whether the synthetic e-commerce data is efficient enough to represent biological databases. Therefore, for this evaluation, we used five real data sets from biological databases. We evaluated five triple stores, 4store, Bigdata, Mulgara, Virtuoso, and OWLIM-SE, with five biological data sets, Cell Cycle Ontology, Allie, PDBj, UniProt, and DDBJ, ranging in size from approximately 10 million to 8 billion triples. For each database, we loaded all the data into our single node and prepared the database for use in a classical data warehouse scenario. Then, we ran a series of SPARQL queries against each endpoint and recorded the execution time and the accuracy of the query response. Our paper shows that with appropriate configuration Virtuoso and OWLIM-SE can satisfy the basic requirements to load and query biological data less than 8 billion or so on a single node, for the simultaneous access of 64 clients. OWLIM-SE performs best for databases with approximately 11 million triples; For data sets that contain 94 million and 590 million triples, OWLIM-SE and Virtuoso perform best. They do not show overwhelming advantage over each other; For data over 4 billion Virtuoso works best. 4store performs well on small data sets with limited features when the number of triples is less than 100 million, and our test shows its scalability is poor; Bigdata demonstrates average performance and is a good open source triple store for middle-sized (500 million or so) data set; Mulgara shows a little of fragility.
Databases on biotechnology and biosafety of GMOs.
Degrassi, Giuliano; Alexandrova, Nevena; Ripandelli, Decio
2003-01-01
Due to the involvement of scientific, industrial, commercial and public sectors of society, the complexity of the issues concerning the safety of genetically modified organisms (GMOs) for the environment, agriculture, and human and animal health calls for a wide coverage of information. Accordingly, development of the field of biotechnology, along with concerns related to the fate of released GMOs, has led to a rapid development of tools for disseminating such information. As a result, there is a growing number of databases aimed at collecting and storing information related to GMOs. Most of the sites deal with information on environmental releases, field trials, transgenes and related sequences, regulations and legislation, risk assessment documents, and literature. Databases are mainly established and managed by scientific, national or international authorities, and are addressed towards scientists, government officials, policy makers, consumers, farmers, environmental groups and civil society representatives. This complexity can lead to an overlapping of information. The purpose of the present review is to analyse the relevant databases currently available on the web, providing comments on their vastly different information and on the structure of the sites pertaining to different users. A preliminary overview on the development of these sites during the last decade, at both the national and international level, is also provided.
A Codasyl-Type Schema for Natural Language Medical Records
Sager, N.; Tick, L.; Story, G.; Hirschman, L.
1980-01-01
This paper describes a CODASYL (network) database schema for information derived from narrative clinical reports. The goal of this work is to create an automated process that accepts natural language documents as input and maps this information into a database of a type managed by existing database management systems. The schema described here represents the medical events and facts identified through the natural language processing. This processing decomposes each narrative into a set of elementary assertions, represented as MEDFACT records in the database. Each assertion in turn consists of a subject and a predicate classed according to a limited number of medical event types, e.g., signs/symptoms, laboratory tests, etc. The subject and predicate are represented by EVENT records which are owned by the MEDFACT record associated with the assertion. The CODASYL-type network structure was found to be suitable for expressing most of the relations needed to represent the natural language information. However, special mechanisms were developed for storing the time relations between EVENT records and for recording connections (such as causality) between certain MEDFACT records. This schema has been implemented using the UNIVAC DMS-1100 DBMS.
GIS Toolsets for Planetary Geomorphology and Landing-Site Analysis
NASA Astrophysics Data System (ADS)
Nass, Andrea; van Gasselt, Stephan
2015-04-01
Modern Geographic Information Systems (GIS) allow expert and lay users alike to load and position geographic data and perform simple to highly complex surface analyses. For many applications dedicated and ready-to-use GIS tools are available in standard software systems while other applications require the modular combination of available basic tools to answer more specific questions. This also applies to analyses in modern planetary geomorphology where many of such (basic) tools can be used to build complex analysis tools, e.g. in image- and terrain model analysis. Apart from the simple application of sets of different tools, many complex tasks require a more sophisticated design for storing and accessing data using databases (e.g. ArcHydro for hydrological data analysis). In planetary sciences, complex database-driven models are often required to efficiently analyse potential landings sites or store rover data, but also geologic mapping data can be efficiently stored and accessed using database models rather than stand-alone shapefiles. For landings-site analyses, relief and surface roughness estimates are two common concepts that are of particular interest and for both, a number of different definitions co-exist. We here present an advanced toolset for the analysis of image and terrain-model data with an emphasis on extraction of landing site characteristics using established criteria. We provide working examples and particularly focus on the concepts of terrain roughness as it is interpreted in geomorphology and engineering studies.
Design and Implementation of a Set-Top Box–Based Homecare System Using Hybrid Cloud
Lin, Bor-Shing; Hsiao, Pei-Chi; Cheng, Po-Hsun; Jan, Gene Eu
2015-01-01
Abstract Introduction: Telemedicine has become a prevalent topic in recent years, and several telemedicine systems have been proposed; however, such systems are an unsuitable fit for the daily requirements of users. Materials and Methods: The system proposed in this study was developed as a set-top box integrated with the Android™ (Google, Mountain View, CA) operating system to provide a convenient and user-friendly interface. The proposed system can assist with family healthcare management, telemedicine service delivery, and information exchange among hospitals. To manage the system, a novel type of hybrid cloud architecture was also developed. Results: Updated information is stored on a public cloud, enabling medical staff members to rapidly access information when diagnosing patients. In the long term, the stored data can be reduced to improve the efficiency of the database. Conclusions: The proposed design offers a robust architecture for storing data in a homecare system and can thus resolve network overload and congestion resulting from accumulating data, which are inherent problems in centralized architectures, thereby improving system efficiency. PMID:26075333
Morris, Chris; Pajon, Anne; Griffiths, Susanne L.; Daniel, Ed; Savitsky, Marc; Lin, Bill; Diprose, Jonathan M.; Wilter da Silva, Alan; Pilicheva, Katya; Troshin, Peter; van Niekerk, Johannes; Isaacs, Neil; Naismith, James; Nave, Colin; Blake, Richard; Wilson, Keith S.; Stuart, David I.; Henrick, Kim; Esnouf, Robert M.
2011-01-01
The techniques used in protein production and structural biology have been developing rapidly, but techniques for recording the laboratory information produced have not kept pace. One approach is the development of laboratory information-management systems (LIMS), which typically use a relational database schema to model and store results from a laboratory workflow. The underlying philosophy and implementation of the Protein Information Management System (PiMS), a LIMS development specifically targeted at the flexible and unpredictable workflows of protein-production research laboratories of all scales, is described. PiMS is a web-based Java application that uses either Postgres or Oracle as the underlying relational database-management system. PiMS is available under a free licence to all academic laboratories either for local installation or for use as a managed service. PMID:21460443
Morris, Chris; Pajon, Anne; Griffiths, Susanne L; Daniel, Ed; Savitsky, Marc; Lin, Bill; Diprose, Jonathan M; da Silva, Alan Wilter; Pilicheva, Katya; Troshin, Peter; van Niekerk, Johannes; Isaacs, Neil; Naismith, James; Nave, Colin; Blake, Richard; Wilson, Keith S; Stuart, David I; Henrick, Kim; Esnouf, Robert M
2011-04-01
The techniques used in protein production and structural biology have been developing rapidly, but techniques for recording the laboratory information produced have not kept pace. One approach is the development of laboratory information-management systems (LIMS), which typically use a relational database schema to model and store results from a laboratory workflow. The underlying philosophy and implementation of the Protein Information Management System (PiMS), a LIMS development specifically targeted at the flexible and unpredictable workflows of protein-production research laboratories of all scales, is described. PiMS is a web-based Java application that uses either Postgres or Oracle as the underlying relational database-management system. PiMS is available under a free licence to all academic laboratories either for local installation or for use as a managed service.
Architecture for Multiple Interacting Robot Intelligences
NASA Technical Reports Server (NTRS)
Peters, Richard Alan, II (Inventor)
2008-01-01
An architecture for robot intelligence enables a robot to learn new behaviors and create new behavior sequences autonomously and interact with a dynamically changing environment. Sensory information is mapped onto a Sensory Ego-Sphere (SES) that rapidly identifies important changes in the environment and functions much like short term memory. Behaviors are stored in a database associative memory (DBAM) that creates an active map from the robot's current state to a goal state and functions much like long term memory. A dream state converts recent activities stored in the SES and creates or modifies behaviors in the DBAM.
Big data mining: In-database Oracle data mining over hadoop
NASA Astrophysics Data System (ADS)
Kovacheva, Zlatinka; Naydenova, Ina; Kaloyanova, Kalinka; Markov, Krasimir
2017-07-01
Big data challenges different aspects of storing, processing and managing data, as well as analyzing and using data for business purposes. Applying Data Mining methods over Big Data is another challenge because of huge data volumes, variety of information, and the dynamic of the sources. Different applications are made in this area, but their successful usage depends on understanding many specific parameters. In this paper we present several opportunities for using Data Mining techniques provided by the analytical engine of RDBMS Oracle over data stored in Hadoop Distributed File System (HDFS). Some experimental results are given and they are discussed.
Yamamoto, Naoki; Suzuki, Tomohiro; Kobayashi, Masaaki; Dohra, Hideo; Sasaki, Yohei; Hirai, Hirofumi; Yokoyama, Koji; Kawagishi, Hirokazu; Yano, Kentaro
2014-12-03
The angel's wing oyster mushroom (Pleurocybella porrigens, Sugihiratake) is a well-known delicacy. However, its potential risk in acute encephalopathy was recently revealed by a food poisoning incident. To disclose the genes underlying the accident and provide mechanistic insight, we seek to develop an information infrastructure containing omics data. In our previous work, we sequenced the genome and transcriptome using next-generation sequencing techniques. The next step in achieving our goal is to develop a web database to facilitate the efficient mining of large-scale omics data and identification of genes specifically expressed in the mushroom. This paper introduces a web database A-WINGS (http://bioinf.mind.meiji.ac.jp/a-wings/) that provides integrated genomic and transcriptomic information for the angel's wing oyster mushroom. The database contains structure and functional annotations of transcripts and gene expressions. Functional annotations contain information on homologous sequences from NCBI nr and UniProt, Gene Ontology, and KEGG Orthology. Digital gene expression profiles were derived from RNA sequencing (RNA-seq) analysis in the fruiting bodies and mycelia. The omics information stored in the database is freely accessible through interactive and graphical interfaces by search functions that include 'GO TREE VIEW' browsing, keyword searches, and BLAST searches. The A-WINGS database will accelerate omics studies on specific aspects of the angel's wing oyster mushroom and the family Tricholomataceae.
Multimedia Database at National Museum of Ethnology
NASA Astrophysics Data System (ADS)
Sugita, Shigeharu
This paper describes the information management system at National Museum of Ethnology, Osaka, Japan. This museum is a kind of research center for cultural anthropology, and has many computer systems such as IBM 3090, VAX11/780, Fujitu M340R, etc. With these computers, distributed multimedia databases are constructed in which not only bibliographic data but also artifact image, slide image, book page image, etc. are stored. The number of data is now about 1.3 million items. These data can be retrieved and displayed on the multimedia workstation which has several displays.
URS DataBase: universe of RNA structures and their motifs.
Baulin, Eugene; Yacovlev, Victor; Khachko, Denis; Spirin, Sergei; Roytberg, Mikhail
2016-01-01
The Universe of RNA Structures DataBase (URSDB) stores information obtained from all RNA-containing PDB entries (2935 entries in October 2015). The content of the database is updated regularly. The database consists of 51 tables containing indexed data on various elements of the RNA structures. The database provides a web interface allowing user to select a subset of structures with desired features and to obtain various statistical data for a selected subset of structures or for all structures. In particular, one can easily obtain statistics on geometric parameters of base pairs, on structural motifs (stems, loops, etc.) or on different types of pseudoknots. The user can also view and get information on an individual structure or its selected parts, e.g. RNA-protein hydrogen bonds. URSDB employs a new original definition of loops in RNA structures. That definition fits both pseudoknot-free and pseudoknotted secondary structures and coincides with the classical definition in case of pseudoknot-free structures. To our knowledge, URSDB is the first database supporting searches based on topological classification of pseudoknots and on extended loop classification.Database URL: http://server3.lpm.org.ru/urs/. © The Author(s) 2016. Published by Oxford University Press.
URS DataBase: universe of RNA structures and their motifs
Baulin, Eugene; Yacovlev, Victor; Khachko, Denis; Spirin, Sergei; Roytberg, Mikhail
2016-01-01
The Universe of RNA Structures DataBase (URSDB) stores information obtained from all RNA-containing PDB entries (2935 entries in October 2015). The content of the database is updated regularly. The database consists of 51 tables containing indexed data on various elements of the RNA structures. The database provides a web interface allowing user to select a subset of structures with desired features and to obtain various statistical data for a selected subset of structures or for all structures. In particular, one can easily obtain statistics on geometric parameters of base pairs, on structural motifs (stems, loops, etc.) or on different types of pseudoknots. The user can also view and get information on an individual structure or its selected parts, e.g. RNA–protein hydrogen bonds. URSDB employs a new original definition of loops in RNA structures. That definition fits both pseudoknot-free and pseudoknotted secondary structures and coincides with the classical definition in case of pseudoknot-free structures. To our knowledge, URSDB is the first database supporting searches based on topological classification of pseudoknots and on extended loop classification. Database URL: http://server3.lpm.org.ru/urs/ PMID:27242032
Evaluation of NoSQL databases for DIRAC monitoring and beyond
NASA Astrophysics Data System (ADS)
Mathe, Z.; Casajus Ramo, A.; Stagni, F.; Tomassetti, L.
2015-12-01
Nowadays, many database systems are available but they may not be optimized for storing time series data. Monitoring DIRAC jobs would be better done using a database optimised for storing time series data. So far it was done using a MySQL database, which is not well suited for such an application. Therefore alternatives have been investigated. Choosing an appropriate database for storing huge amounts of time series data is not trivial as one must take into account different aspects such as manageability, scalability and extensibility. We compared the performance of Elasticsearch, OpenTSDB (based on HBase) and InfluxDB NoSQL databases, using the same set of machines and the same data. We also evaluated the effort required for maintaining them. Using the LHCb Workload Management System (WMS), based on DIRAC as a use case we set up a new monitoring system, in parallel with the current MySQL system, and we stored the same data into the databases under test. We evaluated Grafana (for OpenTSDB) and Kibana (for ElasticSearch) metrics and graph editors for creating dashboards, in order to have a clear picture on the usability of each candidate. In this paper we present the results of this study and the performance of the selected technology. We also give an outlook of other potential applications of NoSQL databases within the DIRAC project.
Implementation of a Big Data Accessing and Processing Platform for Medical Records in Cloud.
Yang, Chao-Tung; Liu, Jung-Chun; Chen, Shuo-Tsung; Lu, Hsin-Wen
2017-08-18
Big Data analysis has become a key factor of being innovative and competitive. Along with population growth worldwide and the trend aging of population in developed countries, the rate of the national medical care usage has been increasing. Due to the fact that individual medical data are usually scattered in different institutions and their data formats are varied, to integrate those data that continue increasing is challenging. In order to have scalable load capacity for these data platforms, we must build them in good platform architecture. Some issues must be considered in order to use the cloud computing to quickly integrate big medical data into database for easy analyzing, searching, and filtering big data to obtain valuable information.This work builds a cloud storage system with HBase of Hadoop for storing and analyzing big data of medical records and improves the performance of importing data into database. The data of medical records are stored in HBase database platform for big data analysis. This system performs distributed computing on medical records data processing through Hadoop MapReduce programming, and to provide functions, including keyword search, data filtering, and basic statistics for HBase database. This system uses the Put with the single-threaded method and the CompleteBulkload mechanism to import medical data. From the experimental results, we find that when the file size is less than 300MB, the Put with single-threaded method is used and when the file size is larger than 300MB, the CompleteBulkload mechanism is used to improve the performance of data import into database. This system provides a web interface that allows users to search data, filter out meaningful information through the web, and analyze and convert data in suitable forms that will be helpful for medical staff and institutions.
Current Standardization and Cooperative Efforts Related to Industrial Information Infrastructures.
1993-05-01
Data Management Systems: Components used to store, manage, and retrieve data. Data management includes knowledge bases, database management...Application Development Tools and Methods X/Open and POSIX APIs Integrated Design Support System (IDS) Knowledge -Based Systems (KBS) Application...IDEFlx) Yourdon Jackson System Design (JSD) Knowledge -Based Systems (KBSs) Structured Systems Development (SSD) Semantic Unification Meta-Model
ERIC Educational Resources Information Center
Miley, David W.
Many reference librarians still rely on manual searches to access vertical files, ready reference files, and other information stored in card files, drawers, and notebooks scattered around the reference department. Automated access to these materials via microcomputers using database management software may speed up the process. This study focuses…
NASA Technical Reports Server (NTRS)
Maluf, David A.; Tran, Peter B.
2003-01-01
Object-Relational database management system is an integrated hybrid cooperative approach to combine the best practices of both the relational model utilizing SQL queries and the object-oriented, semantic paradigm for supporting complex data creation. In this paper, a highly scalable, information on demand database framework, called NETMARK, is introduced. NETMARK takes advantages of the Oracle 8i object-relational database using physical addresses data types for very efficient keyword search of records spanning across both context and content. NETMARK was originally developed in early 2000 as a research and development prototype to solve the vast amounts of unstructured and semistructured documents existing within NASA enterprises. Today, NETMARK is a flexible, high-throughput open database framework for managing, storing, and searching unstructured or semi-structured arbitrary hierarchal models, such as XML and HTML.
An Extensible Schema-less Database Framework for Managing High-throughput Semi-Structured Documents
NASA Technical Reports Server (NTRS)
Maluf, David A.; Tran, Peter B.; La, Tracy; Clancy, Daniel (Technical Monitor)
2002-01-01
Object-Relational database management system is an integrated hybrid cooperative approach to combine the best practices of both the relational model utilizing SQL queries and the object oriented, semantic paradigm for supporting complex data creation. In this paper, a highly scalable, information on demand database framework, called NETMARK is introduced. NETMARK takes advantages of the Oracle 8i object-relational database using physical addresses data types for very efficient keyword searches of records for both context and content. NETMARK was originally developed in early 2000 as a research and development prototype to solve the vast amounts of unstructured and semi-structured documents existing within NASA enterprises. Today, NETMARK is a flexible, high throughput open database framework for managing, storing, and searching unstructured or semi structured arbitrary hierarchal models, XML and HTML.
NASA Technical Reports Server (NTRS)
Maluf, David A.; Tran, Peter B.
2003-01-01
Object-Relational database management system is an integrated hybrid cooperative approach to combine the best practices of both the relational model utilizing SQL queries and the object-oriented, semantic paradigm for supporting complex data creation. In this paper, a highly scalable, information on demand database framework, called NETMARK, is introduced. NETMARK takes advantages of the Oracle 8i object-relational database using physical addresses data types for very efficient keyword search of records spanning across both context and content. NETMARK was originally developed in early 2000 as a research and development prototype to solve the vast amounts of unstructured and semi-structured documents existing within NASA enterprises. Today, NETMARK is a flexible, high-throughput open database framework for managing, storing, and searching unstructured or semi-structured arbitrary hierarchal models, such as XML and HTML.
Storage and retrieval of medical images from data warehouses
NASA Astrophysics Data System (ADS)
Tikekar, Rahul V.; Fotouhi, Farshad A.; Ragan, Don P.
1995-11-01
As our applications continue to become more sophisticated, the demand for more storage continues to rise. Hence many businesses are looking toward data warehousing technology to satisfy their storage needs. A warehouse is different from a conventional database and hence deserves a different approach while storing data that might be retrieved at a later point in time. In this paper we look at the problem of storing and retrieving medical image data from a warehouse. We regard the warehouse as a pyramid with fast storage devices at the top and slower storage devices at the bottom. Our approach is to store the most needed information abstract at the top of the pyramid and more detailed and storage consuming data toward the end of the pyramid. This information is linked for browsing purposes. In a similar fashion, during the retrieval of data, the user is given a sample representation with browse option of the detailed data and, as required, more and more details are made available.
Authomatization of Digital Collection Access Using Mobile and Wireless Data Terminals
NASA Astrophysics Data System (ADS)
Leontiev, I. V.
Information technologies become vital due to information processing needs, database access, data analysis and decision support. Currently, a lot of scientific projects are oriented on database integration of heterogeneous systems. The problem of on-line and rapid access to large integrated systems of digital collections is also very important. Usually users move between different locations, either at work or at home. In most cases users need an efficient and remote access to information, stored in integrated data collections. Desktop computers are unable to fulfill the needs, so mobile and wireless devices become helpful. Handhelds and data terminals are nessessary in medical assistance (they store detailed information about each patient, and helpful for nurses), immediate access to data collections is used in a Highway patrol services (databanks of cars, owners, driver licences). Using mobile access, warehouse operations can be validated. Library and museum items cyclecounting will speed up using online barcode-scanning and central database access. That's why mobile devices - cell phones, PDA, handheld computers with wireless access, WindowsCE and PalmOS terminals become popular. Generally, mobile devices have a relatively slow processor, and limited display capabilities, but they are effective for storing and displaying textual data, recognize user hand-writing with stylus, support GUI. Users can perform operations on handheld terminal, and exchange data with the main system (using immediate radio access, or offline access during syncronization process) for update. In our report, we give an approach for mobile access to data collections, which raises an efficiency of data processing in a book library, helps to control available books, books in stock, validate service charges, eliminate staff mistakes, generate requests for book delivery. Our system uses mobile devices Symbol RF (with radio-channel access), and data terminals Symbol Palm Terminal for batch-processing and synchronization with remote library databases. We discuss the use of PalmOS-compatible devices, and WindowsCE terminals. Our software system is based on modular, scalable three-tier architecture. Additional functionality can be easily customized. Scalability is also supplied by Internet / Intranet technologies, and radio-access points. The base module of the system supports generic warehouse operations: cyclecounting with handheld barcode-scanners, efficient items delivery and issue, item movement, reserving, report generating on finished and in-process operations. Movements are optimized using worker's current location, operations are sorted in a priority order and transmitted to mobile and wireless worker's terminals. Mobile terminals improve of tasks processing control, eliminate staff mistakes, display actual information about main processes, provide data for online-reports, and significantly raise the efficiency of data exchange.
PseudoBase: a database with RNA pseudoknots.
van Batenburg, F H; Gultyaev, A P; Pleij, C W; Ng, J; Oliehoek, J
2000-01-01
PseudoBase is a database containing structural, functional and sequence data related to RNA pseudo-knots. It can be reached at http://wwwbio. Leiden Univ.nl/ approximately Batenburg/PKB.html. This page will direct the user to a retrieval page from where a particular pseudoknot can be chosen, or to a submission page which enables the user to add pseudoknot information to the database or to an informative page that elaborates on the various aspects of the database. For each pseudoknot, 12 items are stored, e.g. the nucleotides of the region that contains the pseudoknot, the stem positions of the pseudoknot, the EMBL accession number of the sequence that contains this pseudoknot and the support that can be given regarding the reliability of the pseudoknot. Access is via a small number of steps, using 16 different categories. The development process was done by applying the evolutionary methodology for software development rather than by applying the methodology of the classical waterfall model or the more modern spiral model.
Adding Hierarchical Objects to Relational Database General-Purpose XML-Based Information Managements
NASA Technical Reports Server (NTRS)
Lin, Shu-Chun; Knight, Chris; La, Tracy; Maluf, David; Bell, David; Tran, Khai Peter; Gawdiak, Yuri
2006-01-01
NETMARK is a flexible, high-throughput software system for managing, storing, and rapid searching of unstructured and semi-structured documents. NETMARK transforms such documents from their original highly complex, constantly changing, heterogeneous data formats into well-structured, common data formats in using Hypertext Markup Language (HTML) and/or Extensible Markup Language (XML). The software implements an object-relational database system that combines the best practices of the relational model utilizing Structured Query Language (SQL) with those of the object-oriented, semantic database model for creating complex data. In particular, NETMARK takes advantage of the Oracle 8i object-relational database model using physical-address data types for very efficient keyword searches of records across both context and content. NETMARK also supports multiple international standards such as WEBDAV for drag-and-drop file management and SOAP for integrated information management using Web services. The document-organization and -searching capabilities afforded by NETMARK are likely to make this software attractive for use in disciplines as diverse as science, auditing, and law enforcement.
Vaxvec: The first web-based recombinant vaccine vector database and its data analysis
Deng, Shunzhou; Martin, Carly; Patil, Rasika; Zhu, Felix; Zhao, Bin; Xiang, Zuoshuang; He, Yongqun
2015-01-01
A recombinant vector vaccine uses an attenuated virus, bacterium, or parasite as the carrier to express a heterologous antigen(s). Many recombinant vaccine vectors and related vaccines have been developed and extensively investigated. To compare and better understand recombinant vectors and vaccines, we have generated Vaxvec (http://www.violinet.org/vaxvec), the first web-based database that stores various recombinant vaccine vectors and those experimentally verified vaccines that use these vectors. Vaxvec has now included 59 vaccine vectors that have been used in 196 recombinant vector vaccines against 66 pathogens and cancers. These vectors are classified to 41 viral vectors, 15 bacterial vectors, 1 parasitic vector, and 1 fungal vector. The most commonly used viral vaccine vectors are double-stranded DNA viruses, including herpesviruses, adenoviruses, and poxviruses. For example, Vaxvec includes 63 poxvirus-based recombinant vaccines for over 20 pathogens and cancers. Vaxvec collects 30 recombinant vector influenza vaccines that use 17 recombinant vectors and were experimentally tested in 7 animal models. In addition, over 60 protective antigens used in recombinant vector vaccines are annotated and analyzed. User-friendly web-interfaces are available for querying various data in Vaxvec. To support data exchange, the information of vaccine vectors, vaccines, and related information is stored in the Vaccine Ontology (VO). Vaxvec is a timely and vital source of vaccine vector database and facilitates efficient vaccine vector research and development. PMID:26403370
Scientific information repository assisting reflectance spectrometry in legal medicine.
Belenki, Liudmila; Sterzik, Vera; Bohnert, Michael; Zimmermann, Klaus; Liehr, Andreas W
2012-06-01
Reflectance spectrometry is a fast and reliable method for the characterization of human skin if the spectra are analyzed with respect to a physical model describing the optical properties of human skin. For a field study performed at the Institute of Legal Medicine and the Freiburg Materials Research Center of the University of Freiburg, a scientific information repository has been developed, which is a variant of an electronic laboratory notebook and assists in the acquisition, management, and high-throughput analysis of reflectance spectra in heterogeneous research environments. At the core of the repository is a database management system hosting the master data. It is filled with primary data via a graphical user interface (GUI) programmed in Java, which also enables the user to browse the database and access the results of data analysis. The latter is carried out via Matlab, Python, and C programs, which retrieve the primary data from the scientific information repository, perform the analysis, and store the results in the database for further usage.
Population and Activity of On-road Vehicles in MOVES2014 ...
This report describes the sources and derivation for on-road vehicle population and activity information and associated adjustments as stored in the MOVES2014 default databases. Motor Vehicle Emission Simulator, the MOVES2014 model, is a set of modeling tools for estimating emissions produced by on-road (cars, trucks, motorcycles, etc.) and nonroad (backhoes, lawnmowers, etc.) mobile sources. The national default activity information in MOVES2014 provides a reasonable basis for estimating national emissions. However, the uncertainties and variability in the default data contribute to the uncertainty in the resulting emission estimates. Properly characterizing emissions from the on-road vehicle subset requires a detailed understanding of the cars and trucks that make up the vehicle fleet and their patterns of operation. The MOVES model calculates emission inventories by multiplying emission rates by the appropriate emission-related activity, applying correction (adjustment) factors as needed to simulate specific situations, and then adding up the emissions from all sources (populations) and regions. This report describes the sources and derivation for on-road vehicle population and activity information and associated adjustments as stored in the MOVES2014 default databases. Motor Vehicle Emission Simulator, the MOVES2014 model, is a set of modeling tools for estimating emissions produced by on-road (cars, trucks, motorcycles, etc.) and nonroad (backhoes, law
Querying databases of trajectories of differential equations: Data structures for trajectories
NASA Technical Reports Server (NTRS)
Grossman, Robert
1989-01-01
One approach to qualitative reasoning about dynamical systems is to extract qualitative information by searching or making queries on databases containing very large numbers of trajectories. The efficiency of such queries depends crucially upon finding an appropriate data structure for trajectories of dynamical systems. Suppose that a large number of parameterized trajectories gamma of a dynamical system evolving in R sup N are stored in a database. Let Eta is contained in set R sup N denote a parameterized path in Euclidean Space, and let the Euclidean Norm denote a norm on the space of paths. A data structure is defined to represent trajectories of dynamical systems, and an algorithm is sketched which answers queries.
Cros, Annick; Ahamad Fatan, Nurulhuda; White, Alan; Teoh, Shwu Jiau; Tan, Stanley; Handayani, Christian; Huang, Charles; Peterson, Nate; Venegas Li, Ruben; Siry, Hendra Yusran; Fitriana, Ria; Gove, Jamison; Acoba, Tomoko; Knight, Maurice; Acosta, Renerio; Andrew, Neil; Beare, Doug
2014-01-01
In this paper we describe the construction of an online GIS database system, hosted by WorldFish, which stores bio-physical, ecological and socio-economic data for the ‘Coral Triangle Area’ in South-east Asia and the Pacific. The database has been built in partnership with all six (Timor-Leste, Malaysia, Indonesia, The Philippines, Solomon Islands and Papua New Guinea) of the Coral Triangle countries, and represents a valuable source of information for natural resource managers at the regional scale. Its utility is demonstrated using biophysical data, data summarising marine habitats, and data describing the extent of marine protected areas in the region. PMID:24941442
NASA Astrophysics Data System (ADS)
Strotov, Valery V.; Taganov, Alexander I.; Konkin, Yuriy V.; Kolesenkov, Aleksandr N.
2017-10-01
Task of processing and analysis of obtained Earth remote sensing data on ultra-small spacecraft board is actual taking into consideration significant expenditures of energy for data transfer and low productivity of computers. Thereby, there is an issue of effective and reliable storage of the general information flow obtained from onboard systems of information collection, including Earth remote sensing data, into a specialized data base. The paper has considered peculiarities of database management system operation with the multilevel memory structure. For storage of data in data base the format has been developed that describes a data base physical structure which contains required parameters for information loading. Such structure allows reducing a memory size occupied by data base because it is not necessary to store values of keys separately. The paper has shown architecture of the relational database management system oriented into embedment into the onboard ultra-small spacecraft software. Data base for storage of different information, including Earth remote sensing data, can be developed by means of such database management system for its following processing. Suggested database management system architecture has low requirements to power of the computer systems and memory resources on the ultra-small spacecraft board. Data integrity is ensured under input and change of the structured information.
GIS Methodic and New Database for Magmatic Rocks. Application for Atlantic Oceanic Magmatism.
NASA Astrophysics Data System (ADS)
Asavin, A. M.
2001-12-01
There are several geochemical Databases in INTERNET available now. There one of the main peculiarities of stored geochemical information is geographical coordinates of each samples in those Databases. As rule the software of this Database use spatial information only for users interface search procedures. In the other side, GIS-software (Geographical Information System software),for example ARC/INFO software which using for creation and analyzing special geological, geochemical and geophysical e-map, have been deeply involved with geographical coordinates for of samples. We join peculiarities GIS systems and relational geochemical Database from special software. Our geochemical information system created in Vernadsky Geological State Museum and institute of Geochemistry and Analytical Chemistry from Moscow. Now we tested system with data of geochemistry oceanic rock from Atlantic and Pacific oceans, about 10000 chemical analysis. GIS information content consist from e-map covers Wold Globes. Parts of these maps are Atlantic ocean covers gravica map (with grid 2''), oceanic bottom hot stream, altimeteric maps, seismic activity, tectonic map and geological map. Combination of this information content makes possible created new geochemical maps and combination of spatial analysis and numerical geochemical modeling of volcanic process in ocean segment. Now we tested information system on thick client technology. Interface between GIS system Arc/View and Database resides in special multiply SQL-queries sequence. The result of the above gueries were simple DBF-file with geographical coordinates. This file act at the instant of creation geochemical and other special e-map from oceanic region. We used more complex method for geophysical data. From ARC\\View we created grid cover for polygon spatial geophysical information.
A public HTLV-1 molecular epidemiology database for sequence management and data mining.
Araujo, Thessika Hialla Almeida; Souza-Brito, Leandro Inacio; Libin, Pieter; Deforche, Koen; Edwards, Dustin; de Albuquerque-Junior, Antonio Eduardo; Vandamme, Anne-Mieke; Galvao-Castro, Bernardo; Alcantara, Luiz Carlos Junior
2012-01-01
It is estimated that 15 to 20 million people are infected with the human T-cell lymphotropic virus type 1 (HTLV-1). At present, there are more than 2,000 unique HTLV-1 isolate sequences published. A central database to aggregate sequence information from a range of epidemiological aspects including HTLV-1 infections, pathogenesis, origins, and evolutionary dynamics would be useful to scientists and physicians worldwide. Described here, we have developed a database that collects and annotates sequence data and can be accessed through a user-friendly search interface. The HTLV-1 Molecular Epidemiology Database website is available at http://htlv1db.bahia.fiocruz.br/. All data was obtained from publications available at GenBank or through contact with the authors. The database was developed using Apache Webserver 2.1.6 and SGBD MySQL. The webpage interfaces were developed in HTML and sever-side scripting written in PHP. The HTLV-1 Molecular Epidemiology Database is hosted on the Gonçalo Moniz/FIOCRUZ Research Center server. There are currently 2,457 registered sequences with 2,024 (82.37%) of those sequences representing unique isolates. Of these sequences, 803 (39.67%) contain information about clinical status (TSP/HAM, 17.19%; ATL, 7.41%; asymptomatic, 12.89%; other diseases, 2.17%; and no information, 60.32%). Further, 7.26% of sequences contain information on patient gender while 5.23% of sequences provide the age of the patient. The HTLV-1 Molecular Epidemiology Database retrieves and stores annotated HTLV-1 proviral sequences from clinical, epidemiological, and geographical studies. The collected sequences and related information are now accessible on a publically available and user-friendly website. This open-access database will support clinical research and vaccine development related to viral genotype.
Design and implementation of a health data interoperability mediator.
Kuo, Mu-Hsing; Kushniruk, Andre William; Borycki, Elizabeth Marie
2010-01-01
The objective of this study is to design and implement a common-gateway oriented mediator to solve the health data interoperability problems that exist among heterogeneous health information systems. The proposed mediator has three main components: (1) a Synonym Dictionary (SD) that stores a set of global metadata and terminologies to serve as the mapping intermediary, (2) a Semantic Mapping Engine (SME) that can be used to map metadata and instance semantics, and (3) a DB-to-XML module that translates source health data stored in a database into XML format and back. A routine admission notification data exchange scenario is used to test the efficiency and feasibility of the proposed mediator. The study results show that the proposed mediator can make health information exchange more efficient.
Combined use of computational chemistry and chemoinformatics methods for chemical discovery
DOE Office of Scientific and Technical Information (OSTI.GOV)
Sugimoto, Manabu, E-mail: sugimoto@kumamoto-u.ac.jp; Institute for Molecular Science, 38 Nishigo-Naka, Myodaiji, Okazaki 444-8585; CREST, Japan Science and Technology Agency, 4-1-8 Honcho, Kawaguchi, Saitama 332-0012
2015-12-31
Data analysis on numerical data by the computational chemistry calculations is carried out to obtain knowledge information of molecules. A molecular database is developed to systematically store chemical, electronic-structure, and knowledge-based information. The database is used to find molecules related to a keyword of “cancer”. Then the electronic-structure calculations are performed to quantitatively evaluate quantum chemical similarity of the molecules. Among the 377 compounds registered in the database, 24 molecules are found to be “cancer”-related. This set of molecules includes both carcinogens and anticancer drugs. The quantum chemical similarity analysis, which is carried out by using numerical results of themore » density-functional theory calculations, shows that, when some energy spectra are referred to, carcinogens are reasonably distinguished from the anticancer drugs. Therefore these spectral properties are considered of as important measures for classification.« less
Remote monitoring of patients with implanted devices: data exchange and integration.
Van der Velde, Enno T; Atsma, Douwe E; Foeken, Hylke; Witteman, Tom A; Hoekstra, Wybo H G J
2013-06-01
Remote follow-up of implanted implantable cardioverter defibrillators (ICDs) may offer a solution to the problem of overcrowded outpatient clinics, and may also be effective in detecting clinical events early. Data obtained from remote follow up systems, as developed by all major device companies, are stored in a central database system, operated and owned by the device company. A problem now arises that the patient's clinical information is partly stored in the local electronic health record (EHR) system in the hospital, and partly in the remote monitoring database, which may potentially result in patient safety issues. To address the requirement of integrating remote monitoring data in the local EHR, the Integrating the Healthcare Enterprise (IHE) Implantable Device Cardiac Observation (IDCO) profile has been developed. This IHE IDCO profile has been adapted by all major device companies. In our hospital, we have implemented the IHE IDCO profile to import data from the remote databases from two device vendors into the departmental Cardiology Information System (EPD-Vision). Data is exchanged via a HL7/XML communication protocol, as defined in the IHE IDCO profile. By implementing the IHE IDCO profile, we have been able to integrate the data from the remote monitoring databases in our local EHRs. It can be expected that remote monitoring systems will develop into dedicated monitoring and therapy platforms. Data retrieved from these systems should form an integral part of the electronic patient record as more and more out-patient clinic care will shift to personalized care provided at a distance, in other words at the patient's home.
NASA Technical Reports Server (NTRS)
Bertelrud, Arild; Johnson, Sherylene; Anders, J. B. (Technical Monitor)
2002-01-01
A 2-D (two dimensional) high-lift system experiment was conducted in August of 1996 in the Low Turbulence Pressure Tunnel at NASA Langley Research Center, Hampton, VA. The purpose of the experiment was to obtain transition measurements on a three element high-lift system for CFD (computational fluid dynamics) code validation studies. A transition database has been created using the data from this experiment. The present report details how the hot-film data and the related pressure data are organized in the database. Data processing codes to access the data in an efficient and reliable manner are described and limited examples are given on how to access the database and store acquired information.
The liver tissue bank and clinical database in China.
Yang, Yuan; Liu, Yi-Min; Wei, Ming-Yue; Wu, Yi-Fei; Gao, Jun-Hui; Liu, Lei; Zhou, Wei-Ping; Wang, Hong-Yang; Wu, Meng-Chao
2010-12-01
To develop a standardized and well-rounded material available for hepatology research, the National Liver Tissue Bank (NLTB) Project began in 2008 in China to make well-characterized and optimally preserved liver tumor tissue and clinical database. From Dec 2008 to Jun 2010, over 3000 individuals have been enrolled as liver tumor donors to the NLTB, including 2317 cases of newly diagnosed hepatocellular carcinoma (HCC) and about 1000 cases of diagnosed benign or malignant liver tumors. The clinical database and sample store can be managed easily and correctly with the data management platform used. We believe that the high-quality samples with detailed information database will become the cornerstone of hepatology research especially in studies exploring the diagnosis and new treatments for HCC and other liver diseases.
Database Dictionary for Ethiopian National Ground-Water DAtabase (ENGDA) Data Fields
Kuniansky, Eve L.; Litke, David W.; Tucci, Patrick
2007-01-01
Introduction This document describes the data fields that are used for both field forms and the Ethiopian National Ground-water Database (ENGDA) tables associated with information stored about production wells, springs, test holes, test wells, and water level or water-quality observation wells. Several different words are used in this database dictionary and in the ENGDA database to describe a narrow shaft constructed in the ground. The most general term is borehole, which is applicable to any type of hole. A well is a borehole specifically constructed to extract water from the ground; however, for this data dictionary and for the ENGDA database, the words well and borehole are used interchangeably. A production well is defined as any well used for water supply and includes hand-dug wells, small-diameter bored wells equipped with hand pumps, or large-diameter bored wells equipped with large-capacity motorized pumps. Test holes are borings made to collect information about the subsurface with continuous core or non-continuous core and/or where geophysical logs are collected. Test holes are not converted into wells. A test well is a well constructed for hydraulic testing of an aquifer in order to plan a larger ground-water production system. A water-level or water-quality observation well is a well that is used to collect information about an aquifer and not used for water supply. A spring is any naturally flowing, local, ground-water discharge site. The database dictionary is designed to help define all fields on both field data collection forms (provided in attachment 2 of this report) and for the ENGDA software screen entry forms (described in Litke, 2007). The data entered into each screen entry field are stored in relational database tables within the computer database. The organization of the database dictionary is designed based on field data collection and the field forms, because this is what the majority of people will use. After each field, however, the ENGDA database field name and relational database table is designated; along with the ENGDA screen entry form(s) and the ENGDA field form (attachment 2). The database dictionary is separated into sections. The first section, Basic Site Data Fields, describes the basic site information that is similar for all of the different types of sites. The remaining sections may be applicable for only one type of site; for example, the Well Drilling and Construction Data Fields and Lithologic Description Data Fields are applicable to boreholes and not to springs. Attachment 1 contains a table for conversion from English to metric units. Attachment 2 contains selected field forms used in conjunction with ENGDA. A separate document, 'Users Reference Manual for the Ethiopian National Ground-Water DAtabase (ENGDA),' by David W. Litke was developed as a users guide for the computer database and screen entry. This database dictionary serves as a reference for both the field forms and the computer database. Every effort has been made to have identical field names between the field forms and the screen entry forms in order to avoid confusion.
An image database management system for conducting CAD research
NASA Astrophysics Data System (ADS)
Gruszauskas, Nicholas; Drukker, Karen; Giger, Maryellen L.
2007-03-01
The development of image databases for CAD research is not a trivial task. The collection and management of images and their related metadata from multiple sources is a time-consuming but necessary process. By standardizing and centralizing the methods in which these data are maintained, one can generate subsets of a larger database that match the specific criteria needed for a particular research project in a quick and efficient manner. A research-oriented management system of this type is highly desirable in a multi-modality CAD research environment. An online, webbased database system for the storage and management of research-specific medical image metadata was designed for use with four modalities of breast imaging: screen-film mammography, full-field digital mammography, breast ultrasound and breast MRI. The system was designed to consolidate data from multiple clinical sources and provide the user with the ability to anonymize the data. Input concerning the type of data to be stored as well as desired searchable parameters was solicited from researchers in each modality. The backbone of the database was created using MySQL. A robust and easy-to-use interface for entering, removing, modifying and searching information in the database was created using HTML and PHP. This standardized system can be accessed using any modern web-browsing software and is fundamental for our various research projects on computer-aided detection, diagnosis, cancer risk assessment, multimodality lesion assessment, and prognosis. Our CAD database system stores large amounts of research-related metadata and successfully generates subsets of cases that match the user's desired search criteria.
GEOMAGIA50: An archeointensity database with PHP and MySQL
NASA Astrophysics Data System (ADS)
Korhonen, K.; Donadini, F.; Riisager, P.; Pesonen, L. J.
2008-04-01
The GEOMAGIA50 database stores 3798 archeomagnetic and paleomagnetic intensity determinations dated to the past 50,000 years. It also stores details of the measurement setup for each determination, which are used for ranking the data according to prescribed reliability criteria. The ranking system aims to alleviate the data reliability problem inherent in this kind of data. GEOMAGIA50 is based on two popular open source technologies. The MySQL database management system is used for storing the data, whereas the functionality and user interface are provided by server-side PHP scripts. This technical brief gives a detailed description of GEOMAGIA50 from a technical viewpoint.
The Hazard Notification System (HANS)
NASA Astrophysics Data System (ADS)
Snedigar, S. F.; Venezky, D. Y.
2009-12-01
The Volcano Hazards Program (VHP) has developed a Hazard Notification System (HANS) for distributing volcanic activity information collected by scientists to airlines, emergency services, and the general public. In the past year, data from HANS have been used by airlines to make decisions about diverting or canceling flights during the eruption of Mount Redoubt. HANS was developed to provide a single system that each of the five U.S. volcano observatories could use for communicating and storing volcanic information about the 160+ potentially active U.S. volcanoes. The data that cover ten tables and nearly 100 fields are now stored in similar formats, and the information can be released in styles requested by our agency partners, such as the International Civil Aviation Organization (ICAO). Currently, HANS has about 4500 reports stored; on average, two - three reports are added daily. HANS (at its most basic form) consists of a user interface for entering data into one of many release types (Daily Status Reports, Weekly Updates, Volcano Activity Notifications, etc.); a database holding previous releases as well as observatory information such as email address lists and volcano boilerplates; and a transmission system for formatting releases and sending them out by email or other web related system. The user interface to HANS is completely web based, providing access to our observatory scientists from any online PC. The underlying database stores the observatory information and drives the observatory and program websites' dynamic updates and archived information releases. HANS also runs scripts for generating several different feeds including the program home page Volcano Status Map. Each observatory has the capability of running an instance of HANS. There are currently three instances of HANS and each instance is synchronized to all other instances using a master-slave environment. Information can be entered on any node; slave nodes transmit data to the master node, and the master retransmits that data to all slave nodes. All data transfer between instances uses the Simple Object Access Protocol (SOAP) as the envelope in which data are transmitted between nodes. The HANS data synchronization not only works as a backup feature, but also acts as a simple fault-tolerant system. Information from any observatory can be entered on any instance, and still be transmitted to the specified observatory's distribution list, which provides added flexibility if there is a disruption in access from an area that needs to send an update. Additionally, having the same information available on our multiple websites is necessary for communicating our scientists' most up-to-date information.
Construction of In-house Databases in a Corporation
NASA Astrophysics Data System (ADS)
Kato, Toshio
Osaka Gas Co., Ltd. constructed Osaka Gas Technical Information System (OGTIS) in 1979, which stores and retrieves the in-house technical information and provides even primary materials by unifying optical disk files, facsimile system and so on. The major information sources are technical materials, survey materials, planning documents, design materials, research reports, business tour reports which are all generated inside the Company. At the present moment it amounts to 25,000 items in total adding 1,000 items annually. The data file is updated once in a month and also outputs the abstract journal OGTIS Report monthly. In 1983 it constructed System for International Exchange of Personal Information (SIP) as a subsystem of OGTIS in order to compile SIP database which covers exchange outlines with oversea enterprises or organizations. The data size is 2,600 totally adding about 500 annually with monthly data updating.
PlantDB – a versatile database for managing plant research
Exner, Vivien; Hirsch-Hoffmann, Matthias; Gruissem, Wilhelm; Hennig, Lars
2008-01-01
Background Research in plant science laboratories often involves usage of many different species, cultivars, ecotypes, mutants, alleles or transgenic lines. This creates a great challenge to keep track of the identity of experimental plants and stored samples or seeds. Results Here, we describe PlantDB – a Microsoft® Office Access database – with a user-friendly front-end for managing information relevant for experimental plants. PlantDB can hold information about plants of different species, cultivars or genetic composition. Introduction of a concise identifier system allows easy generation of pedigree trees. In addition, all information about any experimental plant – from growth conditions and dates over extracted samples such as RNA to files containing images of the plants – can be linked unequivocally. Conclusion We have been using PlantDB for several years in our laboratory and found that it greatly facilitates access to relevant information. PMID:18182106
[Construction of chemical information database based on optical structure recognition technique].
Lv, C Y; Li, M N; Zhang, L R; Liu, Z M
2018-04-18
To create a protocol that could be used to construct chemical information database from scientific literature quickly and automatically. Scientific literature, patents and technical reports from different chemical disciplines were collected and stored in PDF format as fundamental datasets. Chemical structures were transformed from published documents and images to machine-readable data by using the name conversion technology and optical structure recognition tool CLiDE. In the process of molecular structure information extraction, Markush structures were enumerated into well-defined monomer molecules by means of QueryTools in molecule editor ChemDraw. Document management software EndNote X8 was applied to acquire bibliographical references involving title, author, journal and year of publication. Text mining toolkit ChemDataExtractor was adopted to retrieve information that could be used to populate structured chemical database from figures, tables, and textual paragraphs. After this step, detailed manual revision and annotation were conducted in order to ensure the accuracy and completeness of the data. In addition to the literature data, computing simulation platform Pipeline Pilot 7.5 was utilized to calculate the physical and chemical properties and predict molecular attributes. Furthermore, open database ChEMBL was linked to fetch known bioactivities, such as indications and targets. After information extraction and data expansion, five separate metadata files were generated, including molecular structure data file, molecular information, bibliographical references, predictable attributes and known bioactivities. Canonical simplified molecular input line entry specification as primary key, metadata files were associated through common key nodes including molecular number and PDF number to construct an integrated chemical information database. A reasonable construction protocol of chemical information database was created successfully. A total of 174 research articles and 25 reviews published in Marine Drugs from January 2015 to June 2016 collected as essential data source, and an elementary marine natural product database named PKU-MNPD was built in accordance with this protocol, which contained 3 262 molecules and 19 821 records. This data aggregation protocol is of great help for the chemical information database construction in accuracy, comprehensiveness and efficiency based on original documents. The structured chemical information database can facilitate the access to medical intelligence and accelerate the transformation of scientific research achievements.
Research Directions in Database Security IV
1993-07-01
second algorithm, which is based on multiversion timestamp ordering, is that high level transactions can be forced to read arbitrarily old data values...system. The first, the single ver- sion model, stores only the latest veision of each data item, while the second, the 88 multiversion model, stores... Multiversion Database Model In the standard database model, where there is only one version of each data item, all transactions compete for the most recent
[Design of computerised database for clinical and basic management of uveal melanoma].
Bande Rodríguez, M F; Santiago Varela, M; Blanco Teijeiro, M J; Mera Yañez, P; Pardo Perez, M; Capeans Tome, C; Piñeiro Ces, A
2012-09-01
The uveal melanoma is the most common primary intraocular tumour in adults. The objective of this work is to show how a computerised database has been formed with specific applications, for clinical and research use, to an extensive group of patients diagnosed with uveal melanoma. For the design of the database a selection of categories, attributes and values was created based on the classifications and parameters given by various authors of articles which have had great relevance in the field of uveal melanoma in recent years. The database has over 250 patient entries with specific information on their clinical history, diagnosis, treatment and progress. It enables us to search any parameter of the entry and make quick and simple statistical studies of them. The database models have been transformed into a basic tool for clinical practice, as they are an efficient way of storing, compiling and selective searching of information. When creating a database it is very important to define a common strategy and the use of a standard language. Copyright © 2011 Sociedad Española de Oftalmología. Published by Elsevier Espana. All rights reserved.
HepSEQ: International Public Health Repository for Hepatitis B
Gnaneshan, Saravanamuttu; Ijaz, Samreen; Moran, Joanne; Ramsay, Mary; Green, Jonathan
2007-01-01
HepSEQ is a repository for an extensive library of public health and molecular data relating to hepatitis B virus (HBV) infection collected from international sources. It is hosted by the Centre for Infections, Health Protection Agency (HPA), England, United Kingdom. This repository has been developed as a web-enabled, quality-controlled database to act as a tool for surveillance, HBV case management and for research. The web front-end for the database system can be accessed from . The format of the database system allows for comprehensive molecular, clinical and epidemiological data to be deposited into a functional database, to search and manipulate the stored data and to extract and visualize the information on epidemiological, virological, clinical, nucleotide sequence and mutational aspects of HBV infection through web front-end. Specific tools, built into the database, can be utilized to analyse deposited data and provide information on HBV genotype, identify mutations with known clinical significance (e.g. vaccine escape, precore and antiviral-resistant mutations) and carry out sequence homology searches against other deposited strains. Further mechanisms are also in place to allow specific tailored searches of the database to be undertaken. PMID:17130143
Shuttle-Data-Tape XML Translator
NASA Technical Reports Server (NTRS)
Barry, Matthew R.; Osborne, Richard N.
2005-01-01
JSDTImport is a computer program for translating native Shuttle Data Tape (SDT) files from American Standard Code for Information Interchange (ASCII) format into databases in other formats. JSDTImport solves the problem of organizing the SDT content, affording flexibility to enable users to choose how to store the information in a database to better support client and server applications. JSDTImport can be dynamically configured by use of a simple Extensible Markup Language (XML) file. JSDTImport uses this XML file to define how each record and field will be parsed, its layout and definition, and how the resulting database will be structured. JSDTImport also includes a client application programming interface (API) layer that provides abstraction for the data-querying process. The API enables a user to specify the search criteria to apply in gathering all the data relevant to a query. The API can be used to organize the SDT content and translate into a native XML database. The XML format is structured into efficient sections, enabling excellent query performance by use of the XPath query language. Optionally, the content can be translated into a Structured Query Language (SQL) database for fast, reliable SQL queries on standard database server computers.
SinEx DB: a database for single exon coding sequences in mammalian genomes.
Jorquera, Roddy; Ortiz, Rodrigo; Ossandon, F; Cárdenas, Juan Pablo; Sepúlveda, Rene; González, Carolina; Holmes, David S
2016-01-01
Eukaryotic genes are typically interrupted by intragenic, noncoding sequences termed introns. However, some genes lack introns in their coding sequence (CDS) and are generally known as 'single exon genes' (SEGs). In this work, a SEG is defined as a nuclear, protein-coding gene that lacks introns in its CDS. Whereas, many public databases of Eukaryotic multi-exon genes are available, there are only two specialized databases for SEGs. The present work addresses the need for a more extensive and diverse database by creating SinEx DB, a publicly available, searchable database of predicted SEGs from 10 completely sequenced mammalian genomes including human. SinEx DB houses the DNA and protein sequence information of these SEGs and includes their functional predictions (KOG) and the relative distribution of these functions within species. The information is stored in a relational database built with My SQL Server 5.1.33 and the complete dataset of SEG sequences and their functional predictions are available for downloading. SinEx DB can be interrogated by: (i) a browsable phylogenetic schema, (ii) carrying out BLAST searches to the in-house SinEx DB of SEGs and (iii) via an advanced search mode in which the database can be searched by key words and any combination of searches by species and predicted functions. SinEx DB provides a rich source of information for advancing our understanding of the evolution and function of SEGs.Database URL: www.sinex.cl. © The Author(s) 2016. Published by Oxford University Press.
The ATLAS conditions database architecture for the Muon spectrometer
NASA Astrophysics Data System (ADS)
Verducci, Monica; ATLAS Muon Collaboration
2010-04-01
The Muon System, facing the challenge requirement of the conditions data storage, has extensively started to use the conditions database project 'COOL' as the basis for all its conditions data storage both at CERN and throughout the worldwide collaboration as decided by the ATLAS Collaboration. The management of the Muon COOL conditions database will be one of the most challenging applications for Muon System, both in terms of data volumes and rates, but also in terms of the variety of data stored. The Muon conditions database is responsible for almost all of the 'non event' data and detector quality flags storage needed for debugging of the detector operations and for performing reconstruction and analysis. The COOL database allows database applications to be written independently of the underlying database technology and ensures long term compatibility with the entire ATLAS Software. COOL implements an interval of validity database, i.e. objects stored or referenced in COOL have an associated start and end time between which they are valid, the data is stored in folders, which are themselves arranged in a hierarchical structure of folder sets. The structure is simple and mainly optimized to store and retrieve object(s) associated with a particular time. In this work, an overview of the entire Muon conditions database architecture is given, including the different sources of the data and the storage model used. In addiction the software interfaces used to access to the conditions data are described, more emphasis is given to the Offline Reconstruction framework ATHENA and the services developed to provide the conditions data to the reconstruction.
A Web-Based Information System for Field Data Management
NASA Astrophysics Data System (ADS)
Weng, Y. H.; Sun, F. S.
2014-12-01
A web-based field data management system has been designed and developed to allow field geologists to store, organize, manage, and share field data online. System requirements were analyzed and clearly defined first regarding what data are to be stored, who the potential users are, and what system functions are needed in order to deliver the right data in the right way to the right user. A 3-tiered architecture was adopted to create this secure, scalable system that consists of a web browser at the front end while a database at the back end and a functional logic server in the middle. Specifically, HTML, CSS, and JavaScript were used to implement the user interface in the front-end tier, the Apache web server runs PHP scripts, and MySQL to server is used for the back-end database. The system accepts various types of field information, including image, audio, video, numeric, and text. It allows users to select data and populate them on either Google Earth or Google Maps for the examination of the spatial relations. It also makes the sharing of field data easy by converting them into XML format that is both human-readable and machine-readable, and thus ready for reuse.
An Ontology-Based GIS for Genomic Data Management of Rumen Microbes
Jelokhani-Niaraki, Saber; Minuchehr, Zarrin; Nassiri, Mohammad Reza
2015-01-01
During recent years, there has been exponential growth in biological information. With the emergence of large datasets in biology, life scientists are encountering bottlenecks in handling the biological data. This study presents an integrated geographic information system (GIS)-ontology application for handling microbial genome data. The application uses a linear referencing technique as one of the GIS functionalities to represent genes as linear events on the genome layer, where users can define/change the attributes of genes in an event table and interactively see the gene events on a genome layer. Our application adopted ontology to portray and store genomic data in a semantic framework, which facilitates data-sharing among biology domains, applications, and experts. The application was developed in two steps. In the first step, the genome annotated data were prepared and stored in a MySQL database. The second step involved the connection of the database to both ArcGIS and Protégé as the GIS engine and ontology platform, respectively. We have designed this application specifically to manage the genome-annotated data of rumen microbial populations. Such a GIS-ontology application offers powerful capabilities for visualizing, managing, reusing, sharing, and querying genome-related data. PMID:25873847
An Ontology-Based GIS for Genomic Data Management of Rumen Microbes.
Jelokhani-Niaraki, Saber; Tahmoorespur, Mojtaba; Minuchehr, Zarrin; Nassiri, Mohammad Reza
2015-03-01
During recent years, there has been exponential growth in biological information. With the emergence of large datasets in biology, life scientists are encountering bottlenecks in handling the biological data. This study presents an integrated geographic information system (GIS)-ontology application for handling microbial genome data. The application uses a linear referencing technique as one of the GIS functionalities to represent genes as linear events on the genome layer, where users can define/change the attributes of genes in an event table and interactively see the gene events on a genome layer. Our application adopted ontology to portray and store genomic data in a semantic framework, which facilitates data-sharing among biology domains, applications, and experts. The application was developed in two steps. In the first step, the genome annotated data were prepared and stored in a MySQL database. The second step involved the connection of the database to both ArcGIS and Protégé as the GIS engine and ontology platform, respectively. We have designed this application specifically to manage the genome-annotated data of rumen microbial populations. Such a GIS-ontology application offers powerful capabilities for visualizing, managing, reusing, sharing, and querying genome-related data.
Ben Said, Mohamed; Robel, Laurence; Messiaen, Claude; Craus, Yann; Jais, Jean Philippe; Golse, Bernard; Landais, Paul
2014-01-01
Patients explicit and unambiguous information, patients consents and privacy protection are reviewed in this article, in the frame of the deployment of the information system TEDIS dedicated to autism spectrum disorders. The role of the Delegate to the Protection of Data is essential at this stage. We developed a privacy protection scheme based on storing encrypted patients personal data on the server database and decrypting it on the Web browser. It tries to respond to the end-users request to manage nominative data in a human readable form and to meet with privacy protection framework.
NASA Astrophysics Data System (ADS)
Gentry, Jeffery D.
2000-05-01
A relational database is a powerful tool for collecting and analyzing the vast amounts of inner-related data associated with the manufacture of composite materials. A relational database contains many individual database tables that store data that are related in some fashion. Manufacturing process variables as well as quality assurance measurements can be collected and stored in database tables indexed according to lot numbers, part type or individual serial numbers. Relationships between manufacturing process and product quality can then be correlated over a wide range of product types and process variations. This paper presents details on how relational databases are used to collect, store, and analyze process variables and quality assurance data associated with the manufacture of advanced composite materials. Important considerations are covered including how the various types of data are organized and how relationships between the data are defined. Employing relational database techniques to establish correlative relationships between process variables and quality assurance measurements is then explored. Finally, the benefits of database techniques such as data warehousing, data mining and web based client/server architectures are discussed in the context of composite material manufacturing.
Data-Base Software For Tracking Technological Developments
NASA Technical Reports Server (NTRS)
Aliberti, James A.; Wright, Simon; Monteith, Steve K.
1996-01-01
Technology Tracking System (TechTracS) computer program developed for use in storing and retrieving information on technology and related patent information developed under auspices of NASA Headquarters and NASA's field centers. Contents of data base include multiple scanned still images and quick-time movies as well as text. TechTracS includes word-processing, report-editing, chart-and-graph-editing, and search-editing subprograms. Extensive keyword searching capabilities enable rapid location of technologies, innovators, and companies. System performs routine functions automatically and serves multiple users.
1993-07-09
Calculate Oil and solve iteratively equation (18) for q and (l)-(S) forex . 4, Solve the velocity problemn through equation (19) to calculate q and (6)-(10) to...object.oriented models for the database to store the system information f1l. Using OOP on the formalism level is more difficult and a current field of...Multidimensional Physical Systems: Graph-theoretic Modeling, Systems and Cybernetics, vol 21 (1992), 5 .9-71 JV A RELATIONAL DATABASE FOR GENERAL
Mancardi, G L; Uccelli, M M; Sonnati, M; Comi, G; Milanese, C; De Vincentiis, A; Battaglia, M A
2000-04-01
The SMile Card was developed as a means for computerising clinical information for the purpose of transferability, accessibility, standardisation and compilation of a national database of demographic and clinical information about multiple sclerosis (MS) patients. In many European countries, centres for MS are organised independently from one another making collaboration, consultation and patient referral complicated. Only the more highly advanced clinical centres, generally located in large urban areas, have had the possibility to utilise technical possibilities for improving the organisation of patient clinical and research information, although independently from other centres. The information system, developed utilising the Visual Basic language for Microsoft Windows 95, stores information via a 'smart card' in a database which is initiated and updated utilising a microprocessor, located at each neurological clinic. The SMile Card, currently being tested in Italy, permits patients to carry with them all relevant medical information without limitations. Neurologists are able to access and update, via the microprocessor, the patient's entire medical history and MS-related information, including the complete neurological examination and laboratory test results. The SMile Card provides MS patients and neurologists with a complete computerised archive of clinical information which is accessible throughout the country. In addition, data from the SMile Card system can be exported to other database programs.
Application Analysis and Decision with Dynamic Analysis
2014-12-01
pushes the application file and the JSON file containing the metadata from the database . When the 2 files are in place, the consumer thread starts...human analysts and stores it in a database . It would then use some of these data to generate a risk score for the application. However, static analysis...and store them in the primary A2D database for future analysis. 15. SUBJECT TERMS Android, dynamic analysis 16. SECURITY CLASSIFICATION OF: 17
Sánchez-de-Madariaga, Ricardo; Muñoz, Adolfo; Lozano-Rubí, Raimundo; Serrano-Balazote, Pablo; Castro, Antonio L; Moreno, Oscar; Pascual, Mario
2017-08-18
The objective of this research is to compare the relational and non-relational (NoSQL) database systems approaches in order to store, recover, query and persist standardized medical information in the form of ISO/EN 13606 normalized Electronic Health Record XML extracts, both in isolation and concurrently. NoSQL database systems have recently attracted much attention, but few studies in the literature address their direct comparison with relational databases when applied to build the persistence layer of a standardized medical information system. One relational and two NoSQL databases (one document-based and one native XML database) of three different sizes have been created in order to evaluate and compare the response times (algorithmic complexity) of six different complexity growing queries, which have been performed on them. Similar appropriate results available in the literature have also been considered. Relational and non-relational NoSQL database systems show almost linear algorithmic complexity query execution. However, they show very different linear slopes, the former being much steeper than the two latter. Document-based NoSQL databases perform better in concurrency than in isolation, and also better than relational databases in concurrency. Non-relational NoSQL databases seem to be more appropriate than standard relational SQL databases when database size is extremely high (secondary use, research applications). Document-based NoSQL databases perform in general better than native XML NoSQL databases. EHR extracts visualization and edition are also document-based tasks more appropriate to NoSQL database systems. However, the appropriate database solution much depends on each particular situation and specific problem.
Self-aligning and compressed autosophy video databases
NASA Astrophysics Data System (ADS)
Holtz, Klaus E.
1993-04-01
Autosophy, an emerging new science, explains `self-assembling structures,' such as crystals or living trees, in mathematical terms. This research provides a new mathematical theory of `learning' and a new `information theory' which permits the growing of self-assembling data network in a computer memory similar to the growing of `data crystals' or `data trees' without data processing or programming. Autosophy databases are educated very much like a human child to organize their own internal data storage. Input patterns, such as written questions or images, are converted to points in a mathematical omni dimensional hyperspace. The input patterns are then associated with output patterns, such as written answers or images. Omni dimensional information storage will result in enormous data compression because each pattern fragment is only stored once. Pattern recognition in the text or image files is greatly simplified by the peculiar omni dimensional storage method. Video databases will absorb input images from a TV camera and associate them with textual information. The `black box' operations are totally self-aligning where the input data will determine their own hyperspace storage locations. Self-aligning autosophy databases may lead to a new generation of brain-like devices.
VaProS: a database-integration approach for protein/genome information retrieval.
Gojobori, Takashi; Ikeo, Kazuho; Katayama, Yukie; Kawabata, Takeshi; Kinjo, Akira R; Kinoshita, Kengo; Kwon, Yeondae; Migita, Ohsuke; Mizutani, Hisashi; Muraoka, Masafumi; Nagata, Koji; Omori, Satoshi; Sugawara, Hideaki; Yamada, Daichi; Yura, Kei
2016-12-01
Life science research now heavily relies on all sorts of databases for genome sequences, transcription, protein three-dimensional (3D) structures, protein-protein interactions, phenotypes and so forth. The knowledge accumulated by all the omics research is so vast that a computer-aided search of data is now a prerequisite for starting a new study. In addition, a combinatory search throughout these databases has a chance to extract new ideas and new hypotheses that can be examined by wet-lab experiments. By virtually integrating the related databases on the Internet, we have built a new web application that facilitates life science researchers for retrieving experts' knowledge stored in the databases and for building a new hypothesis of the research target. This web application, named VaProS, puts stress on the interconnection between the functional information of genome sequences and protein 3D structures, such as structural effect of the gene mutation. In this manuscript, we present the notion of VaProS, the databases and tools that can be accessed without any knowledge of database locations and data formats, and the power of search exemplified in quest of the molecular mechanisms of lysosomal storage disease. VaProS can be freely accessed at http://p4d-info.nig.ac.jp/vapros/ .
NCBI GEO: archive for functional genomics data sets--10 years on.
Barrett, Tanya; Troup, Dennis B; Wilhite, Stephen E; Ledoux, Pierre; Evangelista, Carlos; Kim, Irene F; Tomashevsky, Maxim; Marshall, Kimberly A; Phillippy, Katherine H; Sherman, Patti M; Muertter, Rolf N; Holko, Michelle; Ayanbule, Oluwabukunmi; Yefanov, Andrey; Soboleva, Alexandra
2011-01-01
A decade ago, the Gene Expression Omnibus (GEO) database was established at the National Center for Biotechnology Information (NCBI). The original objective of GEO was to serve as a public repository for high-throughput gene expression data generated mostly by microarray technology. However, the research community quickly applied microarrays to non-gene-expression studies, including examination of genome copy number variation and genome-wide profiling of DNA-binding proteins. Because the GEO database was designed with a flexible structure, it was possible to quickly adapt the repository to store these data types. More recently, as the microarray community switches to next-generation sequencing technologies, GEO has again adapted to host these data sets. Today, GEO stores over 20,000 microarray- and sequence-based functional genomics studies, and continues to handle the majority of direct high-throughput data submissions from the research community. Multiple mechanisms are provided to help users effectively search, browse, download and visualize the data at the level of individual genes or entire studies. This paper describes recent database enhancements, including new search and data representation tools, as well as a brief review of how the community uses GEO data. GEO is freely accessible at http://www.ncbi.nlm.nih.gov/geo/.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Naumann, Axel; /CERN; Canal, Philippe
2008-01-01
High performance computing with a large code base and C++ has proved to be a good combination. But when it comes to storing data, C++ is a problematic choice: it offers no support for serialization, type definitions are amazingly complex to parse, and the dependency analysis (what does object A need to be stored?) is incredibly difficult. Nevertheless, the LHC data consists of C++ objects that are serialized with help from ROOT's reflection database and interpreter CINT. The fact that we can do it on that scale, and the performance with which we do it makes this approach unique andmore » stirs interest even outside HEP. I will show how CINT collects and stores information about C++ types, what the current major challenges are (dictionary size), and what CINT and ROOT have done and plan to do about it.« less
Migration from relational to NoSQL database
NASA Astrophysics Data System (ADS)
Ghotiya, Sunita; Mandal, Juhi; Kandasamy, Saravanakumar
2017-11-01
Data generated by various real time applications, social networking sites and sensor devices is of very huge amount and unstructured, which makes it difficult for Relational database management systems to handle the data. Data is very precious component of any application and needs to be analysed after arranging it in some structure. Relational databases are only able to deal with structured data, so there is need of NoSQL Database management System which can deal with semi -structured data also. Relational database provides the easiest way to manage the data but as the use of NoSQL is increasing it is becoming necessary to migrate the data from Relational to NoSQL databases. Various frameworks has been proposed previously which provides mechanisms for migration of data stored at warehouses in SQL, middle layer solutions which can provide facility of data to be stored in NoSQL databases to handle data which is not structured. This paper provides a literature review of some of the recent approaches proposed by various researchers to migrate data from relational to NoSQL databases. Some researchers proposed mechanisms for the co-existence of NoSQL and Relational databases together. This paper provides a summary of mechanisms which can be used for mapping data stored in Relational databases to NoSQL databases. Various techniques for data transformation and middle layer solutions are summarised in the paper.
D Nearest Neighbour Search Using a Clustered Hierarchical Tree Structure
NASA Astrophysics Data System (ADS)
Suhaibah, A.; Uznir, U.; Anton, F.; Mioc, D.; Rahman, A. A.
2016-06-01
Locating and analysing the location of new stores or outlets is one of the common issues facing retailers and franchisers. This is due to assure that new opening stores are at their strategic location to attract the highest possible number of customers. Spatial information is used to manage, maintain and analyse these store locations. However, since the business of franchising and chain stores in urban areas runs within high rise multi-level buildings, a three-dimensional (3D) method is prominently required in order to locate and identify the surrounding information such as at which level of the franchise unit will be located or is the franchise unit located is at the best level for visibility purposes. One of the common used analyses used for retrieving the surrounding information is Nearest Neighbour (NN) analysis. It uses a point location and identifies the surrounding neighbours. However, with the immense number of urban datasets, the retrieval and analysis of nearest neighbour information and their efficiency will become more complex and crucial. In this paper, we present a technique to retrieve nearest neighbour information in 3D space using a clustered hierarchical tree structure. Based on our findings, the proposed approach substantially showed an improvement of response time analysis compared to existing approaches of spatial access methods in databases. The query performance was tested using a dataset consisting of 500,000 point locations building and franchising unit. The results are presented in this paper. Another advantage of this structure is that it also offers a minimal overlap and coverage among nodes which can reduce repetitive data entry.
TryTransDB: A web-based resource for transport proteins in Trypanosomatidae.
Sonar, Krushna; Kabra, Ritika; Singh, Shailza
2018-03-12
TryTransDB is a web-based resource that stores transport protein data which can be retrieved using a standalone BLAST tool. We have attempted to create an integrated database that can be a one-stop shop for the researchers working with transport proteins of Trypanosomatidae family. TryTransDB (Trypanosomatidae Transport Protein Database) is a web based comprehensive resource that can fire a BLAST search against most of the transport protein sequences (protein and nucleotide) from Trypanosomatidae family organisms. This web resource further allows to compute a phylogenetic tree by performing multiple sequence alignment (MSA) using CLUSTALW suite embedded in it. Also, cross-linking to other databases helps in gathering more information for a certain transport protein in a single website.
Conversion of a traditional image archive into an image resource on compact disc.
Andrew, S M; Benbow, E W
1997-01-01
The conversion of a traditional archive of pathology images was organised on 35 mm slides into a database of images stored on compact disc (CD-ROM), and textual descriptions were added to each image record. Students on a didactic pathology course found this resource useful as an aid to revision, despite relative computer illiteracy, and it is anticipated that students on a new problem based learning course, which incorporates experience with information technology, will benefit even more readily when they use the database as an educational resource. A text and image database on CD-ROM can be updated repeatedly, and the content manipulated to reflect the content and style of the courses it supports. Images PMID:9306931
The Kepler DB: a database management system for arrays, sparse arrays, and binary data
NASA Astrophysics Data System (ADS)
McCauliff, Sean; Cote, Miles T.; Girouard, Forrest R.; Middour, Christopher; Klaus, Todd C.; Wohler, Bill
2010-07-01
The Kepler Science Operations Center stores pixel values on approximately six million pixels collected every 30 minutes, as well as data products that are generated as a result of running the Kepler science processing pipeline. The Kepler Database management system (Kepler DB)was created to act as the repository of this information. After one year of flight usage, Kepler DB is managing 3 TiB of data and is expected to grow to over 10 TiB over the course of the mission. Kepler DB is a non-relational, transactional database where data are represented as one-dimensional arrays, sparse arrays or binary large objects. We will discuss Kepler DB's APIs, implementation, usage and deployment at the Kepler Science Operations Center.
The Kepler DB, a Database Management System for Arrays, Sparse Arrays and Binary Data
NASA Technical Reports Server (NTRS)
McCauliff, Sean; Cote, Miles T.; Girouard, Forrest R.; Middour, Christopher; Klaus, Todd C.; Wohler, Bill
2010-01-01
The Kepler Science Operations Center stores pixel values on approximately six million pixels collected every 30-minutes, as well as data products that are generated as a result of running the Kepler science processing pipeline. The Kepler Database (Kepler DB) management system was created to act as the repository of this information. After one year of ight usage, Kepler DB is managing 3 TiB of data and is expected to grow to over 10 TiB over the course of the mission. Kepler DB is a non-relational, transactional database where data are represented as one dimensional arrays, sparse arrays or binary large objects. We will discuss Kepler DB's APIs, implementation, usage and deployment at the Kepler Science Operations Center.
The Iranian National Geodata Revision Strategy and Realization Based on Geodatabase
NASA Astrophysics Data System (ADS)
Haeri, M.; Fasihi, A.; Ayazi, S. M.
2012-07-01
In recent years, using of spatial database for storing and managing spatial data has become a hot topic in the field of GIS. Accordingly National Cartographic Center of Iran (NCC) produces - from time to time - some spatial data which is usually included in some databases. One of the NCC major projects was designing National Topographic Database (NTDB). NCC decided to create National Topographic Database of the entire country-based on 1:25000 coverage maps. The standard of NTDB was published in 1994 and its database was created at the same time. In NTDB geometric data was stored in MicroStation design format (DGN) which each feature has a link to its attribute data (stored in Microsoft Access file). Also NTDB file was produced in a sheet-wise mode and then stored in a file-based style. Besides map compilation, revision of existing maps has already been started. Key problems of NCC are revision strategy, NTDB file-based style storage and operator challenges (NCC operators are almost preferred to edit and revise geometry data in CAD environments). A GeoDatabase solution for national Geodata, based on NTDB map files and operators' revision preferences, is introduced and released herein. The proposed solution extends the traditional methods to have a seamless spatial database which it can be revised in CAD and GIS environment, simultaneously. The proposed system is the common data framework to create a central data repository for spatial data storage and management.
Dynamic graph system for a semantic database
Mizell, David
2016-04-12
A method and system in a computer system for dynamically providing a graphical representation of a data store of entries via a matrix interface is disclosed. A dynamic graph system provides a matrix interface that exposes to an application program a graphical representation of data stored in a data store such as a semantic database storing triples. To the application program, the matrix interface represents the graph as a sparse adjacency matrix that is stored in compressed form. Each entry of the data store is considered to represent a link between nodes of the graph. Each entry has a first field and a second field identifying the nodes connected by the link and a third field with a value for the link that connects the identified nodes. The first, second, and third fields represent the rows, column, and elements of the adjacency matrix.
Dynamic graph system for a semantic database
Mizell, David
2015-01-27
A method and system in a computer system for dynamically providing a graphical representation of a data store of entries via a matrix interface is disclosed. A dynamic graph system provides a matrix interface that exposes to an application program a graphical representation of data stored in a data store such as a semantic database storing triples. To the application program, the matrix interface represents the graph as a sparse adjacency matrix that is stored in compressed form. Each entry of the data store is considered to represent a link between nodes of the graph. Each entry has a first field and a second field identifying the nodes connected by the link and a third field with a value for the link that connects the identified nodes. The first, second, and third fields represent the rows, column, and elements of the adjacency matrix.
Selection and Management of DNA Markers for Use in Genomic Evaluation
USDA-ARS?s Scientific Manuscript database
A database was constructed to store genotypes for 50,972 single-nucleotide polymorphisms (SNP) from the Illumina BovineSNP50 BeadChip for over 30,000 animals. The database allows storage of multiple samples per animal and stores all SNP genotypes for a sample in a single row. An indicator specifies ...
2012-11-27
with powerful analysis tools and an informatics approach leveraging best-of-breed NoSQL databases, in order to store, search and retrieve relevant...dictionaries, and JavaScript also has good support. The MongoDB project[15] was chosen as a scalable NoSQL data store for the cheminfor- matics components
"TPSX: Thermal Protection System Expert and Material Property Database"
NASA Technical Reports Server (NTRS)
Squire, Thomas H.; Milos, Frank S.; Rasky, Daniel J. (Technical Monitor)
1997-01-01
The Thermal Protection Branch at NASA Ames Research Center has developed a computer program for storing, organizing, and accessing information about thermal protection materials. The program, called Thermal Protection Systems Expert and Material Property Database, or TPSX, is available for the Microsoft Windows operating system. An "on-line" version is also accessible on the World Wide Web. TPSX is designed to be a high-quality source for TPS material properties presented in a convenient, easily accessible form for use by engineers and researchers in the field of high-speed vehicle design. Data can be displayed and printed in several formats. An information window displays a brief description of the material with properties at standard pressure and temperature. A spread sheet window displays complete, detailed property information. Properties which are a function of temperature and/or pressure can be displayed as graphs. In any display the data can be converted from English to SI units with the click of a button. Two material databases included with TPSX are: 1) materials used and/or developed by the Thermal Protection Branch at NASA Ames Research Center, and 2) a database compiled by NASA Johnson Space Center 9JSC). The Ames database contains over 60 advanced TPS materials including flexible blankets, rigid ceramic tiles, and ultra-high temperature ceramics. The JSC database contains over 130 insulative and structural materials. The Ames database is periodically updated and expanded as required to include newly developed materials and material property refinements.
NASA Technical Reports Server (NTRS)
Curto, Paul A. (Inventor); Brown, Gerald E. (Inventor); Zysko, Jan A. (Inventor)
2001-01-01
The present invention is a two-part wind advisory system comprising a ground station at an airfield and an airborne unit placed inside an aircraft. The ground station monitors wind conditions (wind speed, wind direction, and wind gust) at the airfield and transmits the wind conditions and an airfield ID to the airborne unit. The airborne unit identifies the airfield by comparing the received airfield ID with airfield IDs stored in a database. The airborne unit also calculates the headwind and crosswind for each runway in both directions at the airfield using the received wind conditions and runway information stored in the database. The airborne unit then determines a recommended runway for takeoff and landing operations of the aircraft based on th runway having the greatest headwind value and displays the airfield ID, wind conditions, and recommended runway to the pilot. Another embodiment of the present invention includes a wireless internet based airborne unit in which the airborne unit can receive the wind conditions from the ground station over the internet.
Scalable global grid catalogue for Run3 and beyond
NASA Astrophysics Data System (ADS)
Martinez Pedreira, M.; Grigoras, C.;
2017-10-01
The AliEn (ALICE Environment) file catalogue is a global unique namespace providing mapping between a UNIX-like logical name structure and the corresponding physical files distributed over 80 storage elements worldwide. Powerful search tools and hierarchical metadata information are integral parts of the system and are used by the Grid jobs as well as local users to store and access all files on the Grid storage elements. The catalogue has been in production since 2005 and over the past 11 years has grown to more than 2 billion logical file names. The backend is a set of distributed relational databases, ensuring smooth growth and fast access. Due to the anticipated fast future growth, we are looking for ways to enhance the performance and scalability by simplifying the catalogue schema while keeping the functionality intact. We investigated different backend solutions, such as distributed key value stores, as replacement for the relational database. This contribution covers the architectural changes in the system, together with the technology evaluation, benchmark results and conclusions.
R.E.DD.B.: A database for RESP and ESP atomic charges, and force field libraries
Dupradeau, François-Yves; Cézard, Christine; Lelong, Rodolphe; Stanislawiak, Élodie; Pêcher, Julien; Delepine, Jean Charles; Cieplak, Piotr
2008-01-01
The web-based RESP ESP charge DataBase (R.E.DD.B., http://q4md-forcefieldtools.org/REDDB) is a free and new source of RESP and ESP atomic charge values and force field libraries for model systems and/or small molecules. R.E.DD.B. stores highly effective and reproducible charge values and molecular structures in the Tripos mol2 file format, information about the charge derivation procedure, scripts to integrate the charges and molecular topology in the most common molecular dynamics packages. Moreover, R.E.DD.B. allows users to freely store and distribute RESP or ESP charges and force field libraries to the scientific community, via a web interface. The first version of R.E.DD.B., released in January 2006, contains force field libraries for molecules as well as molecular fragments for standard residues and their analogs (amino acids, monosaccharides, nucleotides and ligands), hence covering a vast area of relevant biological applications. PMID:17962302
Enhanced DIII-D Data Management Through a Relational Database
NASA Astrophysics Data System (ADS)
Burruss, J. R.; Peng, Q.; Schachter, J.; Schissel, D. P.; Terpstra, T. B.
2000-10-01
A relational database is being used to serve data about DIII-D experiments. The database is optimized for queries across multiple shots, allowing for rapid data mining by SQL-literate researchers. The relational database relates different experiments and datasets, thus providing a big picture of DIII-D operations. Users are encouraged to add their own tables to the database. Summary physics quantities about DIII-D discharges are collected and stored in the database automatically. Meta-data about code runs, MDSplus usage, and visualization tool usage are collected, stored in the database, and later analyzed to improve computing. Documentation on the database may be accessed through programming languages such as C, Java, and IDL, or through ODBC compliant applications such as Excel and Access. A database-driven web page also provides a convenient means for viewing database quantities through the World Wide Web. Demonstrations will be given at the poster.
Challenges in developing medicinal plant databases for sharing ethnopharmacological knowledge.
Ningthoujam, Sanjoy Singh; Talukdar, Anupam Das; Potsangbam, Kumar Singh; Choudhury, Manabendra Dutta
2012-05-07
Major research contributions in ethnopharmacology have generated vast amount of data associated with medicinal plants. Computerized databases facilitate data management and analysis making coherent information available to researchers, planners and other users. Web-based databases also facilitate knowledge transmission and feed the circle of information exchange between the ethnopharmacological studies and public audience. However, despite the development of many medicinal plant databases, a lack of uniformity is still discernible. Therefore, it calls for defining a common standard to achieve the common objectives of ethnopharmacology. The aim of the study is to review the diversity of approaches in storing ethnopharmacological information in databases and to provide some minimal standards for these databases. Survey for articles on medicinal plant databases was done on the Internet by using selective keywords. Grey literatures and printed materials were also searched for information. Listed resources were critically analyzed for their approaches in content type, focus area and software technology. Necessity for rapid incorporation of traditional knowledge by compiling primary data has been felt. While citation collection is common approach for information compilation, it could not fully assimilate local literatures which reflect traditional knowledge. Need for defining standards for systematic evaluation, checking quality and authenticity of the data is felt. Databases focussing on thematic areas, viz., traditional medicine system, regional aspect, disease and phytochemical information are analyzed. Issues pertaining to data standard, data linking and unique identification need to be addressed in addition to general issues like lack of update and sustainability. In the background of the present study, suggestions have been made on some minimum standards for development of medicinal plant database. In spite of variations in approaches, existence of many overlapping features indicates redundancy of resources and efforts. As the development of global data in a single database may not be possible in view of the culture-specific differences, efforts can be given to specific regional areas. Existing scenario calls for collaborative approach for defining a common standard in medicinal plant database for knowledge sharing and scientific advancement. Copyright © 2012 Elsevier Ireland Ltd. All rights reserved.
Hayes, Laura; Horn, Marilee A.
2009-01-01
The U.S. Geological Survey, in cooperation with the New Hampshire Department of Environmental Services, estimated the amount of water demand, consumptive use, withdrawal, and return flow for each U.S. Census block in New Hampshire for the years 2005 (current) and 2020. Estimates of domestic, commercial, industrial, irrigation, and other nondomestic water use were derived through the use and innovative integration of several State and Federal databases, and by use of previously developed techniques. The New Hampshire Water Demand database was created as part of this study to store and integrate State of New Hampshire data central to the project. Within the New Hampshire Water Demand database, a lookup table was created to link the State databases and identify water users common to more than one database. The lookup table also allowed identification of withdrawal and return-flow locations of registered and unregistered commercial, industrial, agricultural, and other nondomestic users. Geographic information system data from the State were used in combination with U.S. Census Bureau spatial data to locate and quantify withdrawals and return flow for domestic users in each census block. Analyzing and processing the most recently available data resulted in census-block estimations of 2005 water use. Applying population projections developed by the State to the data sets enabled projection of water use for the year 2020. The results for each census block are stored in the New Hampshire Water Demand database and may be aggregated to larger political areas or watersheds to assess relative hydrologic stress on the basis of current and potential water availability.
Vaxvec: The first web-based recombinant vaccine vector database and its data analysis.
Deng, Shunzhou; Martin, Carly; Patil, Rasika; Zhu, Felix; Zhao, Bin; Xiang, Zuoshuang; He, Yongqun
2015-11-27
A recombinant vector vaccine uses an attenuated virus, bacterium, or parasite as the carrier to express a heterologous antigen(s). Many recombinant vaccine vectors and related vaccines have been developed and extensively investigated. To compare and better understand recombinant vectors and vaccines, we have generated Vaxvec (http://www.violinet.org/vaxvec), the first web-based database that stores various recombinant vaccine vectors and those experimentally verified vaccines that use these vectors. Vaxvec has now included 59 vaccine vectors that have been used in 196 recombinant vector vaccines against 66 pathogens and cancers. These vectors are classified to 41 viral vectors, 15 bacterial vectors, 1 parasitic vector, and 1 fungal vector. The most commonly used viral vaccine vectors are double-stranded DNA viruses, including herpesviruses, adenoviruses, and poxviruses. For example, Vaxvec includes 63 poxvirus-based recombinant vaccines for over 20 pathogens and cancers. Vaxvec collects 30 recombinant vector influenza vaccines that use 17 recombinant vectors and were experimentally tested in 7 animal models. In addition, over 60 protective antigens used in recombinant vector vaccines are annotated and analyzed. User-friendly web-interfaces are available for querying various data in Vaxvec. To support data exchange, the information of vaccine vectors, vaccines, and related information is stored in the Vaccine Ontology (VO). Vaxvec is a timely and vital source of vaccine vector database and facilitates efficient vaccine vector research and development. Copyright © 2015 Elsevier Ltd. All rights reserved.
Imaged Document Optical Correlation and Conversion System (IDOCCS)
NASA Astrophysics Data System (ADS)
Stalcup, Bruce W.; Dennis, Phillip W.; Dydyk, Robert B.
1999-03-01
Today, the paper document is fast becoming a thing of the past. With the rapid development of fast, inexpensive computing and storage devices, many government and private organizations are archiving their documents in electronic form (e.g., personnel records, medical records, patents, etc.). In addition, many organizations are converting their paper archives to electronic images, which are stored in a computer database. Because of this, there is a need to efficiently organize this data into comprehensive and accessible information resources. The Imaged Document Optical Correlation and Conversion System (IDOCCS) provides a total solution to the problem of managing and retrieving textual and graphic information from imaged document archives. At the heart of IDOCCS, optical correlation technology provides the search and retrieval capability of document images. The IDOCCS can be used to rapidly search for key words or phrases within the imaged document archives and can even determine the types of languages contained within a document. In addition, IDOCCS can automatically compare an input document with the archived database to determine if it is a duplicate, thereby reducing the overall resources required to maintain and access the document database. Embedded graphics on imaged pages can also be exploited, e.g., imaged documents containing an agency's seal or logo, or documents with a particular individual's signature block, can be singled out. With this dual capability, IDOCCS outperforms systems that rely on optical character recognition as a basis for indexing and storing only the textual content of documents for later retrieval.
A framework for interval-valued information system
NASA Astrophysics Data System (ADS)
Yin, Yunfei; Gong, Guanghong; Han, Liang
2012-09-01
Interval-valued information system is used to transform the conventional dataset into the interval-valued form. To conduct the interval-valued data mining, we conduct two investigations: (1) construct the interval-valued information system, and (2) conduct the interval-valued knowledge discovery. In constructing the interval-valued information system, we first make the paired attributes in the database discovered, and then, make them stored in the neighbour locations in a common database and regard them as 'one' new field. In conducting the interval-valued knowledge discovery, we utilise some related priori knowledge and regard the priori knowledge as the control objectives; and design an approximate closed-loop control mining system. On the implemented experimental platform (prototype), we conduct the corresponding experiments and compare the proposed algorithms with several typical algorithms, such as the Apriori algorithm, the FP-growth algorithm and the CLOSE+ algorithm. The experimental results show that the interval-valued information system method is more effective than the conventional algorithms in discovering interval-valued patterns.
Mackey, Aaron J; Pearson, William R
2004-10-01
Relational databases are designed to integrate diverse types of information and manage large sets of search results, greatly simplifying genome-scale analyses. Relational databases are essential for management and analysis of large-scale sequence analyses, and can also be used to improve the statistical significance of similarity searches by focusing on subsets of sequence libraries most likely to contain homologs. This unit describes using relational databases to improve the efficiency of sequence similarity searching and to demonstrate various large-scale genomic analyses of homology-related data. This unit describes the installation and use of a simple protein sequence database, seqdb_demo, which is used as a basis for the other protocols. These include basic use of the database to generate a novel sequence library subset, how to extend and use seqdb_demo for the storage of sequence similarity search results and making use of various kinds of stored search results to address aspects of comparative genomic analysis.
Hall, Aaron Smalter; Shan, Yunfeng; Lushington, Gerald; Visvanathan, Mahesh
2016-01-01
Databases and exchange formats describing biological entities such as chemicals and proteins, along with their relationships, are a critical component of research in life sciences disciplines, including chemical biology wherein small information about small molecule properties converges with cellular and molecular biology. Databases for storing biological entities are growing not only in size, but also in type, with many similarities between them and often subtle differences. The data formats available to describe and exchange these entities are numerous as well. In general, each format is optimized for a particular purpose or database, and hence some understanding of these formats is required when choosing one for research purposes. This paper reviews a selection of different databases and data formats with the goal of summarizing their purposes, features, and limitations. Databases are reviewed under the categories of 1) protein interactions, 2) metabolic pathways, 3) chemical interactions, and 4) drug discovery. Representation formats will be discussed according to those describing chemical structures, and those describing genomic/proteomic entities. PMID:22934944
Smalter Hall, Aaron; Shan, Yunfeng; Lushington, Gerald; Visvanathan, Mahesh
2013-03-01
Databases and exchange formats describing biological entities such as chemicals and proteins, along with their relationships, are a critical component of research in life sciences disciplines, including chemical biology wherein small information about small molecule properties converges with cellular and molecular biology. Databases for storing biological entities are growing not only in size, but also in type, with many similarities between them and often subtle differences. The data formats available to describe and exchange these entities are numerous as well. In general, each format is optimized for a particular purpose or database, and hence some understanding of these formats is required when choosing one for research purposes. This paper reviews a selection of different databases and data formats with the goal of summarizing their purposes, features, and limitations. Databases are reviewed under the categories of 1) protein interactions, 2) metabolic pathways, 3) chemical interactions, and 4) drug discovery. Representation formats will be discussed according to those describing chemical structures, and those describing genomic/proteomic entities.
ARCPHdb: A comprehensive protein database for SF1 and SF2 helicase from archaea.
Moukhtar, Mirna; Chaar, Wafi; Abdel-Razzak, Ziad; Khalil, Mohamad; Taha, Samir; Chamieh, Hala
2017-01-01
Superfamily 1 and Superfamily 2 helicases, two of the largest helicase protein families, play vital roles in many biological processes including replication, transcription and translation. Study of helicase proteins in the model microorganisms of archaea have largely contributed to the understanding of their function, architecture and assembly. Based on a large phylogenomics approach, we have identified and classified all SF1 and SF2 protein families in ninety five sequenced archaea genomes. Here we developed an online webserver linked to a specialized protein database named ARCPHdb to provide access for SF1 and SF2 helicase families from archaea. ARCPHdb was implemented using MySQL relational database. Web interfaces were developed using Netbeans. Data were stored according to UniProt accession numbers, NCBI Ref Seq ID, PDB IDs and Entrez Databases. A user-friendly interactive web interface has been developed to browse, search and download archaeal helicase protein sequences, their available 3D structure models, and related documentation available in the literature provided by ARCPHdb. The database provides direct links to matching external databases. The ARCPHdb is the first online database to compile all protein information on SF1 and SF2 helicase from archaea in one platform. This database provides essential resource information for all researchers interested in the field. Copyright © 2016 Elsevier Ltd. All rights reserved.
MIPS: a database for protein sequences, homology data and yeast genome information.
Mewes, H W; Albermann, K; Heumann, K; Liebl, S; Pfeiffer, F
1997-01-01
The MIPS group (Martinsried Institute for Protein Sequences) at the Max-Planck-Institute for Biochemistry, Martinsried near Munich, Germany, collects, processes and distributes protein sequence data within the framework of the tripartite association of the PIR-International Protein Sequence Database (,). MIPS contributes nearly 50% of the data input to the PIR-International Protein Sequence Database. The database is distributed on CD-ROM together with PATCHX, an exhaustive supplement of unique, unverified protein sequences from external sources compiled by MIPS. Through its WWW server (http://www.mips.biochem.mpg.de/ ) MIPS permits internet access to sequence databases, homology data and to yeast genome information. (i) Sequence similarity results from the FASTA program () are stored in the FASTA database for all proteins from PIR-International and PATCHX. The database is dynamically maintained and permits instant access to FASTA results. (ii) Starting with FASTA database queries, proteins have been classified into families and superfamilies (PROT-FAM). (iii) The HPT (hashed position tree) data structure () developed at MIPS is a new approach for rapid sequence and pattern searching. (iv) MIPS provides access to the sequence and annotation of the complete yeast genome (), the functional classification of yeast genes (FunCat) and its graphical display, the 'Genome Browser' (). A CD-ROM based on the JAVA programming language providing dynamic interactive access to the yeast genome and the related protein sequences has been compiled and is available on request. PMID:9016498
Extracting Databases from Dark Data with DeepDive.
Zhang, Ce; Shin, Jaeho; Ré, Christopher; Cafarella, Michael; Niu, Feng
2016-01-01
DeepDive is a system for extracting relational databases from dark data : the mass of text, tables, and images that are widely collected and stored but which cannot be exploited by standard relational tools. If the information in dark data - scientific papers, Web classified ads, customer service notes, and so on - were instead in a relational database, it would give analysts a massive and valuable new set of "big data." DeepDive is distinctive when compared to previous information extraction systems in its ability to obtain very high precision and recall at reasonable engineering cost; in a number of applications, we have used DeepDive to create databases with accuracy that meets that of human annotators. To date we have successfully deployed DeepDive to create data-centric applications for insurance, materials science, genomics, paleontologists, law enforcement, and others. The data unlocked by DeepDive represents a massive opportunity for industry, government, and scientific researchers. DeepDive is enabled by an unusual design that combines large-scale probabilistic inference with a novel developer interaction cycle. This design is enabled by several core innovations around probabilistic training and inference.
d'Acierno, Antonio; Facchiano, Angelo; Marabotti, Anna
2009-06-01
We describe the GALT-Prot database and its related web-based application that have been developed to collect information about the structural and functional effects of mutations on the human enzyme galactose-1-phosphate uridyltransferase (GALT) involved in the genetic disease named galactosemia type I. Besides a list of missense mutations at gene and protein sequence levels, GALT-Prot reports the analysis results of mutant GALT structures. In addition to the structural information about the wild-type enzyme, the database also includes structures of over 100 single point mutants simulated by means of a computational procedure, and the analysis to each mutant was made with several bioinformatics programs in order to investigate the effect of the mutations. The web-based interface allows querying of the database, and several links are also provided in order to guarantee a high integration with other resources already present on the web. Moreover, the architecture of the database and the web application is flexible and can be easily adapted to store data related to other proteins with point mutations. GALT-Prot is freely available at http://bioinformatica.isa.cnr.it/GALT/.
Using SQL Databases for Sequence Similarity Searching and Analysis.
Pearson, William R; Mackey, Aaron J
2017-09-13
Relational databases can integrate diverse types of information and manage large sets of similarity search results, greatly simplifying genome-scale analyses. By focusing on taxonomic subsets of sequences, relational databases can reduce the size and redundancy of sequence libraries and improve the statistical significance of homologs. In addition, by loading similarity search results into a relational database, it becomes possible to explore and summarize the relationships between all of the proteins in an organism and those in other biological kingdoms. This unit describes how to use relational databases to improve the efficiency of sequence similarity searching and demonstrates various large-scale genomic analyses of homology-related data. It also describes the installation and use of a simple protein sequence database, seqdb_demo, which is used as a basis for the other protocols. The unit also introduces search_demo, a database that stores sequence similarity search results. The search_demo database is then used to explore the evolutionary relationships between E. coli proteins and proteins in other organisms in a large-scale comparative genomic analysis. © 2017 by John Wiley & Sons, Inc. Copyright © 2017 John Wiley & Sons, Inc.
Corwin, John; Silberschatz, Avi; Miller, Perry L; Marenco, Luis
2007-01-01
Data sparsity and schema evolution issues affecting clinical informatics and bioinformatics communities have led to the adoption of vertical or object-attribute-value-based database schemas to overcome limitations posed when using conventional relational database technology. This paper explores these issues and discusses why biomedical data are difficult to model using conventional relational techniques. The authors propose a solution to these obstacles based on a relational database engine using a sparse, column-store architecture. The authors provide benchmarks comparing the performance of queries and schema-modification operations using three different strategies: (1) the standard conventional relational design; (2) past approaches used by biomedical informatics researchers; and (3) their sparse, column-store architecture. The performance results show that their architecture is a promising technique for storing and processing many types of data that are not handled well by the other two semantic data models.
Study on parallel and distributed management of RS data based on spatial database
NASA Astrophysics Data System (ADS)
Chen, Yingbiao; Qian, Qinglan; Wu, Hongqiao; Liu, Shijin
2009-10-01
With the rapid development of current earth-observing technology, RS image data storage, management and information publication become a bottle-neck for its appliance and popularization. There are two prominent problems in RS image data storage and management system. First, background server hardly handle the heavy process of great capacity of RS data which stored at different nodes in a distributing environment. A tough burden has put on the background server. Second, there is no unique, standard and rational organization of Multi-sensor RS data for its storage and management. And lots of information is lost or not included at storage. Faced at the above two problems, the paper has put forward a framework for RS image data parallel and distributed management and storage system. This system aims at RS data information system based on parallel background server and a distributed data management system. Aiming at the above two goals, this paper has studied the following key techniques and elicited some revelatory conclusions. The paper has put forward a solid index of "Pyramid, Block, Layer, Epoch" according to the properties of RS image data. With the solid index mechanism, a rational organization for different resolution, different area, different band and different period of Multi-sensor RS image data is completed. In data storage, RS data is not divided into binary large objects to be stored at current relational database system, while it is reconstructed through the above solid index mechanism. A logical image database for the RS image data file is constructed. In system architecture, this paper has set up a framework based on a parallel server of several common computers. Under the framework, the background process is divided into two parts, the common WEB process and parallel process.
Case and Model Driven Dynamic Template Linking
2005-06-01
store the trips in a PostgreSQL database (www.postgresql.org) and the values stored in this database could be re-used to provide values for similar trips...Preferences YES Yes but limited Print Form YES NO Close Form YES NO Just “X” Quit YES NO Just “X” Show User Action History YES NO 6.5 DAML Ontologies
Nemesis I: Parallel Enhancements to ExodusII
DOE Office of Scientific and Technical Information (OSTI.GOV)
Hennigan, Gary L.; John, Matthew S.; Shadid, John N.
2006-03-28
NEMESIS I is an enhancement to the EXODUS II finite element database model used to store and retrieve data for unstructured parallel finite element analyses. NEMESIS I adds data structures which facilitate the partitioning of a scalar (standard serial) EXODUS II file onto parallel disk systems found on many parallel computers. Since the NEMESIS I application programming interface (APl)can be used to append information to an existing EXODUS II files can be used on files which contain NEMESIS I information. The NEMESIS I information is written and read via C or C++ callable functions which compromise the NEMESIS I API.
Development of a medical module for disaster information systems.
Calik, Elif; Atilla, Rıdvan; Kaya, Hilal; Aribaş, Alirıza; Cengiz, Hakan; Dicle, Oğuz
2014-01-01
This study aims to improve a medical module which provides a real-time medical information flow about pre-hospital processes that gives health care in disasters; transferring, storing and processing the records that are in electronic media and over internet as a part of disaster information systems. In this study which is handled within the frame of providing information flow among professionals in a disaster case, to supply the coordination of healthcare team and transferring complete information to specified people at real time, Microsoft Access database and SQL query language were used to inform database applications. System was prepared on Microsoft .Net platform using C# language. Disaster information system-medical module was designed to be used in disaster area, field hospital, nearby hospitals, temporary inhabiting areas like tent city, vehicles that are used for dispatch, and providing information flow between medical officials and data centres. For fast recording of the disaster victim data, accessing to database which was used by health care professionals was provided (or granted) among analysing process steps and creating minimal datasets. Database fields were created in the manner of giving opportunity to enter new data and search old data which is recorded before disaster. Web application which provides access such as data entry to the database and searching towards the designed interfaces according to the login credentials access level. In this study, homepage and users' interfaces which were built on database in consequence of system analyses were provided with www.afmedinfo.com web site to the user access. With this study, a recommendation was made about how to use disaster-based information systems in the field of health. Awareness has been developed about the fact that disaster information system should not be perceived only as an early warning system. Contents and the differences of the health care practices of disaster information systems were revealed. A web application was developed supplying a link between the user and the database to make date entry and data query practices by the help of the developed interfaces.
BIO-Plex Information System Concept
NASA Technical Reports Server (NTRS)
Jones, Harry; Boulanger, Richard; Arnold, James O. (Technical Monitor)
1999-01-01
This paper describes a suggested design for an integrated information system for the proposed BIO-Plex (Bioregenerative Planetary Life Support Systems Test Complex) at Johnson Space Center (JSC), including distributed control systems, central control, networks, database servers, personal computers and workstations, applications software, and external communications. The system will have an open commercial computing and networking, architecture. The network will provide automatic real-time transfer of information to database server computers which perform data collection and validation. This information system will support integrated, data sharing applications for everything, from system alarms to management summaries. Most existing complex process control systems have information gaps between the different real time subsystems, between these subsystems and central controller, between the central controller and system level planning and analysis application software, and between the system level applications and management overview reporting. An integrated information system is vitally necessary as the basis for the integration of planning, scheduling, modeling, monitoring, and control, which will allow improved monitoring and control based on timely, accurate and complete data. Data describing the system configuration and the real time processes can be collected, checked and reconciled, analyzed and stored in database servers that can be accessed by all applications. The required technology is available. The only opportunity to design a distributed, nonredundant, integrated system is before it is built. Retrofit is extremely difficult and costly.
Kim, Hwa Sun; Cho, Hune
2011-01-01
Objectives The Health Level Seven Interface Engine (HL7 IE), developed by Kyungpook National University, has been employed in health information systems, however users without a background in programming have reported difficulties in using it. Therefore, we developed a graphical user interface (GUI) engine to make the use of the HL7 IE more convenient. Methods The GUI engine was directly connected with the HL7 IE to handle the HL7 version 2.x messages. Furthermore, the information exchange rules (called the mapping data), represented by a conceptual graph in the GUI engine, were transformed into program objects that were made available to the HL7 IE; the mapping data were stored as binary files for reuse. The usefulness of the GUI engine was examined through information exchange tests between an HL7 version 2.x message and a health information database system. Results Users could easily create HL7 version 2.x messages by creating a conceptual graph through the GUI engine without requiring assistance from programmers. In addition, time could be saved when creating new information exchange rules by reusing the stored mapping data. Conclusions The GUI engine was not able to incorporate information types (e.g., extensible markup language, XML) other than the HL7 version 2.x messages and the database, because it was designed exclusively for the HL7 IE protocol. However, in future work, by including additional parsers to manage XML-based information such as Continuity of Care Documents (CCD) and Continuity of Care Records (CCR), we plan to ensure that the GUI engine will be more widely accessible for the health field. PMID:22259723
DOE Office of Scientific and Technical Information (OSTI.GOV)
Lamont, Stephen Philip; Brisson, Marcia; Curry, Michael
2011-02-17
Nuclear forensics assessments to determine material process history requires careful comparison of sample data to both measured and modeled nuclear material characteristics. Developing centralized databases, or nuclear forensics libraries, to house this information is an important step to ensure all relevant data will be available for comparison during a nuclear forensics analysis and help expedite the assessment of material history. The approach most widely accepted by the international community at this time is the implementation of National Nuclear Forensics libraries, which would be developed and maintained by individual nations. This is an attractive alternative toan international database since it providesmore » an understanding that each country has data on materials produced and stored within their borders, but eliminates the need to reveal any proprietary or sensitive information to other nations. To support the concept of National Nuclear Forensics libraries, the United States Department of Energy has developed a model library, based on a data dictionary, or set of parameters designed to capture all nuclear forensic relevant information about a nuclear material. Specifically, information includes material identification, collection background and current location, analytical laboratories where measurements were made, material packaging and container descriptions, physical characteristics including mass and dimensions, chemical and isotopic characteristics, particle morphology or metallurgical properties, process history including facilities, and measurement quality assurance information. While not necessarily required, it may also be valuable to store modeled data sets including reactor burn-up or enrichment cascade data for comparison. It is fully expected that only a subset of this information is available or relevant to many materials, and much of the data populating a National Nuclear Forensics library would be process analytical or material accountability measurement data as opposed to a complete forensic analysis of each material in the library.« less
Kim, Hwa Sun; Cho, Hune; Lee, In Keun
2011-12-01
The Health Level Seven Interface Engine (HL7 IE), developed by Kyungpook National University, has been employed in health information systems, however users without a background in programming have reported difficulties in using it. Therefore, we developed a graphical user interface (GUI) engine to make the use of the HL7 IE more convenient. The GUI engine was directly connected with the HL7 IE to handle the HL7 version 2.x messages. Furthermore, the information exchange rules (called the mapping data), represented by a conceptual graph in the GUI engine, were transformed into program objects that were made available to the HL7 IE; the mapping data were stored as binary files for reuse. The usefulness of the GUI engine was examined through information exchange tests between an HL7 version 2.x message and a health information database system. Users could easily create HL7 version 2.x messages by creating a conceptual graph through the GUI engine without requiring assistance from programmers. In addition, time could be saved when creating new information exchange rules by reusing the stored mapping data. The GUI engine was not able to incorporate information types (e.g., extensible markup language, XML) other than the HL7 version 2.x messages and the database, because it was designed exclusively for the HL7 IE protocol. However, in future work, by including additional parsers to manage XML-based information such as Continuity of Care Documents (CCD) and Continuity of Care Records (CCR), we plan to ensure that the GUI engine will be more widely accessible for the health field.
E&P data lifecycle: a case study in Petrobras Company
NASA Astrophysics Data System (ADS)
Mastella, Laura; Campinho, Vania; Alonso, João
2013-04-01
Petrobras, the biggest Brazilian Petroleum Company, has been studying and working on Brazilian sedimentary basins for nearly 60 years. The corporate database currently registers over 25000 wells and all their associated products (geophysical logs, cores, sidewall samples) and analyses. There are thousands of samples, descriptions, pictures, measures, and other scientific data resulted from petroleum exploration and production. This data constitutes a huge scientific database which is applied to support Petrobras economic strategy. Geological models built during the exploration phase continue to be refined during both the development and production phases: data should be continually manipulated, correlated and integrated. As E&P assets reach maturity, a new cycle starts: data is re-analyzed and new hypotheses are made in order to increase hydrocarbon productivity. Initial geological models then evolve from accumulated knowledge throughout all the E&P phases. Therefore the quality control must be performed in the first phases of data acquisition, i.e., during the exploration phase, to avoid reworking and loss of information. The last decade witnessed a great evolution in petroleum industry technology. As a consequence, the complexity and particulars of the information generated have increased accordingly. Current technology has also facilitated access to networks and databases, making it possible to store large amounts of information. This scenario makes available a large mass of information from difference sources, which uses heterogeneous vocabulary as well as different scales and measurement units. In this context, knowledge might be diluted and the total amount of information cannot be applied in E&P process. In order to provide adequate data governance, data input is controlled by rules, standards and policies, implemented by corporate software systems. Petrobras' integrated E&P database is a centralized repository to which all E&P systems can have access. The quality of the data that goes into the database can be increased by means of information management practices: • data validation, • language internationalization, • dictionaries, patterns, metadata. Moreover, stored data must be kept consistent, and any changes in the data should be registered while maintaining, if possible, the original data, associating the modification with its author, timestamp and reason. These practices lead to the creation of a database that serves and benefits the company's knowledge. Information retrieval and visualization is one of the main issues concerning petroleum industries. In order to make significant information available for end-users, it is fundamental to have an efficient data integration strategy. The integration of E&P data, such as geological, geophysical, geographical and operational data, is the end goal of the exploratory activities. Petrobras corporate systems are evolving towards it so as to make available various data from diverse sources and to create a dashboard that can be easily accessed at any time by geoscientists and reservoir engineers. The main goal is to maintain scientific integrity of information, from generators to consumers, during all E&P data life cycle.
Swetha, Rayapadi G; Kala Sekar, Dinesh Kumar; Ramaiah, Sudha; Anbarasu, Anand; Sekar, Kanagaraj
2014-12-01
Haemophilus influenzae (H. Influenzae) is the causative agent of pneumonia, bacteraemia and meningitis. The organism is responsible for large number of deaths in both developed and developing countries. Even-though the first bacterial genome to be sequenced was that of H. Influenzae, there is no exclusive database dedicated for H. Influenzae. This prompted us to develop the Haemophilus influenzae Genome Database (HIGDB). All data of HIGDB are stored and managed in MySQL database. The HIGDB is hosted on Solaris server and developed using PERL modules. Ajax and JavaScript are used for the interface development. The HIGDB contains detailed information on 42,741 proteins, 18,077 genes including 10 whole genome sequences and also 284 three dimensional structures of proteins of H. influenzae. In addition, the database provides "Motif search" and "GBrowse". The HIGDB is freely accessible through the URL: http://bioserver1.physics.iisc.ernet.in/HIGDB/. The HIGDB will be a single point access for bacteriological, clinical, genomic and proteomic information of H. influenzae. The database can also be used to identify DNA motifs within H. influenzae genomes and to compare gene or protein sequences of a particular strain with other strains of H. influenzae. Copyright © 2014 Elsevier Ltd. All rights reserved.
Stability assessment of structures under earthquake hazard through GRID technology
NASA Astrophysics Data System (ADS)
Prieto Castrillo, F.; Boton Fernandez, M.
2009-04-01
This work presents a GRID framework to estimate the vulnerability of structures under earthquake hazard. The tool has been designed to cover the needs of a typical earthquake engineering stability analysis; preparation of input data (pre-processing), response computation and stability analysis (post-processing). In order to validate the application over GRID, a simplified model of structure under artificially generated earthquake records has been implemented. To achieve this goal, the proposed scheme exploits the GRID technology and its main advantages (parallel intensive computing, huge storage capacity and collaboration analysis among institutions) through intensive interaction among the GRID elements (Computing Element, Storage Element, LHC File Catalogue, federated database etc.) The dynamical model is described by a set of ordinary differential equations (ODE's) and by a set of parameters. Both elements, along with the integration engine, are encapsulated into Java classes. With this high level design, subsequent improvements/changes of the model can be addressed with little effort. In the procedure, an earthquake record database is prepared and stored (pre-processing) in the GRID Storage Element (SE). The Metadata of these records is also stored in the GRID federated database. This Metadata contains both relevant information about the earthquake (as it is usual in a seismic repository) and also the Logical File Name (LFN) of the record for its later retrieval. Then, from the available set of accelerograms in the SE, the user can specify a range of earthquake parameters to carry out a dynamic analysis. This way, a GRID job is created for each selected accelerogram in the database. At the GRID Computing Element (CE), displacements are then obtained by numerical integration of the ODE's over time. The resulting response for that configuration is stored in the GRID Storage Element (SE) and the maximum structure displacement is computed. Then, the corresponding Metadata containing the response LFN, earthquake magnitude and maximum structure displacement is also stored. Finally, the displacements are post-processed through a statistically-based algorithm from the available Metadata to obtain the probability of collapse of the structure for different earthquake magnitudes. From this study, it is possible to build a vulnerability report for the structure type and seismic data. The proposed methodology can be combined with the on-going initiatives to build a European earthquake record database. In this context, Grid enables collaboration analysis over shared seismic data and results among different institutions.
Gavrielides, Mike; Furney, Simon J; Yates, Tim; Miller, Crispin J; Marais, Richard
2014-01-01
Whole genomes, whole exomes and transcriptomes of tumour samples are sequenced routinely to identify the drivers of cancer. The systematic sequencing and analysis of tumour samples, as well other oncogenomic experiments, necessitates the tracking of relevant sample information throughout the investigative process. These meta-data of the sequencing and analysis procedures include information about the samples and projects as well as the sequencing centres, platforms, data locations, results locations, alignments, analysis specifications and further information relevant to the experiments. The current work presents a sample tracking system for oncogenomic studies (Onco-STS) to store these data and make them easily accessible to the researchers who work with the samples. The system is a web application, which includes a database and a front-end web page that allows the remote access, submission and updating of the sample data in the database. The web application development programming framework Grails was used for the development and implementation of the system. The resulting Onco-STS solution is efficient, secure and easy to use and is intended to replace the manual data handling of text records. Onco-STS allows simultaneous remote access to the system making collaboration among researchers more effective. The system stores both information on the samples in oncogenomic studies and details of the analyses conducted on the resulting data. Onco-STS is based on open-source software, is easy to develop and can be modified according to a research group's needs. Hence it is suitable for laboratories that do not require a commercial system.
International Database of Volcanic Ash Impacts
NASA Astrophysics Data System (ADS)
Wallace, K.; Cameron, C.; Wilson, T. M.; Jenkins, S.; Brown, S.; Leonard, G.; Deligne, N.; Stewart, C.
2015-12-01
Volcanic ash creates extensive impacts to people and property, yet we lack a global ash impacts catalog to organize, distribute, and archive this important information. Critical impact information is often stored in ephemeral news articles or other isolated resources, which cannot be queried or located easily. A global ash impacts database would improve 1) warning messages, 2) public and lifeline emergency preparation, and 3) eruption response and recovery. Ashfall can have varying consequences, such as disabling critical lifeline infrastructure (e.g. electrical generation and transmission, water supplies, telecommunications, aircraft and airports) or merely creating limited and expensive inconvenience to local communities. Impacts to the aviation sector can be a far-reaching global issue. The international volcanic ash impacts community formed a committee to develop a database to catalog the impacts of volcanic ash. We identify three user populations for this database: 1) research teams, who would use the database to assist in systematic collection, recording, and storage of ash impact data, and to prioritize impact assessment trips and lab experiments 2) volcanic risk assessment scientists who rely on impact data for assessments (especially vulnerability/fragility assessments); a complete dataset would have utility for global, regional, national and local scale risk assessments, and 3) citizen science volcanic hazard reporting. Publication of an international ash impacts database will encourage standardization and development of best practices for collecting and reporting impact information. Data entered will be highly categorized, searchable, and open source. Systematic cataloging of impact data will allow users to query the data and extract valuable information to aid in the development of improved emergency preparedness, response and recovery measures.
FBIS: A regional DNA barcode archival & analysis system for Indian fishes.
Nagpure, Naresh Sahebrao; Rashid, Iliyas; Pathak, Ajey Kumar; Singh, Mahender; Singh, Shri Prakash; Sarkar, Uttam Kumar
2012-01-01
DNA barcode is a new tool for taxon recognition and classification of biological organisms based on sequence of a fragment of mitochondrial gene, cytochrome c oxidase I (COI). In view of the growing importance of the fish DNA barcoding for species identification, molecular taxonomy and fish diversity conservation, we developed a Fish Barcode Information System (FBIS) for Indian fishes, which will serve as a regional DNA barcode archival and analysis system. The database presently contains 2334 sequence records of COI gene for 472 aquatic species belonging to 39 orders and 136 families, collected from available published data sources. Additionally, it contains information on phenotype, distribution and IUCN Red List status of fishes. The web version of FBIS was designed using MySQL, Perl and PHP under Linux operating platform to (a) store and manage the acquisition (b) analyze and explore DNA barcode records (c) identify species and estimate genetic divergence. FBIS has also been integrated with appropriate tools for retrieving and viewing information about the database statistics and taxonomy. It is expected that FBIS would be useful as a potent information system in fish molecular taxonomy, phylogeny and genomics. The database is available for free at http://mail.nbfgr.res.in/fbis/
NASA Astrophysics Data System (ADS)
Kouziokas, Georgios N.
2016-01-01
The adoption of Information and Communication Technologies (ICT) in environmental management has become a significant demand nowadays with the rapid growth of environmental information. This paper presents a prototype Environmental Management Information System (EMIS) that was developed to provide a systematic way of managing environmental data and human resources of an environmental organization. The system was designed using programming languages, a Database Management System (DBMS) and other technologies and programming tools and combines information from the relational database in order to achieve the principal goals of the environmental organization. The developed application can be used to store and elaborate information regarding: human resources data, environmental projects, observations, reports, data about the protected species, environmental measurements of pollutant factors or other kinds of analytical measurements and also the financial data of the organization. Furthermore, the system supports the visualization of spatial data structures by using geographic information systems (GIS) and web mapping technologies. This paper describes this prototype software application, its structure, its functions and how this system can be utilized to facilitate technology-based environmental management and decision-making process.
Introducing glycomics data into the Semantic Web
2013-01-01
Background Glycoscience is a research field focusing on complex carbohydrates (otherwise known as glycans)a, which can, for example, serve as “switches” that toggle between different functions of a glycoprotein or glycolipid. Due to the advancement of glycomics technologies that are used to characterize glycan structures, many glycomics databases are now publicly available and provide useful information for glycoscience research. However, these databases have almost no link to other life science databases. Results In order to implement support for the Semantic Web most efficiently for glycomics research, the developers of major glycomics databases agreed on a minimal standard for representing glycan structure and annotation information using RDF (Resource Description Framework). Moreover, all of the participants implemented this standard prototype and generated preliminary RDF versions of their data. To test the utility of the converted data, all of the data sets were uploaded into a Virtuoso triple store, and several SPARQL queries were tested as “proofs-of-concept” to illustrate the utility of the Semantic Web in querying across databases which were originally difficult to implement. Conclusions We were able to successfully retrieve information by linking UniCarbKB, GlycomeDB and JCGGDB in a single SPARQL query to obtain our target information. We also tested queries linking UniProt with GlycoEpitope as well as lectin data with GlycomeDB through PDB. As a result, we have been able to link proteomics data with glycomics data through the implementation of Semantic Web technologies, allowing for more flexible queries across these domains. PMID:24280648
Introducing glycomics data into the Semantic Web.
Aoki-Kinoshita, Kiyoko F; Bolleman, Jerven; Campbell, Matthew P; Kawano, Shin; Kim, Jin-Dong; Lütteke, Thomas; Matsubara, Masaaki; Okuda, Shujiro; Ranzinger, Rene; Sawaki, Hiromichi; Shikanai, Toshihide; Shinmachi, Daisuke; Suzuki, Yoshinori; Toukach, Philip; Yamada, Issaku; Packer, Nicolle H; Narimatsu, Hisashi
2013-11-26
Glycoscience is a research field focusing on complex carbohydrates (otherwise known as glycans)a, which can, for example, serve as "switches" that toggle between different functions of a glycoprotein or glycolipid. Due to the advancement of glycomics technologies that are used to characterize glycan structures, many glycomics databases are now publicly available and provide useful information for glycoscience research. However, these databases have almost no link to other life science databases. In order to implement support for the Semantic Web most efficiently for glycomics research, the developers of major glycomics databases agreed on a minimal standard for representing glycan structure and annotation information using RDF (Resource Description Framework). Moreover, all of the participants implemented this standard prototype and generated preliminary RDF versions of their data. To test the utility of the converted data, all of the data sets were uploaded into a Virtuoso triple store, and several SPARQL queries were tested as "proofs-of-concept" to illustrate the utility of the Semantic Web in querying across databases which were originally difficult to implement. We were able to successfully retrieve information by linking UniCarbKB, GlycomeDB and JCGGDB in a single SPARQL query to obtain our target information. We also tested queries linking UniProt with GlycoEpitope as well as lectin data with GlycomeDB through PDB. As a result, we have been able to link proteomics data with glycomics data through the implementation of Semantic Web technologies, allowing for more flexible queries across these domains.
,
2004-01-01
The Ground-Water Site-Inventory (GWSI) System is a ground-water data storage and retrieval system that is part of the National Water Information System (NWIS) developed by the U.S. Geological Survey (USGS). The NWIS is a distributed water database in which data can be processed over a network of workstations and file servers at USGS offices throughout the United States. This system comprises the GWSI, the Automated Data Processing System (ADAPS), the Water-Quality System (QWDATA), and the Site-Specific Water-Use Data System (SWUDS). The GWSI System provides for entering new sites and updating existing sites within the local database. In addition, the GWSI provides for retrieving and displaying ground-water and sitefile data stored in the local database. Finally, the GWSI provides for routine maintenance of the local and national data records. This manual contains instructions for users of the GWSI and discusses the general operating procedures for the programs found within the GWSI Main Menu.
,
2005-01-01
The Ground-Water Site-Inventory (GWSI) System is a ground-water data storage and retrieval system that is part of the National Water Information System (NWIS) developed by the U.S. Geological Survey (USGS). The NWIS is a distributed water database in which data can be processed over a network of workstations and file servers at USGS offices throughout the United States. This system comprises the GWSI, the Automated Data Processing System (ADAPS), the Water-Quality System (QWDATA), and the Site- Specific Water-Use Data System (SWUDS). The GWSI System provides for entering new sites and updating existing sites within the local database. In addition, the GWSI provides for retrieving and displaying groundwater and Sitefile data stored in the local database. Finally, the GWSI provides for routine maintenance of the local and national data records. This manual contains instructions for users of the GWSI and discusses the general operating procedures for the programs found within the GWSI Main Menu.
Graph Learning in Knowledge Bases
DOE Office of Scientific and Technical Information (OSTI.GOV)
Goldberg, Sean; Wang, Daisy Zhe
The amount of text data has been growing exponentially in recent years, giving rise to automatic information extraction methods that store text annotations in a database. The current state-of-theart structured prediction methods, however, are likely to contain errors and it’s important to be able to manage the overall uncertainty of the database. On the other hand, the advent of crowdsourcing has enabled humans to aid machine algorithms at scale. As part of this project we introduced pi-CASTLE , a system that optimizes and integrates human and machine computing as applied to a complex structured prediction problem involving conditional random fieldsmore » (CRFs). We proposed strategies grounded in information theory to select a token subset, formulate questions for the crowd to label, and integrate these labelings back into the database using a method of constrained inference. On both a text segmentation task over academic citations and a named entity recognition task over tweets we showed an order of magnitude improvement in accuracy gain over baseline methods.« less
VitisExpDB: a database resource for grape functional genomics.
Doddapaneni, Harshavardhan; Lin, Hong; Walker, M Andrew; Yao, Jiqiang; Civerolo, Edwin L
2008-02-28
The family Vitaceae consists of many different grape species that grow in a range of climatic conditions. In the past few years, several studies have generated functional genomic information on different Vitis species and cultivars, including the European grape vine, Vitis vinifera. Our goal is to develop a comprehensive web data source for Vitaceae. VitisExpDB is an online MySQL-PHP driven relational database that houses annotated EST and gene expression data for V. vinifera and non-vinifera grape species and varieties. Currently, the database stores approximately 320,000 EST sequences derived from 8 species/hybrids, their annotation (BLAST top match) details and Gene Ontology based structured vocabulary. Putative homologs for each EST in other species and varieties along with information on their percent nucleotide identities, phylogenetic relationship and common primers can be retrieved. The database also includes information on probe sequence and annotation features of the high density 60-mer gene expression chip consisting of approximately 20,000 non-redundant set of ESTs. Finally, the database includes 14 processed global microarray expression profile sets. Data from 12 of these expression profile sets have been mapped onto metabolic pathways. A user-friendly web interface with multiple search indices and extensively hyperlinked result features that permit efficient data retrieval has been developed. Several online bioinformatics tools that interact with the database along with other sequence analysis tools have been added. In addition, users can submit their ESTs to the database. The developed database provides genomic resource to grape community for functional analysis of genes in the collection and for the grape genome annotation and gene function identification. The VitisExpDB database is available through our website http://cropdisease.ars.usda.gov/vitis_at/main-page.htm.
VitisExpDB: A database resource for grape functional genomics
Doddapaneni, Harshavardhan; Lin, Hong; Walker, M Andrew; Yao, Jiqiang; Civerolo, Edwin L
2008-01-01
Background The family Vitaceae consists of many different grape species that grow in a range of climatic conditions. In the past few years, several studies have generated functional genomic information on different Vitis species and cultivars, including the European grape vine, Vitis vinifera. Our goal is to develop a comprehensive web data source for Vitaceae. Description VitisExpDB is an online MySQL-PHP driven relational database that houses annotated EST and gene expression data for V. vinifera and non-vinifera grape species and varieties. Currently, the database stores ~320,000 EST sequences derived from 8 species/hybrids, their annotation (BLAST top match) details and Gene Ontology based structured vocabulary. Putative homologs for each EST in other species and varieties along with information on their percent nucleotide identities, phylogenetic relationship and common primers can be retrieved. The database also includes information on probe sequence and annotation features of the high density 60-mer gene expression chip consisting of ~20,000 non-redundant set of ESTs. Finally, the database includes 14 processed global microarray expression profile sets. Data from 12 of these expression profile sets have been mapped onto metabolic pathways. A user-friendly web interface with multiple search indices and extensively hyperlinked result features that permit efficient data retrieval has been developed. Several online bioinformatics tools that interact with the database along with other sequence analysis tools have been added. In addition, users can submit their ESTs to the database. Conclusion The developed database provides genomic resource to grape community for functional analysis of genes in the collection and for the grape genome annotation and gene function identification. The VitisExpDB database is available through our website . PMID:18307813
Design and deployment of hybrid-telemedicine applications
NASA Astrophysics Data System (ADS)
Ikhu-Omoregbe, N. A.; Atayero, A. A.; Ayo, C. K.; Olugbara, O. O.
2005-01-01
With advances and availability of information and communication technology infrastructures in some nations and institutions, patients are now able to receive healthcare services from doctors and healthcare centers even when they are physically separated. The availability and transfer of patient data which often include medical images for specialist opinion is invaluable both to the patient and the medical practitioner in a telemedicine session. Two existing approaches to telemedicine are real-time and stored-and-forward. The real-time requires the availability or development of video-conferencing infrastructures which are expensive especially for most developing nations of the world while stored-and-forward could allow data transmission between any hospital with computer and telephone by landline link which is less expensive but with delays. We therefore propose a hybrid design of applications using hypermedia database capable of harnessing the features of real-time and stored-and-forward deployed over a wireless Virtual Private Network for the participating centers and healthcare providers.
Techniques for integrating ‐omics data
Akula, Siva Prasad; Miriyala, Raghava Naidu; Thota, Hanuman; Rao, Allam Appa; Gedela, Srinubabu
2009-01-01
The challenge for -omics research is to tackle the problem of fragmentation of knowledge by integrating several sources of heterogeneous information into a coherent entity. It is widely recognized that successful data integration is one of the keys to improve productivity for stored data. Through proper data integration tools and algorithms, researchers may correlate relationships that enable them to make better and faster decisions. The need for data integration is essential for present ‐omics community, because ‐omics data is currently spread world wide in wide variety of formats. These formats can be integrated and migrated across platforms through different techniques and one of the important techniques often used is XML. XML is used to provide a document markup language that is easier to learn, retrieve, store and transmit. It is semantically richer than HTML. Here, we describe bio warehousing, database federation, controlled vocabularies and highlighting the XML application to store, migrate and validate -omics data. PMID:19255651
Techniques for integrating -omics data.
Akula, Siva Prasad; Miriyala, Raghava Naidu; Thota, Hanuman; Rao, Allam Appa; Gedela, Srinubabu
2009-01-01
The challenge for -omics research is to tackle the problem of fragmentation of knowledge by integrating several sources of heterogeneous information into a coherent entity. It is widely recognized that successful data integration is one of the keys to improve productivity for stored data. Through proper data integration tools and algorithms, researchers may correlate relationships that enable them to make better and faster decisions. The need for data integration is essential for present -omics community, because -omics data is currently spread world wide in wide variety of formats. These formats can be integrated and migrated across platforms through different techniques and one of the important techniques often used is XML. XML is used to provide a document markup language that is easier to learn, retrieve, store and transmit. It is semantically richer than HTML. Here, we describe bio warehousing, database federation, controlled vocabularies and highlighting the XML application to store, migrate and validate -omics data.
R2 Water Quality Portal Monitoring Stations
The Water Quality Data Portal (WQP) provides an easy way to access data stored in various large water quality databases. The WQP provides various input parameters on the form including location, site, sampling, and date parameters to filter and customize the returned results. The The Water Quality Portal (WQP) is a cooperative service sponsored by the United States Geological Survey (USGS), the Environmental Protection Agency (EPA) and the National Water Quality Monitoring Council (NWQMC) that integrates publicly available water quality data from the USGS National Water Information System (NWIS) the EPA STOrage and RETrieval (STORET) Data Warehouse, and the USDA ARS Sustaining The Earth??s Watersheds - Agricultural Research Database System (STEWARDS).
Implementation of a WAP-based telemedicine system for patient monitoring.
Hung, Kevin; Zhang, Yuan-Ting
2003-06-01
Many parties have already demonstrated telemedicine applications that use cellular phones and the Internet. A current trend in telecommunication is the convergence of wireless communication and computer network technologies, and the emergence of wireless application protocol (WAP) devices is an example. Since WAP will also be a common feature found in future mobile communication devices, it is worthwhile to investigate its use in telemedicine. This paper describes the implementation and experiences with a WAP-based telemedicine system for patient-monitoring that has been developed in our laboratory. It utilizes WAP devices as mobile access terminals for general inquiry and patient-monitoring services. Authorized users can browse the patients' general data, monitored blood pressure (BP), and electrocardiogram (ECG) on WAP devices in store-and-forward mode. The applications, written in wireless markup language (WML), WMLScript, and Perl, resided in a content server. A MySQL relational database system was set up to store the BP readings, ECG data, patient records, clinic and hospital information, and doctors' appointments with patients. A wireless ECG subsystem was built for recording ambulatory ECG in an indoor environment and for storing ECG data into the database. For testing, a WAP phone compliant with WAP 1.1 was used at GSM 1800 MHz by circuit-switched data (CSD) to connect to the content server through a WAP gateway, which was provided by a mobile phone service provider in Hong Kong. Data were successfully retrieved from the database and displayed on the WAP phone. The system shows how WAP can be feasible in remote patient-monitoring and patient data retrieval.
Oceanography Information System of Spanish Institute of Oceanography (IEO)
NASA Astrophysics Data System (ADS)
Tello, Olvido; Gómez, María; González, Sonsoles
2016-04-01
Since 1914, the Spanish Institute of Oceanography (IEO) performs multidisciplinary studies of the marine environment. In same case are systematic studies and in others are specific studies for special requirements (El Hierro submarine volcanic episode, spill Prestige, others.). Different methodologies and data acquisition techniques are used depending on studies aims. The acquired data are stored and presented in different formats. The information is organized into different databases according to the subject and the variables represented (geology, fisheries, aquaculture, pollution, habitats, etc.). Related to physical and chemical oceanography data, in 1964 was created the DATA CENTER of IEO (CEDO), in order to organize the data about physical and chemical variables, to standardize this information and to serve the international data network SeaDataNet. www.seadatanet.org. This database integrates data about temperature, salinity, nutrients, and tidal data. CEDO allows consult and download the data. http://indamar.ieo.es On the other hand, related to data about marine species in 1999 was developed SIRENO DATABASE. All data about species collected in oceanographic surveys carried out by researches of IEO, and data from observers on fishing vessels are incorporated in SIRENO database. In this database is stored catch data, biomass, abundance, etc. This system is based on architecture ORACLE. Due to the large amount of information collected over the 100 years of IEO history, there is a clear need to organize, standardize, integrate and relate the different databases and information, and to provide interoperability and access to the information. Consequently, in 2000 it emerged the first initiative to organize the IEO spatial information in an Oceanography Information System, based on a Geographical Information System (GIS). The GIS was consolidated as IEO institutional GIS and was created the Spatial Data Infrastructure of IEO (IDEO) following trend of INSPIRE. All data included in the GIS have their corresponding metadata about ISO19115 and INSPIRE. IDEO is based on Web services, Quality of Services, Open standards, ISO (OGC) and INSPIRE standards, and both provide access to the geographical marine information of IEO. The GIS allows the information to be organized, visualized, consulted and analyzed. The data from different IEO databases are integrated into a GIS corporate Geodatabase (Esri format). This tool is essential in the decision making of aspects like: - Protection of marine environment - Sustainable management of resources - Natural Hazards. - Marine spatial planning. Examples of the use of GIS as a spatial analysis tool are: - Mud volcanoes explored in LIFE-INDEMARES project. - Cartographic series about Spanish continental shelf, developed from data integrated in IEO marine GIS, acquired from oceanographic surveys in ESPACE project. - Cartography developed from the information gathered in Initial Assessment of Marine Strategy Framework Directive. - Studies of natural hazards related to submarine canyons in southeast region marine Spanish. Currently the IEO is participating in many European initiatives, especially in several lots of EMODNET. The IEO besides is working in consonance with INSPIRE, Growth Blue, Horizon 2020, etc., to contribute to, the knowledge of marine environment, its protection and its spatial planning are extremely relevant issues. In order to facilitate the access to the Spatial Data Infrastructure of IEO, the IEO Geoportal was developed in 2012. It mainly involves a metadata catalog, access to the data viewers and Web Services of IDEO. http://www.geo-ideo.ieo.es/geoportalideo/catalog/main/home.page
Integrating Scientific Array Processing into Standard SQL
NASA Astrophysics Data System (ADS)
Misev, Dimitar; Bachhuber, Johannes; Baumann, Peter
2014-05-01
We live in a time that is dominated by data. Data storage is cheap and more applications than ever accrue vast amounts of data. Storing the emerging multidimensional data sets efficiently, however, and allowing them to be queried by their inherent structure, is a challenge many databases have to face today. Despite the fact that multidimensional array data is almost always linked to additional, non-array information, array databases have mostly developed separately from relational systems, resulting in a disparity between the two database categories. The current SQL standard and SQL DBMS supports arrays - and in an extension also multidimensional arrays - but does so in a very rudimentary and inefficient way. This poster demonstrates the practicality of an SQL extension for array processing, implemented in a proof-of-concept multi-faceted system that manages a federation of array and relational database systems, providing transparent, efficient and scalable access to the heterogeneous data in them.
NASA Technical Reports Server (NTRS)
1993-01-01
All the options in the NASA VEGetation Workbench (VEG) make use of a database of historical cover types. This database contains results from experiments by scientists on a wide variety of different cover types. The learning system uses the database to provide positive and negative training examples of classes that enable it to learn distinguishing features between classes of vegetation. All the other VEG options use the database to estimate the error bounds involved in the results obtained when various analysis techniques are applied to the sample of cover type data that is being studied. In the previous version of VEG, the historical cover type database was stored as part of the VEG knowledge base. This database was removed from the knowledge base. It is now stored as a series of flat files that are external to VEG. An interface between VEG and these files was provided. The interface allows the user to select which files of historical data to use. The files are then read, and the data are stored in Knowledge Engineering Environment (KEE) units using the same organization of units as in the previous version of VEG. The interface also allows the user to delete some or all of the historical database units from VEG and load new historical data from a file. This report summarizes the use of the historical cover type database in VEG. It then describes the new interface to the files containing the historical data. It describes minor changes that were made to VEG to enable the externally stored database to be used. Test runs to test the operation of the new interface and also to test the operation of VEG using historical data loaded from external files are described. Task F was completed. A Sun cartridge tape containing the KEE and Common Lisp code for the new interface and the modified version of the VEG knowledge base was delivered to the NASA GSFC technical representative.
Data entry module and manuals for the Land Treatment Digital Library
Welty, Justin L.; Pilliod, David S.
2013-01-01
Across the country, public land managers make decisions each year that influence landscapes and ecosystems within their jurisdictions. Many of these decisions involve vegetation manipulations, which often are referred to as land treatments. These treatments include removal or alteration of plant biomass, seeding of burned areas, application of herbicides, and other activities. Data documenting these land treatments usually are stored at local management offices in various formats. Therefore, anyone interested in the types and effects of land treatments across multiple jurisdictions must first assemble the information, which can be difficult if data discovery and organization involve multiple local offices. A centralized system for storing and accessing the data helps inform land managers when making policy and management considerations and assists scientists in developing sampling designs and studies. The Land Treatment Digital Library (LTDL) was created by the U.S. Geological Survey (USGS) as a comprehensive database incorporating tabular data, documentation, photographs, and spatial data about land treatments in a single system. It was developed over a period of several years and refined based on feedback from partner agencies and stakeholders. Currently, Bureau of Land Management (BLM) land treatment data are being entered by USGS personnel as part of a memorandum of understanding between the USGS and BLM. The LTDL has a website maintained by the USGS Forest and Rangeland Ecosystem Science Center where LTDL data can be viewed http://ltdl.wr.usgs.gov/. The resources and information provided in this data series allow other agencies, organizations, and individuals to download an empty, stand-alone LTDL database to individual or networked computers. Data entered in these databases may be submitted to the USGS for possible inclusion in the online LTDL. Multiple computer programs are used to accomplish the objective of the LTDL. The support of an information-technology specialist or professionals familiar with Microsoft Access™, ESRI’s ArcGIS™, Python, Adobe Acrobat Professional™, and computer settings is essential when installing and operating the LTDL. After the program is operational, a critical element for successful data entry is an understanding of the difference between database tables and forms, and how to edit data in both formats. Complete instructions accompany the program, and they should be followed carefully to ensure the setup and operation of the database goes smoothly.
Geospatial database for heritage building conservation
NASA Astrophysics Data System (ADS)
Basir, W. N. F. W. A.; Setan, H.; Majid, Z.; Chong, A.
2014-02-01
Heritage buildings are icons from the past that exist in present time. Through heritage architecture, we can learn about economic issues and social activities of the past. Nowadays, heritage buildings are under threat from natural disaster, uncertain weather, pollution and others. In order to preserve this heritage for the future generation, recording and documenting of heritage buildings are required. With the development of information system and data collection technique, it is possible to create a 3D digital model. This 3D information plays an important role in recording and documenting heritage buildings. 3D modeling and virtual reality techniques have demonstrated the ability to visualize the real world in 3D. It can provide a better platform for communication and understanding of heritage building. Combining 3D modelling with technology of Geographic Information System (GIS) will create a database that can make various analyses about spatial data in the form of a 3D model. Objectives of this research are to determine the reliability of Terrestrial Laser Scanning (TLS) technique for data acquisition of heritage building and to develop a geospatial database for heritage building conservation purposes. The result from data acquisition will become a guideline for 3D model development. This 3D model will be exported to the GIS format in order to develop a database for heritage building conservation. In this database, requirements for heritage building conservation process are included. Through this research, a proper database for storing and documenting of the heritage building conservation data will be developed.
The Magnetics Information Consortium (MagIC)
NASA Astrophysics Data System (ADS)
Johnson, C.; Constable, C.; Tauxe, L.; Koppers, A.; Banerjee, S.; Jackson, M.; Solheid, P.
2003-12-01
The Magnetics Information Consortium (MagIC) is a multi-user facility to establish and maintain a state-of-the-art relational database and digital archive for rock and paleomagnetic data. The goal of MagIC is to make such data generally available and to provide an information technology infrastructure for these and other research-oriented databases run by the international community. As its name implies, MagIC will not be restricted to paleomagnetic or rock magnetic data only, although MagIC will focus on these kinds of information during its setup phase. MagIC will be hosted under EarthRef.org at http://earthref.org/MAGIC/ where two "integrated" web portals will be developed, one for paleomagnetism (currently functional as a prototype that can be explored via the http://earthref.org/databases/PMAG/ link) and one for rock magnetism. The MagIC database will store all measurements and their derived properties for studies of paleomagnetic directions (inclination, declination) and their intensities, and for rock magnetic experiments (hysteresis, remanence, susceptibility, anisotropy). Ultimately, this database will allow researchers to study "on the internet" and to download important data sets that display paleo-secular variations in the intensity of the Earth's magnetic field over geological time, or that display magnetic data in typical Zijderveld, hysteresis/FORC and various magnetization/remanence diagrams. The MagIC database is completely integrated in the EarthRef.org relational database structure and thus benefits significantly from already-existing common database components, such as the EarthRef Reference Database (ERR) and Address Book (ERAB). The ERR allows researchers to find complete sets of literature resources as used in GERM (Geochemical Earth Reference Model), REM (Reference Earth Model) and MagIC. The ERAB contains addresses for all contributors to the EarthRef.org databases, and also for those who participated in data collection, archiving and analysis in the magnetic studies. Integration with these existing components will guarantee direct traceability to the original sources of the MagIC data and metadata. The MagIC database design focuses around the general workflow that results in the determination of typical paleomagnetic and rock magnetic analyses. This ensures that individual data points can be traced between the actual measurements and their associated specimen, sample, site, rock formation and locality. This permits a distinction between original and derived data, where the actual measurements are performed at the specimen level, and data at the sample level and higher are then derived products in the database. These relations will also allow recalculation of derived properties, such as site means, when new data becomes available for a specific locality. Data contribution to the MagIC database is critical in achieving a useful research tool. We have developed a standard data and metadata template that can be used to provide all data at the same time as publication. Software tools are provided to facilitate easy population of these templates. The tools allow for the import/export of data files in a delimited text format, and they provide some advanced functionality to validate data and to check internal coherence of the data in the template. During and after publication these standardized MagIC templates will be stored in the ERR database of EarthRef.org from where they can be downloaded at all times. Finally, the contents of these template files will be automatically parsed into the online relational database.
An Efficient, Lossless Database for Storing and Transmitting Medical Images
NASA Technical Reports Server (NTRS)
Fenstermacher, Marc J.
1998-01-01
This research aimed in creating new compression methods based on the central idea of Set Redundancy Compression (SRC). Set Redundancy refers to the common information that exists in a set of similar images. SRC compression methods take advantage of this common information and can achieve improved compression of similar images by reducing their Set Redundancy. The current research resulted in the development of three new lossless SRC compression methods: MARS (Median-Aided Region Sorting), MAZE (Max-Aided Zero Elimination) and MaxGBA (Max-Guided Bit Allocation).
Fukushima Daiichi Information Repository FY13 Status
DOE Office of Scientific and Technical Information (OSTI.GOV)
Smith, Curtis; Phelan, Cherie; Schwieder, Dave
The accident at the Fukushima Daiichi nuclear power station in Japan is one of the most serious in commercial nuclear power plant operating history. Much will be learned that may be applicable to the U.S. reactor fleet, nuclear fuel cycle facilities, and supporting systems, and the international reactor fleet. For example, lessons from Fukushima Daiichi may be applied to emergency response planning, reactor operator training, accident scenario modeling, human factors engineering, radiation protection, and accident mitigation; as well as influence U.S. policies towards the nuclear fuel cycle including power generation, and spent fuel storage, reprocessing, and disposal. This document describesmore » the database used to establish a centralized information repository to store and manage the Fukushima data that has been gathered. The data is stored in a secured (password protected and encrypted) repository that is searchable and available to researchers at diverse locations.« less
XCEDE: An Extensible Schema For Biomedical Data
Gadde, Syam; Aucoin, Nicole; Grethe, Jeffrey S.; Keator, David B.; Marcus, Daniel S.; Pieper, Steve
2013-01-01
The XCEDE (XML-based Clinical and Experimental Data Exchange) XML schema, developed by members of the BIRN (Biomedical Informatics Research Network), provides an extensive metadata hierarchy for storing, describing and documenting the data generated by scientific studies. Currently at version 2.0, the XCEDE schema serves as a specification for the exchange of scientific data between databases, analysis tools, and web services. It provides a structured metadata hierarchy, storing information relevant to various aspects of an experiment (project, subject, protocol, etc.). Each hierarchy level also provides for the storage of data provenance information allowing for a traceable record of processing and/or changes to the underlying data. The schema is extensible to support the needs of various data modalities and to express types of data not originally envisioned by the developers. The latest version of the XCEDE schema and manual are available from http://www.xcede.org/ PMID:21479735
SIMS: addressing the problem of heterogeneity in databases
NASA Astrophysics Data System (ADS)
Arens, Yigal
1997-02-01
The heterogeneity of remotely accessible databases -- with respect to contents, query language, semantics, organization, etc. -- presents serious obstacles to convenient querying. The SIMS (single interface to multiple sources) system addresses this global integration problem. It does so by defining a single language for describing the domain about which information is stored in the databases and using this language as the query language. Each database to which SIMS is to provide access is modeled using this language. The model describes a database's contents, organization, and other relevant features. SIMS uses these models, together with a planning system drawing on techniques from artificial intelligence, to decompose a given user's high-level query into a series of queries against the databases and other data manipulation steps. The retrieval plan is constructed so as to minimize data movement over the network and maximize parallelism to increase execution speed. SIMS can recover from network failures during plan execution by obtaining data from alternate sources, when possible. SIMS has been demonstrated in the domains of medical informatics and logistics, using real databases.
Implementing Relational Operations in an Object-Oriented Database
1992-03-01
computer aided software engineering (CASE) and computer aided design (CAD) tools. There has been some research done in the area of combining...35 2. Prograph Database Engine .................................................................. 38 III. W HY A N R/O...in most business applications where the bulk of data being stored and manipulated is simply textual or numeric data that can be stored and manipulated
NASA Astrophysics Data System (ADS)
Sano, Tomoyuki; Suzuki, Masataka; Nishida, Hideo
The Development of CAI system using CD-ROM and NAPLPS (North American Presentation Level Protocol Syntax) was taken place by Himeji Dokkyo University. The characteristics of CAI using CD-ROM as information processing series for the department of liberal arts student are described. The system is that the computer program, vast amount of voice data and graphics data are stored in a CD-ROM. It is very effective to improve learning ability of student.
Toxic substances registry system: Index of material safety data sheets
NASA Technical Reports Server (NTRS)
1991-01-01
The Material Safety Data Sheets (MSDSs) listed in this index reflect product inventories and associated MSDSs which have been submitted to the Toxic Substance Registry database maintained by the Base Operations Contractor at the Kennedy Space Center. The purpose of this index is to provide a means to access information on the hazards associated with the toxic and otherwise hazardous chemicals stored and used at the Kennedy Space Center.
Meta4: a web application for sharing and annotating metagenomic gene predictions using web services.
Richardson, Emily J; Escalettes, Franck; Fotheringham, Ian; Wallace, Robert J; Watson, Mick
2013-01-01
Whole-genome shotgun metagenomics experiments produce DNA sequence data from entire ecosystems, and provide a huge amount of novel information. Gene discovery projects require up-to-date information about sequence homology and domain structure for millions of predicted proteins to be presented in a simple, easy-to-use system. There is a lack of simple, open, flexible tools that allow the rapid sharing of metagenomics datasets with collaborators in a format they can easily interrogate. We present Meta4, a flexible and extensible web application that can be used to share and annotate metagenomic gene predictions. Proteins and predicted domains are stored in a simple relational database, with a dynamic front-end which displays the results in an internet browser. Web services are used to provide up-to-date information about the proteins from homology searches against public databases. Information about Meta4 can be found on the project website, code is available on Github, a cloud image is available, and an example implementation can be seen at.
Lee, Sunghoon; Lee, Byungwook; Jang, Insoo; Kim, Sangsoo; Bhak, Jong
2006-01-01
The Localizome server predicts the transmembrane (TM) helix number and TM topology of a user-supplied eukaryotic protein and presents the result as an intuitive graphic representation. It utilizes hmmpfam to detect the presence of Pfam domains and a prediction algorithm, Phobius, to predict the TM helices. The results are combined and checked against the TM topology rules stored in a protein domain database called LocaloDom. LocaloDom is a curated database that contains TM topologies and TM helix numbers of known protein domains. It was constructed from Pfam domains combined with Swiss-Prot annotations and Phobius predictions. The Localizome server corrects the combined results of the user sequence to conform to the rules stored in LocaloDom. Compared with other programs, this server showed the highest accuracy for TM topology prediction: for soluble proteins, the accuracy and coverage were 99 and 75%, respectively, while for TM protein domain regions, they were 96 and 68%, respectively. With a graphical representation of TM topology and TM helix positions with the domain units, the Localizome server is a highly accurate and comprehensive information source for subcellular localization for soluble proteins as well as membrane proteins. The Localizome server can be found at . PMID:16845118
NASA Astrophysics Data System (ADS)
Altini, V.; Carena, F.; Carena, W.; Chapeland, S.; Chibante Barroso, V.; Costa, F.; Divià, R.; Fuchs, U.; Makhlyueva, I.; Roukoutakis, F.; Schossmaier, K.; Soòs, C.; Vande Vyvre, P.; Von Haller, B.; ALICE Collaboration
2010-04-01
All major experiments need tools that provide a way to keep a record of the events and activities, both during commissioning and operations. In ALICE (A Large Ion Collider Experiment) at CERN, this task is performed by the Alice Electronic Logbook (eLogbook), a custom-made application developed and maintained by the Data-Acquisition group (DAQ). Started as a statistics repository, the eLogbook has evolved to become not only a fully functional electronic logbook, but also a massive information repository used to store the conditions and statistics of the several online systems. It's currently used by more than 600 users in 30 different countries and it plays an important role in the daily ALICE collaboration activities. This paper will describe the LAMP (Linux, Apache, MySQL and PHP) based architecture of the eLogbook, the database schema and the relevance of the information stored in the eLogbook to the different ALICE actors, not only for near real time procedures but also for long term data-mining and analysis. It will also present the web interface, including the different used technologies, the implemented security measures and the current main features. Finally it will present the roadmap for the future, including a migration to the web 2.0 paradigm, the handling of the database ever-increasing data volume and the deployment of data-mining tools.
The Universal Protein Resource (UniProt): an expanding universe of protein information.
Wu, Cathy H; Apweiler, Rolf; Bairoch, Amos; Natale, Darren A; Barker, Winona C; Boeckmann, Brigitte; Ferro, Serenella; Gasteiger, Elisabeth; Huang, Hongzhan; Lopez, Rodrigo; Magrane, Michele; Martin, Maria J; Mazumder, Raja; O'Donovan, Claire; Redaschi, Nicole; Suzek, Baris
2006-01-01
The Universal Protein Resource (UniProt) provides a central resource on protein sequences and functional annotation with three database components, each addressing a key need in protein bioinformatics. The UniProt Knowledgebase (UniProtKB), comprising the manually annotated UniProtKB/Swiss-Prot section and the automatically annotated UniProtKB/TrEMBL section, is the preeminent storehouse of protein annotation. The extensive cross-references, functional and feature annotations and literature-based evidence attribution enable scientists to analyse proteins and query across databases. The UniProt Reference Clusters (UniRef) speed similarity searches via sequence space compression by merging sequences that are 100% (UniRef100), 90% (UniRef90) or 50% (UniRef50) identical. Finally, the UniProt Archive (UniParc) stores all publicly available protein sequences, containing the history of sequence data with links to the source databases. UniProt databases continue to grow in size and in availability of information. Recent and upcoming changes to database contents, formats, controlled vocabularies and services are described. New download availability includes all major releases of UniProtKB, sequence collections by taxonomic division and complete proteomes. A bibliography mapping service has been added, and an ID mapping service will be available soon. UniProt databases can be accessed online at http://www.uniprot.org or downloaded at ftp://ftp.uniprot.org/pub/databases/.
A comparison of database systems for XML-type data.
Risse, Judith E; Leunissen, Jack A M
2010-01-01
In the field of bioinformatics interchangeable data formats based on XML are widely used. XML-type data is also at the core of most web services. With the increasing amount of data stored in XML comes the need for storing and accessing the data. In this paper we analyse the suitability of different database systems for storing and querying large datasets in general and Medline in particular. All reviewed database systems perform well when tested with small to medium sized datasets, however when the full Medline dataset is queried a large variation in query times is observed. There is not one system that is vastly superior to the others in this comparison and, depending on the database size and the query requirements, different systems are most suitable. The best all-round solution is the Oracle 11~g database system using the new binary storage option. Alias-i's Lingpipe is a more lightweight, customizable and sufficiently fast solution. It does however require more initial configuration steps. For data with a changing XML structure Sedna and BaseX as native XML database systems or MySQL with an XML-type column are suitable.
The Chandra Source Catalog: Storage and Interfaces
NASA Astrophysics Data System (ADS)
van Stone, David; Harbo, Peter N.; Tibbetts, Michael S.; Zografou, Panagoula; Evans, Ian N.; Primini, Francis A.; Glotfelty, Kenny J.; Anderson, Craig S.; Bonaventura, Nina R.; Chen, Judy C.; Davis, John E.; Doe, Stephen M.; Evans, Janet D.; Fabbiano, Giuseppina; Galle, Elizabeth C.; Gibbs, Danny G., II; Grier, John D.; Hain, Roger; Hall, Diane M.; He, Xiang Qun (Helen); Houck, John C.; Karovska, Margarita; Kashyap, Vinay L.; Lauer, Jennifer; McCollough, Michael L.; McDowell, Jonathan C.; Miller, Joseph B.; Mitschang, Arik W.; Morgan, Douglas L.; Mossman, Amy E.; Nichols, Joy S.; Nowak, Michael A.; Plummer, David A.; Refsdal, Brian L.; Rots, Arnold H.; Siemiginowska, Aneta L.; Sundheim, Beth A.; Winkelman, Sherry L.
2009-09-01
The Chandra Source Catalog (CSC) is part of the Chandra Data Archive (CDA) at the Chandra X-ray Center. The catalog contains source properties and associated data objects such as images, spectra, and lightcurves. The source properties are stored in relational databases and the data objects are stored in files with their metadata stored in databases. The CDA supports different versions of the catalog: multiple fixed release versions and a live database version. There are several interfaces to the catalog: CSCview, a graphical interface for building and submitting queries and for retrieving data objects; a command-line interface for property and source searches using ADQL; and VO-compliant services discoverable though the VO registry. This poster describes the structure of the catalog and provides an overview of the interfaces.
Schomburg, Ida; Chang, Antje; Placzek, Sandra; Söhngen, Carola; Rother, Michael; Lang, Maren; Munaretto, Cornelia; Ulas, Susanne; Stelzer, Michael; Grote, Andreas; Scheer, Maurice; Schomburg, Dietmar
2013-01-01
The BRENDA (BRaunschweig ENzyme DAtabase) enzyme portal (http://www.brenda-enzymes.org) is the main information system of functional biochemical and molecular enzyme data and provides access to seven interconnected databases. BRENDA contains 2.7 million manually annotated data on enzyme occurrence, function, kinetics and molecular properties. Each entry is connected to a reference and the source organism. Enzyme ligands are stored with their structures and can be accessed via their names, synonyms or via a structure search. FRENDA (Full Reference ENzyme DAta) and AMENDA (Automatic Mining of ENzyme DAta) are based on text mining methods and represent a complete survey of PubMed abstracts with information on enzymes in different organisms, tissues or organelles. The supplemental database DRENDA provides more than 910 000 new EC number-disease relations in more than 510 000 references from automatic search and a classification of enzyme-disease-related information. KENDA (Kinetic ENzyme DAta), a new amendment extracts and displays kinetic values from PubMed abstracts. The integration of the EnzymeDetector offers an automatic comparison, evaluation and prediction of enzyme function annotations for prokaryotic genomes. The biochemical reaction database BKM-react contains non-redundant enzyme-catalysed and spontaneous reactions and was developed to facilitate and accelerate the construction of biochemical models.
Geoinformatics paves the way for a zoo information system
NASA Astrophysics Data System (ADS)
Michel, Ulrich
2008-10-01
The use of modern electronic media offers new ways of (environmental) knowledge transfer. All kind of information can be made quickly available as well as queryable and can be processed individually. The Institute for Geoinformatics and Remote Sensing (IGF) in collaboration with the Osnabrueck Zoo, is developing a zoo information system, especially for new media (e.g. mobile devices), which provides information about the animals living there, their natural habitat and endangerment status. Thereby multimedia information is being offered to the zoo visitors. The implementation of the 2D/3D components is realized by modern database and Mapserver technologies. Among other technologies, the VRML (Virtual Reality Modeling Language) standard is used for the realization of the 3D visualization so that it can be viewed in every conventional web browser. Also, a mobile information system for Pocket PCs, Smartphones and Ultra Mobile PCs (UMPC) is being developed. All contents, including the coordinates, are stored in a PostgreSQL database. The data input, the processing and other administrative operations are executed by a content management system (CMS).
Data model and relational database design for the New Jersey Water-Transfer Data System (NJWaTr)
Tessler, Steven
2003-01-01
The New Jersey Water-Transfer Data System (NJWaTr) is a database design for the storage and retrieval of water-use data. NJWaTr can manage data encompassing many facets of water use, including (1) the tracking of various types of water-use activities (withdrawals, returns, transfers, distributions, consumptive-use, wastewater collection, and treatment); (2) the storage of descriptions, classifications and locations of places and organizations involved in water-use activities; (3) the storage of details about measured or estimated volumes of water associated with water-use activities; and (4) the storage of information about data sources and water resources associated with water use. In NJWaTr, each water transfer occurs unidirectionally between two site objects, and the sites and conveyances form a water network. The core entities in the NJWaTr model are site, conveyance, transfer/volume, location, and owner. Other important entities include water resource (used for withdrawals and returns), data source, permit, and alias. Multiple water-exchange estimates based on different methods or data sources can be stored for individual transfers. Storage of user-defined details is accommodated for several of the main entities. Many tables contain classification terms to facilitate the detailed description of data items and can be used for routine or custom data summarization. NJWaTr accommodates single-user and aggregate-user water-use data, can be used for large or small water-network projects, and is available as a stand-alone Microsoft? Access database. Data stored in the NJWaTr structure can be retrieved in user-defined combinations to serve visualization and analytical applications. Users can customize and extend the database, link it to other databases, or implement the design in other relational database applications.
Multimedia explorer: image database, image proxy-server and search-engine.
Frankewitsch, T.; Prokosch, U.
1999-01-01
Multimedia plays a major role in medicine. Databases containing images, movies or other types of multimedia objects are increasing in number, especially on the WWW. However, no good retrieval mechanism or search engine currently exists to efficiently track down such multimedia sources in the vast of information provided by the WWW. Secondly, the tools for searching databases are usually not adapted to the properties of images. HTML pages do not allow complex searches. Therefore establishing a more comfortable retrieval involves the use of a higher programming level like JAVA. With this platform independent language it is possible to create extensions to commonly used web browsers. These applets offer a graphical user interface for high level navigation. We implemented a database using JAVA objects as the primary storage container which are then stored by a JAVA controlled ORACLE8 database. Navigation depends on a structured vocabulary enhanced by a semantic network. With this approach multimedia objects can be encapsulated within a logical module for quick data retrieval. PMID:10566463
Multimedia explorer: image database, image proxy-server and search-engine.
Frankewitsch, T; Prokosch, U
1999-01-01
Multimedia plays a major role in medicine. Databases containing images, movies or other types of multimedia objects are increasing in number, especially on the WWW. However, no good retrieval mechanism or search engine currently exists to efficiently track down such multimedia sources in the vast of information provided by the WWW. Secondly, the tools for searching databases are usually not adapted to the properties of images. HTML pages do not allow complex searches. Therefore establishing a more comfortable retrieval involves the use of a higher programming level like JAVA. With this platform independent language it is possible to create extensions to commonly used web browsers. These applets offer a graphical user interface for high level navigation. We implemented a database using JAVA objects as the primary storage container which are then stored by a JAVA controlled ORACLE8 database. Navigation depends on a structured vocabulary enhanced by a semantic network. With this approach multimedia objects can be encapsulated within a logical module for quick data retrieval.
NASA Technical Reports Server (NTRS)
Ramirez, Eric; Gutheinz, Sandy; Brison, James; Ho, Anita; Allen, James; Ceritelli, Olga; Tobar, Claudia; Nguyen, Thuykien; Crenshaw, Harrel; Santos, Roxann
2008-01-01
Supplier Management System (SMS) allows for a consistent, agency-wide performance rating system for suppliers used by NASA. This version (2.0) combines separate databases into one central database that allows for the sharing of supplier data. Information extracted from the NBS/Oracle database can be used to generate ratings. Also, supplier ratings can now be generated in the areas of cost, product quality, delivery, and audit data. Supplier data can be charted based on real-time user input. Based on these individual ratings, an overall rating can be generated. Data that normally would be stored in multiple databases, each requiring its own log-in, is now readily available and easily accessible with only one log-in required. Additionally, the database can accommodate the storage and display of quality-related data that can be analyzed and used in the supplier procurement decision-making process. Moreover, the software allows for a Closed-Loop System (supplier feedback), as well as the capability to communicate with other federal agencies.
Footprint Database and web services for the Herschel space observatory
NASA Astrophysics Data System (ADS)
Verebélyi, Erika; Dobos, László; Kiss, Csaba
2015-08-01
Using all telemetry and observational meta-data, we created a searchable database of Herschel observation footprints. Data from the Herschel space observatory is freely available for everyone but no uniformly processed catalog of all observations has been published yet. As a first step, we unified the data model for all three Herschel instruments in all observation modes and compiled a database of sky coverage information. As opposed to methods using a pixellation of the sphere, in our database, sky coverage is stored in exact geometric form allowing for precise area calculations. Indexing of the footprints allows for very fast search among observations based on pointing, time, sky coverage overlap and meta-data. This enables us, for example, to find moving objects easily in Herschel fields. The database is accessible via a web site and also as a set of REST web service functions which makes it usable from program clients like Python or IDL scripts. Data is available in various formats including Virtual Observatory standards.
Comparative Analysis of Data Structures for Storing Massive Tins in a Dbms
NASA Astrophysics Data System (ADS)
Kumar, K.; Ledoux, H.; Stoter, J.
2016-06-01
Point cloud data are an important source for 3D geoinformation. Modern day 3D data acquisition and processing techniques such as airborne laser scanning and multi-beam echosounding generate billions of 3D points for simply an area of few square kilometers. With the size of the point clouds exceeding the billion mark for even a small area, there is a need for their efficient storage and management. These point clouds are sometimes associated with attributes and constraints as well. Storing billions of 3D points is currently possible which is confirmed by the initial implementations in Oracle Spatial SDO PC and the PostgreSQL Point Cloud extension. But to be able to analyse and extract useful information from point clouds, we need more than just points i.e. we require the surface defined by these points in space. There are different ways to represent surfaces in GIS including grids, TINs, boundary representations, etc. In this study, we investigate the database solutions for the storage and management of massive TINs. The classical (face and edge based) and compact (star based) data structures are discussed at length with reference to their structure, advantages and limitations in handling massive triangulations and are compared with the current solution of PostGIS Simple Feature. The main test dataset is the TIN generated from third national elevation model of the Netherlands (AHN3) with a point density of over 10 points/m2. PostgreSQL/PostGIS DBMS is used for storing the generated TIN. The data structures are tested with the generated TIN models to account for their geometry, topology, storage, indexing, and loading time in a database. Our study is useful in identifying what are the limitations of the existing data structures for storing massive TINs and what is required to optimise these structures for managing massive triangulations in a database.
Aryanto, Kadek Y E; Broekema, André; Oudkerk, Matthijs; van Ooijen, Peter M A
2012-01-01
To present an adapted Clinical Trial Processor (CTP) test set-up for receiving, anonymising and saving Digital Imaging and Communications in Medicine (DICOM) data using external input from the original database of an existing clinical study information system to guide the anonymisation process. Two methods are presented for an adapted CTP test set-up. In the first method, images are pushed from the Picture Archiving and Communication System (PACS) using the DICOM protocol through a local network. In the second method, images are transferred through the internet using the HTTPS protocol. In total 25,000 images from 50 patients were moved from the PACS, anonymised and stored within roughly 2 h using the first method. In the second method, an average of 10 images per minute were transferred and processed over a residential connection. In both methods, no duplicated images were stored when previous images were retransferred. The anonymised images are stored in appropriate directories. The CTP can transfer and process DICOM images correctly in a very easy set-up providing a fast, secure and stable environment. The adapted CTP allows easy integration into an environment in which patient data are already included in an existing information system.
Cloud-based adaptive exon prediction for DNA analysis.
Putluri, Srinivasareddy; Zia Ur Rahman, Md; Fathima, Shaik Yasmeen
2018-02-01
Cloud computing offers significant research and economic benefits to healthcare organisations. Cloud services provide a safe place for storing and managing large amounts of such sensitive data. Under conventional flow of gene information, gene sequence laboratories send out raw and inferred information via Internet to several sequence libraries. DNA sequencing storage costs will be minimised by use of cloud service. In this study, the authors put forward a novel genomic informatics system using Amazon Cloud Services, where genomic sequence information is stored and accessed for processing. True identification of exon regions in a DNA sequence is a key task in bioinformatics, which helps in disease identification and design drugs. Three base periodicity property of exons forms the basis of all exon identification techniques. Adaptive signal processing techniques found to be promising in comparison with several other methods. Several adaptive exon predictors (AEPs) are developed using variable normalised least mean square and its maximum normalised variants to reduce computational complexity. Finally, performance evaluation of various AEPs is done based on measures such as sensitivity, specificity and precision using various standard genomic datasets taken from National Center for Biotechnology Information genomic sequence database.
searchSCF: Using MongoDB to Enable Richer Searches of Locally Hosted Science Data Repositories
NASA Astrophysics Data System (ADS)
Knosp, B.
2016-12-01
Science teams today are in the unusual position of almost having too much data available to them. Modern sensors and models are capable of outputting terabytes of data per day, which can make it difficult to find specific subsets of data. The sheer size of files can also make it time consuming to retrieve this big data from national data archive centers. Thus, many science teams choose to store what data they can on their local systems, but they are not always equipped with tools to help them intelligently organize and search their data. In its local data repository, the Aura Microwave Limb Sounder (MLS) science team at NASA's Jet Propulsion Laboratory has collected over 300TB of atmospheric science data from 71 missions/models that aid in validation, algorithm development, and research activities. When the project began, the team developed a MySQL database to aid in data queries, but this database was only designed to keep track of MLS and a few ancillary data sets, leving much of the data uncatalogued. The team has also seen database query time rise over the life of the mission. Even though the MLS science team's data holdings are not the size of a national data center's, team members still need tools to help them discover and utilize the data that they have on-hand. Over the past year, members of the science team have been looking for solutions to (1) store information on all the data sets they have collected in a single database, (2) store more metadata about each data file, (3) develop queries that can find relationships among these disparate data types, and (4) plug any new functions developed around this database into existing analysis, visualization, and web tools, transparently to users. In this presentation, I will discuss the searchSCF package that is currently under development. This package includes a NoSQL database management system (MongoDB) and a set of Python tools that both ingests data into the database and supports user queries. I will also highlight case studies of how this system could be used by the MLS science team, and how it could be implemented by other science teams with local data repositories.
Digital data storage systems, computers, and data verification methods
Groeneveld, Bennett J.; Austad, Wayne E.; Walsh, Stuart C.; Herring, Catherine A.
2005-12-27
Digital data storage systems, computers, and data verification methods are provided. According to a first aspect of the invention, a computer includes an interface adapted to couple with a dynamic database; and processing circuitry configured to provide a first hash from digital data stored within a portion of the dynamic database at an initial moment in time, to provide a second hash from digital data stored within the portion of the dynamic database at a subsequent moment in time, and to compare the first hash and the second hash.
ARIANE: integration of information databases within a hospital intranet.
Joubert, M; Aymard, S; Fieschi, D; Volot, F; Staccini, P; Robert, J J; Fieschi, M
1998-05-01
Large information systems handle massive volume of data stored in heterogeneous sources. Each server has its own model of representation of concepts with regard to its aims. One of the main problems end-users encounter when accessing different servers is to match their own viewpoint on biomedical concepts with the various representations that are made in the databases servers. The aim of the project ARIANE is to provide end-users with easy-to-use and natural means to access and query heterogeneous information databases. The objectives of this research work consist in building a conceptual interface by means of the Internet technology inside an enterprise Intranet and to propose a method to realize it. This method is based on the knowledge sources provided by the Unified Medical Language System (UMLS) project of the US National Library of Medicine. Experiments concern queries to three different information servers: PubMed, a Medline server of the NLM; Thériaque, a French database on drugs implemented in the Hospital Intranet; and a Web site dedicated to Internet resources in gastroenterology and nutrition, located at the Faculty of Medicine of Nice (France). Accessing to each of these servers is different according to the kind of information delivered and according to the technology used to query it. Dealing with health care professional workstation, the authors introduced in the ARIANE project quality criteria in order to attempt a homogeneous and efficient way to build a query system able to be integrated in existing information systems and to integrate existing and new information sources.
Informatics in neurocritical care: new ideas for Big Data.
Flechet, Marine; Grandas, Fabian Güiza; Meyfroidt, Geert
2016-04-01
Big data is the new hype in business and healthcare. Data storage and processing has become cheap, fast, and easy. Business analysts and scientists are trying to design methods to mine these data for hidden knowledge. Neurocritical care is a field that typically produces large amounts of patient-related data, and these data are increasingly being digitized and stored. This review will try to look beyond the hype, and focus on possible applications in neurointensive care amenable to Big Data research that can potentially improve patient care. The first challenge in Big Data research will be the development of large, multicenter, and high-quality databases. These databases could be used to further investigate recent findings from mathematical models, developed in smaller datasets. Randomized clinical trials and Big Data research are complementary. Big Data research might be used to identify subgroups of patients that could benefit most from a certain intervention, or can be an alternative in areas where randomized clinical trials are not possible. The processing and the analysis of the large amount of patient-related information stored in clinical databases is beyond normal human cognitive ability. Big Data research applications have the potential to discover new medical knowledge, and improve care in the neurointensive care unit.
NCBI GEO: archive for functional genomics data sets—10 years on
Barrett, Tanya; Troup, Dennis B.; Wilhite, Stephen E.; Ledoux, Pierre; Evangelista, Carlos; Kim, Irene F.; Tomashevsky, Maxim; Marshall, Kimberly A.; Phillippy, Katherine H.; Sherman, Patti M.; Muertter, Rolf N.; Holko, Michelle; Ayanbule, Oluwabukunmi; Yefanov, Andrey; Soboleva, Alexandra
2011-01-01
A decade ago, the Gene Expression Omnibus (GEO) database was established at the National Center for Biotechnology Information (NCBI). The original objective of GEO was to serve as a public repository for high-throughput gene expression data generated mostly by microarray technology. However, the research community quickly applied microarrays to non-gene-expression studies, including examination of genome copy number variation and genome-wide profiling of DNA-binding proteins. Because the GEO database was designed with a flexible structure, it was possible to quickly adapt the repository to store these data types. More recently, as the microarray community switches to next-generation sequencing technologies, GEO has again adapted to host these data sets. Today, GEO stores over 20 000 microarray- and sequence-based functional genomics studies, and continues to handle the majority of direct high-throughput data submissions from the research community. Multiple mechanisms are provided to help users effectively search, browse, download and visualize the data at the level of individual genes or entire studies. This paper describes recent database enhancements, including new search and data representation tools, as well as a brief review of how the community uses GEO data. GEO is freely accessible at http://www.ncbi.nlm.nih.gov/geo/. PMID:21097893
Toward an integrated knowledge environment to support modern oncology.
Blake, Patrick M; Decker, David A; Glennon, Timothy M; Liang, Yong Michael; Losko, Sascha; Navin, Nicholas; Suh, K Stephen
2011-01-01
Around the world, teams of researchers continue to develop a wide range of systems to capture, store, and analyze data including treatment, patient outcomes, tumor registries, next-generation sequencing, single-nucleotide polymorphism, copy number, gene expression, drug chemistry, drug safety, and toxicity. Scientists mine, curate, and manually annotate growing mountains of data to produce high-quality databases, while clinical information is aggregated in distant systems. Databases are currently scattered, and relationships between variables coded in disparate datasets are frequently invisible. The challenge is to evolve oncology informatics from a "systems" orientation of standalone platforms and silos into an "integrated knowledge environments" that will connect "knowable" research data with patient clinical information. The aim of this article is to review progress toward an integrated knowledge environment to support modern oncology with a focus on supporting scientific discovery and improving cancer care.
Begley, Dale A; Sundberg, John P; Krupke, Debra M; Neuhauser, Steven B; Bult, Carol J; Eppig, Janan T; Morse, Herbert C; Ward, Jerrold M
2015-12-01
Many mouse models have been created to study hematopoietic cancer types. There are over thirty hematopoietic tumor types and subtypes, both human and mouse, with various origins, characteristics and clinical prognoses. Determining the specific type of hematopoietic lesion produced in a mouse model and identifying mouse models that correspond to the human subtypes of these lesions has been a continuing challenge for the scientific community. The Mouse Tumor Biology Database (MTB; http://tumor.informatics.jax.org) is designed to facilitate use of mouse models of human cancer by providing detailed histopathologic and molecular information on lymphoma subtypes, including expertly annotated, on line, whole slide scans, and providing a repository for storing information on and querying these data for specific lymphoma models. Copyright © 2015 Elsevier Inc. All rights reserved.
Distributed On-line Monitoring System Based on Modem and Public Phone Net
NASA Astrophysics Data System (ADS)
Chen, Dandan; Zhang, Qiushi; Li, Guiru
In order to solve the monitoring problem of urban sewage disposal, a distributed on-line monitoring system is proposed. By introducing dial-up communication technology based on Modem, the serial communication program can rationally solve the information transmission problem between master station and slave station. The realization of serial communication program is based on the MSComm control of C++ Builder 6.0.The software includes real-time data operation part and history data handling part, which using Microsoft SQL Server 2000 for database, and C++ Builder6.0 for user interface. The monitoring center displays a user interface with alarm information of over-standard data and real-time curve. Practical application shows that the system has successfully accomplished the real-time data acquisition from data gather station, and stored them in the terminal database.
Quantum search of a real unstructured database
NASA Astrophysics Data System (ADS)
Broda, Bogusław
2016-02-01
A simple circuit implementation of the oracle for Grover's quantum search of a real unstructured classical database is proposed. The oracle contains a kind of quantumly accessible classical memory, which stores the database.
Evaluating the Cassandra NoSQL Database Approach for Genomic Data Persistency.
Aniceto, Rodrigo; Xavier, Rene; Guimarães, Valeria; Hondo, Fernanda; Holanda, Maristela; Walter, Maria Emilia; Lifschitz, Sérgio
2015-01-01
Rapid advances in high-throughput sequencing techniques have created interesting computational challenges in bioinformatics. One of them refers to management of massive amounts of data generated by automatic sequencers. We need to deal with the persistency of genomic data, particularly storing and analyzing these large-scale processed data. To find an alternative to the frequently considered relational database model becomes a compelling task. Other data models may be more effective when dealing with a very large amount of nonconventional data, especially for writing and retrieving operations. In this paper, we discuss the Cassandra NoSQL database approach for storing genomic data. We perform an analysis of persistency and I/O operations with real data, using the Cassandra database system. We also compare the results obtained with a classical relational database system and another NoSQL database approach, MongoDB.
Use of Graph Database for the Integration of Heterogeneous Biological Data.
Yoon, Byoung-Ha; Kim, Seon-Kyu; Kim, Seon-Young
2017-03-01
Understanding complex relationships among heterogeneous biological data is one of the fundamental goals in biology. In most cases, diverse biological data are stored in relational databases, such as MySQL and Oracle, which store data in multiple tables and then infer relationships by multiple-join statements. Recently, a new type of database, called the graph-based database, was developed to natively represent various kinds of complex relationships, and it is widely used among computer science communities and IT industries. Here, we demonstrate the feasibility of using a graph-based database for complex biological relationships by comparing the performance between MySQL and Neo4j, one of the most widely used graph databases. We collected various biological data (protein-protein interaction, drug-target, gene-disease, etc.) from several existing sources, removed duplicate and redundant data, and finally constructed a graph database containing 114,550 nodes and 82,674,321 relationships. When we tested the query execution performance of MySQL versus Neo4j, we found that Neo4j outperformed MySQL in all cases. While Neo4j exhibited a very fast response for various queries, MySQL exhibited latent or unfinished responses for complex queries with multiple-join statements. These results show that using graph-based databases, such as Neo4j, is an efficient way to store complex biological relationships. Moreover, querying a graph database in diverse ways has the potential to reveal novel relationships among heterogeneous biological data.
Use of Graph Database for the Integration of Heterogeneous Biological Data
Yoon, Byoung-Ha; Kim, Seon-Kyu
2017-01-01
Understanding complex relationships among heterogeneous biological data is one of the fundamental goals in biology. In most cases, diverse biological data are stored in relational databases, such as MySQL and Oracle, which store data in multiple tables and then infer relationships by multiple-join statements. Recently, a new type of database, called the graph-based database, was developed to natively represent various kinds of complex relationships, and it is widely used among computer science communities and IT industries. Here, we demonstrate the feasibility of using a graph-based database for complex biological relationships by comparing the performance between MySQL and Neo4j, one of the most widely used graph databases. We collected various biological data (protein-protein interaction, drug-target, gene-disease, etc.) from several existing sources, removed duplicate and redundant data, and finally constructed a graph database containing 114,550 nodes and 82,674,321 relationships. When we tested the query execution performance of MySQL versus Neo4j, we found that Neo4j outperformed MySQL in all cases. While Neo4j exhibited a very fast response for various queries, MySQL exhibited latent or unfinished responses for complex queries with multiple-join statements. These results show that using graph-based databases, such as Neo4j, is an efficient way to store complex biological relationships. Moreover, querying a graph database in diverse ways has the potential to reveal novel relationships among heterogeneous biological data. PMID:28416946
NoSQL technologies for the CMS Conditions Database
NASA Astrophysics Data System (ADS)
Sipos, Roland
2015-12-01
With the restart of the LHC in 2015, the growth of the CMS Conditions dataset will continue, therefore the need of consistent and highly available access to the Conditions makes a great cause to revisit different aspects of the current data storage solutions. We present a study of alternative data storage backends for the Conditions Databases, by evaluating some of the most popular NoSQL databases to support a key-value representation of the CMS Conditions. The definition of the database infrastructure is based on the need of storing the conditions as BLOBs. Because of this, each condition can reach the size that may require special treatment (splitting) in these NoSQL databases. As big binary objects may be problematic in several database systems, and also to give an accurate baseline, a testing framework extension was implemented to measure the characteristics of the handling of arbitrary binary data in these databases. Based on the evaluation, prototypes of a document store, using a column-oriented and plain key-value store, are deployed. An adaption layer to access the backends in the CMS Offline software was developed to provide transparent support for these NoSQL databases in the CMS context. Additional data modelling approaches and considerations in the software layer, deployment and automatization of the databases are also covered in the research. In this paper we present the results of the evaluation as well as a performance comparison of the prototypes studied.
Automated extraction of radiation dose information for CT examinations.
Cook, Tessa S; Zimmerman, Stefan; Maidment, Andrew D A; Kim, Woojin; Boonn, William W
2010-11-01
Exposure to radiation as a result of medical imaging is currently in the spotlight, receiving attention from Congress as well as the lay press. Although scanner manufacturers are moving toward including effective dose information in the Digital Imaging and Communications in Medicine headers of imaging studies, there is a vast repository of retrospective CT data at every imaging center that stores dose information in an image-based dose sheet. As such, it is difficult for imaging centers to participate in the ACR's Dose Index Registry. The authors have designed an automated extraction system to query their PACS archive and parse CT examinations to extract the dose information stored in each dose sheet. First, an open-source optical character recognition program processes each dose sheet and converts the information to American Standard Code for Information Interchange (ASCII) text. Each text file is parsed, and radiation dose information is extracted and stored in a database which can be queried using an existing pathology and radiology enterprise search tool. Using this automated extraction pipeline, it is possible to perform dose analysis on the >800,000 CT examinations in the PACS archive and generate dose reports for all of these patients. It is also possible to more effectively educate technologists, radiologists, and referring physicians about exposure to radiation from CT by generating report cards for interpreted and performed studies. The automated extraction pipeline enables compliance with the ACR's reporting guidelines and greater awareness of radiation dose to patients, thus resulting in improved patient care and management. Copyright © 2010 American College of Radiology. Published by Elsevier Inc. All rights reserved.
Short-term memory and long-term memory are still different.
Norris, Dennis
2017-09-01
A commonly expressed view is that short-term memory (STM) is nothing more than activated long-term memory. If true, this would overturn a central tenet of cognitive psychology-the idea that there are functionally and neurobiologically distinct short- and long-term stores. Here I present an updated case for a separation between short- and long-term stores, focusing on the computational demands placed on any STM system. STM must support memory for previously unencountered information, the storage of multiple tokens of the same type, and variable binding. None of these can be achieved simply by activating long-term memory. For example, even a simple sequence of digits such as "1, 3, 1" where there are 2 tokens of the digit "1" cannot be stored in the correct order simply by activating the representations of the digits "1" and "3" in LTM. I also review recent neuroimaging data that has been presented as evidence that STM is activated LTM and show that these data are exactly what one would expect to see based on a conventional 2-store view. (PsycINFO Database Record (c) 2017 APA, all rights reserved).
Object-oriented analysis and design of an ECG storage and retrieval system integrated with an HIS.
Wang, C; Ohe, K; Sakurai, T; Nagase, T; Kaihara, S
1996-03-01
For a hospital information system, object-oriented methodology plays an increasingly important role, especially for the management of digitized data, e.g., the electrocardiogram, electroencephalogram, electromyogram, spirogram, X-ray, CT and histopathological images, which are not yet computerized in most hospitals. As a first step in an object-oriented approach to hospital information management and storing medical data in an object-oriented database, we connected electrocardiographs to a hospital network and established the integration of ECG storage and retrieval systems with a hospital information system. In this paper, the object-oriented analysis and design of the ECG storage and retrieval systems is reported.
Thai, Quan Ke; Chung, Dung Anh; Tran, Hoang-Dung
2017-06-26
Canine and wolf mitochondrial DNA haplotypes, which can be used for forensic or phylogenetic analyses, have been defined in various schemes depending on the region analyzed. In recent studies, the 582 bp fragment of the HV1 region is most commonly used. 317 different canine HV1 haplotypes have been reported in the rapidly growing public database GenBank. These reported haplotypes contain several inconsistencies in their haplotype information. To overcome this issue, we have developed a Canis mtDNA HV1 database. This database collects data on the HV1 582 bp region in dog mitochondrial DNA from the GenBank to screen and correct the inconsistencies. It also supports users in detection of new novel mutation profiles and assignment of new haplotypes. The Canis mtDNA HV1 database (CHD) contains 5567 nucleotide entries originating from 15 subspecies in the species Canis lupus. Of these entries, 3646 were haplotypes and grouped into 804 distinct sequences. 319 sequences were recognized as previously assigned haplotypes, while the remaining 485 sequences had new mutation profiles and were marked as new haplotype candidates awaiting further analysis for haplotype assignment. Of the 3646 nucleotide entries, only 414 were annotated with correct haplotype information, while 3232 had insufficient or lacked haplotype information and were corrected or modified before storing in the CHD. The CHD can be accessed at http://chd.vnbiology.com . It provides sequences, haplotype information, and a web-based tool for mtDNA HV1 haplotyping. The CHD is updated monthly and supplies all data for download. The Canis mtDNA HV1 database contains information about canine mitochondrial DNA HV1 sequences with reconciled annotation. It serves as a tool for detection of inconsistencies in GenBank and helps identifying new HV1 haplotypes. Thus, it supports the scientific community in naming new HV1 haplotypes and to reconcile existing annotation of HV1 582 bp sequences.
A mobile trauma database with charge capture.
Moulton, Steve; Myung, Dan; Chary, Aron; Chen, Joshua; Agarwal, Suresh; Emhoff, Tim; Burke, Peter; Hirsch, Erwin
2005-11-01
Charge capture plays an important role in every surgical practice. We have developed and merged a custom mobile database (DB) system with our trauma registry (TRACS), to better understand our billing methods, revenue generators, and areas for improved revenue capture. The mobile database runs on handheld devices using the Windows Compact Edition platform. The front end was written in C# and the back end is SQL. The mobile database operates as a thick client; it includes active and inactive patient lists, billing screens, hot pick lists, and Current Procedural Terminology and International Classification of Diseases, Ninth Revision code sets. Microsoft Information Internet Server provides secure data transaction services between the back ends stored on each device. Traditional, hand written billing information for three of five adult trauma surgeons was averaged over a 5-month period. Electronic billing information was then collected over a 3-month period using handheld devices and the subject software application. One surgeon used the software for all 3 months, and two surgeons used it for the latter 2 months of the electronic data collection period. This electronic billing information was combined with TRACS data to determine the clinical characteristics of the trauma patients who were and were not captured using the mobile database. Total charges increased by 135%, 148%, and 228% for each of the three trauma surgeons who used the mobile DB application. The majority of additional charges were for evaluation and management services. Patients who were captured and billed at the point of care using the mobile DB had higher Injury Severity Scores, were more likely to undergo an operative procedure, and had longer lengths of stay compared with those who were not captured. Total charges more than doubled using a mobile database to bill at the point of care. A subsequent comparison of TRACS data with billing information revealed a large amount of uncaptured patient revenue. Greater familiarity and broader use of mobile database technology holds the potential for even greater revenue capture.
NASA Astrophysics Data System (ADS)
Cristóbal-Hornillos, D.; Varela, J.; Ederoclite, A.; Vázquez Ramió, H.; López-Sainz, A.; Hernández-Fuertes, J.; Civera, T.; Muniesa, D.; Moles, M.; Cenarro, A. J.; Marín-Franch, A.; Yanes-Díaz, A.
2015-05-01
The Observatorio Astrofísico de Javalambre consists of two main telescopes: JST/T250, a 2.5 m telescope with a FoV of 3 deg, and JAST/T80, a 83 cm with a 2 deg FoV. JST/T250 will be devoted to complete the Javalambre-PAU Astronomical Survey (J-PAS). It is a photometric survey with a system of 54 narrow-band plus 3 broad-band filters covering an area of 8500°^2. The JAST/T80 will perform the J-PLUS survey, covering the same area in a system of 12 filters. This contribution presents the software and hardware architecture designed to store and process the data. The processing pipeline runs daily and it is devoted to correct instrumental signature on the science images, to perform astrometric and photometric calibration, and the computation of individual image catalogs. In a second stage, the pipeline performs the combination of the tile mosaics and the computation of final catalogs. The catalogs are ingested in as Scientific database to be provided to the community. The processing software is connected with a management database to store persistent information about the pipeline operations done on each frame. The processing pipeline is executed in a computing cluster under a batch queuing system. Regarding the storage system, it will combine disk and tape technologies. The disk storage system will have capacity to store the data that is accessed by the pipeline. The tape library will store and archive the raw data and earlier data releases with lower access frequency.
Mobile Apps for Suicide Prevention: Review of Virtual Stores and Literature
Castillo, Gema; Arambarri, Jon; López-Coronado, Miguel; Franco, Manuel A
2017-01-01
Background The best manner to prevent suicide is to recognize suicidal signs and signals, and know how to respond to them. Objective We aim to study the existing mobile apps for suicide prevention in the literature and the most commonly used virtual stores. Methods Two reviews were carried out. The first was done by searching the most commonly used commercial app stores, which are iTunes and Google Play. The second was a review of mobile health (mHealth) apps in published articles within the last 10 years in the following 7 scientific databases: Science Direct, Medline, PsycINFO, Embase, The Cochrane Library, IEEE Xplore, and Google Scholar. Results A total of 124 apps related to suicide were found in the cited virtual stores but only 20 apps were specifically designed for suicide prevention. All apps were free and most were designed for Android. Furthermore, 6 relevant papers were found in the indicated scientific databases; in these studies, some real experiences with physicians, caregivers, and families were described. The importance of these people in suicide prevention was indicated. Conclusions The number of apps regarding suicide prevention is small, and there was little information available from literature searches, indicating that technology-based suicide prevention remains understudied. Many of the apps provided no interactive features. It is important to verify the accuracy of the results of different apps that are available on iOS and Android. The confidence generated by these apps can benefit end users, either by improving their health monitoring or simply to verify their body condition. PMID:29017992
Toward a mtDNA locus-specific mutation database using the LOVD platform.
Elson, Joanna L; Sweeney, Mary G; Procaccio, Vincent; Yarham, John W; Salas, Antonio; Kong, Qing-Peng; van der Westhuizen, Francois H; Pitceathly, Robert D S; Thorburn, David R; Lott, Marie T; Wallace, Douglas C; Taylor, Robert W; McFarland, Robert
2012-09-01
The Human Variome Project (HVP) is a global effort to collect and curate all human genetic variation affecting health. Mutations of mitochondrial DNA (mtDNA) are an important cause of neurogenetic disease in humans; however, identification of the pathogenic mutations responsible can be problematic. In this article, we provide explanations as to why and suggest how such difficulties might be overcome. We put forward a case in support of a new Locus Specific Mutation Database (LSDB) implemented using the Leiden Open-source Variation Database (LOVD) system that will not only list primary mutations, but also present the evidence supporting their role in disease. Critically, we feel that this new database should have the capacity to store information on the observed phenotypes alongside the genetic variation, thereby facilitating our understanding of the complex and variable presentation of mtDNA disease. LOVD supports fast queries of both seen and hidden data and allows storage of sequence variants from high-throughput sequence analysis. The LOVD platform will allow construction of a secure mtDNA database; one that can fully utilize currently available data, as well as that being generated by high-throughput sequencing, to link genotype with phenotype enhancing our understanding of mitochondrial disease, with a view to providing better prognostic information. © 2012 Wiley Periodicals, Inc.
Toward a mtDNA Locus-Specific Mutation Database Using the LOVD Platform
Elson, Joanna L.; Sweeney, Mary G.; Procaccio, Vincent; Yarham, John W.; Salas, Antonio; Kong, Qing-Peng; van der Westhuizen, Francois H.; Pitceathly, Robert D.S.; Thorburn, David R.; Lott, Marie T.; Wallace, Douglas C.; Taylor, Robert W.; McFarland, Robert
2015-01-01
The Human Variome Project (HVP) is a global effort to collect and curate all human genetic variation affecting health. Mutations of mitochondrial DNA (mtDNA) are an important cause of neurogenetic disease in humans; however, identification of the pathogenic mutations responsible can be problematic. In this article, we provide explanations as to why and suggest how such difficulties might be overcome. We put forward a case in support of a new Locus Specific Mutation Database (LSDB) implemented using the Leiden Open-source Variation Database (LOVD) system that will not only list primary mutations, but also present the evidence supporting their role in disease. Critically, we feel that this new database should have the capacity to store information on the observed phenotypes alongside the genetic variation, thereby facilitating our understanding of the complex and variable presentation of mtDNA disease. LOVD supports fast queries of both seen and hidden data and allows storage of sequence variants from high-throughput sequence analysis. The LOVD platform will allow construction of a secure mtDNA database; one that can fully utilize currently available data, as well as that being generated by high-throughput sequencing, to link genotype with phenotype enhancing our understanding of mitochondrial disease, with a view to providing better prognostic information. PMID:22581690
Reasoning and memory: People make varied use of the information available in working memory.
Hardman, Kyle O; Cowan, Nelson
2016-05-01
Working memory (WM) is used for storing information in a highly accessible state so that other mental processes, such as reasoning, can use that information. Some WM tasks require that participants not only store information, but also reason about that information to perform optimally on the task. In this study, we used visual WM tasks that had both storage and reasoning components to determine both how ideally people are able to reason about information in WM and if there is a relationship between information storage and reasoning. We developed novel psychological process models of the tasks that allowed us to estimate for each participant both how much information they had in WM and how efficiently they reasoned about that information. Our estimates of information use showed that participants are not all ideal information users or minimal information users, but rather that there are individual differences in the thoroughness of information use in our WM tasks. However, we found that our participants tended to be more ideal than minimal. One implication of this work is that to accurately estimate the amount of information in WM, it is important to also estimate how efficiently that information is used. This new analysis contributes to the theoretical premise that human rationality may be bounded by the complexity of task demands. (PsycINFO Database Record (c) 2016 APA, all rights reserved).
Security and privacy qualities of medical devices: an analysis of FDA postmarket surveillance.
Kramer, Daniel B; Baker, Matthew; Ransford, Benjamin; Molina-Markham, Andres; Stewart, Quinn; Fu, Kevin; Reynolds, Matthew R
2012-01-01
Medical devices increasingly depend on computing functions such as wireless communication and Internet connectivity for software-based control of therapies and network-based transmission of patients' stored medical information. These computing capabilities introduce security and privacy risks, yet little is known about the prevalence of such risks within the clinical setting. We used three comprehensive, publicly available databases maintained by the Food and Drug Administration (FDA) to evaluate recalls and adverse events related to security and privacy risks of medical devices. Review of weekly enforcement reports identified 1,845 recalls; 605 (32.8%) of these included computers, 35 (1.9%) stored patient data, and 31 (1.7%) were capable of wireless communication. Searches of databases specific to recalls and adverse events identified only one event with a specific connection to security or privacy. Software-related recalls were relatively common, and most (81.8%) mentioned the possibility of upgrades, though only half of these provided specific instructions for the update mechanism. Our review of recalls and adverse events from federal government databases reveals sharp inconsistencies with databases at individual providers with respect to security and privacy risks. Recalls related to software may increase security risks because of unprotected update and correction mechanisms. To detect signals of security and privacy problems that adversely affect public health, federal postmarket surveillance strategies should rethink how to effectively and efficiently collect data on security and privacy problems in devices that increasingly depend on computing systems susceptible to malware.
Security and Privacy Qualities of Medical Devices: An Analysis of FDA Postmarket Surveillance
Kramer, Daniel B.; Baker, Matthew; Ransford, Benjamin; Molina-Markham, Andres; Stewart, Quinn; Fu, Kevin; Reynolds, Matthew R.
2012-01-01
Background Medical devices increasingly depend on computing functions such as wireless communication and Internet connectivity for software-based control of therapies and network-based transmission of patients’ stored medical information. These computing capabilities introduce security and privacy risks, yet little is known about the prevalence of such risks within the clinical setting. Methods We used three comprehensive, publicly available databases maintained by the Food and Drug Administration (FDA) to evaluate recalls and adverse events related to security and privacy risks of medical devices. Results Review of weekly enforcement reports identified 1,845 recalls; 605 (32.8%) of these included computers, 35 (1.9%) stored patient data, and 31 (1.7%) were capable of wireless communication. Searches of databases specific to recalls and adverse events identified only one event with a specific connection to security or privacy. Software-related recalls were relatively common, and most (81.8%) mentioned the possibility of upgrades, though only half of these provided specific instructions for the update mechanism. Conclusions Our review of recalls and adverse events from federal government databases reveals sharp inconsistencies with databases at individual providers with respect to security and privacy risks. Recalls related to software may increase security risks because of unprotected update and correction mechanisms. To detect signals of security and privacy problems that adversely affect public health, federal postmarket surveillance strategies should rethink how to effectively and efficiently collect data on security and privacy problems in devices that increasingly depend on computing systems susceptible to malware. PMID:22829874
Database tomography for commercial application
NASA Technical Reports Server (NTRS)
Kostoff, Ronald N.; Eberhart, Henry J.
1994-01-01
Database tomography is a method for extracting themes and their relationships from text. The algorithms, employed begin with word frequency and word proximity analysis and build upon these results. When the word 'database' is used, think of medical or police records, patents, journals, or papers, etc. (any text information that can be computer stored). Database tomography features a full text, user interactive technique enabling the user to identify areas of interest, establish relationships, and map trends for a deeper understanding of an area of interest. Database tomography concepts and applications have been reported in journals and presented at conferences. One important feature of the database tomography algorithm is that it can be used on a database of any size, and will facilitate the users ability to understand the volume of content therein. While employing the process to identify research opportunities it became obvious that this promising technology has potential applications for business, science, engineering, law, and academe. Examples include evaluating marketing trends, strategies, relationships and associations. Also, the database tomography process would be a powerful component in the area of competitive intelligence, national security intelligence and patent analysis. User interests and involvement cannot be overemphasized.
A Taxonomic Search Engine: Federating taxonomic databases using web services
Page, Roderic DM
2005-01-01
Background The taxonomic name of an organism is a key link between different databases that store information on that organism. However, in the absence of a single, comprehensive database of organism names, individual databases lack an easy means of checking the correctness of a name. Furthermore, the same organism may have more than one name, and the same name may apply to more than one organism. Results The Taxonomic Search Engine (TSE) is a web application written in PHP that queries multiple taxonomic databases (ITIS, Index Fungorum, IPNI, NCBI, and uBIO) and summarises the results in a consistent format. It supports "drill-down" queries to retrieve a specific record. The TSE can optionally suggest alternative spellings the user can try. It also acts as a Life Science Identifier (LSID) authority for the source taxonomic databases, providing globally unique identifiers (and associated metadata) for each name. Conclusion The Taxonomic Search Engine is available at and provides a simple demonstration of the potential of the federated approach to providing access to taxonomic names. PMID:15757517
AquaPathogen X--A template database for tracking field isolates of aquatic pathogens
Emmenegger, Evi; Kurath, Gael
2012-01-01
AquaPathogen X is a template database for recording information on individual isolates of aquatic pathogens and is available for download from the U.S. Geological Survey (USGS) Western Fisheries Research Center (WFRC) website (http://wfrc.usgs.gov). This template database can accommodate the nucleotide sequence data generated in molecular epidemiological studies along with the myriad of abiotic and biotic traits associated with isolates of various pathogens (for example, viruses, parasites, or bacteria) from multiple aquatic animal host species (for example, fish, shellfish, or shrimp). The simultaneous cataloging of isolates from different aquatic pathogens is a unique feature to the AquaPathogen X database, which can be used in surveillance of emerging aquatic animal diseases and clarification of main risk factors associated with pathogen incursions into new water systems. As a template database, the data fields are empty upon download and can be modified to user specifications. For example, an application of the template database that stores the epidemiological profiles of fish virus isolates, called Fish ViroTrak (fig. 1), was also developed (Emmenegger and others, 2011).
Hayakawa, Kazuo; Iwatani, Yoshinori
2013-02-01
Osaka University Center for Twin Research is currently organizing a government-funded, multidisciplinary research project using a large registry of aged twins living in Japan. The purpose of the project is to collect various information as well as biological resources from registered twins, and to establish a biobank and databases for preserving and managing these data and resources. The Center is collecting data from twin pairs, both of whom have agreed to participate in a one-day comprehensive medical examination. The following data are being collected: physical data (e.g., height, body mass, blood pressure, theoretical visceral fat, pulse wave velocity, and bone density), data regarding epidemiology (e.g., medical history, lifestyle, quality of life, mood status, cognitive function, and nutrition), electrocardiogram, ultrasonography (carotid artery and thyroid), dentistry, plastic surgery, positron emission tomography, magnetoencephalogram, and magnetic resonance imaging of brain. These data are then aggregated and systematically stored in specific databases. In addition, peripheral blood is obtained from the participants, and then genomic DNA is purified and sera are stored. A wide variety of studies are ongoing, and more are in the planning stage.
Using Social Media to Identify Sources of Healthy Food in Urban Neighborhoods.
Gomez-Lopez, Iris N; Clarke, Philippa; Hill, Alex B; Romero, Daniel M; Goodspeed, Robert; Berrocal, Veronica J; Vinod Vydiswaran, V G; Veinot, Tiffany C
2017-06-01
An established body of research has used secondary data sources (such as proprietary business databases) to demonstrate the importance of the neighborhood food environment for multiple health outcomes. However, documenting food availability using secondary sources in low-income urban neighborhoods can be particularly challenging since small businesses play a crucial role in food availability. These small businesses are typically underrepresented in national databases, which rely on secondary sources to develop data for marketing purposes. Using social media and other crowdsourced data to account for these smaller businesses holds promise, but the quality of these data remains unknown. This paper compares the quality of full-line grocery store information from Yelp, a crowdsourced content service, to a "ground truth" data set (Detroit Food Map) and a commercially-available dataset (Reference USA) for the greater Detroit area. Results suggest that Yelp is more accurate than Reference USA in identifying healthy food stores in urban areas. Researchers investigating the relationship between the nutrition environment and health may consider Yelp as a reliable and valid source for identifying sources of healthy food in urban environments.
NASA Astrophysics Data System (ADS)
McEnery, J. A.; Jitkajornwanich, K.
2012-12-01
This presentation will describe the methodology and overall system development by which a benchmark dataset of precipitation information has been used to characterize the depth-area-duration relations in heavy rain storms occurring over regions of Texas. Over the past two years project investigators along with the National Weather Service (NWS) West Gulf River Forecast Center (WGRFC) have developed and operated a gateway data system to ingest, store, and disseminate NWS multi-sensor precipitation estimates (MPE). As a pilot project of the Integrated Water Resources Science and Services (IWRSS) initiative, this testbed uses a Standard Query Language (SQL) server to maintain a full archive of current and historic MPE values within the WGRFC service area. These time series values are made available for public access as web services in the standard WaterML format. Having this volume of information maintained in a comprehensive database now allows the use of relational analysis capabilities within SQL to leverage these multi-sensor precipitation values and produce a valuable derivative product. The area of focus for this study is North Texas and will utilize values that originated from the West Gulf River Forecast Center (WGRFC); one of three River Forecast Centers currently represented in the holdings of this data system. Over the past two decades, NEXRAD radar has dramatically improved the ability to record rainfall. The resulting hourly MPE values, distributed over an approximate 4 km by 4 km grid, are considered by the NWS to be the "best estimate" of rainfall. The data server provides an accepted standard interface for internet access to the largest time-series dataset of NEXRAD based MPE values ever assembled. An automated script has been written to search and extract storms over the 18 year period of record from the contents of this massive historical precipitation database. Not only can it extract site-specific storms, but also duration-specific storms and storms separated by user defined inter-event periods. A separate storm database has been created to store the selected output. By storing output within tables in a separate database, we can make use of powerful SQL capabilities to perform flexible pattern analysis. Previous efforts have made use of historic data from limited clusters of irregularly spaced physical gauges. Spatial extent of the observational network has been a limiting factor. The relatively dense distribution of MPE provides a virtual mesh of observations stretched over the landscape. This work combines a unique hydrologic data resource with programming and database analysis to characterize storm depth-area-duration relationships.
Extracting Databases from Dark Data with DeepDive
Zhang, Ce; Shin, Jaeho; Ré, Christopher; Cafarella, Michael; Niu, Feng
2016-01-01
DeepDive is a system for extracting relational databases from dark data: the mass of text, tables, and images that are widely collected and stored but which cannot be exploited by standard relational tools. If the information in dark data — scientific papers, Web classified ads, customer service notes, and so on — were instead in a relational database, it would give analysts a massive and valuable new set of “big data.” DeepDive is distinctive when compared to previous information extraction systems in its ability to obtain very high precision and recall at reasonable engineering cost; in a number of applications, we have used DeepDive to create databases with accuracy that meets that of human annotators. To date we have successfully deployed DeepDive to create data-centric applications for insurance, materials science, genomics, paleontologists, law enforcement, and others. The data unlocked by DeepDive represents a massive opportunity for industry, government, and scientific researchers. DeepDive is enabled by an unusual design that combines large-scale probabilistic inference with a novel developer interaction cycle. This design is enabled by several core innovations around probabilistic training and inference. PMID:28316365
Tomato Expression Database (TED): a suite of data presentation and analysis tools
Fei, Zhangjun; Tang, Xuemei; Alba, Rob; Giovannoni, James
2006-01-01
The Tomato Expression Database (TED) includes three integrated components. The Tomato Microarray Data Warehouse serves as a central repository for raw gene expression data derived from the public tomato cDNA microarray. In addition to expression data, TED stores experimental design and array information in compliance with the MIAME guidelines and provides web interfaces for researchers to retrieve data for their own analysis and use. The Tomato Microarray Expression Database contains normalized and processed microarray data for ten time points with nine pair-wise comparisons during fruit development and ripening in a normal tomato variety and nearly isogenic single gene mutants impacting fruit development and ripening. Finally, the Tomato Digital Expression Database contains raw and normalized digital expression (EST abundance) data derived from analysis of the complete public tomato EST collection containing >150 000 ESTs derived from 27 different non-normalized EST libraries. This last component also includes tools for the comparison of tomato and Arabidopsis digital expression data. A set of query interfaces and analysis, and visualization tools have been developed and incorporated into TED, which aid users in identifying and deciphering biologically important information from our datasets. TED can be accessed at . PMID:16381976
Tomato Expression Database (TED): a suite of data presentation and analysis tools.
Fei, Zhangjun; Tang, Xuemei; Alba, Rob; Giovannoni, James
2006-01-01
The Tomato Expression Database (TED) includes three integrated components. The Tomato Microarray Data Warehouse serves as a central repository for raw gene expression data derived from the public tomato cDNA microarray. In addition to expression data, TED stores experimental design and array information in compliance with the MIAME guidelines and provides web interfaces for researchers to retrieve data for their own analysis and use. The Tomato Microarray Expression Database contains normalized and processed microarray data for ten time points with nine pair-wise comparisons during fruit development and ripening in a normal tomato variety and nearly isogenic single gene mutants impacting fruit development and ripening. Finally, the Tomato Digital Expression Database contains raw and normalized digital expression (EST abundance) data derived from analysis of the complete public tomato EST collection containing >150,000 ESTs derived from 27 different non-normalized EST libraries. This last component also includes tools for the comparison of tomato and Arabidopsis digital expression data. A set of query interfaces and analysis, and visualization tools have been developed and incorporated into TED, which aid users in identifying and deciphering biologically important information from our datasets. TED can be accessed at http://ted.bti.cornell.edu.
External Data and Attribute Hyperlink Programs for Promis*e(Registered Trademark)
NASA Technical Reports Server (NTRS)
Derengowski, Rich; Gruel, Andrew
2001-01-01
External Data and Attribute Hyperlink are computer programs that can be added to Promis*e(trademark) which is a commercial software system that automates routine tasks in the design (including drawing schematic diagrams) of electrical control systems. The programs were developed under the Stennis Space Center's (SSC) Dual Use Technology Development Program to provide capabilities for SSC's BMCS configuration management system which uses Promis*e(trademark). The External Data program enables the storage and management of information in an external database linked to a drawing. Changes can be made either in the database or on the drawing. Information that originates outside Promis*e(trademark) can be stored in custom fields that can be added to the database. Although this information is not available in Promis*e(trademark) printed drawings, it can be associated with symbols in the drawings, and can be retrieved through the drawings when the software is running. The Attribute Hyperlink program enables the addition of hyperlink information as attributes of symbols. This program enables the formation of a direct hyperlink between a schematic diagram and an Internet site or a file on a compact disk, on the user's hard drive, or on another computer on a network to which the user's computer is connected. The user can then obtain information directly related to the part (e.g., maintenance, or troubleshooting information) associated with the hyperlink.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Karpinets, Tatiana V; Park, Byung; Syed, Mustafa H
2010-01-01
The Carbohydrate-Active Enzyme (CAZy) database provides a rich set of manually annotated enzymes that degrade, modify, or create glycosidic bonds. Despite rich and invaluable information stored in the database, software tools utilizing this information for annotation of newly sequenced genomes by CAZy families are limited. We have employed two annotation approaches to fill the gap between manually curated high-quality protein sequences collected in the CAZy database and the growing number of other protein sequences produced by genome or metagenome sequencing projects. The first approach is based on a similarity search against the entire non-redundant sequences of the CAZy database. Themore » second approach performs annotation using links or correspondences between the CAZy families and protein family domains. The links were discovered using the association rule learning algorithm applied to sequences from the CAZy database. The approaches complement each other and in combination achieved high specificity and sensitivity when cross-evaluated with the manually curated genomes of Clostridium thermocellum ATCC 27405 and Saccharophagus degradans 2-40. The capability of the proposed framework to predict the function of unknown protein domains (DUF) and of hypothetical proteins in the genome of Neurospora crassa is demonstrated. The framework is implemented as a Web service, the CAZymes Analysis Toolkit (CAT), and is available at http://cricket.ornl.gov/cgi-bin/cat.cgi.« less
Park, Byung H; Karpinets, Tatiana V; Syed, Mustafa H; Leuze, Michael R; Uberbacher, Edward C
2010-12-01
The Carbohydrate-Active Enzyme (CAZy) database provides a rich set of manually annotated enzymes that degrade, modify, or create glycosidic bonds. Despite rich and invaluable information stored in the database, software tools utilizing this information for annotation of newly sequenced genomes by CAZy families are limited. We have employed two annotation approaches to fill the gap between manually curated high-quality protein sequences collected in the CAZy database and the growing number of other protein sequences produced by genome or metagenome sequencing projects. The first approach is based on a similarity search against the entire nonredundant sequences of the CAZy database. The second approach performs annotation using links or correspondences between the CAZy families and protein family domains. The links were discovered using the association rule learning algorithm applied to sequences from the CAZy database. The approaches complement each other and in combination achieved high specificity and sensitivity when cross-evaluated with the manually curated genomes of Clostridium thermocellum ATCC 27405 and Saccharophagus degradans 2-40. The capability of the proposed framework to predict the function of unknown protein domains and of hypothetical proteins in the genome of Neurospora crassa is demonstrated. The framework is implemented as a Web service, the CAZymes Analysis Toolkit, and is available at http://cricket.ornl.gov/cgi-bin/cat.cgi.
Guo, Wei; Song, Binbin; Shen, Junfei; Wu, Jiong; Zhang, Chunyan; Wang, Beili; Pan, Baishen
2015-08-25
To establish an indirect reference interval based on the test results of alanine aminotransferase stored in a laboratory information system. All alanine aminotransferase results were included for outpatients and physical examinations that were stored in the laboratory information system of Zhongshan Hospital during 2014. The original data were transformed using a Box-Cox transformation to obtain an approximate normal distribution. Outliers were identified and omitted using the Chauvenet and Tukey methods. The indirect reference intervals were obtained by simultaneously applying nonparametric and Hoffmann methods. The reference change value was selected to determine the statistical significance of the observed differences between the calculated and published reference intervals. The indirect reference intervals for alanine aminotransferase of all groups were 12 to 41 U/L (male, outpatient), 12 to 48 U/L (male, physical examination), 9 to 32 U/L (female, outpatient), and 8 to 35 U/L (female, physical examination), respectively. The absolute differences when compared with the direct results were all smaller than the reference change value of alanine aminotransferase. The Box-Cox transformation combined with the Hoffmann and Tukey methods is a simple and reliable technique that should be promoted and used by clinical laboratories.
Characterizing Rural Food Access in Remote Areas.
Bardenhagen, Chris J; Pinard, Courtney A; Pirog, Rich; Yaroch, Amy Lazarus
2017-10-01
Residents of rural areas may have limited access to healthy foods, leading to higher incidence of diet related health issues. Smaller grocers in rural areas experience challenges in maintaining fresh produce and other healthy foods available for customers. This study assessed the rural food environment in northeast Lower Michigan in order to inform healthy food financing projects such as the Michigan Good Food Fund. The area's retail food businesses were categorized using secondary licensing, business, and nutrition program databases. Twenty of these stores were visited in person to verify the validity of the categories created, and to assess the availability of healthy foods in their aisles. In-depth interviews with key informants were carried out with store owners, economic development personnel, and other food system stakeholders having knowledge about food access, in order to learn more about the specific challenges that the area faces. Out-shopping, seasonality, and economic challenges were found to affect healthy food availability. Mid-sized independent stores were generally found to have a larger selection of healthy foods, but smaller rural groceries also have potential to provide fresh produce and increase food access. Potential healthy food financing projects are described and areas in need of further research are identified.
Web-based flood database for Colorado, water years 1867 through 2011
Kohn, Michael S.; Jarrett, Robert D.; Krammes, Gary S.; Mommandi, Amanullah
2013-01-01
In order to provide a centralized repository of flood information for the State of Colorado, the U.S. Geological Survey, in cooperation with the Colorado Department of Transportation, created a Web-based geodatabase for flood information from water years 1867 through 2011 and data for paleofloods occurring in the past 5,000 to 10,000 years. The geodatabase was created using the Environmental Systems Research Institute ArcGIS JavaScript Application Programing Interface 3.2. The database can be accessed at http://cwscpublic2.cr.usgs.gov/projects/coflood/COFloodMap.html. Data on 6,767 flood events at 1,597 individual sites throughout Colorado were compiled to generate the flood database. The data sources of flood information are indirect discharge measurements that were stored in U.S. Geological Survey offices (water years 1867–2011), flood data from indirect discharge measurements referenced in U.S. Geological Survey reports (water years 1884–2011), paleoflood studies from six peer-reviewed journal articles (data on events occurring in the past 5,000 to 10,000 years), and the U.S. Geological Survey National Water Information System peak-discharge database (water years 1883–2010). A number of tests were performed on the flood database to ensure the quality of the data. The Web interface was programmed using the Environmental Systems Research Institute ArcGIS JavaScript Application Programing Interface 3.2, which allows for display, query, georeference, and export of the data in the flood database. The data fields in the flood database used to search and filter the database include hydrologic unit code, U.S. Geological Survey station number, site name, county, drainage area, elevation, data source, date of flood, peak discharge, and field method used to determine discharge. Additional data fields can be viewed and exported, but the data fields described above are the only ones that can be used for queries.
Bridging the Qualitative/Quantitative Software Divide
Annechino, Rachelle; Antin, Tamar M. J.; Lee, Juliet P.
2011-01-01
To compare and combine qualitative and quantitative data collected from respondents in a mixed methods study, the research team developed a relational database to merge survey responses stored and analyzed in SPSS and semistructured interview responses stored and analyzed in the qualitative software package ATLAS.ti. The process of developing the database, as well as practical considerations for researchers who may wish to use similar methods, are explored. PMID:22003318
Ontology-based geospatial data query and integration
Zhao, T.; Zhang, C.; Wei, M.; Peng, Z.-R.
2008-01-01
Geospatial data sharing is an increasingly important subject as large amount of data is produced by a variety of sources, stored in incompatible formats, and accessible through different GIS applications. Past efforts to enable sharing have produced standardized data format such as GML and data access protocols such as Web Feature Service (WFS). While these standards help enabling client applications to gain access to heterogeneous data stored in different formats from diverse sources, the usability of the access is limited due to the lack of data semantics encoded in the WFS feature types. Past research has used ontology languages to describe the semantics of geospatial data but ontology-based queries cannot be applied directly to legacy data stored in databases or shapefiles, or to feature data in WFS services. This paper presents a method to enable ontology query on spatial data available from WFS services and on data stored in databases. We do not create ontology instances explicitly and thus avoid the problems of data replication. Instead, user queries are rewritten to WFS getFeature requests and SQL queries to database. The method also has the benefits of being able to utilize existing tools of databases, WFS, and GML while enabling query based on ontology semantics. ?? 2008 Springer-Verlag Berlin Heidelberg.
RDFBuilder: a tool to automatically build RDF-based interfaces for MAGE-OM microarray data sources.
Anguita, Alberto; Martin, Luis; Garcia-Remesal, Miguel; Maojo, Victor
2013-07-01
This paper presents RDFBuilder, a tool that enables RDF-based access to MAGE-ML-compliant microarray databases. We have developed a system that automatically transforms the MAGE-OM model and microarray data stored in the ArrayExpress database into RDF format. Additionally, the system automatically enables a SPARQL endpoint. This allows users to execute SPARQL queries for retrieving microarray data, either from specific experiments or from more than one experiment at a time. Our system optimizes response times by caching and reusing information from previous queries. In this paper, we describe our methods for achieving this transformation. We show that our approach is complementary to other existing initiatives, such as Bio2RDF, for accessing and retrieving data from the ArrayExpress database. Copyright © 2013 Elsevier Ireland Ltd. All rights reserved.
C-ME: A 3D Community-Based, Real-Time Collaboration Tool for Scientific Research and Training
Kolatkar, Anand; Kennedy, Kevin; Halabuk, Dan; Kunken, Josh; Marrinucci, Dena; Bethel, Kelly; Guzman, Rodney; Huckaby, Tim; Kuhn, Peter
2008-01-01
The need for effective collaboration tools is growing as multidisciplinary proteome-wide projects and distributed research teams become more common. The resulting data is often quite disparate, stored in separate locations, and not contextually related. Collaborative Molecular Modeling Environment (C-ME) is an interactive community-based collaboration system that allows researchers to organize information, visualize data on a two-dimensional (2-D) or three-dimensional (3-D) basis, and share and manage that information with collaborators in real time. C-ME stores the information in industry-standard databases that are immediately accessible by appropriate permission within the computer network directory service or anonymously across the internet through the C-ME application or through a web browser. The system addresses two important aspects of collaboration: context and information management. C-ME allows a researcher to use a 3-D atomic structure model or a 2-D image as a contextual basis on which to attach and share annotations to specific atoms or molecules or to specific regions of a 2-D image. These annotations provide additional information about the atomic structure or image data that can then be evaluated, amended or added to by other project members. PMID:18286178
Virtual working systems to support R&D groups
NASA Astrophysics Data System (ADS)
Dew, Peter M.; Leigh, Christine; Drew, Richard S.; Morris, David; Curson, Jayne
1995-03-01
The paper reports on the progress at Leeds University to build a Virtual Science Park (VSP) to enhance the University's ability to interact with industry, grow its applied research and workplace learning activities. The VSP exploits the advances in real time collaborative computing and networking to provide an environment that meets the objectives of physically based science parks without the need for the organizations to relocate. It provides an integrated set of services (e.g. virtual consultancy, workbased learning) built around a structured person- centered information model. This model supports the integration of tools for: (a) navigating around the information space; (b) browsing information stored within the VSP database; (c) communicating through a variety of Person-to-Person collaborative tools; and (d) the ability to the information stored in the VSP including the relationships to other information that support the underlying model. The paper gives an overview of a generic virtual working system based on X.500 directory services and the World-Wide Web that can be used to support the Virtual Science Park. Finally the paper discusses some of the research issues that need to be addressed to fully realize a Virtual Science Park.
Lilley, Rebbecca; Davie, Gabrielle; Wilson, Suzanne
2016-10-01
Large administrative databases provide powerful opportunities for examining the epidemiology of injury. The National Coronial Information System (NCIS) contains Coronial data from Australia and New Zealand (NZ); however, only closed cases are stored for NZ. This paper examines the completeness of NZ data within the NCIS and its impact upon the validity and utility of this database. A retrospective review of the capture of NZ cases of quad-related fatalities held in the NCIS was undertaken by identifying outstanding Coronial cases held on the NZ Coronial Management System (primary source of NZ Coronial data). NZ data held on the NCIS database were incomplete due to the non-capture of closed cases and the unavailability of open cases. Improvements to the information provided on the NCIS about the completeness of NZ data are needed to improve the validity of NCIS-derived findings and the overall utility of the NCIS for research. Published by the BMJ Publishing Group Limited. For permission to use (where not already granted under a licence) please go to http://www.bmj.com/company/products-services/rights-and-licensing/
Vector and Raster Data Storage Based on Morton Code
NASA Astrophysics Data System (ADS)
Zhou, G.; Pan, Q.; Yue, T.; Wang, Q.; Sha, H.; Huang, S.; Liu, X.
2018-05-01
Even though geomatique is so developed nowadays, the integration of spatial data in vector and raster formats is still a very tricky problem in geographic information system environment. And there is still not a proper way to solve the problem. This article proposes a method to interpret vector data and raster data. In this paper, we saved the image data and building vector data of Guilin University of Technology to Oracle database. Then we use ADO interface to connect database to Visual C++ and convert row and column numbers of raster data and X Y of vector data to Morton code in Visual C++ environment. This method stores vector and raster data to Oracle Database and uses Morton code instead of row and column and X Y to mark the position information of vector and raster data. Using Morton code to mark geographic information enables storage of data make full use of storage space, simultaneous analysis of vector and raster data more efficient and visualization of vector and raster more intuitive. This method is very helpful for some situations that need to analyse or display vector data and raster data at the same time.
Yager, Douglas B.; Hofstra, Albert H.; Granitto, Matthew
2012-01-01
This report emphasizes geographic information system analysis and the display of data stored in the legacy U.S. Geological Survey National Geochemical Database for use in mineral resource investigations. Geochemical analyses of soils, stream sediments, and rocks that are archived in the National Geochemical Database provide an extensive data source for investigating geochemical anomalies. A study area in the Egan Range of east-central Nevada was used to develop a geographic information system analysis methodology for two different geochemical datasets involving detailed (Bureau of Land Management Wilderness) and reconnaissance-scale (National Uranium Resource Evaluation) investigations. ArcGIS was used to analyze and thematically map geochemical information at point locations. Watershed-boundary datasets served as a geographic reference to relate potentially anomalous sample sites with hydrologic unit codes at varying scales. The National Hydrography Dataset was analyzed with Hydrography Event Management and ArcGIS Utility Network Analyst tools to delineate potential sediment-sample provenance along a stream network. These tools can be used to track potential upstream-sediment-contributing areas to a sample site. This methodology identifies geochemically anomalous sample sites, watersheds, and streams that could help focus mineral resource investigations in the field.
Carey, George B; Kazantsev, Stephanie; Surati, Mosmi; Rolle, Cleo E; Kanteti, Archana; Sadiq, Ahad; Bahroos, Neil; Raumann, Brigitte; Madduri, Ravi; Dave, Paul; Starkey, Adam; Hensing, Thomas; Husain, Aliya N; Vokes, Everett E; Vigneswaran, Wickii; Armato, Samuel G; Kindler, Hedy L; Salgia, Ravi
2012-01-01
Objective An area of need in cancer informatics is the ability to store images in a comprehensive database as part of translational cancer research. To meet this need, we have implemented a novel tandem database infrastructure that facilitates image storage and utilisation. Background We had previously implemented the Thoracic Oncology Program Database Project (TOPDP) database for our translational cancer research needs. While useful for many research endeavours, it is unable to store images, hence our need to implement an imaging database which could communicate easily with the TOPDP database. Methods The Thoracic Oncology Research Program (TORP) imaging database was designed using the Research Electronic Data Capture (REDCap) platform, which was developed by Vanderbilt University. To demonstrate proof of principle and evaluate utility, we performed a retrospective investigation into tumour response for malignant pleural mesothelioma (MPM) patients treated at the University of Chicago Medical Center with either of two analogous chemotherapy regimens and consented to at least one of two UCMC IRB protocols, 9571 and 13473A. Results A cohort of 22 MPM patients was identified using clinical data in the TOPDP database. After measurements were acquired, two representative CT images and 0–35 histological images per patient were successfully stored in the TORP database, along with clinical and demographic data. Discussion We implemented the TORP imaging database to be used in conjunction with our comprehensive TOPDP database. While it requires an additional effort to use two databases, our database infrastructure facilitates more comprehensive translational research. Conclusions The investigation described herein demonstrates the successful implementation of this novel tandem imaging database infrastructure, as well as the potential utility of investigations enabled by it. The data model presented here can be utilised as the basis for further development of other larger, more streamlined databases in the future. PMID:23103606
Barbosa-Silva, A; Pafilis, E; Ortega, J M; Schneider, R
2007-12-11
Data integration has become an important task for biological database providers. The current model for data exchange among different sources simplifies the manner that distinct information is accessed by users. The evolution of data representation from HTML to XML enabled programs, instead of humans, to interact with biological databases. We present here SRS.php, a PHP library that can interact with the data integration Sequence Retrieval System (SRS). The library has been written using SOAP definitions, and permits the programmatic communication through webservices with the SRS. The interactions are possible by invoking the methods described in WSDL by exchanging XML messages. The current functions available in the library have been built to access specific data stored in any of the 90 different databases (such as UNIPROT, KEGG and GO) using the same query syntax format. The inclusion of the described functions in the source of scripts written in PHP enables them as webservice clients to the SRS server. The functions permit one to query the whole content of any SRS database, to list specific records in these databases, to get specific fields from the records, and to link any record among any pair of linked databases. The case study presented exemplifies the library usage to retrieve information regarding registries of a Plant Defense Mechanisms database. The Plant Defense Mechanisms database is currently being developed, and the proposal of SRS.php library usage is to enable the data acquisition for the further warehousing tasks related to its setup and maintenance.
APADB: a database for alternative polyadenylation and microRNA regulation events
Müller, Sören; Rycak, Lukas; Afonso-Grunz, Fabian; Winter, Peter; Zawada, Adam M.; Damrath, Ewa; Scheider, Jessica; Schmäh, Juliane; Koch, Ina; Kahl, Günter; Rotter, Björn
2014-01-01
Alternative polyadenylation (APA) is a widespread mechanism that contributes to the sophisticated dynamics of gene regulation. Approximately 50% of all protein-coding human genes harbor multiple polyadenylation (PA) sites; their selective and combinatorial use gives rise to transcript variants with differing length of their 3′ untranslated region (3′UTR). Shortened variants escape UTR-mediated regulation by microRNAs (miRNAs), especially in cancer, where global 3′UTR shortening accelerates disease progression, dedifferentiation and proliferation. Here we present APADB, a database of vertebrate PA sites determined by 3′ end sequencing, using massive analysis of complementary DNA ends. APADB provides (A)PA sites for coding and non-coding transcripts of human, mouse and chicken genes. For human and mouse, several tissue types, including different cancer specimens, are available. APADB records the loss of predicted miRNA binding sites and visualizes next-generation sequencing reads that support each PA site in a genome browser. The database tables can either be browsed according to organism and tissue or alternatively searched for a gene of interest. APADB is the largest database of APA in human, chicken and mouse. The stored information provides experimental evidence for thousands of PA sites and APA events. APADB combines 3′ end sequencing data with prediction algorithms of miRNA binding sites, allowing to further improve prediction algorithms. Current databases lack correct information about 3′UTR lengths, especially for chicken, and APADB provides necessary information to close this gap. Database URL: http://tools.genxpro.net/apadb/ PMID:25052703
An On-line Technology Information System (OTIS) for Advanced Life Support
NASA Technical Reports Server (NTRS)
Levri, Julie A.; Boulanger, Richard; Hogan, John A.; Rodriquez, Luis
2003-01-01
OTIS is an on-line communication platform designed for smooth flow of technology information between advanced life support (ALS) technology developers, researchers, system analysts, and managers. With pathways for efficient transfer of information, several improvements in the ALS Program will result. With OTIS, it will be possible to provide programmatic information for technology developers and researchers, technical information for analysts, and managerial decision support. OTIS is a platform that enables the effective research, development, and delivery of complex systems for life support. An electronic data collection form has been developed for the solid waste element, drafted by the Solid Waste Working Group. Forms for other elements (air revitalization, water recovery, food processing, biomass production and thermal control) will also be developed, based on lessons learned from the development of the solid waste form. All forms will be developed by consultation with other working groups, comprised of experts in the area of interest. Forms will be converted to an on-line data collection interface that technology developers will use to transfer information into OTIS. Funded technology developers will log in to OTIS annually to complete the element- specific forms for their technology. The type and amount of information requested expands as the technology readiness level (TRL) increases. The completed forms will feed into a regularly updated and maintained database that will store technology information and allow for database searching. To ensure confidentiality of proprietary information, security permissions will be customized for each user. Principal investigators of a project will be able to designate certain data as proprietary and only technical monitors of a task, ALS Management, and the principal investigator will have the ability to view this information. The typical OTIS user will be able to read all non-proprietary information about all projects.Interaction with the database will occur over encrypted connections, and data will be stored on the server in an encrypted form. Implementation of OTIS will initiate a community-accessible repository of technology development information. With OTIS, ALS element leads and managers will be able to carry out informed technology selection for programmatic decisions. OTIS will also allow analysts to make accurate evaluations of technology options. Additionally, the range and specificity of information solicited will help educate technology developers of program needs. With augmentation, OTIS reporting is capable of replacing the current fiscal year-end reporting process. Overall, the system will enable more informed R&TD decisions and more rapid attainment of ALS Program goals.
FBIS: A regional DNA barcode archival & analysis system for Indian fishes
Nagpure, Naresh Sahebrao; Rashid, Iliyas; Pathak, Ajey Kumar; Singh, Mahender; Singh, Shri Prakash; Sarkar, Uttam Kumar
2012-01-01
DNA barcode is a new tool for taxon recognition and classification of biological organisms based on sequence of a fragment of mitochondrial gene, cytochrome c oxidase I (COI). In view of the growing importance of the fish DNA barcoding for species identification, molecular taxonomy and fish diversity conservation, we developed a Fish Barcode Information System (FBIS) for Indian fishes, which will serve as a regional DNA barcode archival and analysis system. The database presently contains 2334 sequence records of COI gene for 472 aquatic species belonging to 39 orders and 136 families, collected from available published data sources. Additionally, it contains information on phenotype, distribution and IUCN Red List status of fishes. The web version of FBIS was designed using MySQL, Perl and PHP under Linux operating platform to (a) store and manage the acquisition (b) analyze and explore DNA barcode records (c) identify species and estimate genetic divergence. FBIS has also been integrated with appropriate tools for retrieving and viewing information about the database statistics and taxonomy. It is expected that FBIS would be useful as a potent information system in fish molecular taxonomy, phylogeny and genomics. Availability The database is available for free at http://mail.nbfgr.res.in/fbis/ PMID:22715304
Ocean Drilling Program: Janus Web Database
in Janus Data Types and Examples Leg 199, sunrise. Janus Web Database ODP and IODP data are stored in as time permits (see Database Overview for available data). Data are available to everyone. There are
Evaluating the Cassandra NoSQL Database Approach for Genomic Data Persistency
Aniceto, Rodrigo; Xavier, Rene; Guimarães, Valeria; Hondo, Fernanda; Holanda, Maristela; Walter, Maria Emilia; Lifschitz, Sérgio
2015-01-01
Rapid advances in high-throughput sequencing techniques have created interesting computational challenges in bioinformatics. One of them refers to management of massive amounts of data generated by automatic sequencers. We need to deal with the persistency of genomic data, particularly storing and analyzing these large-scale processed data. To find an alternative to the frequently considered relational database model becomes a compelling task. Other data models may be more effective when dealing with a very large amount of nonconventional data, especially for writing and retrieving operations. In this paper, we discuss the Cassandra NoSQL database approach for storing genomic data. We perform an analysis of persistency and I/O operations with real data, using the Cassandra database system. We also compare the results obtained with a classical relational database system and another NoSQL database approach, MongoDB. PMID:26558254
Development of a national, dynamic reservoir-sedimentation database
Gray, J.R.; Bernard, J.M.; Stewart, D.W.; McFaul, E.J.; Laurent, K.W.; Schwarz, G.E.; Stinson, J.T.; Jonas, M.M.; Randle, T.J.; Webb, J.W.
2010-01-01
The importance of dependable, long-term water supplies, coupled with the need to quantify rates of capacity loss of the Nation’s re servoirs due to sediment deposition, were the most compelling reasons for developing the REServoir- SEDimentation survey information (RESSED) database and website. Created under the auspices of the Advisory Committee on Water Information’s Subcommittee on Sedimenta ion by the U.S. Geological Survey and the Natural Resources Conservation Service, the RESSED database is the most comprehensive compilation of data from reservoir bathymetric and dry-basin surveys in the United States. As of March 2010, the database, which contains data compiled on the 1950s vintage Soil Conservation Service’s Form SCS-34 data sheets, contained results from 6,616 surveys on 1,823 reservoirs in the United States and two surveys on one reservoir in Puerto Rico. The data span the period 1755–1997, with 95 percent of the surveys performed from 1930–1990. The reservoir surface areas range from sub-hectare-scale farm ponds to 658 km2 Lake Powell. The data in the RESSED database can be useful for a number of purposes, including calculating changes in reservoir-storage characteristics, quantifying sediment budgets, and estimating erosion rates in a reservoir’s watershed. The March 2010 version of the RESSED database has a number of deficiencies, including a cryptic and out-of-date database architecture; some geospatial inaccuracies (although most have been corrected); other data errors; an inability to store all data in a readily retrievable manner; and an inability to store all data types that currently exist. Perhaps most importantly, the March 2010 version of RESSED database provides no publically available means to submit new data and corrections to existing data. To address these and other deficiencies, the Subcommittee on Sedimentation, through the U.S. Geological Survey and the U.S. Army Corps of Engineers, began a collaborative project in November 2009 to modernize the RESSED database architecture; provide public online input capability; and produce online reports. The ultimate goal of the Subcommittee on Sedimentation is to build a comprehensive, quality-assured database describing capacity changes over time for the largest suite of the Nation’s reservoirs.
Generation And Understanding Of Natural Language Using Information In A Frame Structure
NASA Astrophysics Data System (ADS)
Perkins, Walton A.
1989-03-01
Many expert systems and relational database systems store factual information in the form of attributes values of objects. Problems arise in transforming from that attribute (frame) database representation into English surface structure and in transforming the English surface structure into a representation that references information in the frame database. In this paper we consider mainly the generation process, as it is this area in which we have made the most significant progress. In its interaction with the user, the expert system must generate questions, declarations, and uncertain declarations. Attributes such as COLOR, LENGTH, and ILLUMINATION can be referenced using the template: "
Anderson, Beth M.; Stevens, Michael C.; Glahn, David C.; Assaf, Michal; Pearlson, Godfrey D.
2013-01-01
We present a modular, high performance, open-source database system that incorporates popular neuroimaging database features with novel peer-to-peer sharing, and a simple installation. An increasing number of imaging centers have created a massive amount of neuroimaging data since fMRI became popular more than 20 years ago, with much of that data unshared. The Neuroinformatics Database (NiDB) provides a stable platform to store and manipulate neuroimaging data and addresses several of the impediments to data sharing presented by the INCF Task Force on Neuroimaging Datasharing, including 1) motivation to share data, 2) technical issues, and 3) standards development. NiDB solves these problems by 1) minimizing PHI use, providing a cost effective simple locally stored platform, 2) storing and associating all data (including genome) with a subject and creating a peer-to-peer sharing model, and 3) defining a sample, normalized definition of a data storage structure that is used in NiDB. NiDB not only simplifies the local storage and analysis of neuroimaging data, but also enables simple sharing of raw data and analysis methods, which may encourage further sharing. PMID:23912507
USDA-ARS?s Scientific Manuscript database
Tomato Functional Genomics Database (TFGD; http://ted.bti.cornell.edu) provides a comprehensive systems biology resource to store, mine, analyze, visualize and integrate large-scale tomato functional genomics datasets. The database is expanded from the previously described Tomato Expression Database...
Development of Human Face Literature Database Using Text Mining Approach: Phase I.
Kaur, Paramjit; Krishan, Kewal; Sharma, Suresh K
2018-06-01
The face is an important part of the human body by which an individual communicates in the society. Its importance can be highlighted by the fact that a person deprived of face cannot sustain in the living world. The amount of experiments being performed and the number of research papers being published under the domain of human face have surged in the past few decades. Several scientific disciplines, which are conducting research on human face include: Medical Science, Anthropology, Information Technology (Biometrics, Robotics, and Artificial Intelligence, etc.), Psychology, Forensic Science, Neuroscience, etc. This alarms the need of collecting and managing the data concerning human face so that the public and free access of it can be provided to the scientific community. This can be attained by developing databases and tools on human face using bioinformatics approach. The current research emphasizes on creating a database concerning literature data of human face. The database can be accessed on the basis of specific keywords, journal name, date of publication, author's name, etc. The collected research papers will be stored in the form of a database. Hence, the database will be beneficial to the research community as the comprehensive information dedicated to the human face could be found at one place. The information related to facial morphologic features, facial disorders, facial asymmetry, facial abnormalities, and many other parameters can be extracted from this database. The front end has been developed using Hyper Text Mark-up Language and Cascading Style Sheets. The back end has been developed using hypertext preprocessor (PHP). The JAVA Script has used as scripting language. MySQL (Structured Query Language) is used for database development as it is most widely used Relational Database Management System. XAMPP (X (cross platform), Apache, MySQL, PHP, Perl) open source web application software has been used as the server.The database is still under the developmental phase and discusses the initial steps of its creation. The current paper throws light on the work done till date.
Process description language: an experiment in robust programming for manufacturing systems
NASA Astrophysics Data System (ADS)
Spooner, Natalie R.; Creak, G. Alan
1998-10-01
Maintaining stable, robust, and consistent software is difficult in face of the increasing rate of change of customers' preferences, materials, manufacturing techniques, computer equipment, and other characteristic features of manufacturing systems. It is argued that software is commonly difficult to keep up to date because many of the implications of these changing features on software details are obscure. A possible solution is to use a software generation system in which the transformation of system properties into system software is made explicit. The proposed generation system stores the system properties, such as machine properties, product properties and information on manufacturing techniques, in databases. As a result this information, on which system control is based, can also be made available to other programs. In particular, artificial intelligence programs such as fault diagnosis programs, can benefit from using the same information as the control system, rather than a separate database which must be developed and maintained separately to ensure consistency. Experience in developing a simplified model of such a system is presented.
A Data Management System for International Space Station Simulation Tools
NASA Technical Reports Server (NTRS)
Betts, Bradley J.; DelMundo, Rommel; Elcott, Sharif; McIntosh, Dawn; Niehaus, Brian; Papasin, Richard; Mah, Robert W.; Clancy, Daniel (Technical Monitor)
2002-01-01
Groups associated with the design, operational, and training aspects of the International Space Station make extensive use of modeling and simulation tools. Users of these tools often need to access and manipulate large quantities of data associated with the station, ranging from design documents to wiring diagrams. Retrieving and manipulating this data directly within the simulation and modeling environment can provide substantial benefit to users. An approach for providing these kinds of data management services, including a database schema and class structure, is presented. Implementation details are also provided as a data management system is integrated into the Intelligent Virtual Station, a modeling and simulation tool developed by the NASA Ames Smart Systems Research Laboratory. One use of the Intelligent Virtual Station is generating station-related training procedures in a virtual environment, The data management component allows users to quickly and easily retrieve information related to objects on the station, enhancing their ability to generate accurate procedures. Users can associate new information with objects and have that information stored in a database.
The design and implementation of the immune epitope database and analysis resource
Peters, Bjoern; Sidney, John; Bourne, Phil; Bui, Huynh-Hoa; Buus, Soeren; Doh, Grace; Fleri, Ward; Kronenberg, Mitch; Kubo, Ralph; Lund, Ole; Nemazee, David; Ponomarenko, Julia V.; Sathiamurthy, Muthu; Schoenberger, Stephen P.; Stewart, Scott; Surko, Pamela; Way, Scott; Wilson, Steve; Sette, Alessandro
2016-01-01
Epitopes are defined as parts of antigens interacting with receptors of the immune system. Knowledge about their intrinsic structure and how they affect the immune response is required to continue development of techniques that detect, monitor, and fight diseases. Their scientific importance is reflected in the vast amount of epitope-related information gathered, ranging from interactions between epitopes and major histocompatibility complex molecules determined by X-ray crystallography to clinical studies analyzing correlates of protection for epitope based vaccines. Our goal is to provide a central resource capable of capturing this information, allowing users to access and connect realms of knowledge that are currently separated and difficult to access. Here, we portray a new initiative, “The Immune Epitope Database and Analysis Resource.” We describe how we plan to capture, structure, and store this information, what query interfaces we will make available to the public, and what additional predictive and analytical tools we will provide. PMID:15895191
Portales-Casamar, Elodie; Arenillas, David; Lim, Jonathan; Swanson, Magdalena I.; Jiang, Steven; McCallum, Anthony; Kirov, Stefan; Wasserman, Wyeth W.
2009-01-01
The PAZAR database unites independently created and maintained data collections of transcription factor and regulatory sequence annotation. The flexible PAZAR schema permits the representation of diverse information derived from experiments ranging from biochemical protein–DNA binding to cellular reporter gene assays. Data collections can be made available to the public, or restricted to specific system users. The data ‘boutiques’ within the shopping-mall-inspired system facilitate the analysis of genomics data and the creation of predictive models of gene regulation. Since its initial release, PAZAR has grown in terms of data, features and through the addition of an associated package of software tools called the ORCA toolkit (ORCAtk). ORCAtk allows users to rapidly develop analyses based on the information stored in the PAZAR system. PAZAR is available at http://www.pazar.info. ORCAtk can be accessed through convenient buttons located in the PAZAR pages or via our website at http://www.cisreg.ca/ORCAtk. PMID:18971253
Amadoz, Alicia; González-Candelas, Fernando
2007-04-20
Most research scientists working in the fields of molecular epidemiology, population and evolutionary genetics are confronted with the management of large volumes of data. Moreover, the data used in studies of infectious diseases are complex and usually derive from different institutions such as hospitals or laboratories. Since no public database scheme incorporating clinical and epidemiological information about patients and molecular information about pathogens is currently available, we have developed an information system, composed by a main database and a web-based interface, which integrates both types of data and satisfies requirements of good organization, simple accessibility, data security and multi-user support. From the moment a patient arrives to a hospital or health centre until the processing and analysis of molecular sequences obtained from infectious pathogens in the laboratory, lots of information is collected from different sources. We have divided the most relevant data into 12 conceptual modules around which we have organized the database schema. Our schema is very complete and it covers many aspects of sample sources, samples, laboratory processes, molecular sequences, phylogenetics results, clinical tests and results, clinical information, treatments, pathogens, transmissions, outbreaks and bibliographic information. Communication between end-users and the selected Relational Database Management System (RDMS) is carried out by default through a command-line window or through a user-friendly, web-based interface which provides access and management tools for the data. epiPATH is an information system for managing clinical and molecular information from infectious diseases. It facilitates daily work related to infectious pathogens and sequences obtained from them. This software is intended for local installation in order to safeguard private data and provides advanced SQL-users the flexibility to adapt it to their needs. The database schema, tool scripts and web-based interface are free software but data stored in our database server are not publicly available. epiPATH is distributed under the terms of GNU General Public License. More details about epiPATH can be found at http://genevo.uv.es/epipath.
Technology Applications Group Multimedia CD-ROM Project
NASA Technical Reports Server (NTRS)
McRacken, Kristi D.
1995-01-01
To produce a multimedia CD-ROM for the Technology Applications Group which would present the Technology Opportunity Showcase (TOPS) exhibits and Small Business Innovative Research (SBIR) projects to interested companies. The CD-ROM format is being used and developed especially for those companies who do not have Internet access, and cannot directly visit Langley through the World Wide Web. The CD-ROM will include text, pictures, sound, and movies. The information for the CD-ROM will be stored in a database from which the users can query and browse the information, and future CD's can be maintained and updated.
An annotation system for 3D fluid flow visualization
NASA Technical Reports Server (NTRS)
Loughlin, Maria M.; Hughes, John F.
1995-01-01
Annotation is a key activity of data analysis. However, current systems for data analysis focus almost exclusively on visualization. We propose a system which integrates annotations into a visualization system. Annotations are embedded in 3D data space, using the Post-it metaphor. This embedding allows contextual-based information storage and retrieval, and facilitates information sharing in collaborative environments. We provide a traditional database filter and a Magic Lens filter to create specialized views of the data. The system has been customized for fluid flow applications, with features which allow users to store parameters of visualization tools and sketch 3D volumes.
The use of intelligent database systems in acute pancreatitis--a systematic review.
van den Heever, Marc; Mittal, Anubhav; Haydock, Matthew; Windsor, John
2014-01-01
Acute pancreatitis (AP) is a complex disease with multiple aetiological factors, wide ranging severity, and multiple challenges to effective triage and management. Databases, data mining and machine learning algorithms (MLAs), including artificial neural networks (ANNs), may assist by storing and interpreting data from multiple sources, potentially improving clinical decision-making. 1) Identify database technologies used to store AP data, 2) collate and categorise variables stored in AP databases, 3) identify the MLA technologies, including ANNs, used to analyse AP data, and 4) identify clinical and non-clinical benefits and obstacles in establishing a national or international AP database. Comprehensive systematic search of online reference databases. The predetermined inclusion criteria were all papers discussing 1) databases, 2) data mining or 3) MLAs, pertaining to AP, independently assessed by two reviewers with conflicts resolved by a third author. Forty-three papers were included. Three data mining technologies and five ANN methodologies were reported in the literature. There were 187 collected variables identified. ANNs increase accuracy of severity prediction, one study showed ANNs had a sensitivity of 0.89 and specificity of 0.96 six hours after admission--compare APACHE II (cutoff score ≥8) with 0.80 and 0.85 respectively. Problems with databases were incomplete data, lack of clinical data, diagnostic reliability and missing clinical data. This is the first systematic review examining the use of databases, MLAs and ANNs in the management of AP. The clinical benefits these technologies have over current systems and other advantages to adopting them are identified. Copyright © 2013 IAP and EPC. Published by Elsevier B.V. All rights reserved.
A Novel Database to Rank and Display Archeomagnetic Intensity Data
NASA Astrophysics Data System (ADS)
Donadini, F.; Korhonen, K.; Riisager, P.; Pesonen, L. J.; Kahma, K.
2005-12-01
To understand the content and the causes of the changes in the Earth's magnetic field beyond the observatory records one has to rely on archeomagnetic and lake sediment paleomagnetic data. The regional archeointensity curves are often of different quality and temporally variable which hampers the global analysis of the data in terms of dipole vs non-dipole field. We have developed a novel archeointensity database application utilizing MySQL, PHP (PHP Hypertext Preprocessor), and the Generic Mapping Tools (GMT) for ranking and displaying geomagnetic intensity data from the last 12000 years. Our application has the advantage that no specific software is required to query the database and view the results. Querying the database is performed using any Web browser; a fill-out form is used to enter the site location and a minimum ranking value to select the data points to be displayed. The form also features the possibility to select plotting of the data as an archeointensity curve with error bars, and a Virtual Axial Dipole Moment (VADM) or ancient field value (Ba) curve calculated using the CALS7K model (Continuous Archaeomagnetic and Lake Sediment geomagnetic model) of (Korte and Constable, 2005). The results of a query are displayed on a Web page containing a table summarizing the query parameters, a table showing the archeointensity values satisfying the query parameters, and a plot of VADM or Ba as a function of sample age. The database consists of eight related tables. The main one, INTENSITIES, stores the 3704 archeointensity measurements collected from 159 publications as VADM (and VDM when available) and Ba values, including their standard deviations and sampling locations. It also contains the number of samples and specimens measured from each site. The REFS table stores the references to a particular study. The names, latitudes, and longitudes of the regions where the samples were collected are stored in the SITES table. The MATERIALS, METHODS, SPECIMEN_TYPES and DATING_METHODS tables store information about the sample materials, intensity determination methods, specimen types and age determination methods. The SIGMA_COUNT table is used indirectly for ranking data according to the number of samples measured and their standard deviations. Each intensity measurement is assigned a score (0--2) depending on the number of specimens measured and their standard deviations, the intensity determination method, the type of specimens measured and materials. The ranking of each data point is calculated as the sum of the four scores and varies between 0 and 8. Additionally, users can select the parameters that will be included in the ranking.
Cross-Matching Source Observations from the Palomar Transient Factory (PTF)
NASA Astrophysics Data System (ADS)
Laher, Russ; Grillmair, C.; Surace, J.; Monkewitz, S.; Jackson, E.
2009-01-01
Over the four-year lifetime of the PTF project, approximately 40 billion instances of astronomical-source observations will be extracted from the image data. The instances will correspond to the same astronomical objects being observed at roughly 25-50 different times, and so a very large catalog containing important object-variability information will be the chief PTF product. Organizing astronomical-source catalogs is conventionally done by dividing the catalog into declination zones and sorting by right ascension within each zone (e.g., the USNOA star catalog), in order to facilitate catalog searches. This method was reincarnated as the "zones" algorithm in a SQL-Server database implementation (Szalay et al., MSR-TR-2004-32), with corrections given by Gray et al. (MSR-TR-2006-52). The primary advantage of this implementation is that all of the work is done entirely on the database server and client/server communication is eliminated. We implemented the methods outlined in Gray et al. for a PostgreSQL database. We programmed the methods as database functions in PL/pgSQL procedural language. The cross-matching is currently based on source positions, but we intend to extend it to use both positions and positional uncertainties to form a chi-square statistic for optimal thresholding. The database design includes three main tables, plus a handful of internal tables. The Sources table stores the SExtractor source extractions taken at various times; the MergedSources table stores statistics about the astronomical objects, which are the result of cross-matching records in the Sources table; and the Merges table, which associates cross-matched primary keys in the Sources table with primary keys in the MergedSoures table. Besides judicious database indexing, we have also internally partitioned the Sources table by declination zone, in order to speed up the population of Sources records and make the database more manageable. The catalog will be accessible to the public after the proprietary period through IRSA (irsa.ipac.caltech.edu).
Windsor, J S; Rodway, G W; Middleton, P M; McCarthy, S
2006-01-01
Objective The emergence of a new generation of “point‐and‐shoot” digital cameras offers doctors a compact, portable and user‐friendly solution to the recording of highly detailed digital photographs and video images. This work highlights the use of such technology, and provides information for those who wish to record, store and display their own medical images. Methods Over a 3‐month period, a digital camera was carried by a doctor in a busy, adult emergency department and used to record a range of clinical images that were subsequently transferred to a computer database. Results In total, 493 digital images were recorded, of which 428 were photographs and 65 were video clips. These were successfully used for teaching purposes, publications and patient records. Conclusions This study highlights the importance of informed consent, the selection of a suitable package of digital technology and the role of basic photographic technique in developing a successful digital database in a busy clinical environment. PMID:17068281
NASA Astrophysics Data System (ADS)
Xu, Mingzhu; Gao, Zhiqiang; Ning, Jicai
2014-10-01
To improve the access efficiency of geoscience data, efficient data model and storage solutions should be used. Geoscience data is usually classified by format or coordinate system in existing storage solutions. When data is large, it is not conducive to search the geographic features. In this study, a geographical information integration system of Shandong province, China was developed based on the technology of ArcGIS Engine, .NET, and SQL Server. It uses Geodatabase spatial data model and ArcSDE to organize and store spatial and attribute data and establishes geoscience database of Shangdong. Seven function modules were designed: map browse, database and subject management, layer control, map query, spatial analysis and map symbolization. The system's characteristics of can be browsed and managed by geoscience subjects make the system convenient for geographic researchers and decision-making departments to use the data.
BIND: the Biomolecular Interaction Network Database
Bader, Gary D.; Betel, Doron; Hogue, Christopher W. V.
2003-01-01
The Biomolecular Interaction Network Database (BIND: http://bind.ca) archives biomolecular interaction, complex and pathway information. A web-based system is available to query, view and submit records. BIND continues to grow with the addition of individual submissions as well as interaction data from the PDB and a number of large-scale interaction and complex mapping experiments using yeast two hybrid, mass spectrometry, genetic interactions and phage display. We have developed a new graphical analysis tool that provides users with a view of the domain composition of proteins in interaction and complex records to help relate functional domains to protein interactions. An interaction network clustering tool has also been developed to help focus on regions of interest. Continued input from users has helped further mature the BIND data specification, which now includes the ability to store detailed information about genetic interactions. The BIND data specification is available as ASN.1 and XML DTD. PMID:12519993
Mariner, R.H.; Venezky, D.Y.; Hurwitz, S.
2006-01-01
Chemical and isotope data accumulated by two USGS Projects (led by I. Barnes and R. Mariner) over a time period of about 40 years can now be found using a basic web search or through an image search (left). The data are primarily chemical and isotopic analyses of waters (thermal, mineral, or fresh) and associated gas (free and/or dissolved) collected from hot springs, mineral springs, cold springs, geothermal wells, fumaroles, and gas seeps. Additional information is available about the collection methods and analysis procedures.The chemical and isotope data are stored in a MySQL database and accessed using PHP from a basic search form below. Data can also be accessed using an Open Source GIS called WorldKit by clicking on the image to the left. Additional information is available about WorldKit including the files used to set up the site.
Extending the data dictionary for data/knowledge management
NASA Technical Reports Server (NTRS)
Hydrick, Cecile L.; Graves, Sara J.
1988-01-01
Current relational database technology provides the means for efficiently storing and retrieving large amounts of data. By combining techniques learned from the field of artificial intelligence with this technology, it is possible to expand the capabilities of such systems. This paper suggests using the expanded domain concept, an object-oriented organization, and the storing of knowledge rules within the relational database as a solution to the unique problems associated with CAD/CAM and engineering data.
Why Save Your Course as a Relational Database?
ERIC Educational Resources Information Center
Hamilton, Gregory C.; Katz, David L.; Davis, James E.
2000-01-01
Describes a system that stores course materials for computer-based training programs in a relational database called Of Course! Outlines the basic structure of the databases; explains distinctions between Of Course! and other authoring languages; and describes how data is retrieved from the database and presented to the student. (Author/LRW)
ERIC Educational Resources Information Center
Moore, Pam
2010-01-01
The Internet and electronic commerce (e-commerce) generate lots of data. Data must be stored, organized, and managed. Database administrators, or DBAs, work with database software to find ways to do this. They identify user needs, set up computer databases, and test systems. They ensure that systems perform as they should and add people to the…
John F. Caratti
2006-01-01
The FIREMON database software allows users to enter data, store, analyze, and summarize plot data, photos, and related documents. The FIREMON database software consists of a Java application and a Microsoft® Access database. The Java application provides the user interface with FIREMON data through data entry forms, data summary reports, and other data management tools...
NASA Technical Reports Server (NTRS)
Huber, P. D.; Gallagher, J. P.
1994-01-01
This report describes the organization, format and content of the NASA Johnson damage tolerant database which was created to store damage tolerant property data for non aerospace structural materials. The database is designed to store fracture toughness data (K(sub IC), K(sub c), J(sub IC) and CTOD(sub IC)), resistance curve data (K(sub R) VS. delta a (sub eff) and JR VS. delta a (sub eff)), as well as subcritical crack growth data (a vs. N and da/dN vs. delta K). The database contains complementary material property data for both stainless and alloy steels, as well as for aluminum, nickel, and titanium alloys which were not incorporated into the Damage Tolerant Design Handbook database.
High Performance Semantic Factoring of Giga-Scale Semantic Graph Databases
DOE Office of Scientific and Technical Information (OSTI.GOV)
Joslyn, Cliff A.; Adolf, Robert D.; Al-Saffar, Sinan
2010-10-04
As semantic graph database technology grows to address components ranging from extant large triple stores to SPARQL endpoints over SQL-structured relational databases, it will become increasingly important to be able to bring high performance computational resources to bear on their analysis, interpretation, and visualization, especially with respect to their innate semantic structure. Our research group built a novel high performance hybrid system comprising computational capability for semantic graph database processing utilizing the large multithreaded architecture of the Cray XMT platform, conventional clusters, and large data stores. In this paper we describe that architecture, and present the results of our deployingmore » that for the analysis of the Billion Triple dataset with respect to its semantic factors.« less
Shark: SQL and Analytics with Cost-Based Query Optimization on Coarse-Grained Distributed Memory
2014-01-13
RDBMS and contains a database (often MySQL or Derby) with a namespace for tables, table metadata and partition information. Table data is stored in an...serialization/deserialization) Java interface implementations with corresponding object inspectors. The Hive driver controls the processing of queries, coordinat...native API, RDD operations are invoked through a functional interface similar to DryadLINQ [32] in Scala, Java or Python. For example, the Scala code for
DOE Office of Scientific and Technical Information (OSTI.GOV)
Kinnan, Mark K.; Valerio, Richard Arthur; Flanagan, Tatiana Paz
2016-12-01
This report gives introductory guidance on the level of effort required to create a data warehouse for mining data. Numerous tutorials have been provided to demonstrate the process of downloading raw data, processing the raw data, and importing the data into a PostgreSQL database. Additional information and tutorial has been provided on setting up a Hadoop cluster for storing vasts amounts of data. This report has been generated as a deliverable for a New Mexico Small Business Assistance (NMSBA) project.
Development of a Database Design for Serials Control in the Defense Communications Agency.
1985-10-21
USERS 0 UNCLASSIFIED 22s. NAME OF RESPONSIBLE INDIVIDUAL 22b. TELEPHONE NUMBER 22c. OFFICE SYMBOL (Include Area Code) GUERRIERO, Donald A. (703) 692-0373...effective control over the subscriptions. Studies had shown that an automated system was both possible and desirable from a technical and management...way, and to determine what software and hardware could be used to enter, store, and retrieve the information. This study showed the logical structure
Biological sequence compression algorithms.
Matsumoto, T; Sadakane, K; Imai, H
2000-01-01
Today, more and more DNA sequences are becoming available. The information about DNA sequences are stored in molecular biology databases. The size and importance of these databases will be bigger and bigger in the future, therefore this information must be stored or communicated efficiently. Furthermore, sequence compression can be used to define similarities between biological sequences. The standard compression algorithms such as gzip or compress cannot compress DNA sequences, but only expand them in size. On the other hand, CTW (Context Tree Weighting Method) can compress DNA sequences less than two bits per symbol. These algorithms do not use special structures of biological sequences. Two characteristic structures of DNA sequences are known. One is called palindromes or reverse complements and the other structure is approximate repeats. Several specific algorithms for DNA sequences that use these structures can compress them less than two bits per symbol. In this paper, we improve the CTW so that characteristic structures of DNA sequences are available. Before encoding the next symbol, the algorithm searches an approximate repeat and palindrome using hash and dynamic programming. If there is a palindrome or an approximate repeat with enough length then our algorithm represents it with length and distance. By using this preprocessing, a new program achieves a little higher compression ratio than that of existing DNA-oriented compression algorithms. We also describe new compression algorithm for protein sequences.
Searching Across the International Space Station Databases
NASA Technical Reports Server (NTRS)
Maluf, David A.; McDermott, William J.; Smith, Ernest E.; Bell, David G.; Gurram, Mohana
2007-01-01
Data access in the enterprise generally requires us to combine data from different sources and different formats. It is advantageous thus to focus on the intersection of the knowledge across sources and domains; keeping irrelevant knowledge around only serves to make the integration more unwieldy and more complicated than necessary. A context search over multiple domain is proposed in this paper to use context sensitive queries to support disciplined manipulation of domain knowledge resources. The objective of a context search is to provide the capability for interrogating many domain knowledge resources, which are largely semantically disjoint. The search supports formally the tasks of selecting, combining, extending, specializing, and modifying components from a diverse set of domains. This paper demonstrates a new paradigm in composition of information for enterprise applications. In particular, it discusses an approach to achieving data integration across multiple sources, in a manner that does not require heavy investment in database and middleware maintenance. This lean approach to integration leads to cost-effectiveness and scalability of data integration with an underlying schemaless object-relational database management system. This highly scalable, information on demand system framework, called NX-Search, which is an implementation of an information system built on NETMARK. NETMARK is a flexible, high-throughput open database integration framework for managing, storing, and searching unstructured or semi-structured arbitrary XML and HTML used widely at the National Aeronautics Space Administration (NASA) and industry.
The composite load spectra project
NASA Technical Reports Server (NTRS)
Newell, J. F.; Ho, H.; Kurth, R. E.
1990-01-01
Probabilistic methods and generic load models capable of simulating the load spectra that are induced in space propulsion system components are being developed. Four engine component types (the transfer ducts, the turbine blades, the liquid oxygen posts and the turbopump oxidizer discharge duct) were selected as representative hardware examples. The composite load spectra that simulate the probabilistic loads for these components are typically used as the input loads for a probabilistic structural analysis. The knowledge-based system approach used for the composite load spectra project provides an ideal environment for incremental development. The intelligent database paradigm employed in developing the expert system provides a smooth coupling between the numerical processing and the symbolic (information) processing. Large volumes of engine load information and engineering data are stored in database format and managed by a database management system. Numerical procedures for probabilistic load simulation and database management functions are controlled by rule modules. Rules were hard-wired as decision trees into rule modules to perform process control tasks. There are modules to retrieve load information and models. There are modules to select loads and models to carry out quick load calculations or make an input file for full duty-cycle time dependent load simulation. The composite load spectra load expert system implemented today is capable of performing intelligent rocket engine load spectra simulation. Further development of the expert system will provide tutorial capability for users to learn from it.
NASA Astrophysics Data System (ADS)
Fletcher, Alex; Yoo, Terry S.
2004-04-01
Public databases today can be constructed with a wide variety of authoring and management structures. The widespread appeal of Internet search engines suggests that public information be made open and available to common search strategies, making accessible information that would otherwise be hidden by the infrastructure and software interfaces of a traditional database management system. We present the construction and organizational details for managing NOVA, the National Online Volumetric Archive. As an archival effort of the Visible Human Project for supporting medical visualization research, archiving 3D multimodal radiological teaching files, and enhancing medical education with volumetric data, our overall database structure is simplified; archives grow by accruing information, but seldom have to modify, delete, or overwrite stored records. NOVA is being constructed and populated so that it is transparent to the Internet; that is, much of its internal structure is mirrored in HTML allowing internet search engines to investigate, catalog, and link directly to the deep relational structure of the collection index. The key organizational concept for NOVA is the Image Content Group (ICG), an indexing strategy for cataloging incoming data as a set structure rather than by keyword management. These groups are managed through a series of XML files and authoring scripts. We cover the motivation for Image Content Groups, their overall construction, authorship, and management in XML, and the pilot results for creating public data repositories using this strategy.
Design and development of an interactive medical teleconsultation system over the World Wide Web.
Bai, J; Zhang, Y; Dai, B
1998-06-01
The objective of the medical teleconsultation system presented in this paper is to demonstrate the use of the World Wide Web (WWW) for telemedicine and interactive medical information exchange. The system, which is developed based on Java, could provide several basic Java tools to fulfill the requirements of medical applications, including a file manager, data tool, bulletin board, and digital audio tool. The digital audio tool uses point-to-point structure to enable two physicians to communicate directly through voice. The others use multipoint structure. The file manager manages the medical images stored in the WWW information server, which come from a hospital database. The data tool supports cooperative operations on the medical data between the participating physicians. The bulletin board enables the users to discuss special cases by writing text on the board, send their personal or group diagnostic reports on the cases, and reorganize the reports and store them in its report file for later use. The system provides a hardware-independent platform for physicians to interact with one another as well as to access medical information over the WWW.
Frank, M S; Dreyer, K
2001-06-01
We describe a virtual web site hosting technology that enables educators in radiology to emblazon and make available for delivery on the world wide web their own interactive educational content, free from dependencies on in-house resources and policies. This suite of technologies includes a graphically oriented software application, designed for the computer novice, to facilitate the input, storage, and management of domain expertise within a database system. The database stores this expertise as choreographed and interlinked multimedia entities including text, imagery, interactive questions, and audio. Case-based presentations or thematic lectures can be authored locally, previewed locally within a web browser, then uploaded at will as packaged knowledge objects to an educator's (or department's) personal web site housed within a virtual server architecture. This architecture can host an unlimited number of unique educational web sites for individuals or departments in need of such service. Each virtual site's content is stored within that site's protected back-end database connected to Internet Information Server (Microsoft Corp, Redmond WA) using a suite of Active Server Page (ASP) modules that incorporate Microsoft's Active Data Objects (ADO) technology. Each person's or department's electronic teaching material appears as an independent web site with different levels of access--controlled by a username-password strategy--for teachers and students. There is essentially no static hypertext markup language (HTML). Rather, all pages displayed for a given site are rendered dynamically from case-based or thematic content that is fetched from that virtual site's database. The dynamically rendered HTML is displayed within a web browser in a Socratic fashion that can assess the recipient's current fund of knowledge while providing instantaneous user-specific feedback. Each site is emblazoned with the logo and identification of the participating institution. Individuals with teacher-level access can use a web browser to upload new content as well as manage content already stored on their virtual site. Each virtual site stores, collates, and scores participants' responses to the interactive questions posed on line. This virtual web site strategy empowers the educator with an end-to-end solution for creating interactive educational content and hosting that content within the educator's personalized and protected educational site on the world wide web, thus providing a valuable outlet that can magnify the impact of his or her talents and contributions.
NASA Astrophysics Data System (ADS)
Rack, F. R.
2005-12-01
The Integrated Ocean Drilling Program (IODP: 2003-2013 initial phase) is the successor to the Deep Sea Drilling Project (DSDP: 1968-1983) and the Ocean Drilling Program (ODP: 1985-2003). These earlier scientific drilling programs amassed collections of sediment and rock cores (over 300 kilometers stored in four repositories) and data organized in distributed databases and in print or electronic publications. International members of the IODP have established, through memoranda, the right to have access to: (1) all data, samples, scientific and technical results, all engineering plans, data or other information produced under contract to the program; and, (2) all data from geophysical and other site surveys performed in support of the program which are used for drilling planning. The challenge that faces the individual platform operators and management of IODP is to find the right balance and appropriate synergies among the needs, expectations and requirements of stakeholders. The evolving model for IODP database services consists of the management and integration of data collected onboard the various IODP platforms (including downhole logging and syn-cruise site survey information), legacy data from DSDP and ODP, data derived from post-cruise research and publications, and other IODP-relevant information types, to form a common, program-wide IODP information system (e.g., IODP Portal) which will be accessible to both researchers and the public. The JANUS relational database of ODP was introduced in 1997 and the bulk of ODP shipboard data has been migrated into this system, which is comprised of a relational data model consisting of over 450 tables. The JANUS database includes paleontological, lithostratigraphic, chemical, physical, sedimentological, and geophysical data from a global distribution of sites. For ODP Legs 100 through 210, and including IODP Expeditions 301 through 308, JANUS has been used to store data from 233,835 meters of core recovered, which are comprised of 38,039 cores, with 202,281 core sections stored in repositories, which have resulted in the taking of 2,299,180 samples for scientists and other users (http://iodp.tamu.edu/janusweb/general/dbtable.cgi). JANUS and other IODP databases are viewed as components of an evolving distributed network of databases, supported by metadata catalogs and middleware with XML workflows, that are intended to provide access to DSDP/ODP/IODP cores and sample-based data as well as other distributed geoscience data collections (e.g., CHRONOS, PetDB, SedDB). These data resources can be explored through the use of emerging data visualization environments, such as GeoWall, CoreWall (http://(www.evl.uic.edu/cavern/corewall), a multi-screen display for viewing cores and related data, GeoWall-2 and LambdaVision, a very-high resolution, networked environment for data exploration and visualization, and others. The U.S Implementing Organization (USIO) for the IODP, also known as the JOI Alliance, is a partnership between Joint Oceanographic Institutions (JOI), Texas A&M University, and Lamont-Doherty Earth Observatory of Columbia University. JOI is a consortium of 20 premier oceanographic research institutions that serves the U.S. scientific community by leading large-scale, global research programs in scientific ocean drilling and ocean observing. For more than 25 years, JOI has helped facilitate discovery and advance global understanding of the Earth and its oceans through excellence in program management.
Design and implementation of the first nationwide, web-based Chinese Renal Data System (CNRDS)
2012-01-01
Background In April 2010, with an endorsement from the Ministry of Health of the People's Republic of China, the Chinese Society of Nephrology launched the first nationwide, web-based prospective renal data registration platform, the Chinese Renal Data System (CNRDS), to collect structured demographic, clinical, and laboratory data for dialysis cases, as well as to establish a kidney disease database for researchers and policy makers. Methods The CNRDS program uses information technology to facilitate healthcare professionals to create a blood purification registry and to deliver an evidence-based care and education protocol tailored to chronic kidney disease, as well as online forum for communication between nephrologists. The online portal https://www.cnrds.net is implemented as a Java web application using an Apache Tomcat web server and a MySQL database. All data are stored in a central databank to establish a Chinese renal database for research and publication purposes. Results Currently, over 270,000 clinical cases, including general patient information, diagnostics, therapies, medications, and laboratory tests, have been registered in CNRDS by 3,669 healthcare institutions qualified for hemodialysis therapy. At the 2011 annual blood purification forum of the Chinese Society of Nephrology, the CNRDS 2010 annual report was reviewed and accepted by the society members and government representatives. Conclusions CNRDS is the first national, web-based application for collecting and managing electronic medical records of patients with dialysis in China. It provides both an easily accessible platform for nephrologists to store and organize their patient data and acts as a communication platform among participating doctors. Moreover, it is the largest database for treatment and patient care of end-stage renal disease (ESRD) patients in China, which will be beneficial for scientific research and epidemiological investigations aimed at improving the quality of life of such patients. Furthermore, it is a model nationwide disease registry, which could potentially be used for other diseases. PMID:22369692
Design and implementation of the first nationwide, web-based Chinese Renal Data System (CNRDS).
Xie, Fengbo; Zhang, Dong; Wu, Jinzhao; Zhang, Yunfeng; Yang, Qing; Sun, Xuefeng; Cheng, Jing; Chen, Xiangmei
2012-02-28
In April 2010, with an endorsement from the Ministry of Health of the People's Republic of China, the Chinese Society of Nephrology launched the first nationwide, web-based prospective renal data registration platform, the Chinese Renal Data System (CNRDS), to collect structured demographic, clinical, and laboratory data for dialysis cases, as well as to establish a kidney disease database for researchers and policy makers. The CNRDS program uses information technology to facilitate healthcare professionals to create a blood purification registry and to deliver an evidence-based care and education protocol tailored to chronic kidney disease, as well as online forum for communication between nephrologists. The online portal https://www.cnrds.net is implemented as a Java web application using an Apache Tomcat web server and a MySQL database. All data are stored in a central databank to establish a Chinese renal database for research and publication purposes. Currently, over 270,000 clinical cases, including general patient information, diagnostics, therapies, medications, and laboratory tests, have been registered in CNRDS by 3,669 healthcare institutions qualified for hemodialysis therapy. At the 2011 annual blood purification forum of the Chinese Society of Nephrology, the CNRDS 2010 annual report was reviewed and accepted by the society members and government representatives. CNRDS is the first national, web-based application for collecting and managing electronic medical records of patients with dialysis in China. It provides both an easily accessible platform for nephrologists to store and organize their patient data and acts as a communication platform among participating doctors. Moreover, it is the largest database for treatment and patient care of end-stage renal disease (ESRD) patients in China, which will be beneficial for scientific research and epidemiological investigations aimed at improving the quality of life of such patients. Furthermore, it is a model nationwide disease registry, which could potentially be used for other diseases.
Manrow, Richard E; Beckwith, Margaret; Johnson, Lenora E
2014-03-01
In the National Cancer Act of 1971, the Director of the National Cancer Institute (NCI) was given a mandate to "Collect, analyze, and disseminate all data useful in the prevention, diagnosis, and treatment of cancer, including the establishment of an International Cancer Research Data Bank (ICRDB) to collect, catalog, store, and disseminate insofar as feasible the results of cancer research undertaken in any country for the use of any person involved in cancer research in any country" (National Cancer Act of 1971, S 1828, 92nd Congress, 1st Sess (1971)). In subsequent legislation, the audience for NCI's information dissemination activities was expanded to include physicians and other healthcare professionals, patients and their families, and the general public, in addition to cancer researchers. The Institute's response to these legislative requirements was to create what is now known as the Physician Data Query (PDQ®) cancer information database. From its beginnings in 1977 as a database of NCI-sponsored cancer clinical trials, PDQ has grown to include extensive information about cancer treatment, screening, prevention, supportive and palliative care, genetics, drugs, and more. Herein, we describe the history, editorial processes, influence, and global reach of one component of the PDQ database, namely its evidence-based cancer information summaries for health professionals. These summaries are widely recognized as important cancer information and education resources, and they further serve as foundational documents for the development of other cancer information products by NCI and other organizations.
A data model and database for high-resolution pathology analytical image informatics.
Wang, Fusheng; Kong, Jun; Cooper, Lee; Pan, Tony; Kurc, Tahsin; Chen, Wenjin; Sharma, Ashish; Niedermayr, Cristobal; Oh, Tae W; Brat, Daniel; Farris, Alton B; Foran, David J; Saltz, Joel
2011-01-01
The systematic analysis of imaged pathology specimens often results in a vast amount of morphological information at both the cellular and sub-cellular scales. While microscopy scanners and computerized analysis are capable of capturing and analyzing data rapidly, microscopy image data remain underutilized in research and clinical settings. One major obstacle which tends to reduce wider adoption of these new technologies throughout the clinical and scientific communities is the challenge of managing, querying, and integrating the vast amounts of data resulting from the analysis of large digital pathology datasets. This paper presents a data model, which addresses these challenges, and demonstrates its implementation in a relational database system. This paper describes a data model, referred to as Pathology Analytic Imaging Standards (PAIS), and a database implementation, which are designed to support the data management and query requirements of detailed characterization of micro-anatomic morphology through many interrelated analysis pipelines on whole-slide images and tissue microarrays (TMAs). (1) Development of a data model capable of efficiently representing and storing virtual slide related image, annotation, markup, and feature information. (2) Development of a database, based on the data model, capable of supporting queries for data retrieval based on analysis and image metadata, queries for comparison of results from different analyses, and spatial queries on segmented regions, features, and classified objects. The work described in this paper is motivated by the challenges associated with characterization of micro-scale features for comparative and correlative analyses involving whole-slides tissue images and TMAs. Technologies for digitizing tissues have advanced significantly in the past decade. Slide scanners are capable of producing high-magnification, high-resolution images from whole slides and TMAs within several minutes. Hence, it is becoming increasingly feasible for basic, clinical, and translational research studies to produce thousands of whole-slide images. Systematic analysis of these large datasets requires efficient data management support for representing and indexing results from hundreds of interrelated analyses generating very large volumes of quantifications such as shape and texture and of classifications of the quantified features. We have designed a data model and a database to address the data management requirements of detailed characterization of micro-anatomic morphology through many interrelated analysis pipelines. The data model represents virtual slide related image, annotation, markup and feature information. The database supports a wide range of metadata and spatial queries on images, annotations, markups, and features. We currently have three databases running on a Dell PowerEdge T410 server with CentOS 5.5 Linux operating system. The database server is IBM DB2 Enterprise Edition 9.7.2. The set of databases consists of 1) a TMA database containing image analysis results from 4740 cases of breast cancer, with 641 MB storage size; 2) an algorithm validation database, which stores markups and annotations from two segmentation algorithms and two parameter sets on 18 selected slides, with 66 GB storage size; and 3) an in silico brain tumor study database comprising results from 307 TCGA slides, with 365 GB storage size. The latter two databases also contain human-generated annotations and markups for regions and nuclei. Modeling and managing pathology image analysis results in a database provide immediate benefits on the value and usability of data in a research study. The database provides powerful query capabilities, which are otherwise difficult or cumbersome to support by other approaches such as programming languages. Standardized, semantic annotated data representation and interfaces also make it possible to more efficiently share image data and analysis results.
BNDB - the Biochemical Network Database.
Küntzer, Jan; Backes, Christina; Blum, Torsten; Gerasch, Andreas; Kaufmann, Michael; Kohlbacher, Oliver; Lenhof, Hans-Peter
2007-10-02
Technological advances in high-throughput techniques and efficient data acquisition methods have resulted in a massive amount of life science data. The data is stored in numerous databases that have been established over the last decades and are essential resources for scientists nowadays. However, the diversity of the databases and the underlying data models make it difficult to combine this information for solving complex problems in systems biology. Currently, researchers typically have to browse several, often highly focused, databases to obtain the required information. Hence, there is a pressing need for more efficient systems for integrating, analyzing, and interpreting these data. The standardization and virtual consolidation of the databases is a major challenge resulting in a unified access to a variety of data sources. We present the Biochemical Network Database (BNDB), a powerful relational database platform, allowing a complete semantic integration of an extensive collection of external databases. BNDB is built upon a comprehensive and extensible object model called BioCore, which is powerful enough to model most known biochemical processes and at the same time easily extensible to be adapted to new biological concepts. Besides a web interface for the search and curation of the data, a Java-based viewer (BiNA) provides a powerful platform-independent visualization and navigation of the data. BiNA uses sophisticated graph layout algorithms for an interactive visualization and navigation of BNDB. BNDB allows a simple, unified access to a variety of external data sources. Its tight integration with the biochemical network library BN++ offers the possibility for import, integration, analysis, and visualization of the data. BNDB is freely accessible at http://www.bndb.org.
Providing R-Tree Support for Mongodb
NASA Astrophysics Data System (ADS)
Xiang, Longgang; Shao, Xiaotian; Wang, Dehao
2016-06-01
Supporting large amounts of spatial data is a significant characteristic of modern databases. However, unlike some mature relational databases, such as Oracle and PostgreSQL, most of current burgeoning NoSQL databases are not well designed for storing geospatial data, which is becoming increasingly important in various fields. In this paper, we propose a novel method to provide R-tree index, as well as corresponding spatial range query and nearest neighbour query functions, for MongoDB, one of the most prevalent NoSQL databases. First, after in-depth analysis of MongoDB's features, we devise an efficient tabular document structure which flattens R-tree index into MongoDB collections. Further, relevant mechanisms of R-tree operations are issued, and then we discuss in detail how to integrate R-tree into MongoDB. Finally, we present the experimental results which show that our proposed method out-performs the built-in spatial index of MongoDB. Our research will greatly facilitate big data management issues with MongoDB in a variety of geospatial information applications.
NASA Astrophysics Data System (ADS)
Protsyuk, Yu.; Pinigin, G.; Shulga, A.
2005-06-01
Results of the development and organization of the digital database of the Nikolaev Astronomical Observatory (NAO) are presented. At present, three telescopes are connected to the local area network of NAO. All the data obtained, and results of data processing are entered into the common database of NAO. The daily average volume of new astronomical information obtained from the CCD instruments ranges from 300 MB up to 2 GB, depending on the purposes and conditions of observations. The overwhelming majority of the data are stored in the FITS format. Development and further improvement of storage standards, procedures of data handling and data processing are being carried out. It is planned to create an astronomical web portal with the possibility to have interactive access to databases and telescopes. In the future, this resource may become a part of an international virtual observatory. There are the prototypes of search tools with the use of PHP and MySQL. Efforts for getting more links to the Internet are being made.
Frishkoff, Gwen; Sydes, Jason; Mueller, Kurt; Frank, Robert; Curran, Tim; Connolly, John; Kilborn, Kerry; Molfese, Dennis; Perfetti, Charles; Malony, Allen
2011-01-01
We present MINEMO (Minimal Information for Neural ElectroMagnetic Ontologies), a checklist for the description of event-related potentials (ERP) studies. MINEMO extends MINI (Minimal Information for Neuroscience Investigations)to the ERP domain. Checklist terms are explicated in NEMO, a formal ontology that is designed to support ERP data sharing and integration. MINEMO is also linked to an ERP database and web application (the NEMO portal). Users upload their data and enter MINEMO information through the portal. The database then stores these entries in RDF (Resource Description Framework), along with summary metrics, i.e., spatial and temporal metadata. Together these spatial, temporal, and functional metadata provide a complete description of ERP data and the context in which these data were acquired. The RDF files then serve as inputs to ontology-based labeling and meta-analysis. Our ultimate goal is to represent ERPs using a rich semantic structure, so results can be queried at multiple levels, to stimulate novel hypotheses and to promote a high-level, integrative account of ERP results across diverse study methods and paradigms. PMID:22180824
Inoue, Masashi; Hasegawa, Shinsaku; Suyama, Akihiko; Meshitsuka, Shunsuke
2003-11-01
Infectious disease surveillance schemes have been established to detect infectious disease outbreak in the early stages, to identify the causative viral strains, and to rapidly assess related morbidity and mortality. To make a scheme function well, two things are required. Firstly, it must have sufficient sensitivity and be timely to guarantee as short a delay as possible from collection to redistribution of information. Secondly, it must provide a good representation of the results of the surveillance. To do this, we have developed a database system that can redistribute the information via the Internet. The feature of this system is to automatically generate the graphic images based on the numerical data stored in the database by using Hypertext Preprocessor (PHP) script and Graphics Drawing (GD) library. It dynamically displays the information as a map or bar chart as well as a numerical impression according to the real time demand of the users. This system will be a useful tool for medical personnel and researchers working on infectious disease problems and will save significant time in the redistribution of information.
Coplen, Tyler B.
2000-01-01
The reliability and accuracy of isotopic data can be improved by utilizing database software to (i) store information about samples, (ii) store the results of mass spectrometric isotope-ratio analyses of samples, (iii) calculate analytical results using standardized algorithms stored in a database, (iv) normalize stable isotopic data to international scales using isotopic reference materials, and (v) generate multi-sheet paper templates for convenient sample loading of automated mass-spectrometer sample preparation manifolds. Such a database program, the Laboratory Information Management System (LIMS) for Light Stable Isotopes, is presented herein. Major benefits of this system include (i) a dramatic improvement in quality assurance, (ii) an increase in laboratory efficiency, (iii) a reduction in workload due to the elimination or reduction of retyping of data by laboratory personnel, and (iv) a decrease in errors in data reported to sample submitters. Such a database provides a complete record of when and how often laboratory reference materials have been analyzed and provides a record of what correction factors have been used through time. It provides an audit trail for laboratories. LIMS for Light Stable Isotopes is available for both Microsoft Office 97 Professional and Microsoft Office 2000 Professional as versions 7 and 8, respectively. Both source code (mdb file) and precompiled executable files (mde) are available. Numerous improvements have been made for continuous flow isotopic analysis in this version (specifically 7.13 for Microsoft Access 97 and 8.13 for Microsoft Access 2000). It is much easier to import isotopic results from Finnigan ISODAT worksheets, even worksheets on which corrections for amount of sample (linearity corrections) have been added. The capability to determine blank corrections using isotope mass balance from analyses of elemental analyzer samples has been added. It is now possible to calculate and apply drift corrections to isotopic data based on the time of day of analysis. Whereas Finnigan ISODAT software is confined to using only a single peak for calculating delta values, LIMS now enables one to use the mean of two or more reference injections during a continuous flow analysis to calculate delta values. This is useful with Finnigan?s GasBench II online sample preparation system. Concentrations of carbon, nitrogen, and sulfur can be calculated based one or more isotopic reference materials analyzed with a group of samples. Both sample data and isotopic analysis data can now be exported to Excel files. A calculator for determining the amount of sample needed for isotopic analysis based on a previous amount of sample and continuous flow area is now an integral part of LIMS for Light Stable Isotopes. LIMS for Light Stable Isotopes can now assign an error code to Finnigan elemental analyzer analyses in which one of the electrometers has saturated due to analysis of too much sample material, giving rise to incorrect isotopic abundances. Information on downloading this report and downloading code and databases is provided at the Internet addresses: http://water.usgs.gov/software/geochemical.html or http://www.geogr.uni-jena.de/software/geochemical.html in the Eastern Hemisphere.
Design of Integrated Database on Mobile Information System: A Study of Yogyakarta Smart City App
NASA Astrophysics Data System (ADS)
Nurnawati, E. K.; Ermawati, E.
2018-02-01
An integration database is a database which acts as the data store for multiple applications and thus integrates data across these applications (in contrast to an Application Database). An integration database needs a schema that takes all its client applications into account. The benefit of the schema that sharing data among applications does not require an extra layer of integration services on the applications. Any changes to data made in a single application are made available to all applications at the time of database commit - thus keeping the applications’ data use better synchronized. This study aims to design and build an integrated database that can be used by various applications in a mobile device based system platforms with the based on smart city system. The built-in database can be used by various applications, whether used together or separately. The design and development of the database are emphasized on the flexibility, security, and completeness of attributes that can be used together by various applications to be built. The method used in this study is to choice of the appropriate database logical structure (patterns of data) and to build the relational-database models (Design Databases). Test the resulting design with some prototype apps and analyze system performance with test data. The integrated database can be utilized both of the admin and the user in an integral and comprehensive platform. This system can help admin, manager, and operator in managing the application easily and efficiently. This Android-based app is built based on a dynamic clientserver where data is extracted from an external database MySQL. So if there is a change of data in the database, then the data on Android applications will also change. This Android app assists users in searching of Yogyakarta (as smart city) related information, especially in culture, government, hotels, and transportation.
Automatic public access to documents and maps stored on and internal secure system.
NASA Astrophysics Data System (ADS)
Trench, James; Carter, Mary
2013-04-01
The Geological Survey of Ireland operates a Document Management System for providing documents and maps stored internally in high resolution and in a high level secure environment, to an external service where the documents are automatically presented in a lower resolution to members of the public. Security is devised through roles and Individual Users where role level and folder level can be set. The application is an electronic document/data management (EDM) system which has a Geographical Information System (GIS) component integrated to allow users to query an interactive map of Ireland for data that relates to a particular area of interest. The data stored in the database consists of Bedrock Field Sheets, Bedrock Notebooks, Bedrock Maps, Geophysical Surveys, Geotechnical Maps & Reports, Groundwater, GSI Publications, Marine, Mine Records, Mineral Localities, Open File, Quaternary and Unpublished Reports. The Konfig application Tool is both an internal and public facing application. It acts as a tool for high resolution data entry which are stored in a high resolution vault. The public facing application is a mirror of the internal application and differs only in that the application furnishes high resolution data into low resolution format which is stored in a low resolution vault thus, making the data web friendly to the end user for download.
Glanz, Karen; Johnson, Lauren; Yaroch, Amy L; Phillips, Matthew; Ayala, Guadalupe X; Davis, Erica L
2016-04-01
This review describes available measures of retail food store environments, including data collection methods, characteristics of measures, the dimensions most commonly captured across methods, and their strengths and limitations. Articles were included if they were published between 1990 and 2015 in an English-language peer-reviewed journal and presented original research findings on the development and/or use of a measure or method to assess retail food store environments. Four sources were used, including literature databases, backward searching of identified articles, published reviews, and measurement registries. From 3,013 citations identified, 125 observational studies and 5 studies that used sales records were reviewed in-depth. Most studies were cross-sectional and based in the US. The most common tools used were the US Department of Agriculture's Thrifty Food Plan and the Nutrition Environment Measures Survey for Stores. The most common attribute captured was availability of healthful options, followed by price. Measurement quality indicators were minimal and focused mainly on assessments of reliability. Two widely used tools to measure retail food store environments are available and can be refined and adapted. Standardization of measurement across studies and reports of measurement quality (eg, reliability, validity) may better inform practice and policy changes. Copyright © 2016 Society for Nutrition Education and Behavior. Published by Elsevier Inc. All rights reserved.
SPSmart: adapting population based SNP genotype databases for fast and comprehensive web access.
Amigo, Jorge; Salas, Antonio; Phillips, Christopher; Carracedo, Angel
2008-10-10
In the last five years large online resources of human variability have appeared, notably HapMap, Perlegen and the CEPH foundation. These databases of genotypes with population information act as catalogues of human diversity, and are widely used as reference sources for population genetics studies. Although many useful conclusions may be extracted by querying databases individually, the lack of flexibility for combining data from within and between each database does not allow the calculation of key population variability statistics. We have developed a novel tool for accessing and combining large-scale genomic databases of single nucleotide polymorphisms (SNPs) in widespread use in human population genetics: SPSmart (SNPs for Population Studies). A fast pipeline creates and maintains a data mart from the most commonly accessed databases of genotypes containing population information: data is mined, summarized into the standard statistical reference indices, and stored into a relational database that currently handles as many as 4 x 10(9) genotypes and that can be easily extended to new database initiatives. We have also built a web interface to the data mart that allows the browsing of underlying data indexed by population and the combining of populations, allowing intuitive and straightforward comparison of population groups. All the information served is optimized for web display, and most of the computations are already pre-processed in the data mart to speed up the data browsing and any computational treatment requested. In practice, SPSmart allows populations to be combined into user-defined groups, while multiple databases can be accessed and compared in a few simple steps from a single query. It performs the queries rapidly and gives straightforward graphical summaries of SNP population variability through visual inspection of allele frequencies outlined in standard pie-chart format. In addition, full numerical description of the data is output in statistical results panels that include common population genetics metrics such as heterozygosity, Fst and In.
Genomics Portals: integrative web-platform for mining genomics data.
Shinde, Kaustubh; Phatak, Mukta; Johannes, Freudenberg M; Chen, Jing; Li, Qian; Vineet, Joshi K; Hu, Zhen; Ghosh, Krishnendu; Meller, Jaroslaw; Medvedovic, Mario
2010-01-13
A large amount of experimental data generated by modern high-throughput technologies is available through various public repositories. Our knowledge about molecular interaction networks, functional biological pathways and transcriptional regulatory modules is rapidly expanding, and is being organized in lists of functionally related genes. Jointly, these two sources of information hold a tremendous potential for gaining new insights into functioning of living systems. Genomics Portals platform integrates access to an extensive knowledge base and a large database of human, mouse, and rat genomics data with basic analytical visualization tools. It provides the context for analyzing and interpreting new experimental data and the tool for effective mining of a large number of publicly available genomics datasets stored in the back-end databases. The uniqueness of this platform lies in the volume and the diversity of genomics data that can be accessed and analyzed (gene expression, ChIP-chip, ChIP-seq, epigenomics, computationally predicted binding sites, etc), and the integration with an extensive knowledge base that can be used in such analysis. The integrated access to primary genomics data, functional knowledge and analytical tools makes Genomics Portals platform a unique tool for interpreting results of new genomics experiments and for mining the vast amount of data stored in the Genomics Portals backend databases. Genomics Portals can be accessed and used freely at http://GenomicsPortals.org.
Genomics Portals: integrative web-platform for mining genomics data
2010-01-01
Background A large amount of experimental data generated by modern high-throughput technologies is available through various public repositories. Our knowledge about molecular interaction networks, functional biological pathways and transcriptional regulatory modules is rapidly expanding, and is being organized in lists of functionally related genes. Jointly, these two sources of information hold a tremendous potential for gaining new insights into functioning of living systems. Results Genomics Portals platform integrates access to an extensive knowledge base and a large database of human, mouse, and rat genomics data with basic analytical visualization tools. It provides the context for analyzing and interpreting new experimental data and the tool for effective mining of a large number of publicly available genomics datasets stored in the back-end databases. The uniqueness of this platform lies in the volume and the diversity of genomics data that can be accessed and analyzed (gene expression, ChIP-chip, ChIP-seq, epigenomics, computationally predicted binding sites, etc), and the integration with an extensive knowledge base that can be used in such analysis. Conclusion The integrated access to primary genomics data, functional knowledge and analytical tools makes Genomics Portals platform a unique tool for interpreting results of new genomics experiments and for mining the vast amount of data stored in the Genomics Portals backend databases. Genomics Portals can be accessed and used freely at http://GenomicsPortals.org. PMID:20070909
Footprint Representation of Planetary Remote Sensing Data
NASA Astrophysics Data System (ADS)
Walter, S. H. G.; Gasselt, S. V.; Michael, G.; Neukum, G.
The geometric outline of remote sensing image data, the so called footprint, can be represented as a number of coordinate tuples. These polygons are associated with according attribute information such as orbit name, ground- and image resolution, solar longitude and illumination conditions to generate a powerful base for classification of planetary experiment data. Speed, handling and extended capabilites are the reasons for using geodatabases to store and access these data types. Techniques for such a spatial database of footprint data are demonstrated using the Relational Database Management System (RDBMS) PostgreSQL, spatially enabled by the PostGIS extension. Exemplary, footprints of the HRSC and OMEGA instruments, both onboard ESA's Mars Express Orbiter, are generated and connected to attribute information. The aim is to provide high-resolution footprints of the OMEGA instrument to the science community for the first time and make them available for web-based mapping applications like the "Planetary Interactive GIS-on-the-Web Analyzable Database" (PIG- WAD), produced by the USGS. Map overlays with HRSC or other instruments like MOC and THEMIS (footprint maps are already available for these instruments and can be integrated into the database) allow on-the-fly intersection and comparison as well as extended statistics of the data. Footprint polygons are generated one by one using standard software provided by the instrument teams. Attribute data is calculated and stored together with the geometric information. In the case of HRSC, the coordinates of the footprints are already available in the VICAR label of each image file. Using the VICAR RTL and PostgreSQL's libpq C library they are loaded into the database using the Well-Known Text (WKT) notation by the Open Geospatial Consortium, Inc. (OGC). For the OMEGA instrument, image data is read using IDL routines developed and distributed by the OMEGA team. Image outlines are exported together with relevant attribute data to the industry standard Shapefile format. These files are translated to a Structured Query Language (SQL) command sequence suitable for insertion into the PostGIS/PostgrSQL database using the shp2pgsql data loader provided by the PostGIS software. PostgreSQL's advanced features such as geometry types, rules, operators and functions allow complex spatial queries and on-the-fly processing of data on DBMS level e.g. generalisation of the outlines. Processing done by the DBMS, visualisation via GIS systems and utilisation for web-based applications like mapservers will be demonstrated.
NASA Technical Reports Server (NTRS)
Saeed, M.; Lieu, C.; Raber, G.; Mark, R. G.
2002-01-01
Development and evaluation of Intensive Care Unit (ICU) decision-support systems would be greatly facilitated by the availability of a large-scale ICU patient database. Following our previous efforts with the MIMIC (Multi-parameter Intelligent Monitoring for Intensive Care) Database, we have leveraged advances in networking and storage technologies to develop a far more massive temporal database, MIMIC II. MIMIC II is an ongoing effort: data is continuously and prospectively archived from all ICU patients in our hospital. MIMIC II now consists of over 800 ICU patient records including over 120 gigabytes of data and is growing. A customized archiving system was used to store continuously up to four waveforms and 30 different parameters from ICU patient monitors. An integrated user-friendly relational database was developed for browsing of patients' clinical information (lab results, fluid balance, medications, nurses' progress notes). Based upon its unprecedented size and scope, MIMIC II will prove to be an important resource for intelligent patient monitoring research, and will support efforts in medical data mining and knowledge-discovery.
Gene Expression Omnibus (GEO): Microarray data storage, submission, retrieval, and analysis
Barrett, Tanya
2006-01-01
The Gene Expression Omnibus (GEO) repository at the National Center for Biotechnology Information (NCBI) archives and freely distributes high-throughput molecular abundance data, predominantly gene expression data generated by DNA microarray technology. The database has a flexible design that can handle diverse styles of both unprocessed and processed data in a MIAME- (Minimum Information About a Microarray Experiment) supportive infrastructure that promotes fully annotated submissions. GEO currently stores about a billion individual gene expression measurements, derived from over 100 organisms, submitted by over 1,500 laboratories, addressing a wide range of biological phenomena. To maximize the utility of these data, several user-friendly Web-based interfaces and applications have been implemented that enable effective exploration, query, and visualization of these data, at the level of individual genes or entire studies. This chapter describes how the data are stored, submission procedures, and mechanisms for data retrieval and query. GEO is publicly accessible at http://www.ncbi.nlm.nih.gov/projects/geo/. PMID:16939800
Cloud-based adaptive exon prediction for DNA analysis
Putluri, Srinivasareddy; Fathima, Shaik Yasmeen
2018-01-01
Cloud computing offers significant research and economic benefits to healthcare organisations. Cloud services provide a safe place for storing and managing large amounts of such sensitive data. Under conventional flow of gene information, gene sequence laboratories send out raw and inferred information via Internet to several sequence libraries. DNA sequencing storage costs will be minimised by use of cloud service. In this study, the authors put forward a novel genomic informatics system using Amazon Cloud Services, where genomic sequence information is stored and accessed for processing. True identification of exon regions in a DNA sequence is a key task in bioinformatics, which helps in disease identification and design drugs. Three base periodicity property of exons forms the basis of all exon identification techniques. Adaptive signal processing techniques found to be promising in comparison with several other methods. Several adaptive exon predictors (AEPs) are developed using variable normalised least mean square and its maximum normalised variants to reduce computational complexity. Finally, performance evaluation of various AEPs is done based on measures such as sensitivity, specificity and precision using various standard genomic datasets taken from National Center for Biotechnology Information genomic sequence database. PMID:29515813
Epistemonikos: a free, relational, collaborative, multilingual database of health evidence.
Rada, Gabriel; Pérez, Daniel; Capurro, Daniel
2013-01-01
Epistemonikos (www.epistemonikos.org) is a free, multilingual database of the best available health evidence. This paper describes the design, development and implementation of the Epistemonikos project. Using several web technologies to store systematic reviews, their included articles, overviews of reviews and structured summaries, Epistemonikos is able to provide a simple and powerful search tool to access health evidence for sound decision making. Currently, Epistemonikos stores more than 115,000 unique documents and more than 100,000 relationships between documents. In addition, since its database is translated into 9 different languages, Epistemonikos ensures that non-English speaking decision-makers can access the best available evidence without language barriers.
OWLing Clinical Data Repositories With the Ontology Web Language
Pastor, Xavier; Lozano, Esther
2014-01-01
Background The health sciences are based upon information. Clinical information is usually stored and managed by physicians with precarious tools, such as spreadsheets. The biomedical domain is more complex than other domains that have adopted information and communication technologies as pervasive business tools. Moreover, medicine continuously changes its corpus of knowledge because of new discoveries and the rearrangements in the relationships among concepts. This scenario makes it especially difficult to offer good tools to answer the professional needs of researchers and constitutes a barrier that needs innovation to discover useful solutions. Objective The objective was to design and implement a framework for the development of clinical data repositories, capable of facing the continuous change in the biomedicine domain and minimizing the technical knowledge required from final users. Methods We combined knowledge management tools and methodologies with relational technology. We present an ontology-based approach that is flexible and efficient for dealing with complexity and change, integrated with a solid relational storage and a Web graphical user interface. Results Onto Clinical Research Forms (OntoCRF) is a framework for the definition, modeling, and instantiation of data repositories. It does not need any database design or programming. All required information to define a new project is explicitly stated in ontologies. Moreover, the user interface is built automatically on the fly as Web pages, whereas data are stored in a generic repository. This allows for immediate deployment and population of the database as well as instant online availability of any modification. Conclusions OntoCRF is a complete framework to build data repositories with a solid relational storage. Driven by ontologies, OntoCRF is more flexible and efficient to deal with complexity and change than traditional systems and does not require very skilled technical people facilitating the engineering of clinical software systems. PMID:25599697
OWLing Clinical Data Repositories With the Ontology Web Language.
Lozano-Rubí, Raimundo; Pastor, Xavier; Lozano, Esther
2014-08-01
The health sciences are based upon information. Clinical information is usually stored and managed by physicians with precarious tools, such as spreadsheets. The biomedical domain is more complex than other domains that have adopted information and communication technologies as pervasive business tools. Moreover, medicine continuously changes its corpus of knowledge because of new discoveries and the rearrangements in the relationships among concepts. This scenario makes it especially difficult to offer good tools to answer the professional needs of researchers and constitutes a barrier that needs innovation to discover useful solutions. The objective was to design and implement a framework for the development of clinical data repositories, capable of facing the continuous change in the biomedicine domain and minimizing the technical knowledge required from final users. We combined knowledge management tools and methodologies with relational technology. We present an ontology-based approach that is flexible and efficient for dealing with complexity and change, integrated with a solid relational storage and a Web graphical user interface. Onto Clinical Research Forms (OntoCRF) is a framework for the definition, modeling, and instantiation of data repositories. It does not need any database design or programming. All required information to define a new project is explicitly stated in ontologies. Moreover, the user interface is built automatically on the fly as Web pages, whereas data are stored in a generic repository. This allows for immediate deployment and population of the database as well as instant online availability of any modification. OntoCRF is a complete framework to build data repositories with a solid relational storage. Driven by ontologies, OntoCRF is more flexible and efficient to deal with complexity and change than traditional systems and does not require very skilled technical people facilitating the engineering of clinical software systems.
NASA Astrophysics Data System (ADS)
Gasser, Deta; Viola, Giulio; Bingen, Bernard
2016-04-01
Since 2010, the Geological Survey of Norway has been implementing and continuously developing a digital workflow for geological bedrock mapping in Norway, from fieldwork to final product. Our workflow is based on the ESRI ArcGIS platform, and we use rugged Windows computers in the field. Three different hardware solutions have been tested over the past 5 years (2010-2015). (1) Panasonic Toughbook CE-19 (2.3 kg), (2) Panasonic Toughbook CF H2 Field (1.6 kg) and (3) Motion MC F5t tablet (1.5 kg). For collection of point observations in the field we mainly use the SIGMA Mobile application in ESRI ArcGIS developed by the British Geological Survey, which allows the mappers to store georeferenced comments, structural measurements, sample information, photographs, sketches, log information etc. in a Microsoft Access database. The application is freely downloadable from the BGS websites. For line- and polygon work we use our in-house database, which is currently under revision. Our line database consists of three feature classes: (1) bedrock boundaries, (2) bedrock lineaments, and (3) bedrock lines, with each feature class having up to 24 different attribute fields. Our polygon database consists of one feature class with 38 attribute fields enabling to store various information concerning lithology, stratigraphic order, age, metamorphic grade and tectonic subdivision. The polygon and line databases are coupled via topology in ESRI ArcGIS, which allows us to edit them simultaneously. This approach has been applied in two large-scale 1:50 000 bedrock mapping projects, one in the Kongsberg domain of the Sveconorwegian orogen, and the other in the greater Trondheim area (Orkanger) in the Caledonian belt. The mapping projects combined collection of high-resolution geophysical data, digital acquisition of field data, and collection of geochronological, geochemical and petrological data. During the Kongsberg project, some 25000 field observation points were collected by eight geologists. For the Orkanger project, some 2100 field observation points were collected by three geologists. Several advantages of the applied digital approach became clear during these projects: (1) The systematic collection of geological field data in a common format allows easy access and exchange of data among different geologists, (2) Easier access to background information such as geophysics and DEMS in the field, (3) Faster workflow from field data collection to final map product. Obvious disadvantages include: (1) Heavy(ish) and expensive hardware, (2) Battery life and other technical issues in the field, (3) Need for a central field observation point storage inhouse (large amounts of data!), and (4) Acceptance of- and training in a common workflow from all involved geologists.
A publication database for optical long baseline interferometry
NASA Astrophysics Data System (ADS)
Malbet, Fabien; Mella, Guillaume; Lawson, Peter; Taillifet, Esther; Lafrasse, Sylvain
2010-07-01
Optical long baseline interferometry is a technique that has generated almost 850 refereed papers to date. The targets span a large variety of objects from planetary systems to extragalactic studies and all branches of stellar physics. We have created a database hosted by the JMMC and connected to the Optical Long Baseline Interferometry Newsletter (OLBIN) web site using MySQL and a collection of XML or PHP scripts in order to store and classify these publications. Each entry is defined by its ADS bibcode, includes basic ADS informations and metadata. The metadata are specified by tags sorted in categories: interferometric facilities, instrumentation, wavelength of operation, spectral resolution, type of measurement, target type, and paper category, for example. The whole OLBIN publication list has been processed and we present how the database is organized and can be accessed. We use this tool to generate statistical plots of interest for the community in optical long baseline interferometry.
Bottom-Up Evaluation of Twig Join Pattern Queries in XML Document Databases
NASA Astrophysics Data System (ADS)
Chen, Yangjun
Since the extensible markup language XML emerged as a new standard for information representation and exchange on the Internet, the problem of storing, indexing, and querying XML documents has been among the major issues of database research. In this paper, we study the twig pattern matching and discuss a new algorithm for processing ordered twig pattern queries. The time complexity of the algorithmis bounded by O(|D|·|Q| + |T|·leaf Q ) and its space overhead is by O(leaf T ·leaf Q ), where T stands for a document tree, Q for a twig pattern and D is a largest data stream associated with a node q of Q, which contains the database nodes that match the node predicate at q. leaf T (leaf Q ) represents the number of the leaf nodes of T (resp. Q). In addition, the algorithm can be adapted to an indexing environment with XB-trees being used.
LHCb experience with LFC replication
NASA Astrophysics Data System (ADS)
Bonifazi, F.; Carbone, A.; Perez, E. D.; D'Apice, A.; dell'Agnello, L.; Duellmann, D.; Girone, M.; Re, G. L.; Martelli, B.; Peco, G.; Ricci, P. P.; Sapunenko, V.; Vagnoni, V.; Vitlacil, D.
2008-07-01
Database replication is a key topic in the framework of the LHC Computing Grid to allow processing of data in a distributed environment. In particular, the LHCb computing model relies on the LHC File Catalog, i.e. a database which stores information about files spread across the GRID, their logical names and the physical locations of all the replicas. The LHCb computing model requires the LFC to be replicated at Tier-1s. The LCG 3D project deals with the database replication issue and provides a replication service based on Oracle Streams technology. This paper describes the deployment of the LHC File Catalog replication to the INFN National Center for Telematics and Informatics (CNAF) and to other LHCb Tier-1 sites. We performed stress tests designed to evaluate any delay in the propagation of the streams and the scalability of the system. The tests show the robustness of the replica implementation with performance going much beyond the LHCb requirements.
Mobile Apps for Suicide Prevention: Review of Virtual Stores and Literature.
de la Torre, Isabel; Castillo, Gema; Arambarri, Jon; López-Coronado, Miguel; Franco, Manuel A
2017-10-10
The best manner to prevent suicide is to recognize suicidal signs and signals, and know how to respond to them. We aim to study the existing mobile apps for suicide prevention in the literature and the most commonly used virtual stores. Two reviews were carried out. The first was done by searching the most commonly used commercial app stores, which are iTunes and Google Play. The second was a review of mobile health (mHealth) apps in published articles within the last 10 years in the following 7 scientific databases: Science Direct, Medline, PsycINFO, Embase, The Cochrane Library, IEEE Xplore, and Google Scholar. A total of 124 apps related to suicide were found in the cited virtual stores but only 20 apps were specifically designed for suicide prevention. All apps were free and most were designed for Android. Furthermore, 6 relevant papers were found in the indicated scientific databases; in these studies, some real experiences with physicians, caregivers, and families were described. The importance of these people in suicide prevention was indicated. The number of apps regarding suicide prevention is small, and there was little information available from literature searches, indicating that technology-based suicide prevention remains understudied. Many of the apps provided no interactive features. It is important to verify the accuracy of the results of different apps that are available on iOS and Android. The confidence generated by these apps can benefit end users, either by improving their health monitoring or simply to verify their body condition. ©Isabel de la Torre, Gema Castillo, Jon Arambarri, Miguel López-Coronado, Manuel A Franco. Originally published in JMIR Mhealth and Uhealth (http://mhealth.jmir.org), 10.10.2017.
DIMA.Tools: An R package for working with the database for inventory, monitoring, and assessment
USDA-ARS?s Scientific Manuscript database
The Database for Inventory, Monitoring, and Assessment (DIMA) is a Microsoft Access database used to collect, store and summarize monitoring data. This database is used by both local and national monitoring efforts within the National Park Service, the Forest Service, the Bureau of Land Management, ...
Implementing model-based system engineering for the whole lifecycle of a spacecraft
NASA Astrophysics Data System (ADS)
Fischer, P. M.; Lüdtke, D.; Lange, C.; Roshani, F.-C.; Dannemann, F.; Gerndt, A.
2017-09-01
Design information of a spacecraft is collected over all phases in the lifecycle of a project. A lot of this information is exchanged between different engineering tasks and business processes. In some lifecycle phases, model-based system engineering (MBSE) has introduced system models and databases that help to organize such information and to keep it consistent for everyone. Nevertheless, none of the existing databases approached the whole lifecycle yet. Virtual Satellite is the MBSE database developed at DLR. It has been used for quite some time in Phase A studies and is currently extended for implementing it in the whole lifecycle of spacecraft projects. Since it is unforeseeable which future use cases such a database needs to support in all these different projects, the underlying data model has to provide tailoring and extension mechanisms to its conceptual data model (CDM). This paper explains the mechanisms as they are implemented in Virtual Satellite, which enables extending the CDM along the project without corrupting already stored information. As an upcoming major use case, Virtual Satellite will be implemented as MBSE tool in the S2TEP project. This project provides a new satellite bus for internal research and several different payload missions in the future. This paper explains how Virtual Satellite will be used to manage configuration control problems associated with such a multi-mission platform. It discusses how the S2TEP project starts using the software for collecting the first design information from concurrent engineering studies, then making use of the extension mechanisms of the CDM to introduce further information artefacts such as functional electrical architecture, thus linking more and more processes into an integrated MBSE approach.
Emmenegger, E.J.; Kentop, E.; Thompson, T.M.; Pittam, S.; Ryan, A.; Keon, D.; Carlino, J.A.; Ranson, J.; Life, R.B.; Troyer, R.M.; Garver, K.A.; Kurath, G.
2011-01-01
The AquaPathogen X database is a template for recording information on individual isolates of aquatic pathogens and is freely available for download (http://wfrc.usgs.gov). This database can accommodate the nucleotide sequence data generated in molecular epidemiological studies along with the myriad of abiotic and biotic traits associated with isolates of various pathogens (e.g. viruses, parasites and bacteria) from multiple aquatic animal host species (e.g. fish, shellfish and shrimp). The cataloguing of isolates from different aquatic pathogens simultaneously is a unique feature to the AquaPathogen X database, which can be used in surveillance of emerging aquatic animal diseases and elucidation of key risk factors associated with pathogen incursions into new water systems. An application of the template database that stores the epidemiological profiles of fish virus isolates, called Fish ViroTrak, was also developed. Exported records for two aquatic rhabdovirus species emerging in North America were used in the implementation of two separate web-accessible databases: the Molecular Epidemiology of Aquatic Pathogens infectious haematopoietic necrosis virus (MEAP-IHNV) database (http://gis.nacse.org/ihnv/) released in 2006 and the MEAP- viral haemorrhagic septicaemia virus (http://gis.nacse.org/vhsv/) database released in 2010.
DRUMS: a human disease related unique gene mutation search engine.
Li, Zuofeng; Liu, Xingnan; Wen, Jingran; Xu, Ye; Zhao, Xin; Li, Xuan; Liu, Lei; Zhang, Xiaoyan
2011-10-01
With the completion of the human genome project and the development of new methods for gene variant detection, the integration of mutation data and its phenotypic consequences has become more important than ever. Among all available resources, locus-specific databases (LSDBs) curate one or more specific genes' mutation data along with high-quality phenotypes. Although some genotype-phenotype data from LSDB have been integrated into central databases little effort has been made to integrate all these data by a search engine approach. In this work, we have developed disease related unique gene mutation search engine (DRUMS), a search engine for human disease related unique gene mutation as a convenient tool for biologists or physicians to retrieve gene variant and related phenotype information. Gene variant and phenotype information were stored in a gene-centred relational database. Moreover, the relationships between mutations and diseases were indexed by the uniform resource identifier from LSDB, or another central database. By querying DRUMS, users can access the most popular mutation databases under one interface. DRUMS could be treated as a domain specific search engine. By using web crawling, indexing, and searching technologies, it provides a competitively efficient interface for searching and retrieving mutation data and their relationships to diseases. The present system is freely accessible at http://www.scbit.org/glif/new/drums/index.html. © 2011 Wiley-Liss, Inc.