SQL/NF Translator for the Triton Nested Relational Database System
1990-12-01
18as., Ohio .. 9~~ ~~ 1 4- AFIT/GCE/ENG/90D-05 SQL/Nk1 TRANSLATOR FOR THE TRITON NESTED RELATIONAL DATABASE SYSTEM THESIS Craig William Schnepf Captain...FOR THE TRITON NESTED RELATIONAL DATABASE SYSTEM THESIS Presented to the Faculty of the School of Engineering of the Air Force Institute of Technnlogy... systems . The SQL/NF query language used for the nested relationil model is an extension of the popular relational model query language SQL. The query
An Experimental Investigation of Complexity in Database Query Formulation Tasks
ERIC Educational Resources Information Center
Casterella, Gretchen Irwin; Vijayasarathy, Leo
2013-01-01
Information Technology professionals and other knowledge workers rely on their ability to extract data from organizational databases to respond to business questions and support decision making. Structured query language (SQL) is the standard programming language for querying data in relational databases, and SQL skills are in high demand and are…
Analysis and Development of a Web-Enabled Planning and Scheduling Database Application
2013-09-01
establishes an entity—relationship diagram for the desired process, constructs an operable database using MySQL , and provides a web- enabled interface for...development, develop, design, process, re- engineering, reengineering, MySQL , structured query language, SQL, myPHPadmin. 15. NUMBER OF PAGES 107 16...relationship diagram for the desired process, constructs an operable database using MySQL , and provides a web-enabled interface for the population of
NASA Technical Reports Server (NTRS)
McGlynn, T.; Santisteban, M.
2007-01-01
This chapter provides a very brief introduction to the Structured Query Language (SQL) for getting information from relational databases. We make no pretense that this is a complete or comprehensive discussion of SQL. There are many aspects of the language the will be completely ignored in the presentation. The goal here is to provide enough background so that users understand the basic concepts involved in building and using relational databases. We also go through the steps involved in building a particular astronomical database used in some of the other presentations in this volume.
Sánchez-de-Madariaga, Ricardo; Muñoz, Adolfo; Castro, Antonio L; Moreno, Oscar; Pascual, Mario
2018-01-01
This research shows a protocol to assess the computational complexity of querying relational and non-relational (NoSQL (not only Structured Query Language)) standardized electronic health record (EHR) medical information database systems (DBMS). It uses a set of three doubling-sized databases, i.e. databases storing 5000, 10,000 and 20,000 realistic standardized EHR extracts, in three different database management systems (DBMS): relational MySQL object-relational mapping (ORM), document-based NoSQL MongoDB, and native extensible markup language (XML) NoSQL eXist. The average response times to six complexity-increasing queries were computed, and the results showed a linear behavior in the NoSQL cases. In the NoSQL field, MongoDB presents a much flatter linear slope than eXist. NoSQL systems may also be more appropriate to maintain standardized medical information systems due to the special nature of the updating policies of medical information, which should not affect the consistency and efficiency of the data stored in NoSQL databases. One limitation of this protocol is the lack of direct results of improved relational systems such as archetype relational mapping (ARM) with the same data. However, the interpolation of doubling-size database results to those presented in the literature and other published results suggests that NoSQL systems might be more appropriate in many specific scenarios and problems to be solved. For example, NoSQL may be appropriate for document-based tasks such as EHR extracts used in clinical practice, or edition and visualization, or situations where the aim is not only to query medical information, but also to restore the EHR in exactly its original form. PMID:29608174
Sánchez-de-Madariaga, Ricardo; Muñoz, Adolfo; Castro, Antonio L; Moreno, Oscar; Pascual, Mario
2018-03-19
This research shows a protocol to assess the computational complexity of querying relational and non-relational (NoSQL (not only Structured Query Language)) standardized electronic health record (EHR) medical information database systems (DBMS). It uses a set of three doubling-sized databases, i.e. databases storing 5000, 10,000 and 20,000 realistic standardized EHR extracts, in three different database management systems (DBMS): relational MySQL object-relational mapping (ORM), document-based NoSQL MongoDB, and native extensible markup language (XML) NoSQL eXist. The average response times to six complexity-increasing queries were computed, and the results showed a linear behavior in the NoSQL cases. In the NoSQL field, MongoDB presents a much flatter linear slope than eXist. NoSQL systems may also be more appropriate to maintain standardized medical information systems due to the special nature of the updating policies of medical information, which should not affect the consistency and efficiency of the data stored in NoSQL databases. One limitation of this protocol is the lack of direct results of improved relational systems such as archetype relational mapping (ARM) with the same data. However, the interpolation of doubling-size database results to those presented in the literature and other published results suggests that NoSQL systems might be more appropriate in many specific scenarios and problems to be solved. For example, NoSQL may be appropriate for document-based tasks such as EHR extracts used in clinical practice, or edition and visualization, or situations where the aim is not only to query medical information, but also to restore the EHR in exactly its original form.
Methods to Secure Databases Against Vulnerabilities
2015-12-01
for several languages such as C, C++, PHP, Java and Python [16]. MySQL will work well with very large databases. The documentation references...using Eclipse and connected to each database management system using Python and Java drivers provided by MySQL , MongoDB, and Datastax (for Cassandra...tiers in Python and Java . Problem MySQL MongoDB Cassandra 1. Injection a. Tautologies Vulnerable Vulnerable Not Vulnerable b. Illegal query
Information Security Considerations for Applications Using Apache Accumulo
2014-09-01
Distributed File System INSCOM United States Army Intelligence and Security Command JPA Java Persistence API JSON JavaScript Object Notation MAC Mandatory... MySQL [13]. BigTable can process 20 petabytes per day [14]. High degree of scalability on commodity hardware. NoSQL databases do not rely on highly...manipulation in relational databases. NoSQL databases each have a unique programming interface that uses a lower level procedural language (e.g., Java
Social media based NPL system to find and retrieve ARM data: Concept paper
DOE Office of Scientific and Technical Information (OSTI.GOV)
Devarakonda, Ranjeet; Giansiracusa, Michael T.; Kumar, Jitendra
Information connectivity and retrieval has a role in our daily lives. The most pervasive source of online information is databases. The amount of data is growing at rapid rate and database technology is improving and having a profound effect. Almost all online applications are storing and retrieving information from databases. One challenge in supplying the public with wider access to informational databases is the need for knowledge of database languages like Structured Query Language (SQL). Although the SQL language has been published in many forms, not everybody is able to write SQL queries. Another challenge is that it may notmore » be practical to make the public aware of the structure of the database. There is a need for novice users to query relational databases using their natural language. To solve this problem, many natural language interfaces to structured databases have been developed. The goal is to provide more intuitive method for generating database queries and delivering responses. Social media makes it possible to interact with a wide section of the population. Through this medium, and with the help of Natural Language Processing (NLP) we can make the data of the Atmospheric Radiation Measurement Data Center (ADC) more accessible to the public. We propose an architecture for using Apache Lucene/Solr [1], OpenML [2,3], and Kafka [4] to generate an automated query/response system with inputs from Twitter5, our Cassandra DB, and our log database. Using the Twitter API and NLP we can give the public the ability to ask questions of our database and get automated responses.« less
DOE Office of Scientific and Technical Information (OSTI.GOV)
Devarakonda, Ranjeet; Giansiracusa, Michael T.; Kumar, Jitendra
Information connectivity and retrieval has a role in our daily lives. The most pervasive source of online information is databases. The amount of data is growing at rapid rate and database technology is improving and having a profound effect. Almost all online applications are storing and retrieving information from databases. One challenge in supplying the public with wider access to informational databases is the need for knowledge of database languages like Structured Query Language (SQL). Although the SQL language has been published in many forms, not everybody is able to write SQL queries. Another challenge is that it may notmore » be practical to make the public aware of the structure of the database. There is a need for novice users to query relational databases using their natural language. To solve this problem, many natural language interfaces to structured databases have been developed. The goal is to provide more intuitive method for generating database queries and delivering responses. Social media makes it possible to interact with a wide section of the population. Through this medium, and with the help of Natural Language Processing (NLP) we can make the data of the Atmospheric Radiation Measurement Data Center (ADC) more accessible to the public. We propose an architecture for using Apache Lucene/Solr [1], OpenML [2,3], and Kafka [4] to generate an automated query/response system with inputs from Twitter5, our Cassandra DB, and our log database. Using the Twitter API and NLP we can give the public the ability to ask questions of our database and get automated responses.« less
Network Configuration of Oracle and Database Programming Using SQL
NASA Technical Reports Server (NTRS)
Davis, Melton; Abdurrashid, Jibril; Diaz, Philip; Harris, W. C.
2000-01-01
A database can be defined as a collection of information organized in such a way that it can be retrieved and used. A database management system (DBMS) can further be defined as the tool that enables us to manage and interact with the database. The Oracle 8 Server is a state-of-the-art information management environment. It is a repository for very large amounts of data, and gives users rapid access to that data. The Oracle 8 Server allows for sharing of data between applications; the information is stored in one place and used by many systems. My research will focus primarily on SQL (Structured Query Language) programming. SQL is the way you define and manipulate data in Oracle's relational database. SQL is the industry standard adopted by all database vendors. When programming with SQL, you work on sets of data (i.e., information is not processed one record at a time).
Technical Aspects of Interfacing MUMPS to an External SQL Relational Database Management System
Kuzmak, Peter M.; Walters, Richard F.; Penrod, Gail
1988-01-01
This paper describes an interface connecting InterSystems MUMPS (M/VX) to an external relational DBMS, the SYBASE Database Management System. The interface enables MUMPS to operate in a relational environment and gives the MUMPS language full access to a complete set of SQL commands. MUMPS generates SQL statements as ASCII text and sends them to the RDBMS. The RDBMS executes the statements and returns ASCII results to MUMPS. The interface suggests that the language features of MUMPS make it an attractive tool for use in the relational database environment. The approach described in this paper separates MUMPS from the relational database. Positioning the relational database outside of MUMPS promotes data sharing and permits a number of different options to be used for working with the data. Other languages like C, FORTRAN, and COBOL can access the RDBMS database. Advanced tools provided by the relational database vendor can also be used. SYBASE is an advanced high-performance transaction-oriented relational database management system for the VAX/VMS and UNIX operating systems. SYBASE is designed using a distributed open-systems architecture, and is relatively easy to interface with MUMPS.
Evaluating a NoSQL Alternative for Chilean Virtual Observatory Services
NASA Astrophysics Data System (ADS)
Antognini, J.; Araya, M.; Solar, M.; Valenzuela, C.; Lira, F.
2015-09-01
Currently, the standards and protocols for data access in the Virtual Observatory architecture (DAL) are generally implemented with relational databases based on SQL. In particular, the Astronomical Data Query Language (ADQL), language used by IVOA to represent queries to VO services, was created to satisfy the different data access protocols, such as Simple Cone Search. ADQL is based in SQL92, and has extra functionality implemented using PgSphere. An emergent alternative to SQL are the so called NoSQL databases, which can be classified in several categories such as Column, Document, Key-Value, Graph, Object, etc.; each one recommended for different scenarios. Within their notable characteristics we can find: schema-free, easy replication support, simple API, Big Data, etc. The Chilean Virtual Observatory (ChiVO) is developing a functional prototype based on the IVOA architecture, with the following relevant factors: Performance, Scalability, Flexibility, Complexity, and Functionality. Currently, it's very difficult to compare these factors, due to a lack of alternatives. The objective of this paper is to compare NoSQL alternatives with SQL through the implementation of a Web API REST that satisfies ChiVO's needs: a SESAME-style name resolver for the data from ALMA. Therefore, we propose a test scenario by configuring a NoSQL database with data from different sources and evaluating the feasibility of creating a Simple Cone Search service and its performance. This comparison will allow to pave the way for the application of Big Data databases in the Virtual Observatory.
Standard Port-Visit Cost Forecasting Model for U.S. Navy Husbanding Contracts
2009-12-01
Protocol (HTTP) server.35 2. MySQL . An open-source database.36 3. PHP . A common scripting language used for Web development.37 E. IMPLEMENTATION OF...Inc. (2009). MySQL Community Server (Version 5.1) [Software]. Available from http://dev.mysql.com/downloads/ 37 The PHP Group (2009). PHP (Version...Logistics Services MySQL My Structured Query Language NAVSUP Navy Supply Systems Command NC Non-Contract Items NPS Naval Postgraduate
Learning Asset Technology Integration Support Tool Design Document
2010-05-11
language known as Hypertext Preprocessor ( PHP ) and by MySQL – a relational database management system that can also be used for content management. It...Requirements The LATIST tool will be implemented utilizing a WordPress platform with MySQL as the database. Also the LATIST system must effectively work... MySQL . When designing the LATIST system there are several considerations which must be accounted for in the working prototype. These include: • DAU
Hewitt, Robin; Gobbi, Alberto; Lee, Man-Ling
2005-01-01
Relational databases are the current standard for storing and retrieving data in the pharmaceutical and biotech industries. However, retrieving data from a relational database requires specialized knowledge of the database schema and of the SQL query language. At Anadys, we have developed an easy-to-use system for searching and reporting data in a relational database to support our drug discovery project teams. This system is fast and flexible and allows users to access all data without having to write SQL queries. This paper presents the hierarchical, graph-based metadata representation and SQL-construction methods that, together, are the basis of this system's capabilities.
Flexible network reconstruction from relational databases with Cytoscape and CytoSQL
2010-01-01
Background Molecular interaction networks can be efficiently studied using network visualization software such as Cytoscape. The relevant nodes, edges and their attributes can be imported in Cytoscape in various file formats, or directly from external databases through specialized third party plugins. However, molecular data are often stored in relational databases with their own specific structure, for which dedicated plugins do not exist. Therefore, a more generic solution is presented. Results A new Cytoscape plugin 'CytoSQL' is developed to connect Cytoscape to any relational database. It allows to launch SQL ('Structured Query Language') queries from within Cytoscape, with the option to inject node or edge features of an existing network as SQL arguments, and to convert the retrieved data to Cytoscape network components. Supported by a set of case studies we demonstrate the flexibility and the power of the CytoSQL plugin in converting specific data subsets into meaningful network representations. Conclusions CytoSQL offers a unified approach to let Cytoscape interact with relational databases. Thanks to the power of the SQL syntax, this tool can rapidly generate and enrich networks according to very complex criteria. The plugin is available at http://www.ptools.ua.ac.be/CytoSQL. PMID:20594316
Flexible network reconstruction from relational databases with Cytoscape and CytoSQL.
Laukens, Kris; Hollunder, Jens; Dang, Thanh Hai; De Jaeger, Geert; Kuiper, Martin; Witters, Erwin; Verschoren, Alain; Van Leemput, Koenraad
2010-07-01
Molecular interaction networks can be efficiently studied using network visualization software such as Cytoscape. The relevant nodes, edges and their attributes can be imported in Cytoscape in various file formats, or directly from external databases through specialized third party plugins. However, molecular data are often stored in relational databases with their own specific structure, for which dedicated plugins do not exist. Therefore, a more generic solution is presented. A new Cytoscape plugin 'CytoSQL' is developed to connect Cytoscape to any relational database. It allows to launch SQL ('Structured Query Language') queries from within Cytoscape, with the option to inject node or edge features of an existing network as SQL arguments, and to convert the retrieved data to Cytoscape network components. Supported by a set of case studies we demonstrate the flexibility and the power of the CytoSQL plugin in converting specific data subsets into meaningful network representations. CytoSQL offers a unified approach to let Cytoscape interact with relational databases. Thanks to the power of the SQL syntax, this tool can rapidly generate and enrich networks according to very complex criteria. The plugin is available at http://www.ptools.ua.ac.be/CytoSQL.
Using PHP/MySQL to Manage Potential Mass Impacts
NASA Technical Reports Server (NTRS)
Hager, Benjamin I.
2010-01-01
This paper presents a new application using commercially available software to manage mass properties for spaceflight vehicles. PHP/MySQL(PHP: Hypertext Preprocessor and My Structured Query Language) are a web scripting language and a database language commonly used in concert with each other. They open up new opportunities to develop cutting edge mass properties tools, and in particular, tools for the management of potential mass impacts (threats and opportunities). The paper begins by providing an overview of the functions and capabilities of PHP/MySQL. The focus of this paper is on how PHP/MySQL are being used to develop an advanced "web accessible" database system for identifying and managing mass impacts on NASA's Ares I Upper Stage program, managed by the Marshall Space Flight Center. To fully describe this application, examples of the data, search functions, and views are provided to promote, not only the function, but the security, ease of use, simplicity, and eye-appeal of this new application. This paper concludes with an overview of the other potential mass properties applications and tools that could be developed using PHP/MySQL. The premise behind this paper is that PHP/MySQL are software tools that are easy to use and readily available for the development of cutting edge mass properties applications. These tools are capable of providing "real-time" searching and status of an active database, automated report generation, and other capabilities to streamline and enhance mass properties management application. By using PHP/MySQL, proven existing methods for managing mass properties can be adapted to present-day information technology to accelerate mass properties data gathering, analysis, and reporting, allowing mass property management to keep pace with today's fast-pace design and development processes.
Database Reports Over the Internet
NASA Technical Reports Server (NTRS)
Smith, Dean Lance
2002-01-01
Most of the summer was spent developing software that would permit existing test report forms to be printed over the web on a printer that is supported by Adobe Acrobat Reader. The data is stored in a DBMS (Data Base Management System). The client asks for the information from the database using an HTML (Hyper Text Markup Language) form in a web browser. JavaScript is used with the forms to assist the user and verify the integrity of the entered data. Queries to a database are made in SQL (Sequential Query Language), a widely supported standard for making queries to databases. Java servlets, programs written in the Java programming language running under the control of network server software, interrogate the database and complete a PDF form template kept in a file. The completed report is sent to the browser requesting the report. Some errors are sent to the browser in an HTML web page, others are reported to the server. Access to the databases was restricted since the data are being transported to new DBMS software that will run on new hardware. However, the SQL queries were made to Microsoft Access, a DBMS that is available on most PCs (Personal Computers). Access does support the SQL commands that were used, and a database was created with Access that contained typical data for the report forms. Some of the problems and features are discussed below.
A Tutorial in Creating Web-Enabled Databases with Inmagic DB/TextWorks through ODBC.
ERIC Educational Resources Information Center
Breeding, Marshall
2000-01-01
Explains how to create Web-enabled databases. Highlights include Inmagic's DB/Text WebPublisher product called DB/TextWorks; ODBC (Open Database Connectivity) drivers; Perl programming language; HTML coding; Structured Query Language (SQL); Common Gateway Interface (CGI) programming; and examples of HTML pages and Perl scripts. (LRW)
A UML Profile for Developing Databases that Conform to the Third Manifesto
NASA Astrophysics Data System (ADS)
Eessaar, Erki
The Third Manifesto (TTM) presents the principles of a relational database language that is free of deficiencies and ambiguities of SQL. There are database management systems that are created according to TTM. Developers need tools that support the development of databases by using these database management systems. UML is a widely used visual modeling language. It provides built-in extension mechanism that makes it possible to extend UML by creating profiles. In this paper, we introduce a UML profile for designing databases that correspond to the rules of TTM. We created the first version of the profile by translating existing profiles of SQL database design. After that, we extended and improved the profile. We implemented the profile by using UML CASE system StarUML™. We present an example of using the new profile. In addition, we describe problems that occurred during the profile development.
NVST Data Archiving System Based On FastBit NoSQL Database
NASA Astrophysics Data System (ADS)
Liu, Ying-bo; Wang, Feng; Ji, Kai-fan; Deng, Hui; Dai, Wei; Liang, Bo
2014-06-01
The New Vacuum Solar Telescope (NVST) is a 1-meter vacuum solar telescope that aims to observe the fine structures of active regions on the Sun. The main tasks of the NVST are high resolution imaging and spectral observations, including the measurements of the solar magnetic field. The NVST has been collecting more than 20 million FITS files since it began routine observations in 2012 and produces a maximum observational records of 120 thousand files in a day. Given the large amount of files, the effective archiving and retrieval of files becomes a critical and urgent problem. In this study, we implement a new data archiving system for the NVST based on the Fastbit Not Only Structured Query Language (NoSQL) database. Comparing to the relational database (i.e., MySQL; My Structured Query Language), the Fastbit database manifests distinctive advantages on indexing and querying performance. In a large scale database of 40 million records, the multi-field combined query response time of Fastbit database is about 15 times faster and fully meets the requirements of the NVST. Our study brings a new idea for massive astronomical data archiving and would contribute to the design of data management systems for other astronomical telescopes.
Interactive DataBase of Cosmic Ray Anisotropy (DB A10)
NASA Astrophysics Data System (ADS)
Asipenka, A.S.; Belov, A.V.; Eroshenko, E.F.; Klepach, E.G.; Oleneva, V.A.; Yake, V.G.
Data on the hourly means of cosmic ray density and anisotropy derived by the GSM method over the 1957-2006 are introduced in to MySQL database. This format allowed an access to data both in local and in the Internet. Using the realized combination of script-language Php and My SQL database the Internet project was created on the access for users data on the CR anisotropy in different formats (http://cr20.izmiran.ru/AnisotropyCR/main.htm/). Usage the sheaf Php and MySQL provides fast receiving data even in the Internet since a request and following process of data are accomplished on the project server. Usage of MySQL basis for the storing data on cosmic ray variations give a possibility to construct requests of different structures, extends the variety of data reflection, makes it possible the conformity data to other systems and usage them in other projects.
Petaminer: Using ROOT for efficient data storage in MySQL database
NASA Astrophysics Data System (ADS)
Cranshaw, J.; Malon, D.; Vaniachine, A.; Fine, V.; Lauret, J.; Hamill, P.
2010-04-01
High Energy and Nuclear Physics (HENP) experiments store Petabytes of event data and Terabytes of calibration data in ROOT files. The Petaminer project is developing a custom MySQL storage engine to enable the MySQL query processor to directly access experimental data stored in ROOT files. Our project is addressing the problem of efficient navigation to PetaBytes of HENP experimental data described with event-level TAG metadata, which is required by data intensive physics communities such as the LHC and RHIC experiments. Physicists need to be able to compose a metadata query and rapidly retrieve the set of matching events, where improved efficiency will facilitate the discovery process by permitting rapid iterations of data evaluation and retrieval. Our custom MySQL storage engine enables the MySQL query processor to directly access TAG data stored in ROOT TTrees. As ROOT TTrees are column-oriented, reading them directly provides improved performance over traditional row-oriented TAG databases. Leveraging the flexible and powerful SQL query language to access data stored in ROOT TTrees, the Petaminer approach enables rich MySQL index-building capabilities for further performance optimization.
Epstein, Richard H; Dexter, Franklin
2017-07-01
Comorbidity adjustment is often performed during outcomes and health care resource utilization research. Our goal was to develop an efficient algorithm in structured query language (SQL) to determine the Elixhauser comorbidity index. We wrote an SQL algorithm to calculate the Elixhauser comorbidities from Diagnosis Related Group and International Classification of Diseases (ICD) codes. Validation was by comparison to expected comorbidities from combinations of these codes and to the 2013 Nationwide Readmissions Database (NRD). The SQL algorithm matched perfectly with expected comorbidities for all combinations of ICD-9 or ICD-10, and Diagnosis Related Groups. Of 13 585 859 evaluable NRD records, the algorithm matched 100% of the listed comorbidities. Processing time was ∼0.05 ms/record. The SQL Elixhauser code was efficient and computationally identical to the SAS algorithm used for the NRD. This algorithm may be useful where preprocessing of large datasets in a relational database environment and comorbidity determination is desired before statistical analysis. A validated SQL procedure to calculate Elixhauser comorbidities and the van Walraven index from ICD-9 or ICD-10 discharge diagnosis codes has been published. © The Author 2017. Published by Oxford University Press on behalf of the American Medical Informatics Association. All rights reserved. For Permissions, please email: journals.permissions@oup.com
ERIC Educational Resources Information Center
Mills, Robert J.; Dupin-Bryant, Pamela A.; Johnson, John D.; Beaulieu, Tanya Y.
2015-01-01
The demand for Information Systems (IS) graduates with expertise in Structured Query Language (SQL) and database management is vast and projected to increase as "big data" becomes ubiquitous. To prepare students to solve complex problems in a data-driven world, educators must explore instructional strategies to help link prior knowledge…
Shuttle-Data-Tape XML Translator
NASA Technical Reports Server (NTRS)
Barry, Matthew R.; Osborne, Richard N.
2005-01-01
JSDTImport is a computer program for translating native Shuttle Data Tape (SDT) files from American Standard Code for Information Interchange (ASCII) format into databases in other formats. JSDTImport solves the problem of organizing the SDT content, affording flexibility to enable users to choose how to store the information in a database to better support client and server applications. JSDTImport can be dynamically configured by use of a simple Extensible Markup Language (XML) file. JSDTImport uses this XML file to define how each record and field will be parsed, its layout and definition, and how the resulting database will be structured. JSDTImport also includes a client application programming interface (API) layer that provides abstraction for the data-querying process. The API enables a user to specify the search criteria to apply in gathering all the data relevant to a query. The API can be used to organize the SDT content and translate into a native XML database. The XML format is structured into efficient sections, enabling excellent query performance by use of the XPath query language. Optionally, the content can be translated into a Structured Query Language (SQL) database for fast, reliable SQL queries on standard database server computers.
Constructing a Graph Database for Semantic Literature-Based Discovery.
Hristovski, Dimitar; Kastrin, Andrej; Dinevski, Dejan; Rindflesch, Thomas C
2015-01-01
Literature-based discovery (LBD) generates discoveries, or hypotheses, by combining what is already known in the literature. Potential discoveries have the form of relations between biomedical concepts; for example, a drug may be determined to treat a disease other than the one for which it was intended. LBD views the knowledge in a domain as a network; a set of concepts along with the relations between them. As a starting point, we used SemMedDB, a database of semantic relations between biomedical concepts extracted with SemRep from Medline. SemMedDB is distributed as a MySQL relational database, which has some problems when dealing with network data. We transformed and uploaded SemMedDB into the Neo4j graph database, and implemented the basic LBD discovery algorithms with the Cypher query language. We conclude that storing the data needed for semantic LBD is more natural in a graph database. Also, implementing LBD discovery algorithms is conceptually simpler with a graph query language when compared with standard SQL.
Supporting temporal queries on clinical relational databases: the S-WATCH-QL language.
Combi, C.; Missora, L.; Pinciroli, F.
1996-01-01
Due to the ubiquitous and special nature of time, specially in clinical datábases there's the need of particular temporal data and operators. In this paper we describe S-WATCH-QL (Structured Watch Query Language), a temporal extension of SQL, the widespread query language based on the relational model. S-WATCH-QL extends the well-known SQL by the addition of: a) temporal data types that allow the storage of information with different levels of granularity; b) historical relations that can store together both instantaneous valid times and intervals; c) some temporal clauses, functions and predicates allowing to define complex temporal queries. PMID:8947722
Integrated Array/Metadata Analytics
NASA Astrophysics Data System (ADS)
Misev, Dimitar; Baumann, Peter
2015-04-01
Data comes in various forms and types, and integration usually presents a problem that is often simply ignored and solved with ad-hoc solutions. Multidimensional arrays are an ubiquitous data type, that we find at the core of virtually all science and engineering domains, as sensor, model, image, statistics data. Naturally, arrays are richly described by and intertwined with additional metadata (alphanumeric relational data, XML, JSON, etc). Database systems, however, a fundamental building block of what we call "Big Data", lack adequate support for modelling and expressing these array data/metadata relationships. Array analytics is hence quite primitive or non-existent at all in modern relational DBMS. Recognizing this, we extended SQL with a new SQL/MDA part seamlessly integrating multidimensional array analytics into the standard database query language. We demonstrate the benefits of SQL/MDA with real-world examples executed in ASQLDB, an open-source mediator system based on HSQLDB and rasdaman, that already implements SQL/MDA.
2017-01-01
Reusing the data from healthcare information systems can effectively facilitate clinical trials (CTs). How to select candidate patients eligible for CT recruitment criteria is a central task. Related work either depends on DBA (database administrator) to convert the recruitment criteria to native SQL queries or involves the data mapping between a standard ontology/information model and individual data source schema. This paper proposes an alternative computer-aided CT recruitment paradigm, based on syntax translation between different DSLs (domain-specific languages). In this paradigm, the CT recruitment criteria are first formally represented as production rules. The referenced rule variables are all from the underlying database schema. Then the production rule is translated to an intermediate query-oriented DSL (e.g., LINQ). Finally, the intermediate DSL is directly mapped to native database queries (e.g., SQL) automated by ORM (object-relational mapping). PMID:29065644
Zhang, Yinsheng; Zhang, Guoming; Shang, Qian
2017-01-01
Reusing the data from healthcare information systems can effectively facilitate clinical trials (CTs). How to select candidate patients eligible for CT recruitment criteria is a central task. Related work either depends on DBA (database administrator) to convert the recruitment criteria to native SQL queries or involves the data mapping between a standard ontology/information model and individual data source schema. This paper proposes an alternative computer-aided CT recruitment paradigm, based on syntax translation between different DSLs (domain-specific languages). In this paradigm, the CT recruitment criteria are first formally represented as production rules. The referenced rule variables are all from the underlying database schema. Then the production rule is translated to an intermediate query-oriented DSL (e.g., LINQ). Finally, the intermediate DSL is directly mapped to native database queries (e.g., SQL) automated by ORM (object-relational mapping).
Relational Algebra and SQL: Better Together
ERIC Educational Resources Information Center
McMaster, Kirby; Sambasivam, Samuel; Hadfield, Steven; Wolthuis, Stuart
2013-01-01
In this paper, we describe how database instructors can teach Relational Algebra and Structured Query Language together through programming. Students write query programs consisting of sequences of Relational Algebra operations vs. Structured Query Language SELECT statements. The query programs can then be run interactively, allowing students to…
Enabling On-Demand Database Computing with MIT SuperCloud Database Management System
2015-09-15
arc.liv.ac.uk/trac/SGE) provides these services and is independent of programming language (C, Fortran, Java , Matlab, etc) or parallel programming...a MySQL database to store DNS records. The DNS records are controlled via a simple web service interface that allows records to be created
ERIC Educational Resources Information Center
Liu, Xia; Liu, Lai C.; Koong, Kai S.; Lu, June
2003-01-01
Analysis of 300 information technology job postings in two Internet databases identified the following skill categories: programming languages (Java, C/C++, and Visual Basic were most frequent); website development (57% sought SQL and HTML skills); databases (nearly 50% required Oracle); networks (only Windows NT or wide-area/local-area networks);…
ADVICE--Educational System for Teaching Database Courses
ERIC Educational Resources Information Center
Cvetanovic, M.; Radivojevic, Z.; Blagojevic, V.; Bojovic, M.
2011-01-01
This paper presents a Web-based educational system, ADVICE, that helps students to bridge the gap between database management system (DBMS) theory and practice. The usage of ADVICE is presented through a set of laboratory exercises developed to teach students conceptual and logical modeling, SQL, formal query languages, and normalization. While…
Improved Information Retrieval Performance on SQL Database Using Data Adapter
NASA Astrophysics Data System (ADS)
Husni, M.; Djanali, S.; Ciptaningtyas, H. T.; Wicaksana, I. G. N. A.
2018-02-01
The NoSQL databases, short for Not Only SQL, are increasingly being used as the number of big data applications increases. Most systems still use relational databases (RDBs), but as the number of data increases each year, the system handles big data with NoSQL databases to analyze and access data more quickly. NoSQL emerged as a result of the exponential growth of the internet and the development of web applications. The query syntax in the NoSQL database differs from the SQL database, therefore requiring code changes in the application. Data adapter allow applications to not change their SQL query syntax. Data adapters provide methods that can synchronize SQL databases with NotSQL databases. In addition, the data adapter provides an interface which is application can access to run SQL queries. Hence, this research applied data adapter system to synchronize data between MySQL database and Apache HBase using direct access query approach, where system allows application to accept query while synchronization process in progress. From the test performed using data adapter, the results obtained that the data adapter can synchronize between SQL databases, MySQL, and NoSQL database, Apache HBase. This system spends the percentage of memory resources in the range of 40% to 60%, and the percentage of processor moving from 10% to 90%. In addition, from this system also obtained the performance of database NoSQL better than SQL database.
A SQL-Database Based Meta-CASE System and its Query Subsystem
NASA Astrophysics Data System (ADS)
Eessaar, Erki; Sgirka, Rünno
Meta-CASE systems simplify the creation of CASE (Computer Aided System Engineering) systems. In this paper, we present a meta-CASE system that provides a web-based user interface and uses an object-relational database system (ORDBMS) as its basis. The use of ORDBMSs allows us to integrate different parts of the system and simplify the creation of meta-CASE and CASE systems. ORDBMSs provide powerful query mechanism. The proposed system allows developers to use queries to evaluate and gradually improve artifacts and calculate values of software measures. We illustrate the use of the systems by using SimpleM modeling language and discuss the use of SQL in the context of queries about artifacts. We have created a prototype of the meta-CASE system by using PostgreSQL™ ORDBMS and PHP scripting language.
Improved oilfield GHG accounting using a global oilfield database
NASA Astrophysics Data System (ADS)
Roberts, S.; Brandt, A. R.; Masnadi, M.
2016-12-01
The definition of oil is shifting in considerable ways. Conventional oil resources are declining as oil sands, heavy oils, and others emerge. Technological advances mean that these unconventional hydrocarbons are now viable resources. Meanwhile, scientific evidence is mounting that climate change is occurring. The oil sector is responsible for 35% of global greenhouse gas (GHG) emissions, but the climate impacts of these new unconventional oils are not well understood. As such, the Oil Climate Index (OCI) project has been an international effort to evaluate the total life-cycle environmental GHG emissions of different oil fields globally. Over the course of the first and second phases of the project, 30 and 75 global oil fields have been investigated, respectively. The 75 fields account for about 25% of global oil production. For the third phase of the project, it is aimed to expand the OCI to contain closing to 100% of global oil production; leading to the analysis of 8000 fields. To accomplish this, a robust database system is required to handle and manipulate the data. Therefore, the integration of the data into the computer science language SQL (Structured Query Language) was performed. The implementation of SQL allows users to process the data more efficiently than would be possible by using the previously established program (Microsoft Excel). Next, a graphic user interface (gui) was implemented, in the computer science language of C#, in order to make the data interactive; enabling people to update the database without prior knowledge of SQL being necessary.
Astronomical Data Processing Using SciQL, an SQL Based Query Language for Array Data
NASA Astrophysics Data System (ADS)
Zhang, Y.; Scheers, B.; Kersten, M.; Ivanova, M.; Nes, N.
2012-09-01
SciQL (pronounced as ‘cycle’) is a novel SQL-based array query language for scientific applications with both tables and arrays as first class citizens. SciQL lowers the entrance fee of adopting relational DBMS (RDBMS) in scientific domains, because it includes functionality often only found in mathematics software packages. In this paper, we demonstrate the usefulness of SciQL for astronomical data processing using examples from the Transient Key Project of the LOFAR radio telescope. In particular, how the LOFAR light-curve database of all detected sources can be constructed, by correlating sources across the spatial, frequency, time and polarisation domains.
Software Engineering Laboratory (SEL) database organization and user's guide, revision 2
NASA Technical Reports Server (NTRS)
Morusiewicz, Linda; Bristow, John
1992-01-01
The organization of the Software Engineering Laboratory (SEL) database is presented. Included are definitions and detailed descriptions of the database tables and views, the SEL data, and system support data. The mapping from the SEL and system support data to the base table is described. In addition, techniques for accessing the database through the Database Access Manager for the SEL (DAMSEL) system and via the ORACLE structured query language (SQL) are discussed.
Software Engineering Laboratory (SEL) database organization and user's guide
NASA Technical Reports Server (NTRS)
So, Maria; Heller, Gerard; Steinberg, Sandra; Spiegel, Douglas
1989-01-01
The organization of the Software Engineering Laboratory (SEL) database is presented. Included are definitions and detailed descriptions of the database tables and views, the SEL data, and system support data. The mapping from the SEL and system support data to the base tables is described. In addition, techniques for accessing the database, through the Database Access Manager for the SEL (DAMSEL) system and via the ORACLE structured query language (SQL), are discussed.
Datacube Services in Action, Using Open Source and Open Standards
NASA Astrophysics Data System (ADS)
Baumann, P.; Misev, D.
2016-12-01
Array Databases comprise novel, promising technology for massive spatio-temporal datacubes, extending the SQL paradigm of "any query, anytime" to n-D arrays. On server side, such queries can be optimized, parallelized, and distributed based on partitioned array storage. The rasdaman ("raster data manager") system, which has pioneered Array Databases, is available in open source on www.rasdaman.org. Its declarative query language extends SQL with array operators which are optimized and parallelized on server side. The rasdaman engine, which is part of OSGeo Live, is mature and in operational use databases individually holding dozens of Terabytes. Further, the rasdaman concepts have strongly impacted international Big Data standards in the field, including the forthcoming MDA ("Multi-Dimensional Array") extension to ISO SQL, the OGC Web Coverage Service (WCS) and Web Coverage Processing Service (WCPS) standards, and the forthcoming INSPIRE WCS/WCPS; in both OGC and INSPIRE, OGC is WCS Core Reference Implementation. In our talk we present concepts, architecture, operational services, and standardization impact of open-source rasdaman, as well as experiences made.
SiC: An Agent Based Architecture for Preventing and Detecting Attacks to Ubiquitous Databases
NASA Astrophysics Data System (ADS)
Pinzón, Cristian; de Paz, Yanira; Bajo, Javier; Abraham, Ajith; Corchado, Juan M.
One of the main attacks to ubiquitous databases is the structure query language (SQL) injection attack, which causes severe damages both in the commercial aspect and in the user’s confidence. This chapter proposes the SiC architecture as a solution to the SQL injection attack problem. This is a hierarchical distributed multiagent architecture, which involves an entirely new approach with respect to existing architectures for the prevention and detection of SQL injections. SiC incorporates a kind of intelligent agent, which integrates a case-based reasoning system. This agent, which is the core of the architecture, allows the application of detection techniques based on anomalies as well as those based on patterns, providing a great degree of autonomy, flexibility, robustness and dynamic scalability. The characteristics of the multiagent system allow an architecture to detect attacks from different types of devices, regardless of the physical location. The architecture has been tested on a medical database, guaranteeing safe access from various devices such as PDAs and notebook computers.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Roberts, D
Purpose: A unified database system was developed to allow accumulation, review and analysis of quality assurance (QA) data for measurement, treatment, imaging and simulation equipment in our department. Recording these data in a database allows a unified and structured approach to review and analysis of data gathered using commercial database tools. Methods: A clinical database was developed to track records of quality assurance operations on linear accelerators, a computed tomography (CT) scanner, high dose rate (HDR) afterloader and imaging systems such as on-board imaging (OBI) and Calypso in our department. The database was developed using Microsoft Access database and visualmore » basic for applications (VBA) programming interface. Separate modules were written for accumulation, review and analysis of daily, monthly and annual QA data. All modules were designed to use structured query language (SQL) as the basis of data accumulation and review. The SQL strings are dynamically re-written at run time. The database also features embedded documentation, storage of documents produced during QA activities and the ability to annotate all data within the database. Tests are defined in a set of tables that define test type, specific value, and schedule. Results: Daily, Monthly and Annual QA data has been taken in parallel with established procedures to test MQA. The database has been used to aggregate data across machines to examine the consistency of machine parameters and operations within the clinic for several months. Conclusion: The MQA application has been developed as an interface to a commercially available SQL engine (JET 5.0) and a standard database back-end. The MQA system has been used for several months for routine data collection.. The system is robust, relatively simple to extend and can be migrated to a commercial SQL server.« less
Research on high availability architecture of SQL and NoSQL
NASA Astrophysics Data System (ADS)
Wang, Zhiguo; Wei, Zhiqiang; Liu, Hao
2017-03-01
With the advent of the era of big data, amount and importance of data have increased dramatically. SQL database develops in performance and scalability, but more and more companies tend to use NoSQL database as their databases, because NoSQL database has simpler data model and stronger extension capacity than SQL database. Almost all database designers including SQL database and NoSQL database aim to improve performance and ensure availability by reasonable architecture which can reduce the effects of software failures and hardware failures, so that they can provide better experiences for their customers. In this paper, I mainly discuss the architectures of MySQL, MongoDB, and Redis, which are high available and have been deployed in practical application environment, and design a hybrid architecture.
Levy, C.; Beauchamp, C.
1996-01-01
This poster describes the methods used and working prototype that was developed from an abstraction of the relational model from the VA's hierarchical DHCP database. Overlaying the relational model on DHCP permits multiple user views of the physical data structure, enhances access to the database by providing a link to commercial (SQL based) software, and supports a conceptual managed care data model based on primary and longitudinal patient care. The goal of this work was to create a relational abstraction of the existing hierarchical database; to construct, using SQL data definition language, user views of the database which reflect the clinical conceptual view of DHCP, and to allow the user to work directly with the logical view of the data using GUI based commercial software of their choosing. The workstation is intended to serve as a platform from which a managed care information model could be implemented and evaluated.
Maintaining Multimedia Data in a Geospatial Database
2012-09-01
at PostgreSQL and MySQL as spatial databases was offered. Given their results, as each database produced result sets from zero to 100,000, it was...excelled given multiple conditions. A different look at PostgreSQL and MySQL as spatial databases was offered. Given their results, as each database... MySQL ................................................................................................14 B. BENCHMARKING DATA RETRIEVED FROM TABLE
Sánchez-de-Madariaga, Ricardo; Muñoz, Adolfo; Lozano-Rubí, Raimundo; Serrano-Balazote, Pablo; Castro, Antonio L; Moreno, Oscar; Pascual, Mario
2017-08-18
The objective of this research is to compare the relational and non-relational (NoSQL) database systems approaches in order to store, recover, query and persist standardized medical information in the form of ISO/EN 13606 normalized Electronic Health Record XML extracts, both in isolation and concurrently. NoSQL database systems have recently attracted much attention, but few studies in the literature address their direct comparison with relational databases when applied to build the persistence layer of a standardized medical information system. One relational and two NoSQL databases (one document-based and one native XML database) of three different sizes have been created in order to evaluate and compare the response times (algorithmic complexity) of six different complexity growing queries, which have been performed on them. Similar appropriate results available in the literature have also been considered. Relational and non-relational NoSQL database systems show almost linear algorithmic complexity query execution. However, they show very different linear slopes, the former being much steeper than the two latter. Document-based NoSQL databases perform better in concurrency than in isolation, and also better than relational databases in concurrency. Non-relational NoSQL databases seem to be more appropriate than standard relational SQL databases when database size is extremely high (secondary use, research applications). Document-based NoSQL databases perform in general better than native XML NoSQL databases. EHR extracts visualization and edition are also document-based tasks more appropriate to NoSQL database systems. However, the appropriate database solution much depends on each particular situation and specific problem.
An SQL query generator for CLIPS
NASA Technical Reports Server (NTRS)
Snyder, James; Chirica, Laurian
1990-01-01
As expert systems become more widely used, their access to large amounts of external information becomes increasingly important. This information exists in several forms such as statistical, tabular data, knowledge gained by experts and large databases of information maintained by companies. Because many expert systems, including CLIPS, do not provide access to this external information, much of the usefulness of expert systems is left untapped. The scope of this paper is to describe a database extension for the CLIPS expert system shell. The current industry standard database language is SQL. Due to SQL standardization, large amounts of information stored on various computers, potentially at different locations, will be more easily accessible. Expert systems should be able to directly access these existing databases rather than requiring information to be re-entered into the expert system environment. The ORACLE relational database management system (RDBMS) was used to provide a database connection within the CLIPS environment. To facilitate relational database access a query generation system was developed as a CLIPS user function. The queries are entered in a CLlPS-like syntax and are passed to the query generator, which constructs and submits for execution, an SQL query to the ORACLE RDBMS. The query results are asserted as CLIPS facts. The query generator was developed primarily for use within the ICADS project (Intelligent Computer Aided Design System) currently being developed by the CAD Research Unit in the California Polytechnic State University (Cal Poly). In ICADS, there are several parallel or distributed expert systems accessing a common knowledge base of facts. Expert system has a narrow domain of interest and therefore needs only certain portions of the information. The query generator provides a common method of accessing this information and allows the expert system to specify what data is needed without specifying how to retrieve it.
Establishment and Assessment of Plasma Disruption and Warning Databases from EAST
NASA Astrophysics Data System (ADS)
Wang, Bo; Robert, Granetz; Xiao, Bingjia; Li, Jiangang; Yang, Fei; Li, Junjun; Chen, Dalong
2016-12-01
Disruption database and disruption warning database of the EAST tokamak had been established by a disruption research group. The disruption database, based on Structured Query Language (SQL), comprises 41 disruption parameters, which include current quench characteristics, EFIT equilibrium characteristics, kinetic parameters, halo currents, and vertical motion. Presently most disruption databases are based on plasma experiments of non-superconducting tokamak devices. The purposes of the EAST database are to find disruption characteristics and disruption statistics to the fully superconducting tokamak EAST, to elucidate the physics underlying tokamak disruptions, to explore the influence of disruption on superconducting magnets and to extrapolate toward future burning plasma devices. In order to quantitatively assess the usefulness of various plasma parameters for predicting disruptions, a similar SQL database to Alcator C-Mod for EAST has been created by compiling values for a number of proposed disruption-relevant parameters sampled from all plasma discharges in the 2015 campaign. The detailed statistic results and analysis of two databases on the EAST tokamak are presented. supported by the National Magnetic Confinement Fusion Science Program of China (No. 2014GB103000)
TabSQL: a MySQL tool to facilitate mapping user data to public databases.
Xia, Xiao-Qin; McClelland, Michael; Wang, Yipeng
2010-06-23
With advances in high-throughput genomics and proteomics, it is challenging for biologists to deal with large data files and to map their data to annotations in public databases. We developed TabSQL, a MySQL-based application tool, for viewing, filtering and querying data files with large numbers of rows. TabSQL provides functions for downloading and installing table files from public databases including the Gene Ontology database (GO), the Ensembl databases, and genome databases from the UCSC genome bioinformatics site. Any other database that provides tab-delimited flat files can also be imported. The downloaded gene annotation tables can be queried together with users' data in TabSQL using either a graphic interface or command line. TabSQL allows queries across the user's data and public databases without programming. It is a convenient tool for biologists to annotate and enrich their data.
TabSQL: a MySQL tool to facilitate mapping user data to public databases
2010-01-01
Background With advances in high-throughput genomics and proteomics, it is challenging for biologists to deal with large data files and to map their data to annotations in public databases. Results We developed TabSQL, a MySQL-based application tool, for viewing, filtering and querying data files with large numbers of rows. TabSQL provides functions for downloading and installing table files from public databases including the Gene Ontology database (GO), the Ensembl databases, and genome databases from the UCSC genome bioinformatics site. Any other database that provides tab-delimited flat files can also be imported. The downloaded gene annotation tables can be queried together with users' data in TabSQL using either a graphic interface or command line. Conclusions TabSQL allows queries across the user's data and public databases without programming. It is a convenient tool for biologists to annotate and enrich their data. PMID:20573251
Data structures and organisation: Special problems in scientific applications
NASA Astrophysics Data System (ADS)
Read, Brian J.
1989-12-01
In this paper we discuss and offer answers to the following questions: What, really, are the benifits of databases in physics? Are scientific databases essentially different from conventional ones? What are the drawbacks of a commercial database management system for use with scientific data? Do they outweigh the advantages? Do databases systems have adequate graphics facilities, or is a separate graphics package necessary? SQL as a standard language has deficiencies, but what are they for scientific data in particular? Indeed, is the relational model appropriate anyway? Or, should we turn to object oriented databases?
ERIC Educational Resources Information Center
Bosc, P.; Lietard, L.; Pivert, O.
2003-01-01
Considers flexible querying of relational databases. Highlights include SQL languages and basic aggregate operators; Sugeno's fuzzy integral; evaluation examples; and how and under what conditions other aggregate functions could be applied to fuzzy sets in a flexible query. (Author/LRW)
Migration from relational to NoSQL database
NASA Astrophysics Data System (ADS)
Ghotiya, Sunita; Mandal, Juhi; Kandasamy, Saravanakumar
2017-11-01
Data generated by various real time applications, social networking sites and sensor devices is of very huge amount and unstructured, which makes it difficult for Relational database management systems to handle the data. Data is very precious component of any application and needs to be analysed after arranging it in some structure. Relational databases are only able to deal with structured data, so there is need of NoSQL Database management System which can deal with semi -structured data also. Relational database provides the easiest way to manage the data but as the use of NoSQL is increasing it is becoming necessary to migrate the data from Relational to NoSQL databases. Various frameworks has been proposed previously which provides mechanisms for migration of data stored at warehouses in SQL, middle layer solutions which can provide facility of data to be stored in NoSQL databases to handle data which is not structured. This paper provides a literature review of some of the recent approaches proposed by various researchers to migrate data from relational to NoSQL databases. Some researchers proposed mechanisms for the co-existence of NoSQL and Relational databases together. This paper provides a summary of mechanisms which can be used for mapping data stored in Relational databases to NoSQL databases. Various techniques for data transformation and middle layer solutions are summarised in the paper.
PylotDB - A Database Management, Graphing, and Analysis Tool Written in Python
DOE Office of Scientific and Technical Information (OSTI.GOV)
Barnette, Daniel W.
2012-01-04
PylotDB, written completely in Python, provides a user interface (UI) with which to interact with, analyze, graph data from, and manage open source databases such as MySQL. The UI mitigates the user having to know in-depth knowledge of the database application programming interface (API). PylotDB allows the user to generate various kinds of plots from user-selected data; generate statistical information on text as well as numerical fields; backup and restore databases; compare database tables across different databases as well as across different servers; extract information from any field to create new fields; generate, edit, and delete databases, tables, and fields;more » generate or read into a table CSV data; and similar operations. Since much of the database information is brought under control of the Python computer language, PylotDB is not intended for huge databases for which MySQL and Oracle, for example, are better suited. PylotDB is better suited for smaller databases that might be typically needed in a small research group situation. PylotDB can also be used as a learning tool for database applications in general.« less
Extending SQL to Support Privacy Policies
NASA Astrophysics Data System (ADS)
Ghazinour, Kambiz; Pun, Sampson; Majedi, Maryam; Chinaci, Amir H.; Barker, Ken
Increasing concerns over Internet applications that violate user privacy by exploiting (back-end) database vulnerabilities must be addressed to protect both customer privacy and to ensure corporate strategic assets remain trustworthy. This chapter describes an extension onto database catalogues and Structured Query Language (SQL) for supporting privacy in Internet applications, such as in social networks, e-health, e-governmcnt, etc. The idea is to introduce new predicates to SQL commands to capture common privacy requirements, such as purpose, visibility, generalization, and retention for both mandatory and discretionary access control policies. The contribution is that corporations, when creating the underlying databases, will be able to define what their mandatory privacy policies arc with which all application users have to comply. Furthermore, each application user, when providing their own data, will be able to define their own privacy policies with which other users have to comply. The extension is supported with underlying catalogues and algorithms. The experiments demonstrate a very reasonable overhead for the extension. The result is a low-cost mechanism to create new systems that arc privacy aware and also to transform legacy databases to their privacy-preserving equivalents. Although the examples arc from social networks, one can apply the results to data security and user privacy of other enterprises as well.
BioWarehouse: a bioinformatics database warehouse toolkit
Lee, Thomas J; Pouliot, Yannick; Wagner, Valerie; Gupta, Priyanka; Stringer-Calvert, David WJ; Tenenbaum, Jessica D; Karp, Peter D
2006-01-01
Background This article addresses the problem of interoperation of heterogeneous bioinformatics databases. Results We introduce BioWarehouse, an open source toolkit for constructing bioinformatics database warehouses using the MySQL and Oracle relational database managers. BioWarehouse integrates its component databases into a common representational framework within a single database management system, thus enabling multi-database queries using the Structured Query Language (SQL) but also facilitating a variety of database integration tasks such as comparative analysis and data mining. BioWarehouse currently supports the integration of a pathway-centric set of databases including ENZYME, KEGG, and BioCyc, and in addition the UniProt, GenBank, NCBI Taxonomy, and CMR databases, and the Gene Ontology. Loader tools, written in the C and JAVA languages, parse and load these databases into a relational database schema. The loaders also apply a degree of semantic normalization to their respective source data, decreasing semantic heterogeneity. The schema supports the following bioinformatics datatypes: chemical compounds, biochemical reactions, metabolic pathways, proteins, genes, nucleic acid sequences, features on protein and nucleic-acid sequences, organisms, organism taxonomies, and controlled vocabularies. As an application example, we applied BioWarehouse to determine the fraction of biochemically characterized enzyme activities for which no sequences exist in the public sequence databases. The answer is that no sequence exists for 36% of enzyme activities for which EC numbers have been assigned. These gaps in sequence data significantly limit the accuracy of genome annotation and metabolic pathway prediction, and are a barrier for metabolic engineering. Complex queries of this type provide examples of the value of the data warehousing approach to bioinformatics research. Conclusion BioWarehouse embodies significant progress on the database integration problem for bioinformatics. PMID:16556315
BioWarehouse: a bioinformatics database warehouse toolkit.
Lee, Thomas J; Pouliot, Yannick; Wagner, Valerie; Gupta, Priyanka; Stringer-Calvert, David W J; Tenenbaum, Jessica D; Karp, Peter D
2006-03-23
This article addresses the problem of interoperation of heterogeneous bioinformatics databases. We introduce BioWarehouse, an open source toolkit for constructing bioinformatics database warehouses using the MySQL and Oracle relational database managers. BioWarehouse integrates its component databases into a common representational framework within a single database management system, thus enabling multi-database queries using the Structured Query Language (SQL) but also facilitating a variety of database integration tasks such as comparative analysis and data mining. BioWarehouse currently supports the integration of a pathway-centric set of databases including ENZYME, KEGG, and BioCyc, and in addition the UniProt, GenBank, NCBI Taxonomy, and CMR databases, and the Gene Ontology. Loader tools, written in the C and JAVA languages, parse and load these databases into a relational database schema. The loaders also apply a degree of semantic normalization to their respective source data, decreasing semantic heterogeneity. The schema supports the following bioinformatics datatypes: chemical compounds, biochemical reactions, metabolic pathways, proteins, genes, nucleic acid sequences, features on protein and nucleic-acid sequences, organisms, organism taxonomies, and controlled vocabularies. As an application example, we applied BioWarehouse to determine the fraction of biochemically characterized enzyme activities for which no sequences exist in the public sequence databases. The answer is that no sequence exists for 36% of enzyme activities for which EC numbers have been assigned. These gaps in sequence data significantly limit the accuracy of genome annotation and metabolic pathway prediction, and are a barrier for metabolic engineering. Complex queries of this type provide examples of the value of the data warehousing approach to bioinformatics research. BioWarehouse embodies significant progress on the database integration problem for bioinformatics.
Computerized database management system for breast cancer patients.
Sim, Kok Swee; Chong, Sze Siang; Tso, Chih Ping; Nia, Mohsen Esmaeili; Chong, Aun Kee; Abbas, Siti Fathimah
2014-01-01
Data analysis based on breast cancer risk factors such as age, race, breastfeeding, hormone replacement therapy, family history, and obesity was conducted on breast cancer patients using a new enhanced computerized database management system. My Structural Query Language (MySQL) is selected as the application for database management system to store the patient data collected from hospitals in Malaysia. An automatic calculation tool is embedded in this system to assist the data analysis. The results are plotted automatically and a user-friendly graphical user interface is developed that can control the MySQL database. Case studies show breast cancer incidence rate is highest among Malay women, followed by Chinese and Indian. The peak age for breast cancer incidence is from 50 to 59 years old. Results suggest that the chance of developing breast cancer is increased in older women, and reduced with breastfeeding practice. The weight status might affect the breast cancer risk differently. Additional studies are needed to confirm these findings.
StreptomycesInforSys: A web-enabled information repository
Jain, Chakresh Kumar; Gupta, Vidhi; Gupta, Ashvarya; Gupta, Sanjay; Wadhwa, Gulshan; Sharma, Sanjeev Kumar; Sarethy, Indira P
2012-01-01
Members of Streptomyces produce 70% of natural bioactive products. There is considerable amount of information available based on polyphasic approach for classification of Streptomyces. However, this information based on phenotypic, genotypic and bioactive component production profiles is crucial for pharmacological screening programmes. This is scattered across various journals, books and other resources, many of which are not freely accessible. The designed database incorporates polyphasic typing information using combinations of search options to aid in efficient screening of new isolates. This will help in the preliminary categorization of appropriate groups. It is a free relational database compatible with existing operating systems. A cross platform technology with XAMPP Web server has been used to develop, manage, and facilitate the user query effectively with database support. Employment of PHP, a platform-independent scripting language, embedded in HTML and the database management software MySQL will facilitate dynamic information storage and retrieval. The user-friendly, open and flexible freeware (PHP, MySQL and Apache) is foreseen to reduce running and maintenance cost. Availability www.sis.biowaves.org PMID:23275736
StreptomycesInforSys: A web-enabled information repository.
Jain, Chakresh Kumar; Gupta, Vidhi; Gupta, Ashvarya; Gupta, Sanjay; Wadhwa, Gulshan; Sharma, Sanjeev Kumar; Sarethy, Indira P
2012-01-01
Members of Streptomyces produce 70% of natural bioactive products. There is considerable amount of information available based on polyphasic approach for classification of Streptomyces. However, this information based on phenotypic, genotypic and bioactive component production profiles is crucial for pharmacological screening programmes. This is scattered across various journals, books and other resources, many of which are not freely accessible. The designed database incorporates polyphasic typing information using combinations of search options to aid in efficient screening of new isolates. This will help in the preliminary categorization of appropriate groups. It is a free relational database compatible with existing operating systems. A cross platform technology with XAMPP Web server has been used to develop, manage, and facilitate the user query effectively with database support. Employment of PHP, a platform-independent scripting language, embedded in HTML and the database management software MySQL will facilitate dynamic information storage and retrieval. The user-friendly, open and flexible freeware (PHP, MySQL and Apache) is foreseen to reduce running and maintenance cost. www.sis.biowaves.org.
Comprehensive Routing Security Development and Deployment for the Internet
2015-02-01
feature enhancement and bug fixes. • MySQL : MySQL is a widely used and popular open source database package. It was chosen for database support in the...RPSTIR depends on several other open source packages. • MySQL : MySQL is used for the the local RPKI database cache. • OpenSSL: OpenSSL is used for...cryptographic libraries for X.509 certificates. • ODBC mySql Connector: ODBC (Open Database Connectivity) is a standard programming interface (API) for
Development of Human Face Literature Database Using Text Mining Approach: Phase I.
Kaur, Paramjit; Krishan, Kewal; Sharma, Suresh K
2018-06-01
The face is an important part of the human body by which an individual communicates in the society. Its importance can be highlighted by the fact that a person deprived of face cannot sustain in the living world. The amount of experiments being performed and the number of research papers being published under the domain of human face have surged in the past few decades. Several scientific disciplines, which are conducting research on human face include: Medical Science, Anthropology, Information Technology (Biometrics, Robotics, and Artificial Intelligence, etc.), Psychology, Forensic Science, Neuroscience, etc. This alarms the need of collecting and managing the data concerning human face so that the public and free access of it can be provided to the scientific community. This can be attained by developing databases and tools on human face using bioinformatics approach. The current research emphasizes on creating a database concerning literature data of human face. The database can be accessed on the basis of specific keywords, journal name, date of publication, author's name, etc. The collected research papers will be stored in the form of a database. Hence, the database will be beneficial to the research community as the comprehensive information dedicated to the human face could be found at one place. The information related to facial morphologic features, facial disorders, facial asymmetry, facial abnormalities, and many other parameters can be extracted from this database. The front end has been developed using Hyper Text Mark-up Language and Cascading Style Sheets. The back end has been developed using hypertext preprocessor (PHP). The JAVA Script has used as scripting language. MySQL (Structured Query Language) is used for database development as it is most widely used Relational Database Management System. XAMPP (X (cross platform), Apache, MySQL, PHP, Perl) open source web application software has been used as the server.The database is still under the developmental phase and discusses the initial steps of its creation. The current paper throws light on the work done till date.
NASA Astrophysics Data System (ADS)
Dziedzic, Adam; Mulawka, Jan
2014-11-01
NoSQL is a new approach to data storage and manipulation. The aim of this paper is to gain more insight into NoSQL databases, as we are still in the early stages of understanding when to use them and how to use them in an appropriate way. In this submission descriptions of selected NoSQL databases are presented. Each of the databases is analysed with primary focus on its data model, data access, architecture and practical usage in real applications. Furthemore, the NoSQL databases are compared in fields of data references. The relational databases offer foreign keys, whereas NoSQL databases provide us with limited references. An intermediate model between graph theory and relational algebra which can address the problem should be created. Finally, the proposal of a new approach to the problem of inconsistent references in Big Data storage systems is introduced.
Migration of legacy mumps applications to relational database servers.
O'Kane, K C
2001-07-01
An extended implementation of the Mumps language is described that facilitates vendor neutral migration of legacy Mumps applications to SQL-based relational database servers. Implemented as a compiler, this system translates Mumps programs to operating system independent, standard C code for subsequent compilation to fully stand-alone, binary executables. Added built-in functions and support modules extend the native hierarchical Mumps database with access to industry standard, networked, relational database management servers (RDBMS) thus freeing Mumps applications from dependence upon vendor specific, proprietary, unstandardized database models. Unlike Mumps systems that have added captive, proprietary RDMBS access, the programs generated by this development environment can be used with any RDBMS system that supports common network access protocols. Additional features include a built-in web server interface and the ability to interoperate directly with programs and functions written in other languages.
NASA Astrophysics Data System (ADS)
Kuznetsov, Valentin; Riley, Daniel; Afaq, Anzar; Sekhri, Vijay; Guo, Yuyi; Lueking, Lee
2010-04-01
The CMS experiment has implemented a flexible and powerful system enabling users to find data within the CMS physics data catalog. The Dataset Bookkeeping Service (DBS) comprises a database and the services used to store and access metadata related to CMS physics data. To this, we have added a generalized query system in addition to the existing web and programmatic interfaces to the DBS. This query system is based on a query language that hides the complexity of the underlying database structure by discovering the join conditions between database tables. This provides a way of querying the system that is simple and straightforward for CMS data managers and physicists to use without requiring knowledge of the database tables or keys. The DBS Query Language uses the ANTLR tool to build the input query parser and tokenizer, followed by a query builder that uses a graph representation of the DBS schema to construct the SQL query sent to underlying database. We will describe the design of the query system, provide details of the language components and overview of how this component fits into the overall data discovery system architecture.
Recommender System for Learning SQL Using Hints
ERIC Educational Resources Information Center
Lavbic, Dejan; Matek, Tadej; Zrnec, Aljaž
2017-01-01
Today's software industry requires individuals who are proficient in as many programming languages as possible. Structured query language (SQL), as an adopted standard, is no exception, as it is the most widely used query language to retrieve and manipulate data. However, the process of learning SQL turns out to be challenging. The need for a…
Evaluation of relational and NoSQL database architectures to manage genomic annotations.
Schulz, Wade L; Nelson, Brent G; Felker, Donn K; Durant, Thomas J S; Torres, Richard
2016-12-01
While the adoption of next generation sequencing has rapidly expanded, the informatics infrastructure used to manage the data generated by this technology has not kept pace. Historically, relational databases have provided much of the framework for data storage and retrieval. Newer technologies based on NoSQL architectures may provide significant advantages in storage and query efficiency, thereby reducing the cost of data management. But their relative advantage when applied to biomedical data sets, such as genetic data, has not been characterized. To this end, we compared the storage, indexing, and query efficiency of a common relational database (MySQL), a document-oriented NoSQL database (MongoDB), and a relational database with NoSQL support (PostgreSQL). When used to store genomic annotations from the dbSNP database, we found the NoSQL architectures to outperform traditional, relational models for speed of data storage, indexing, and query retrieval in nearly every operation. These findings strongly support the use of novel database technologies to improve the efficiency of data management within the biological sciences. Copyright © 2016 Elsevier Inc. All rights reserved.
2015-09-01
Detectability ...............................................................................................37 Figure 20. Excel VBA Codes for Checker...National Vulnerability Database OS Operating System SQL Structured Query Language VC Verification Condition VBA Visual Basic for Applications...checks each of these assertions for detectability by Daikon. The checker is an Excel Visual Basic for Applications ( VBA ) script that checks the
Flexible Decision Support in Device-Saturated Environments
2003-10-01
also output tuples to a remote MySQL or Postgres database. 3.3 GUI The GUI allows the user to pose queries using SQL and to display query...DatabaseConnection.java – handles connections to an external database (such as MySQL or Postgres ). • Debug.java – contains the code for printing out Debug messages...also provided. It is possible to output the results of queries to a MySQL or Postgres database for archival and the GUI can query those results
Adding Hierarchical Objects to Relational Database General-Purpose XML-Based Information Managements
NASA Technical Reports Server (NTRS)
Lin, Shu-Chun; Knight, Chris; La, Tracy; Maluf, David; Bell, David; Tran, Khai Peter; Gawdiak, Yuri
2006-01-01
NETMARK is a flexible, high-throughput software system for managing, storing, and rapid searching of unstructured and semi-structured documents. NETMARK transforms such documents from their original highly complex, constantly changing, heterogeneous data formats into well-structured, common data formats in using Hypertext Markup Language (HTML) and/or Extensible Markup Language (XML). The software implements an object-relational database system that combines the best practices of the relational model utilizing Structured Query Language (SQL) with those of the object-oriented, semantic database model for creating complex data. In particular, NETMARK takes advantage of the Oracle 8i object-relational database model using physical-address data types for very efficient keyword searches of records across both context and content. NETMARK also supports multiple international standards such as WEBDAV for drag-and-drop file management and SOAP for integrated information management using Web services. The document-organization and -searching capabilities afforded by NETMARK are likely to make this software attractive for use in disciplines as diverse as science, auditing, and law enforcement.
Agile Datacube Analytics (not just) for the Earth Sciences
NASA Astrophysics Data System (ADS)
Misev, Dimitar; Merticariu, Vlad; Baumann, Peter
2017-04-01
Metadata are considered small, smart, and queryable; data, on the other hand, are known as big, clumsy, hard to analyze. Consequently, gridded data - such as images, image timeseries, and climate datacubes - are managed separately from the metadata, and with different, restricted retrieval capabilities. One reason for this silo approach is that databases, while good at tables, XML hierarchies, RDF graphs, etc., traditionally do not support multi-dimensional arrays well. This gap is being closed by Array Databases which extend the SQL paradigm of "any query, anytime" to NoSQL arrays. They introduce semantically rich modelling combined with declarative, high-level query languages on n-D arrays. On Server side, such queries can be optimized, parallelized, and distributed based on partitioned array storage. This way, they offer new vistas in flexibility, scalability, performance, and data integration. In this respect, the forthcoming ISO SQL extension MDA ("Multi-dimensional Arrays") will be a game changer in Big Data Analytics. We introduce concepts and opportunities through the example of rasdaman ("raster data manager") which in fact has pioneered the field of Array Databases and forms the blueprint for ISO SQL/MDA and further Big Data standards, such as OGC WCPS for querying spatio-temporal Earth datacubes. With operational installations exceeding 140 TB queries have been split across more than one thousand cloud nodes, using CPUs as well as GPUs. Installations can easily be mashed up securely, enabling large-scale location-transparent query processing in federations. Federation queries have been demonstrated live at EGU 2016 spanning Europe and Australia in the context of the intercontinental EarthServer initiative, visualized through NASA WorldWind.
Agile Datacube Analytics (not just) for the Earth Sciences
NASA Astrophysics Data System (ADS)
Baumann, P.
2016-12-01
Metadata are considered small, smart, and queryable; data, on the other hand, are known as big, clumsy, hard to analyze. Consequently, gridded data - such as images, image timeseries, and climate datacubes - are managed separately from the metadata, and with different, restricted retrieval capabilities. One reason for this silo approach is that databases, while good at tables, XML hierarchies, RDF graphs, etc., traditionally do not support multi-dimensional arrays well.This gap is being closed by Array Databases which extend the SQL paradigm of "any query, anytime" to NoSQL arrays. They introduce semantically rich modelling combined with declarative, high-level query languages on n-D arrays. On Server side, such queries can be optimized, parallelized, and distributed based on partitioned array storage. This way, they offer new vistas in flexibility, scalability, performance, and data integration. In this respect, the forthcoming ISO SQL extension MDA ("Multi-dimensional Arrays") will be a game changer in Big Data Analytics.We introduce concepts and opportunities through the example of rasdaman ("raster data manager") which in fact has pioneered the field of Array Databases and forms the blueprint for ISO SQL/MDA and further Big Data standards, such as OGC WCPS for querying spatio-temporal Earth datacubes. With operational installations exceeding 140 TB queries have been split across more than one thousand cloud nodes, using CPUs as well as GPUs. Installations can easily be mashed up securely, enabling large-scale location-transparent query processing in federations. Federation queries have been demonstrated live at EGU 2016 spanning Europe and Australia in the context of the intercontinental EarthServer initiative, visualized through NASA WorldWind.
A Database-Based and Web-Based Meta-CASE System
NASA Astrophysics Data System (ADS)
Eessaar, Erki; Sgirka, Rünno
Each Computer Aided Software Engineering (CASE) system provides support to a software process or specific tasks or activities that are part of a software process. Each meta-CASE system allows us to create new CASE systems. The creators of a new CASE system have to specify abstract syntax of the language that is used in the system and functionality as well as non-functional properties of the new system. Many meta-CASE systems record their data directly in files. In this paper, we introduce a meta-CASE system, the enabling technology of which is an object-relational database system (ORDBMS). The system allows users to manage specifications of languages and create models by using these languages. The system has web-based and form-based user interface. We have created a proof-of-concept prototype of the system by using PostgreSQL ORDBMS and PHP scripting language.
Using an image-extended relational database to support content-based image retrieval in a PACS.
Traina, Caetano; Traina, Agma J M; Araújo, Myrian R B; Bueno, Josiane M; Chino, Fabio J T; Razente, Humberto; Azevedo-Marques, Paulo M
2005-12-01
This paper presents a new Picture Archiving and Communication System (PACS), called cbPACS, which has content-based image retrieval capabilities. The cbPACS answers range and k-nearest- neighbor similarity queries, employing a relational database manager extended to support images. The images are compared through their features, which are extracted by an image-processing module and stored in the extended relational database. The database extensions were developed aiming at efficiently answering similarity queries by taking advantage of specialized indexing methods. The main concept supporting the extensions is the definition, inside the relational manager, of distance functions based on features extracted from the images. An extension to the SQL language enables the construction of an interpreter that intercepts the extended commands and translates them to standard SQL, allowing any relational database server to be used. By now, the system implemented works on features based on color distribution of the images through normalized histograms as well as metric histograms. Metric histograms are invariant regarding scale, translation and rotation of images and also to brightness transformations. The cbPACS is prepared to integrate new image features, based on texture and shape of the main objects in the image.
The Comparison of SQL, QBE, and DFQL as Query Languages for Relational Databases
1994-03-01
is: Dname F-mune Laame Headquarter James Borg b. Query 7: RetieMl involving explicit sets Retrieve the Social Security Numbers of employees who worked...i •••,• I• i , i I I • I 10. Ka Dispullahta MABES TNI-AL Cilangkap-Jakarta Timur Indonesia 11. Parunmungan Girsang 3 Jl. Cawang Baru 34-36 Jakarta
Performance Evaluation of NoSQL Databases: A Case Study
2015-02-01
a centralized relational database. The customer decided to consider NoSQL technologies for two specific uses, namely: the primary data store for...17 custom specific 6. FU NoSQL availab data mo arking of data g a specific wo sin benchmark f hmark for tran le workload de o publish meas their...The choice of a particular NoSQL database imposes a specific distributed software architecture and data model, and is a major determinant of the
ADASS Web Database XML Project
NASA Astrophysics Data System (ADS)
Barg, M. I.; Stobie, E. B.; Ferro, A. J.; O'Neil, E. J.
In the spring of 2000, at the request of the ADASS Program Organizing Committee (POC), we began organizing information from previous ADASS conferences in an effort to create a centralized database. The beginnings of this database originated from data (invited speakers, participants, papers, etc.) extracted from HyperText Markup Language (HTML) documents from past ADASS host sites. Unfortunately, not all HTML documents are well formed and parsing them proved to be an iterative process. It was evident at the beginning that if these Web documents were organized in a standardized way, such as XML (Extensible Markup Language), the processing of this information across the Web could be automated, more efficient, and less error prone. This paper will briefly review the many programming tools available for processing XML, including Java, Perl and Python, and will explore the mapping of relational data from our MySQL database to XML.
[Design and establishment of modern literature database about acupuncture Deqi].
Guo, Zheng-rong; Qian, Gui-feng; Pan, Qiu-yin; Wang, Yang; Xin, Si-yuan; Li, Jing; Hao, Jie; Hu, Ni-juan; Zhu, Jiang; Ma, Liang-xiao
2015-02-01
A search on acupuncture Deqi was conducted using four Chinese-language biomedical databases (CNKI, Wan-Fang, VIP and CBM) and PubMed database and using keywords "Deqi" or "needle sensation" "needling feeling" "needle feel" "obtaining qi", etc. Then, a "Modern Literature Database for Acupuncture Deqi" was established by employing Microsoft SQL Server 2005 Express Edition, introducing the contents, data types, information structure and logic constraint of the system table fields. From this Database, detailed inquiries about general information of clinical trials, acupuncturists' experience, ancient medical works, comprehensive literature, etc. can be obtained. The present databank lays a foundation for subsequent evaluation of literature quality about Deqi and data mining of undetected Deqi knowledge.
2012-09-01
relative performance of several conventional SQL and NoSQL databases with a set of one billion file block hashes. Digital Forensics, Sector Hashing, Full... NoSQL databases with a set of one billion file block hashes. v THIS PAGE INTENTIONALLY LEFT BLANK vi Table of Contents List of Acronyms and...Operating System NOOP No Operation assembly instruction NoSQL “Not only SQL” model for non-relational database management NSRL National Software
Lee, Ken Ka-Yin; Tang, Wai-Choi; Choi, Kup-Sze
2013-04-01
Clinical data are dynamic in nature, often arranged hierarchically and stored as free text and numbers. Effective management of clinical data and the transformation of the data into structured format for data analysis are therefore challenging issues in electronic health records development. Despite the popularity of relational databases, the scalability of the NoSQL database model and the document-centric data structure of XML databases appear to be promising features for effective clinical data management. In this paper, three database approaches--NoSQL, XML-enabled and native XML--are investigated to evaluate their suitability for structured clinical data. The database query performance is reported, together with our experience in the databases development. The results show that NoSQL database is the best choice for query speed, whereas XML databases are advantageous in terms of scalability, flexibility and extensibility, which are essential to cope with the characteristics of clinical data. While NoSQL and XML technologies are relatively new compared to the conventional relational database, both of them demonstrate potential to become a key database technology for clinical data management as the technology further advances. Copyright © 2012 Elsevier Ireland Ltd. All rights reserved.
Photo-z-SQL: Photometric redshift estimation framework
NASA Astrophysics Data System (ADS)
Beck, Róbert; Dobos, László; Budavári, Tamás; Szalay, Alexander S.; Csabai, István
2017-04-01
Photo-z-SQL is a flexible template-based photometric redshift estimation framework that can be seamlessly integrated into a SQL database (or DB) server and executed on demand in SQL. The DB integration eliminates the need to move large photometric datasets outside a database for redshift estimation, and uses the computational capabilities of DB hardware. Photo-z-SQL performs both maximum likelihood and Bayesian estimation and handles inputs of variable photometric filter sets and corresponding broad-band magnitudes.
HBVPathDB: a database of HBV infection-related molecular interaction network.
Zhang, Yi; Bo, Xiao-Chen; Yang, Jing; Wang, Sheng-Qi
2005-03-21
To describe molecules or genes interaction between hepatitis B viruses (HBV) and host, for understanding how virus' and host's genes and molecules are networked to form a biological system and for perceiving mechanism of HBV infection. The knowledge of HBV infection-related reactions was organized into various kinds of pathways with carefully drawn graphs in HBVPathDB. Pathway information is stored with relational database management system (DBMS), which is currently the most efficient way to manage large amounts of data and query is implemented with powerful Structured Query Language (SQL). The search engine is written using Personal Home Page (PHP) with SQL embedded and web retrieval interface is developed for searching with Hypertext Markup Language (HTML). We present the first version of HBVPathDB, which is a HBV infection-related molecular interaction network database composed of 306 pathways with 1 050 molecules involved. With carefully drawn graphs, pathway information stored in HBVPathDB can be browsed in an intuitive way. We develop an easy-to-use interface for flexible accesses to the details of database. Convenient software is implemented to query and browse the pathway information of HBVPathDB. Four search page layout options-category search, gene search, description search, unitized search-are supported by the search engine of the database. The database is freely available at http://www.bio-inf.net/HBVPathDB/HBV/. The conventional perspective HBVPathDB have already contained a considerable amount of pathway information with HBV infection related, which is suitable for in-depth analysis of molecular interaction network of virus and host. HBVPathDB integrates pathway data-sets with convenient software for query, browsing, visualization, that provides users more opportunity to identify regulatory key molecules as potential drug targets and to explore the possible mechanism of HBV infection based on gene expression datasets.
Named Entity Recognition in a Hungarian NL Based QA System
NASA Astrophysics Data System (ADS)
Tikkl, Domonkos; Szidarovszky, P. Ferenc; Kardkovacs, Zsolt T.; Magyar, Gábor
In WoW project our purpose is to create a complex search interface with the following features: search in the deep web content of contracted partners' databases, processing Hungarian natural language (NL) questions and transforming them to SQL queries for database access, image search supported by a visual thesaurus that describes in a structural form the visual content of images (also in Hungarian). This paper primarily focuses on a particular problem of question processing task: the entity recognition. Before going into details we give a short overview of the project's aims.
Database Entity Persistence with Hibernate for the Network Connectivity Analysis Model
2014-04-01
time savings in the Java coding development process. Appendices A and B describe address setup procedures for installing the MySQL database...development environment is required: • The open source MySQL Database Management System (DBMS) from Oracle, which is a Java Database Connectivity (JDBC...compliant DBMS • MySQL JDBC Driver library that comes as a plug-in with the Netbeans distribution • The latest Java Development Kit with the latest
Teaching Case: Introduction to NoSQL in a Traditional Database Course
ERIC Educational Resources Information Center
Fowler, Brad; Godin, Joy; Geddy, Margaret
2016-01-01
Many organizations are dealing with the increasing demands of big data, so they are turning to NoSQL databases as their preferred system for handling the unique problems of capturing and storing massive amounts of data. Therefore, it is likely that employees in all sizes of organizations will encounter NoSQL databases. Thus, to be more job-ready,…
DOE Office of Scientific and Technical Information (OSTI.GOV)
Enders, Alexander L.; Lousteau, Angela L.
The Desktop Analysis Reporting Tool (DART) is a software package that allows users to easily view and analyze daily files that span long periods. DART gives users the capability to quickly determine the state of health of a radiation portal monitor (RPM), troubleshoot and diagnose problems, and view data in various time frames to perform trend analysis. In short, it converts the data strings written in the daily files into meaningful tables and plots. The standalone version of DART (“soloDART”) utilizes a database engine that is included with the application; no additional installations are necessary. There is also a networkedmore » version of DART (“polyDART”) that is designed to maximize the benefit of a centralized data repository while distributing the workload to individual desktop machines. This networked approach requires a more complex database manager Structured Query Language (SQL) Server; however, SQL Server is not currently provided with DART. Regardless of which version is used, DART will import daily files from RPMs, store the relevant data in its database, and it can produce reports for status, trend analysis, and reporting purposes.« less
Comparison of the Frontier Distributed Database Caching System to NoSQL Databases
DOE Office of Scientific and Technical Information (OSTI.GOV)
Dykstra, Dave
One of the main attractions of non-relational NoSQL databases is their ability to scale to large numbers of readers, including readers spread over a wide area. The Frontier distributed database caching system, used in production by the Large Hadron Collider CMS and ATLAS detector projects for Conditions data, is based on traditional SQL databases but also adds high scalability and the ability to be distributed over a wide-area for an important subset of applications. This paper compares the major characteristics of the two different approaches and identifies the criteria for choosing which approach to prefer over the other. It alsomore » compares in some detail the NoSQL databases used by CMS and ATLAS: MongoDB, CouchDB, HBase, and Cassandra.« less
Sujansky, Walter V; Faus, Sam A; Stone, Ethan; Brennan, Patricia Flatley
2010-10-01
Online personal health records (PHRs) enable patients to access, manage, and share certain of their own health information electronically. This capability creates the need for precise access-controls mechanisms that restrict the sharing of data to that intended by the patient. The authors describe the design and implementation of an access-control mechanism for PHR repositories that is modeled on the eXtensible Access Control Markup Language (XACML) standard, but intended to reduce the cognitive and computational complexity of XACML. The authors implemented the mechanism entirely in a relational database system using ANSI-standard SQL statements. Based on a set of access-control rules encoded as relational table rows, the mechanism determines via a single SQL query whether a user who accesses patient data from a specific application is authorized to perform a requested operation on a specified data object. Testing of this query on a moderately large database has demonstrated execution times consistently below 100ms. The authors include the details of the implementation, including algorithms, examples, and a test database as Supplementary materials. Copyright © 2010 Elsevier Inc. All rights reserved.
Multi-Resolution Playback of Network Trace Files
2015-06-01
a com- plete MySQL database, C++ developer tools and the libraries utilized in the development of the system (Boost and Libcrafter), and Wireshark...XE suite has a limit to the allowed size of each database. In order to be scalable, the project had to switch to the MySQL database suite. The...programs that access the database use the MySQL C++ connector, provided by Oracle, and the supplied methods and libraries. 4.4 Flow Generator Chapter 3
An approach in building a chemical compound search engine in oracle database.
Wang, H; Volarath, P; Harrison, R
2005-01-01
A searching or identifying of chemical compounds is an important process in drug design and in chemistry research. An efficient search engine involves a close coupling of the search algorithm and database implementation. The database must process chemical structures, which demands the approaches to represent, store, and retrieve structures in a database system. In this paper, a general database framework for working as a chemical compound search engine in Oracle database is described. The framework is devoted to eliminate data type constrains for potential search algorithms, which is a crucial step toward building a domain specific query language on top of SQL. A search engine implementation based on the database framework is also demonstrated. The convenience of the implementation emphasizes the efficiency and simplicity of the framework.
2004-03-01
with MySQL . This choice was made because MySQL is open source. Any significant database engine such as Oracle or MS- SQL or even MS Access can be used...10 Figure 6. The DoD vs . Commercial Life Cycle...necessarily be interested in SCADA network security 13. MySQL (Database server) – This station represents a typical data server for a web page
A new relational database structure and online interface for the HITRAN database
NASA Astrophysics Data System (ADS)
Hill, Christian; Gordon, Iouli E.; Rothman, Laurence S.; Tennyson, Jonathan
2013-11-01
A new format for the HITRAN database is proposed. By storing the line-transition data in a number of linked tables described by a relational database schema, it is possible to overcome the limitations of the existing format, which have become increasingly apparent over the last few years as new and more varied data are being used by radiative-transfer models. Although the database in the new format can be searched using the well-established Structured Query Language (SQL), a web service, HITRANonline, has been deployed to allow users to make most common queries of the database using a graphical user interface in a web page. The advantages of the relational form of the database to ensuring data integrity and consistency are explored, and the compatibility of the online interface with the emerging standards of the Virtual Atomic and Molecular Data Centre (VAMDC) project is discussed. In particular, the ability to access HITRAN data using a standard query language from other websites, command line tools and from within computer programs is described.
Yu, Kaijun
2010-07-01
This paper Analys the design goals of Medical Instrumentation standard information retrieval system. Based on the B /S structure,we established a medical instrumentation standard retrieval system with ASP.NET C # programming language, IIS f Web server, SQL Server 2000 database, in the. NET environment. The paper also Introduces the system structure, retrieval system modules, system development environment and detailed design of the system.
DESPIC: Detecting Early Signatures of Persuasion in Information Cascades
2015-08-27
over NoSQL Databases, Proceedings of the 14th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (CCGrid 2014). 26-MAY-14, . : , P...over NoSQL Databases. Proceedings of the 14th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (CCGrid 2014). Chicago, IL, USA...distributed NoSQL databases including HBase and Riak, we finalized the requirements of the optimal computational architecture to support our framework
Cyclone: java-based querying and computing with Pathway/Genome databases.
Le Fèvre, François; Smidtas, Serge; Schächter, Vincent
2007-05-15
Cyclone aims at facilitating the use of BioCyc, a collection of Pathway/Genome Databases (PGDBs). Cyclone provides a fully extensible Java Object API to analyze and visualize these data. Cyclone can read and write PGDBs, and can write its own data in the CycloneML format. This format is automatically generated from the BioCyc ontology by Cyclone itself, ensuring continued compatibility. Cyclone objects can also be stored in a relational database CycloneDB. Queries can be written in SQL, and in an intuitive and concise object-oriented query language, Hibernate Query Language (HQL). In addition, Cyclone interfaces easily with Java software including the Eclipse IDE for HQL edition, the Jung API for graph algorithms or Cytoscape for graph visualization. Cyclone is freely available under an open source license at: http://sourceforge.net/projects/nemo-cyclone. For download and installation instructions, tutorials, use cases and examples, see http://nemo-cyclone.sourceforge.net.
Catalogue of HI PArameters (CHIPA)
NASA Astrophysics Data System (ADS)
Saponara, J.; Benaglia, P.; Koribalski, B.; Andruchow, I.
2015-08-01
The catalogue of HI parameters of galaxies HI (CHIPA) is the natural continuation of the compilation by M.C. Martin in 1998. CHIPA provides the most important parameters of nearby galaxies derived from observations of the neutral Hydrogen line. The catalogue contains information of 1400 galaxies across the sky and different morphological types. Parameters like the optical diameter of the galaxy, the blue magnitude, the distance, morphological type, HI extension are listed among others. Maps of the HI distribution, velocity and velocity dispersion can also be display for some cases. The main objective of this catalogue is to facilitate the bibliographic queries, through searching in a database accessible from the internet that will be available in 2015 (the website is under construction). The database was built using the open source `` mysql (SQL, Structured Query Language, management system relational database) '', while the website was built with ''HTML (Hypertext Markup Language)'' and ''PHP (Hypertext Preprocessor)''.
SQLGEN: a framework for rapid client-server database application development.
Nadkarni, P M; Cheung, K H
1995-12-01
SQLGEN is a framework for rapid client-server relational database application development. It relies on an active data dictionary on the client machine that stores metadata on one or more database servers to which the client may be connected. The dictionary generates dynamic Structured Query Language (SQL) to perform common database operations; it also stores information about the access rights of the user at log-in time, which is used to partially self-configure the behavior of the client to disable inappropriate user actions. SQLGEN uses a microcomputer database as the client to store metadata in relational form, to transiently capture server data in tables, and to allow rapid application prototyping followed by porting to client-server mode with modest effort. SQLGEN is currently used in several production biomedical databases.
Design of Instant Messaging System of Multi-language E-commerce Platform
NASA Astrophysics Data System (ADS)
Yang, Heng; Chen, Xinyi; Li, Jiajia; Cao, Yaru
2017-09-01
This paper aims at researching the message system in the instant messaging system based on the multi-language e-commerce platform in order to design the instant messaging system in multi-language environment and exhibit the national characteristics based information as well as applying national languages to e-commerce. In order to develop beautiful and friendly system interface for the front end of the message system and reduce the development cost, the mature jQuery framework is adopted in this paper. The high-performance server Tomcat is adopted at the back end to process user requests, and MySQL database is adopted for data storage to persistently store user data, and meanwhile Oracle database is adopted as the message buffer for system optimization. Moreover, AJAX technology is adopted for the client to actively pull the newest data from the server at the specified time. In practical application, the system has strong reliability, good expansibility, short response time, high system throughput capacity and high user concurrency.
Footprint Representation of Planetary Remote Sensing Data
NASA Astrophysics Data System (ADS)
Walter, S. H. G.; Gasselt, S. V.; Michael, G.; Neukum, G.
The geometric outline of remote sensing image data, the so called footprint, can be represented as a number of coordinate tuples. These polygons are associated with according attribute information such as orbit name, ground- and image resolution, solar longitude and illumination conditions to generate a powerful base for classification of planetary experiment data. Speed, handling and extended capabilites are the reasons for using geodatabases to store and access these data types. Techniques for such a spatial database of footprint data are demonstrated using the Relational Database Management System (RDBMS) PostgreSQL, spatially enabled by the PostGIS extension. Exemplary, footprints of the HRSC and OMEGA instruments, both onboard ESA's Mars Express Orbiter, are generated and connected to attribute information. The aim is to provide high-resolution footprints of the OMEGA instrument to the science community for the first time and make them available for web-based mapping applications like the "Planetary Interactive GIS-on-the-Web Analyzable Database" (PIG- WAD), produced by the USGS. Map overlays with HRSC or other instruments like MOC and THEMIS (footprint maps are already available for these instruments and can be integrated into the database) allow on-the-fly intersection and comparison as well as extended statistics of the data. Footprint polygons are generated one by one using standard software provided by the instrument teams. Attribute data is calculated and stored together with the geometric information. In the case of HRSC, the coordinates of the footprints are already available in the VICAR label of each image file. Using the VICAR RTL and PostgreSQL's libpq C library they are loaded into the database using the Well-Known Text (WKT) notation by the Open Geospatial Consortium, Inc. (OGC). For the OMEGA instrument, image data is read using IDL routines developed and distributed by the OMEGA team. Image outlines are exported together with relevant attribute data to the industry standard Shapefile format. These files are translated to a Structured Query Language (SQL) command sequence suitable for insertion into the PostGIS/PostgrSQL database using the shp2pgsql data loader provided by the PostGIS software. PostgreSQL's advanced features such as geometry types, rules, operators and functions allow complex spatial queries and on-the-fly processing of data on DBMS level e.g. generalisation of the outlines. Processing done by the DBMS, visualisation via GIS systems and utilisation for web-based applications like mapservers will be demonstrated.
Cross-Matching Source Observations from the Palomar Transient Factory (PTF)
NASA Astrophysics Data System (ADS)
Laher, Russ; Grillmair, C.; Surace, J.; Monkewitz, S.; Jackson, E.
2009-01-01
Over the four-year lifetime of the PTF project, approximately 40 billion instances of astronomical-source observations will be extracted from the image data. The instances will correspond to the same astronomical objects being observed at roughly 25-50 different times, and so a very large catalog containing important object-variability information will be the chief PTF product. Organizing astronomical-source catalogs is conventionally done by dividing the catalog into declination zones and sorting by right ascension within each zone (e.g., the USNOA star catalog), in order to facilitate catalog searches. This method was reincarnated as the "zones" algorithm in a SQL-Server database implementation (Szalay et al., MSR-TR-2004-32), with corrections given by Gray et al. (MSR-TR-2006-52). The primary advantage of this implementation is that all of the work is done entirely on the database server and client/server communication is eliminated. We implemented the methods outlined in Gray et al. for a PostgreSQL database. We programmed the methods as database functions in PL/pgSQL procedural language. The cross-matching is currently based on source positions, but we intend to extend it to use both positions and positional uncertainties to form a chi-square statistic for optimal thresholding. The database design includes three main tables, plus a handful of internal tables. The Sources table stores the SExtractor source extractions taken at various times; the MergedSources table stores statistics about the astronomical objects, which are the result of cross-matching records in the Sources table; and the Merges table, which associates cross-matched primary keys in the Sources table with primary keys in the MergedSoures table. Besides judicious database indexing, we have also internally partitioned the Sources table by declination zone, in order to speed up the population of Sources records and make the database more manageable. The catalog will be accessible to the public after the proprietary period through IRSA (irsa.ipac.caltech.edu).
NASA Technical Reports Server (NTRS)
Steeman, Gerald; Connell, Christopher
2000-01-01
Many librarians may feel that dynamic Web pages are out of their reach, financially and technically. Yet we are reminded in library and Web design literature that static home pages are a thing of the past. This paper describes how librarians at the Institute for Defense Analyses (IDA) library developed a database-driven, dynamic intranet site using commercial off-the-shelf applications. Administrative issues include surveying a library users group for interest and needs evaluation; outlining metadata elements; and, committing resources from managing time to populate the database and training in Microsoft FrontPage and Web-to-database design. Technical issues covered include Microsoft Access database fundamentals, lessons learned in the Web-to-database process (including setting up Database Source Names (DSNs), redesigning queries to accommodate the Web interface, and understanding Access 97 query language vs. Standard Query Language (SQL)). This paper also offers tips on editing Active Server Pages (ASP) scripting to create desired results. A how-to annotated resource list closes out the paper.
Deng, Chen-Hui; Zhang, Guan-Min; Bi, Shan-Shan; Zhou, Tian-Yan; Lu, Wei
2011-07-01
This study is to develop a therapeutic drug monitoring (TDM) network server of tacrolimus for Chinese renal transplant patients, which can facilitate doctor to manage patients' information and provide three levels of predictions. Database management system MySQL was employed to build and manage the database of patients and doctors' information, and hypertext mark-up language (HTML) and Java server pages (JSP) technology were employed to construct network server for database management. Based on the population pharmacokinetic model of tacrolimus for Chinese renal transplant patients, above program languages were used to construct the population prediction and subpopulation prediction modules. Based on Bayesian principle and maximization of the posterior probability function, an objective function was established, and minimized by an optimization algorithm to estimate patient's individual pharmacokinetic parameters. It is proved that the network server has the basic functions for database management and three levels of prediction to aid doctor to optimize the regimen of tacrolimus for Chinese renal transplant patients.
[Establishment of a comprehensive database for laryngeal cancer related genes and the miRNAs].
Li, Mengjiao; E, Qimin; Liu, Jialin; Huang, Tingting; Liang, Chuanyu
2015-09-01
By collecting and analyzing the laryngeal cancer related genes and the miRNAs, to build a comprehensive laryngeal cancer-related gene database, which differs from the current biological information database with complex and clumsy structure and focuses on the theme of gene and miRNA, and it could make the research and teaching more convenient and efficient. Based on the B/S architecture, using Apache as a Web server, MySQL as coding language of database design and PHP as coding language of web design, a comprehensive database for laryngeal cancer-related genes was established, providing with the gene tables, protein tables, miRNA tables and clinical information tables of the patients with laryngeal cancer. The established database containsed 207 laryngeal cancer related genes, 243 proteins, 26 miRNAs, and their particular information such as mutations, methylations, diversified expressions, and the empirical references of laryngeal cancer relevant molecules. The database could be accessed and operated via the Internet, by which browsing and retrieval of the information were performed. The database were maintained and updated regularly. The database for laryngeal cancer related genes is resource-integrated and user-friendly, providing a genetic information query tool for the study of laryngeal cancer.
Enhanced DIII-D Data Management Through a Relational Database
NASA Astrophysics Data System (ADS)
Burruss, J. R.; Peng, Q.; Schachter, J.; Schissel, D. P.; Terpstra, T. B.
2000-10-01
A relational database is being used to serve data about DIII-D experiments. The database is optimized for queries across multiple shots, allowing for rapid data mining by SQL-literate researchers. The relational database relates different experiments and datasets, thus providing a big picture of DIII-D operations. Users are encouraged to add their own tables to the database. Summary physics quantities about DIII-D discharges are collected and stored in the database automatically. Meta-data about code runs, MDSplus usage, and visualization tool usage are collected, stored in the database, and later analyzed to improve computing. Documentation on the database may be accessed through programming languages such as C, Java, and IDL, or through ODBC compliant applications such as Excel and Access. A database-driven web page also provides a convenient means for viewing database quantities through the World Wide Web. Demonstrations will be given at the poster.
2002-06-01
Student memo for personnel MCLLS . . . . . . . . . . . . . . 75 i. Migrate data to SQL Server...The Web Server is on the same server as the SWORD database in the current version. 4: results set 5: dynamic HTML page 6: dynamic HTML page 3: SQL ...still be supported by Access. SQL Server would be a more viable tool for a fully developed application based on the number of potential users and
2013-01-01
commercial NoSQL database system. The results show that In-dexedHBase provides a data loading speed that is 6 times faster than Riak, and is...compare it with Riak, a widely adopted commercial NoSQL database system. The results show that In- dexedHBase provides a data loading speed that is 6...events. This chapter describes our research towards building an efficient and scalable storage platform for Truthy. Many existing NoSQL databases
Quality Attribute-Guided Evaluation of NoSQL Databases: A Case Study
2015-01-16
evaluations of NoSQL databases specifically, and big data systems in general, that have become apparent during our study. Keywords—NoSQL, distributed...technology, namely that of big data , software systems [1]. At the heart of big data systems are a collection of database technologies that are more...born organizations such as Google and Amazon [3][4], along with those of numerous other big data innovators, have created a variety of open source and
Validation of the United States Marine Corps Qualified Candidate Population Model
2003-03-01
time. Fields are created in the database to support this forecasting. User forms and a macro are programmed in Microsoft VBA to develop the...at 0.001. To accomplish 50,000 iterations of a minimization problem, this study wrote a macro in the VBA programming language that guides the solver...success in the commissioning process. **To improve the diagnostics of this propensity model, other factors were considered as well. Applying SQL
Global ISR: Toward a Comprehensive Defense Against Unauthorized Code Execution
2010-10-01
implementation using two of the most popular open- source servers: the Apache web server, and the MySQL database server. For Apache, we measure the effect that...utility ab. T o ta l T im e ( s e c ) 0 500 1000 1500 2000 2500 3000 Native Null ISR ISR−MP Fig. 3. The MySQL test-insert bench- mark measures...various SQL operations. The figure draws total execution time as reported by the benchmark utility. Finally, we benchmarked a MySQL database server using
Meta sequence analysis of human blood peptides and their parent proteins.
Bowden, Peter; Pendrak, Voitek; Zhu, Peihong; Marshall, John G
2010-04-18
Sequence analysis of the blood peptides and their qualities will be key to understanding the mechanisms that contribute to error in LC-ESI-MS/MS. Analysis of peptides and their proteins at the level of sequences is much more direct and informative than the comparison of disparate accession numbers. A portable database of all blood peptide and protein sequences with descriptor fields and gene ontology terms might be useful for designing immunological or MRM assays from human blood. The results of twelve studies of human blood peptides and/or proteins identified by LC-MS/MS and correlated against a disparate array of genetic libraries were parsed and matched to proteins from the human ENSEMBL, SwissProt and RefSeq databases by SQL. The reported peptide and protein sequences were organized into an SQL database with full protein sequences and up to five unique peptides in order of prevalence along with the peptide count for each protein. Structured query language or BLAST was used to acquire descriptive information in current databases. Sampling error at the level of peptides is the largest source of disparity between groups. Chi Square analysis of peptide to protein distributions confirmed the significant agreement between groups on identified proteins. Copyright 2010. Published by Elsevier B.V.
Reliability Information Analysis Center 1st Quarter 2007, Technical Area Task (TAT) Report
2007-02-05
34* Created new SQL server database for "PC Configuration" web application. Added roles for security closed 4235 and posted application to production. "e Wrote...and ran SQL Server scripts to migrate production databases to new server . "e Created backup jobs for new SQL Server databases. "* Continued...second phase of the TENA demo. Extensive tasking was established and assigned. A TENA interface to EW Server was reaffirmed after some uncertainty about
A Brief Assessment of LC2IEDM, MIST and Web Services for use in Naval Tactical Data Management
2004-07-01
server software, messaging between the client and server, and a database. The MIST database is implemented in an open source DBMS named PostGreSQL ... PostGreSQL had its beginnings at the University of California, Berkley, in 1986 [11]. The development of PostGreSQL has since evolved into a...contact history from the database. DRDC Atlantic TM 2004-148 9 Request Software Request Software Server Side Response from service
Evaluation of NoSQL databases for DIRAC monitoring and beyond
NASA Astrophysics Data System (ADS)
Mathe, Z.; Casajus Ramo, A.; Stagni, F.; Tomassetti, L.
2015-12-01
Nowadays, many database systems are available but they may not be optimized for storing time series data. Monitoring DIRAC jobs would be better done using a database optimised for storing time series data. So far it was done using a MySQL database, which is not well suited for such an application. Therefore alternatives have been investigated. Choosing an appropriate database for storing huge amounts of time series data is not trivial as one must take into account different aspects such as manageability, scalability and extensibility. We compared the performance of Elasticsearch, OpenTSDB (based on HBase) and InfluxDB NoSQL databases, using the same set of machines and the same data. We also evaluated the effort required for maintaining them. Using the LHCb Workload Management System (WMS), based on DIRAC as a use case we set up a new monitoring system, in parallel with the current MySQL system, and we stored the same data into the databases under test. We evaluated Grafana (for OpenTSDB) and Kibana (for ElasticSearch) metrics and graph editors for creating dashboards, in order to have a clear picture on the usability of each candidate. In this paper we present the results of this study and the performance of the selected technology. We also give an outlook of other potential applications of NoSQL databases within the DIRAC project.
Ferreira Junior, José Raniery; Oliveira, Marcelo Costa; de Azevedo-Marques, Paulo Mazzoncini
2016-12-01
Lung cancer is the leading cause of cancer-related deaths in the world, and its main manifestation is pulmonary nodules. Detection and classification of pulmonary nodules are challenging tasks that must be done by qualified specialists, but image interpretation errors make those tasks difficult. In order to aid radiologists on those hard tasks, it is important to integrate the computer-based tools with the lesion detection, pathology diagnosis, and image interpretation processes. However, computer-aided diagnosis research faces the problem of not having enough shared medical reference data for the development, testing, and evaluation of computational methods for diagnosis. In order to minimize this problem, this paper presents a public nonrelational document-oriented cloud-based database of pulmonary nodules characterized by 3D texture attributes, identified by experienced radiologists and classified in nine different subjective characteristics by the same specialists. Our goal with the development of this database is to improve computer-aided lung cancer diagnosis and pulmonary nodule detection and classification research through the deployment of this database in a cloud Database as a Service framework. Pulmonary nodule data was provided by the Lung Image Database Consortium and Image Database Resource Initiative (LIDC-IDRI), image descriptors were acquired by a volumetric texture analysis, and database schema was developed using a document-oriented Not only Structured Query Language (NoSQL) approach. The proposed database is now with 379 exams, 838 nodules, and 8237 images, 4029 of them are CT scans and 4208 manually segmented nodules, and it is allocated in a MongoDB instance on a cloud infrastructure.
2003-06-01
delivery Data Access (1980s) "What were unit sales in New England last March?" Relational databases (RDBMS), Structured Query Language ( SQL ...macros written in Visual Basic for Applications ( VBA ). 32 Iteration Two: Class Diagram Tech OASIS Export ScriptImport Filter Data ProcessingMethod 1...MS Excel * 1 VBA Macro*1 contains sends data to co nt ai ns executes * * 1 1 contains contains Figure 20. Iteration two class diagram The
2002-09-01
Basic for Applications ( VBA ) 6.0 as macros may not be supported in 8 future versions of Access. Access 2000 offers Internet- related features for...security features from Microsoft’s SQL Server. [1] 3. System Requirements Access 2000 is a resource-intensive application as are all Office 2000...1] • Modules – Functions and procedures written in the Visual Basic for Applications ( VBA ) programming language. The capabilities of modules
Software Application for Supporting the Education of Database Systems
ERIC Educational Resources Information Center
Vágner, Anikó
2015-01-01
The article introduces an application which supports the education of database systems, particularly the teaching of SQL and PL/SQL in Oracle Database Management System environment. The application has two parts, one is the database schema and its content, and the other is a C# application. The schema is to administrate and store the tasks and the…
Speed up of XML parsers with PHP language implementation
NASA Astrophysics Data System (ADS)
Georgiev, Bozhidar; Georgieva, Adriana
2012-11-01
In this paper, authors introduce PHP5's XML implementation and show how to read, parse, and write a short and uncomplicated XML file using Simple XML in a PHP environment. The possibilities for mutual work of PHP5 language and XML standard are described. The details of parsing process with Simple XML are also cleared. A practical project PHP-XML-MySQL presents the advantages of XML implementation in PHP modules. This approach allows comparatively simple search of XML hierarchical data by means of PHP software tools. The proposed project includes database, which can be extended with new data and new XML parsing functions.
Use of Graph Database for the Integration of Heterogeneous Biological Data.
Yoon, Byoung-Ha; Kim, Seon-Kyu; Kim, Seon-Young
2017-03-01
Understanding complex relationships among heterogeneous biological data is one of the fundamental goals in biology. In most cases, diverse biological data are stored in relational databases, such as MySQL and Oracle, which store data in multiple tables and then infer relationships by multiple-join statements. Recently, a new type of database, called the graph-based database, was developed to natively represent various kinds of complex relationships, and it is widely used among computer science communities and IT industries. Here, we demonstrate the feasibility of using a graph-based database for complex biological relationships by comparing the performance between MySQL and Neo4j, one of the most widely used graph databases. We collected various biological data (protein-protein interaction, drug-target, gene-disease, etc.) from several existing sources, removed duplicate and redundant data, and finally constructed a graph database containing 114,550 nodes and 82,674,321 relationships. When we tested the query execution performance of MySQL versus Neo4j, we found that Neo4j outperformed MySQL in all cases. While Neo4j exhibited a very fast response for various queries, MySQL exhibited latent or unfinished responses for complex queries with multiple-join statements. These results show that using graph-based databases, such as Neo4j, is an efficient way to store complex biological relationships. Moreover, querying a graph database in diverse ways has the potential to reveal novel relationships among heterogeneous biological data.
Use of Graph Database for the Integration of Heterogeneous Biological Data
Yoon, Byoung-Ha; Kim, Seon-Kyu
2017-01-01
Understanding complex relationships among heterogeneous biological data is one of the fundamental goals in biology. In most cases, diverse biological data are stored in relational databases, such as MySQL and Oracle, which store data in multiple tables and then infer relationships by multiple-join statements. Recently, a new type of database, called the graph-based database, was developed to natively represent various kinds of complex relationships, and it is widely used among computer science communities and IT industries. Here, we demonstrate the feasibility of using a graph-based database for complex biological relationships by comparing the performance between MySQL and Neo4j, one of the most widely used graph databases. We collected various biological data (protein-protein interaction, drug-target, gene-disease, etc.) from several existing sources, removed duplicate and redundant data, and finally constructed a graph database containing 114,550 nodes and 82,674,321 relationships. When we tested the query execution performance of MySQL versus Neo4j, we found that Neo4j outperformed MySQL in all cases. While Neo4j exhibited a very fast response for various queries, MySQL exhibited latent or unfinished responses for complex queries with multiple-join statements. These results show that using graph-based databases, such as Neo4j, is an efficient way to store complex biological relationships. Moreover, querying a graph database in diverse ways has the potential to reveal novel relationships among heterogeneous biological data. PMID:28416946
Development of a replicated database of DHCP data for evaluation of drug use.
Graber, S E; Seneker, J A; Stahl, A A; Franklin, K O; Neel, T E; Miller, R A
1996-01-01
This case report describes development and testing of a method to extract clinical information stored in the Veterans Affairs (VA) Decentralized Hospital Computer System (DHCP) for the purpose of analyzing data about groups of patients. The authors used a microcomputer-based, structured query language (SQL)-compatible, relational database system to replicate a subset of the Nashville VA Hospital's DHCP patient database. This replicated database contained the complete current Nashville DHCP prescription, provider, patient, and drug data sets, and a subset of the laboratory data. A pilot project employed this replicated database to answer questions that might arise in drug-use evaluation, such as identification of cases of polypharmacy, suboptimal drug regimens, and inadequate laboratory monitoring of drug therapy. These database queries included as candidates for review all prescriptions for all outpatients. The queries demonstrated that specific drug-use events could be identified for any time interval represented in the replicated database. PMID:8653451
Development of a replicated database of DHCP data for evaluation of drug use.
Graber, S E; Seneker, J A; Stahl, A A; Franklin, K O; Neel, T E; Miller, R A
1996-01-01
This case report describes development and testing of a method to extract clinical information stored in the Veterans Affairs (VA) Decentralized Hospital Computer System (DHCP) for the purpose of analyzing data about groups of patients. The authors used a microcomputer-based, structured query language (SQL)-compatible, relational database system to replicate a subset of the Nashville VA Hospital's DHCP patient database. This replicated database contained the complete current Nashville DHCP prescription, provider, patient, and drug data sets, and a subset of the laboratory data. A pilot project employed this replicated database to answer questions that might arise in drug-use evaluation, such as identification of cases of polypharmacy, suboptimal drug regimens, and inadequate laboratory monitoring of drug therapy. These database queries included as candidates for review all prescriptions for all outpatients. The queries demonstrated that specific drug-use events could be identified for any time interval represented in the replicated database.
ProteoLens: a visual analytic tool for multi-scale database-driven biological network data mining.
Huan, Tianxiao; Sivachenko, Andrey Y; Harrison, Scott H; Chen, Jake Y
2008-08-12
New systems biology studies require researchers to understand how interplay among myriads of biomolecular entities is orchestrated in order to achieve high-level cellular and physiological functions. Many software tools have been developed in the past decade to help researchers visually navigate large networks of biomolecular interactions with built-in template-based query capabilities. To further advance researchers' ability to interrogate global physiological states of cells through multi-scale visual network explorations, new visualization software tools still need to be developed to empower the analysis. A robust visual data analysis platform driven by database management systems to perform bi-directional data processing-to-visualizations with declarative querying capabilities is needed. We developed ProteoLens as a JAVA-based visual analytic software tool for creating, annotating and exploring multi-scale biological networks. It supports direct database connectivity to either Oracle or PostgreSQL database tables/views, on which SQL statements using both Data Definition Languages (DDL) and Data Manipulation languages (DML) may be specified. The robust query languages embedded directly within the visualization software help users to bring their network data into a visualization context for annotation and exploration. ProteoLens supports graph/network represented data in standard Graph Modeling Language (GML) formats, and this enables interoperation with a wide range of other visual layout tools. The architectural design of ProteoLens enables the de-coupling of complex network data visualization tasks into two distinct phases: 1) creating network data association rules, which are mapping rules between network node IDs or edge IDs and data attributes such as functional annotations, expression levels, scores, synonyms, descriptions etc; 2) applying network data association rules to build the network and perform the visual annotation of graph nodes and edges according to associated data values. We demonstrated the advantages of these new capabilities through three biological network visualization case studies: human disease association network, drug-target interaction network and protein-peptide mapping network. The architectural design of ProteoLens makes it suitable for bioinformatics expert data analysts who are experienced with relational database management to perform large-scale integrated network visual explorations. ProteoLens is a promising visual analytic platform that will facilitate knowledge discoveries in future network and systems biology studies.
DOE Office of Scientific and Technical Information (OSTI.GOV)
David Nix, Lisa Simirenko
2006-10-25
The Biolmaging Database (BID) is a relational database developed to store the data and meta-data for the 3D gene expression in early Drosophila embryo development on a cellular level. The schema was written to be used with the MySQL DBMS but with minor modifications can be used on any SQL compliant relational DBMS.
Chapter 51: How to Build a Simple Cone Search Service Using a Local Database
NASA Astrophysics Data System (ADS)
Kent, B. R.; Greene, G. R.
The cone search service protocol will be examined from the server side in this chapter. A simple cone search service will be setup and configured locally using MySQL. Data will be read into a table, and the Java JDBC will be used to connect to the database. Readers will understand the VO cone search specification and how to use it to query a database on their local systems and return an XML/VOTable file based on an input of RA/DEC coordinates and a search radius. The cone search in this example will be deployed as a Java servlet. The resulting cone search can be tested with a verification service. This basic setup can be used with other languages and relational databases.
The Ruby UCSC API: accessing the UCSC genome database using Ruby.
Mishima, Hiroyuki; Aerts, Jan; Katayama, Toshiaki; Bonnal, Raoul J P; Yoshiura, Koh-ichiro
2012-09-21
The University of California, Santa Cruz (UCSC) genome database is among the most used sources of genomic annotation in human and other organisms. The database offers an excellent web-based graphical user interface (the UCSC genome browser) and several means for programmatic queries. A simple application programming interface (API) in a scripting language aimed at the biologist was however not yet available. Here, we present the Ruby UCSC API, a library to access the UCSC genome database using Ruby. The API is designed as a BioRuby plug-in and built on the ActiveRecord 3 framework for the object-relational mapping, making writing SQL statements unnecessary. The current version of the API supports databases of all organisms in the UCSC genome database including human, mammals, vertebrates, deuterostomes, insects, nematodes, and yeast.The API uses the bin index-if available-when querying for genomic intervals. The API also supports genomic sequence queries using locally downloaded *.2bit files that are not stored in the official MySQL database. The API is implemented in pure Ruby and is therefore available in different environments and with different Ruby interpreters (including JRuby). Assisted by the straightforward object-oriented design of Ruby and ActiveRecord, the Ruby UCSC API will facilitate biologists to query the UCSC genome database programmatically. The API is available through the RubyGem system. Source code and documentation are available at https://github.com/misshie/bioruby-ucsc-api/ under the Ruby license. Feedback and help is provided via the website at http://rubyucscapi.userecho.com/.
The Ruby UCSC API: accessing the UCSC genome database using Ruby
2012-01-01
Background The University of California, Santa Cruz (UCSC) genome database is among the most used sources of genomic annotation in human and other organisms. The database offers an excellent web-based graphical user interface (the UCSC genome browser) and several means for programmatic queries. A simple application programming interface (API) in a scripting language aimed at the biologist was however not yet available. Here, we present the Ruby UCSC API, a library to access the UCSC genome database using Ruby. Results The API is designed as a BioRuby plug-in and built on the ActiveRecord 3 framework for the object-relational mapping, making writing SQL statements unnecessary. The current version of the API supports databases of all organisms in the UCSC genome database including human, mammals, vertebrates, deuterostomes, insects, nematodes, and yeast. The API uses the bin index—if available—when querying for genomic intervals. The API also supports genomic sequence queries using locally downloaded *.2bit files that are not stored in the official MySQL database. The API is implemented in pure Ruby and is therefore available in different environments and with different Ruby interpreters (including JRuby). Conclusions Assisted by the straightforward object-oriented design of Ruby and ActiveRecord, the Ruby UCSC API will facilitate biologists to query the UCSC genome database programmatically. The API is available through the RubyGem system. Source code and documentation are available at https://github.com/misshie/bioruby-ucsc-api/ under the Ruby license. Feedback and help is provided via the website at http://rubyucscapi.userecho.com/. PMID:22994508
Lu, Ying-Hao; Kuo, Chen-Chun; Huang, Yaw-Bin
2011-08-01
We selected HTML, PHP and JavaScript as the programming languages to build "WebBio", a web-based system for patient data of biological products and used MySQL as database. WebBio is based on the PHP-MySQL suite and is run by Apache server on Linux machine. WebBio provides the functions of data management, searching function and data analysis for 20 kinds of biological products (plasma expanders, human immunoglobulin and hematological products). There are two particular features in WebBio: (1) pharmacists can rapidly find out whose patients used contaminated products for medication safety, and (2) the statistics charts for a specific product can be automatically generated to reduce pharmacist's work loading. WebBio has successfully turned traditional paper work into web-based data management.
Architecture Knowledge for Evaluating Scalable Databases
2015-01-16
problems, arising from the proliferation of new data models and distributed technologies for building scalable, available data stores . Architects must...longer are relational databases the de facto standard for building data repositories. Highly distributed, scalable “ NoSQL ” databases [11] have emerged...This is especially challenging at the data storage layer. The multitude of competing NoSQL database technologies creates a complex and rapidly
Benchmarking database performance for genomic data.
Khushi, Matloob
2015-06-01
Genomic regions represent features such as gene annotations, transcription factor binding sites and epigenetic modifications. Performing various genomic operations such as identifying overlapping/non-overlapping regions or nearest gene annotations are common research needs. The data can be saved in a database system for easy management, however, there is no comprehensive database built-in algorithm at present to identify overlapping regions. Therefore I have developed a novel region-mapping (RegMap) SQL-based algorithm to perform genomic operations and have benchmarked the performance of different databases. Benchmarking identified that PostgreSQL extracts overlapping regions much faster than MySQL. Insertion and data uploads in PostgreSQL were also better, although general searching capability of both databases was almost equivalent. In addition, using the algorithm pair-wise, overlaps of >1000 datasets of transcription factor binding sites and histone marks, collected from previous publications, were reported and it was found that HNF4G significantly co-locates with cohesin subunit STAG1 (SA1).Inc. © 2015 Wiley Periodicals, Inc.
Integrating Scientific Array Processing into Standard SQL
NASA Astrophysics Data System (ADS)
Misev, Dimitar; Bachhuber, Johannes; Baumann, Peter
2014-05-01
We live in a time that is dominated by data. Data storage is cheap and more applications than ever accrue vast amounts of data. Storing the emerging multidimensional data sets efficiently, however, and allowing them to be queried by their inherent structure, is a challenge many databases have to face today. Despite the fact that multidimensional array data is almost always linked to additional, non-array information, array databases have mostly developed separately from relational systems, resulting in a disparity between the two database categories. The current SQL standard and SQL DBMS supports arrays - and in an extension also multidimensional arrays - but does so in a very rudimentary and inefficient way. This poster demonstrates the practicality of an SQL extension for array processing, implemented in a proof-of-concept multi-faceted system that manages a federation of array and relational database systems, providing transparent, efficient and scalable access to the heterogeneous data in them.
Information integration for a sky survey by data warehousing
NASA Astrophysics Data System (ADS)
Luo, A.; Zhang, Y.; Zhao, Y.
The virtualization service of data system for a sky survey LAMOST is very important for astronomers The service needs to integrate information from data collections catalogs and references and support simple federation of a set of distributed files and associated metadata Data warehousing has been in existence for several years and demonstrated superiority over traditional relational database management systems by providing novel indexing schemes that supported efficient on-line analytical processing OLAP of large databases Now relational database systems such as Oracle etc support the warehouse capability which including extensions to the SQL language to support OLAP operations and a number of metadata management tools have been created The information integration of LAMOST by applying data warehousing is to effectively provide data and knowledge on-line
DOMe: A deduplication optimization method for the NewSQL database backups
Wang, Longxiang; Zhu, Zhengdong; Zhang, Xingjun; Wang, Yinfeng
2017-01-01
Reducing duplicated data of database backups is an important application scenario for data deduplication technology. NewSQL is an emerging database system and is now being used more and more widely. NewSQL systems need to improve data reliability by periodically backing up in-memory data, resulting in a lot of duplicated data. The traditional deduplication method is not optimized for the NewSQL server system and cannot take full advantage of hardware resources to optimize deduplication performance. A recent research pointed out that the future NewSQL server will have thousands of CPU cores, large DRAM and huge NVRAM. Therefore, how to utilize these hardware resources to optimize the performance of data deduplication is an important issue. To solve this problem, we propose a deduplication optimization method (DOMe) for NewSQL system backup. To take advantage of the large number of CPU cores in the NewSQL server to optimize deduplication performance, DOMe parallelizes the deduplication method based on the fork-join framework. The fingerprint index, which is the key data structure in the deduplication process, is implemented as pure in-memory hash table, which makes full use of the large DRAM in NewSQL system, eliminating the performance bottleneck problem of fingerprint index existing in traditional deduplication method. The H-store is used as a typical NewSQL database system to implement DOMe method. DOMe is experimentally analyzed by two representative backup data. The experimental results show that: 1) DOMe can reduce the duplicated NewSQL backup data. 2) DOMe significantly improves deduplication performance by parallelizing CDC algorithms. In the case of the theoretical speedup ratio of the server is 20.8, the speedup ratio of DOMe can achieve up to 18; 3) DOMe improved the deduplication throughput by 1.5 times through the pure in-memory index optimization method. PMID:29049307
Comparison of the Frontier Distributed Database Caching System to NoSQL Databases
NASA Astrophysics Data System (ADS)
Dykstra, Dave
2012-12-01
One of the main attractions of non-relational “NoSQL” databases is their ability to scale to large numbers of readers, including readers spread over a wide area. The Frontier distributed database caching system, used in production by the Large Hadron Collider CMS and ATLAS detector projects for Conditions data, is based on traditional SQL databases but also adds high scalability and the ability to be distributed over a wide-area for an important subset of applications. This paper compares the major characteristics of the two different approaches and identifies the criteria for choosing which approach to prefer over the other. It also compares in some detail the NoSQL databases used by CMS and ATLAS: MongoDB, CouchDB, HBase, and Cassandra.
Experiences with DCE: the pro7 communication server based on OSF-DCE functionality.
Schulte, M; Lordieck, W
1997-01-01
The pro7-communication server is a new approach to manage communication between different applications on different hardware platforms in a hospital environment. The most important features are the use of OSF/DCE for realising remote procedure calls between different platforms, the use of an SQL-92 compatible relational database and the design of a new software development tool (called protocol definition language compiler) for describing the interface of a new application, which is to integrate in a hospital environment.
Security Behavior Observatory: Infrastructure for Long-term Monitoring of Client Machines
2014-07-14
desired data. In Wmdows, this is most often a .NET language (e.g., C#, PowerShell), a command-line batch script, or Java . 3) Least privilege: To ensure...modules are written in Java , and thus should be easily-portable to any OS. B. Deployment There are several high-level requirements the SBO must meet...practically feasible with such solutions. Instead, one researcher with access to all the clients’ keys (stored in an isolated and secured MySQL database
Abstracting data warehousing issues in scientific research.
Tews, Cody; Bracio, Boris R
2002-01-01
This paper presents the design and implementation of the Idaho Biomedical Data Management System (IBDMS). This system preprocesses biomedical data from the IMPROVE (Improving Control of Patient Status in Critical Care) library via an Open Database Connectivity (ODBC) connection. The ODBC connection allows for local and remote simulations to access filtered, joined, and sorted data using the Structured Query Language (SQL). The tool is capable of providing an overview of available data in addition to user defined data subset for verification of models of the human respiratory system.
Knowledge Query Language (KQL)
2016-02-12
Lexington Massachusetts This page intentionally left blank. iii EXECUTIVE SUMMARY Currently, queries for data ...retrieval from non-Structured Query Language (NoSQL) data stores are tightly coupled to the specific implementation of the data store implementation...independent of the storage content and format for querying NoSQL or relational data stores. This approach uses address expressions (or A-Expressions
A database for coconut crop improvement.
Rajagopal, Velamoor; Manimekalai, Ramaswamy; Devakumar, Krishnamurthy; Rajesh; Karun, Anitha; Niral, Vittal; Gopal, Murali; Aziz, Shamina; Gunasekaran, Marimuthu; Kumar, Mundappurathe Ramesh; Chandrasekar, Arumugam
2005-12-08
Coconut crop improvement requires a number of biotechnology and bioinformatics tools. A database containing information on CG (coconut germplasm), CCI (coconut cultivar identification), CD (coconut disease), MIFSPC (microbial information systems in plantation crops) and VO (vegetable oils) is described. The database was developed using MySQL and PostgreSQL running in Linux operating system. The database interface is developed in PHP, HTML and JAVA. http://www.bioinfcpcri.org.
Reactive Aggregate Model Protecting Against Real-Time Threats
2014-09-01
on the underlying functionality of three core components. • MS SQL server 2008 backend database. • Microsoft IIS running on Windows server 2008...services. The capstone tested a Linux-based Apache web server with the following software implementations: • MySQL as a Linux-based backend server for...malicious compromise. 1. Assumptions • GINA could connect to a backend MS SQL database through proper configuration of DotNetNuke. • GINA had access
Cloud-Based Distributed Control of Unmanned Systems
2015-04-01
during mission execution. At best, the data is saved onto hard-drives and is accessible only by the local team. Data history in a form available and...following open source technologies: GeoServer, OpenLayers, PostgreSQL , and PostGIS are chosen to implement the back-end database and server. A brief...geospatial map data. 3. PostgreSQL : An SQL-compliant object-relational database that easily scales to accommodate large amounts of data - upwards to
Mynodbcsv: lightweight zero-config database solution for handling very large CSV files.
Adaszewski, Stanisław
2014-01-01
Volumes of data used in science and industry are growing rapidly. When researchers face the challenge of analyzing them, their format is often the first obstacle. Lack of standardized ways of exploring different data layouts requires an effort each time to solve the problem from scratch. Possibility to access data in a rich, uniform manner, e.g. using Structured Query Language (SQL) would offer expressiveness and user-friendliness. Comma-separated values (CSV) are one of the most common data storage formats. Despite its simplicity, with growing file size handling it becomes non-trivial. Importing CSVs into existing databases is time-consuming and troublesome, or even impossible if its horizontal dimension reaches thousands of columns. Most databases are optimized for handling large number of rows rather than columns, therefore, performance for datasets with non-typical layouts is often unacceptable. Other challenges include schema creation, updates and repeated data imports. To address the above-mentioned problems, I present a system for accessing very large CSV-based datasets by means of SQL. It's characterized by: "no copy" approach--data stay mostly in the CSV files; "zero configuration"--no need to specify database schema; written in C++, with boost [1], SQLite [2] and Qt [3], doesn't require installation and has very small size; query rewriting, dynamic creation of indices for appropriate columns and static data retrieval directly from CSV files ensure efficient plan execution; effortless support for millions of columns; due to per-value typing, using mixed text/numbers data is easy; very simple network protocol provides efficient interface for MATLAB and reduces implementation time for other languages. The software is available as freeware along with educational videos on its website [4]. It doesn't need any prerequisites to run, as all of the libraries are included in the distribution package. I test it against existing database solutions using a battery of benchmarks and discuss the results.
Mynodbcsv: Lightweight Zero-Config Database Solution for Handling Very Large CSV Files
Adaszewski, Stanisław
2014-01-01
Volumes of data used in science and industry are growing rapidly. When researchers face the challenge of analyzing them, their format is often the first obstacle. Lack of standardized ways of exploring different data layouts requires an effort each time to solve the problem from scratch. Possibility to access data in a rich, uniform manner, e.g. using Structured Query Language (SQL) would offer expressiveness and user-friendliness. Comma-separated values (CSV) are one of the most common data storage formats. Despite its simplicity, with growing file size handling it becomes non-trivial. Importing CSVs into existing databases is time-consuming and troublesome, or even impossible if its horizontal dimension reaches thousands of columns. Most databases are optimized for handling large number of rows rather than columns, therefore, performance for datasets with non-typical layouts is often unacceptable. Other challenges include schema creation, updates and repeated data imports. To address the above-mentioned problems, I present a system for accessing very large CSV-based datasets by means of SQL. It's characterized by: “no copy” approach – data stay mostly in the CSV files; “zero configuration” – no need to specify database schema; written in C++, with boost [1], SQLite [2] and Qt [3], doesn't require installation and has very small size; query rewriting, dynamic creation of indices for appropriate columns and static data retrieval directly from CSV files ensure efficient plan execution; effortless support for millions of columns; due to per-value typing, using mixed text/numbers data is easy; very simple network protocol provides efficient interface for MATLAB and reduces implementation time for other languages. The software is available as freeware along with educational videos on its website [4]. It doesn't need any prerequisites to run, as all of the libraries are included in the distribution package. I test it against existing database solutions using a battery of benchmarks and discuss the results. PMID:25068261
An effective model for store and retrieve big health data in cloud computing.
Goli-Malekabadi, Zohreh; Sargolzaei-Javan, Morteza; Akbari, Mohammad Kazem
2016-08-01
The volume of healthcare data including different and variable text types, sounds, and images is increasing day to day. Therefore, the storage and processing of these data is a necessary and challenging issue. Generally, relational databases are used for storing health data which are not able to handle the massive and diverse nature of them. This study aimed at presenting the model based on NoSQL databases for the storage of healthcare data. Despite different types of NoSQL databases, document-based DBs were selected by a survey on the nature of health data. The presented model was implemented in the Cloud environment for accessing to the distribution properties. Then, the data were distributed on the database by applying the Shard property. The efficiency of the model was evaluated in comparison with the previous data model, Relational Database, considering query time, data preparation, flexibility, and extensibility parameters. The results showed that the presented model approximately performed the same as SQL Server for "read" query while it acted more efficiently than SQL Server for "write" query. Also, the performance of the presented model was better than SQL Server in the case of flexibility, data preparation and extensibility. Based on these observations, the proposed model was more effective than Relational Databases for handling health data. Copyright © 2016 Elsevier Ireland Ltd. All rights reserved.
NoSQL data model for semi-automatic integration of ethnomedicinal plant data from multiple sources.
Ningthoujam, Sanjoy Singh; Choudhury, Manabendra Dutta; Potsangbam, Kumar Singh; Chetia, Pankaj; Nahar, Lutfun; Sarker, Satyajit D; Basar, Norazah; Das Talukdar, Anupam
2014-01-01
Sharing traditional knowledge with the scientific community could refine scientific approaches to phytochemical investigation and conservation of ethnomedicinal plants. As such, integration of traditional knowledge with scientific data using a single platform for sharing is greatly needed. However, ethnomedicinal data are available in heterogeneous formats, which depend on cultural aspects, survey methodology and focus of the study. Phytochemical and bioassay data are also available from many open sources in various standards and customised formats. To design a flexible data model that could integrate both primary and curated ethnomedicinal plant data from multiple sources. The current model is based on MongoDB, one of the Not only Structured Query Language (NoSQL) databases. Although it does not contain schema, modifications were made so that the model could incorporate both standard and customised ethnomedicinal plant data format from different sources. The model presented can integrate both primary and secondary data related to ethnomedicinal plants. Accommodation of disparate data was accomplished by a feature of this database that supported a different set of fields for each document. It also allowed storage of similar data having different properties. The model presented is scalable to a highly complex level with continuing maturation of the database, and is applicable for storing, retrieving and sharing ethnomedicinal plant data. It can also serve as a flexible alternative to a relational and normalised database. Copyright © 2014 John Wiley & Sons, Ltd.
Distributed Episodic Exploratory Planning (DEEP)
2008-12-01
API). For DEEP, Hibernate offered the following advantages: • Abstracts SQL by utilizing HQL so any database with a Java Database Connectivity... Hibernate SQL ICCRTS International Command and Control Research and Technology Symposium JDB Java Distributed Blackboard JDBC Java Database Connectivity...selected because of its opportunistic reasoning capabilities and implemented in Java for platform independence. Java was chosen for ease of
ExplorEnz: a MySQL database of the IUBMB enzyme nomenclature.
McDonald, Andrew G; Boyce, Sinéad; Moss, Gerard P; Dixon, Henry B F; Tipton, Keith F
2007-07-27
We describe the database ExplorEnz, which is the primary repository for EC numbers and enzyme data that are being curated on behalf of the IUBMB. The enzyme nomenclature is incorporated into many other resources, including the ExPASy-ENZYME, BRENDA and KEGG bioinformatics databases. The data, which are stored in a MySQL database, preserve the formatting of chemical and enzyme names. A simple, easy to use, web-based query interface is provided, along with an advanced search engine for more complex queries. The database is publicly available at http://www.enzyme-database.org. The data are available for download as SQL and XML files via FTP. ExplorEnz has powerful and flexible search capabilities and provides the scientific community with the most up-to-date version of the IUBMB Enzyme List.
Evaluating the Cassandra NoSQL Database Approach for Genomic Data Persistency.
Aniceto, Rodrigo; Xavier, Rene; Guimarães, Valeria; Hondo, Fernanda; Holanda, Maristela; Walter, Maria Emilia; Lifschitz, Sérgio
2015-01-01
Rapid advances in high-throughput sequencing techniques have created interesting computational challenges in bioinformatics. One of them refers to management of massive amounts of data generated by automatic sequencers. We need to deal with the persistency of genomic data, particularly storing and analyzing these large-scale processed data. To find an alternative to the frequently considered relational database model becomes a compelling task. Other data models may be more effective when dealing with a very large amount of nonconventional data, especially for writing and retrieving operations. In this paper, we discuss the Cassandra NoSQL database approach for storing genomic data. We perform an analysis of persistency and I/O operations with real data, using the Cassandra database system. We also compare the results obtained with a classical relational database system and another NoSQL database approach, MongoDB.
A database for coconut crop improvement
Rajagopal, Velamoor; Manimekalai, Ramaswamy; Devakumar, Krishnamurthy; Rajesh; Karun, Anitha; Niral, Vittal; Gopal, Murali; Aziz, Shamina; Gunasekaran, Marimuthu; Kumar, Mundappurathe Ramesh; Chandrasekar, Arumugam
2005-01-01
Coconut crop improvement requires a number of biotechnology and bioinformatics tools. A database containing information on CG (coconut germplasm), CCI (coconut cultivar identification), CD (coconut disease), MIFSPC (microbial information systems in plantation crops) and VO (vegetable oils) is described. The database was developed using MySQL and PostgreSQL running in Linux operating system. The database interface is developed in PHP, HTML and JAVA. Availability http://www.bioinfcpcri.org PMID:17597858
New Software for Ensemble Creation in the Spitzer-Space-Telescope Operations Database
NASA Technical Reports Server (NTRS)
Laher, Russ; Rector, John
2004-01-01
Some of the computer pipelines used to process digital astronomical images from NASA's Spitzer Space Telescope require multiple input images, in order to generate high-level science and calibration products. The images are grouped into ensembles according to well documented ensemble-creation rules by making explicit associations in the operations Informix database at the Spitzer Science Center (SSC). The advantage of this approach is that a simple database query can retrieve the required ensemble of pipeline input images. New and improved software for ensemble creation has been developed. The new software is much faster than the existing software because it uses pre-compiled database stored-procedures written in Informix SPL (SQL programming language). The new software is also more flexible because the ensemble creation rules are now stored in and read from newly defined database tables. This table-driven approach was implemented so that ensemble rules can be inserted, updated, or deleted without modifying software.
ERIC Educational Resources Information Center
Piyayodilokchai, Hongsiri; Panjaburee, Patcharin; Laosinchai, Parames; Ketpichainarong, Watcharee; Ruenwongsa, Pintip
2013-01-01
With the benefit of multimedia and the learning cycle approach in promoting effective active learning, this paper proposed a learning cycle approach-based, multimedia-supplemented instructional unit for Structured Query Language (SQL) for second-year undergraduate students with the aim of enhancing their basic knowledge of SQL and ability to apply…
Knowledge Query Language (KQL)
2016-02-01
unlimited. This page intentionally left blank. iii EXECUTIVE SUMMARY Currently, queries for data ...retrieval from non-Structured Query Language (NoSQL) data stores are tightly coupled to the specific implementation of the data store implementation, making...of the storage content and format for querying NoSQL or relational data stores. This approach uses address expressions (or A-Expressions) embedded in
Temsch, W; Luger, A; Riedl, M
2008-01-01
This article presents a mathematical model to calculate HbA1c values based on self-measured blood glucose and past HbA1c levels, thereby enabling patients to monitor diabetes therapy between scheduled checkups. This method could help physicians to make treatment decisions if implemented in a system where glucose data are transferred to a remote server. The method, however, cannot replace HbA1c measurements; past HbA1c values are needed to gauge the method. The mathematical model of HbA1c formation was developed based on biochemical principles. Unlike an existing HbA1c formula, the new model respects the decreasing contribution of older glucose levels to current HbA1c values. About 12 standard SQL statements embedded in a php program were used to perform Fourier transform. Regression analysis was used to gauge results with previous HbA1c values. The method can be readily implemented in any SQL database. The predicted HbA1c values thus obtained were in accordance with measured values. They also matched the results of the HbA1c formula in the elevated range. By contrast, the formula was too "optimistic" in the range of better glycemic control. Individual analysis of two subjects improved the accuracy of values and reflected the bias introduced by different glucometers and individual measurement habits.
Development of a platform-independent receiver control system for SISIFOS
NASA Astrophysics Data System (ADS)
Lemke, Roland; Olberg, Michael
1998-05-01
Up to now receiver control software was a time consuming development usually written by receiver engineers who had mainly the hardware in mind. We are presenting a low-cost and very flexible system which uses a minimal interface to the real hardware, and which makes it easy to adapt to new receivers. Our system uses Tcl/Tk as a graphical user interface (GUI), SpecTcl as a GUI builder, Pgplot as plotting software, a simple query language (SQL) database for information storage and retrieval, Ethernet socket to socket communication and SCPI as a command control language. The complete system is in principal platform independent but for cost saving reasons we are using it actually on a PC486 running Linux 2.0.30, which is a copylefted Unix. The only hardware dependent part are the digital input/output boards, analog to digital and digital to analog convertors. In the case of the Linux PC we are using a device driver development kit to integrate the boards fully into the kernel of the operating system, which indeed makes them look like an ordinary device. The advantage of this system is firstly the low price and secondly the clear separation between the different software components which are available for many operating systems. If it is not possible, due to CPU performance limitations, to run all the software in a single machine,the SQL-database or the graphical user interface could be installed on separate computers.
An advanced web query interface for biological databases
Latendresse, Mario; Karp, Peter D.
2010-01-01
Although most web-based biological databases (DBs) offer some type of web-based form to allow users to author DB queries, these query forms are quite restricted in the complexity of DB queries that they can formulate. They can typically query only one DB, and can query only a single type of object at a time (e.g. genes) with no possible interaction between the objects—that is, in SQL parlance, no joins are allowed between DB objects. Writing precise queries against biological DBs is usually left to a programmer skillful enough in complex DB query languages like SQL. We present a web interface for building precise queries for biological DBs that can construct much more precise queries than most web-based query forms, yet that is user friendly enough to be used by biologists. It supports queries containing multiple conditions, and connecting multiple object types without using the join concept, which is unintuitive to biologists. This interactive web interface is called the Structured Advanced Query Page (SAQP). Users interactively build up a wide range of query constructs. Interactive documentation within the SAQP describes the schema of the queried DBs. The SAQP is based on BioVelo, a query language based on list comprehension. The SAQP is part of the Pathway Tools software and is available as part of several bioinformatics web sites powered by Pathway Tools, including the BioCyc.org site that contains more than 500 Pathway/Genome DBs. PMID:20624715
2009-01-01
Oracle 9i, 10g MySQL MS SQL Server MS SQL Server Operating System Supported Windows 2003 Server Windows 2000 Server (32 bit...WebStar (Mac OS X) SunOne Internet Information Services (IIS) Database Server Supported MS SQL Server MS SQL Server Oracle 9i, 10g...challenges of Web-based surveys are: 1) identifying the best Commercial Off the Shelf (COTS) Web-based survey packages to serve the particular
2001-09-01
replication) -- all from Visual Basic and VBA . In fact, we found that the SQL Server engine actually had a plethora of options, most formidable of...2002, the new SQL Server 2000 database engine, and Microsoft Visual Basic.NET. This thesis describes our use of the Spiral Development Model to...versions of Microsoft products? Specifically, the pending release of Microsoft Office 2002, the new SQL Server 2000 database engine, and Microsoft
Spectrum Savings from High Performance Recording and Playback Onboard the Test Article
2013-02-20
execute within a Windows 7 environment, and data is recorded on SSDs. The underlying database is implemented using MySQL . Figure 1 illustrates the... MySQL database. This is effectively the time at which the recorded data are available for retransmission. CPU and Memory utilization were collected...17.7% MySQL avg. 3.9% EQDR Total avg. 21.6% Table 1 CPU Utilization with260 Mbits/sec Load The difference between the total System CPU (27.8
Evaluating the Cassandra NoSQL Database Approach for Genomic Data Persistency
Aniceto, Rodrigo; Xavier, Rene; Guimarães, Valeria; Hondo, Fernanda; Holanda, Maristela; Walter, Maria Emilia; Lifschitz, Sérgio
2015-01-01
Rapid advances in high-throughput sequencing techniques have created interesting computational challenges in bioinformatics. One of them refers to management of massive amounts of data generated by automatic sequencers. We need to deal with the persistency of genomic data, particularly storing and analyzing these large-scale processed data. To find an alternative to the frequently considered relational database model becomes a compelling task. Other data models may be more effective when dealing with a very large amount of nonconventional data, especially for writing and retrieving operations. In this paper, we discuss the Cassandra NoSQL database approach for storing genomic data. We perform an analysis of persistency and I/O operations with real data, using the Cassandra database system. We also compare the results obtained with a classical relational database system and another NoSQL database approach, MongoDB. PMID:26558254
2014-09-01
NoSQL Data Store Technologies John Klein, Software Engineering Institute Patrick Donohoe, Software Engineering Institute Neil Ernst...REPORT TYPE N/A 3. DATES COVERED 4. TITLE AND SUBTITLE NoSQL Data Store Technologies 5a. CONTRACT NUMBER 5b. GRANT NUMBER 5c. PROGRAM ELEMENT...distribute data 4. Data Replication – determines how a NoSQL database facilitates reliable, high performance data replication to build
ExplorEnz: a MySQL database of the IUBMB enzyme nomenclature
McDonald, Andrew G; Boyce, Sinéad; Moss, Gerard P; Dixon, Henry BF; Tipton, Keith F
2007-01-01
Background We describe the database ExplorEnz, which is the primary repository for EC numbers and enzyme data that are being curated on behalf of the IUBMB. The enzyme nomenclature is incorporated into many other resources, including the ExPASy-ENZYME, BRENDA and KEGG bioinformatics databases. Description The data, which are stored in a MySQL database, preserve the formatting of chemical and enzyme names. A simple, easy to use, web-based query interface is provided, along with an advanced search engine for more complex queries. The database is publicly available at . The data are available for download as SQL and XML files via FTP. Conclusion ExplorEnz has powerful and flexible search capabilities and provides the scientific community with the most up-to-date version of the IUBMB Enzyme List. PMID:17662133
Novel Method of Storing and Reconstructing Events at Fermilab E-906/SeaQuest Using a MySQL Database
NASA Astrophysics Data System (ADS)
Hague, Tyler
2010-11-01
Fermilab E-906/SeaQuest is a fixed target experiment at Fermi National Accelerator Laboratory. We are investigating the antiquark asymmetry in the nucleon sea. By examining the ratio of the Drell- Yan cross sections of proton-proton and proton-deuterium collisions we can determine the asymmetry ratio. An essential feature in the development of the analysis software is to update the event reconstruction to modern software tools. We are doing this in a unique way by doing a majority of the calculations within an SQL database. Using a MySQL database allows us to take advantage of off-the-shelf software without sacrificing ROOT compatibility and avoid network bottlenecks with server-side data selection. Using our raw data we create stubs, or partial tracks, at each station which are pieced together to create full tracks. Our reconstruction process uses dynamically created SQL statements to analyze the data. These SQL statements create tables that contain the final reconstructed tracks as well as intermediate values. This poster will explain the reconstruction process and how it is being implemented.
The SQL Server Database for Non Computer Professional Teaching Reform
ERIC Educational Resources Information Center
Liu, Xiangwei
2012-01-01
A summary of the teaching methods of the non-computer professional SQL Server database, analyzes the current situation of the teaching course. According to non computer professional curriculum teaching characteristic, put forward some teaching reform methods, and put it into practice, improve the students' analysis ability, practice ability and…
Comparative study on the customization of natural language interfaces to databases.
Pazos R, Rodolfo A; Aguirre L, Marco A; González B, Juan J; Martínez F, José A; Pérez O, Joaquín; Verástegui O, Andrés A
2016-01-01
In the last decades the popularity of natural language interfaces to databases (NLIDBs) has increased, because in many cases information obtained from them is used for making important business decisions. Unfortunately, the complexity of their customization by database administrators make them difficult to use. In order for a NLIDB to obtain a high percentage of correctly translated queries, it is necessary that it is correctly customized for the database to be queried. In most cases the performance reported in NLIDB literature is the highest possible; i.e., the performance obtained when the interfaces were customized by the implementers. However, for end users it is more important the performance that the interface can yield when the NLIDB is customized by someone different from the implementers. Unfortunately, there exist very few articles that report NLIDB performance when the NLIDBs are not customized by the implementers. This article presents a semantically-enriched data dictionary (which permits solving many of the problems that occur when translating from natural language to SQL) and an experiment in which two groups of undergraduate students customized our NLIDB and English language frontend (ELF), considered one of the best available commercial NLIDBs. The experimental results show that, when customized by the first group, our NLIDB obtained a 44.69 % of correctly answered queries and ELF 11.83 % for the ATIS database, and when customized by the second group, our NLIDB attained 77.05 % and ELF 13.48 %. The performance attained by our NLIDB, when customized by ourselves was 90 %.
NoSQL: collection document and cloud by using a dynamic web query form
NASA Astrophysics Data System (ADS)
Abdalla, Hemn B.; Lin, Jinzhao; Li, Guoquan
2015-07-01
Mongo-DB (from "humongous") is an open-source document database and the leading NoSQL database. A NoSQL (Not Only SQL, next generation databases, being non-relational, deal, open-source and horizontally scalable) presenting a mechanism for storage and retrieval of documents. Previously, we stored and retrieved the data using the SQL queries. Here, we use the MonogoDB that means we are not utilizing the MySQL and SQL queries. Directly importing the documents into our Drives, retrieving the documents on that drive by not applying the SQL queries, using the IO BufferReader and Writer, BufferReader for importing our type of document files to my folder (Drive). For retrieving the document files, the usage is BufferWriter from the particular folder (or) Drive. In this sense, providing the security for those storing files for what purpose means if we store the documents in our local folder means all or views that file and modified that file. So preventing that file, we are furnishing the security. The original document files will be changed to another format like in this paper; Binary format is used. Our documents will be converting to the binary format after that direct storing in one of our folder, that time the storage space will provide the private key for accessing that file. Wherever any user tries to discover the Document files means that file data are in the binary format, the document's file owner simply views that original format using that personal key from receive the secret key from the cloud.
NPTool: Towards Scalability and Reliability of Business Process Management
NASA Astrophysics Data System (ADS)
Braghetto, Kelly Rosa; Ferreira, João Eduardo; Pu, Calton
Currently one important challenge in business process management is provide at the same time scalability and reliability of business process executions. This difficulty becomes more accentuated when the execution control assumes complex countless business processes. This work presents NavigationPlanTool (NPTool), a tool to control the execution of business processes. NPTool is supported by Navigation Plan Definition Language (NPDL), a language for business processes specification that uses process algebra as formal foundation. NPTool implements the NPDL language as a SQL extension. The main contribution of this paper is a description of the NPTool showing how the process algebra features combined with a relational database model can be used to provide a scalable and reliable control in the execution of business processes. The next steps of NPTool include reuse of control-flow patterns and support to data flow management.
NASA Astrophysics Data System (ADS)
Hu, Haibin
2017-05-01
Among numerous WEB security issues, SQL injection is the most notable and dangerous. In this study, characteristics and procedures of SQL injection are analyzed, and the method for detecting the SQL injection attack is illustrated. The defense resistance and remedy model of SQL injection attack is established from the perspective of non-intrusive SQL injection attack and defense. Moreover, the ability of resisting the SQL injection attack of the server has been comprehensively improved through the security strategies on operation system, IIS and database, etc.. Corresponding codes are realized. The method is well applied in the actual projects.
Applications of GIS and database technologies to manage a Karst Feature Database
Gao, Y.; Tipping, R.G.; Alexander, E.C.
2006-01-01
This paper describes the management of a Karst Feature Database (KFD) in Minnesota. Two sets of applications in both GIS and Database Management System (DBMS) have been developed for the KFD of Minnesota. These applications were used to manage and to enhance the usability of the KFD. Structured Query Language (SQL) was used to manipulate transactions of the database and to facilitate the functionality of the user interfaces. The Database Administrator (DBA) authorized users with different access permissions to enhance the security of the database. Database consistency and recovery are accomplished by creating data logs and maintaining backups on a regular basis. The working database provides guidelines and management tools for future studies of karst features in Minnesota. The methodology of designing this DBMS is applicable to develop GIS-based databases to analyze and manage geomorphic and hydrologic datasets at both regional and local scales. The short-term goal of this research is to develop a regional KFD for the Upper Mississippi Valley Karst and the long-term goal is to expand this database to manage and study karst features at national and global scales.
A Magnetic Petrology Database for Satellite Magnetic Anomaly Interpretations
NASA Astrophysics Data System (ADS)
Nazarova, K.; Wasilewski, P.; Didenko, A.; Genshaft, Y.; Pashkevich, I.
2002-05-01
A Magnetic Petrology Database (MPDB) is now being compiled at NASA/Goddard Space Flight Center in cooperation with Russian and Ukrainian Institutions. The purpose of this database is to provide the geomagnetic community with a comprehensive and user-friendly method of accessing magnetic petrology data via Internet for more realistic interpretation of satellite magnetic anomalies. Magnetic Petrology Data had been accumulated in NASA/Goddard Space Flight Center, United Institute of Physics of the Earth (Russia) and Institute of Geophysics (Ukraine) over several decades and now consists of many thousands of records of data in our archives. The MPDB was, and continues to be in big demand especially since recent launching in near Earth orbit of the mini-constellation of three satellites - Oersted (in 1999), Champ (in 2000), and SAC-C (in 2000) which will provide lithospheric magnetic maps with better spatial and amplitude resolution (about 1 nT). The MPDB is focused on lower crustal and upper mantle rocks and will include data on mantle xenoliths, serpentinized ultramafic rocks, granulites, iron quartzites and rocks from Archean-Proterozoic metamorphic sequences from all around the world. A substantial amount of data is coming from the area of unique Kursk Magnetic Anomaly and Kola Deep Borehole (which recovered 12 km of continental crust). A prototype MPDB can be found on the Geodynamics Branch web server of Goddard Space Flight Center at http://core2.gsfc.nasa.gov/terr_mag/magnpetr.html. The MPDB employs a searchable relational design and consists of 7 interrelated tables. The schema of database is shown at http://core2.gsfc.nasa.gov/terr_mag/doc.html. MySQL database server was utilized to implement MPDB. The SQL (Structured Query Language) is used to query the database. To present the results of queries on WEB and for WEB programming we utilized PHP scripting language and CGI scripts. The prototype MPDB is designed to search database by major satellite magnetic anomaly, tectonic structure, geographical location, rock type, magnetic properties, chemistry and reference, see http://core2.gsfc.nasa.gov/terr_mag/query1.html. The output of database is HTML structured table, text file, and downloadable file. This database will be very useful for studies of lithospheric satellite magnetic anomalies on the Earth and other terrestrial planets.
SORTEZ: a relational translator for NCBI's ASN.1 database.
Hart, K W; Searls, D B; Overton, G C
1994-07-01
The National Center for Biotechnology Information (NCBI) has created a database collection that includes several protein and nucleic acid sequence databases, a biosequence-specific subset of MEDLINE, as well as value-added information such as links between similar sequences. Information in the NCBI database is modeled in Abstract Syntax Notation 1 (ASN.1) an Open Systems Interconnection protocol designed for the purpose of exchanging structured data between software applications rather than as a data model for database systems. While the NCBI database is distributed with an easy-to-use information retrieval system, ENTREZ, the ASN.1 data model currently lacks an ad hoc query language for general-purpose data access. For that reason, we have developed a software package, SORTEZ, that transforms the ASN.1 database (or other databases with nested data structures) to a relational data model and subsequently to a relational database management system (Sybase) where information can be accessed through the relational query language, SQL. Because the need to transform data from one data model and schema to another arises naturally in several important contexts, including efficient execution of specific applications, access to multiple databases and adaptation to database evolution this work also serves as a practical study of the issues involved in the various stages of database transformation. We show that transformation from the ASN.1 data model to a relational data model can be largely automated, but that schema transformation and data conversion require considerable domain expertise and would greatly benefit from additional support tools.
Freire, Sergio Miranda; Teodoro, Douglas; Wei-Kleiner, Fang; Sundvall, Erik; Karlsson, Daniel; Lambrix, Patrick
2016-01-01
This study provides an experimental performance evaluation on population-based queries of NoSQL databases storing archetype-based Electronic Health Record (EHR) data. There are few published studies regarding the performance of persistence mechanisms for systems that use multilevel modelling approaches, especially when the focus is on population-based queries. A healthcare dataset with 4.2 million records stored in a relational database (MySQL) was used to generate XML and JSON documents based on the openEHR reference model. Six datasets with different sizes were created from these documents and imported into three single machine XML databases (BaseX, eXistdb and Berkeley DB XML) and into a distributed NoSQL database system based on the MapReduce approach, Couchbase, deployed in different cluster configurations of 1, 2, 4, 8 and 12 machines. Population-based queries were submitted to those databases and to the original relational database. Database size and query response times are presented. The XML databases were considerably slower and required much more space than Couchbase. Overall, Couchbase had better response times than MySQL, especially for larger datasets. However, Couchbase requires indexing for each differently formulated query and the indexing time increases with the size of the datasets. The performances of the clusters with 2, 4, 8 and 12 nodes were not better than the single node cluster in relation to the query response time, but the indexing time was reduced proportionally to the number of nodes. The tested XML databases had acceptable performance for openEHR-based data in some querying use cases and small datasets, but were generally much slower than Couchbase. Couchbase also outperformed the response times of the relational database, but required more disk space and had a much longer indexing time. Systems like Couchbase are thus interesting research targets for scalable storage and querying of archetype-based EHR data when population-based use cases are of interest. PMID:26958859
Freire, Sergio Miranda; Teodoro, Douglas; Wei-Kleiner, Fang; Sundvall, Erik; Karlsson, Daniel; Lambrix, Patrick
2016-01-01
This study provides an experimental performance evaluation on population-based queries of NoSQL databases storing archetype-based Electronic Health Record (EHR) data. There are few published studies regarding the performance of persistence mechanisms for systems that use multilevel modelling approaches, especially when the focus is on population-based queries. A healthcare dataset with 4.2 million records stored in a relational database (MySQL) was used to generate XML and JSON documents based on the openEHR reference model. Six datasets with different sizes were created from these documents and imported into three single machine XML databases (BaseX, eXistdb and Berkeley DB XML) and into a distributed NoSQL database system based on the MapReduce approach, Couchbase, deployed in different cluster configurations of 1, 2, 4, 8 and 12 machines. Population-based queries were submitted to those databases and to the original relational database. Database size and query response times are presented. The XML databases were considerably slower and required much more space than Couchbase. Overall, Couchbase had better response times than MySQL, especially for larger datasets. However, Couchbase requires indexing for each differently formulated query and the indexing time increases with the size of the datasets. The performances of the clusters with 2, 4, 8 and 12 nodes were not better than the single node cluster in relation to the query response time, but the indexing time was reduced proportionally to the number of nodes. The tested XML databases had acceptable performance for openEHR-based data in some querying use cases and small datasets, but were generally much slower than Couchbase. Couchbase also outperformed the response times of the relational database, but required more disk space and had a much longer indexing time. Systems like Couchbase are thus interesting research targets for scalable storage and querying of archetype-based EHR data when population-based use cases are of interest.
NASA Astrophysics Data System (ADS)
Velazquez, Enrique Israel
Improvements in medical and genomic technologies have dramatically increased the production of electronic data over the last decade. As a result, data management is rapidly becoming a major determinant, and urgent challenge, for the development of Precision Medicine. Although successful data management is achievable using Relational Database Management Systems (RDBMS), exponential data growth is a significant contributor to failure scenarios. Growing amounts of data can also be observed in other sectors, such as economics and business, which, together with the previous facts, suggests that alternate database approaches (NoSQL) may soon be required for efficient storage and management of big databases. However, this hypothesis has been difficult to test in the Precision Medicine field since alternate database architectures are complex to assess and means to integrate heterogeneous electronic health records (EHR) with dynamic genomic data are not easily available. In this dissertation, we present a novel set of experiments for identifying NoSQL database approaches that enable effective data storage and management in Precision Medicine using patients' clinical and genomic information from the cancer genome atlas (TCGA). The first experiment draws on performance and scalability from biologically meaningful queries with differing complexity and database sizes. The second experiment measures performance and scalability in database updates without schema changes. The third experiment assesses performance and scalability in database updates with schema modifications due dynamic data. We have identified two NoSQL approach, based on Cassandra and Redis, which seems to be the ideal database management systems for our precision medicine queries in terms of performance and scalability. We present NoSQL approaches and show how they can be used to manage clinical and genomic big data. Our research is relevant to the public health since we are focusing on one of the main challenges to the development of Precision Medicine and, consequently, investigating a potential solution to the progressively increasing demands on health care.
Assessment & Commitment Tracking System (ACTS)
DOE Office of Scientific and Technical Information (OSTI.GOV)
Bryant, Robert A.; Childs, Teresa A.; Miller, Michael A.
2004-12-20
The ACTS computer code provides a centralized tool for planning and scheduling assessments, tracking and managing actions associated with assessments or that result from an event or condition, and "mining" data for reporting and analyzing information for improving performance. The ACTS application is designed to work with the MS SQL database management system. All database interfaces are written in SQL. The following software is used to develop and support the ACTS application: Cold Fusion HTML JavaScript Quest TOAD Microsoft Visual Source Safe (VSS) HTML Mailer for sending email Microsoft SQL Microsoft Internet Information Server
NoSQL technologies for the CMS Conditions Database
NASA Astrophysics Data System (ADS)
Sipos, Roland
2015-12-01
With the restart of the LHC in 2015, the growth of the CMS Conditions dataset will continue, therefore the need of consistent and highly available access to the Conditions makes a great cause to revisit different aspects of the current data storage solutions. We present a study of alternative data storage backends for the Conditions Databases, by evaluating some of the most popular NoSQL databases to support a key-value representation of the CMS Conditions. The definition of the database infrastructure is based on the need of storing the conditions as BLOBs. Because of this, each condition can reach the size that may require special treatment (splitting) in these NoSQL databases. As big binary objects may be problematic in several database systems, and also to give an accurate baseline, a testing framework extension was implemented to measure the characteristics of the handling of arbitrary binary data in these databases. Based on the evaluation, prototypes of a document store, using a column-oriented and plain key-value store, are deployed. An adaption layer to access the backends in the CMS Offline software was developed to provide transparent support for these NoSQL databases in the CMS context. Additional data modelling approaches and considerations in the software layer, deployment and automatization of the databases are also covered in the research. In this paper we present the results of the evaluation as well as a performance comparison of the prototypes studied.
Comparing IndexedHBase and Riak for Serving Truthy: Performance of Data Loading and Query Evaluation
2013-08-01
Research Triangle Park, NC 27709-2211 15. SUBJECT TERMS performance evaluation, distributed database, noSQL , HBase, indexing Xiaoming Gao, Judy Qiu...common hashtags created during a given time window. With the purpose of finding a solution for these challenges, we evaluate NoSQL databases such as
2012-11-27
with powerful analysis tools and an informatics approach leveraging best-of-breed NoSQL databases, in order to store, search and retrieve relevant...dictionaries, and JavaScript also has good support. The MongoDB project[15] was chosen as a scalable NoSQL data store for the cheminfor- matics components
A mobile information management system used in textile enterprises
NASA Astrophysics Data System (ADS)
Huang, C.-R.; Yu, W.-D.
2008-02-01
The mobile information management system (MIMS) for textile enterprises is based on Microsoft Visual Studios. NET2003 Server, Microsoft SQL Server 2000, C++ language and wireless application protocol (WAP) and wireless markup language (WML) technology. The portable MIMS is composed of three-layer structures, i.e. showing layer; operating layer; and data visiting layer corresponding to the port-link module; processing module; and database module. By using the MIMS, not only the information exchanges become more convenient and easier, but also the compatible between the giant information capacity and a micro-cell phone and functional expansion nature in operating and designing can be realized by means of build-in units. The development of MIMS is suitable for the utilization in textile enterprises.
[Establishement for regional pelvic trauma database in Hunan Province].
Cheng, Liang; Zhu, Yong; Long, Haitao; Yang, Junxiao; Sun, Buhua; Li, Kanghua
2017-04-28
To establish a database for pelvic trauma in Hunan Province, and to start the work of multicenter pelvic trauma registry. Methods: To establish the database, literatures relevant to pelvic trauma were screened, the experiences from the established trauma database in China and abroad were learned, and the actual situations for pelvic trauma rescue in Hunan Province were considered. The database for pelvic trauma was established based on the PostgreSQL and the advanced programming language Java 1.6. Results: The complex procedure for pelvic trauma rescue was described structurally. The contents for the database included general patient information, injurious condition, prehospital rescue, conditions in admission, treatment in hospital, status on discharge, diagnosis, classification, complication, trauma scoring and therapeutic effect. The database can be accessed through the internet by browser/servicer. The functions for the database include patient information management, data export, history query, progress report, video-image management and personal information management. Conclusion: The database with whole life cycle pelvic trauma is successfully established for the first time in China. It is scientific, functional, practical, and user-friendly.
Evolution of the LBT Telemetry System
NASA Astrophysics Data System (ADS)
Summers, K.; Biddick, C.; De La Peña, M. D.; Summers, D.
2014-05-01
The Large Binocular Telescope (LBT) Telescope Control System (TCS) records about 10GB of telemetry data per night. Additionally, the vibration monitoring system records about 9GB of telemetry data per night. Through 2013, we have amassed over 6TB of Hierarchical Data Format (HDF5) files and almost 9TB in a MySQL database of TCS and vibration data. The LBT telemetry system, in its third major revision since 2004, provides the mechanism to capture and store this data. The telemetry system has evolved from a simple HDF file system with MySQL stream definitions within the TCS, to a separate system using a MySQL database system for the definitions and data, and finally to no database use at all, using HDF5 files.
The MAO NASU Plate Archive Database. Current Status and Perspectives
NASA Astrophysics Data System (ADS)
Pakuliak, L. K.; Sergeeva, T. P.
2006-04-01
The preliminary online version of the database of the MAO NASU plate archive is constructed on the basis of the relational database management system MySQL and permits an easy supplement of database with new collections of astronegatives, provides a high flexibility in constructing SQL-queries for data search optimization, PHP Basic Authorization protected access to administrative interface and wide range of search parameters. The current status of the database will be reported and the brief description of the search engine and means of the database integrity support will be given. Methods and means of the data verification and tasks for the further development will be discussed.
Database constraints applied to metabolic pathway reconstruction tools.
Vilaplana, Jordi; Solsona, Francesc; Teixido, Ivan; Usié, Anabel; Karathia, Hiren; Alves, Rui; Mateo, Jordi
2014-01-01
Our group developed two biological applications, Biblio-MetReS and Homol-MetReS, accessing the same database of organisms with annotated genes. Biblio-MetReS is a data-mining application that facilitates the reconstruction of molecular networks based on automated text-mining analysis of published scientific literature. Homol-MetReS allows functional (re)annotation of proteomes, to properly identify both the individual proteins involved in the process(es) of interest and their function. It also enables the sets of proteins involved in the process(es) in different organisms to be compared directly. The efficiency of these biological applications is directly related to the design of the shared database. We classified and analyzed the different kinds of access to the database. Based on this study, we tried to adjust and tune the configurable parameters of the database server to reach the best performance of the communication data link to/from the database system. Different database technologies were analyzed. We started the study with a public relational SQL database, MySQL. Then, the same database was implemented by a MapReduce-based database named HBase. The results indicated that the standard configuration of MySQL gives an acceptable performance for low or medium size databases. Nevertheless, tuning database parameters can greatly improve the performance and lead to very competitive runtimes.
Database Migration for Command and Control
2002-11-01
Sql - proprietary JDP Private Area Air defense data Defended asset list Oracle 7.3.2 - Automated process (OLTP...TADIL warnings Oracle 7.3.2 Flat File - Discrete transaction with data upds - NRT response required Pull mission data Std SQL ...level execution data Oracle 7.3 User update External interfaces Auto/manual backup Messaging Proprietary replication (internally) SQL Web server
NASA Astrophysics Data System (ADS)
Gaspar Aparicio, R.; Gomez, D.; Coterillo Coz, I.; Wojcik, D.
2012-12-01
At CERN a number of key database applications are running on user-managed MySQL database services. The database on demand project was born out of an idea to provide the CERN user community with an environment to develop and run database services outside of the actual centralised Oracle based database services. The Database on Demand (DBoD) empowers the user to perform certain actions that had been traditionally done by database administrators, DBA's, providing an enterprise platform for database applications. It also allows the CERN user community to run different database engines, e.g. presently open community version of MySQL and single instance Oracle database server. This article describes a technology approach to face this challenge, a service level agreement, the SLA that the project provides, and an evolution of possible scenarios.
2014-06-01
central location. Each of the SQLite databases are converted and stored in one MySQL database and the pcap files are parsed to extract call information...from the specific communications applications used during the experiment. This extracted data is then stored in the same MySQL database. With all...rhythm of the event. Figure 3 demonstrates the application usage over the course of the experiment for the EXDIR. As seen, the EXDIR spent the majority
An Analysis Platform for Mobile Ad Hoc Network (MANET) Scenario Execution Log Data
2016-01-01
these technologies. 4.1 Backend Technologies • Java 1.8 • my-sql-connector- java -5.0.8.jar • Tomcat • VirtualBox • Kali MANET Virtual Machine 4.2...Frontend Technologies • LAMPP 4.3 Database • MySQL Server 5. Database The SEDAP database settings and structure are described in this section...contains all the backend java functionality including the web services, should be placed in the webapps directory inside the Tomcat installation
A veterinary anatomy tutoring system.
Theodoropoulos, G; Loumos, V; Antonopoulos, J
1994-02-14
A veterinary anatomy tutoring system was developed by using Knowledge Pro, an object-oriented software development tool with hypermedia capabilities, and MS Access, a relational database. Communication between them is facilitated by using the Structured Query Language (SQL). The architecture of the system is based on knowledge sets, each of which covers four different descriptions of an organ, namely gross anatomy (general description), gross anatomy (comparative features), histology, and embryology, which constitute the knowledge units. These knowledge units are linked with three global variables that define the animals, the topographies, and the system to which this organ belongs, creating three data-bases. These three data-bases are interrelated through the organ field in order to establish a relational model. This system allows versatility in the student's navigation through the information space by offering different modes for information location and presentation. These include course mode, review mode, reference mode, dissection mode, and comparison mode. In addition, the system provides a self-evaluation mode.
Tamm, E P; Kawashima, A; Silverman, P
2001-06-01
Current commercial radiology information systems (RIS) are designed for scheduling, billing, charge collection, and report dissemination. Academic institutions have additional requirements for their missions for teaching, research and clinical care. The newest versions of commercial RIS offer greater flexibility than prior systems. We sent questionnaires to Cerner Corporation, ADAC Health Care Information Systems, IDX Systems, Per-Se' Technologies, and Siemens Health Services regarding features of their products. All of the products we surveyed offer user customizable fields. However, most products did not allow the user to expand their product's data table. The search capabilities of the products varied. All of the products supported the Health Level 7 (HL-7) interface and the use of structured query language (SQL). All of the products were offered with an SQL editor for creating customized queries and custom reports. All products included capabilities for collecting data for quality assurance and included capabilities for tracking "interesting cases," though they varied in the functionality offered. No product offered dedicated functions for research. Alternatively, radiology departments can create their own client-server Windows-based database systems to supplement the capabilities of commercial systems. Such systems can be developed with "web-enabled" database products like Microsoft Access or Apple Filemaker Pro.
Lü, Yiran; Hao, Shuxin; Zhang, Guoqing; Liu, Jie; Liu, Yue; Xu, Dongqun
2018-01-01
To implement the online statistical analysis function in information system of air pollution and health impact monitoring, and obtain the data analysis information real-time. Using the descriptive statistical method as well as time-series analysis and multivariate regression analysis, SQL language and visual tools to implement online statistical analysis based on database software. Generate basic statistical tables and summary tables of air pollution exposure and health impact data online; Generate tendency charts of each data part online and proceed interaction connecting to database; Generate butting sheets which can lead to R, SAS and SPSS directly online. The information system air pollution and health impact monitoring implements the statistical analysis function online, which can provide real-time analysis result to its users.
Master Metadata Repository and Metadata-Management System
NASA Technical Reports Server (NTRS)
Armstrong, Edward; Reed, Nate; Zhang, Wen
2007-01-01
A master metadata repository (MMR) software system manages the storage and searching of metadata pertaining to data from national and international satellite sources of the Global Ocean Data Assimilation Experiment (GODAE) High Resolution Sea Surface Temperature Pilot Project [GHRSSTPP]. These sources produce a total of hundreds of data files daily, each file classified as one of more than ten data products representing global sea-surface temperatures. The MMR is a relational database wherein the metadata are divided into granulelevel records [denoted file records (FRs)] for individual satellite files and collection-level records [denoted data set descriptions (DSDs)] that describe metadata common to all the files from a specific data product. FRs and DSDs adhere to the NASA Directory Interchange Format (DIF). The FRs and DSDs are contained in separate subdatabases linked by a common field. The MMR is configured in MySQL database software with custom Practical Extraction and Reporting Language (PERL) programs to validate and ingest the metadata records. The database contents are converted into the Federal Geographic Data Committee (FGDC) standard format by use of the Extensible Markup Language (XML). A Web interface enables users to search for availability of data from all sources.
Nosql for Storage and Retrieval of Large LIDAR Data Collections
NASA Astrophysics Data System (ADS)
Boehm, J.; Liu, K.
2015-08-01
Developments in LiDAR technology over the past decades have made LiDAR to become a mature and widely accepted source of geospatial information. This in turn has led to an enormous growth in data volume. The central idea for a file-centric storage of LiDAR point clouds is the observation that large collections of LiDAR data are typically delivered as large collections of files, rather than single files of terabyte size. This split of the dataset, commonly referred to as tiling, was usually done to accommodate a specific processing pipeline. It makes therefore sense to preserve this split. A document oriented NoSQL database can easily emulate this data partitioning, by representing each tile (file) in a separate document. The document stores the metadata of the tile. The actual files are stored in a distributed file system emulated by the NoSQL database. We demonstrate the use of MongoDB a highly scalable document oriented NoSQL database for storing large LiDAR files. MongoDB like any NoSQL database allows for queries on the attributes of the document. As a specialty MongoDB also allows spatial queries. Hence we can perform spatial queries on the bounding boxes of the LiDAR tiles. Inserting and retrieving files on a cloud-based database is compared to native file system and cloud storage transfer speed.
Database Constraints Applied to Metabolic Pathway Reconstruction Tools
Vilaplana, Jordi; Solsona, Francesc; Teixido, Ivan; Usié, Anabel; Karathia, Hiren; Alves, Rui; Mateo, Jordi
2014-01-01
Our group developed two biological applications, Biblio-MetReS and Homol-MetReS, accessing the same database of organisms with annotated genes. Biblio-MetReS is a data-mining application that facilitates the reconstruction of molecular networks based on automated text-mining analysis of published scientific literature. Homol-MetReS allows functional (re)annotation of proteomes, to properly identify both the individual proteins involved in the process(es) of interest and their function. It also enables the sets of proteins involved in the process(es) in different organisms to be compared directly. The efficiency of these biological applications is directly related to the design of the shared database. We classified and analyzed the different kinds of access to the database. Based on this study, we tried to adjust and tune the configurable parameters of the database server to reach the best performance of the communication data link to/from the database system. Different database technologies were analyzed. We started the study with a public relational SQL database, MySQL. Then, the same database was implemented by a MapReduce-based database named HBase. The results indicated that the standard configuration of MySQL gives an acceptable performance for low or medium size databases. Nevertheless, tuning database parameters can greatly improve the performance and lead to very competitive runtimes. PMID:25202745
Database on Demand: insight how to build your own DBaaS
NASA Astrophysics Data System (ADS)
Gaspar Aparicio, Ruben; Coterillo Coz, Ignacio
2015-12-01
At CERN, a number of key database applications are running on user-managed MySQL, PostgreSQL and Oracle database services. The Database on Demand (DBoD) project was born out of an idea to provide CERN user community with an environment to develop and run database services as a complement to the central Oracle based database service. The Database on Demand empowers the user to perform certain actions that had been traditionally done by database administrators, providing an enterprise platform for database applications. It also allows the CERN user community to run different database engines, e.g. presently three major RDBMS (relational database management system) vendors are offered. In this article we show the actual status of the service after almost three years of operations, some insight of our new redesign software engineering and near future evolution.
Informatics in radiology: use of CouchDB for document-based storage of DICOM objects.
Rascovsky, Simón J; Delgado, Jorge A; Sanz, Alexander; Calvo, Víctor D; Castrillón, Gabriel
2012-01-01
Picture archiving and communication systems traditionally have depended on schema-based Structured Query Language (SQL) databases for imaging data management. To optimize database size and performance, many such systems store a reduced set of Digital Imaging and Communications in Medicine (DICOM) metadata, discarding informational content that might be needed in the future. As an alternative to traditional database systems, document-based key-value stores recently have gained popularity. These systems store documents containing key-value pairs that facilitate data searches without predefined schemas. Document-based key-value stores are especially suited to archive DICOM objects because DICOM metadata are highly heterogeneous collections of tag-value pairs conveying specific information about imaging modalities, acquisition protocols, and vendor-supported postprocessing options. The authors used an open-source document-based database management system (Apache CouchDB) to create and test two such databases; CouchDB was selected for its overall ease of use, capability for managing attachments, and reliance on HTTP and Representational State Transfer standards for accessing and retrieving data. A large database was created first in which the DICOM metadata from 5880 anonymized magnetic resonance imaging studies (1,949,753 images) were loaded by using a Ruby script. To provide the usual DICOM query functionality, several predefined "views" (standard queries) were created by using JavaScript. For performance comparison, the same queries were executed in both the CouchDB database and a SQL-based DICOM archive. The capabilities of CouchDB for attachment management and database replication were separately assessed in tests of a similar, smaller database. Results showed that CouchDB allowed efficient storage and interrogation of all DICOM objects; with the use of information retrieval algorithms such as map-reduce, all the DICOM metadata stored in the large database were searchable with only a minimal increase in retrieval time over that with the traditional database management system. Results also indicated possible uses for document-based databases in data mining applications such as dose monitoring, quality assurance, and protocol optimization. RSNA, 2012
A Data Warehouse to Support Condition Based Maintenance (CBM)
2005-05-01
Application ( VBA ) code sequence to import the original MAST-generated CSV and then create a single output table in DBASE IV format. The DBASE IV format...database architecture (Oracle, Sybase, MS- SQL , etc). This design includes table definitions, comments, specification of table attributes, primary and foreign...built queries and applications. Needs the application developers to construct data views. No SQL programming experience. b. Power Database User - knows
2016-03-01
science IT information technology JBOD just a bunch of disks JDBC java database connectivity xviii JPME Joint Professional Military Education JSO...Joint Service Officer JVM java virtual machine MPP massively parallel processing MPTE Manpower, Personnel, Training, and Education NAVMAC Navy...27 external database, whether it is MySQL , Oracle, DB2, or SQL Server (Teller, 2015). Connectors optimize the data transfer by obtaining metadata
Quality Attribute-Guided Evaluation of NoSQL Databases: An Experience Report
2014-10-18
detailed technical evaluations of NoSQL databases specifically, and big data systems in general, that have become apparent during our study... big data , software systems [Agarwal 2011]. Internet-born organizations such as Google and Amazon are at the cutting edge of this revolution...Chang 2008], along with those of numerous other big data innovators, have made a variety of open source and commercial data management technologies
Alignment of high-throughput sequencing data inside in-memory databases.
Firnkorn, Daniel; Knaup-Gregori, Petra; Lorenzo Bermejo, Justo; Ganzinger, Matthias
2014-01-01
In times of high-throughput DNA sequencing techniques, performance-capable analysis of DNA sequences is of high importance. Computer supported DNA analysis is still an intensive time-consuming task. In this paper we explore the potential of a new In-Memory database technology by using SAP's High Performance Analytic Appliance (HANA). We focus on read alignment as one of the first steps in DNA sequence analysis. In particular, we examined the widely used Burrows-Wheeler Aligner (BWA) and implemented stored procedures in both, HANA and the free database system MySQL, to compare execution time and memory management. To ensure that the results are comparable, MySQL has been running in memory as well, utilizing its integrated memory engine for database table creation. We implemented stored procedures, containing exact and inexact searching of DNA reads within the reference genome GRCh37. Due to technical restrictions in SAP HANA concerning recursion, the inexact matching problem could not be implemented on this platform. Hence, performance analysis between HANA and MySQL was made by comparing the execution time of the exact search procedures. Here, HANA was approximately 27 times faster than MySQL which means, that there is a high potential within the new In-Memory concepts, leading to further developments of DNA analysis procedures in the future.
Techniques for Efficiently Managing Large Geosciences Data Sets
NASA Astrophysics Data System (ADS)
Kruger, A.; Krajewski, W. F.; Bradley, A. A.; Smith, J. A.; Baeck, M. L.; Steiner, M.; Lawrence, R. E.; Ramamurthy, M. K.; Weber, J.; Delgreco, S. A.; Domaszczynski, P.; Seo, B.; Gunyon, C. A.
2007-12-01
We have developed techniques and software tools for efficiently managing large geosciences data sets. While the techniques were developed as part of an NSF-Funded ITR project that focuses on making NEXRAD weather data and rainfall products available to hydrologists and other scientists, they are relevant to other geosciences disciplines that deal with large data sets. Metadata, relational databases, data compression, and networking are central to our methodology. Data and derived products are stored on file servers in a compressed format. URLs to, and metadata about the data and derived products are managed in a PostgreSQL database. Virtually all access to the data and products is through this database. Geosciences data normally require a number of processing steps to transform the raw data into useful products: data quality assurance, coordinate transformations and georeferencing, applying calibration information, and many more. We have developed the concept of crawlers that manage this scientific workflow. Crawlers are unattended processes that run indefinitely, and at set intervals query the database for their next assignment. A database table functions as a roster for the crawlers. Crawlers perform well-defined tasks that are, except for perhaps sequencing, largely independent from other crawlers. Once a crawler is done with its current assignment, it updates the database roster table, and gets its next assignment by querying the database. We have developed a library that enables one to quickly add crawlers. The library provides hooks to external (i.e., C-language) compiled codes, so that developers can work and contribute independently. Processes called ingesters inject data into the system. The bulk of the data are from a real-time feed using UCAR/Unidata's IDD/LDM software. An exciting recent development is the establishment of a Unidata HYDRO feed that feeds value-added metadata over the IDD/LDM. Ingesters grab the metadata and populate the PostgreSQL tables. These and other concepts we have developed have enabled us to efficiently manage a 70 Tb (and growing) data weather radar data set.
Providing R-Tree Support for Mongodb
NASA Astrophysics Data System (ADS)
Xiang, Longgang; Shao, Xiaotian; Wang, Dehao
2016-06-01
Supporting large amounts of spatial data is a significant characteristic of modern databases. However, unlike some mature relational databases, such as Oracle and PostgreSQL, most of current burgeoning NoSQL databases are not well designed for storing geospatial data, which is becoming increasingly important in various fields. In this paper, we propose a novel method to provide R-tree index, as well as corresponding spatial range query and nearest neighbour query functions, for MongoDB, one of the most prevalent NoSQL databases. First, after in-depth analysis of MongoDB's features, we devise an efficient tabular document structure which flattens R-tree index into MongoDB collections. Further, relevant mechanisms of R-tree operations are issued, and then we discuss in detail how to integrate R-tree into MongoDB. Finally, we present the experimental results which show that our proposed method out-performs the built-in spatial index of MongoDB. Our research will greatly facilitate big data management issues with MongoDB in a variety of geospatial information applications.
Experience with ATLAS MySQL PanDA database service
NASA Astrophysics Data System (ADS)
Smirnov, Y.; Wlodek, T.; De, K.; Hover, J.; Ozturk, N.; Smith, J.; Wenaus, T.; Yu, D.
2010-04-01
The PanDA distributed production and analysis system has been in production use for ATLAS data processing and analysis since late 2005 in the US, and globally throughout ATLAS since early 2008. Its core architecture is based on a set of stateless web services served by Apache and backed by a suite of MySQL databases that are the repository for all PanDA information: active and archival job queues, dataset and file catalogs, site configuration information, monitoring information, system control parameters, and so on. This database system is one of the most critical components of PanDA, and has successfully delivered the functional and scaling performance required by PanDA, currently operating at a scale of half a million jobs per week, with much growth still to come. In this paper we describe the design and implementation of the PanDA database system, its architecture of MySQL servers deployed at BNL and CERN, backup strategy and monitoring tools. The system has been developed, thoroughly tested, and brought to production to provide highly reliable, scalable, flexible and available database services for ATLAS Monte Carlo production, reconstruction and physics analysis.
NASA Astrophysics Data System (ADS)
Knosp, B.; Gangl, M.; Hristova-Veleva, S. M.; Kim, R. M.; Li, P.; Turk, J.; Vu, Q. A.
2015-12-01
The JPL Tropical Cyclone Information System (TCIS) brings together satellite, aircraft, and model forecast data from several NASA, NOAA, and other data centers to assist researchers in comparing and analyzing data and model forecast related to tropical cyclones. The TCIS has been running a near-real time (NRT) data portal during North Atlantic hurricane season that typically runs from June through October each year, since 2010. Data collected by the TCIS varies by type, format, contents, and frequency and is served to the user in two ways: (1) as image overlays on a virtual globe and (2) as derived output from a suite of analysis tools. In order to support these two functions, the data must be collected and then made searchable by criteria such as date, mission, product, pressure level, and geospatial region. Creating a database architecture that is flexible enough to manage, intelligently interrogate, and ultimately present this disparate data to the user in a meaningful way has been the primary challenge. The database solution for the TCIS has been to use a hybrid MySQL + Solr implementation. After testing other relational database and NoSQL solutions, such as PostgreSQL and MongoDB respectively, this solution has given the TCIS the best offerings in terms of query speed and result reliability. This database solution also supports the challenging (and memory overwhelming) geospatial queries that are necessary to support analysis tools requested by users. Though hardly new technologies on their own, our implementation of MySQL + Solr had to be customized and tuned to be able to accurately store, index, and search the TCIS data holdings. In this presentation, we will discuss how we arrived on our MySQL + Solr database architecture, why it offers us the most consistent fast and reliable results, and how it supports our front end so that we can offer users a look into our "big data" holdings.
CheD: chemical database compilation tool, Internet server, and client for SQL servers.
Trepalin, S V; Yarkov, A V
2001-01-01
An efficient program, which runs on a personal computer, for the storage, retrieval, and processing of chemical information, is presented, The program can work both as a stand-alone application or in conjunction with a specifically written Web server application or with some standard SQL servers, e.g., Oracle, Interbase, and MS SQL. New types of data fields are introduced, e.g., arrays for spectral information storage, HTML and database links, and user-defined functions. CheD has an open architecture; thus, custom data types, controls, and services may be added. A WWW server application for chemical data retrieval features an easy and user-friendly installation on Windows NT or 95 platforms.
Domain fusion analysis by applying relational algebra to protein sequence and domain databases
Truong, Kevin; Ikura, Mitsuhiko
2003-01-01
Background Domain fusion analysis is a useful method to predict functionally linked proteins that may be involved in direct protein-protein interactions or in the same metabolic or signaling pathway. As separate domain databases like BLOCKS, PROSITE, Pfam, SMART, PRINTS-S, ProDom, TIGRFAMs, and amalgamated domain databases like InterPro continue to grow in size and quality, a computational method to perform domain fusion analysis that leverages on these efforts will become increasingly powerful. Results This paper proposes a computational method employing relational algebra to find domain fusions in protein sequence databases. The feasibility of this method was illustrated on the SWISS-PROT+TrEMBL sequence database using domain predictions from the Pfam HMM (hidden Markov model) database. We identified 235 and 189 putative functionally linked protein partners in H. sapiens and S. cerevisiae, respectively. From scientific literature, we were able to confirm many of these functional linkages, while the remainder offer testable experimental hypothesis. Results can be viewed at . Conclusion As the analysis can be computed quickly on any relational database that supports standard SQL (structured query language), it can be dynamically updated along with the sequence and domain databases, thereby improving the quality of predictions over time. PMID:12734020
Assembling proteomics data as a prerequisite for the analysis of large scale experiments
Schmidt, Frank; Schmid, Monika; Thiede, Bernd; Pleißner, Klaus-Peter; Böhme, Martina; Jungblut, Peter R
2009-01-01
Background Despite the complete determination of the genome sequence of a huge number of bacteria, their proteomes remain relatively poorly defined. Beside new methods to increase the number of identified proteins new database applications are necessary to store and present results of large- scale proteomics experiments. Results In the present study, a database concept has been developed to address these issues and to offer complete information via a web interface. In our concept, the Oracle based data repository system SQL-LIMS plays the central role in the proteomics workflow and was applied to the proteomes of Mycobacterium tuberculosis, Helicobacter pylori, Salmonella typhimurium and protein complexes such as 20S proteasome. Technical operations of our proteomics labs were used as the standard for SQL-LIMS template creation. By means of a Java based data parser, post-processed data of different approaches, such as LC/ESI-MS, MALDI-MS and 2-D gel electrophoresis (2-DE), were stored in SQL-LIMS. A minimum set of the proteomics data were transferred in our public 2D-PAGE database using a Java based interface (Data Transfer Tool) with the requirements of the PEDRo standardization. Furthermore, the stored proteomics data were extractable out of SQL-LIMS via XML. Conclusion The Oracle based data repository system SQL-LIMS played the central role in the proteomics workflow concept. Technical operations of our proteomics labs were used as standards for SQL-LIMS templates. Using a Java based parser, post-processed data of different approaches such as LC/ESI-MS, MALDI-MS and 1-DE and 2-DE were stored in SQL-LIMS. Thus, unique data formats of different instruments were unified and stored in SQL-LIMS tables. Moreover, a unique submission identifier allowed fast access to all experimental data. This was the main advantage compared to multi software solutions, especially if personnel fluctuations are high. Moreover, large scale and high-throughput experiments must be managed in a comprehensive repository system such as SQL-LIMS, to query results in a systematic manner. On the other hand, these database systems are expensive and require at least one full time administrator and specialized lab manager. Moreover, the high technical dynamics in proteomics may cause problems to adjust new data formats. To summarize, SQL-LIMS met the requirements of proteomics data handling especially in skilled processes such as gel-electrophoresis or mass spectrometry and fulfilled the PSI standardization criteria. The data transfer into a public domain via DTT facilitated validation of proteomics data. Additionally, evaluation of mass spectra by post-processing using MS-Screener improved the reliability of mass analysis and prevented storage of data junk. PMID:19166578
2004-06-01
remote databases, has seen little vendor acceptance. Each database ( Oracle , DB2, MySQL , etc.) has its own client- server protocol. Therefore each...existing standards – SQL , X.500/LDAP, FTP, etc. • View information dissemination as selective replication – State-oriented vs . message-oriented...allowing the 8 application to start. The resource management system would serve as a broker to the resources, making sure that resources are not
Delayed Instantiation Bulk Operations for Management of Distributed, Object-Based Storage Systems
2009-08-01
source and destination object sets, while they have attribute pages to indicate that history . Fourth, we allow for operations to occur on any objects...client dialogue to the PostgreSQL database where server-side functions implement the service logic for the requests. The translation is done...to satisfy client requests, and performs delayed instantiation bulk operations. It is built around a PostgreSQL database with tables for storing
Wong, Wing Chung; Kim, Dewey; Carter, Hannah; Diekhans, Mark; Ryan, Michael C; Karchin, Rachel
2011-08-01
Thousands of cancer exomes are currently being sequenced, yielding millions of non-synonymous single nucleotide variants (SNVs) of possible relevance to disease etiology. Here, we provide a software toolkit to prioritize SNVs based on their predicted contribution to tumorigenesis. It includes a database of precomputed, predictive features covering all positions in the annotated human exome and can be used either stand-alone or as part of a larger variant discovery pipeline. MySQL database, source code and binaries freely available for academic/government use at http://wiki.chasmsoftware.org, Source in Python and C++. Requires 32 or 64-bit Linux system (tested on Fedora Core 8,10,11 and Ubuntu 10), 2.5*≤ Python <3.0*, MySQL server >5.0, 60 GB available hard disk space (50 MB for software and data files, 40 GB for MySQL database dump when uncompressed), 2 GB of RAM.
Comparison of response times of a mobile-web EHRs system using PHP and JSP languages.
De la Torre-Díez, Isabel; Antón-Rodríguez, Míriam; Díaz-Pernas, Francisco Javier; Perozo-Rondón, Freddy José
2012-12-01
Performance evaluation is highly important in the Electronic Health Records (EHRs) system implementation. Response time's measurement can be considered as one manner to make that evaluation. In the e-health field, after the creation of EHRs available through different platforms such as Web and/or mobile, a performance evaluation is necessary. The operation of the system in the right way is essential. In this paper, a comparison of the response times for the MEHRmobile system is presented. The first version uses PHP language with a MySQL database and the second one employs JSP with an eXist database. Both versions have got the same functionalities. In addition to the technological aspects, a significant difference is the way the information is stored. The main goal of this paper is choosing the version which offers better response times. We have created a new benchmark to calculate the response times. Better results have been obtained for the PHP version. Nowadays, this version is being used for specialists from Fundación Intras, Spain.
An XML-Based Knowledge Management System of Port Information for U.S. Coast Guard Cutters
2003-03-01
using DTDs was not chosen. XML Schema performs many of the same functions as SQL type schemas, but differ by the unique structure of XML documents...to access data from content files within the developed system. XPath is not equivalent to SQL . While XPath is very powerful at reaching into an XML...document and finding nodes or node sets, it is not a complete query language. For operations like joins, unions, intersections, etc., SQL is far
Defense Information Systems Agency Technical Integration Support (DISA- TIS). MUMPS Study.
1993-01-01
usable in DoD, MUMPS must continue to improve in its support of DoD and OSE standards such as SQL , X-Windows, POSIX, PHIGS, etc. MUMPS and large AlSs...Language ( SQL ), X-Windows, and Graphical Kernel Services (GKS)) 2.2.2.3 FIPS Adoption by NIST The National Institute of Standards and Technology (NIST...many of the performance tuning mechanisms that must be performed explicitly with other systems. The VA looks forward to the SQL binding (1993 ANS) that
De-MA: a web Database for electron Microprobe Analyses to assist EMP lab manager and users
NASA Astrophysics Data System (ADS)
Allaz, J. M.
2012-12-01
Lab managers and users of electron microprobe (EMP) facilities require comprehensive, yet flexible documentation structures, as well as an efficient scheduling mechanism. A single on-line database system for managing reservations, and providing information on standards, quantitative and qualitative setups (element mapping, etc.), and X-ray data has been developed for this purpose. This system is particularly useful in multi-user facilities where experience ranges from beginners to the highly experienced. New users and occasional facility users will find these tools extremely useful in developing and maintaining high quality, reproducible, and efficient analyses. This user-friendly database is available through the web, and uses MySQL as a database and PHP/HTML as script language (dynamic website). The database includes several tables for standards information, X-ray lines, X-ray element mapping, PHA, element setups, and agenda. It is configurable for up to five different EMPs in a single lab, each of them having up to five spectrometers and as many diffraction crystals as required. The installation should be done on a web server supporting PHP/MySQL, although installation on a personal computer is possible using third-party freeware to create a local Apache server, and to enable PHP/MySQL. Since it is web-based, any user outside the EMP lab can access this database anytime through any web browser and on any operating system. The access can be secured using a general password protection (e.g. htaccess). The web interface consists of 6 main menus. (1) "Standards" lists standards defined in the database, and displays detailed information on each (e.g. material type, name, reference, comments, and analyses). Images such as EDS spectra or BSE can be associated with a standard. (2) "Analyses" lists typical setups to use for quantitative analyses, allows calculation of mineral composition based on a mineral formula, or calculation of mineral formula based on a fixed amount of oxygen, or of cation (using an analysis in element or oxide weight-%); this latter includes re-calculation of H2O/CO2 based on stoichiometry, and oxygen correction for F and Cl. Another option offers a list of any available standards and possible peak or background interferences for a series of elements. (3) "X-ray maps" lists the different setups recommended for element mapping using WDS, and a map calculator to facilitate maps setups and to estimate the total mapping time. (4) "X-ray data" lists all x-ray lines for a specific element (K, L, M, absorption edges, and satellite peaks) in term of energy, wavelength and peak position. A check for possible interferences on peak or background is also possible. Theoretical x-ray peak positions for each crystal are calculated based on the 2d spacing of each crystal and the wavelength of each line. (5) "Agenda" menu displays the reservation dates for each month and for each EMP lab defined. It also offers a reservation request option, this request being sent by email to the EMP manager for approval. (6) Finally, "Admin" is password restricted, and contains all necessary options to manage the database through user-friendly forms. The installation of this database is made easy and knowledge of HTML, PHP, or MySQL is unnecessary to install, configure, manage, or use it. A working database is accessible at http://cub.geoloweb.ch.
NASA Astrophysics Data System (ADS)
Ifimov, Gabriela; Pigeau, Grace; Arroyo-Mora, J. Pablo; Soffer, Raymond; Leblanc, George
2017-10-01
In this study the development and implementation of a geospatial database model for the management of multiscale datasets encompassing airborne imagery and associated metadata is presented. To develop the multi-source geospatial database we have used a Relational Database Management System (RDBMS) on a Structure Query Language (SQL) server which was then integrated into ArcGIS and implemented as a geodatabase. The acquired datasets were compiled, standardized, and integrated into the RDBMS, where logical associations between different types of information were linked (e.g. location, date, and instrument). Airborne data, at different processing levels (digital numbers through geocorrected reflectance), were implemented in the geospatial database where the datasets are linked spatially and temporally. An example dataset consisting of airborne hyperspectral imagery, collected for inter and intra-annual vegetation characterization and detection of potential hydrocarbon seepage events over pipeline areas, is presented. Our work provides a model for the management of airborne imagery, which is a challenging aspect of data management in remote sensing, especially when large volumes of data are collected.
NASA Astrophysics Data System (ADS)
Merticariu, Vlad; Misev, Dimitar; Baumann, Peter
2017-04-01
While python has developed into the lingua franca in Data Science there is often a paradigm break when accessing specialized tools. In particular for one of the core data categories in science and engineering, massive multi-dimensional arrays, out-of-memory solutions typically employ their own, different models. We discuss this situation on the example of the scalable open-source array engine, rasdaman ("raster data manager") which offers access to and processing of Petascale multi-dimensional arrays through an SQL-style array query language, rasql. Such queries are executed in the server on a storage engine utilizing adaptive array partitioning and based on a processing engine implementing a "tile streaming" paradigm to allow processing of arrays massively larger than server RAM. The rasdaman QL has acted as blueprint for forthcoming ISO Array SQL and the Open Geospatial Consortium (OGC) geo analytics language, Web Coverage Processing Service, adopted in 2008. Not surprisingly, rasdaman is OGC and INSPIRE Reference Implementation for their "Big Earth Data" standards suite. Recently, rasdaman has been augmented with a python interface which allows to transparently interact with the database (credits go to Siddharth Shukla's Master Thesis at Jacobs University). Programmers do not need to know the rasdaman query language, as the operators are silently transformed, through lazy evaluation, into queries. Arrays delivered are likewise automatically transformed into their python representation. In the talk, the rasdaman concept will be illustrated with the help of large-scale real-life examples of operational satellite image and weather data services, and sample python code.
Introduction to an Open Source Internet-Based Testing Program for Medical Student Examinations
2009-01-01
The author developed a freely available open source internet-based testing program for medical examination. PHP and Java script were used as the programming language and postgreSQL as the database management system on an Apache web server and Linux operating system. The system approach was that a super user inputs the items, each school administrator inputs the examinees' information, and examinees access the system. The examinee's score is displayed immediately after examination with item analysis. The set-up of the system beginning with installation is described. This may help medical professors to easily adopt an internet-based testing system for medical education. PMID:20046457
Introduction to an open source internet-based testing program for medical student examinations.
Lee, Yoon-Hwan
2009-12-20
The author developed a freely available open source internet-based testing program for medical examination. PHP and Java script were used as the programming language and postgreSQL as the database management system on an Apache web server and Linux operating system. The system approach was that a super user inputs the items, each school administrator inputs the examinees' information, and examinees access the system. The examinee's score is displayed immediately after examination with item analysis. The set-up of the system beginning with installation is described. This may help medical professors to easily adopt an internet-based testing system for medical education.
NASA Astrophysics Data System (ADS)
Ivankovic, D.; Dadic, V.
2009-04-01
Some of oceanographic parameters have to be manually inserted into database; some (for example data from CTD probe) are inserted from various files. All this parameters requires visualization, validation and manipulation from research vessel or scientific institution, and also public presentation. For these purposes is developed web based system, containing dynamic sql procedures and java applets. Technology background is Oracle 10g relational database, and Oracle application server. Web interfaces are developed using PL/SQL stored database procedures (mod PL/SQL). Additional parts for data visualization include use of Java applets and JavaScript. Mapping tool is Google maps API (javascript) and as alternative java applet. Graph is realized as dynamically generated web page containing java applet. Mapping tool and graph are georeferenced. That means that click on some part of graph, automatically initiate zoom or marker onto location where parameter was measured. This feature is very useful for data validation. Code for data manipulation and visualization are partially realized with dynamic SQL and that allow as to separate data definition and code for data manipulation. Adding new parameter in system requires only data definition and description without programming interface for this kind of data.
TriatoKey: a web and mobile tool for biodiversity identification of Brazilian triatomine species
Márcia de Oliveira, Luciana; Nogueira de Brito, Raissa; Anderson Souza Guimarães, Paul; Vitor Mastrângelo Amaro dos Santos, Rômulo; Gonçalves Diotaiuti, Liléia; de Cássia Moreira de Souza, Rita
2017-01-01
Abstract Triatomines are blood-sucking insects that transmit the causative agent of Chagas disease, Trypanosoma cruzi. Despite being recognized as a difficult task, the correct taxonomic identification of triatomine species is crucial for vector control in Latin America, where the disease is endemic. In this context, we have developed a web and mobile tool based on PostgreSQL database to help healthcare technicians to overcome the difficulties to identify triatomine vectors when the technical expertise is missing. The web and mobile version makes use of real triatomine species pictures and dichotomous key method to support the identification of potential vectors that occur in Brazil. It provides a user example-driven interface with simple language. TriatoKey can also be useful for educational purposes. Database URL: http://triatokey.cpqrr.fiocruz.br PMID:28605769
Streamlining the Process of Acquiring Secure Open Architecture Software Systems
2013-10-08
Microsoft.NET, Enterprise Java Beans, GNU Lesser General Public License (LGPL) libraries, and data communication protocols like the Hypertext Transfer...NetBeans development environments), customer relationship management (SugarCRM), database management systems (PostgreSQL, MySQL ), operating
The unified database for the fixed target experiment BM@N
NASA Astrophysics Data System (ADS)
Gertsenberger, K. V.
2016-09-01
The article describes the developed database designed as comprehensive data storage of the fixed target experiment BM@N [1] at Joint Institute for Nuclear Research (JINR) in Dubna. The structure and purposes of the BM@N facility will be briefly presented. The scheme of the unified database and its parameters will be described in detail. The use of the BM@N database implemented on the PostgreSQL database management system (DBMS) allows one to provide user access to the actual information of the experiment. Also the interfaces developed for the access to the database will be presented. One was implemented as the set of C++ classes to access the data without SQL statements, the other-Web-interface being available on the Web page of the BM@N experiment.
Life in extra dimensions of database world or penetration of NoSQL in HEP community
NASA Astrophysics Data System (ADS)
Kuznetsov, V.; Evans, D.; Metson, S.
2012-12-01
The recent buzzword in IT world is NoSQL. Major players, such as Facebook, Yahoo, Google, etc. are widely adopted different “NoSQL” solutions for their needs. Horizontal scalability, flexible data model and management of big data volumes are only a few advantages of NoSQL. In CMS experiment we use several of them in production environment. Here, we present CMS projects based on NoSQL solutions, their strengths and weaknesses as well as our experience with those tools and their coexistence with standard RDBMS solutions in our applications.
A Database Design for the Brazilian Air Force Military Personnel Control System.
1987-06-01
GIVEN A RECNum GET MOVING HISTORICAL". 77 SEL4 PlC X(70) VALUE ". 4. GIVEN A RECNUM GET NOMINATION HISTORICAL". 77 SEL5 PIC X(70) VALUE it 5. GIVEN A...WHERE - "°RECNUM = :RECNUM". 77 SQL-SEL3-LENGTH PIC S9999 VALUE 150 COMP. 77 SQL- SEL4 PIC X(150) VALUE "SELECT ABBREV,DTNOM,DTEXO,SITN FROM...NOMINATION WHERE RECNUM 77 SQL- SEL4 -LENGTH PIC S9999 VALUE 150 COMP. 77 SQL-SEL5 PIC X(150) VALUE "SELECT ABBREVDTDES,DTWAIVER,SITD FROM DESIG WHERE RECNUM It
Joint Battlespace Infosphere: Information Management Within a C2 Enterprise
2005-06-01
using. In version 1.2, we support both MySQL and Oracle as underlying implementations where the XML metadata schema is mapped into relational tables in...Identity Servers, Role-Based Access Control, and Policy Representation – Databases: Oracle , MySQL , TigerLogic, Berkeley XML DB 15 Instrumentation Services...converted to SQL for execution. Invocations are then forwarded to the appropriate underlying IOR core components that have the responsibility of issuing
Information Collection using Handheld Devices in Unreliable Networking Environments
2014-06-01
different types of mobile devices that connect wirelessly to a database 8 server. The actual backend database is not important to the mobile clients...Google’s infrastructure and local servers with MySQL and PostgreSQL on the backend (ODK 2014b). (2) Google Fusion Tables are used to do basic link...how we conduct business. Our requirements to share information do not change simply because there is little or no existing infrastructure in our
MicroRNA Gene Regulatory Networks in Peripheral Nerve Sheath Tumors
2013-09-01
3.0 hierarchical clustering of both the X and the Y-axis using Centroid linkage. The resulting clustered matrixes were visualized using Java Treeview...To score potential ceRNA interactions, the 54979 human interactions were loaded into a mySQL database and when the user selects a given mRNA all...on the fly using PHP interactions with mySQL in a similar fashion as previously described in our publicly available databases such as sarcoma
Astronomical Archive at Tartu Observatory
NASA Astrophysics Data System (ADS)
Annuk, K.
2007-10-01
Archiving astronomical data is important task not only at large observatories but also at small observatories. Here we describe the astronomical archive at Tartu Observatory. The archive consists of old photographic plate images, photographic spectrograms, CCD direct--images and CCD spectroscopic data. The photographic plate digitizing project was started in 2005. An on-line database (based on MySQL) was created. The database includes CCD data as well photographic data. A PHP-MySQL interface was written for access to all data.
The HARPS-N archive through a Cassandra, NoSQL database suite?
NASA Astrophysics Data System (ADS)
Molinari, Emilio; Guerra, Jose; Harutyunyan, Avet; Lodi, Marcello; Martin, Adrian
2016-07-01
The TNG-INAF is developing the science archive for the WEAVE instrument. The underlying architecture of the archive is based on a non relational database, more precisely, on Apache Cassandra cluster, which uses a NoSQL technology. In order to test and validate the use of this architecture, we created a local archive which we populated with all the HARPSN spectra collected at the TNG since the instrument's start of operations in mid-2012, as well as developed tools for the analysis of this data set. The HARPS-N data set is two orders of magnitude smaller than WEAVE, but we want to demonstrate the ability to walk through a complete data set and produce scientific output, as valuable as that produced by an ordinary pipeline, though without accessing directly the FITS files. The analytics is done by Apache Solr and Spark and on a relational PostgreSQL database. As an example, we produce observables like metallicity indexes for the targets in the archive and compare the results with the ones coming from the HARPS-N regular data reduction software. The aim of this experiment is to explore the viability of a high availability cluster and distributed NoSQL database as a platform for complex scientific analytics on a large data set, which will then be ported to the WEAVE Archive System (WAS) which we are developing for the WEAVE multi object, fiber spectrograph.
PACSY, a relational database management system for protein structure and chemical shift analysis.
Lee, Woonghee; Yu, Wookyung; Kim, Suhkmann; Chang, Iksoo; Lee, Weontae; Markley, John L
2012-10-01
PACSY (Protein structure And Chemical Shift NMR spectroscopY) is a relational database management system that integrates information from the Protein Data Bank, the Biological Magnetic Resonance Data Bank, and the Structural Classification of Proteins database. PACSY provides three-dimensional coordinates and chemical shifts of atoms along with derived information such as torsion angles, solvent accessible surface areas, and hydrophobicity scales. PACSY consists of six relational table types linked to one another for coherence by key identification numbers. Database queries are enabled by advanced search functions supported by an RDBMS server such as MySQL or PostgreSQL. PACSY enables users to search for combinations of information from different database sources in support of their research. Two software packages, PACSY Maker for database creation and PACSY Analyzer for database analysis, are available from http://pacsy.nmrfam.wisc.edu.
MyMolDB: a micromolecular database solution with open source and free components.
Xia, Bing; Tai, Zheng-Fu; Gu, Yu-Cheng; Li, Bang-Jing; Ding, Li-Sheng; Zhou, Yan
2011-10-01
To manage chemical structures in small laboratories is one of the important daily tasks. Few solutions are available on the internet, and most of them are closed source applications. The open-source applications typically have limited capability and basic cheminformatics functionalities. In this article, we describe an open-source solution to manage chemicals in research groups based on open source and free components. It has a user-friendly interface with the functions of chemical handling and intensive searching. MyMolDB is a micromolecular database solution that supports exact, substructure, similarity, and combined searching. This solution is mainly implemented using scripting language Python with a web-based interface for compound management and searching. Almost all the searches are in essence done with pure SQL on the database by using the high performance of the database engine. Thus, impressive searching speed has been archived in large data sets for no external Central Processing Unit (CPU) consuming languages were involved in the key procedure of the searching. MyMolDB is an open-source software and can be modified and/or redistributed under GNU General Public License version 3 published by the Free Software Foundation (Free Software Foundation Inc. The GNU General Public License, Version 3, 2007. Available at: http://www.gnu.org/licenses/gpl.html). The software itself can be found at http://code.google.com/p/mymoldb/. Copyright © 2011 Wiley Periodicals, Inc.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Femec, D.A.
This report describes two code-generating tools used to speed design and implementation of relational databases and user interfaces: CREATE-SCHEMA and BUILD-SCREEN. CREATE-SCHEMA produces the SQL commands that actually create and define the database. BUILD-SCREEN takes templates for data entry screens and generates the screen management system routine calls to display the desired screen. Both tools also generate the related FORTRAN declaration statements and precompiled SQL calls. Included with this report is the source code for a number of FORTRAN routines and functions used by the user interface. This code is broadly applicable to a number of different databases.
Reactome graph database: Efficient access to complex pathway data
Korninger, Florian; Viteri, Guilherme; Marin-Garcia, Pablo; Ping, Peipei; Wu, Guanming; Stein, Lincoln; D’Eustachio, Peter
2018-01-01
Reactome is a free, open-source, open-data, curated and peer-reviewed knowledgebase of biomolecular pathways. One of its main priorities is to provide easy and efficient access to its high quality curated data. At present, biological pathway databases typically store their contents in relational databases. This limits access efficiency because there are performance issues associated with queries traversing highly interconnected data. The same data in a graph database can be queried more efficiently. Here we present the rationale behind the adoption of a graph database (Neo4j) as well as the new ContentService (REST API) that provides access to these data. The Neo4j graph database and its query language, Cypher, provide efficient access to the complex Reactome data model, facilitating easy traversal and knowledge discovery. The adoption of this technology greatly improved query efficiency, reducing the average query time by 93%. The web service built on top of the graph database provides programmatic access to Reactome data by object oriented queries, but also supports more complex queries that take advantage of the new underlying graph-based data storage. By adopting graph database technology we are providing a high performance pathway data resource to the community. The Reactome graph database use case shows the power of NoSQL database engines for complex biological data types. PMID:29377902
Reactome graph database: Efficient access to complex pathway data.
Fabregat, Antonio; Korninger, Florian; Viteri, Guilherme; Sidiropoulos, Konstantinos; Marin-Garcia, Pablo; Ping, Peipei; Wu, Guanming; Stein, Lincoln; D'Eustachio, Peter; Hermjakob, Henning
2018-01-01
Reactome is a free, open-source, open-data, curated and peer-reviewed knowledgebase of biomolecular pathways. One of its main priorities is to provide easy and efficient access to its high quality curated data. At present, biological pathway databases typically store their contents in relational databases. This limits access efficiency because there are performance issues associated with queries traversing highly interconnected data. The same data in a graph database can be queried more efficiently. Here we present the rationale behind the adoption of a graph database (Neo4j) as well as the new ContentService (REST API) that provides access to these data. The Neo4j graph database and its query language, Cypher, provide efficient access to the complex Reactome data model, facilitating easy traversal and knowledge discovery. The adoption of this technology greatly improved query efficiency, reducing the average query time by 93%. The web service built on top of the graph database provides programmatic access to Reactome data by object oriented queries, but also supports more complex queries that take advantage of the new underlying graph-based data storage. By adopting graph database technology we are providing a high performance pathway data resource to the community. The Reactome graph database use case shows the power of NoSQL database engines for complex biological data types.
Thriving on Chaos: The Development of a Surgical Information System
Olund, Steven R.
1988-01-01
Hospitals present unique challenges to the computer industry, generating a greater quantity and variety of data than nearly any other enterprise. This is complicated by the fact that a hospital is not one homogenous organization, but a bundle of semi-independent groups with unique data requirements. Therefore hospital information systems must be fast, flexible, reliable, easy to use and maintain, and cost-effective. The Surgical Information System at Rush Presbyterian-St. Luke's Medical Center, Chicago is such system. It uses a Sequent Balance 21000 multi-processor superminicomputer, running industry standard tools such as the Unix operating system, a 4th generation programming language (4GL), and Structured Query Language (SQL) relational database management software. This treatise illustrates a comprehensive yet generic approach which can be applied to almost any clinical situation where access to patient data is required by a variety of medical professionals.
Intelligent search in Big Data
NASA Astrophysics Data System (ADS)
Birialtsev, E.; Bukharaev, N.; Gusenkov, A.
2017-10-01
An approach to data integration, aimed on the ontology-based intelligent search in Big Data, is considered in the case when information objects are represented in the form of relational databases (RDB), structurally marked by their schemes. The source of information for constructing an ontology and, later on, the organization of the search are texts in natural language, treated as semi-structured data. For the RDBs, these are comments on the names of tables and their attributes. Formal definition of RDBs integration model in terms of ontologies is given. Within framework of the model universal RDB representation ontology, oil production subject domain ontology and linguistic thesaurus of subject domain language are built. Technique of automatic SQL queries generation for subject domain specialists is proposed. On the base of it, information system for TATNEFT oil-producing company RDBs was implemented. Exploitation of the system showed good relevance with majority of queries.
NASA Astrophysics Data System (ADS)
Hendikawati, P.; Arifudin, R.; Zahid, M. Z.
2018-03-01
This study aims to design an android Statistics Data Analysis application that can be accessed through mobile devices to making it easier for users to access. The Statistics Data Analysis application includes various topics of basic statistical along with a parametric statistics data analysis application. The output of this application system is parametric statistics data analysis that can be used for students, lecturers, and users who need the results of statistical calculations quickly and easily understood. Android application development is created using Java programming language. The server programming language uses PHP with the Code Igniter framework, and the database used MySQL. The system development methodology used is the Waterfall methodology with the stages of analysis, design, coding, testing, and implementation and system maintenance. This statistical data analysis application is expected to support statistical lecturing activities and make students easier to understand the statistical analysis of mobile devices.
PACSY, a relational database management system for protein structure and chemical shift analysis
Lee, Woonghee; Yu, Wookyung; Kim, Suhkmann; Chang, Iksoo
2012-01-01
PACSY (Protein structure And Chemical Shift NMR spectroscopY) is a relational database management system that integrates information from the Protein Data Bank, the Biological Magnetic Resonance Data Bank, and the Structural Classification of Proteins database. PACSY provides three-dimensional coordinates and chemical shifts of atoms along with derived information such as torsion angles, solvent accessible surface areas, and hydrophobicity scales. PACSY consists of six relational table types linked to one another for coherence by key identification numbers. Database queries are enabled by advanced search functions supported by an RDBMS server such as MySQL or PostgreSQL. PACSY enables users to search for combinations of information from different database sources in support of their research. Two software packages, PACSY Maker for database creation and PACSY Analyzer for database analysis, are available from http://pacsy.nmrfam.wisc.edu. PMID:22903636
An XML-based Generic Tool for Information Retrieval in Solar Databases
NASA Astrophysics Data System (ADS)
Scholl, Isabelle F.; Legay, Eric; Linsolas, Romain
This paper presents the current architecture of the `Solar Web Project' now in its development phase. This tool will provide scientists interested in solar data with a single web-based interface for browsing distributed and heterogeneous catalogs of solar observations. The main goal is to have a generic application that can be easily extended to new sets of data or to new missions with a low level of maintenance. It is developed with Java and XML is used as a powerful configuration language. The server, independent of any database scheme, can communicate with a client (the user interface) and several local or remote archive access systems (such as existing web pages, ftp sites or SQL databases). Archive access systems are externally described in XML files. The user interface is also dynamically generated from an XML file containing the window building rules and a simplified database description. This project is developed at MEDOC (Multi-Experiment Data and Operations Centre), located at the Institut d'Astrophysique Spatiale (Orsay, France). Successful tests have been conducted with other solar archive access systems.
ExportAid: database of RNA elements regulating nuclear RNA export in mammals.
Giulietti, Matteo; Milantoni, Sara Armida; Armeni, Tatiana; Principato, Giovanni; Piva, Francesco
2015-01-15
Regulation of nuclear mRNA export or retention is carried out by RNA elements but the mechanism is not yet well understood. To understand the mRNA export process, it is important to collect all the involved RNA elements and their trans-acting factors. By hand-curated literature screening we collected, in ExportAid database, experimentally assessed data about RNA elements regulating nuclear export or retention of endogenous, heterologous or artificial RNAs in mammalian cells. This database could help to understand the RNA export language and to study the possible export efficiency alterations owing to mutations or polymorphisms. Currently, ExportAid stores 235 and 96 RNA elements, respectively, increasing and decreasing export efficiency, and 98 neutral assessed sequences. Freely accessible without registration at http://www.introni.it/ExportAid/ExportAid.html. Database and web interface are implemented in Perl, MySQL, Apache and JavaScript with all major browsers supported. © The Author 2014. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.
CHASM and SNVBox: toolkit for detecting biologically important single nucleotide mutations in cancer
Carter, Hannah; Diekhans, Mark; Ryan, Michael C.; Karchin, Rachel
2011-01-01
Summary: Thousands of cancer exomes are currently being sequenced, yielding millions of non-synonymous single nucleotide variants (SNVs) of possible relevance to disease etiology. Here, we provide a software toolkit to prioritize SNVs based on their predicted contribution to tumorigenesis. It includes a database of precomputed, predictive features covering all positions in the annotated human exome and can be used either stand-alone or as part of a larger variant discovery pipeline. Availability and Implementation: MySQL database, source code and binaries freely available for academic/government use at http://wiki.chasmsoftware.org, Source in Python and C++. Requires 32 or 64-bit Linux system (tested on Fedora Core 8,10,11 and Ubuntu 10), 2.5*≤ Python <3.0*, MySQL server >5.0, 60 GB available hard disk space (50 MB for software and data files, 40 GB for MySQL database dump when uncompressed), 2 GB of RAM. Contact: karchin@jhu.edu Supplementary Information: Supplementary data are available at Bioinformatics online. PMID:21685053
GEOMAGIA50: An archeointensity database with PHP and MySQL
NASA Astrophysics Data System (ADS)
Korhonen, K.; Donadini, F.; Riisager, P.; Pesonen, L. J.
2008-04-01
The GEOMAGIA50 database stores 3798 archeomagnetic and paleomagnetic intensity determinations dated to the past 50,000 years. It also stores details of the measurement setup for each determination, which are used for ranking the data according to prescribed reliability criteria. The ranking system aims to alleviate the data reliability problem inherent in this kind of data. GEOMAGIA50 is based on two popular open source technologies. The MySQL database management system is used for storing the data, whereas the functionality and user interface are provided by server-side PHP scripts. This technical brief gives a detailed description of GEOMAGIA50 from a technical viewpoint.
Monitoring SLAC High Performance UNIX Computing Systems
DOE Office of Scientific and Technical Information (OSTI.GOV)
Lettsome, Annette K.; /Bethune-Cookman Coll. /SLAC
2005-12-15
Knowledge of the effectiveness and efficiency of computers is important when working with high performance systems. The monitoring of such systems is advantageous in order to foresee possible misfortunes or system failures. Ganglia is a software system designed for high performance computing systems to retrieve specific monitoring information. An alternative storage facility for Ganglia's collected data is needed since its default storage system, the round-robin database (RRD), struggles with data integrity. The creation of a script-driven MySQL database solves this dilemma. This paper describes the process took in the creation and implementation of the MySQL database for use by Ganglia.more » Comparisons between data storage by both databases are made using gnuplot and Ganglia's real-time graphical user interface.« less
Photo-z-SQL: Integrated, flexible photometric redshift computation in a database
NASA Astrophysics Data System (ADS)
Beck, R.; Dobos, L.; Budavári, T.; Szalay, A. S.; Csabai, I.
2017-04-01
We present a flexible template-based photometric redshift estimation framework, implemented in C#, that can be seamlessly integrated into a SQL database (or DB) server and executed on-demand in SQL. The DB integration eliminates the need to move large photometric datasets outside a database for redshift estimation, and utilizes the computational capabilities of DB hardware. The code is able to perform both maximum likelihood and Bayesian estimation, and can handle inputs of variable photometric filter sets and corresponding broad-band magnitudes. It is possible to take into account the full covariance matrix between filters, and filter zero points can be empirically calibrated using measurements with given redshifts. The list of spectral templates and the prior can be specified flexibly, and the expensive synthetic magnitude computations are done via lazy evaluation, coupled with a caching of results. Parallel execution is fully supported. For large upcoming photometric surveys such as the LSST, the ability to perform in-place photo-z calculation would be a significant advantage. Also, the efficient handling of variable filter sets is a necessity for heterogeneous databases, for example the Hubble Source Catalog, and for cross-match services such as SkyQuery. We illustrate the performance of our code on two reference photo-z estimation testing datasets, and provide an analysis of execution time and scalability with respect to different configurations. The code is available for download at https://github.com/beckrob/Photo-z-SQL.
Development of user-friendly and interactive data collection system for cerebral palsy.
Raharjo, I; Burns, T G; Venugopalan, J; Wang, M D
2016-02-01
Cerebral palsy (CP) is a permanent motor disorder that appears in early age and it requires multiple tests to assess the physical and mental capabilities of the patients. Current medical record data collection systems, e.g., EPIC, employed for CP are very general, difficult to navigate, and prone to errors. The data cannot easily be extracted which limits data analysis on this rich source of information. To overcome these limitations, we designed and prototyped a database with a graphical user interface geared towards clinical research specifically in CP. The platform with MySQL and Java framework is reliable, secure, and can be easily integrated with other programming languages for data analysis such as MATLAB. This database with GUI design is a promising tool for data collection and can be applied in many different fields aside from CP to infer useful information out of the vast amount of data being collected.
Development of user-friendly and interactive data collection system for cerebral palsy
Raharjo, I.; Burns, T. G.; Venugopalan, J.; Wang., M. D.
2016-01-01
Cerebral palsy (CP) is a permanent motor disorder that appears in early age and it requires multiple tests to assess the physical and mental capabilities of the patients. Current medical record data collection systems, e.g., EPIC, employed for CP are very general, difficult to navigate, and prone to errors. The data cannot easily be extracted which limits data analysis on this rich source of information. To overcome these limitations, we designed and prototyped a database with a graphical user interface geared towards clinical research specifically in CP. The platform with MySQL and Java framework is reliable, secure, and can be easily integrated with other programming languages for data analysis such as MATLAB. This database with GUI design is a promising tool for data collection and can be applied in many different fields aside from CP to infer useful information out of the vast amount of data being collected. PMID:28133638
Astronomical databases of Nikolaev Observatory
NASA Astrophysics Data System (ADS)
Protsyuk, Y.; Mazhaev, A.
2008-07-01
Several astronomical databases were created at Nikolaev Observatory during the last years. The databases are built by using MySQL search engine and PHP scripts. They are available on NAO web-site http://www.mao.nikolaev.ua.
Domain fusion analysis by applying relational algebra to protein sequence and domain databases.
Truong, Kevin; Ikura, Mitsuhiko
2003-05-06
Domain fusion analysis is a useful method to predict functionally linked proteins that may be involved in direct protein-protein interactions or in the same metabolic or signaling pathway. As separate domain databases like BLOCKS, PROSITE, Pfam, SMART, PRINTS-S, ProDom, TIGRFAMs, and amalgamated domain databases like InterPro continue to grow in size and quality, a computational method to perform domain fusion analysis that leverages on these efforts will become increasingly powerful. This paper proposes a computational method employing relational algebra to find domain fusions in protein sequence databases. The feasibility of this method was illustrated on the SWISS-PROT+TrEMBL sequence database using domain predictions from the Pfam HMM (hidden Markov model) database. We identified 235 and 189 putative functionally linked protein partners in H. sapiens and S. cerevisiae, respectively. From scientific literature, we were able to confirm many of these functional linkages, while the remainder offer testable experimental hypothesis. Results can be viewed at http://calcium.uhnres.utoronto.ca/pi. As the analysis can be computed quickly on any relational database that supports standard SQL (structured query language), it can be dynamically updated along with the sequence and domain databases, thereby improving the quality of predictions over time.
Search extension transforms Wiki into a relational system: a case for flavonoid metabolite database.
Arita, Masanori; Suwa, Kazuhiro
2008-09-17
In computer science, database systems are based on the relational model founded by Edgar Codd in 1970. On the other hand, in the area of biology the word 'database' often refers to loosely formatted, very large text files. Although such bio-databases may describe conflicts or ambiguities (e.g. a protein pair do and do not interact, or unknown parameters) in a positive sense, the flexibility of the data format sacrifices a systematic query mechanism equivalent to the widely used SQL. To overcome this disadvantage, we propose embeddable string-search commands on a Wiki-based system and designed a half-formatted database. As proof of principle, a database of flavonoid with 6902 molecular structures from over 1687 plant species was implemented on MediaWiki, the background system of Wikipedia. Registered users can describe any information in an arbitrary format. Structured part is subject to text-string searches to realize relational operations. The system was written in PHP language as the extension of MediaWiki. All modifications are open-source and publicly available. This scheme benefits from both the free-formatted Wiki style and the concise and structured relational-database style. MediaWiki supports multi-user environments for document management, and the cost for database maintenance is alleviated.
Search extension transforms Wiki into a relational system: A case for flavonoid metabolite database
Arita, Masanori; Suwa, Kazuhiro
2008-01-01
Background In computer science, database systems are based on the relational model founded by Edgar Codd in 1970. On the other hand, in the area of biology the word 'database' often refers to loosely formatted, very large text files. Although such bio-databases may describe conflicts or ambiguities (e.g. a protein pair do and do not interact, or unknown parameters) in a positive sense, the flexibility of the data format sacrifices a systematic query mechanism equivalent to the widely used SQL. Results To overcome this disadvantage, we propose embeddable string-search commands on a Wiki-based system and designed a half-formatted database. As proof of principle, a database of flavonoid with 6902 molecular structures from over 1687 plant species was implemented on MediaWiki, the background system of Wikipedia. Registered users can describe any information in an arbitrary format. Structured part is subject to text-string searches to realize relational operations. The system was written in PHP language as the extension of MediaWiki. All modifications are open-source and publicly available. Conclusion This scheme benefits from both the free-formatted Wiki style and the concise and structured relational-database style. MediaWiki supports multi-user environments for document management, and the cost for database maintenance is alleviated. PMID:18822113
Zhu, Xinjie; Zhang, Qiang; Ho, Eric Dun; Yu, Ken Hung-On; Liu, Chris; Huang, Tim H; Cheng, Alfred Sze-Lok; Kao, Ben; Lo, Eric; Yip, Kevin Y
2017-09-22
A genomic signal track is a set of genomic intervals associated with values of various types, such as measurements from high-throughput experiments. Analysis of signal tracks requires complex computational methods, which often make the analysts focus too much on the detailed computational steps rather than on their biological questions. Here we propose Signal Track Query Language (STQL) for simple analysis of signal tracks. It is a Structured Query Language (SQL)-like declarative language, which means one only specifies what computations need to be done but not how these computations are to be carried out. STQL provides a rich set of constructs for manipulating genomic intervals and their values. To run STQL queries, we have developed the Signal Track Analytical Research Tool (START, http://yiplab.cse.cuhk.edu.hk/start/ ), a system that includes a Web-based user interface and a back-end execution system. The user interface helps users select data from our database of around 10,000 commonly-used public signal tracks, manage their own tracks, and construct, store and share STQL queries. The back-end system automatically translates STQL queries into optimized low-level programs and runs them on a computer cluster in parallel. We use STQL to perform 14 representative analytical tasks. By repeating these analyses using bedtools, Galaxy and custom Python scripts, we show that the STQL solution is usually the simplest, and the parallel execution achieves significant speed-up with large data files. Finally, we describe how a biologist with minimal formal training in computer programming self-learned STQL to analyze DNA methylation data we produced from 60 pairs of hepatocellular carcinoma (HCC) samples. Overall, STQL and START provide a generic way for analyzing a large number of genomic signal tracks in parallel easily.
Code of Federal Regulations, 2011 CFR
2011-10-01
... Microsoft Open Database Connectivity (ODBC) standard. ODBC is a Windows technology that allows a database software package to import data from a database created using a different software package. We currently...-compatible format. All databases must be supported with adequate documentation on data attributes, SQL...
Integrating Radar Image Data with Google Maps
NASA Technical Reports Server (NTRS)
Chapman, Bruce D.; Gibas, Sarah
2010-01-01
A public Web site has been developed as a method for displaying the multitude of radar imagery collected by NASA s Airborne Synthetic Aperture Radar (AIRSAR) instrument during its 16-year mission. Utilizing NASA s internal AIRSAR site, the new Web site features more sophisticated visualization tools that enable the general public to have access to these images. The site was originally maintained at NASA on six computers: one that held the Oracle database, two that took care of the software for the interactive map, and three that were for the Web site itself. Several tasks were involved in moving this complicated setup to just one computer. First, the AIRSAR database was migrated from Oracle to MySQL. Then the back-end of the AIRSAR Web site was updated in order to access the MySQL database. To do this, a few of the scripts needed to be modified; specifically three Perl scripts that query that database. The database connections were then updated from Oracle to MySQL, numerous syntax errors were corrected, and a query was implemented that replaced one of the stored Oracle procedures. Lastly, the interactive map was designed, implemented, and tested so that users could easily browse and access the radar imagery through the Google Maps interface.
Wiley, Laura K.; Sivley, R. Michael; Bush, William S.
2013-01-01
Efficient storage and retrieval of genomic annotations based on range intervals is necessary, given the amount of data produced by next-generation sequencing studies. The indexing strategies of relational database systems (such as MySQL) greatly inhibit their use in genomic annotation tasks. This has led to the development of stand-alone applications that are dependent on flat-file libraries. In this work, we introduce MyNCList, an implementation of the NCList data structure within a MySQL database. MyNCList enables the storage, update and rapid retrieval of genomic annotations from the convenience of a relational database system. Range-based annotations of 1 million variants are retrieved in under a minute, making this approach feasible for whole-genome annotation tasks. Database URL: https://github.com/bushlab/mynclist PMID:23894185
Wiley, Laura K; Sivley, R Michael; Bush, William S
2013-01-01
Efficient storage and retrieval of genomic annotations based on range intervals is necessary, given the amount of data produced by next-generation sequencing studies. The indexing strategies of relational database systems (such as MySQL) greatly inhibit their use in genomic annotation tasks. This has led to the development of stand-alone applications that are dependent on flat-file libraries. In this work, we introduce MyNCList, an implementation of the NCList data structure within a MySQL database. MyNCList enables the storage, update and rapid retrieval of genomic annotations from the convenience of a relational database system. Range-based annotations of 1 million variants are retrieved in under a minute, making this approach feasible for whole-genome annotation tasks. Database URL: https://github.com/bushlab/mynclist.
Salary Management System for Small and Medium-sized Enterprises
NASA Astrophysics Data System (ADS)
Hao, Zhang; Guangli, Xu; Yuhuan, Zhang; Yilong, Lei
Small and Medium-sized Enterprises (SMEs) in the process of wage entry, calculation, the total number are needed to be done manually in the past, the data volume is quite large, processing speed is low, and it is easy to make error, which is resulting in low efficiency. The main purpose of writing this paper is to present the basis of salary management system, establish a scientific database, the computer payroll system, using the computer instead of a lot of past manual work in order to reduce duplication of staff labor, it will improve working efficiency.This system combines the actual needs of SMEs, through in-depth study and practice of the C/S mode, PowerBuilder10.0 development tools, databases and SQL language, Completed a payroll system needs analysis, database design, application design and development work. Wages, departments, units and personnel database file are included in this system, and have data management, department management, personnel management and other functions, through the control and management of the database query, add, delete, modify, and other functions can be realized. This system is reasonable design, a more complete function, stable operation has been tested to meet the basic needs of the work.
NASA Astrophysics Data System (ADS)
McEnery, J. A.; Jitkajornwanich, K.
2012-12-01
This presentation will describe the methodology and overall system development by which a benchmark dataset of precipitation information has been used to characterize the depth-area-duration relations in heavy rain storms occurring over regions of Texas. Over the past two years project investigators along with the National Weather Service (NWS) West Gulf River Forecast Center (WGRFC) have developed and operated a gateway data system to ingest, store, and disseminate NWS multi-sensor precipitation estimates (MPE). As a pilot project of the Integrated Water Resources Science and Services (IWRSS) initiative, this testbed uses a Standard Query Language (SQL) server to maintain a full archive of current and historic MPE values within the WGRFC service area. These time series values are made available for public access as web services in the standard WaterML format. Having this volume of information maintained in a comprehensive database now allows the use of relational analysis capabilities within SQL to leverage these multi-sensor precipitation values and produce a valuable derivative product. The area of focus for this study is North Texas and will utilize values that originated from the West Gulf River Forecast Center (WGRFC); one of three River Forecast Centers currently represented in the holdings of this data system. Over the past two decades, NEXRAD radar has dramatically improved the ability to record rainfall. The resulting hourly MPE values, distributed over an approximate 4 km by 4 km grid, are considered by the NWS to be the "best estimate" of rainfall. The data server provides an accepted standard interface for internet access to the largest time-series dataset of NEXRAD based MPE values ever assembled. An automated script has been written to search and extract storms over the 18 year period of record from the contents of this massive historical precipitation database. Not only can it extract site-specific storms, but also duration-specific storms and storms separated by user defined inter-event periods. A separate storm database has been created to store the selected output. By storing output within tables in a separate database, we can make use of powerful SQL capabilities to perform flexible pattern analysis. Previous efforts have made use of historic data from limited clusters of irregularly spaced physical gauges. Spatial extent of the observational network has been a limiting factor. The relatively dense distribution of MPE provides a virtual mesh of observations stretched over the landscape. This work combines a unique hydrologic data resource with programming and database analysis to characterize storm depth-area-duration relationships.
Brave New World: Data Intensive Science with SDSS and the VO
NASA Astrophysics Data System (ADS)
Thakar, A. R.; Szalay, A. S.; O'Mullane, W.; Nieto-Santisteban, M.; Budavari, T.; Li, N.; Carliles, S.; Haridas, V.; Malik, T.; Gray, J.
2004-12-01
With the advent of digital archives and the VO, astronomy is quickly changing from a data-hungry to a data-intensive science. Local and specialized access to data will remain the most direct and efficient way to get data out of individual archives, especially if you know what you are looking for. However, the enormous sizes of the upcoming archives will preclude this type of access for most institutions, and will not allow researchers to tap the vast potential for discovery in cross-matching and comparing data between different archives. The VO makes this type of interoperability and distributed data access possible by adopting industry standards for data access (SQL) and data interchange (SOAP/XML) with platform independence (Web services). As a sneak preview of this brave new world where astronomers may need to become SQL warriors, we present a look at VO-enabled access to catalog data in the SDSS Catalog Archive Server (CAS): CasJobs - a workbench environment that allows arbitrarily complex SQL queries and your own personal database (MyDB) that you can share with collaborators; OpenSkyQuery - an IVOA (International Virtual Observatory Alliance) compliant federation of multiple archives (OpenSkyNodes) that currently links nearly 20 catalogs and allows cross-match queries (in ADQL - Astronomical Data Query Language) between them; Spectrum and Filter Profile Web services that provide access to an open database of spectra (registered users may add their own spectra); and VO-enabled Mirage - a Java visualizatiion tool developed at Bell Labs and enhanced at JHU that allows side-by-side comparison of SDSS catalog and FITS image data. Anticipating the next generation of Petabyte archives like LSST by the end of the decade, we are developing a parallel cross-match engine for all-sky cross-matches between large surveys, along with a 100-Terabyte data intensive science laboratory with high-speed parallel data access.
2MASS Catalog Server Kit Version 2.1
NASA Astrophysics Data System (ADS)
Yamauchi, C.
2013-10-01
The 2MASS Catalog Server Kit is open source software for use in easily constructing a high performance search server for important astronomical catalogs. This software utilizes the open source RDBMS PostgreSQL, therefore, any users can setup the database on their local computers by following step-by-step installation guide. The kit provides highly optimized stored functions for positional searchs similar to SDSS SkyServer. Together with these, the powerful SQL environment of PostgreSQL will meet various user's demands. We released 2MASS Catalog Server Kit version 2.1 in 2012 May, which supports the latest WISE All-Sky catalog (563,921,584 rows) and 9 major all-sky catalogs. Local databases are often indispensable for observatories with unstable or narrow-band networks or severe use, such as retrieving large numbers of records within a small period of time. This software is the best for such purposes, and increasing supported catalogs and improvements of version 2.1 can cover a wider range of applications including advanced calibration system, scientific studies using complicated SQL queries, etc. Official page: http://www.ir.isas.jaxa.jp/~cyamauch/2masskit/
ExplorEnz: the primary source of the IUBMB enzyme list
McDonald, Andrew G.; Boyce, Sinéad; Tipton, Keith F.
2009-01-01
ExplorEnz is the MySQL database that is used for the curation and dissemination of the International Union of Biochemistry and Molecular Biology (IUBMB) Enzyme Nomenclature. A simple web-based query interface is provided, along with an advanced search engine for more complex Boolean queries. The WWW front-end is accessible at http://www.enzyme-database.org, from where downloads of the database as SQL and XML are also available. An associated form-based curatorial application has been developed to facilitate the curation of enzyme data as well as the internal and public review processes that occur before an enzyme entry is made official. Suggestions for new enzyme entries, or modifications to existing ones, can be made using the forms provided at http://www.enzyme-database.org/forms.php. PMID:18776214
Array Databases: Agile Analytics (not just) for the Earth Sciences
NASA Astrophysics Data System (ADS)
Baumann, P.; Misev, D.
2015-12-01
Gridded data, such as images, image timeseries, and climate datacubes, today are managed separately from the metadata, and with different, restricted retrieval capabilities. While databases are good at metadata modelled in tables, XML hierarchies, or RDF graphs, they traditionally do not support multi-dimensional arrays.This gap is being closed by Array Databases, pioneered by the scalable rasdaman ("raster data manager") array engine. Its declarative query language, rasql, extends SQL with array operators which are optimized and parallelized on server side. Installations can easily be mashed up securely, thereby enabling large-scale location-transparent query processing in federations. Domain experts value the integration with their commonly used tools leading to a quick learning curve.Earth, Space, and Life sciences, but also Social sciences as well as business have massive amounts of data and complex analysis challenges that are answered by rasdaman. As of today, rasdaman is mature and in operational use on hundreds of Terabytes of timeseries datacubes, with transparent query distribution across more than 1,000 nodes. Additionally, its concepts have shaped international Big Data standards in the field, including the forthcoming array extension to ISO SQL, many of which are supported by both open-source and commercial systems meantime. In the geo field, rasdaman is reference implementation for the Open Geospatial Consortium (OGC) Big Data standard, WCS, now also under adoption by ISO. Further, rasdaman is in the final stage of OSGeo incubation.In this contribution we present array queries a la rasdaman, describe the architecture and novel optimization and parallelization techniques introduced in 2015, and put this in context of the intercontinental EarthServer initiative which utilizes rasdaman for enabling agile analytics on Petascale datacubes.
Monitoring of IaaS and scientific applications on the Cloud using the Elasticsearch ecosystem
NASA Astrophysics Data System (ADS)
Bagnasco, S.; Berzano, D.; Guarise, A.; Lusso, S.; Masera, M.; Vallero, S.
2015-05-01
The private Cloud at the Torino INFN computing centre offers IaaS services to different scientific computing applications. The infrastructure is managed with the OpenNebula cloud controller. The main stakeholders of the facility are a grid Tier-2 site for the ALICE collaboration at LHC, an interactive analysis facility for the same experiment and a grid Tier-2 site for the BES-III collaboration, plus an increasing number of other small tenants. Besides keeping track of the usage, the automation of dynamic allocation of resources to tenants requires detailed monitoring and accounting of the resource usage. As a first investigation towards this, we set up a monitoring system to inspect the site activities both in terms of IaaS and applications running on the hosted virtual instances. For this purpose we used the Elasticsearch, Logstash and Kibana stack. In the current implementation, the heterogeneous accounting information is fed to different MySQL databases and sent to Elasticsearch via a custom Logstash plugin. For the IaaS metering, we developed sensors for the OpenNebula API. The IaaS level information gathered through the API is sent to the MySQL database through an ad-hoc developed RESTful web service, which is also used for other accounting purposes. Concerning the application level, we used the Root plugin TProofMonSenderSQL to collect accounting data from the interactive analysis facility. The BES-III virtual instances used to be monitored with Zabbix, as a proof of concept we also retrieve the information contained in the Zabbix database. Each of these three cases is indexed separately in Elasticsearch. We are now starting to consider dismissing the intermediate level provided by the SQL database and evaluating a NoSQL option as a unique central database for all the monitoring information. We setup a set of Kibana dashboards with pre-defined queries in order to monitor the relevant information in each case. In this way we have achieved a uniform monitoring interface for both the IaaS and the scientific applications, mostly leveraging off-the-shelf tools.
Efficient data management tools for the heterogeneous big data warehouse
NASA Astrophysics Data System (ADS)
Alekseev, A. A.; Osipova, V. V.; Ivanov, M. A.; Klimentov, A.; Grigorieva, N. V.; Nalamwar, H. S.
2016-09-01
The traditional RDBMS has been consistent for the normalized data structures. RDBMS served well for decades, but the technology is not optimal for data processing and analysis in data intensive fields like social networks, oil-gas industry, experiments at the Large Hadron Collider, etc. Several challenges have been raised recently on the scalability of data warehouse like workload against the transactional schema, in particular for the analysis of archived data or the aggregation of data for summary and accounting purposes. The paper evaluates new database technologies like HBase, Cassandra, and MongoDB commonly referred as NoSQL databases for handling messy, varied and large amount of data. The evaluation depends upon the performance, throughput and scalability of the above technologies for several scientific and industrial use-cases. This paper outlines the technologies and architectures needed for processing Big Data, as well as the description of the back-end application that implements data migration from RDBMS to NoSQL data warehouse, NoSQL database organization and how it could be useful for further data analytics.
An Array Library for Microsoft SQL Server with Astrophysical Applications
NASA Astrophysics Data System (ADS)
Dobos, L.; Szalay, A. S.; Blakeley, J.; Falck, B.; Budavári, T.; Csabai, I.
2012-09-01
Today's scientific simulations produce output on the 10-100 TB scale. This unprecedented amount of data requires data handling techniques that are beyond what is used for ordinary files. Relational database systems have been successfully used to store and process scientific data, but the new requirements constantly generate new challenges. Moving terabytes of data among servers on a timely basis is a tough problem, even with the newest high-throughput networks. Thus, moving the computations as close to the data as possible and minimizing the client-server overhead are absolutely necessary. At least data subsetting and preprocessing have to be done inside the server process. Out of the box commercial database systems perform very well in scientific applications from the prospective of data storage optimization, data retrieval, and memory management but lack basic functionality like handling scientific data structures or enabling advanced math inside the database server. The most important gap in Microsoft SQL Server is the lack of a native array data type. Fortunately, the technology exists to extend the database server with custom-written code that enables us to address these problems. We present the prototype of a custom-built extension to Microsoft SQL Server that adds array handling functionality to the database system. With our Array Library, fix-sized arrays of all basic numeric data types can be created and manipulated efficiently. Also, the library is designed to be able to be seamlessly integrated with the most common math libraries, such as BLAS, LAPACK, FFTW, etc. With the help of these libraries, complex operations, such as matrix inversions or Fourier transformations, can be done on-the-fly, from SQL code, inside the database server process. We are currently testing the prototype with two different scientific data sets: The Indra cosmological simulation will use it to store particle and density data from N-body simulations, and the Milky Way Laboratory project will use it to store galaxy simulation data.
NASA Technical Reports Server (NTRS)
Baldwin, John; Zendejas, Silvino; Gutheinz, Sandy; Borden, Chester; Wang, Yeou-Fang
2009-01-01
Mission and Assets Database (MADB) Version 1.0 is an SQL database system with a Web user interface to centralize information. The database stores flight project support resource requirements, view periods, antenna information, schedule, and forecast results for use in mid-range and long-term planning of Deep Space Network (DSN) assets.
Protecting Database Centric Web Services against SQL/XPath Injection Attacks
NASA Astrophysics Data System (ADS)
Laranjeiro, Nuno; Vieira, Marco; Madeira, Henrique
Web services represent a powerful interface for back-end database systems and are increasingly being used in business critical applications. However, field studies show that a large number of web services are deployed with security flaws (e.g., having SQL Injection vulnerabilities). Although several techniques for the identification of security vulnerabilities have been proposed, developing non-vulnerable web services is still a difficult task. In fact, security-related concerns are hard to apply as they involve adding complexity to already complex code. This paper proposes an approach to secure web services against SQL and XPath Injection attacks, by transparently detecting and aborting service invocations that try to take advantage of potential vulnerabilities. Our mechanism was applied to secure several web services specified by the TPC-App benchmark, showing to be 100% effective in stopping attacks, non-intrusive and very easy to use.
Web application for detailed real-time database transaction monitoring for CMS condition data
NASA Astrophysics Data System (ADS)
de Gruttola, Michele; Di Guida, Salvatore; Innocente, Vincenzo; Pierro, Antonio
2012-12-01
In the upcoming LHC era, database have become an essential part for the experiments collecting data from LHC, in order to safely store, and consistently retrieve, a wide amount of data, which are produced by different sources. In the CMS experiment at CERN, all this information is stored in ORACLE databases, allocated in several servers, both inside and outside the CERN network. In this scenario, the task of monitoring different databases is a crucial database administration issue, since different information may be required depending on different users' tasks such as data transfer, inspection, planning and security issues. We present here a web application based on Python web framework and Python modules for data mining purposes. To customize the GUI we record traces of user interactions that are used to build use case models. In addition the application detects errors in database transactions (for example identify any mistake made by user, application failure, unexpected network shutdown or Structured Query Language (SQL) statement error) and provides warning messages from the different users' perspectives. Finally, in order to fullfill the requirements of the CMS experiment community, and to meet the new development in many Web client tools, our application was further developed, and new features were deployed.
EDCs DataBank: 3D-Structure database of endocrine disrupting chemicals.
Montes-Grajales, Diana; Olivero-Verbel, Jesus
2015-01-02
Endocrine disrupting chemicals (EDCs) are a group of compounds that affect the endocrine system, frequently found in everyday products and epidemiologically associated with several diseases. The purpose of this work was to develop EDCs DataBank, the only database of EDCs with three-dimensional structures. This database was built on MySQL using the EU list of potential endocrine disruptors and TEDX list. It contains the three-dimensional structures available on PubChem, as well as a wide variety of information from different databases and text mining tools, useful for almost any kind of research regarding EDCs. The web platform was developed employing HTML, CSS and PHP languages, with dynamic contents in a graphic environment, facilitating information analysis. Currently EDCs DataBank has 615 molecules, including pesticides, natural and industrial products, cosmetics, drugs and food additives, among other low molecular weight xenobiotics. Therefore, this database can be used to study the toxicological effects of these molecules, or to develop pharmaceuticals targeting hormone receptors, through docking studies, high-throughput virtual screening and ligand-protein interaction analysis. EDCs DataBank is totally user-friendly and the 3D-structures of the molecules can be downloaded in several formats. This database is freely available at http://edcs.unicartagena.edu.co. Copyright © 2014. Published by Elsevier Ireland Ltd.
Enhancing SAMOS Data Access in DOMS via a Neo4j Property Graph Database.
NASA Astrophysics Data System (ADS)
Stallard, A. P.; Smith, S. R.; Elya, J. L.
2016-12-01
The Shipboard Automated Meteorological and Oceanographic System (SAMOS) initiative provides routine access to high-quality marine meteorological and near-surface oceanographic observations from research vessels. The Distributed Oceanographic Match-Up Service (DOMS) under development is a centralized service that allows researchers to easily match in situ and satellite oceanographic data from distributed sources to facilitate satellite calibration, validation, and retrieval algorithm development. The service currently uses Apache Solr as a backend search engine on each node in the distributed network. While Solr is a high-performance solution that facilitates creation and maintenance of indexed data, it is limited in the sense that its schema is fixed. The property graph model escapes this limitation by creating relationships between data objects. The authors will present the development of the SAMOS Neo4j property graph database including new search possibilities that take advantage of the property graph model, performance comparisons with Apache Solr, and a vision for graph databases as a storage tool for oceanographic data. The integration of the SAMOS Neo4j graph into DOMS will also be described. Currently, Neo4j contains spatial and temporal records from SAMOS which are modeled into a time tree and r-tree using Graph Aware and Spatial plugin tools for Neo4j. These extensions provide callable Java procedures within CYPHER (Neo4j's query language) that generate in-graph structures. Once generated, these structures can be queried using procedures from these libraries, or directly via CYPHER statements. Neo4j excels at performing relationship and path-based queries, which challenge relational-SQL databases because they require memory intensive joins due to the limitation of their design. Consider a user who wants to find records over several years, but only for specific months. If a traditional database only stores timestamps, this type of query would be complex and likely prohibitively slow. Using the time tree model, one can specify a path from the root to the data which restricts resolutions to certain timeframes (e.g., months). This query can be executed without joins, unions, or other compute-intensive operations, putting Neo4j at a computational advantage to the SQL database alternative.
NASA Technical Reports Server (NTRS)
Alfaro, Victor O.; Casey, Nancy J.
2005-01-01
SQL-RAMS (where "SQL" signifies Structured Query Language and "RAMS" signifies Rocketdyne Automated Management System) is a successor to the legacy version of RAMS -- a computer program used to manage all work, nonconformance, corrective action, and configuration management on rocket engines and ground support equipment at Stennis Space Center. The legacy version resided in the File-Maker Pro software system and was constructed in modules that could act as standalone programs. There was little or no integration among modules. Because of limitations on file-management capabilities in FileMaker Pro, and because of difficulty of integration of FileMaker Pro with other software systems for exchange of data using such industry standards as SQL, the legacy version of RAMS proved to be limited, and working to circumvent its limitations too time-consuming. In contrast, SQL-RAMS is an integrated SQL-server-based program that supports all data-exchange software industry standards. Whereas in the legacy version, it was necessary to access individual modules to gain insight into a particular workstatus document, SQL-RAMS provides access through a single-screen presentation of core modules. In addition, SQL-RAMS enables rapid and efficient filtering of displayed statuses by predefined categories and test numbers. SQL-RAMS is rich in functionality and encompasses significant improvements over the legacy system. It provides users the ability to perform many tasks, which in the past required administrator intervention. Additionally, many of the design limitations have been corrected, allowing for a robust application that is user centric.
NASA Technical Reports Server (NTRS)
Alfaro, Victor O.; Casey, Nancy J.
2005-01-01
SQL-RAMS (where "SQL" signifies Structured Query Language and "RAMS" signifies Rocketdyne Automated Management System) is a successor to the legacy version of RAMS a computer program used to manage all work, nonconformance, corrective action, and configuration management on rocket engines and ground support equipment at Stennis Space Center. The legacy version resided in the FileMaker Pro software system and was constructed in modules that could act as stand-alone programs. There was little or no integration among modules. Because of limitations on file-management capabilities in FileMaker Pro, and because of difficulty of integration of FileMaker Pro with other software systems for exchange of data using such industry standards as SQL, the legacy version of RAMS proved to be limited, and working to circumvent its limitations too time-consuming. In contrast, SQL-RAMS is an integrated SQL-server-based program that supports all data-exchange software industry standards. Whereas in the legacy version, it was necessary to access individual modules to gain insight to a particular work-status documents, SQL-RAMS provides access through a single-screen presentation of core modules. In addition, SQL-RAMS enable rapid and efficient filtering of displayed statuses by predefined categories and test numbers. SQL-RAMS is rich in functionality and encompasses significant improvements over the legacy system. It provides users the ability to perform many tasks which in the past required administrator intervention. Additionally many of the design limitations have been corrected allowing for a robust application that is user centric.
MOSAIC: An organic geochemical and sedimentological database for marine surface sediments
NASA Astrophysics Data System (ADS)
Tavagna, Maria Luisa; Usman, Muhammed; De Avelar, Silvania; Eglinton, Timothy
2015-04-01
Modern ocean sediments serve as the interface between the biosphere and the geosphere, play a key role in biogeochemical cycles and provide a window on how contemporary processes are written into the sedimentary record. Research over past decades has resulted in a wealth of information on the content and composition of organic matter in marine sediments, with ever-more sophisticated techniques continuing to yield information of greater detail and as an accelerating pace. However, there has been no attempt to synthesize this wealth of information. We are establishing a new database that incorporates information relevant to local, regional and global-scale assessment of the content, source and fate of organic materials accumulating in contemporary marine sediments. In the MOSAIC (Modern Ocean Sediment Archive and Inventory of Carbon) database, particular emphasis is placed on molecular and isotopic information, coupled with relevant contextual information (e.g., sedimentological properties) relevant to elucidating factors that influence the efficiency and nature of organic matter burial. The main features of MOSAIC include: (i) Emphasis on continental margin sediments as major loci of carbon burial, and as the interface between terrestrial and oceanic realms; (ii) Bulk to molecular-level organic geochemical properties and parameters, including concentration and isotopic compositions; (iii) Inclusion of extensive contextual data regarding the depositional setting, in particular with respect to sedimentological and redox characteristics. The ultimate goal is to create an open-access instrument, available on the web, to be utilized for research and education by the international community who can both contribute to, and interrogate the database. The submission will be accomplished by means of a pre-configured table available on the MOSAIC webpage. The information on the filled tables will be checked and eventually imported, via the Structural Query Language (SQL), into MOSAIC. MOSAIC is programmed with PostgreSQL, an open-source database management system. In order to locate geographically the data, each element/datum is associated to a latitude, longitude and depth, facilitating creation of a geospatial database which can be easily interfaced to a Geographic Information System (GIS). In order to make the database broadly accessible, a HTML-PHP language-based website will ultimately be created and linked to the database. Consulting the website will allow for both data visualization as well as export of data in txt format for utilization with common software solutions (e.g. ODV, Excel, Matlab, Python, Word, PPT, Illustrator…). In this very early stage, MOSAIC presently contains approximately 10000 analyses conducted on more than 1800 samples which were collected from over 1600 different geographical locations around the world. Through participation of the international research community, MOSAIC will rapidly develop into a rich archive and versatile tool for investigation of distribution and composition of organic matter accumulating in seafloor sediments. The present contribution will outline the structure of MOSAIC, provide examples of data output, and solicit feedback on desirable features to be included in the database and associated software tools.
System Enhancements for Mechanical Inspection Processes
NASA Technical Reports Server (NTRS)
Hawkins, Myers IV
2011-01-01
Quality inspection of parts is a major component to any project that requires hardware implementation. Keeping track of all of the inspection jobs is essential to having a smooth running process. By using HTML, the programming language ColdFusion, and the MySQL database, I created a web-based job management system for the 170 Mechanical Inspection Group that will replace the Microsoft Access based management system. This will improve the ways inspectors and the people awaiting inspection view and keep track of hardware as it is in the inspection process. In the end, the management system should be able to insert jobs into a queue, place jobs in and out of a bonded state, pre-release bonded jobs, and close out inspection jobs.
DOE Office of Scientific and Technical Information (OSTI.GOV)
This software is a plug-in that interfaces between the Phoenix Integration's Model Center and the Base SAS 9.2 applications. The end use of the plug-in is to link input and output data that resides in SAS tables or MS SQL to and from "legacy" software programs without recoding. The potential end users are users who need to run legacy code and want data stored in a SQL database.
A Web-based Tool for SDSS and 2MASS Database Searches
NASA Astrophysics Data System (ADS)
Hendrickson, M. A.; Uomoto, A.; Golimowski, D. A.
We have developed a web site using HTML, Php, Python, and MySQL that extracts, processes, and displays data from the Sloan Digital Sky Survey (SDSS) and the Two-Micron All-Sky Survey (2MASS). The goal is to locate brown dwarf candidates in the SDSS database by looking at color cuts; however, this site could also be useful for targeted searches of other databases as well. MySQL databases are created from broad searches of SDSS and 2MASS data. Broad queries on the SDSS and 2MASS database servers are run weekly so that observers have the most up-to-date information from which to select candidates for observation. Observers can look at detailed information about specific objects including finding charts, images, and available spectra. In addition, updates from previous observations can be added by any collaborators; this format makes observational collaboration simple. Observers can also restrict the database search, just before or during an observing run, to select objects of special interest.
NASA Astrophysics Data System (ADS)
Bemm, Stefan; Sandmeier, Christine; Wilde, Martina; Jaeger, Daniel; Schwindt, Daniel; Terhorst, Birgit
2014-05-01
The area of the Swabian-Franconian cuesta landscape (Southern Germany) is highly prone to landslides. This was apparent in the late spring of 2013, when numerous landslides occurred as a consequence of heavy and long-lasting rainfalls. The specific climatic situation caused numerous damages with serious impact on settlements and infrastructure. Knowledge on spatial distribution of landslides, processes and characteristics are important to evaluate the potential risk that can occur from mass movements in those areas. In the frame of two projects about 400 landslides were mapped and detailed data sets were compiled during years 2011 to 2014 at the Franconian Alb. The studies are related to the project "Slope stability and hazard zones in the northern Bavarian cuesta" (DFG, German Research Foundation) as well as to the LfU (The Bavarian Environment Agency) within the project "Georisks and climate change - hazard indication map Jura". The central goal of the present study is to create a spatial database for landslides. The database should contain all fundamental parameters to characterize the mass movements and should provide the potential for secure data storage and data management, as well as statistical evaluations. The spatial database was created with PostgreSQL, an object-relational database management system and PostGIS, a spatial database extender for PostgreSQL, which provides the possibility to store spatial and geographic objects and to connect to several GIS applications, like GRASS GIS, SAGA GIS, QGIS and GDAL, a geospatial library (Obe et al. 2011). Database access for querying, importing, and exporting spatial and non-spatial data is ensured by using GUI or non-GUI connections. The database allows the use of procedural languages for writing advanced functions in the R, Python or Perl programming languages. It is possible to work directly with the (spatial) data entirety of the database in R. The inventory of the database includes (amongst others), informations on location, landslide types and causes, geomorphological positions, geometries, hazards and damages, as well as assessments related to the activity of landslides. Furthermore, there are stored spatial objects, which represent the components of a landslide, in particular the scarps and the accumulation areas. Besides, waterways, map sheets, contour lines, detailed infrastructure data, digital elevation models, aspect and slope data are included. Examples of spatial queries to the database are intersections of raster and vector data for calculating values for slope gradients or aspects of landslide areas and for creating multiple, overlaying sections for the comparison of slopes, as well as distances to the infrastructure or to the next receiving drainage. Furthermore, getting informations on landslide magnitudes, distribution and clustering, as well as potential correlations concerning geomorphological or geological conditions. The data management concept in this study can be implemented for any academic, public or private use, because it is independent from any obligatory licenses. The created spatial database offers a platform for interdisciplinary research and socio-economic questions, as well as for landslide susceptibility and hazard indication mapping. Obe, R.O., Hsu, L.S. 2011. PostGIS in action. - pp 492, Manning Publications, Stamford
Teaching Advanced SQL Skills: Text Bulk Loading
ERIC Educational Resources Information Center
Olsen, David; Hauser, Karina
2007-01-01
Studies show that advanced database skills are important for students to be prepared for today's highly competitive job market. A common task for database administrators is to insert a large amount of data into a database. This paper illustrates how an up-to-date, advanced database topic, namely bulk insert, can be incorporated into a database…
A Customizable Dashboarding System for Watershed Model Interpretation
NASA Astrophysics Data System (ADS)
Easton, Z. M.; Collick, A.; Wagena, M. B.; Sommerlot, A.; Fuka, D.
2017-12-01
Stakeholders, including policymakers, agricultural water managers, and small farm managers, can benefit from the outputs of commonly run watershed models. However, the information that each stakeholder needs is be different. While policy makers are often interested in the broader effects that small farm management may have on a watershed during extreme events or over long periods, farmers are often interested in field specific effects at daily or seasonal period. To provide stakeholders with the ability to analyze and interpret data from large scale watershed models, we have developed a framework that can support custom exploration of the large datasets produced. For the volume of data produced by these models, SQL-based data queries are not efficient; thus, we employ a "Not Only SQL" (NO-SQL) query language, which allows data to scale in both quantity and query volumes. We demonstrate a stakeholder customizable Dashboarding system that allows stakeholders to create custom `dashboards' to summarize model output specific to their needs. Dashboarding is a dynamic and purpose-based visual interface needed to display one-to-many database linkages so that the information can be presented for a single time period or dynamically monitored over time and allows a user to quickly define focus areas of interest for their analysis. We utilize a single watershed model that is run four times daily with a combined set of climate projections, which are then indexed, and added to an ElasticSearch datastore. ElasticSearch is a NO-SQL search engine built on top of Apache Lucene, a free and open-source information retrieval software library. Aligned with the ElasticSearch project is the open source visualization and analysis system, Kibana, which we utilize for custom stakeholder dashboarding. The dashboards create a visualization of the stakeholder selected analysis and can be extended to recommend robust strategies to support decision-making.
Advances in Data Management in Remote Sensing and Climate Modeling
NASA Astrophysics Data System (ADS)
Brown, P. G.
2014-12-01
Recent commercial interest in "Big Data" information systems has yielded little more than a sense of deja vu among scientists whose work has always required getting their arms around extremely large databases, and writing programs to explore and analyze it. On the flip side, there are some commercial DBMS startups building "Big Data" platform using techniques taken from earth science, astronomy, high energy physics and high performance computing. In this talk, we will introduce one such platform; Paradigm4's SciDB, the first DBMS designed from the ground up to combine the kinds of quality-of-service guarantees made by SQL DBMS platforms—high level data model, query languages, extensibility, transactions—with the kinds of functionality familiar to scientific users—arrays as structural building blocks, integrated linear algebra, and client language interfaces that minimize the learning curve. We will review how SciDB is used to manage and analyze earth science data by several teams of scientific users.
2006-06-01
SPARQL SPARQL Protocol and RDF Query Language SQL Structured Query Language SUMO Suggested Upper Merged Ontology SW... Query optimization algorithms are implemented in the Pellet reasoner in order to ensure querying a knowledge base is efficient . These algorithms...memory as a treelike structure in order for the data to be queried . XML Query (XQuery) is the standard language used when querying XML
How to maintain blood supply during computer network breakdown: a manual backup system.
Zeiler, T; Slonka, J; Bürgi, H R; Kretschmer, V
2000-12-01
Electronic data management systems using computer network systems and client/server architecture are increasingly used in laboratories and transfusion services. Severe problems arise if there is no network access to the database server and critical functions are not available. We describe a manual backup system (MBS) developed to maintain the delivery of blood products to patients in a hospital transfusion service in case of a computer network breakdown. All data are kept on a central SQL database connected to peripheral workstations in a local area network (LAN). Request entry from wards is performed via machine-readable request forms containing self-adhesive specimen labels with barcodes for test tubes. Data entry occurs on-line by bidirectional automated systems or off-line manually. One of the workstations in the laboratory contains a second SQL database which is frequently and incrementally updated. This workstation is run as a stand-alone, read-only database if the central SQL database is not available. In case of a network breakdown, the time-graded MBS is launched. Patient data, requesting ward and ordered tests/requests, are photocopied through a template from the request forms on special MBS worksheets serving as laboratory journal for manual processing and result report (a copy is left in the laboratory). As soon as the network is running again the data from the off-line period are entered into the primary SQL server. The MBS was successfully used at several occasions. The documentation of a 90-min breakdown period is presented in detail. Additional work resulted from the copy work and the belated manual data entry after restoration of the system. There was no delay in issue of blood products or result reporting. The backup system described has been proven to be simple, quick and safe to maintain urgent blood supply and distribution of laboratory results in case of unexpected network breakdown.
Updates to the Virtual Atomic and Molecular Data Centre
NASA Astrophysics Data System (ADS)
Hill, Christian; Tennyson, Jonathan; Gordon, Iouli E.; Rothman, Laurence S.; Dubernet, Marie-Lise
2014-06-01
The Virtual Atomic and Molecular Data Centre (VAMDC) has established a set of standards for the storage and transmission of atomic and molecular data and an SQL-based query language (VSS2) for searching online databases, known as nodes. The project has also created an online service, the VAMDC Portal, through which all of these databases may be searched and their results compared and aggregated. Since its inception four years ago, the VAMDC e-infrastructure has grown to encompass over 40 databases, including HITRAN, in more than 20 countries and engages actively with scientists in six continents. Associated with the portal are a growing suite of software tools for the transformation of data from its native, XML-based, XSAMS format, to a range of more convenient human-readable (such as HTML) and machinereadable (such as CSV) formats. The relational database for HITRAN1, created as part of the VAMDC project is a flexible and extensible data model which is able to represent a wider range of parameters than the current fixed-format text-based one. Over the next year, a new online interface to this database will be tested, released and fully documented - this web application, HITRANonline2, will fully replace the ageing and incomplete JavaHAWKS software suite.
Data Processing on Database Management Systems with Fuzzy Query
NASA Astrophysics Data System (ADS)
Şimşek, Irfan; Topuz, Vedat
In this study, a fuzzy query tool (SQLf) for non-fuzzy database management systems was developed. In addition, samples of fuzzy queries were made by using real data with the tool developed in this study. Performance of SQLf was tested with the data about the Marmara University students' food grant. The food grant data were collected in MySQL database by using a form which had been filled on the web. The students filled a form on the web to describe their social and economical conditions for the food grant request. This form consists of questions which have fuzzy and crisp answers. The main purpose of this fuzzy query is to determine the students who deserve the grant. The SQLf easily found the eligible students for the grant through predefined fuzzy values. The fuzzy query tool (SQLf) could be used easily with other database system like ORACLE and SQL server.
GEOmetadb: powerful alternative search engine for the Gene Expression Omnibus
Zhu, Yuelin; Davis, Sean; Stephens, Robert; Meltzer, Paul S.; Chen, Yidong
2008-01-01
The NCBI Gene Expression Omnibus (GEO) represents the largest public repository of microarray data. However, finding data in GEO can be challenging. We have developed GEOmetadb in an attempt to make querying the GEO metadata both easier and more powerful. All GEO metadata records as well as the relationships between them are parsed and stored in a local MySQL database. A powerful, flexible web search interface with several convenient utilities provides query capabilities not available via NCBI tools. In addition, a Bioconductor package, GEOmetadb that utilizes a SQLite export of the entire GEOmetadb database is also available, rendering the entire GEO database accessible with full power of SQL-based queries from within R. Availability: The web interface and SQLite databases available at http://gbnci.abcc.ncifcrf.gov/geo/. The Bioconductor package is available via the Bioconductor project. The corresponding MATLAB implementation is also available at the same website. Contact: yidong@mail.nih.gov PMID:18842599
In-database processing of a large collection of remote sensing data: applications and implementation
NASA Astrophysics Data System (ADS)
Kikhtenko, Vladimir; Mamash, Elena; Chubarov, Dmitri; Voronina, Polina
2016-04-01
Large archives of remote sensing data are now available to scientists, yet the need to work with individual satellite scenes or product files constrains studies that span a wide temporal range or spatial extent. The resources (storage capacity, computing power and network bandwidth) required for such studies are often beyond the capabilities of individual geoscientists. This problem has been tackled before in remote sensing research and inspired several information systems. Some of them such as NASA Giovanni [1] and Google Earth Engine have already proved their utility for science. Analysis tasks involving large volumes of numerical data are not unique to Earth Sciences. Recent advances in data science are enabled by the development of in-database processing engines that bring processing closer to storage, use declarative query languages to facilitate parallel scalability and provide high-level abstraction of the whole dataset. We build on the idea of bridging the gap between file archives containing remote sensing data and databases by integrating files into relational database as foreign data sources and performing analytical processing inside the database engine. Thereby higher level query language can efficiently address problems of arbitrary size: from accessing the data associated with a specific pixel or a grid cell to complex aggregation over spatial or temporal extents over a large number of individual data files. This approach was implemented using PostgreSQL for a Siberian regional archive of satellite data products holding hundreds of terabytes of measurements from multiple sensors and missions taken over a decade-long span. While preserving the original storage layout and therefore compatibility with existing applications the in-database processing engine provides a toolkit for provisioning remote sensing data in scientific workflows and applications. The use of SQL - a widely used higher level declarative query language - simplifies interoperability between desktop GIS, web applications and geographic web services and interactive scientific applications (MATLAB, IPython). The system is also automatically ingesting direct readout data from meteorological and research satellites in near-real time with distributed acquisition workflows managed by Taverna workflow engine [2]. The system has demonstrated its utility in performing non-trivial analytic processing such as the computation of the Robust Satellite Technique (RST) indices [3]. It had been useful in different tasks such as studying urban heat islands, analyzing patterns in the distribution of wildfire occurrences, detecting phenomena related to seismic and earthquake activity. Initial experience has highlighted several limitations of the proposed approach yet it has demonstrated ability to facilitate the use of large archives of remote sensing data by geoscientists. 1. J.G. Acker, G. Leptoukh, Online analysis enhances use of NASA Earth science data. EOS Trans. AGU, 2007, 88(2), P. 14-17. 2. D. Hull, K. Wolsfencroft, R. Stevens, C. Goble, M.R. Pocock, P. Li and T. Oinn, Taverna: a tool for building and running workflows of services. Nucleic Acids Research. 2006. V. 34. P. W729-W732. 3. V. Tramutoli, G. Di Bello, N. Pergola, S. Piscitelli, Robust satellite techniques for remote sensing of seismically active areas // Annals of Geophysics. 2001. no. 44(2). P. 295-312.
Secure web book to store structural genomics research data.
Manjasetty, Babu A; Höppner, Klaus; Mueller, Uwe; Heinemann, Udo
2003-01-01
Recently established collaborative structural genomics programs aim at significantly accelerating the crystal structure analysis of proteins. These large-scale projects require efficient data management systems to ensure seamless collaboration between different groups of scientists working towards the same goal. Within the Berlin-based Protein Structure Factory, the synchrotron X-ray data collection and the subsequent crystal structure analysis tasks are located at BESSY, a third-generation synchrotron source. To organize file-based communication and data transfer at the BESSY site of the Protein Structure Factory, we have developed the web-based BCLIMS, the BESSY Crystallography Laboratory Information Management System. BCLIMS is a relational data management system which is powered by MySQL as the database engine and Apache HTTP as the web server. The database interface routines are written in Python programing language. The software is freely available to academic users. Here we describe the storage, retrieval and manipulation of laboratory information, mainly pertaining to the synchrotron X-ray diffraction experiments and the subsequent protein structure analysis, using BCLIMS.
Charbonnier, Amandine; Knapp, Jenny; Demonmerot, Florent; Bresson-Hadni, Solange; Raoul, Francis; Grenouillet, Frédéric; Millon, Laurence; Vuitton, Dominique Angèle; Damy, Sylvie
2014-01-01
Alveolar echinococcosis (AE) is an endemic zoonosis in France due to the cestode Echinococcus multilocularis. The French National Reference Centre for Alveolar Echinococcosis (CNR-EA), connected to the FrancEchino network, is responsible for recording all AE cases diagnosed in France. Administrative, epidemiological and medical information on the French AE cases may currently be considered exhaustive only on the diagnosis time. To constitute a reference data set, an information system (IS) was developed thanks to a relational database management system (MySQL language). The current data set will evolve towards a dynamic surveillance system, including follow-up data (e.g. imaging, serology) and will be connected to environmental and parasitological data relative to E. multilocularis to better understand the pathogen transmission pathway. A particularly important goal is the possible interoperability of the IS with similar European and other databases abroad; this new IS could play a supporting role in the creation of new AE registries. © A. Charbonnier et al., published by EDP Sciences, 2014.
A new data management system for the French National Registry of human alveolar echinococcosis cases
Charbonnier, Amandine; Knapp, Jenny; Demonmerot, Florent; Bresson-Hadni, Solange; Raoul, Francis; Grenouillet, Frédéric; Millon, Laurence; Vuitton, Dominique Angèle; Damy, Sylvie
2014-01-01
Alveolar echinococcosis (AE) is an endemic zoonosis in France due to the cestode Echinococcus multilocularis. The French National Reference Centre for Alveolar Echinococcosis (CNR-EA), connected to the FrancEchino network, is responsible for recording all AE cases diagnosed in France. Administrative, epidemiological and medical information on the French AE cases may currently be considered exhaustive only on the diagnosis time. To constitute a reference data set, an information system (IS) was developed thanks to a relational database management system (MySQL language). The current data set will evolve towards a dynamic surveillance system, including follow-up data (e.g. imaging, serology) and will be connected to environmental and parasitological data relative to E. multilocularis to better understand the pathogen transmission pathway. A particularly important goal is the possible interoperability of the IS with similar European and other databases abroad; this new IS could play a supporting role in the creation of new AE registries. PMID:25526544
ERIC Educational Resources Information Center
Lundquist, Carol; Frieder, Ophir; Holmes, David O.; Grossman, David
1999-01-01
Describes a scalable, parallel, relational database-drive information retrieval engine. To support portability across a wide range of execution environments, all algorithms adhere to the SQL-92 standard. By incorporating relevance feedback algorithms, accuracy is enhanced over prior database-driven information retrieval efforts. Presents…
Respiratory cancer database: An open access database of respiratory cancer gene and miRNA.
Choubey, Jyotsna; Choudhari, Jyoti Kant; Patel, Ashish; Verma, Mukesh Kumar
2017-01-01
Respiratory cancer database (RespCanDB) is a genomic and proteomic database of cancer of respiratory organ. It also includes the information of medicinal plants used for the treatment of various respiratory cancers with structure of its active constituents as well as pharmacological and chemical information of drug associated with various respiratory cancers. Data in RespCanDB has been manually collected from published research article and from other databases. Data has been integrated using MySQL an object-relational database management system. MySQL manages all data in the back-end and provides commands to retrieve and store the data into the database. The web interface of database has been built in ASP. RespCanDB is expected to contribute to the understanding of scientific community regarding respiratory cancer biology as well as developments of new way of diagnosing and treating respiratory cancer. Currently, the database consist the oncogenomic information of lung cancer, laryngeal cancer, and nasopharyngeal cancer. Data for other cancers, such as oral and tracheal cancers, will be added in the near future. The URL of RespCanDB is http://ridb.subdic-bioinformatics-nitrr.in/.
Network Security Visualization
1999-09-27
performing SQL generation and result-set binding, inserting acquired security events into the database and gathering the requested data for Console scene...objects is also auto-generated by a VBA script. Built into the auto-generated table access objects are the preferred join paths between tables. This...much of the Server itself) never have to deal with SQL directly. This is one aspect of laying the groundwork for supporting RDBMSs from multiple vendors
High dimensional biological data retrieval optimization with NoSQL technology.
Wang, Shicai; Pandis, Ioannis; Wu, Chao; He, Sijin; Johnson, David; Emam, Ibrahim; Guitton, Florian; Guo, Yike
2014-01-01
High-throughput transcriptomic data generated by microarray experiments is the most abundant and frequently stored kind of data currently used in translational medicine studies. Although microarray data is supported in data warehouses such as tranSMART, when querying relational databases for hundreds of different patient gene expression records queries are slow due to poor performance. Non-relational data models, such as the key-value model implemented in NoSQL databases, hold promise to be more performant solutions. Our motivation is to improve the performance of the tranSMART data warehouse with a view to supporting Next Generation Sequencing data. In this paper we introduce a new data model better suited for high-dimensional data storage and querying, optimized for database scalability and performance. We have designed a key-value pair data model to support faster queries over large-scale microarray data and implemented the model using HBase, an implementation of Google's BigTable storage system. An experimental performance comparison was carried out against the traditional relational data model implemented in both MySQL Cluster and MongoDB, using a large publicly available transcriptomic data set taken from NCBI GEO concerning Multiple Myeloma. Our new key-value data model implemented on HBase exhibits an average 5.24-fold increase in high-dimensional biological data query performance compared to the relational model implemented on MySQL Cluster, and an average 6.47-fold increase on query performance on MongoDB. The performance evaluation found that the new key-value data model, in particular its implementation in HBase, outperforms the relational model currently implemented in tranSMART. We propose that NoSQL technology holds great promise for large-scale data management, in particular for high-dimensional biological data such as that demonstrated in the performance evaluation described in this paper. We aim to use this new data model as a basis for migrating tranSMART's implementation to a more scalable solution for Big Data.
High dimensional biological data retrieval optimization with NoSQL technology
2014-01-01
Background High-throughput transcriptomic data generated by microarray experiments is the most abundant and frequently stored kind of data currently used in translational medicine studies. Although microarray data is supported in data warehouses such as tranSMART, when querying relational databases for hundreds of different patient gene expression records queries are slow due to poor performance. Non-relational data models, such as the key-value model implemented in NoSQL databases, hold promise to be more performant solutions. Our motivation is to improve the performance of the tranSMART data warehouse with a view to supporting Next Generation Sequencing data. Results In this paper we introduce a new data model better suited for high-dimensional data storage and querying, optimized for database scalability and performance. We have designed a key-value pair data model to support faster queries over large-scale microarray data and implemented the model using HBase, an implementation of Google's BigTable storage system. An experimental performance comparison was carried out against the traditional relational data model implemented in both MySQL Cluster and MongoDB, using a large publicly available transcriptomic data set taken from NCBI GEO concerning Multiple Myeloma. Our new key-value data model implemented on HBase exhibits an average 5.24-fold increase in high-dimensional biological data query performance compared to the relational model implemented on MySQL Cluster, and an average 6.47-fold increase on query performance on MongoDB. Conclusions The performance evaluation found that the new key-value data model, in particular its implementation in HBase, outperforms the relational model currently implemented in tranSMART. We propose that NoSQL technology holds great promise for large-scale data management, in particular for high-dimensional biological data such as that demonstrated in the performance evaluation described in this paper. We aim to use this new data model as a basis for migrating tranSMART's implementation to a more scalable solution for Big Data. PMID:25435347
Web-based application on employee performance assessment using exponential comparison method
NASA Astrophysics Data System (ADS)
Maryana, S.; Kurnia, E.; Ruyani, A.
2017-02-01
Employee performance assessment is also called a performance review, performance evaluation, or assessment of employees, is an effort to assess the achievements of staffing performance with the aim to increase productivity of employees and companies. This application helps in the assessment of employee performance using five criteria: Presence, Quality of Work, Quantity of Work, Discipline, and Teamwork. The system uses the Exponential Comparative Method and Weighting Eckenrode. Calculation results using graphs were provided to see the assessment of each employee. Programming language used in this system is written in Notepad++ and MySQL database. The testing result on the system can be concluded that this application is correspond with the design and running properly. The test conducted is structural test, functional test, and validation, sensitivity analysis, and SUMI testing.
Zuberbuhler, Bruno; Galloway, Peter; Reddy, Aravind; Saldana, Manuel; Gale, Richard
2007-12-01
The aim was to develop a software tool for refractive surgeons using a standard user-friendly web-based interface, providing the user with a secure environment to protect large volumes of patient data. The software application was named "Internet-based refractive analysis" (IBRA), and was programmed with the computer languages PHP, HTML and JavaScript, attached to the opensource MySQL database. IBRA facilitated internationally accepted presentation methods including the stability chart, the predictability chart and the safety chart; it was able to perform vector analysis for the course of a single patient or for group data. With the integrated nomogram calculation, treatment could be customised to reduce the postoperative refractive error. Multicenter functions permitted quality-control comparisons between different surgeons and laser units.
BtoxDB: a comprehensive database of protein structural data on toxin-antitoxin systems.
Barbosa, Luiz Carlos Bertucci; Garrido, Saulo Santesso; Marchetto, Reinaldo
2015-03-01
Toxin-antitoxin (TA) systems are diverse and abundant genetic modules in prokaryotic cells that are typically formed by two genes encoding a stable toxin and a labile antitoxin. Because TA systems are able to repress growth or kill cells and are considered to be important actors in cell persistence (multidrug resistance without genetic change), these modules are considered potential targets for alternative drug design. In this scenario, structural information for the proteins in these systems is highly valuable. In this report, we describe the development of a web-based system, named BtoxDB, that stores all protein structural data on TA systems. The BtoxDB database was implemented as a MySQL relational database using PHP scripting language. Web interfaces were developed using HTML, CSS and JavaScript. The data were collected from the PDB, UniProt and Entrez databases. These data were appropriately filtered using specialized literature and our previous knowledge about toxin-antitoxin systems. The database provides three modules ("Search", "Browse" and "Statistics") that enable searches, acquisition of contents and access to statistical data. Direct links to matching external databases are also available. The compilation of all protein structural data on TA systems in one platform is highly useful for researchers interested in this content. BtoxDB is publicly available at http://www.gurupi.uft.edu.br/btoxdb. Copyright © 2015 Elsevier Ltd. All rights reserved.
NASA Astrophysics Data System (ADS)
Miles, B.; Chepudira, K.; LaBar, W.
2017-12-01
The Open Geospatial Consortium (OGC) SensorThings API (STA) specification, ratified in 2016, is a next-generation open standard for enabling real-time communication of sensor data. Building on over a decade of OGC Sensor Web Enablement (SWE) Standards, STA offers a rich data model that can represent a range of sensor and phenomena types (e.g. fixed sensors sensing fixed phenomena, fixed sensors sensing moving phenomena, mobile sensors sensing fixed phenomena, and mobile sensors sensing moving phenomena) and is data agnostic. Additionally, and in contrast to previous SWE standards, STA is developer-friendly, as is evident from its convenient JSON serialization, and expressive OData-based query language (with support for geospatial queries); with its Message Queue Telemetry Transport (MQTT), STA is also well-suited to efficient real-time data publishing and discovery. All these attributes make STA potentially useful for use in environmental monitoring sensor networks. Here we present Kinota(TM), an Open-Source NoSQL implementation of OGC SensorThings for large-scale high-resolution real-time environmental monitoring. Kinota, which roughly stands for Knowledge from Internet of Things Analyses, relies on Cassandra its underlying data store, which is a horizontally scalable, fault-tolerant open-source database that is often used to store time-series data for Big Data applications (though integration with other NoSQL or rational databases is possible). With this foundation, Kinota can scale to store data from an arbitrary number of sensors collecting data every 500 milliseconds. Additionally, Kinota architecture is very modular allowing for customization by adopters who can choose to replace parts of the existing implementation when desirable. The architecture is also highly portable providing the flexibility to choose between cloud providers like azure, amazon, google etc. The scalable, flexible and cloud friendly architecture of Kinota makes it ideal for use in next-generation large-scale and high-resolution real-time environmental monitoring networks used in domains such as hydrology, geomorphology, and geophysics, as well as management applications such as flood early warning, and regulatory enforcement.
Zero tolerance for incorrect data: Best practices in SQL transaction programming
NASA Astrophysics Data System (ADS)
Laiho, M.; Skourlas, C.; Dervos, D. A.
2015-02-01
DBMS products differ in the way they support even the basic SQL transaction services. In this paper, a framework of best practices in SQL transaction programming is given and discussed. The SQL developers are advised to experiment with and verify the services supported by the DBMS product used. The framework has been developed by DBTechNet, a European network of teachers, trainers and ICT professionals. A course module on SQL transactions, offered by the LLP "DBTech VET Teachers" programme, is also presented and discussed. Aims and objectives of the programme include the introduction of the topics and content of SQL transactions and concurrency control to HE/VET curricula and addressing the need for initial and continuous training on these topics to in-company trainers, VET teachers, and Higher Education students. An overview of the course module, its learning outcomes, the education and training (E&T) content, virtual database labs with hands-on self-practicing exercises, plus instructions for the teacher/trainer on the pedagogy and the usage of the course modules' content are briefly described. The main principle adopted is to "Learn by verifying in practice" and the transactions course motto is: "Zero Tolerance for Incorrect Data".
Motamed, Cyrus; Bourgain, Jean Louis
2016-06-01
Anaesthesia Information Management Systems (AIMS) generate large amounts of data, which might be useful for quality assurance programs. This study was designed to highlight the multiple contributions of our AIMS system in extracting quality indicators over a period of 10years. The study was conducted from 2002 to 2011. Two methods were used to extract anaesthesia indicators: the manual extraction of individual files for monitoring neuromuscular relaxation and structured query language (SQL) extraction for other indicators which were postoperative nausea and vomiting (PONV), pain, sedation scores, pain-related medications, scores and postoperative hypothermia. For each indicator, a program of information/meetings and adaptation/suggestions for operating room and PACU personnel was initiated to improve quality assurance, while data were extracted each year. The study included 77,573 patients. The mean overall completeness of data for the initial years ranged from 55 to 85% and was indicator-dependent, which then improved to 95% completeness for the last 5years. The incidence of neuromuscular monitoring was initially 67% and then increased to 95% (P<0.05). The rate of pharmacological reversal remained around 53% throughout the study. Regarding SQL data, an improvement of severe postoperative pain and PONV scores was observed throughout the study, while mild postoperative hypothermia remained a challenge, despite efforts for improvement. The AIMS system permitted the follow-up of certain indicators through manual sampling and many more via SQL extraction in a sustained and non-time-consuming way across years. However, it requires competent and especially dedicated resources to handle the database. Copyright © 2016 Société française d'anesthésie et de réanimation (Sfar). Published by Elsevier Masson SAS. All rights reserved.
Scale-Independent Relational Query Processing
2013-10-04
source options are also available, including Postgresql, MySQL , and SQLite. These mod- ern relational databases are generally very complex software systems...and Their Application to Data Stream Management. IGI Global, 2010. [68] George Reese. Database Programming with JDBC and Java , Second Edition. Ed. by
Information Flow Integrity for Systems of Independently-Developed Components
2015-06-22
We also examined three programs (Apache, MySQL , and PHP) in detail to evaluate the efficacy of using the provided package test suites to generate...method are just as effective as hooks that were manually placed over the course of years while greatly reducing the burden on programmers. ”Leveraging...to validate optimizations of real-world, mature applications: the Apache software suite, the Mozilla Suite, and the MySQL database. ”Validating Library
Processing of the WLCG monitoring data using NoSQL
NASA Astrophysics Data System (ADS)
Andreeva, J.; Beche, A.; Belov, S.; Dzhunov, I.; Kadochnikov, I.; Karavakis, E.; Saiz, P.; Schovancova, J.; Tuckett, D.
2014-06-01
The Worldwide LHC Computing Grid (WLCG) today includes more than 150 computing centres where more than 2 million jobs are being executed daily and petabytes of data are transferred between sites. Monitoring the computing activities of the LHC experiments, over such a huge heterogeneous infrastructure, is extremely demanding in terms of computation, performance and reliability. Furthermore, the generated monitoring flow is constantly increasing, which represents another challenge for the monitoring systems. While existing solutions are traditionally based on Oracle for data storage and processing, recent developments evaluate NoSQL for processing large-scale monitoring datasets. NoSQL databases are getting increasingly popular for processing datasets at the terabyte and petabyte scale using commodity hardware. In this contribution, the integration of NoSQL data processing in the Experiment Dashboard framework is described along with first experiences of using this technology for monitoring the LHC computing activities.
Ontology-based geospatial data query and integration
Zhao, T.; Zhang, C.; Wei, M.; Peng, Z.-R.
2008-01-01
Geospatial data sharing is an increasingly important subject as large amount of data is produced by a variety of sources, stored in incompatible formats, and accessible through different GIS applications. Past efforts to enable sharing have produced standardized data format such as GML and data access protocols such as Web Feature Service (WFS). While these standards help enabling client applications to gain access to heterogeneous data stored in different formats from diverse sources, the usability of the access is limited due to the lack of data semantics encoded in the WFS feature types. Past research has used ontology languages to describe the semantics of geospatial data but ontology-based queries cannot be applied directly to legacy data stored in databases or shapefiles, or to feature data in WFS services. This paper presents a method to enable ontology query on spatial data available from WFS services and on data stored in databases. We do not create ontology instances explicitly and thus avoid the problems of data replication. Instead, user queries are rewritten to WFS getFeature requests and SQL queries to database. The method also has the benefits of being able to utilize existing tools of databases, WFS, and GML while enabling query based on ontology semantics. ?? 2008 Springer-Verlag Berlin Heidelberg.
PhyloExplorer: a web server to validate, explore and query phylogenetic trees
Ranwez, Vincent; Clairon, Nicolas; Delsuc, Frédéric; Pourali, Saeed; Auberval, Nicolas; Diser, Sorel; Berry, Vincent
2009-01-01
Background Many important problems in evolutionary biology require molecular phylogenies to be reconstructed. Phylogenetic trees must then be manipulated for subsequent inclusion in publications or analyses such as supertree inference and tree comparisons. However, no tool is currently available to facilitate the management of tree collections providing, for instance: standardisation of taxon names among trees with respect to a reference taxonomy; selection of relevant subsets of trees or sub-trees according to a taxonomic query; or simply computation of descriptive statistics on the collection. Moreover, although several databases of phylogenetic trees exist, there is currently no easy way to find trees that are both relevant and complementary to a given collection of trees. Results We propose a tool to facilitate assessment and management of phylogenetic tree collections. Given an input collection of rooted trees, PhyloExplorer provides facilities for obtaining statistics describing the collection, correcting invalid taxon names, extracting taxonomically relevant parts of the collection using a dedicated query language, and identifying related trees in the TreeBASE database. Conclusion PhyloExplorer is a simple and interactive website implemented through underlying Python libraries and MySQL databases. It is available at: and the source code can be downloaded from: . PMID:19450253
NASA Astrophysics Data System (ADS)
Anugrah, Wirdah; Suryono; Suseno, Jatmiko Endro
2018-02-01
Management of water resources based on Geographic Information System can provide substantial benefits to water availability settings. Monitoring the potential water level is needed in the development sector, agriculture, energy and others. In this research is developed water resource information system using real-time Geographic Information System concept for monitoring the potential water level of web based area by applying rule based system method. GIS consists of hardware, software, and database. Based on the web-based GIS architecture, this study uses a set of computer that are connected to the network, run on the Apache web server and PHP programming language using MySQL database. The Ultrasound Wireless Sensor System is used as a water level data input. It also includes time and geographic location information. This GIS maps the five sensor locations. GIS is processed through a rule based system to determine the level of potential water level of the area. Water level monitoring information result can be displayed on thematic maps by overlaying more than one layer, and also generating information in the form of tables from the database, as well as graphs are based on the timing of events and the water level values.
A web-based quantitative signal detection system on adverse drug reaction in China.
Li, Chanjuan; Xia, Jielai; Deng, Jianxiong; Chen, Wenge; Wang, Suzhen; Jiang, Jing; Chen, Guanquan
2009-07-01
To establish a web-based quantitative signal detection system for adverse drug reactions (ADRs) based on spontaneous reporting to the Guangdong province drug-monitoring database in China. Using Microsoft Visual Basic and Active Server Pages programming languages and SQL Server 2000, a web-based system with three software modules was programmed to perform data preparation and association detection, and to generate reports. Information component (IC), the internationally recognized measure of disproportionality for quantitative signal detection, was integrated into the system, and its capacity for signal detection was tested with ADR reports collected from 1 January 2002 to 30 June 2007 in Guangdong. A total of 2,496 associations including known signals were mined from the test database. Signals (e.g., cefradine-induced hematuria) were found early by using the IC analysis. In addition, 291 drug-ADR associations were alerted for the first time in the second quarter of 2007. The system can be used for the detection of significant associations from the Guangdong drug-monitoring database and could be an extremely useful adjunct to the expert assessment of very large numbers of spontaneously reported ADRs for the first time in China.
Automating Data Submission to a National Archive
NASA Astrophysics Data System (ADS)
Work, T. T.; Chandler, C. L.; Groman, R. C.; Allison, M. D.; Gegg, S. R.; Biological; Chemical Oceanography Data Management Office
2010-12-01
In late 2006, the U.S. National Science Foundation (NSF) funded the Biological and Chemical Oceanographic Data Management Office (BCO-DMO) at Woods Hole Oceanographic Institution (WHOI) to work closely with investigators to manage oceanographic data generated from their research projects. One of the final data management tasks is to ensure that the data are permanently archived at the U.S. National Oceanographic Data Center (NODC) or other appropriate national archiving facility. In the past, BCO-DMO submitted data to NODC as an email with attachments including a PDF file (a manually completed metadata record) and one or more data files. This method is no longer feasible given the rate at which data sets are contributed to BCO-DMO. Working with collaborators at NODC, a more streamlined and automated workflow was developed to keep up with the increased volume of data that must be archived at NODC. We will describe our new workflow; a semi-automated approach for contributing data to NODC that includes a Federal Geographic Data Committee (FGDC) compliant Extensible Markup Language (XML) metadata file accompanied by comma-delimited data files. The FGDC XML file is populated from information stored in a MySQL database. A crosswalk described by an Extensible Stylesheet Language Transformation (XSLT) is used to transform the XML formatted MySQL result set to a FGDC compliant XML metadata file. To ensure data integrity, the MD5 algorithm is used to generate a checksum and manifest of the files submitted to NODC for permanent archive. The revised system supports preparation of detailed, standards-compliant metadata that facilitate data sharing and enable accurate reuse of multidisciplinary information. The approach is generic enough to be adapted for use by other data management groups.
DAS: A Data Management System for Instrument Tests and Operations
NASA Astrophysics Data System (ADS)
Frailis, M.; Sartor, S.; Zacchei, A.; Lodi, M.; Cirami, R.; Pasian, F.; Trifoglio, M.; Bulgarelli, A.; Gianotti, F.; Franceschi, E.; Nicastro, L.; Conforti, V.; Zoli, A.; Smart, R.; Morbidelli, R.; Dadina, M.
2014-05-01
The Data Access System (DAS) is a and data management software system, providing a reusable solution for the storage of data acquired both from telescopes and auxiliary data sources during the instrument development phases and operations. It is part of the Customizable Instrument WorkStation system (CIWS-FW), a framework for the storage, processing and quick-look at the data acquired from scientific instruments. The DAS provides a data access layer mainly targeted to software applications: quick-look displays, pre-processing pipelines and scientific workflows. It is logically organized in three main components: an intuitive and compact Data Definition Language (DAS DDL) in XML format, aimed for user-defined data types; an Application Programming Interface (DAS API), automatically adding classes and methods supporting the DDL data types, and providing an object-oriented query language; a data management component, which maps the metadata of the DDL data types in a relational Data Base Management System (DBMS), and stores the data in a shared (network) file system. With the DAS DDL, developers define the data model for a particular project, specifying for each data type the metadata attributes, the data format and layout (if applicable), and named references to related or aggregated data types. Together with the DDL user-defined data types, the DAS API acts as the only interface to store, query and retrieve the metadata and data in the DAS system, providing both an abstract interface and a data model specific one in C, C++ and Python. The mapping of metadata in the back-end database is automatic and supports several relational DBMSs, including MySQL, Oracle and PostgreSQL.
Advanced Query and Data Mining Capabilities for MaROS
NASA Technical Reports Server (NTRS)
Wang, Paul; Wallick, Michael N.; Allard, Daniel A.; Gladden, Roy E.; Hy, Franklin H.
2013-01-01
The Mars Relay Operational Service (MaROS) comprises a number of tools to coordinate, plan, and visualize various aspects of the Mars Relay network. These levels include a Web-based user interface, a back-end "ReSTlet" built in Java, and databases that store the data as it is received from the network. As part of MaROS, the innovators have developed and implemented a feature set that operates on several levels of the software architecture. This new feature is an advanced querying capability through either the Web-based user interface, or through a back-end REST interface to access all of the data gathered from the network. This software is not meant to replace the REST interface, but to augment and expand the range of available data. The current REST interface provides specific data that is used by the MaROS Web application to display and visualize the information; however, the returned information from the REST interface has typically been pre-processed to return only a subset of the entire information within the repository, particularly only the information that is of interest to the GUI (graphical user interface). The new, advanced query and data mining capabilities allow users to retrieve the raw data and/or to perform their own data processing. The query language used to access the repository is a restricted subset of the structured query language (SQL) that can be built safely from the Web user interface, or entered as freeform SQL by a user. The results are returned in a CSV (Comma Separated Values) format for easy exporting to third party tools and applications that can be used for data mining or user-defined visualization and interpretation. This is the first time that a service is capable of providing access to all cross-project relay data from a single Web resource. Because MaROS contains the data for a variety of missions from the Mars network, which span both NASA and ESA, the software also establishes an access control list (ACL) on each data record in the database repository to enforce user access permissions through a multilayered approach.
Large Survey Database: A Distributed Framework for Storage and Analysis of Large Datasets
NASA Astrophysics Data System (ADS)
Juric, Mario
2011-01-01
The Large Survey Database (LSD) is a Python framework and DBMS for distributed storage, cross-matching and querying of large survey catalogs (>10^9 rows, >1 TB). The primary driver behind its development is the analysis of Pan-STARRS PS1 data. It is specifically optimized for fast queries and parallel sweeps of positionally and temporally indexed datasets. It transparently scales to more than >10^2 nodes, and can be made to function in "shared nothing" architectures. An LSD database consists of a set of vertically and horizontally partitioned tables, physically stored as compressed HDF5 files. Vertically, we partition the tables into groups of related columns ('column groups'), storing together logically related data (e.g., astrometry, photometry). Horizontally, the tables are partitioned into partially overlapping ``cells'' by position in space (lon, lat) and time (t). This organization allows for fast lookups based on spatial and temporal coordinates, as well as data and task distribution. The design was inspired by the success of Google BigTable (Chang et al., 2006). Our programming model is a pipelined extension of MapReduce (Dean and Ghemawat, 2004). An SQL-like query language is used to access data. For complex tasks, map-reduce ``kernels'' that operate on query results on a per-cell basis can be written, with the framework taking care of scheduling and execution. The combination leverages users' familiarity with SQL, while offering a fully distributed computing environment. LSD adds little overhead compared to direct Python file I/O. In tests, we sweeped through 1.1 Grows of PanSTARRS+SDSS data (220GB) less than 15 minutes on a dual CPU machine. In a cluster environment, we achieved bandwidths of 17Gbits/sec (I/O limited). Based on current experience, we believe LSD should scale to be useful for analysis and storage of LSST-scale datasets. It can be downloaded from http://mwscience.net/lsd.
Techniques of Photometry and Astrometry with APASS, Gaia, and Pan-STARRs Results (Abstract)
NASA Astrophysics Data System (ADS)
Green, W.
2017-12-01
(Abstract only) The databases with the APASS DR9, Gaia DR1, and the Pan-STARRs 3pi DR1 data releases are publicly available for use. There is a bit of data-mining involved to download and manage these reference stars. This paper discusses the use of these databases to acquire accurate photometric references as well as techniques for improving results. Images are prepared in the usual way: zero, dark, flat-fields, and WCS solutions with Astrometry.net. Images are then processed with Sextractor to produce an ASCII table of identifying photometric features. The database manages photometics catalogs and images converted to ASCII tables. Scripts convert the files into SQL and assimilate them into database tables. Using SQL techniques, each image star is merged with reference data to produce publishable results. The VYSOS has over 13,000 images of the ONC5 field to process with roughly 100 total fields in the campaign. This paper provides the overview for this daunting task.
Image query and indexing for digital x rays
NASA Astrophysics Data System (ADS)
Long, L. Rodney; Thoma, George R.
1998-12-01
The web-based medical information retrieval system (WebMIRS) allows interned access to databases containing 17,000 digitized x-ray spine images and associated text data from National Health and Nutrition Examination Surveys (NHANES). WebMIRS allows SQL query of the text, and viewing of the returned text records and images using a standard browser. We are now working (1) to determine utility of data directly derived from the images in our databases, and (2) to investigate the feasibility of computer-assisted or automated indexing of the images to support image retrieval of images of interest to biomedical researchers in the field of osteoarthritis. To build an initial database based on image data, we are manually segmenting a subset of the vertebrae, using techniques from vertebral morphometry. From this, we will derive and add to the database vertebral features. This image-derived data will enhance the user's data access capability by enabling the creation of combined SQL/image-content queries.
The Protein Disease Database of human body fluids: II. Computer methods and data issues.
Lemkin, P F; Orr, G A; Goldstein, M P; Creed, G J; Myrick, J E; Merril, C R
1995-01-01
The Protein Disease Database (PDD) is a relational database of proteins and diseases. With this database it is possible to screen for quantitative protein abnormalities associated with disease states. These quantitative relationships use data drawn from the peer-reviewed biomedical literature. Assays may also include those observed in high-resolution electrophoretic gels that offer the potential to quantitate many proteins in a single test as well as data gathered by enzymatic or immunologic assays. We are using the Internet World Wide Web (WWW) and the Web browser paradigm as an access method for wide distribution and querying of the Protein Disease Database. The WWW hypertext transfer protocol and its Common Gateway Interface make it possible to build powerful graphical user interfaces that can support easy-to-use data retrieval using query specification forms or images. The details of these interactions are totally transparent to the users of these forms. Using a client-server SQL relational database, user query access, initial data entry and database maintenance are all performed over the Internet with a Web browser. We discuss the underlying design issues, mapping mechanisms and assumptions that we used in constructing the system, data entry, access to the database server, security, and synthesis of derived two-dimensional gel image maps and hypertext documents resulting from SQL database searches.
Hosting and pulishing astronomical data in SQL databases
NASA Astrophysics Data System (ADS)
Galkin, Anastasia; Klar, Jochen; Riebe, Kristin; Matokevic, Gal; Enke, Harry
2017-04-01
In astronomy, terabytes and petabytes of data are produced by ground instruments, satellite missions and simulations. At Leibniz-Institute for Astrophysics Potsdam (AIP) we host and publish terabytes of cosmological simulation and observational data. The public archive at AIP has now reached a size of 60TB and growing and helps to produce numerous scientific papers. The web framework Daiquiri offers a dedicated web interface for each of the hosted scientific databases. Scientists all around the world run SQL queries which include specific astrophysical functions and get their desired data in reasonable time. Daiquiri supports the scientific projects by offering a number of administration tools such as database and user management, contact messages to the staff and support for organization of meetings and workshops. The webpages can be customized and the Wordpress integration supports the participating scientists in maintaining the documentation and the projects' news sections.
2016-03-01
Representational state transfer Java messaging service Java application programming interface (API) Internet relay chat (IRC)/extensible messaging and...JBoss application server or an Apache Tomcat servlet container instance. The relational database management system can be either PostgreSQL or MySQL ... Java library called direct web remoting. This library has been part of the core CACE architecture for quite some time; however, there have not been
Understanding and Capturing People’s Mobile App Privacy Preferences
2013-10-28
The entire apps’ metadata takes up about 500MB of storage space when stored in a MySQL database and all the binary files take approximately 300GB of...functionality that can de- compile Dalvik bytecodes to Java source code faster than other de-compilers. Given the scale of the app analysis we planned on... java libraries, such as parser, sql connectors, etc Targeted Ads 137 admob, adwhirl, greystripe… Provided by mobile behavioral ads company to
Shark: SQL and Analytics with Cost-Based Query Optimization on Coarse-Grained Distributed Memory
2014-01-13
RDBMS and contains a database (often MySQL or Derby) with a namespace for tables, table metadata and partition information. Table data is stored in an...serialization/deserialization) Java interface implementations with corresponding object inspectors. The Hive driver controls the processing of queries, coordinat...native API, RDD operations are invoked through a functional interface similar to DryadLINQ [32] in Scala, Java or Python. For example, the Scala code for
Survey data and metadata modelling using document-oriented NoSQL
NASA Astrophysics Data System (ADS)
Rahmatuti Maghfiroh, Lutfi; Gusti Bagus Baskara Nugraha, I.
2018-03-01
Survey data that are collected from year to year have metadata change. However it need to be stored integratedly to get statistical data faster and easier. Data warehouse (DW) can be used to solve this limitation. However there is a change of variables in every period that can not be accommodated by DW. Traditional DW can not handle variable change via Slowly Changing Dimension (SCD). Previous research handle the change of variables in DW to manage metadata by using multiversion DW (MVDW). MVDW is designed using relational model. Some researches also found that developing nonrelational model in NoSQL database has reading time faster than the relational model. Therefore, we propose changes to metadata management by using NoSQL. This study proposes a model DW to manage change and algorithms to retrieve data with metadata changes. Evaluation of the proposed models and algorithms result in that database with the proposed design can retrieve data with metadata changes properly. This paper has contribution in comprehensive data analysis with metadata changes (especially data survey) in integrated storage.
NASA Astrophysics Data System (ADS)
Boichard, Jean-Luc; Brissebrat, Guillaume; Cloche, Sophie; Eymard, Laurence; Fleury, Laurence; Mastrorillo, Laurence; Moulaye, Oumarou; Ramage, Karim
2010-05-01
The AMMA project includes aircraft, ground-based and ocean measurements, an intensive use of satellite data and diverse modelling studies. Therefore, the AMMA database aims at storing a great amount and a large variety of data, and at providing the data as rapidly and safely as possible to the AMMA research community. In order to stimulate the exchange of information and collaboration between researchers from different disciplines or using different tools, the database provides a detailed description of the products and uses standardized formats. The AMMA database contains: - AMMA field campaigns datasets; - historical data in West Africa from 1850 (operational networks and previous scientific programs); - satellite products from past and future satellites, (re-)mapped on a regular latitude/longitude grid and stored in NetCDF format (CF Convention); - model outputs from atmosphere or ocean operational (re-)analysis and forecasts, and from research simulations. The outputs are processed as the satellite products are. Before accessing the data, any user has to sign the AMMA data and publication policy. This chart only covers the use of data in the framework of scientific objectives and categorically excludes the redistribution of data to third parties and the usage for commercial applications. Some collaboration between data producers and users, and the mention of the AMMA project in any publication is also required. The AMMA database and the associated on-line tools have been fully developed and are managed by two teams in France (IPSL Database Centre, Paris and OMP, Toulouse). Users can access data of both data centres using an unique web portal. This website is composed of different modules : - Registration: forms to register, read and sign the data use chart when an user visits for the first time - Data access interface: friendly tool allowing to build a data extraction request by selecting various criteria like location, time, parameters... The request can concern local, satellite and model data. - Documentation: catalogue of all the available data and their metadata. These tools have been developed using standard and free languages and softwares: - Linux system with an Apache web server and a Tomcat application server; - J2EE tools : JSF and Struts frameworks, hibernate; - relational database management systems: PostgreSQL and MySQL; - OpenLDAP directory. In order to facilitate the access to the data by African scientists, the complete system has been mirrored at AGHRYMET Regional Centre in Niamey and is operational there since January 2009. Users can now access metadata and request data through one or the other of two equivalent portals: http://database.amma-international.org or http://amma.agrhymet.ne/amma-data.
A Framework for Building and Reasoning with Adaptive and Interoperable PMESII Models
2007-11-01
Description Logic SOA Service Oriented Architecture SPARQL Simple Protocol And RDF Query Language SQL Standard Query Language SROM Stability and...another by providing a more expressive ontological structure for one of the models, e.g., semantic networks can be mapped to first- order logical...Pellet is an open-source reasoner that works with OWL-DL. It accepts the SPARQL protocol and RDF query language ( SPARQL ) and provides a Java API to
DOE Office of Scientific and Technical Information (OSTI.GOV)
Ahrens, J.P.; Shapiro, L.G.; Tanimoto, S.L.
1997-04-01
This paper describes a computing environment which supports computer-based scientific research work. Key features include support for automatic distributed scheduling and execution and computer-based scientific experimentation. A new flexible and extensible scheduling technique that is responsive to a user`s scheduling constraints, such as the ordering of program results and the specification of task assignments and processor utilization levels, is presented. An easy-to-use constraint language for specifying scheduling constraints, based on the relational database query language SQL, is described along with a search-based algorithm for fulfilling these constraints. A set of performance studies show that the environment can schedule and executemore » program graphs on a network of workstations as the user requests. A method for automatically generating computer-based scientific experiments is described. Experiments provide a concise method of specifying a large collection of parameterized program executions. The environment achieved significant speedups when executing experiments; for a large collection of scientific experiments an average speedup of 3.4 on an average of 5.5 scheduled processors was obtained.« less
SQL is Dead; Long-live SQL: Relational Database Technology in Science Contexts
NASA Astrophysics Data System (ADS)
Howe, B.; Halperin, D.
2014-12-01
Relational databases are often perceived as a poor fit in science contexts: Rigid schemas, poor support for complex analytics, unpredictable performance, significant maintenance and tuning requirements --- these idiosyncrasies often make databases unattractive in science contexts characterized by heterogeneous data sources, complex analysis tasks, rapidly changing requirements, and limited IT budgets. In this talk, I'll argue that although the value proposition of typical relational database systems are weak in science, the core ideas that power relational databases have become incredibly prolific in open source science software, and are emerging as a universal abstraction for both big data and small data. In addition, I'll talk about two open source systems we are building to "jailbreak" the core technology of relational databases and adapt them for use in science. The first is SQLShare, a Database-as-a-Service system supporting collaborative data analysis and exchange by reducing database use to an Upload-Query-Share workflow with no installation, schema design, or configuration required. The second is Myria, a service that supports much larger scale data, complex analytics, and supports multiple back end systems. Finally, I'll describe some of the ways our collaborators in oceanography, astronomy, biology, fisheries science, and more are using these systems to replace script-based workflows for reasons of performance, flexibility, and convenience.
A Medical Image Backup Architecture Based on a NoSQL Database and Cloud Computing Services.
Santos Simões de Almeida, Luan Henrique; Costa Oliveira, Marcelo
2015-01-01
The use of digital systems for storing medical images generates a huge volume of data. Digital images are commonly stored and managed on a Picture Archiving and Communication System (PACS), under the DICOM standard. However, PACS is limited because it is strongly dependent on the server's physical space. Alternatively, Cloud Computing arises as an extensive, low cost, and reconfigurable resource. However, medical images contain patient information that can not be made available in a public cloud. Therefore, a mechanism to anonymize these images is needed. This poster presents a solution for this issue by taking digital images from PACS, converting the information contained in each image file to a NoSQL database, and using cloud computing to store digital images.
MySQL/PHP web database applications for IPAC proposal submission
NASA Astrophysics Data System (ADS)
Crane, Megan K.; Storrie-Lombardi, Lisa J.; Silbermann, Nancy A.; Rebull, Luisa M.
2008-07-01
The Infrared Processing and Analysis Center (IPAC) is NASA's multi-mission center of expertise for long-wavelength astrophysics. Proposals for various IPAC missions and programs are ingested via MySQL/PHP web database applications. Proposers use web forms to enter coversheet information and upload PDF files related to the proposal. Upon proposal submission, a unique directory is created on the webserver into which all of the uploaded files are placed. The coversheet information is converted into a PDF file using a PHP extension called FPDF. The files are concatenated into one PDF file using the command-line tool pdftk and then forwarded to the review committee. This work was performed at the California Institute of Technology under contract to the National Aeronautics and Space Administration.
Using Virtual Servers to Teach the Implementation of Enterprise-Level DBMSs: A Teaching Note
ERIC Educational Resources Information Center
Wagner, William P.; Pant, Vik
2010-01-01
One of the areas where demand has remained strong for MIS students is in the area of database management. Since the early days, this topic has been a mainstay in the MIS curriculum. Students of database management today typically learn about relational databases, SQL, normalization, and how to design and implement various kinds of database…
NASA Astrophysics Data System (ADS)
Guion, A., Jr.; Hodgkins, H.
2015-12-01
The Center of Excellence in Remote Sensing Education and Research (CERSER) has implemented three research projects during the summer Research Experience for Undergraduates (REU) program gathering water quality data for local waterways. The data has been compiled manually utilizing pen and paper and then entered into a spreadsheet. With the spread of electronic devices capable of interacting with databases, the development of an electronic method of entering and manipulating the water quality data was pursued during this project. This project focused on the development of an interactive database to gather, display, and analyze data collected from local waterways. The database and entry form was built in MySQL on a PHP server allowing participants to enter data from anywhere Internet access is available. This project then researched applying this data to the Google Maps site to provide labeling and information to users. The NIA server at http://nia.ecsu.edu is used to host the application for download and for storage of the databases. Water Quality Database Team members included the authors plus Derek Morris Jr., Kathryne Burton and Mr. Jeff Wood as mentor.
Evolution of the use of relational and NoSQL databases in the ATLAS experiment
NASA Astrophysics Data System (ADS)
Barberis, D.
2016-09-01
The ATLAS experiment used for many years a large database infrastructure based on Oracle to store several different types of non-event data: time-dependent detector configuration and conditions data, calibrations and alignments, configurations of Grid sites, catalogues for data management tools, job records for distributed workload management tools, run and event metadata. The rapid development of "NoSQL" databases (structured storage services) in the last five years allowed an extended and complementary usage of traditional relational databases and new structured storage tools in order to improve the performance of existing applications and to extend their functionalities using the possibilities offered by the modern storage systems. The trend is towards using the best tool for each kind of data, separating for example the intrinsically relational metadata from payload storage, and records that are frequently updated and benefit from transactions from archived information. Access to all components has to be orchestrated by specialised services that run on front-end machines and shield the user from the complexity of data storage infrastructure. This paper describes this technology evolution in the ATLAS database infrastructure and presents a few examples of large database applications that benefit from it.
NASA Astrophysics Data System (ADS)
Mueller, Wolfgang; Mueller, Henning; Marchand-Maillet, Stephane; Pun, Thierry; Squire, David M.; Pecenovic, Zoran; Giess, Christoph; de Vries, Arjen P.
2000-10-01
While in the area of relational databases interoperability is ensured by common communication protocols (e.g. ODBC/JDBC using SQL), Content Based Image Retrieval Systems (CBIRS) and other multimedia retrieval systems are lacking both a common query language and a common communication protocol. Besides its obvious short term convenience, interoperability of systems is crucial for the exchange and analysis of user data. In this paper, we present and describe an extensible XML-based query markup language, called MRML (Multimedia Retrieval markup Language). MRML is primarily designed so as to ensure interoperability between different content-based multimedia retrieval systems. Further, MRML allows researchers to preserve their freedom in extending their system as needed. MRML encapsulates multimedia queries in a way that enable multimedia (MM) query languages, MM content descriptions, MM query engines, and MM user interfaces to grow independently from each other, reaching a maximum of interoperability while ensuring a maximum of freedom for the developer. For benefitting from this, only a few simple design principles have to be respected when extending MRML for one's fprivate needs. The design of extensions withing the MRML framework will be described in detail in the paper. MRML has been implemented and tested for the CBIRS Viper, using the user interface Snake Charmer. Both are part of the GNU project and can be downloaded at our site.
Harris, Eric S J; Erickson, Sean D; Tolopko, Andrew N; Cao, Shugeng; Craycroft, Jane A; Scholten, Robert; Fu, Yanling; Wang, Wenquan; Liu, Yong; Zhao, Zhongzhen; Clardy, Jon; Shamu, Caroline E; Eisenberg, David M
2011-05-17
Ethnobotanically driven drug-discovery programs include data related to many aspects of the preparation of botanical medicines, from initial plant collection to chemical extraction and fractionation. The Traditional Medicine Collection Tracking System (TM-CTS) was created to organize and store data of this type for an international collaborative project involving the systematic evaluation of commonly used Traditional Chinese Medicinal plants. The system was developed using domain-driven design techniques, and is implemented using Java, Hibernate, PostgreSQL, Business Intelligence and Reporting Tools (BIRT), and Apache Tomcat. The TM-CTS relational database schema contains over 70 data types, comprising over 500 data fields. The system incorporates a number of unique features that are useful in the context of ethnobotanical projects such as support for information about botanical collection, method of processing, quality tests for plants with existing pharmacopoeia standards, chemical extraction and fractionation, and historical uses of the plants. The database also accommodates data provided in multiple languages and integration with a database system built to support high throughput screening based drug discovery efforts. It is accessed via a web-based application that provides extensive, multi-format reporting capabilities. This new database system was designed to support a project evaluating the bioactivity of Chinese medicinal plants. The software used to create the database is open source, freely available, and could potentially be applied to other ethnobotanically driven natural product collection and drug-discovery programs. Copyright © 2011 Elsevier Ireland Ltd. All rights reserved.
Harris, Eric S. J.; Erickson, Sean D.; Tolopko, Andrew N.; Cao, Shugeng; Craycroft, Jane A.; Scholten, Robert; Fu, Yanling; Wang, Wenquan; Liu, Yong; Zhao, Zhongzhen; Clardy, Jon; Shamu, Caroline E.; Eisenberg, David M.
2011-01-01
Aim of the study. Ethnobotanically-driven drug-discovery programs include data related to many aspects of the preparation of botanical medicines, from initial plant collection to chemical extraction and fractionation. The Traditional Medicine-Collection Tracking System (TM-CTS) was created to organize and store data of this type for an international collaborative project involving the systematic evaluation of commonly used Traditional Chinese Medicinal plants. Materials and Methods. The system was developed using domain-driven design techniques, and is implemented using Java, Hibernate, PostgreSQL, Business Intelligence and Reporting Tools (BIRT), and Apache Tomcat. Results. The TM-CTS relational database schema contains over 70 data types, comprising over 500 data fields. The system incorporates a number of unique features that are useful in the context of ethnobotanical projects such as support for information about botanical collection, method of processing, quality tests for plants with existing pharmacopoeia standards, chemical extraction and fractionation, and historical uses of the plants. The database also accommodates data provided in multiple languages and integration with a database system built to support high throughput screening based drug discovery efforts. It is accessed via a web-based application that provides extensive, multi-format reporting capabilities. Conclusions. This new database system was designed to support a project evaluating the bioactivity of Chinese medicinal plants. The software used to create the database is open source, freely available, and could potentially be applied to other ethnobotanically-driven natural product collection and drug-discovery programs. PMID:21420479
Schulz, Erich; Barrett, James W.; Price, Colin
1998-01-01
As controlled clinical vocabularies assume an increasing role in modern clinical information systems, so the issue of their quality demands greater attention. In order to meet the resulting stringent criteria for completeness and correctness, a quality assurance system comprising a database of more than 500 rules is being developed and applied to the Read Thesaurus. The authors discuss the requirement to apply quality assurance processes to their dynamic editing database in order to ensure the quality of exported products. Sources of errors include human, hardware, and software factors as well as new rules and transactions. The overall quality strategy includes prevention, detection, and correction of errors. The quality assurance process encompasses simple data specification, internal consistency, inspection procedures and, eventually, field testing. The quality assurance system is driven by a small number of tables and UNIX scripts, with “business rules” declared explicitly as Structured Query Language (SQL) statements. Concurrent authorship, client-server technology, and an initial failure to implement robust transaction control have all provided valuable lessons. The feedback loop for error management needs to be short. PMID:9670131
Geoinformatics paves the way for a zoo information system
NASA Astrophysics Data System (ADS)
Michel, Ulrich
2008-10-01
The use of modern electronic media offers new ways of (environmental) knowledge transfer. All kind of information can be made quickly available as well as queryable and can be processed individually. The Institute for Geoinformatics and Remote Sensing (IGF) in collaboration with the Osnabrueck Zoo, is developing a zoo information system, especially for new media (e.g. mobile devices), which provides information about the animals living there, their natural habitat and endangerment status. Thereby multimedia information is being offered to the zoo visitors. The implementation of the 2D/3D components is realized by modern database and Mapserver technologies. Among other technologies, the VRML (Virtual Reality Modeling Language) standard is used for the realization of the 3D visualization so that it can be viewed in every conventional web browser. Also, a mobile information system for Pocket PCs, Smartphones and Ultra Mobile PCs (UMPC) is being developed. All contents, including the coordinates, are stored in a PostgreSQL database. The data input, the processing and other administrative operations are executed by a content management system (CMS).
Read Code quality assurance: from simple syntax to semantic stability.
Schulz, E B; Barrett, J W; Price, C
1998-01-01
As controlled clinical vocabularies assume an increasing role in modern clinical information systems, so the issue of their quality demands greater attention. In order to meet the resulting stringent criteria for completeness and correctness, a quality assurance system comprising a database of more than 500 rules is being developed and applied to the Read Thesaurus. The authors discuss the requirement to apply quality assurance processes to their dynamic editing database in order to ensure the quality of exported products. Sources of errors include human, hardware, and software factors as well as new rules and transactions. The overall quality strategy includes prevention, detection, and correction of errors. The quality assurance process encompasses simple data specification, internal consistency, inspection procedures and, eventually, field testing. The quality assurance system is driven by a small number of tables and UNIX scripts, with "business rules" declared explicitly as Structured Query Language (SQL) statements. Concurrent authorship, client-server technology, and an initial failure to implement robust transaction control have all provided valuable lessons. The feedback loop for error management needs to be short.
Changes in Exercise Data Management
NASA Technical Reports Server (NTRS)
Buxton, R. E.; Kalogera, K. L.; Hanson, A. M.
2018-01-01
The suite of exercise hardware aboard the International Space Station (ISS) generates an immense amount of data. The data collected from the treadmill, cycle ergometer, and resistance strength training hardware are basic exercise parameters (time, heart rate, speed, load, etc.). The raw data are post processed in the laboratory and more detailed parameters are calculated from each exercise data file. Updates have recently been made to how this valuable data are stored, adding an additional level of data security, increasing data accessibility, and resulting in overall increased efficiency of medical report delivery. Questions regarding exercise performance or how exercise may influence other variables of crew health frequently arise within the crew health care community. Inquiries over the health of the exercise hardware often need quick analysis and response to ensure the exercise system is operable on a continuous basis. Consolidating all of the exercise system data in a single repository enables a quick response to both the medical and engineering communities. A SQL server database is currently in use, and provides a secure location for all of the exercise data starting at ISS Expedition 1 - current day. The database has been structured to update derived metrics automatically, making analysis and reporting available within minutes of dropping the inflight data it into the database. Commercial tools were evaluated to help aggregate and visualize data from the SQL database. The Tableau software provides manageable interface, which has improved the laboratory's output time of crew reports by 67%. Expansion of the SQL database to be inclusive of additional medical requirement metrics, addition of 'app-like' tools for mobile visualization, and collaborative use (e.g. operational support teams, research groups, and International Partners) of the data system is currently being explored.
Design and Implementation of 3D Model Data Management System Based on SQL
NASA Astrophysics Data System (ADS)
Li, Shitao; Zhang, Shixin; Zhang, Zhanling; Li, Shiming; Jia, Kun; Hu, Zhongxu; Ping, Liang; Hu, Youming; Li, Yanlei
CAD/CAM technology plays an increasingly important role in the machinery manufacturing industry. As an important means of production, the accumulated three-dimensional models in many years of design work are valuable. Thus the management of these three-dimensional models is of great significance. This paper gives detailed explanation for a method to design three-dimensional model databases based on SQL and to implement the functions such as insertion, modification, inquiry, preview and so on.
"Science SQL" as a Building Block for Flexible, Standards-based Data Infrastructures
NASA Astrophysics Data System (ADS)
Baumann, Peter
2016-04-01
We have learnt to live with the pain of separating data and metadata into non-interoperable silos. For metadata, we enjoy the flexibility of databases, be they relational, graph, or some other NoSQL. Contrasting this, users still "drown in files" as an unstructured, low-level archiving paradigm. It is time to bridge this chasm which once was technologically induced, but today can be overcome. One building block towards a common re-integrated information space is to support massive multi-dimensional spatio-temporal arrays. These "datacubes" appear as sensor, image, simulation, and statistics data in all science and engineering domains, and beyond. For example, 2-D satellilte imagery, 2-D x/y/t image timeseries and x/y/z geophysical voxel data, and 4-D x/y/z/t climate data contribute to today's data deluge in the Earth sciences. Virtual observatories in the Space sciences routinely generate Petabytes of such data. Life sciences deal with microarray data, confocal microscopy, human brain data, which all fall into the same category. The ISO SQL/MDA (Multi-Dimensional Arrays) candidate standard is extending SQL with modelling and query support for n-D arrays ("datacubes") in a flexible, domain-neutral way. This heralds a new generation of services with new quality parameters, such as flexibility, ease of access, embedding into well-known user tools, and scalability mechanisms that remain completely transparent to users. Technology like the EU rasdaman ("raster data manager") Array Database system can support all of the above examples simultaneously, with one technology. This is practically proven: As of today, rasdaman is in operational use on hundreds of Terabytes of satellite image timeseries datacubes, with transparent query distribution across more than 1,000 nodes. Therefore, Array Databases offering SQL/MDA constitute a natural common building block for next-generation data infrastructures. Being initiator and editor of the standard we present principles, implementation facets, and application examples as a basis for further discussion. Further, we highlight recent implementation progress in parallelization, data distribution, and query optimization showing their effects on real-life use cases.
CardioTF, a database of deconstructing transcriptional circuits in the heart system
2016-01-01
Background: Information on cardiovascular gene transcription is fragmented and far behind the present requirements of the systems biology field. To create a comprehensive source of data for cardiovascular gene regulation and to facilitate a deeper understanding of genomic data, the CardioTF database was constructed. The purpose of this database is to collate information on cardiovascular transcription factors (TFs), position weight matrices (PWMs), and enhancer sequences discovered using the ChIP-seq method. Methods: The Naïve-Bayes algorithm was used to classify literature and identify all PubMed abstracts on cardiovascular development. The natural language learning tool GNAT was then used to identify corresponding gene names embedded within these abstracts. Local Perl scripts were used to integrate and dump data from public databases into the MariaDB management system (MySQL). In-house R scripts were written to analyze and visualize the results. Results: Known cardiovascular TFs from humans and human homologs from fly, Ciona, zebrafish, frog, chicken, and mouse were identified and deposited in the database. PWMs from Jaspar, hPDI, and UniPROBE databases were deposited in the database and can be retrieved using their corresponding TF names. Gene enhancer regions from various sources of ChIP-seq data were deposited into the database and were able to be visualized by graphical output. Besides biocuration, mouse homologs of the 81 core cardiac TFs were selected using a Naïve-Bayes approach and then by intersecting four independent data sources: RNA profiling, expert annotation, PubMed abstracts and phenotype. Discussion: The CardioTF database can be used as a portal to construct transcriptional network of cardiac development. Availability and Implementation: Database URL: http://www.cardiosignal.org/database/cardiotf.html. PMID:27635320
CardioTF, a database of deconstructing transcriptional circuits in the heart system.
Zhen, Yisong
2016-01-01
Information on cardiovascular gene transcription is fragmented and far behind the present requirements of the systems biology field. To create a comprehensive source of data for cardiovascular gene regulation and to facilitate a deeper understanding of genomic data, the CardioTF database was constructed. The purpose of this database is to collate information on cardiovascular transcription factors (TFs), position weight matrices (PWMs), and enhancer sequences discovered using the ChIP-seq method. The Naïve-Bayes algorithm was used to classify literature and identify all PubMed abstracts on cardiovascular development. The natural language learning tool GNAT was then used to identify corresponding gene names embedded within these abstracts. Local Perl scripts were used to integrate and dump data from public databases into the MariaDB management system (MySQL). In-house R scripts were written to analyze and visualize the results. Known cardiovascular TFs from humans and human homologs from fly, Ciona, zebrafish, frog, chicken, and mouse were identified and deposited in the database. PWMs from Jaspar, hPDI, and UniPROBE databases were deposited in the database and can be retrieved using their corresponding TF names. Gene enhancer regions from various sources of ChIP-seq data were deposited into the database and were able to be visualized by graphical output. Besides biocuration, mouse homologs of the 81 core cardiac TFs were selected using a Naïve-Bayes approach and then by intersecting four independent data sources: RNA profiling, expert annotation, PubMed abstracts and phenotype. The CardioTF database can be used as a portal to construct transcriptional network of cardiac development. Database URL: http://www.cardiosignal.org/database/cardiotf.html.
Observational database for studies of nearby universe
NASA Astrophysics Data System (ADS)
Kaisina, E. I.; Makarov, D. I.; Karachentsev, I. D.; Kaisin, S. S.
2012-01-01
We present the description of a database of galaxies of the Local Volume (LVG), located within 10 Mpc around the Milky Way. It contains more than 800 objects. Based on an analysis of functional capabilities, we used the PostgreSQL DBMS as a management system for our LVG database. Applying semantic modelling methods, we developed a physical ER-model of the database. We describe the developed architecture of the database table structure, and the implemented web-access, available at http://www.sao.ru/lv/lvgdb.
Atlas - a data warehouse for integrative bioinformatics.
Shah, Sohrab P; Huang, Yong; Xu, Tao; Yuen, Macaire M S; Ling, John; Ouellette, B F Francis
2005-02-21
We present a biological data warehouse called Atlas that locally stores and integrates biological sequences, molecular interactions, homology information, functional annotations of genes, and biological ontologies. The goal of the system is to provide data, as well as a software infrastructure for bioinformatics research and development. The Atlas system is based on relational data models that we developed for each of the source data types. Data stored within these relational models are managed through Structured Query Language (SQL) calls that are implemented in a set of Application Programming Interfaces (APIs). The APIs include three languages: C++, Java, and Perl. The methods in these API libraries are used to construct a set of loader applications, which parse and load the source datasets into the Atlas database, and a set of toolbox applications which facilitate data retrieval. Atlas stores and integrates local instances of GenBank, RefSeq, UniProt, Human Protein Reference Database (HPRD), Biomolecular Interaction Network Database (BIND), Database of Interacting Proteins (DIP), Molecular Interactions Database (MINT), IntAct, NCBI Taxonomy, Gene Ontology (GO), Online Mendelian Inheritance in Man (OMIM), LocusLink, Entrez Gene and HomoloGene. The retrieval APIs and toolbox applications are critical components that offer end-users flexible, easy, integrated access to this data. We present use cases that use Atlas to integrate these sources for genome annotation, inference of molecular interactions across species, and gene-disease associations. The Atlas biological data warehouse serves as data infrastructure for bioinformatics research and development. It forms the backbone of the research activities in our laboratory and facilitates the integration of disparate, heterogeneous biological sources of data enabling new scientific inferences. Atlas achieves integration of diverse data sets at two levels. First, Atlas stores data of similar types using common data models, enforcing the relationships between data types. Second, integration is achieved through a combination of APIs, ontology, and tools. The Atlas software is freely available under the GNU General Public License at: http://bioinformatics.ubc.ca/atlas/
Atlas – a data warehouse for integrative bioinformatics
Shah, Sohrab P; Huang, Yong; Xu, Tao; Yuen, Macaire MS; Ling, John; Ouellette, BF Francis
2005-01-01
Background We present a biological data warehouse called Atlas that locally stores and integrates biological sequences, molecular interactions, homology information, functional annotations of genes, and biological ontologies. The goal of the system is to provide data, as well as a software infrastructure for bioinformatics research and development. Description The Atlas system is based on relational data models that we developed for each of the source data types. Data stored within these relational models are managed through Structured Query Language (SQL) calls that are implemented in a set of Application Programming Interfaces (APIs). The APIs include three languages: C++, Java, and Perl. The methods in these API libraries are used to construct a set of loader applications, which parse and load the source datasets into the Atlas database, and a set of toolbox applications which facilitate data retrieval. Atlas stores and integrates local instances of GenBank, RefSeq, UniProt, Human Protein Reference Database (HPRD), Biomolecular Interaction Network Database (BIND), Database of Interacting Proteins (DIP), Molecular Interactions Database (MINT), IntAct, NCBI Taxonomy, Gene Ontology (GO), Online Mendelian Inheritance in Man (OMIM), LocusLink, Entrez Gene and HomoloGene. The retrieval APIs and toolbox applications are critical components that offer end-users flexible, easy, integrated access to this data. We present use cases that use Atlas to integrate these sources for genome annotation, inference of molecular interactions across species, and gene-disease associations. Conclusion The Atlas biological data warehouse serves as data infrastructure for bioinformatics research and development. It forms the backbone of the research activities in our laboratory and facilitates the integration of disparate, heterogeneous biological sources of data enabling new scientific inferences. Atlas achieves integration of diverse data sets at two levels. First, Atlas stores data of similar types using common data models, enforcing the relationships between data types. Second, integration is achieved through a combination of APIs, ontology, and tools. The Atlas software is freely available under the GNU General Public License at: PMID:15723693
JBioWH: an open-source Java framework for bioinformatics data integration
Vera, Roberto; Perez-Riverol, Yasset; Perez, Sonia; Ligeti, Balázs; Kertész-Farkas, Attila; Pongor, Sándor
2013-01-01
The Java BioWareHouse (JBioWH) project is an open-source platform-independent programming framework that allows a user to build his/her own integrated database from the most popular data sources. JBioWH can be used for intensive querying of multiple data sources and the creation of streamlined task-specific data sets on local PCs. JBioWH is based on a MySQL relational database scheme and includes JAVA API parser functions for retrieving data from 20 public databases (e.g. NCBI, KEGG, etc.). It also includes a client desktop application for (non-programmer) users to query data. In addition, JBioWH can be tailored for use in specific circumstances, including the handling of massive queries for high-throughput analyses or CPU intensive calculations. The framework is provided with complete documentation and application examples and it can be downloaded from the Project Web site at http://code.google.com/p/jbiowh. A MySQL server is available for demonstration purposes at hydrax.icgeb.trieste.it:3307. Database URL: http://code.google.com/p/jbiowh PMID:23846595
Mining Bug Databases for Unidentified Software Vulnerabilities
DOE Office of Scientific and Technical Information (OSTI.GOV)
Dumidu Wijayasekara; Milos Manic; Jason Wright
2012-06-01
Identifying software vulnerabilities is becoming more important as critical and sensitive systems increasingly rely on complex software systems. It has been suggested in previous work that some bugs are only identified as vulnerabilities long after the bug has been made public. These vulnerabilities are known as hidden impact vulnerabilities. This paper discusses the feasibility and necessity to mine common publicly available bug databases for vulnerabilities that are yet to be identified. We present bug database analysis of two well known and frequently used software packages, namely Linux kernel and MySQL. It is shown that for both Linux and MySQL, amore » significant portion of vulnerabilities that were discovered for the time period from January 2006 to April 2011 were hidden impact vulnerabilities. It is also shown that the percentage of hidden impact vulnerabilities has increased in the last two years, for both software packages. We then propose an improved hidden impact vulnerability identification methodology based on text mining bug databases, and conclude by discussing a few potential problems faced by such a classifier.« less
JBioWH: an open-source Java framework for bioinformatics data integration.
Vera, Roberto; Perez-Riverol, Yasset; Perez, Sonia; Ligeti, Balázs; Kertész-Farkas, Attila; Pongor, Sándor
2013-01-01
The Java BioWareHouse (JBioWH) project is an open-source platform-independent programming framework that allows a user to build his/her own integrated database from the most popular data sources. JBioWH can be used for intensive querying of multiple data sources and the creation of streamlined task-specific data sets on local PCs. JBioWH is based on a MySQL relational database scheme and includes JAVA API parser functions for retrieving data from 20 public databases (e.g. NCBI, KEGG, etc.). It also includes a client desktop application for (non-programmer) users to query data. In addition, JBioWH can be tailored for use in specific circumstances, including the handling of massive queries for high-throughput analyses or CPU intensive calculations. The framework is provided with complete documentation and application examples and it can be downloaded from the Project Web site at http://code.google.com/p/jbiowh. A MySQL server is available for demonstration purposes at hydrax.icgeb.trieste.it:3307. Database URL: http://code.google.com/p/jbiowh.
Experience with Multi-Tier Grid MySQL Database Service Resiliency at BNL
NASA Astrophysics Data System (ADS)
Wlodek, Tomasz; Ernst, Michael; Hover, John; Katramatos, Dimitrios; Packard, Jay; Smirnov, Yuri; Yu, Dantong
2011-12-01
We describe the use of F5's BIG-IP smart switch technology (3600 Series and Local Traffic Manager v9.0) to provide load balancing and automatic fail-over to multiple Grid services (GUMS, VOMS) and their associated back-end MySQL databases. This resiliency is introduced in front of the external application servers and also for the back-end database systems, which is what makes it "multi-tier". The combination of solutions chosen to ensure high availability of the services, in particular the database replication and fail-over mechanism, are discussed in detail. The paper explains the design and configuration of the overall system, including virtual servers, machine pools, and health monitors (which govern routing), as well as the master-slave database scheme and fail-over policies and procedures. Pre-deployment planning and stress testing will be outlined. Integration of the systems with our Nagios-based facility monitoring and alerting is also described. And application characteristics of GUMS and VOMS which enable effective clustering will be explained. We then summarize our practical experiences and real-world scenarios resulting from operating a major US Grid center, and assess the applicability of our approach to other Grid services in the future.
HAEdb: a novel interactive, locus-specific mutation database for the C1 inhibitor gene.
Kalmár, Lajos; Hegedüs, Tamás; Farkas, Henriette; Nagy, Melinda; Tordai, Attila
2005-01-01
Hereditary angioneurotic edema (HAE) is an autosomal dominant disorder characterized by episodic local subcutaneous and submucosal edema and is caused by the deficiency of the activated C1 esterase inhibitor protein (C1-INH or C1INH; approved gene symbol SERPING1). Published C1-INH mutations are represented in large universal databases (e.g., OMIM, HGMD), but these databases update their data rather infrequently, they are not interactive, and they do not allow searches according to different criteria. The HAEdb, a C1-INH gene mutation database (http://hae.biomembrane.hu) was created to contribute to the following expectations: 1) help the comprehensive collection of information on genetic alterations of the C1-INH gene; 2) create a database in which data can be searched and compared according to several flexible criteria; and 3) provide additional help in new mutation identification. The website uses MySQL, an open-source, multithreaded, relational database management system. The user-friendly graphical interface was written in the PHP web programming language. The website consists of two main parts, the freely browsable search function, and the password-protected data deposition function. Mutations of the C1-INH gene are divided in two parts: gross mutations involving DNA fragments >1 kb, and micro mutations encompassing all non-gross mutations. Several attributes (e.g., affected exon, molecular consequence, family history) are collected for each mutation in a standardized form. This database may facilitate future comprehensive analyses of C1-INH mutations and also provide regular help for molecular diagnostic testing of HAE patients in different centers.
Time series patterns and language support in DBMS
NASA Astrophysics Data System (ADS)
Telnarova, Zdenka
2017-07-01
This contribution is focused on pattern type Time Series as a rich in semantics representation of data. Some example of implementation of this pattern type in traditional Data Base Management Systems is briefly presented. There are many approaches how to manipulate with patterns and query patterns. Crucial issue can be seen in systematic approach to pattern management and specific pattern query language which takes into consideration semantics of patterns. Query language SQL-TS for manipulating with patterns is shown on Time Series data.
Lessons Learned With a Global Graph and Ozone Widget Framework (OWF) Testbed
2013-05-01
of operating system and database environments. The following is one example. Requirements are: Java 1.6 + and a Relational Database Management...We originally tried to use MySQL as our database, because we were more familiar with it, but since the database dumps as well as most of the...Global Graph Rest Services In order to set up the Global Graph Rest Services, you will need to have the following dependencies installed: Java 1.6
Integrated Substrate and Thin Film Design Methods
1999-02-01
Proper Representation Once the required chemical databases had been converted to the Excel format, VBA macros were written to convert chemical...ternary systems databases were imported from MS Excel to MS Access to implement SQL queries. Further, this database was connected via an ODBC model, to the... VBA macro, corresponding to each of the elements A, B, and C, respectively. The B loop began with the next alphabetical choice of element symbols
Incremental Query Rewriting with Resolution
NASA Astrophysics Data System (ADS)
Riazanov, Alexandre; Aragão, Marcelo A. T.
We address the problem of semantic querying of relational databases (RDB) modulo knowledge bases using very expressive knowledge representation formalisms, such as full first-order logic or its various fragments. We propose to use a resolution-based first-order logic (FOL) reasoner for computing schematic answers to deductive queries, with the subsequent translation of these schematic answers to SQL queries which are evaluated using a conventional relational DBMS. We call our method incremental query rewriting, because an original semantic query is rewritten into a (potentially infinite) series of SQL queries. In this chapter, we outline the main idea of our technique - using abstractions of databases and constrained clauses for deriving schematic answers, and provide completeness and soundness proofs to justify the applicability of this technique to the case of resolution for FOL without equality. The proposed method can be directly used with regular RDBs, including legacy databases. Moreover, we propose it as a potential basis for an efficient Web-scale semantic search technology.
NASA Astrophysics Data System (ADS)
Barnsley, R. M.; Steele, Iain A.; Smith, R. J.; Mawson, Neil R.
2014-07-01
The Small Telescopes Installed at the Liverpool Telescope (STILT) project has been in operation since March 2009, collecting data with three wide field unfiltered cameras: SkycamA, SkycamT and SkycamZ. To process the data, a pipeline was developed to automate source extraction, catalogue cross-matching, photometric calibration and database storage. In this paper, modifications and further developments to this pipeline will be discussed, including a complete refactor of the pipeline's codebase into Python, migration of the back-end database technology from MySQL to PostgreSQL, and changing the catalogue used for source cross-matching from USNO-B1 to APASS. In addition to this, details will be given relating to the development of a preliminary front-end to the source extracted database which will allow a user to perform common queries such as cone searches and light curve comparisons of catalogue and non-catalogue matched objects. Some next steps and future ideas for the project will also be presented.
An integrated data-analysis and database system for AMS 14C
NASA Astrophysics Data System (ADS)
Kjeldsen, Henrik; Olsen, Jesper; Heinemeier, Jan
2010-04-01
AMSdata is the name of a combined database and data-analysis system for AMS 14C and stable-isotope work that has been developed at Aarhus University. The system (1) contains routines for data analysis of AMS and MS data, (2) allows a flexible and accurate description of sample extraction and pretreatment, also when samples are split into several fractions, and (3) keeps track of all measured, calculated and attributed data. The structure of the database is flexible and allows an unlimited number of measurement and pretreatment procedures. The AMS 14C data analysis routine is fairly advanced and flexible, and it can be easily optimized for different kinds of measuring processes. Technically, the system is based on a Microsoft SQL server and includes stored SQL procedures for the data analysis. Microsoft Office Access is used for the (graphical) user interface, and in addition Excel, Word and Origin are exploited for input and output of data, e.g. for plotting data during data analysis.
Development of a medical module for disaster information systems.
Calik, Elif; Atilla, Rıdvan; Kaya, Hilal; Aribaş, Alirıza; Cengiz, Hakan; Dicle, Oğuz
2014-01-01
This study aims to improve a medical module which provides a real-time medical information flow about pre-hospital processes that gives health care in disasters; transferring, storing and processing the records that are in electronic media and over internet as a part of disaster information systems. In this study which is handled within the frame of providing information flow among professionals in a disaster case, to supply the coordination of healthcare team and transferring complete information to specified people at real time, Microsoft Access database and SQL query language were used to inform database applications. System was prepared on Microsoft .Net platform using C# language. Disaster information system-medical module was designed to be used in disaster area, field hospital, nearby hospitals, temporary inhabiting areas like tent city, vehicles that are used for dispatch, and providing information flow between medical officials and data centres. For fast recording of the disaster victim data, accessing to database which was used by health care professionals was provided (or granted) among analysing process steps and creating minimal datasets. Database fields were created in the manner of giving opportunity to enter new data and search old data which is recorded before disaster. Web application which provides access such as data entry to the database and searching towards the designed interfaces according to the login credentials access level. In this study, homepage and users' interfaces which were built on database in consequence of system analyses were provided with www.afmedinfo.com web site to the user access. With this study, a recommendation was made about how to use disaster-based information systems in the field of health. Awareness has been developed about the fact that disaster information system should not be perceived only as an early warning system. Contents and the differences of the health care practices of disaster information systems were revealed. A web application was developed supplying a link between the user and the database to make date entry and data query practices by the help of the developed interfaces.
Development of Integrated Information System for Travel Bureau Company
NASA Astrophysics Data System (ADS)
Karma, I. G. M.; Susanti, J.
2018-01-01
Related to the effectiveness of decision-making by the management of travel bureau company, especially by managers, information serves frequent delays or incomplete. Although already computer-assisted, the existing application-based is used only handle one particular activity only, not integrated. This research is intended to produce an integrated information system that handles the overall operational activities of the company. By applying the object-oriented system development approach, the system is built with Visual Basic. Net programming language and MySQL database package. The result is a system that consists of 4 (four) separated program packages, including Reservation System, AR System, AP System and Accounting System. Based on the output, we can conclude that this system is able to produce integrated information that related to the problem of reservation, operational and financial those produce up-to-date information in order to support operational activities and decisionmaking process by related parties.
Research on sudden environmental pollution public service platform construction based on WebGIS
NASA Astrophysics Data System (ADS)
Bi, T. P.; Gao, D. Y.; Zhong, X. Y.
2016-08-01
In order to actualize the social sharing and service of the emergency-response information for sudden pollution accidents, the public can share the risk source information service, dangerous goods control technology service and so on, The SQL Server and ArcSDE software are used to establish a spatial database to restore all kinds of information including risk sources, hazardous chemicals and handling methods in case of accidents. Combined with Chinese atmospheric environmental assessment standards, the SCREEN3 atmospheric dispersion model and one-dimensional liquid diffusion model are established to realize the query of related information and the display of the diffusion effect under B/S structure. Based on the WebGIS technology, C#.Net language is used to develop the sudden environmental pollution public service platform. As a result, the public service platform can make risk assessments and provide the best emergency processing services.
Design and implementation of ticket price forecasting system
NASA Astrophysics Data System (ADS)
Li, Yuling; Li, Zhichao
2018-05-01
With the advent of the aviation travel industry, a large number of data mining technologies have been developed to increase profits for airlines in the past two decades. The implementation of the digital optimization strategy leads to price discrimination, for example, similar seats on the same flight are purchased at different prices, depending on the time of purchase, the supplier, and so on. Price fluctuations make the prediction of ticket prices have application value. In this paper, a combination of ARMA algorithm and random forest algorithm is proposed to predict the price of air ticket. The experimental results show that the model is more reliable by comparing the forecasting results with the actual results of each price model. The model is helpful for passengers to buy tickets and to save money. Based on the proposed model, using Python language and SQL Server database, we design and implement the ticket price forecasting system.
TargetCompare: A web interface to compare simultaneous miRNAs targets
Moreira, Fabiano Cordeiro; Dustan, Bruno; Hamoy, Igor G; Ribeiro-dos-Santos, André M; dos Santos, Ândrea Ribeiro
2014-01-01
MicroRNAs (miRNAs) are small non-coding nucleotide sequences between 17 and 25 nucleotides in length that primarily function in the regulation of gene expression. A since miRNA has thousand of predict targets in a complex, regulatory cell signaling network. Therefore, it is of interest to study multiple target genes simultaneously. Hence, we describe a web tool (developed using Java programming language and MySQL database server) to analyse multiple targets of pre-selected miRNAs. We cross validated the tool in eight most highly expressed miRNAs in the antrum region of stomach. This helped to identify 43 potential genes that are target of at least six of the referred miRNAs. The developed tool aims to reduce the randomness and increase the chance of selecting strong candidate target genes and miRNAs responsible for playing important roles in the studied tissue. Availability http://lghm.ufpa.br/targetcompare PMID:25352731
TargetCompare: A web interface to compare simultaneous miRNAs targets.
Moreira, Fabiano Cordeiro; Dustan, Bruno; Hamoy, Igor G; Ribeiro-Dos-Santos, André M; Dos Santos, Andrea Ribeiro
2014-01-01
MicroRNAs (miRNAs) are small non-coding nucleotide sequences between 17 and 25 nucleotides in length that primarily function in the regulation of gene expression. A since miRNA has thousand of predict targets in a complex, regulatory cell signaling network. Therefore, it is of interest to study multiple target genes simultaneously. Hence, we describe a web tool (developed using Java programming language and MySQL database server) to analyse multiple targets of pre-selected miRNAs. We cross validated the tool in eight most highly expressed miRNAs in the antrum region of stomach. This helped to identify 43 potential genes that are target of at least six of the referred miRNAs. The developed tool aims to reduce the randomness and increase the chance of selecting strong candidate target genes and miRNAs responsible for playing important roles in the studied tissue. http://lghm.ufpa.br/targetcompare.
Development of Web-Based Menu Planning Support System and its Solution Using Genetic Algorithm
NASA Astrophysics Data System (ADS)
Kashima, Tomoko; Matsumoto, Shimpei; Ishii, Hiroaki
2009-10-01
Recently lifestyle-related diseases have become an object of public concern, while at the same time people are being more health conscious. As an essential factor for causing the lifestyle-related diseases, we assume that the knowledge circulation on dietary habits is still insufficient. This paper focuses on everyday meals close to our life and proposes a well-balanced menu planning system as a preventive measure of lifestyle-related diseases. The system is developed by using a Web-based frontend and it provides multi-user services and menu information sharing capabilities like social networking services (SNS). The system is implemented on a Web server running Apache (HTTP server software), MySQL (database management system), and PHP (scripting language for dynamic Web pages). For the menu planning, a genetic algorithm is applied by understanding this problem as multidimensional 0-1 integer programming.
LSST communications middleware implementation
NASA Astrophysics Data System (ADS)
Mills, Dave; Schumacher, German; Lotz, Paul
2016-07-01
The LSST communications middleware is based on a set of software abstractions; which provide standard interfaces for common communications services. The observatory requires communication between diverse subsystems, implemented by different contractors, and comprehensive archiving of subsystem status data. The Service Abstraction Layer (SAL) is implemented using open source packages that implement open standards of DDS (Data Distribution Service1) for data communication, and SQL (Standard Query Language) for database access. For every subsystem, abstractions for each of the Telemetry datastreams, along with Command/Response and Events, have been agreed with the appropriate component vendor (such as Dome, TMA, Hexapod), and captured in ICD's (Interface Control Documents).The OpenSplice (Prismtech) Community Edition of DDS provides an LGPL licensed distribution which may be freely redistributed. The availability of the full source code provides assurances that the project will be able to maintain it over the full 10 year survey, independent of the fortunes of the original providers.
Virtual file system on NoSQL for processing high volumes of HL7 messages.
Kimura, Eizen; Ishihara, Ken
2015-01-01
The Standardized Structured Medical Information Exchange (SS-MIX) is intended to be the standard repository for HL7 messages that depend on a local file system. However, its scalability is limited. We implemented a virtual file system using NoSQL to incorporate modern computing technology into SS-MIX and allow the system to integrate local patient IDs from different healthcare systems into a universal system. We discuss its implementation using the database MongoDB and describe its performance in a case study.
Robinson, Judas; de Lusignan, Simon; Kostkova, Patty; Madge, Bruce; Marsh, A; Biniaris, C
2006-01-01
Rich Site Summary (RSS) feeds are a method for disseminating and syndicating the contents of a website using extensible mark-up language (XML). The Primary Care Electronic Library (PCEL) distributes recent additions to the site in the form of an RSS feed. When new resources are added to PCEL, they are manually assigned medical subject headings (MeSH terms), which are then automatically mapped to SNOMED-CT terms using the Unified Medical Language System (UMLS) Metathesaurus. The library is thus searchable using MeSH or SNOMED-CT. Our syndicate partner wished to have remote access to PCEL coronary heart disease (CHD) information resources based on SNOMED-CT search terms. To pilot the supply of relevant information resources in response to clinically coded requests, using RSS syndication for transmission between web servers. Our syndicate partner provided a list of CHD SNOMED-CT terms to its end-users, a list which was coded according to UMLS specifications. When the end-user requested relevant information resources, this request was relayed from our syndicate partner's web server to the PCEL web server. The relevant resources were retrieved from the PCEL MySQL database. This database is accessed using a server side scripting language (PHP), which enables the production of dynamic RSS feeds on the basis of Source Asserted Identifiers (CODEs) contained in UMLS. Retrieving resources using SNOMED-CT terms using syndication can be used to build a functioning application. The process from request to display of syndicated resources took less than one second. The results of the pilot illustrate that it is possible to exchange data between servers using RSS syndication. This method could be utilised dynamically to supply digital library resources to a clinical system with SNOMED-CT data used as the standard of reference.
2010-07-01
http://www.iono.noa.gr/ElectronDensity/EDProfile.php The web service has been developed with the following open source tools: a) PHP , for the... MySQL for the database, which was based on the enhancement of the DIAS database. Below we present some screen shots to demonstrate the functionality
Case and Model Driven Dynamic Template Linking
2005-06-01
store the trips in a PostgreSQL database (www.postgresql.org) and the values stored in this database could be re-used to provide values for similar trips...Preferences YES Yes but limited Print Form YES NO Close Form YES NO Just “X” Quit YES NO Just “X” Show User Action History YES NO 6.5 DAML Ontologies
DataSpread: Unifying Databases and Spreadsheets.
Bendre, Mangesh; Sun, Bofan; Zhang, Ding; Zhou, Xinyan; Chang, Kevin ChenChuan; Parameswaran, Aditya
2015-08-01
Spreadsheet software is often the tool of choice for ad-hoc tabular data management, processing, and visualization, especially on tiny data sets. On the other hand, relational database systems offer significant power, expressivity, and efficiency over spreadsheet software for data management, while lacking in the ease of use and ad-hoc analysis capabilities. We demonstrate DataSpread, a data exploration tool that holistically unifies databases and spreadsheets. It continues to offer a Microsoft Excel-based spreadsheet front-end, while in parallel managing all the data in a back-end database, specifically, PostgreSQL. DataSpread retains all the advantages of spreadsheets, including ease of use, ad-hoc analysis and visualization capabilities, and a schema-free nature, while also adding the advantages of traditional relational databases, such as scalability and the ability to use arbitrary SQL to import, filter, or join external or internal tables and have the results appear in the spreadsheet. DataSpread needs to reason about and reconcile differences in the notions of schema, addressing of cells and tuples, and the current "pane" (which exists in spreadsheets but not in traditional databases), and support data modifications at both the front-end and the back-end. Our demonstration will center on our first and early prototype of the DataSpread, and will give the attendees a sense for the enormous data exploration capabilities offered by unifying spreadsheets and databases.
DataSpread: Unifying Databases and Spreadsheets
Bendre, Mangesh; Sun, Bofan; Zhang, Ding; Zhou, Xinyan; Chang, Kevin ChenChuan; Parameswaran, Aditya
2015-01-01
Spreadsheet software is often the tool of choice for ad-hoc tabular data management, processing, and visualization, especially on tiny data sets. On the other hand, relational database systems offer significant power, expressivity, and efficiency over spreadsheet software for data management, while lacking in the ease of use and ad-hoc analysis capabilities. We demonstrate DataSpread, a data exploration tool that holistically unifies databases and spreadsheets. It continues to offer a Microsoft Excel-based spreadsheet front-end, while in parallel managing all the data in a back-end database, specifically, PostgreSQL. DataSpread retains all the advantages of spreadsheets, including ease of use, ad-hoc analysis and visualization capabilities, and a schema-free nature, while also adding the advantages of traditional relational databases, such as scalability and the ability to use arbitrary SQL to import, filter, or join external or internal tables and have the results appear in the spreadsheet. DataSpread needs to reason about and reconcile differences in the notions of schema, addressing of cells and tuples, and the current “pane” (which exists in spreadsheets but not in traditional databases), and support data modifications at both the front-end and the back-end. Our demonstration will center on our first and early prototype of the DataSpread, and will give the attendees a sense for the enormous data exploration capabilities offered by unifying spreadsheets and databases. PMID:26900487
A Comparison of Different Database Technologies for the CMS AsyncStageOut Transfer Database
DOE Office of Scientific and Technical Information (OSTI.GOV)
Ciangottini, D.; Balcas, J.; Mascheroni, M.
AsyncStageOut (ASO) is the component of the CMS distributed data analysis system (CRAB) that manages users transfers in a centrally controlled way using the File Transfer System (FTS3) at CERN. It addresses a major weakness of the previous, decentralized model, namely that the transfer of the user’s output data to a single remote site was part of the job execution, resulting in inefficient use of job slots and an unacceptable failure rate. Currently ASO manages up to 600k files of various sizes per day from more than 500 users per month, spread over more than 100 sites. ASO uses amore » NoSQL database (CouchDB) as internal bookkeeping and as way to communicate with other CRAB components. Since ASO/CRAB were put in production in 2014, the number of transfers constantly increased up to a point where the pressure to the central CouchDB instance became critical, creating new challenges for the system scalability, performance, and monitoring. This forced a re-engineering of the ASO application to increase its scalability and lowering its operational effort. In this contribution we present a comparison of the performance of the current NoSQL implementation and a new SQL implementation, and how their different strengths and features influenced the design choices and operational experience. We also discuss other architectural changes introduced in the system to handle the increasing load and latency in delivering output to the user.« less
A comparison of different database technologies for the CMS AsyncStageOut transfer database
NASA Astrophysics Data System (ADS)
Ciangottini, D.; Balcas, J.; Mascheroni, M.; Rupeika, E. A.; Vaandering, E.; Riahi, H.; Silva, J. M. D.; Hernandez, J. M.; Belforte, S.; Ivanov, T. T.
2017-10-01
AsyncStageOut (ASO) is the component of the CMS distributed data analysis system (CRAB) that manages users transfers in a centrally controlled way using the File Transfer System (FTS3) at CERN. It addresses a major weakness of the previous, decentralized model, namely that the transfer of the user’s output data to a single remote site was part of the job execution, resulting in inefficient use of job slots and an unacceptable failure rate. Currently ASO manages up to 600k files of various sizes per day from more than 500 users per month, spread over more than 100 sites. ASO uses a NoSQL database (CouchDB) as internal bookkeeping and as way to communicate with other CRAB components. Since ASO/CRAB were put in production in 2014, the number of transfers constantly increased up to a point where the pressure to the central CouchDB instance became critical, creating new challenges for the system scalability, performance, and monitoring. This forced a re-engineering of the ASO application to increase its scalability and lowering its operational effort. In this contribution we present a comparison of the performance of the current NoSQL implementation and a new SQL implementation, and how their different strengths and features influenced the design choices and operational experience. We also discuss other architectural changes introduced in the system to handle the increasing load and latency in delivering output to the user.
Zaman, Babar; Khandekar, Rajiv; Al Shahwan, Sami; Song, Jonathan; Al Jadaan, Ibrahim; Al Jiasim, Leyla; Owaydha, Ohood; Asghar, Nasira; Hijazi, Amar; Edward, Deepak P.
2014-01-01
In this brief communication, we present the steps used to establish a web-based congenital glaucoma registry at our institution. The contents of a case report form (CRF) were developed by a group of glaucoma subspecialists. Information Technology (IT) specialists used Lime Survey softwareTM to create an electronic CRF. A MY Structured Query Language (MySQL) server was used as a database with a virtual machine operating system. Two ophthalmologists and 2 IT specialists worked for 7 hours, and a biostatistician and a data registrar worked for 24 hours each to establish the electronic CRF. Using the CRF which was transferred to the Lime survey tool, and the MYSQL server application, data could be directly stored in spreadsheet programs that included Microsoft Excel, SPSS, and R-Language and queried in real-time. In a pilot test, clinical data from 80 patients with congenital glaucoma were entered into the registry and successful descriptive analysis and data entry validation was performed. A web-based disease registry was established in a short period of time in a cost-efficient manner using available resources and a team-based approach. PMID:24791112
Zaman, Babar; Khandekar, Rajiv; Al Shahwan, Sami; Song, Jonathan; Al Jadaan, Ibrahim; Al Jiasim, Leyla; Owaydha, Ohood; Asghar, Nasira; Hijazi, Amar; Edward, Deepak P
2014-01-01
In this brief communication, we present the steps used to establish a web-based congenital glaucoma registry at our institution. The contents of a case report form (CRF) were developed by a group of glaucoma subspecialists. Information Technology (IT) specialists used Lime Survey softwareTM to create an electronic CRF. A MY Structured Query Language (MySQL) server was used as a database with a virtual machine operating system. Two ophthalmologists and 2 IT specialists worked for 7 hours, and a biostatistician and a data registrar worked for 24 hours each to establish the electronic CRF. Using the CRF which was transferred to the Lime survey tool, and the MYSQL server application, data could be directly stored in spreadsheet programs that included Microsoft Excel, SPSS, and R-Language and queried in real-time. In a pilot test, clinical data from 80 patients with congenital glaucoma were entered into the registry and successful descriptive analysis and data entry validation was performed. A web-based disease registry was established in a short period of time in a cost-efficient manner using available resources and a team-based approach.
Toward Phase IV, Populating the WOVOdat Database
NASA Astrophysics Data System (ADS)
Ratdomopurbo, A.; Newhall, C. G.; Schwandner, F. M.; Selva, J.; Ueda, H.
2009-12-01
One of challenges for volcanologists is the fact that more and more people are likely to live on volcanic slopes. Information about volcanic activity during unrest should be accurate and rapidly distributed. As unrest may lead to eruption, evacuation may be necessary to minimize damage and casualties. The decision to evacuate people is usually based on the interpretation of monitoring data. Over the past several decades, monitoring volcanoes has used more and more sophisticated instruments. A huge volume of data is collected in order to understand the state of activity and behaviour of a volcano. WOVOdat, The World Organization of Volcano Observatories (WOVO) Database of Volcanic Unrest, will provide context within which scientists can interpret the state of their own volcano, during and between crises. After a decision during the 2000 IAVCEI General Assembly to create WOVOdat, development has passed through several phases, from Concept Development (Phase-I in 2000-2002), Database Design (Phase-II, 2003-2006) and Pilot Testing (Phase-III in 2007-2008). For WOVOdat to be operational, there are still two (2) steps to complete, which are: Database Population (Phase-IV) and Enhancement and Maintenance (Phase-V). Since January 2009, the WOVOdat project is hosted by Earth Observatory of Singapore for at least a 5-year period. According to the original planning in 2002, this 5-year period will be used for completing the Phase-IV. As the WOVOdat design is not yet tested for all types of data, 2009 is still reserved for building the back-end relational database management system (RDBMS) of WOVOdat and testing it with more complex data. Fine-tuning of the WOVOdat’s RDBMS design is being done with each new upload of observatory data. The next and main phase of WOVOdat development will be data population, managing data transfer from multiple observatory formats to WOVOdat format. Data population will depend on two important things, the availability of SQL database in volcano observatories and their data sharing policy. Hence, a strong collaboration with every WOVO observatory is important. For some volcanoes where the data are not in an SQL system, the WOVOdat project will help scientists working on the volcano to start building an SQL database.
Killion, Patrick J; Sherlock, Gavin; Iyer, Vishwanath R
2003-01-01
Background The power of microarray analysis can be realized only if data is systematically archived and linked to biological annotations as well as analysis algorithms. Description The Longhorn Array Database (LAD) is a MIAME compliant microarray database that operates on PostgreSQL and Linux. It is a fully open source version of the Stanford Microarray Database (SMD), one of the largest microarray databases. LAD is available at Conclusions Our development of LAD provides a simple, free, open, reliable and proven solution for storage and analysis of two-color microarray data. PMID:12930545
A future Outlook: Web based Simulation of Hydrodynamic models
NASA Astrophysics Data System (ADS)
Islam, A. S.; Piasecki, M.
2003-12-01
Despite recent advances to present simulation results as 3D graphs or animation contours, the modeling user community still faces some shortcomings when trying to move around and analyze data. Typical problems include the lack of common platforms with standard vocabulary to exchange simulation results from different numerical models, insufficient descriptions about data (metadata), lack of robust search and retrieval tools for data, and difficulties to reuse simulation domain knowledge. This research demonstrates how to create a shared simulation domain in the WWW and run a number of models through multi-user interfaces. Firstly, meta-datasets have been developed to describe hydrodynamic model data based on geographic metadata standard (ISO 19115) that has been extended to satisfy the need of the hydrodynamic modeling community. The Extended Markup Language (XML) is used to publish this metadata by the Resource Description Framework (RDF). Specific domain ontology for Web Based Simulation (WBS) has been developed to explicitly define vocabulary for the knowledge based simulation system. Subsequently, this knowledge based system is converted into an object model using Meta Object Family (MOF). The knowledge based system acts as a Meta model for the object oriented system, which aids in reusing the domain knowledge. Specific simulation software has been developed based on the object oriented model. Finally, all model data is stored in an object relational database. Database back-ends help store, retrieve and query information efficiently. This research uses open source software and technology such as Java Servlet and JSP, Apache web server, Tomcat Servlet Engine, PostgresSQL databases, Protégé ontology editor, RDQL and RQL for querying RDF in semantic level, Jena Java API for RDF. Also, we use international standards such as the ISO 19115 metadata standard, and specifications such as XML, RDF, OWL, XMI, and UML. The final web based simulation product is deployed as Web Archive (WAR) files which is platform and OS independent and can be used by Windows, UNIX, or Linux. Keywords: Apache, ISO 19115, Java Servlet, Jena, JSP, Metadata, MOF, Linux, Ontology, OWL, PostgresSQL, Protégé, RDF, RDQL, RQL, Tomcat, UML, UNIX, Windows, WAR, XML
NASA Astrophysics Data System (ADS)
Shanafield, Harold; Shamblin, Stephanie; Devarakonda, Ranjeet; McMurry, Ben; Walker Beaty, Tammy; Wilson, Bruce; Cook, Robert B.
2011-02-01
The FLUXNET global network of regional flux tower networks serves to coordinate the regional and global analysis of eddy covariance based CO2, water vapor and energy flux measurements taken at more than 500 sites in continuous long-term operation. The FLUXNET database presently contains information about the location, characteristics, and data availability of each of these sites. To facilitate the coordination and distribution of this information, we redesigned the underlying database and associated web site. We chose the PostgreSQL database as a platform based on its performance, stability and GIS extensions. PostreSQL allows us to enhance our search and presentation capabilities, which will in turn provide increased functionality for users seeking to understand the FLUXNET data. The redesigned database will also significantly decrease the burden of managing such highly varied data. The website is being developed using the Drupal content management system, which provides many community-developed modules and a robust framework for custom feature development. In parallel, we are working with the regional networks to ensure that the information in the FLUXNET database is identical to that in the regional networks. Going forward, we also plan to develop an automated way to synchronize information with the regional networks.
Schacht Hansen, M; Dørup, J
2001-01-01
The Wireless Application Protocol technology implemented in newer mobile phones has built-in facilities for handling much of the information processing needed in clinical work. To test a practical approach we ported a relational database of the Danish pharmaceutical catalogue to Wireless Application Protocol using open source freeware at all steps. We used Apache 1.3 web software on a Linux server. Data containing the Danish pharmaceutical catalogue were imported from an ASCII file into a MySQL 3.22.32 database using a Practical Extraction and Report Language script for easy update of the database. Data were distributed in 35 interrelated tables. Each pharmaceutical brand name was given its own card with links to general information about the drug, active substances, contraindications etc. Access was available through 1) browsing therapeutic groups and 2) searching for a brand name. The database interface was programmed in the server-side scripting language PHP3. A free, open source Wireless Application Protocol gateway to a pharmaceutical catalogue was established to allow dial-in access independent of commercial Wireless Application Protocol service providers. The application was tested on the Nokia 7110 and Ericsson R320s cellular phones. We have demonstrated that Wireless Application Protocol-based access to a dynamic clinical database can be established using open source freeware. The project opens perspectives for a further integration of Wireless Application Protocol phone functions in clinical information processing: Global System for Mobile communication telephony for bilateral communication, asynchronous unilateral communication via e-mail and Short Message Service, built-in calculator, calendar, personal organizer, phone number catalogue and Dictaphone function via answering machine technology. An independent Wireless Application Protocol gateway may be placed within hospital firewalls, which may be an advantage with respect to security. However, if Wireless Application Protocol phones are to become effective tools for physicians, special attention must be paid to the limitations of the devices. Input tools of Wireless Application Protocol phones should be improved, for instance by increased use of speech control.
Hansen, Michael Schacht
2001-01-01
Background The Wireless Application Protocol technology implemented in newer mobile phones has built-in facilities for handling much of the information processing needed in clinical work. Objectives To test a practical approach we ported a relational database of the Danish pharmaceutical catalogue to Wireless Application Protocol using open source freeware at all steps. Methods We used Apache 1.3 web software on a Linux server. Data containing the Danish pharmaceutical catalogue were imported from an ASCII file into a MySQL 3.22.32 database using a Practical Extraction and Report Language script for easy update of the database. Data were distributed in 35 interrelated tables. Each pharmaceutical brand name was given its own card with links to general information about the drug, active substances, contraindications etc. Access was available through 1) browsing therapeutic groups and 2) searching for a brand name. The database interface was programmed in the server-side scripting language PHP3. Results A free, open source Wireless Application Protocol gateway to a pharmaceutical catalogue was established to allow dial-in access independent of commercial Wireless Application Protocol service providers. The application was tested on the Nokia 7110 and Ericsson R320s cellular phones. Conclusions We have demonstrated that Wireless Application Protocol-based access to a dynamic clinical database can be established using open source freeware. The project opens perspectives for a further integration of Wireless Application Protocol phone functions in clinical information processing: Global System for Mobile communication telephony for bilateral communication, asynchronous unilateral communication via e-mail and Short Message Service, built-in calculator, calendar, personal organizer, phone number catalogue and Dictaphone function via answering machine technology. An independent Wireless Application Protocol gateway may be placed within hospital firewalls, which may be an advantage with respect to security. However, if Wireless Application Protocol phones are to become effective tools for physicians, special attention must be paid to the limitations of the devices. Input tools of Wireless Application Protocol phones should be improved, for instance by increased use of speech control. PMID:11720946
Managing hydrological measurements for small and intermediate projects: RObsDat
NASA Astrophysics Data System (ADS)
Reusser, Dominik E.
2014-05-01
Hydrological measurements need good management for the data not to be lost. Multiple, often overlapping files from various loggers with heterogeneous formats need to be merged. Data needs to be validated and cleaned and subsequently converted to the format for the hydrological target application. Preferably, all these steps should be easily tracable. RObsDat is an R package designed to support such data management. It comes with a command line user interface to support hydrologists to enter and adjust their data in a database following the Observations Data Model (ODM) standard by QUASHI. RObsDat helps in the setup of the database within one of the free database engines MySQL, PostgreSQL or SQLite. It imports the controlled water vocabulary from the QUASHI web service and provides a smart interface between the hydrologist and the database: Already existing data entries are detected and duplicates avoided. The data import function converts different data table designes to make import simple. Cleaning and modifications of data are handled with a simple version control system. Variable and location names are treated in a user friendly way, accepting and processing multiple versions. A new development is the use of spacetime objects for subsequent processing.
Big Geo Data Services: From More Bytes to More Barrels
NASA Astrophysics Data System (ADS)
Misev, Dimitar; Baumann, Peter
2016-04-01
The data deluge is affecting the oil and gas industry just as much as many other industries. However, aside from the sheer volume there is the challenge of data variety, such as regular and irregular grids, multi-dimensional space/time grids, point clouds, and TINs and other meshes. A uniform conceptualization for modelling and serving them could save substantial effort, such as the proverbial "department of reformatting". The notion of a coverage actually can accomplish this. Its abstract model in ISO 19123 together with the concrete, interoperable OGC Coverage Implementation Schema (CIS), which is currently under adoption as ISO 19123-2, provieds a common platform for representing any n-D grid type, point clouds, and general meshes. This is paired by the OGC Web Coverage Service (WCS) together with its datacube analytics language, the OGC Web Coverage Processing Service (WCPS). The OGC WCS Core Reference Implementation, rasdaman, relies on Array Database technology, i.e. a NewSQL/NoSQL approach. It supports the grid part of coverages, with installations of 100+ TB known and single queries parallelized across 1,000+ cloud nodes. Recent research attempts to address the point cloud and mesh part through a unified query model. The Holy Grail envisioned is that these approaches can be merged into a single service interface at some time. We present both grid amd point cloud / mesh approaches and discuss status, implementation, standardization, and research perspectives, including a live demo.
mantisGRID: a grid platform for DICOM medical images management in Colombia and Latin America.
Garcia Ruiz, Manuel; Garcia Chaves, Alvin; Ruiz Ibañez, Carlos; Gutierrez Mazo, Jorge Mario; Ramirez Giraldo, Juan Carlos; Pelaez Echavarria, Alejandro; Valencia Diaz, Edison; Pelaez Restrepo, Gustavo; Montoya Munera, Edwin Nelson; Garcia Loaiza, Bernardo; Gomez Gonzalez, Sebastian
2011-04-01
This paper presents the mantisGRID project, an interinstitutional initiative from Colombian medical and academic centers aiming to provide medical grid services for Colombia and Latin America. The mantisGRID is a GRID platform, based on open source grid infrastructure that provides the necessary services to access and exchange medical images and associated information following digital imaging and communications in medicine (DICOM) and health level 7 standards. The paper focuses first on the data abstraction architecture, which is achieved via Open Grid Services Architecture Data Access and Integration (OGSA-DAI) services and supported by the Globus Toolkit. The grid currently uses a 30-Mb bandwidth of the Colombian High Technology Academic Network, RENATA, connected to Internet 2. It also includes a discussion on the relational database created to handle the DICOM objects that were represented using Extensible Markup Language Schema documents, as well as other features implemented such as data security, user authentication, and patient confidentiality. Grid performance was tested using the three current operative nodes and the results demonstrated comparable query times between the mantisGRID (OGSA-DAI) and Distributed mySQL databases, especially for a large number of records.
Naturally Occurring Human Urinary Peptides for Use in Diagnosis of Chronic Kidney Disease*
Good, David M.; Zürbig, Petra; Argilés, Àngel; Bauer, Hartwig W.; Behrens, Georg; Coon, Joshua J.; Dakna, Mohammed; Decramer, Stéphane; Delles, Christian; Dominiczak, Anna F.; Ehrich, Jochen H. H.; Eitner, Frank; Fliser, Danilo; Frommberger, Moritz; Ganser, Arnold; Girolami, Mark A.; Golovko, Igor; Gwinner, Wilfried; Haubitz, Marion; Herget-Rosenthal, Stefan; Jankowski, Joachim; Jahn, Holger; Jerums, George; Julian, Bruce A.; Kellmann, Markus; Kliem, Volker; Kolch, Walter; Krolewski, Andrzej S.; Luppi, Mario; Massy, Ziad; Melter, Michael; Neusüss, Christian; Novak, Jan; Peter, Karlheinz; Rossing, Kasper; Rupprecht, Harald; Schanstra, Joost P.; Schiffer, Eric; Stolzenburg, Jens-Uwe; Tarnow, Lise; Theodorescu, Dan; Thongboonkerd, Visith; Vanholder, Raymond; Weissinger, Eva M.; Mischak, Harald; Schmitt-Kopplin, Philippe
2010-01-01
Because of its availability, ease of collection, and correlation with physiology and pathology, urine is an attractive source for clinical proteomics/peptidomics. However, the lack of comparable data sets from large cohorts has greatly hindered the development of clinical proteomics. Here, we report the establishment of a reproducible, high resolution method for peptidome analysis of naturally occurring human urinary peptides and proteins, ranging from 800 to 17,000 Da, using samples from 3,600 individuals analyzed by capillary electrophoresis coupled to MS. All processed data were deposited in an Structured Query Language (SQL) database. This database currently contains 5,010 relevant unique urinary peptides that serve as a pool of potential classifiers for diagnosis and monitoring of various diseases. As an example, by using this source of information, we were able to define urinary peptide biomarkers for chronic kidney diseases, allowing diagnosis of these diseases with high accuracy. Application of the chronic kidney disease-specific biomarker set to an independent test cohort in the subsequent replication phase resulted in 85.5% sensitivity and 100% specificity. These results indicate the potential usefulness of capillary electrophoresis coupled to MS for clinical applications in the analysis of naturally occurring urinary peptides. PMID:20616184
A data management infrastructure for bridge monitoring
NASA Astrophysics Data System (ADS)
Jeong, Seongwoon; Byun, Jaewook; Kim, Daeyoung; Sohn, Hoon; Bae, In Hwan; Law, Kincho H.
2015-04-01
This paper discusses a data management infrastructure framework for bridge monitoring applications. As sensor technologies mature and become economically affordable, their deployment for bridge monitoring will continue to grow. Data management becomes a critical issue not only for storing the sensor data but also for integrating with the bridge model to support other functions, such as management, maintenance and inspection. The focus of this study is on the effective data management of bridge information and sensor data, which is crucial to structural health monitoring and life cycle management of bridge structures. We review the state-of-the-art of bridge information modeling and sensor data management, and propose a data management framework for bridge monitoring based on NoSQL database technologies that have been shown useful in handling high volume, time-series data and to flexibly deal with unstructured data schema. Specifically, Apache Cassandra and Mongo DB are deployed for the prototype implementation of the framework. This paper describes the database design for an XML-based Bridge Information Modeling (BrIM) schema, and the representation of sensor data using Sensor Model Language (SensorML). The proposed prototype data management framework is validated using data collected from the Yeongjong Bridge in Incheon, Korea.
A Recommender System in the Cyber Defense Domain
2014-03-27
monitoring software is a java based program sending updates to the database on the sensor machine. The host monitoring program gathers information about...3.2.2 Database. A MySQL database located on the sensor machine acts as the storage for the sensors on the network. Snort, Nmap, vulnerability scores, and...machine with the IDS and the recommender is labeled “sensor”. The recommender system code is written in java and compiled using java version 1.6.024
A System for Web-based Access to the HSOS Database
NASA Astrophysics Data System (ADS)
Lin, G.
Huairou Solar Observing Station's (HSOS) magnetogram and dopplergram are world-class instruments. Access to their data has opened to the world. Web-based access to the data will provide a powerful, convenient tool for data searching and solar physics. It is necessary that our data be provided to users via the Web when it is opened to the world. In this presentation, the author describes general design and programming construction of the system. The system will be generated by PHP and MySQL. The author also introduces basic feature of PHP and MySQL.
Toxicity ForeCaster (ToxCast™) Data
Data is organized into different data sets and includes descriptions of ToxCast chemicals and assays and files summarizing the screening results, a MySQL database, chemicals screened through Tox21, and available data generated from animal toxicity studies.
STINGRAY: system for integrated genomic resources and analysis.
Wagner, Glauber; Jardim, Rodrigo; Tschoeke, Diogo A; Loureiro, Daniel R; Ocaña, Kary A C S; Ribeiro, Antonio C B; Emmel, Vanessa E; Probst, Christian M; Pitaluga, André N; Grisard, Edmundo C; Cavalcanti, Maria C; Campos, Maria L M; Mattoso, Marta; Dávila, Alberto M R
2014-03-07
The STINGRAY system has been conceived to ease the tasks of integrating, analyzing, annotating and presenting genomic and expression data from Sanger and Next Generation Sequencing (NGS) platforms. STINGRAY includes: (a) a complete and integrated workflow (more than 20 bioinformatics tools) ranging from functional annotation to phylogeny; (b) a MySQL database schema, suitable for data integration and user access control; and (c) a user-friendly graphical web-based interface that makes the system intuitive, facilitating the tasks of data analysis and annotation. STINGRAY showed to be an easy to use and complete system for analyzing sequencing data. While both Sanger and NGS platforms are supported, the system could be faster using Sanger data, since the large NGS datasets could potentially slow down the MySQL database usage. STINGRAY is available at http://stingray.biowebdb.org and the open source code at http://sourceforge.net/projects/stingray-biowebdb/.
NASA Astrophysics Data System (ADS)
Dervos, D. A.; Skourlas, C.; Laiho, M.
2015-02-01
"DBTech VET Teachers" project is Leonardo da Vinci Multilateral Transfer of Innovation Project co-financed by the European Commission's Lifelong Learning Programme. The aim of the project is to renew the teaching of database technologies in VET (Vocational Education and Training) institutes, on the basis of the current and real needs of ICT industry in Europe. Training of the VET teachers is done with the systems used in working life and they are taught to guide students to learning by verifying. In this framework, a course module on SQL transactions is prepared and offered. In this paper we present and briefly discuss some qualitative/quantitative data collected from its first pilot offering to an international audience in Greece during May-June 2013. The questionnaire/evaluation results, and the types of participants who have attended the course offering, are presented. Conclusions are also presented.
STINGRAY: system for integrated genomic resources and analysis
2014-01-01
Background The STINGRAY system has been conceived to ease the tasks of integrating, analyzing, annotating and presenting genomic and expression data from Sanger and Next Generation Sequencing (NGS) platforms. Findings STINGRAY includes: (a) a complete and integrated workflow (more than 20 bioinformatics tools) ranging from functional annotation to phylogeny; (b) a MySQL database schema, suitable for data integration and user access control; and (c) a user-friendly graphical web-based interface that makes the system intuitive, facilitating the tasks of data analysis and annotation. Conclusion STINGRAY showed to be an easy to use and complete system for analyzing sequencing data. While both Sanger and NGS platforms are supported, the system could be faster using Sanger data, since the large NGS datasets could potentially slow down the MySQL database usage. STINGRAY is available at http://stingray.biowebdb.org and the open source code at http://sourceforge.net/projects/stingray-biowebdb/. PMID:24606808
Ontological Approach to Military Knowledge Modeling and Management
2004-03-01
federated search mechanism has to reformulate user queries (expressed using the ontology) in the query languages of the different sources (e.g. SQL...ontologies as a common terminology – Unified query to perform federated search • Query processing – Ontology mapping to sources reformulate queries
Motivational and metacognitive feedback in SQL-Tutor*
NASA Astrophysics Data System (ADS)
Hull, Alison; du Boulay, Benedict
2015-04-01
Motivation and metacognition are strongly intertwined, with learners high in self-efficacy more likely to use a variety of self-regulatory learning strategies, as well as to persist longer on challenging tasks. The aim of the research was to improve the learner's focus on the process and experience of problem-solving while using an Intelligent Tutoring System (ITS) and including motivational and metacognitive feedback based on the learner's past states and experiences. An existing ITS, SQL-Tutor, was used with first-year undergraduates studying a database module. The study used two versions of SQL-Tutor: the Control group used a base version providing domain feedback and the Study group used an extended version that also provided motivational and metacognitive feedback. This paper summarises the pre- and post-process results. Comparisons between groups showed some differing trends both in learning outcomes and behaviour in favour of the Study group.
Performance analysis of different database in new internet mapping system
NASA Astrophysics Data System (ADS)
Yao, Xing; Su, Wei; Gao, Shuai
2017-03-01
In the Mapping System of New Internet, Massive mapping entries between AID and RID need to be stored, added, updated, and deleted. In order to better deal with the problem when facing a large number of mapping entries update and query request, the Mapping System of New Internet must use high-performance database. In this paper, we focus on the performance of Redis, SQLite, and MySQL these three typical databases, and the results show that the Mapping System based on different databases can adapt to different needs according to the actual situation.
The establishment and use of the point source catalog database of the 2MASS near infrared survey
NASA Astrophysics Data System (ADS)
Gao, Y. F.; Shan, H. G.; Cheng, D.
2003-02-01
The 2MASS near infrared survey project is introduced briefly. The 2MASS point sources catalog (2MASS PSC) database and the network query system are established by using the PHP Hypertext Preprocessor and MySQL database server. By using the system, one can not only query information of sources listed in the catalog, but also draw the plots related. Moreover, after the 2MASS data are diagnosed , some research fields which can be benefited from this database are suggested.
A complete history of everything
NASA Astrophysics Data System (ADS)
Lanclos, Kyle; Deich, William T. S.
2012-09-01
This paper discusses Lick Observatory's local solution for retaining a complete history of everything. Leveraging our existing deployment of a publish/subscribe communications model that is used to broadcast the state of all systems at Lick Observatory, a monitoring daemon runs on a dedicated server that subscribes to and records all published messages. Our success with this system is a testament to the power of simple, straightforward approaches to complex problems. The solution itself is written in Python, and the initial version required about a week of development time; the data are stored in PostgreSQL database tables using a distinctly simple schema. Over time, we addressed scaling issues as the data set grew, which involved reworking the PostgreSQL database schema on the back-end. We also duplicate the data in flat files to enable recovery or migration of the data from one server to another. This paper will cover both the initial design as well as the solutions to the subsequent deployment issues, the trade-offs that motivated those choices, and the integration of this history database with existing client applications.
CBD: a biomarker database for colorectal cancer.
Zhang, Xueli; Sun, Xiao-Feng; Cao, Yang; Ye, Benchen; Peng, Qiliang; Liu, Xingyun; Shen, Bairong; Zhang, Hong
2018-01-01
Colorectal cancer (CRC) biomarker database (CBD) was established based on 870 identified CRC biomarkers and their relevant information from 1115 original articles in PubMed published from 1986 to 2017. In this version of the CBD, CRC biomarker data were collected, sorted, displayed and analysed. The CBD with the credible contents as a powerful and time-saving tool provide more comprehensive and accurate information for further CRC biomarker research. The CBD was constructed under MySQL server. HTML, PHP and JavaScript languages have been used to implement the web interface. The Apache was selected as HTTP server. All of these web operations were implemented under the Windows system. The CBD could provide to users the multiple individual biomarker information and categorized into the biological category, source and application of biomarkers; the experiment methods, results, authors and publication resources; the research region, the average age of cohort, gender, race, the number of tumours, tumour location and stage. We only collect data from the articles with clear and credible results to prove the biomarkers are useful in the diagnosis, treatment or prognosis of CRC. The CBD can also provide a professional platform to researchers who are interested in CRC research to communicate, exchange their research ideas and further design high-quality research in CRC. They can submit their new findings to our database via the submission page and communicate with us in the CBD.Database URL: http://sysbio.suda.edu.cn/CBD/.
Earthquake forecasting studies using radon time series data in Taiwan
NASA Astrophysics Data System (ADS)
Walia, Vivek; Kumar, Arvind; Fu, Ching-Chou; Lin, Shih-Jung; Chou, Kuang-Wu; Wen, Kuo-Liang; Chen, Cheng-Hong
2017-04-01
For few decades, growing number of studies have shown usefulness of data in the field of seismogeochemistry interpreted as geochemical precursory signals for impending earthquakes and radon is idendified to be as one of the most reliable geochemical precursor. Radon is recognized as short-term precursor and is being monitored in many countries. This study is aimed at developing an effective earthquake forecasting system by inspecting long term radon time series data. The data is obtained from a network of radon monitoring stations eastblished along different faults of Taiwan. The continuous time series radon data for earthquake studies have been recorded and some significant variations associated with strong earthquakes have been observed. The data is also examined to evaluate earthquake precursory signals against environmental factors. An automated real-time database operating system has been developed recently to improve the data processing for earthquake precursory studies. In addition, the study is aimed at the appraisal and filtrations of these environmental parameters, in order to create a real-time database that helps our earthquake precursory study. In recent years, automatic operating real-time database has been developed using R, an open source programming language, to carry out statistical computation on the data. To integrate our data with our working procedure, we use the popular and famous open source web application solution, AMP (Apache, MySQL, and PHP), creating a website that could effectively show and help us manage the real-time database.
CBD: a biomarker database for colorectal cancer
Zhang, Xueli; Sun, Xiao-Feng; Ye, Benchen; Peng, Qiliang; Liu, Xingyun; Shen, Bairong; Zhang, Hong
2018-01-01
Abstract Colorectal cancer (CRC) biomarker database (CBD) was established based on 870 identified CRC biomarkers and their relevant information from 1115 original articles in PubMed published from 1986 to 2017. In this version of the CBD, CRC biomarker data were collected, sorted, displayed and analysed. The CBD with the credible contents as a powerful and time-saving tool provide more comprehensive and accurate information for further CRC biomarker research. The CBD was constructed under MySQL server. HTML, PHP and JavaScript languages have been used to implement the web interface. The Apache was selected as HTTP server. All of these web operations were implemented under the Windows system. The CBD could provide to users the multiple individual biomarker information and categorized into the biological category, source and application of biomarkers; the experiment methods, results, authors and publication resources; the research region, the average age of cohort, gender, race, the number of tumours, tumour location and stage. We only collect data from the articles with clear and credible results to prove the biomarkers are useful in the diagnosis, treatment or prognosis of CRC. The CBD can also provide a professional platform to researchers who are interested in CRC research to communicate, exchange their research ideas and further design high-quality research in CRC. They can submit their new findings to our database via the submission page and communicate with us in the CBD. Database URL: http://sysbio.suda.edu.cn/CBD/ PMID:29846545
Construction of Database for Pulsating Variable Stars
NASA Astrophysics Data System (ADS)
Chen, B. Q.; Yang, M.; Jiang, B. W.
2011-07-01
A database for the pulsating variable stars is constructed for Chinese astronomers to study the variable stars conveniently. The database includes about 230000 variable stars in the Galactic bulge, LMC and SMC observed by the MACHO (MAssive Compact Halo Objects) and OGLE (Optical Gravitational Lensing Experiment) projects at present. The software used for the construction is LAMP, i.e., Linux+Apache+MySQL+PHP. A web page is provided to search the photometric data and the light curve in the database through the right ascension and declination of the object. More data will be incorporated into the database.
Visualization of Vgi Data Through the New NASA Web World Wind Virtual Globe
NASA Astrophysics Data System (ADS)
Brovelli, M. A.; Kilsedar, C. E.; Zamboni, G.
2016-06-01
GeoWeb 2.0, laying the foundations of Volunteered Geographic Information (VGI) systems, has led to platforms where users can contribute to the geographic knowledge that is open to access. Moreover, as a result of the advancements in 3D visualization, virtual globes able to visualize geographic data even on browsers emerged. However the integration of VGI systems and virtual globes has not been fully realized. The study presented aims to visualize volunteered data in 3D, considering also the ease of use aspects for general public, using Free and Open Source Software (FOSS). The new Application Programming Interface (API) of NASA, Web World Wind, written in JavaScript and based on Web Graphics Library (WebGL) is cross-platform and cross-browser, so that the virtual globe created using this API can be accessible through any WebGL supported browser on different operating systems and devices, as a result not requiring any installation or configuration on the client-side, making the collected data more usable to users, which is not the case with the World Wind for Java as installation and configuration of the Java Virtual Machine (JVM) is required. Furthermore, the data collected through various VGI platforms might be in different formats, stored in a traditional relational database or in a NoSQL database. The project developed aims to visualize and query data collected through Open Data Kit (ODK) platform and a cross-platform application, where data is stored in a relational PostgreSQL and NoSQL CouchDB databases respectively.
Software Architecture Evolution
2013-12-01
system’s major components occurring via a Java Message Service message bus [69]. This architecture was designed to promote loose coupling of soft- ware...play reconfiguration of the system. The components were Java -based and platform-independent; the interfaces by which they communicated were based on...The MPCS database, a MySQL database used for storing telemetry as well as some other information, such as logs and commanding data [68]. This
2011-09-01
rate Python’s maturity as “High.” Python is nine years old and has been continuously been developed and enhanced since then. During fiscal year 2010...We rate Python’s developer toolkit availability/extensibility as “Yes.” Python runs on a SQL database and is 64 compatible with Oracle database...MODEL...........................................................................11 D. GOAL DEVELOPMENT
Cost Considerations in Cloud Computing
2014-01-01
investments. 2. Database Options The potential promise that “ big data ” analytics holds for many enterprise mission areas makes relevant the question of the...development of a range of new distributed file systems and data - bases that have better scalability properties than traditional SQL databases. Hadoop ... data . Many systems exist that extend or supplement Hadoop —such as Apache Accumulo, which provides a highly granular mechanism for managing security
Querying and Computing with BioCyc Databases
Krummenacker, Markus; Paley, Suzanne; Mueller, Lukas; Yan, Thomas; Karp, Peter D.
2006-01-01
Summary We describe multiple methods for accessing and querying the complex and integrated cellular data in the BioCyc family of databases: access through multiple file formats, access through Application Program Interfaces (APIs) for LISP, Perl and Java, and SQL access through the BioWarehouse relational database. Availability The Pathway Tools software and 20 BioCyc DBs in Tiers 1 and 2 are freely available to academic users; fees apply to some types of commercial use. For download instructions see http://BioCyc.org/download.shtml PMID:15961440
InverPep: A database of invertebrate antimicrobial peptides.
Gómez, Esteban A; Giraldo, Paula; Orduz, Sergio
2017-03-01
The aim of this work was to construct InverPep, a database specialised in experimentally validated antimicrobial peptides (AMPs) from invertebrates. AMP data contained in InverPep were manually curated from other databases and the scientific literature. MySQL was integrated with the development platform Laravel; this framework allows to integrate programming in PHP with HTML and was used to design the InverPep web page's interface. InverPep contains 18 separated fields, including InverPep code, phylum and species source, peptide name, sequence, peptide length, secondary structure, molar mass, charge, isoelectric point, hydrophobicity, Boman index, aliphatic index and percentage of hydrophobic amino acids. CALCAMPI, an algorithm to calculate the physicochemical properties of multiple peptides simultaneously, was programmed in PERL language. To date, InverPep contains 702 experimentally validated AMPs from invertebrate species. All of the peptides contain information associated with their source, physicochemical properties, secondary structure, biological activity and links to external literature. Most AMPs in InverPep have a length between 10 and 50 amino acids, a positive charge, a Boman index between 0 and 2 kcal/mol, and 30-50% hydrophobic amino acids. InverPep includes 33 AMPs not reported in other databases. Besides, CALCAMPI and statistical analysis of InverPep data is presented. The InverPep database is available in English and Spanish. InverPep is a useful database to study invertebrate AMPs and its information could be used for the design of new peptides. The user-friendly interface of InverPep and its information can be freely accessed via a web-based browser at http://ciencias.medellin.unal.edu.co/gruposdeinvestigacion/prospeccionydisenobiomoleculas/InverPep/public/home_en. Copyright © 2016 International Society for Chemotherapy of Infection and Cancer. Published by Elsevier Ltd. All rights reserved.
Survey of Machine Learning Methods for Database Security
NASA Astrophysics Data System (ADS)
Kamra, Ashish; Ber, Elisa
Application of machine learning techniques to database security is an emerging area of research. In this chapter, we present a survey of various approaches that use machine learning/data mining techniques to enhance the traditional security mechanisms of databases. There are two key database security areas in which these techniques have found applications, namely, detection of SQL Injection attacks and anomaly detection for defending against insider threats. Apart from the research prototypes and tools, various third-party commercial products are also available that provide database activity monitoring solutions by profiling database users and applications. We present a survey of such products. We end the chapter with a primer on mechanisms for responding to database anomalies.
NASA Astrophysics Data System (ADS)
Bashev, A.
2012-04-01
Currently there is an enormous amount of various geoscience databases. Unfortunately the only users of the majority of the databases are their elaborators. There are several reasons for that: incompaitability, specificity of tasks and objects and so on. However the main obstacles for wide usage of geoscience databases are complexity for elaborators and complication for users. The complexity of architecture leads to high costs that block the public access. The complication prevents users from understanding when and how to use the database. Only databases, associated with GoogleMaps don't have these drawbacks, but they could be hardly named "geoscience" Nevertheless, open and simple geoscience database is necessary at least for educational purposes (see our abstract for ESSI20/EOS12). We developed a database and web interface to work with them and now it is accessible at maps.sch192.ru. In this database a result is a value of a parameter (no matter which) in a station with a certain position, associated with metadata: the date when the result was obtained; the type of a station (lake, soil etc); the contributor that sent the result. Each contributor has its own profile, that allows to estimate the reliability of the data. The results can be represented on GoogleMaps space image as a point in a certain position, coloured according to the value of the parameter. There are default colour scales and each registered user can create the own scale. The results can be also extracted in *.csv file. For both types of representation one could select the data by date, object type, parameter type, area and contributor. The data are uploaded in *.csv format: Name of the station; Lattitude(dd.dddddd); Longitude(ddd.dddddd); Station type; Parameter type; Parameter value; Date(yyyy-mm-dd). The contributor is recognised while entering. This is the minimal set of features that is required to connect a value of a parameter with a position and see the results. All the complicated data treatment could be conducted in other programs after extraction the filtered data into *.csv file. It makes the database understandable for non-experts. The database employs open data format (*.csv) and wide spread tools: PHP as the program language, MySQL as database management system, JavaScript for interaction with GoogleMaps and JQueryUI for create user interface. The database is multilingual: there are association tables, which connect with elements of the database. In total the development required about 150 hours. The database still has several problems. The main problem is the reliability of the data. Actually it needs an expert system for estimation the reliability, but the elaboration of such a system would take more resources than the database itself. The second problem is the problem of stream selection - how to select the stations that are connected with each other (for example, belong to one water stream) and indicate their sequence. Currently the interface is English and Russian. However it can be easily translated to your language. But some problems we decided. For example problem "the problem of the same station" (sometimes the distance between stations is smaller, than the error of position): when you adding new station to the database our application automatically find station near this place. Also we decided problem of object and parameter type (how to regard "EC" and "electrical conductivity" as the same parameter). This problem has been solved using "associative tables". If you would like to see the interface on your language, just contact us. We should send you the list of terms and phrases for translation on your language. The main advantage of the database is that it is totally open: everybody can see, extract the data from the database and use them for non-commercial purposes with no charge. Registered users can contribute to the database without getting paid. We hope, that it will be widely used first of all for education purposes, but professional scientists could use it also.
2014-01-01
Protein biomarkers offer major benefits for diagnosis and monitoring of disease processes. Recent advances in protein mass spectrometry make it feasible to use this very sensitive technology to detect and quantify proteins in blood. To explore the potential of blood biomarkers, we conducted a thorough review to evaluate the reliability of data in the literature and to determine the spectrum of proteins reported to exist in blood with a goal of creating a Federated Database of Blood Proteins (FDBP). A unique feature of our approach is the use of a SQL database for all of the peptide data; the power of the SQL database combined with standard informatic algorithms such as BLAST and the statistical analysis system (SAS) allowed the rapid annotation and analysis of the database without the need to create special programs to manage the data. Our mathematical analysis and review shows that in addition to the usual secreted proteins found in blood, there are many reports of intracellular proteins and good agreement on transcription factors, DNA remodelling factors in addition to cellular receptors and their signal transduction enzymes. Overall, we have catalogued about 12,130 proteins identified by at least one unique peptide, and of these 3858 have 3 or more peptide correlations. The FDBP with annotations should facilitate testing blood for specific disease biomarkers. PMID:24476026
Nadkarni, P M
1997-08-01
Concept Locator (CL) is a client-server application that accesses a Sybase relational database server containing a subset of the UMLS Metathesaurus for the purpose of retrieval of concepts corresponding to one or more query expressions supplied to it. CL's query grammar permits complex Boolean expressions, wildcard patterns, and parenthesized (nested) subexpressions. CL translates the query expressions supplied to it into one or more SQL statements that actually perform the retrieval. The generated SQL is optimized by the client to take advantage of the strengths of the server's query optimizer, and sidesteps its weaknesses, so that execution is reasonably efficient.
JNDMS Task Authorization 2 Report
2013-10-01
uses Barnyard to store alarms from all DREnet Snort sensors in a MySQL database. Barnyard is an open source tool designed to work with Snort to take...Technology ITI Information Technology Infrastructure J2EE Java 2 Enterprise Edition JAR Java Archive. This is an archive file format defined by Java ...standards. JDBC Java Database Connectivity JDW JNDMS Data Warehouse JNDMS Joint Network and Defence Management System JNDMS Joint Network Defence and
Open Clients for Distributed Databases
NASA Astrophysics Data System (ADS)
Chayes, D. N.; Arko, R. A.
2001-12-01
We are actively developing a collection of open source example clients that demonstrate use of our "back end" data management infrastructure. The data management system is reported elsewhere at this meeting (Arko and Chayes: A Scaleable Database Infrastructure). In addition to their primary goal of being examples for others to build upon, some of these clients may have limited utility in them selves. More information about the clients and the data infrastructure is available on line at http://data.ldeo.columbia.edu. The available examples to be demonstrated include several web-based clients including those developed for the Community Review System of the Digital Library for Earth System Education, a real-time watch standers log book, an offline interface to use log book entries, a simple client to search on multibeam metadata and others are Internet enabled and generally web-based front ends that support searches against one or more relational databases using industry standard SQL queries. In addition to the web based clients, simple SQL searches from within Excel and similar applications will be demonstrated. By defining, documenting and publishing a clear interface to the fully searchable databases, it becomes relatively easy to construct client interfaces that are optimized for specific applications in comparison to building a monolithic data and user interface system.
BIGNASim: a NoSQL database structure and analysis portal for nucleic acids simulation data
Hospital, Adam; Andrio, Pau; Cugnasco, Cesare; Codo, Laia; Becerra, Yolanda; Dans, Pablo D.; Battistini, Federica; Torres, Jordi; Goñi, Ramón; Orozco, Modesto; Gelpí, Josep Ll.
2016-01-01
Molecular dynamics simulation (MD) is, just behind genomics, the bioinformatics tool that generates the largest amounts of data, and that is using the largest amount of CPU time in supercomputing centres. MD trajectories are obtained after months of calculations, analysed in situ, and in practice forgotten. Several projects to generate stable trajectory databases have been developed for proteins, but no equivalence exists in the nucleic acids world. We present here a novel database system to store MD trajectories and analyses of nucleic acids. The initial data set available consists mainly of the benchmark of the new molecular dynamics force-field, parmBSC1. It contains 156 simulations, with over 120 μs of total simulation time. A deposition protocol is available to accept the submission of new trajectory data. The database is based on the combination of two NoSQL engines, Cassandra for storing trajectories and MongoDB to store analysis results and simulation metadata. The analyses available include backbone geometries, helical analysis, NMR observables and a variety of mechanical analyses. Individual trajectories and combined meta-trajectories can be downloaded from the portal. The system is accessible through http://mmb.irbbarcelona.org/BIGNASim/. Supplementary Material is also available on-line at http://mmb.irbbarcelona.org/BIGNASim/SuppMaterial/. PMID:26612862
Chen, R S; Nadkarni, P; Marenco, L; Levin, F; Erdos, J; Miller, P L
2000-01-01
The entity-attribute-value representation with classes and relationships (EAV/CR) provides a flexible and simple database schema to store heterogeneous biomedical data. In certain circumstances, however, the EAV/CR model is known to retrieve data less efficiently than conventionally based database schemas. To perform a pilot study that systematically quantifies performance differences for database queries directed at real-world microbiology data modeled with EAV/CR and conventional representations, and to explore the relative merits of different EAV/CR query implementation strategies. Clinical microbiology data obtained over a ten-year period were stored using both database models. Query execution times were compared for four clinically oriented attribute-centered and entity-centered queries operating under varying conditions of database size and system memory. The performance characteristics of three different EAV/CR query strategies were also examined. Performance was similar for entity-centered queries in the two database models. Performance in the EAV/CR model was approximately three to five times less efficient than its conventional counterpart for attribute-centered queries. The differences in query efficiency became slightly greater as database size increased, although they were reduced with the addition of system memory. The authors found that EAV/CR queries formulated using multiple, simple SQL statements executed in batch were more efficient than single, large SQL statements. This paper describes a pilot project to explore issues in and compare query performance for EAV/CR and conventional database representations. Although attribute-centered queries were less efficient in the EAV/CR model, these inefficiencies may be addressable, at least in part, by the use of more powerful hardware or more memory, or both.
Chen, Mingyang; Stott, Amanda C; Li, Shenggang; Dixon, David A
2012-04-01
A robust metadata database called the Collaborative Chemistry Database Tool (CCDBT) for massive amounts of computational chemistry raw data has been designed and implemented. It performs data synchronization and simultaneously extracts the metadata. Computational chemistry data in various formats from different computing sources, software packages, and users can be parsed into uniform metadata for storage in a MySQL database. Parsing is performed by a parsing pyramid, including parsers written for different levels of data types and sets created by the parser loader after loading parser engines and configurations. Copyright © 2011 Elsevier Inc. All rights reserved.
SAMSON Technology Demonstrator
2014-06-01
requested. The SAMSON TD was testing with two different policy engines: 1. A custom XACML-based element matching engine using a MySQL database for...performed during the course of the event. Full information protection across the sphere of access management, information protection and auditing was in...
Methods, Knowledge Support, and Experimental Tools for Modeling
2006-10-01
open source software entities: the PostgreSQL relational database management system (http://www.postgres.org), the Apache web server (http...past. The revision control system allows the program to capture disagreements, and allows users to explore the history of such disagreements by
Automatic helmet-wearing detection for law enforcement using CCTV cameras
NASA Astrophysics Data System (ADS)
Wonghabut, P.; Kumphong, J.; Satiennam, T.; Ung-arunyawee, R.; Leelapatra, W.
2018-04-01
The objective of this research is to develop an application for enforcing helmet wearing using CCTV cameras. The developed application aims to help law enforcement by police, and eventually resulting in changing risk behaviours and consequently reducing the number of accidents and its severity. Conceptually, the application software implemented using C++ language and OpenCV library uses two different angle of view CCTV cameras. Video frames recorded by the wide-angle CCTV camera are used to detect motorcyclists. If any motorcyclist without helmet is found, then the zoomed (narrow-angle) CCTV is activated to capture image of the violating motorcyclist and the motorcycle license plate in real time. Captured images are managed by database implemented using MySQL for ticket issuing. The results show that the developed program is able to detect 81% of motorcyclists on various motorcycle types during daytime and night-time. The validation results reveal that the program achieves 74% accuracy in detecting the motorcyclist without helmet.
STILTS -- Starlink Tables Infrastructure Library Tool Set
NASA Astrophysics Data System (ADS)
Taylor, Mark
STILTS is a set of command-line tools for processing tabular data. It has been designed for, but is not restricted to, use on astronomical data such as source catalogues. It contains both generic (format-independent) table processing tools and tools for processing VOTable documents. Facilities offered include crossmatching, format conversion, format validation, column calculation and rearrangement, row selection, sorting, plotting, statistical calculations and metadata display. Calculations on cell data can be performed using a powerful and extensible expression language. The package is written in pure Java and based on STIL, the Starlink Tables Infrastructure Library. This gives it high portability, support for many data formats (including FITS, VOTable, text-based formats and SQL databases), extensibility and scalability. Where possible the tools are written to accept streamed data so the size of tables which can be processed is not limited by available memory. As well as the tutorial and reference information in this document, detailed on-line help is available from the tools themselves. STILTS is available under the GNU General Public Licence.
Development of Account Receivable and Payable System for Travel Bureau Company
NASA Astrophysics Data System (ADS)
Karma, I. G. M.; Susanti, J.
2018-01-01
Sales and purchases of products on credit made by travel bureau companies require serious handling because it involves a lot of money and many parties. This research aims to build information systems to handle account payables and receivables related to the purchase and sale of tour packages on credit. The methodology is object-oriented approach, by using MS. Visual Basic. Net as a programming language and MySQL as its database package. As the results are the Account Receivable information system that is used to handle accounts receivable on agents who have purchased a tour package on credit for the guests it sends, and the Account Payable information system that is used to handle company’s account payable to suppliers who provided products or services to guests who purchase tour packages. Both of these systems handle the interrelated matter of a particular guest. Therefore, if both systems are integrated with the reservation system will be able to provide income statement on the reservation of certain guests.
A web application to support telemedicine services in Brazil.
Barbosa, Ana Karina P; de A Novaes, Magdala; de Vasconcelos, Alexandre M L
2003-01-01
This paper describes a system that has been developed to support Telemedicine activities in Brazil, a country that has serious problems in the delivery of health services. The system is a part of the broader Tele-health Project that has been developed to make health services more accessible to the low-income population in the northeast region. The HealthNet system is based upon a pilot area that uses fetal and pediatric cardiology. This article describes both the system's conceptual model, including the tele-diagnosis and second medical opinion services, as well as its architecture and development stages. The system model describes both collaborating tools used asynchronously, such as discussion forums, and synchronous tools, such as videoconference services. Web and free-of-charge tools are utilized for implementation, such as Java and MySQL database. Furthermore, an interface with Electronic Patient Record (EPR) systems using Extended Markup Language (XML) technology is also proposed. Finally, considerations concerning the development and implementation process are presented.
Genomic region operation kit for flexible processing of deep sequencing data.
Ovaska, Kristian; Lyly, Lauri; Sahu, Biswajyoti; Jänne, Olli A; Hautaniemi, Sampsa
2013-01-01
Computational analysis of data produced in deep sequencing (DS) experiments is challenging due to large data volumes and requirements for flexible analysis approaches. Here, we present a mathematical formalism based on set algebra for frequently performed operations in DS data analysis to facilitate translation of biomedical research questions to language amenable for computational analysis. With the help of this formalism, we implemented the Genomic Region Operation Kit (GROK), which supports various DS-related operations such as preprocessing, filtering, file conversion, and sample comparison. GROK provides high-level interfaces for R, Python, Lua, and command line, as well as an extension C++ API. It supports major genomic file formats and allows storing custom genomic regions in efficient data structures such as red-black trees and SQL databases. To demonstrate the utility of GROK, we have characterized the roles of two major transcription factors (TFs) in prostate cancer using data from 10 DS experiments. GROK is freely available with a user guide from >http://csbi.ltdk.helsinki.fi/grok/.
Warehouses information system design and development
NASA Astrophysics Data System (ADS)
Darajatun, R. A.; Sukanta
2017-12-01
Materials/goods handling industry is fundamental for companies to ensure the smooth running of their warehouses. Efficiency and organization within every aspect of the business is essential in order to gain a competitive advantage. The purpose of this research is design and development of Kanban of inventory storage and delivery system. Application aims to facilitate inventory stock checks to be more efficient and effective. Users easily input finished goods from production department, warehouse, customer, and also suppliers. Master data designed as complete as possible to be prepared applications used in a variety of process logistic warehouse variations. The author uses Java programming language to develop the application, which is used for building Java Web applications, while the database used is MySQL. System development methodology that I use is the Waterfall methodology. Waterfall methodology has several stages of the Analysis, System Design, Implementation, Integration, Operation and Maintenance. In the process of collecting data the author uses the method of observation, interviews, and literature.
Specialized microbial databases for inductive exploration of microbial genome sequences
Fang, Gang; Ho, Christine; Qiu, Yaowu; Cubas, Virginie; Yu, Zhou; Cabau, Cédric; Cheung, Frankie; Moszer, Ivan; Danchin, Antoine
2005-01-01
Background The enormous amount of genome sequence data asks for user-oriented databases to manage sequences and annotations. Queries must include search tools permitting function identification through exploration of related objects. Methods The GenoList package for collecting and mining microbial genome databases has been rewritten using MySQL as the database management system. Functions that were not available in MySQL, such as nested subquery, have been implemented. Results Inductive reasoning in the study of genomes starts from "islands of knowledge", centered around genes with some known background. With this concept of "neighborhood" in mind, a modified version of the GenoList structure has been used for organizing sequence data from prokaryotic genomes of particular interest in China. GenoChore , a set of 17 specialized end-user-oriented microbial databases (including one instance of Microsporidia, Encephalitozoon cuniculi, a member of Eukarya) has been made publicly available. These databases allow the user to browse genome sequence and annotation data using standard queries. In addition they provide a weekly update of searches against the world-wide protein sequences data libraries, allowing one to monitor annotation updates on genes of interest. Finally, they allow users to search for patterns in DNA or protein sequences, taking into account a clustering of genes into formal operons, as well as providing extra facilities to query sequences using predefined sequence patterns. Conclusion This growing set of specialized microbial databases organize data created by the first Chinese bacterial genome programs (ThermaList, Thermoanaerobacter tencongensis, LeptoList, with two different genomes of Leptospira interrogans and SepiList, Staphylococcus epidermidis) associated to related organisms for comparison. PMID:15698474
Creating databases for biological information: an introduction.
Stein, Lincoln
2013-06-01
The essence of bioinformatics is dealing with large quantities of information. Whether it be sequencing data, microarray data files, mass spectrometric data (e.g., fingerprints), the catalog of strains arising from an insertional mutagenesis project, or even large numbers of PDF files, there inevitably comes a time when the information can simply no longer be managed with files and directories. This is where databases come into play. This unit briefly reviews the characteristics of several database management systems, including flat file, indexed file, relational databases, and NoSQL databases. It compares their strengths and weaknesses and offers some general guidelines for selecting an appropriate database management system. Copyright 2013 by JohnWiley & Sons, Inc.
Construction of the Database for Pulsating Variable Stars
NASA Astrophysics Data System (ADS)
Chen, Bing-Qiu; Yang, Ming; Jiang, Bi-Wei
2012-01-01
A database for pulsating variable stars is constructed to favor the study of variable stars in China. The database includes about 230,000 variable stars in the Galactic bulge, LMC and SMC observed in an about 10 yr period by the MACHO(MAssive Compact Halo Objects) and OGLE(Optical Gravitational Lensing Experiment) projects. The software used for the construction is LAMP, i.e., Linux+Apache+MySQL+PHP. A web page is provided for searching the photometric data and light curves in the database through the right ascension and declination of an object. Because of the flexibility of this database, more up-to-date data of variable stars can be incorporated into the database conveniently.
NASA Astrophysics Data System (ADS)
Vacca, G.; Pili, D.; Fiorino, D. R.; Pintus, V.
2017-05-01
The presented work is part of the research project, titled "Tecniche murarie tradizionali: conoscenza per la conservazione ed il miglioramento prestazionale" (Traditional building techniques: from knowledge to conservation and performance improvement), with the purpose of studying the building techniques of the 13th-18th centuries in the Sardinia Region (Italy) for their knowledge, conservation, and promotion. The end purpose of the entire study is to improve the performance of the examined structures. In particular, the task of the authors within the research project was to build a WebGIS to manage the data collected during the examination and study phases. This infrastructure was entirely built using Open Source software. The work consisted of designing a database built in PostgreSQL and its spatial extension PostGIS, which allows to store and manage feature geometries and spatial data. The data input is performed via a form built in HTML and PHP. The HTML part is based on Bootstrap, an open tools library for websites and web applications. The implementation of this template used both PHP and Javascript code. The PHP code manages the reading and writing of data to the database, using embedded SQL queries. As of today, we surveyed and archived more than 300 buildings, belonging to three main macro categories: fortification architectures, religious architectures, residential architectures. The masonry samples investigated in relation to the construction techniques are more than 150. The database is published on the Internet as a WebGIS built using the Leaflet Javascript open libraries, which allows creating map sites with background maps and navigation, input and query tools. This too uses an interaction of HTML, Javascript, PHP and SQL code.
Cario, Clinton L; Witte, John S
2018-03-15
As whole-genome tumor sequence and biological annotation datasets grow in size, number and content, there is an increasing basic science and clinical need for efficient and accurate data management and analysis software. With the emergence of increasingly sophisticated data stores, execution environments and machine learning algorithms, there is also a need for the integration of functionality across frameworks. We present orchid, a python based software package for the management, annotation and machine learning of cancer mutations. Building on technologies of parallel workflow execution, in-memory database storage and machine learning analytics, orchid efficiently handles millions of mutations and hundreds of features in an easy-to-use manner. We describe the implementation of orchid and demonstrate its ability to distinguish tissue of origin in 12 tumor types based on 339 features using a random forest classifier. Orchid and our annotated tumor mutation database are freely available at https://github.com/wittelab/orchid. Software is implemented in python 2.7, and makes use of MySQL or MemSQL databases. Groovy 2.4.5 is optionally required for parallel workflow execution. JWitte@ucsf.edu. Supplementary data are available at Bioinformatics online.
Finding Protein and Nucleotide Similarities with FASTA
Pearson, William R.
2016-01-01
The FASTA programs provide a comprehensive set of rapid similarity searching tools ( fasta36, fastx36, tfastx36, fasty36, tfasty36), similar to those provided by the BLAST package, as well as programs for slower, optimal, local and global similarity searches ( ssearch36, ggsearch36) and for searching with short peptides and oligonucleotides ( fasts36, fastm36). The FASTA programs use an empirical strategy for estimating statistical significance that accommodates a range of similarity scoring matrices and gap penalties, improving alignment boundary accuracy and search sensitivity (Unit 3.5). The FASTA programs can produce “BLAST-like” alignment and tabular output, for ease of integration into existing analysis pipelines, and can search small, representative databases, and then report results for a larger set of sequences, using links from the smaller dataset. The FASTA programs work with a wide variety of database formats, including mySQL and postgreSQL databases (Unit 9.4). The programs also provide a strategy for integrating domain and active site annotations into alignments and highlighting the mutational state of functionally critical residues. These protocols describe how to use the FASTA programs to characterize protein and DNA sequences, using protein:protein, protein:DNA, and DNA:DNA comparisons. PMID:27010337
Finding Protein and Nucleotide Similarities with FASTA.
Pearson, William R
2016-03-24
The FASTA programs provide a comprehensive set of rapid similarity searching tools (fasta36, fastx36, tfastx36, fasty36, tfasty36), similar to those provided by the BLAST package, as well as programs for slower, optimal, local, and global similarity searches (ssearch36, ggsearch36), and for searching with short peptides and oligonucleotides (fasts36, fastm36). The FASTA programs use an empirical strategy for estimating statistical significance that accommodates a range of similarity scoring matrices and gap penalties, improving alignment boundary accuracy and search sensitivity. The FASTA programs can produce "BLAST-like" alignment and tabular output, for ease of integration into existing analysis pipelines, and can search small, representative databases, and then report results for a larger set of sequences, using links from the smaller dataset. The FASTA programs work with a wide variety of database formats, including mySQL and postgreSQL databases. The programs also provide a strategy for integrating domain and active site annotations into alignments and highlighting the mutational state of functionally critical residues. These protocols describe how to use the FASTA programs to characterize protein and DNA sequences, using protein:protein, protein:DNA, and DNA:DNA comparisons. Copyright © 2016 John Wiley & Sons, Inc.
tcpl: The ToxCast Pipeline for High-Throughput Screening Data
Motivation: The large and diverse high-throughput chemical screening efforts carried out by the US EPAToxCast program requires an efficient, transparent, and reproducible data pipeline.Summary: The tcpl R package and its associated MySQL database provide a generalized platform fo...
DOE Office of Scientific and Technical Information (OSTI.GOV)
Christopher Slominski
2009-10-01
Archiving a large fraction of the EPICS signals within the Jefferson Lab (JLAB) Accelerator control system is vital for postmortem and real-time analysis of the accelerator performance. This analysis is performed on a daily basis by scientists, operators, engineers, technicians, and software developers. Archiving poses unique challenges due to the magnitude of the control system. A MySQL Archiving system (Mya) was developed to scale to the needs of the control system; currently archiving 58,000 EPICS variables, updating at a rate of 11,000 events per second. In addition to the large collection rate, retrieval of the archived data must also bemore » fast and robust. Archived data retrieval clients obtain data at a rate over 100,000 data points per second. Managing the data in a relational database provides a number of benefits. This paper describes an archiving solution that uses an open source database and standard off the shelf hardware to reach high performance archiving needs. Mya has been in production at Jefferson Lab since February of 2007.« less
NoSQL Based 3D City Model Management System
NASA Astrophysics Data System (ADS)
Mao, B.; Harrie, L.; Cao, J.; Wu, Z.; Shen, J.
2014-04-01
To manage increasingly complicated 3D city models, a framework based on NoSQL database is proposed in this paper. The framework supports import and export of 3D city model according to international standards such as CityGML, KML/COLLADA and X3D. We also suggest and implement 3D model analysis and visualization in the framework. For city model analysis, 3D geometry data and semantic information (such as name, height, area, price and so on) are stored and processed separately. We use a Map-Reduce method to deal with the 3D geometry data since it is more complex, while the semantic analysis is mainly based on database query operation. For visualization, a multiple 3D city representation structure CityTree is implemented within the framework to support dynamic LODs based on user viewpoint. Also, the proposed framework is easily extensible and supports geoindexes to speed up the querying. Our experimental results show that the proposed 3D city management system can efficiently fulfil the analysis and visualization requirements.
The BDNYC database of low-mass stars, brown dwarfs, and planetary mass companions
NASA Astrophysics Data System (ADS)
Cruz, Kelle; Rodriguez, David; Filippazzo, Joseph; Gonzales, Eileen; Faherty, Jacqueline K.; Rice, Emily; BDNYC
2018-01-01
We present a web-interface to a database of low-mass stars, brown dwarfs, and planetary mass companions. Users can send SELECT SQL queries to the database, perform searches by coordinates or name, check the database inventory on specified objects, and even plot spectra interactively. The initial version of this database contains information for 198 objects and version 2 will contain over 1000 objects. The database currently includes photometric data from 2MASS, WISE, and Spitzer and version 2 will include a significant portion of the publicly available optical and NIR spectra for brown dwarfs. The database is maintained and curated by the BDNYC research group and we welcome contributions from other researchers via GitHub.
Architecture for biomedical multimedia information delivery on the World Wide Web
NASA Astrophysics Data System (ADS)
Long, L. Rodney; Goh, Gin-Hua; Neve, Leif; Thoma, George R.
1997-10-01
Research engineers at the National Library of Medicine are building a prototype system for the delivery of multimedia biomedical information on the World Wide Web. This paper discuses the architecture and design considerations for the system, which will be used initially to make images and text from the third National Health and Nutrition Examination Survey (NHANES) publicly available. We categorized our analysis as follows: (1) fundamental software tools: we analyzed trade-offs among use of conventional HTML/CGI, X Window Broadway, and Java; (2) image delivery: we examined the use of unconventional TCP transmission methods; (3) database manager and database design: we discuss the capabilities and planned use of the Informix object-relational database manager and the planned schema for the HNANES database; (4) storage requirements for our Sun server; (5) user interface considerations; (6) the compatibility of the system with other standard research and analysis tools; (7) image display: we discuss considerations for consistent image display for end users. Finally, we discuss the scalability of the system in terms of incorporating larger or more databases of similar data, and the extendibility of the system for supporting content-based retrieval of biomedical images. The system prototype is called the Web-based Medical Information Retrieval System. An early version was built as a Java applet and tested on Unix, PC, and Macintosh platforms. This prototype used the MiniSQL database manager to do text queries on a small database of records of participants in the second NHANES survey. The full records and associated x-ray images were retrievable and displayable on a standard Web browser. A second version has now been built, also a Java applet, using the MySQL database manager.
An expression database for roots of the model legume Medicago truncatula under salt stress
2009-01-01
Background Medicago truncatula is a model legume whose genome is currently being sequenced by an international consortium. Abiotic stresses such as salt stress limit plant growth and crop productivity, including those of legumes. We anticipate that studies on M. truncatula will shed light on other economically important legumes across the world. Here, we report the development of a database called MtED that contains gene expression profiles of the roots of M. truncatula based on time-course salt stress experiments using the Affymetrix Medicago GeneChip. Our hope is that MtED will provide information to assist in improving abiotic stress resistance in legumes. Description The results of our microarray experiment with roots of M. truncatula under 180 mM sodium chloride were deposited in the MtED database. Additionally, sequence and annotation information regarding microarray probe sets were included. MtED provides functional category analysis based on Gene and GeneBins Ontology, and other Web-based tools for querying and retrieving query results, browsing pathways and transcription factor families, showing metabolic maps, and comparing and visualizing expression profiles. Utilities like mapping probe sets to genome of M. truncatula and In-Silico PCR were implemented by BLAT software suite, which were also available through MtED database. Conclusion MtED was built in the PHP script language and as a MySQL relational database system on a Linux server. It has an integrated Web interface, which facilitates ready examination and interpretation of the results of microarray experiments. It is intended to help in selecting gene markers to improve abiotic stress resistance in legumes. MtED is available at http://bioinformatics.cau.edu.cn/MtED/. PMID:19906315
An expression database for roots of the model legume Medicago truncatula under salt stress.
Li, Daofeng; Su, Zhen; Dong, Jiangli; Wang, Tao
2009-11-11
Medicago truncatula is a model legume whose genome is currently being sequenced by an international consortium. Abiotic stresses such as salt stress limit plant growth and crop productivity, including those of legumes. We anticipate that studies on M. truncatula will shed light on other economically important legumes across the world. Here, we report the development of a database called MtED that contains gene expression profiles of the roots of M. truncatula based on time-course salt stress experiments using the Affymetrix Medicago GeneChip. Our hope is that MtED will provide information to assist in improving abiotic stress resistance in legumes. The results of our microarray experiment with roots of M. truncatula under 180 mM sodium chloride were deposited in the MtED database. Additionally, sequence and annotation information regarding microarray probe sets were included. MtED provides functional category analysis based on Gene and GeneBins Ontology, and other Web-based tools for querying and retrieving query results, browsing pathways and transcription factor families, showing metabolic maps, and comparing and visualizing expression profiles. Utilities like mapping probe sets to genome of M. truncatula and In-Silico PCR were implemented by BLAT software suite, which were also available through MtED database. MtED was built in the PHP script language and as a MySQL relational database system on a Linux server. It has an integrated Web interface, which facilitates ready examination and interpretation of the results of microarray experiments. It is intended to help in selecting gene markers to improve abiotic stress resistance in legumes. MtED is available at http://bioinformatics.cau.edu.cn/MtED/.
Li, Min; Dong, Xiang-yu; Liang, Hao; Leng, Li; Zhang, Hui; Wang, Shou-zhi; Li, Hui; Du, Zhi-Qiang
2017-05-20
Effective management and analysis of precisely recorded phenotypic traits are important components of the selection and breeding of superior livestocks. Over two decades, we divergently selected chicken lines for abdominal fat content at Northeast Agricultural University (Northeast Agricultural University High and Low Fat, NEAUHLF), and collected large volume of phenotypic data related to the investigation on molecular genetic basis of adipose tissue deposition in broilers. To effectively and systematically store, manage and analyze phenotypic data, we built the NEAUHLF Phenome Database (NEAUHLFPD). NEAUHLFPD included the following phenotypic records: pedigree (generations 1-19) and 29 phenotypes, such as body sizes and weights, carcass traits and their corresponding rates. The design and construction strategy of NEAUHLFPD were executed as follows: (1) Framework design. We used Apache as our web server, MySQL and Navicat as database management tools, and PHP as the HTML-embedded language to create dynamic interactive website. (2) Structural components. On the main interface, detailed introduction on the composition, function, and the index buttons of the basic structure of the database could be found. The functional modules of NEAUHLFPD had two main components: the first module referred to the physical storage space for phenotypic data, in which functional manipulation on data can be realized, such as data indexing, filtering, range-setting, searching, etc.; the second module related to the calculation of basic descriptive statistics, where data filtered from the database can be used for the computation of basic statistical parameters and the simultaneous conditional sorting. NEAUHLFPD could be used to effectively store and manage not only phenotypic, but also genotypic and genomics data, which can facilitate further investigation on the molecular genetic basis of chicken adipose tissue growth and development, and expedite the selection and breeding of broilers with low fat content.
A generic minimization random allocation and blinding system on web.
Cai, Hongwei; Xia, Jielai; Xu, Dezhong; Gao, Donghuai; Yan, Yongping
2006-12-01
Minimization is a dynamic randomization method for clinical trials. Although recommended by many researchers, the utilization of minimization has been seldom reported in randomized trials mainly because of the controversy surrounding the validity of conventional analyses and its complexity in implementation. However, both the statistical and clinical validity of minimization were demonstrated in recent studies. Minimization random allocation system integrated with blinding function that could facilitate the implementation of this method in general clinical trials has not been reported. SYSTEM OVERVIEW: The system is a web-based random allocation system using Pocock and Simon minimization method. It also supports multiple treatment arms within a trial, multiple simultaneous trials, and blinding without further programming. This system was constructed with generic database schema design method, Pocock and Simon minimization method and blinding method. It was coded with Microsoft Visual Basic and Active Server Pages (ASP) programming languages. And all dataset were managed with a Microsoft SQL Server database. Some critical programming codes were also provided. SIMULATIONS AND RESULTS: Two clinical trials were simulated simultaneously to test the system's applicability. Not only balanced groups but also blinded allocation results were achieved in both trials. Practical considerations for minimization method, the benefits, general applicability and drawbacks of the technique implemented in this system are discussed. Promising features of the proposed system are also summarized.
Expert system for skin problem consultation in Thai traditional medicine.
Nopparatkiat, Pornchai; na Nagara, Byaporn; Chansa-ngavej, Chuvej
2014-01-01
This paper aimed to demonstrate the research and development of a rule-based expert system for skin problem consulting in the areas of acne, melasma, freckle, wrinkle, and uneven skin tone, with recommended treatments from Thai traditional medicine knowledge. The tool selected for developing the expert system is a software program written in the PHP language. MySQL database is used to work together with PHP for building database of the expert system. The system is web-based and can be reached from anywhere with Internet access. The developed expert system gave recommendations on the skin problem treatment with Thai herbal recipes and Thai herbal cosmetics based on 416 rules derived from primary and secondary sources. The system had been tested by 50 users consisting of dermatologists, Thai traditional medicine doctors, and general users. The developed system was considered good for learning and consultation. The present work showed how such a scattered body of traditional knowledge as Thai traditional medicine and herbal recipes could be collected, organised and made accessible to users and interested parties. The expert system developed herein should contribute in a meaningful way towards preserving the knowledge and helping promote the use of Thai traditional medicine as a practical alternative medicine for the treatment of illnesses.
Evaluation of Sub Query Performance in SQL Server
NASA Astrophysics Data System (ADS)
Oktavia, Tanty; Sujarwo, Surya
2014-03-01
The paper explores several sub query methods used in a query and their impact on the query performance. The study uses experimental approach to evaluate the performance of each sub query methods combined with indexing strategy. The sub query methods consist of in, exists, relational operator and relational operator combined with top operator. The experimental shows that using relational operator combined with indexing strategy in sub query has greater performance compared with using same method without indexing strategy and also other methods. In summary, for application that emphasized on the performance of retrieving data from database, it better to use relational operator combined with indexing strategy. This study is done on Microsoft SQL Server 2012.
Unobtrusive Social Network Data From Email
2008-12-01
PERSON Standard Form 298 (Rev. 8-98) Prescribed by ANSI Std Z39-18 outlook archived files and stores that data into an SQL - database. Communication...Applications ( VBA ) program was installed on the personal computers (PC) of all participants, in the session window of their Microsoft Outlook. Details of
Dr. John H. Hopps Jr. Research Scholars Program
2014-10-20
Program staff, alumni and existing participants. Over the course of the last five months, SageFox has successfully obtained IRB approval for all...and awards. Progress made in development of the HoppsNet system included design and implementation of a relational database in MySQL , development of
Acoustic Metadata Management and Transparent Access to Networked Oceanographic Data Sets
2013-09-30
connectivity (ODBC) compliant data source for which drivers are available (e.g. MySQL , Oracle database, Postgres) can now be imported. Implementation...the possibility of speeding data transmission through compression (implemented) or the potential to use alternative data formats such as Java script
Social Impacts Module (SIM) Transition
2012-09-28
User String The authorized user’s name to access the PAVE database. Applies only to Microsoft SQL Server; leave blank, otherwise. passwd String The...otherwise. passwd String The password if an authorized user’s name is required; otherwise, leave blank driver String The class name for the driver to
TNAURice: Database on rice varieties released from Tamil Nadu Agricultural University
Ramalingam, Jegadeesan; Arul, Loganathan; Sathishkumar, Natarajan; Vignesh, Dhandapani; Thiyagarajan, Katiannan; Samiyappan, Ramasamy
2010-01-01
We developed, TNAURice: a database comprising of the rice varieties released from a public institution, Tamil Nadu Agricultural University (TNAU), Coimbatore, India. Backed by MS-SQL, and ASP-Net at the front end, this database provide information on both quantitative and qualitative descriptors of the rice varities inclusive of their parental details. Enabled by an user friendly search utility, the database can be effectively searched by the varietal descriptors, and the entire contents are navigable as well. The database comes handy to the plant breeders involved in the varietal improvement programs to decide on the choice of parental lines. TNAURice is available for public access at http://www.btistnau.org/germdefault.aspx. PMID:21364829
TNAURice: Database on rice varieties released from Tamil Nadu Agricultural University.
Ramalingam, Jegadeesan; Arul, Loganathan; Sathishkumar, Natarajan; Vignesh, Dhandapani; Thiyagarajan, Katiannan; Samiyappan, Ramasamy
2010-11-27
WE DEVELOPED, TNAURICE: a database comprising of the rice varieties released from a public institution, Tamil Nadu Agricultural University (TNAU), Coimbatore, India. Backed by MS-SQL, and ASP-Net at the front end, this database provide information on both quantitative and qualitative descriptors of the rice varities inclusive of their parental details. Enabled by an user friendly search utility, the database can be effectively searched by the varietal descriptors, and the entire contents are navigable as well. The database comes handy to the plant breeders involved in the varietal improvement programs to decide on the choice of parental lines. TNAURice is available for public access at http://www.btistnau.org/germdefault.aspx.
EST databases and web tools for EST projects.
Shen, Yao-Qing; O'Brien, Emmet; Koski, Liisa; Lang, B Franz; Burger, Gertraud
2009-01-01
This chapter outlines key considerations for constructing and implementing an EST database. Instead of showing the technological details step by step, emphasis is put on the design of an EST database suited to the specific needs of EST projects and how to choose the most suitable tools. Using TBestDB as an example, we illustrate the essential factors to be considered for database construction and the steps for data population and annotation. This process employs technologies such as PostgreSQL, Perl, and PHP to build the database and interface, and tools such as AutoFACT for data processing and annotation. We discuss these in comparison to other available technologies and tools, and explain the reasons for our choices.
A standards-based clinical information system for HIV/AIDS.
Stitt, F W
1995-01-01
To create a clinical data repository to interface the Veteran's Administration (VA) Decentralized Hospital Computer Program (DHCP) and a departmental clinical information system for the management of HIV patients. This system supports record-keeping, decision-making, reporting, and analysis. The database development was designed to overcome two impediments to successful implementations of clinical databases: (i) lack of a standard reference data model, and; (ii) lack of a universal standard for medical concept representation. Health Level Seven (HL7) is a standard protocol that specifies the implementation of interfaces between two computer applications (sender and receiver) from different vendors or sources of electronic data exchange in the health care environment. This eliminates or substantially reduces the custom interface programming and program maintenance that would otherwise be required. HL7 defines the data to be exchanged, the timing of the interchange, and the communication of errors to the application. The formats are generic in nature and must be configured to meet the needs of the two applications involved. The standard conceptually operates at the seventh level of the ISO model for Open Systems Interconnection (OSI). The OSI simply defines the data elements that are exchanged as abstract messages, and does not prescribe the exact bit stream of the messages that flow over the network. Lower level network software developed according to the OSI model may be used to encode and decode the actual bit stream. The OSI protocols are not universally implemented and, therefore, a set of encoding rules for defining the exact representation of a message must be specified. The VA has created an HL7 module to assist DHCP applications in exchanging health care information with other applications using the HL7 protocol. The DHCP HL7 module consists of a set of utility routines and files that provide a generic interface to the HL7 protocol for all DHCP applications. The VA's DHCP core modules are in standard use at 169 hospitals, and the role of the VA system in health care delivery has been discussed elsewhere. This development was performed at the Miami VA Medical Center Special Immunology Unit, where a database was created for an HIV patient registry in 1987. Over 2,300 patient have been entered into a database that supports a problem-oriented summary of the patient's clinical record. The interface to the VA DHCP was designed and implemented to capture information from the patient treatment file, pharmacy, laboratory, radiology, and other modules. We obtained a suite of programs for implementing the HL7 encoding rules from Columbia-Presbyterian Medical Center in New York, written in ANSI C. This toolkit isolates our application programs from the details of the HL7 encoding rules, and allows them to deal with abstract messages and the programming level. While HL7 has become a standard for healthcare message exchange, SQL (Structured Query Language) is the standard for database definition, data manipulation, and query. The target database (Stitt F.W. The Problem-Oriented Medical Synopsis: a patient-centered clinical information system. Proc 17 SCAMC. 1993:88-93) provides clinical workstation functionality. Medical concepts are encoded using a preferred terminology derived from over 15 sources that include the Unified Medical Language System and SNOMed International ( Stitt F.W. The Problem-Oriented Medical Synopsis: coding, indexing, and classification sub-model. Proc 18 SCAMC, 1994: in press). The databases were modeled using the Information Engineering CASE tools, and were written using relational database utilities, including embedded SQL in C (ESQL/C). We linked ESQL/C programs to the HL7 toolkit to allow data to be inserted, deleted, or updated, under transaction control. A graphical format will be used to display the entity-rel
NASA Astrophysics Data System (ADS)
Shao, Weber; Kupelian, Patrick A.; Wang, Jason; Low, Daniel A.; Ruan, Dan
2014-03-01
We devise a paradigm for representing the DICOM-RT structure sets in a database management system, in such way that secondary calculations of geometric information can be performed quickly from the existing contour definitions. The implementation of this paradigm is achieved using the PostgreSQL database system and the PostGIS extension, a geographic information system commonly used for encoding geographical map data. The proposed paradigm eliminates the overhead of retrieving large data records from the database, as well as the need to implement various numerical and data parsing routines, when additional information related to the geometry of the anatomy is desired.
Xia, Jun; Tashpolat, Tiyip; Zhang, Fei; Ji, Hong-jiang
2011-07-01
The characteristic of object spectrum is not only the base of the quantification analysis of remote sensing, but also the main content of the basic research of remote sensing. The typical surface object spectral database in arid areas oasis is of great significance for applied research on remote sensing in soil salinization. In the present paper, the authors took the Ugan-Kuqa River Delta Oasis as an example, unified .NET and the SuperMap platform with SQL Server database stored data, used the B/S pattern and the C# language to design and develop the typical surface object spectral information system, and established the typical surface object spectral database according to the characteristics of arid areas oasis. The system implemented the classified storage and the management of typical surface object spectral information and the related attribute data of the study areas; this system also implemented visualized two-way query between the maps and attribute data, the drawings of the surface object spectral response curves and the processing of the derivative spectral data and its drawings. In addition, the system initially possessed a simple spectral data mining and analysis capabilities, and this advantage provided an efficient, reliable and convenient data management and application platform for the Ugan-Kuqa River Delta Oasis's follow-up study in soil salinization. Finally, It's easy to maintain, convinient for secondary development and practically operating in good condition.
The HITRAN2016 molecular spectroscopic database
NASA Astrophysics Data System (ADS)
Gordon, I. E.; Rothman, L. S.; Hill, C.; Kochanov, R. V.; Tan, Y.; Bernath, P. F.; Birk, M.; Boudon, V.; Campargue, A.; Chance, K. V.; Drouin, B. J.; Flaud, J.-M.; Gamache, R. R.; Hodges, J. T.; Jacquemart, D.; Perevalov, V. I.; Perrin, A.; Shine, K. P.; Smith, M.-A. H.; Tennyson, J.; Toon, G. C.; Tran, H.; Tyuterev, V. G.; Barbe, A.; Császár, A. G.; Devi, V. M.; Furtenbacher, T.; Harrison, J. J.; Hartmann, J.-M.; Jolly, A.; Johnson, T. J.; Karman, T.; Kleiner, I.; Kyuberis, A. A.; Loos, J.; Lyulin, O. M.; Massie, S. T.; Mikhailenko, S. N.; Moazzen-Ahmadi, N.; Müller, H. S. P.; Naumenko, O. V.; Nikitin, A. V.; Polyansky, O. L.; Rey, M.; Rotger, M.; Sharpe, S. W.; Sung, K.; Starikova, E.; Tashkun, S. A.; Auwera, J. Vander; Wagner, G.; Wilzewski, J.; Wcisło, P.; Yu, S.; Zak, E. J.
2017-12-01
This paper describes the contents of the 2016 edition of the HITRAN molecular spectroscopic compilation. The new edition replaces the previous HITRAN edition of 2012 and its updates during the intervening years. The HITRAN molecular absorption compilation is composed of five major components: the traditional line-by-line spectroscopic parameters required for high-resolution radiative-transfer codes, infrared absorption cross-sections for molecules not yet amenable to representation in a line-by-line form, collision-induced absorption data, aerosol indices of refraction, and general tables such as partition sums that apply globally to the data. The new HITRAN is greatly extended in terms of accuracy, spectral coverage, additional absorption phenomena, added line-shape formalisms, and validity. Moreover, molecules, isotopologues, and perturbing gases have been added that address the issues of atmospheres beyond the Earth. Of considerable note, experimental IR cross-sections for almost 300 additional molecules important in different areas of atmospheric science have been added to the database. The compilation can be accessed through www.hitran.org. Most of the HITRAN data have now been cast into an underlying relational database structure that offers many advantages over the long-standing sequential text-based structure. The new structure empowers the user in many ways. It enables the incorporation of an extended set of fundamental parameters per transition, sophisticated line-shape formalisms, easy user-defined output formats, and very convenient searching, filtering, and plotting of data. A powerful application programming interface making use of structured query language (SQL) features for higher-level applications of HITRAN is also provided.
NetIntel: A Database for Manipulation of Rich Social Network Data
2005-03-03
between entities in a social or organizational system. For most of its history , social network analysis has operated on a notion of a dataset - a clearly...and procedural), as well as stored procedure and trigger capabilities. For the current implementation, we have chosen PostgreSQL [1] database. Of the...data and easy-to-use facilities for export of data into analysis tools as well as online browsing and data entry. References [1] Postgresql
2009-07-01
data were recognized as being largely geospatial and thus a GIS was considered the most reasonable way to proceed. The Postgre suite of software also...for the ESRI (2009) geodatabase environment but is applicable for this Postgre -based system. We then introduce and discuss spatial reference...PostgreSQL database using a Postgre ODBC connection. This procedure identified 100 tables with 737 columns. This is after the removal of two
BIGNASim: a NoSQL database structure and analysis portal for nucleic acids simulation data.
Hospital, Adam; Andrio, Pau; Cugnasco, Cesare; Codo, Laia; Becerra, Yolanda; Dans, Pablo D; Battistini, Federica; Torres, Jordi; Goñi, Ramón; Orozco, Modesto; Gelpí, Josep Ll
2016-01-04
Molecular dynamics simulation (MD) is, just behind genomics, the bioinformatics tool that generates the largest amounts of data, and that is using the largest amount of CPU time in supercomputing centres. MD trajectories are obtained after months of calculations, analysed in situ, and in practice forgotten. Several projects to generate stable trajectory databases have been developed for proteins, but no equivalence exists in the nucleic acids world. We present here a novel database system to store MD trajectories and analyses of nucleic acids. The initial data set available consists mainly of the benchmark of the new molecular dynamics force-field, parmBSC1. It contains 156 simulations, with over 120 μs of total simulation time. A deposition protocol is available to accept the submission of new trajectory data. The database is based on the combination of two NoSQL engines, Cassandra for storing trajectories and MongoDB to store analysis results and simulation metadata. The analyses available include backbone geometries, helical analysis, NMR observables and a variety of mechanical analyses. Individual trajectories and combined meta-trajectories can be downloaded from the portal. The system is accessible through http://mmb.irbbarcelona.org/BIGNASim/. Supplementary Material is also available on-line at http://mmb.irbbarcelona.org/BIGNASim/SuppMaterial/. © The Author(s) 2015. Published by Oxford University Press on behalf of Nucleic Acids Research.
Rasdaman for Big Spatial Raster Data
NASA Astrophysics Data System (ADS)
Hu, F.; Huang, Q.; Scheele, C. J.; Yang, C. P.; Yu, M.; Liu, K.
2015-12-01
Spatial raster data have grown exponentially over the past decade. Recent advancements on data acquisition technology, such as remote sensing, have allowed us to collect massive observation data of various spatial resolution and domain coverage. The volume, velocity, and variety of such spatial data, along with the computational intensive nature of spatial queries, pose grand challenge to the storage technologies for effective big data management. While high performance computing platforms (e.g., cloud computing) can be used to solve the computing-intensive issues in big data analysis, data has to be managed in a way that is suitable for distributed parallel processing. Recently, rasdaman (raster data manager) has emerged as a scalable and cost-effective database solution to store and retrieve massive multi-dimensional arrays, such as sensor, image, and statistics data. Within this paper, the pros and cons of using rasdaman to manage and query spatial raster data will be examined and compared with other common approaches, including file-based systems, relational databases (e.g., PostgreSQL/PostGIS), and NoSQL databases (e.g., MongoDB and Hive). Earth Observing System (EOS) data collected from NASA's Atmospheric Scientific Data Center (ASDC) will be used and stored in these selected database systems, and a set of spatial and non-spatial queries will be designed to benchmark their performance on retrieving large-scale, multi-dimensional arrays of EOS data. Lessons learnt from using rasdaman will be discussed as well.
Toward An Unstructured Mesh Database
NASA Astrophysics Data System (ADS)
Rezaei Mahdiraji, Alireza; Baumann, Peter Peter
2014-05-01
Unstructured meshes are used in several application domains such as earth sciences (e.g., seismology), medicine, oceanography, cli- mate modeling, GIS as approximate representations of physical objects. Meshes subdivide a domain into smaller geometric elements (called cells) which are glued together by incidence relationships. The subdivision of a domain allows computational manipulation of complicated physical structures. For instance, seismologists model earthquakes using elastic wave propagation solvers on hexahedral meshes. The hexahedral con- tains several hundred millions of grid points and millions of hexahedral cells. Each vertex node in the hexahedrals stores a multitude of data fields. To run simulation on such meshes, one needs to iterate over all the cells, iterate over incident cells to a given cell, retrieve coordinates of cells, assign data values to cells, etc. Although meshes are used in many application domains, to the best of our knowledge there is no database vendor that support unstructured mesh features. Currently, the main tool for querying and manipulating unstructured meshes are mesh libraries, e.g., CGAL and GRAL. Mesh li- braries are dedicated libraries which includes mesh algorithms and can be run on mesh representations. The libraries do not scale with dataset size, do not have declarative query language, and need deep C++ knowledge for query implementations. Furthermore, due to high coupling between the implementations and input file structure, the implementations are less reusable and costly to maintain. A dedicated mesh database offers the following advantages: 1) declarative querying, 2) ease of maintenance, 3) hiding mesh storage structure from applications, and 4) transparent query optimization. To design a mesh database, the first challenge is to define a suitable generic data model for unstructured meshes. We proposed ImG-Complexes data model as a generic topological mesh data model which extends incidence graph model to multi-incidence relationships. We instrument ImG model with sets of optional and application-specific constraints which can be used to check validity of meshes for a specific class of object such as manifold, pseudo-manifold, and simplicial manifold. We conducted experiments to measure the performance of the graph database solution in processing mesh queries and compare it with GrAL mesh library and PostgreSQL database on synthetic and real mesh datasets. The experiments show that each system perform well on specific types of mesh queries, e.g., graph databases perform well on global path-intensive queries. In the future, we investigate database operations for the ImG model and design a mesh query language.
Cellular Consequences of Telomere Shortening in Histologically Normal Breast Tissues
2013-09-01
using the open source, JAVA -based image analysis software package ImageJ (http://rsb.info.nih.gov/ij/) and a custom designed plugin (“Telometer...Tabulated data were stored in a MySQL (http://www.mysql.com) database and viewed through Microsoft Access (Microsoft Corp.). Statistical Analysis For
Teaching Structured Design of Network Algorithms in Enhanced Versions of SQL
ERIC Educational Resources Information Center
de Brock, Bert
2004-01-01
From time to time developers of (database) applications will encounter, explicitly or implicitly, structures such as trees, graphs, and networks. Such applications can, for instance, relate to bills of material, organization charts, networks of (rail)roads, networks of conduit pipes (e.g., plumbing, electricity), telecom networks, and data…
Building a High Performance Metadata Broker using Clojure, NoSQL and Message Queues
NASA Astrophysics Data System (ADS)
Truslove, I.; Reed, S.
2013-12-01
In practice, Earth and Space Science Informatics often relies on getting more done with less: fewer hardware resources, less IT staff, fewer lines of code. As a capacity-building exercise focused on rapid development of high-performance geoinformatics software, the National Snow and Ice Data Center (NSIDC) built a prototype metadata brokering system using a new JVM language, modern database engines and virtualized or cloud computing resources. The metadata brokering system was developed with the overarching goals of (i) demonstrating a technically viable product with as little development effort as possible, (ii) using very new yet very popular tools and technologies in order to get the most value from the least legacy-encumbered code bases, and (iii) being a high-performance system by using scalable subcomponents, and implementation patterns typically used in web architectures. We implemented the system using the Clojure programming language (an interactive, dynamic, Lisp-like JVM language), Redis (a fast in-memory key-value store) as both the data store for original XML metadata content and as the provider for the message queueing service, and ElasticSearch for its search and indexing capabilities to generate search results. On evaluating the results of the prototyping process, we believe that the technical choices did in fact allow us to do more for less, due to the expressive nature of the Clojure programming language and its easy interoperability with Java libraries, and the successful reuse or re-application of high performance products or designs. This presentation will describe the architecture of the metadata brokering system, cover the tools and techniques used, and describe lessons learned, conclusions, and potential next steps.
NASA Astrophysics Data System (ADS)
Moreau, N.; Dubernet, M. L.
2006-07-01
Basecol is a combination of a website (using PHP and HTML) and a MySQL database concerning molecular ro-vibrational transitions induced by collisions with atoms or molecules. This database has been created in view of the scientific preparation of the Heterodyne Instrument for the Far-Infrared on board the Herschel Space Observatory (HSO). Basecol offers an access to numerical and bibliographic data through various output methods such as ASCII, HTML or VOTable (which is a first step towards a VO compliant system). A web service using Apache Axis has been developed in order to provide a direct access to data for external applications.
Web client and ODBC access to legacy database information: a low cost approach.
Sanders, N. W.; Mann, N. H.; Spengler, D. M.
1997-01-01
A new method has been developed for the Department of Orthopaedics of Vanderbilt University Medical Center to access departmental clinical data. Previously this data was stored only in the medical center's mainframe DB2 database, it is now additionally stored in a departmental SQL database. Access to this data is available via any ODBC compliant front-end or a web client. With a small budget and no full time staff, we were able to give our department on-line access to many years worth of patient data that was previously inaccessible. PMID:9357735
NASA Astrophysics Data System (ADS)
Krishna, B.; Gustafson, W. I., Jr.; Vogelmann, A. M.; Toto, T.; Devarakonda, R.; Palanisamy, G.
2016-12-01
This paper presents a new way of providing ARM data discovery through data analysis and visualization services. ARM stands for Atmospheric Radiation Measurement. This Program was created to study cloud formation processes and their influence on radiative transfer and also include additional measurements of aerosol and precipitation at various highly instrumented ground and mobile stations. The total volume of ARM data is roughly 900TB. The current search for ARM data is performed by using its metadata, such as the site name, instrument name, date, etc. NoSQL technologies were explored to improve the capabilities of data searching, not only by their metadata, but also by using the measurement values. Two technologies that are currently being implemented for testing are Apache Cassandra (noSQL database) and Apache Spark (noSQL based analytics framework). Both of these technologies were developed to work in a distributed environment and hence can handle large data for storing and analytics. D3.js is a JavaScript library that can generate interactive data visualizations in web browsers by making use of commonly used SVG, HTML5, and CSS standards. To test the performance of NoSQL for ARM data, we will be using ARM's popular measurements to locate the data based on its value. Recently noSQL technology has been applied to a pilot project called LASSO, which stands for LES ARM Symbiotic Simulation and Observation Workflow. LASSO will be packaging LES output and observations in "data bundles" and analyses will require the ability for users to analyze both observations and LES model output either individually or together across multiple time periods. The LASSO implementation strategy suggests that enormous data storage is required to store the above mentioned quantities. Thus noSQL was used to provide a powerful means to store portions of the data that provided users with search capabilities on each simulation's traits through a web application. Based on the user selection, plots are created dynamically along with ancillary information that enables the user to locate and download data that fulfilled their required traits.
A Grid Metadata Service for Earth and Environmental Sciences
NASA Astrophysics Data System (ADS)
Fiore, Sandro; Negro, Alessandro; Aloisio, Giovanni
2010-05-01
Critical challenges for climate modeling researchers are strongly connected with the increasingly complex simulation models and the huge quantities of produced datasets. Future trends in climate modeling will only increase computational and storage requirements. For this reason the ability to transparently access to both computational and data resources for large-scale complex climate simulations must be considered as a key requirement for Earth Science and Environmental distributed systems. From the data management perspective (i) the quantity of data will continuously increases, (ii) data will become more and more distributed and widespread, (iii) data sharing/federation will represent a key challenging issue among different sites distributed worldwide, (iv) the potential community of users (large and heterogeneous) will be interested in discovery experimental results, searching of metadata, browsing collections of files, compare different results, display output, etc.; A key element to carry out data search and discovery, manage and access huge and distributed amount of data is the metadata handling framework. What we propose for the management of distributed datasets is the GRelC service (a data grid solution focusing on metadata management). Despite the classical approaches, the proposed data-grid solution is able to address scalability, transparency, security and efficiency and interoperability. The GRelC service we propose is able to provide access to metadata stored in different and widespread data sources (relational databases running on top of MySQL, Oracle, DB2, etc. leveraging SQL as query language, as well as XML databases - XIndice, eXist, and libxml2 based documents, adopting either XPath or XQuery) providing a strong data virtualization layer in a grid environment. Such a technological solution for distributed metadata management leverages on well known adopted standards (W3C, OASIS, etc.); (ii) supports role-based management (based on VOMS), which increases flexibility and scalability; (iii) provides full support for Grid Security Infrastructure, which means (authorization, mutual authentication, data integrity, data confidentiality and delegation); (iv) is compatible with existing grid middleware such as gLite and Globus and finally (v) is currently adopted at the Euro-Mediterranean Centre for Climate Change (CMCC - Italy) to manage the entire CMCC data production activity as well as in the international Climate-G testbed.
Operationalizing Semantic Medline for meeting the information needs at point of care.
Rastegar-Mojarad, Majid; Li, Dingcheng; Liu, Hongfang
2015-01-01
Scientific literature is one of the popular resources for providing decision support at point of care. It is highly desirable to bring the most relevant literature to support the evidence-based clinical decision making process. Motivated by the recent advance in semantically enhanced information retrieval, we have developed a system, which aims to bring semantically enriched literature, Semantic Medline, to meet the information needs at point of care. This study reports our work towards operationalizing the system for real time use. We demonstrate that the migration of a relational database implementation to a NoSQL (Not only SQL) implementation significantly improves the performance and makes the use of Semantic Medline at point of care decision support possible.
MaGnET: Malaria Genome Exploration Tool.
Sharman, Joanna L; Gerloff, Dietlind L
2013-09-15
The Malaria Genome Exploration Tool (MaGnET) is a software tool enabling intuitive 'exploration-style' visualization of functional genomics data relating to the malaria parasite, Plasmodium falciparum. MaGnET provides innovative integrated graphic displays for different datasets, including genomic location of genes, mRNA expression data, protein-protein interactions and more. Any selection of genes to explore made by the user is easily carried over between the different viewers for different datasets, and can be changed interactively at any point (without returning to a search). Free online use (Java Web Start) or download (Java application archive and MySQL database; requires local MySQL installation) at http://malariagenomeexplorer.org joanna.sharman@ed.ac.uk or dgerloff@ffame.org Supplementary data are available at Bioinformatics online.
Operationalizing Semantic Medline for meeting the information needs at point of care
Rastegar-Mojarad, Majid; Li, Dingcheng; Liu, Hongfang
2015-01-01
Scientific literature is one of the popular resources for providing decision support at point of care. It is highly desirable to bring the most relevant literature to support the evidence-based clinical decision making process. Motivated by the recent advance in semantically enhanced information retrieval, we have developed a system, which aims to bring semantically enriched literature, Semantic Medline, to meet the information needs at point of care. This study reports our work towards operationalizing the system for real time use. We demonstrate that the migration of a relational database implementation to a NoSQL (Not only SQL) implementation significantly improves the performance and makes the use of Semantic Medline at point of care decision support possible. PMID:26306259
Spatio-Temporal Data Model for Integrating Evolving Nation-Level Datasets
NASA Astrophysics Data System (ADS)
Sorokine, A.; Stewart, R. N.
2017-10-01
Ability to easily combine the data from diverse sources in a single analytical workflow is one of the greatest promises of the Big Data technologies. However, such integration is often challenging as datasets originate from different vendors, governments, and research communities that results in multiple incompatibilities including data representations, formats, and semantics. Semantics differences are hardest to handle: different communities often use different attribute definitions and associate the records with different sets of evolving geographic entities. Analysis of global socioeconomic variables across multiple datasets over prolonged time is often complicated by the difference in how boundaries and histories of countries or other geographic entities are represented. Here we propose an event-based data model for depicting and tracking histories of evolving geographic units (countries, provinces, etc.) and their representations in disparate data. The model addresses the semantic challenge of preserving identity of geographic entities over time by defining criteria for the entity existence, a set of events that may affect its existence, and rules for mapping between different representations (datasets). Proposed model is used for maintaining an evolving compound database of global socioeconomic and environmental data harvested from multiple sources. Practical implementation of our model is demonstrated using PostgreSQL object-relational database with the use of temporal, geospatial, and NoSQL database extensions.
Database usage and performance for the Fermilab Run II experiments
DOE Office of Scientific and Technical Information (OSTI.GOV)
Bonham, D.; Box, D.; Gallas, E.
2004-12-01
The Run II experiments at Fermilab, CDF and D0, have extensive database needs covering many areas of their online and offline operations. Delivering data to users and processing farms worldwide has represented major challenges to both experiments. The range of applications employing databases includes, calibration (conditions), trigger information, run configuration, run quality, luminosity, data management, and others. Oracle is the primary database product being used for these applications at Fermilab and some of its advanced features have been employed, such as table partitioning and replication. There is also experience with open source database products such as MySQL for secondary databasesmore » used, for example, in monitoring. Tools employed for monitoring the operation and diagnosing problems are also described.« less
Brassica ASTRA: an integrated database for Brassica genomic research.
Love, Christopher G; Robinson, Andrew J; Lim, Geraldine A C; Hopkins, Clare J; Batley, Jacqueline; Barker, Gary; Spangenberg, German C; Edwards, David
2005-01-01
Brassica ASTRA is a public database for genomic information on Brassica species. The database incorporates expressed sequences with Swiss-Prot and GenBank comparative sequence annotation as well as secondary Gene Ontology (GO) annotation derived from the comparison with Arabidopsis TAIR GO annotations. Simple sequence repeat molecular markers are identified within resident sequences and mapped onto the closely related Arabidopsis genome sequence. Bacterial artificial chromosome (BAC) end sequences derived from the Multinational Brassica Genome Project are also mapped onto the Arabidopsis genome sequence enabling users to identify candidate Brassica BACs corresponding to syntenic regions of Arabidopsis. This information is maintained in a MySQL database with a web interface providing the primary means of interrogation. The database is accessible at http://hornbill.cspp.latrobe.edu.au.
EasyKSORD: A Platform of Keyword Search Over Relational Databases
NASA Astrophysics Data System (ADS)
Peng, Zhaohui; Li, Jing; Wang, Shan
Keyword Search Over Relational Databases (KSORD) enables casual users to use keyword queries (a set of keywords) to search relational databases just like searching the Web, without any knowledge of the database schema or any need of writing SQL queries. Based on our previous work, we design and implement a novel KSORD platform named EasyKSORD for users and system administrators to use and manage different KSORD systems in a novel and simple manner. EasyKSORD supports advanced queries, efficient data-graph-based search engines, multiform result presentations, and system logging and analysis. Through EasyKSORD, users can search relational databases easily and read search results conveniently, and system administrators can easily monitor and analyze the operations of KSORD and manage KSORD systems much better.
Ultrabroadband photonic internet: safety aspects
NASA Astrophysics Data System (ADS)
Kalicki, Arkadiusz; Romaniuk, Ryszard
2008-11-01
Web applications became most popular medium in the Internet. Popularity, easiness of web application frameworks together with careless development results in high number of vulnerabilities and attacks. There are several types of attacks possible because of improper input validation. SQL injection is ability to execute arbitrary SQL queries in a database through an existing application. Cross-site scripting is the vulnerability which allows malicious web users to inject code into the web pages viewed by other users. Cross-Site Request Forgery (CSRF) is an attack that tricks the victim into loading a page that contains malicious request. Web spam in blogs. There are several techniques to mitigate attacks. Most important are web application strong design, correct input validation, defined data types for each field and parameterized statements in SQL queries. Server hardening with firewall, modern security policies systems and safe web framework interpreter configuration are essential. It is advised to keep proper security level on client side, keep updated software and install personal web firewalls or IDS/IPS systems. Good habits are logging out from services just after finishing work and using even separate web browser for most important sites, like e-banking.
StarView: The object oriented design of the ST DADS user interface
NASA Technical Reports Server (NTRS)
Williams, J. D.; Pollizzi, J. A.
1992-01-01
StarView is the user interface being developed for the Hubble Space Telescope Data Archive and Distribution Service (ST DADS). ST DADS is the data archive for HST observations and a relational database catalog describing the archived data. Users will use StarView to query the catalog and select appropriate datasets for study. StarView sends requests for archived datasets to ST DADS which processes the requests and returns the database to the user. StarView is designed to be a powerful and extensible user interface. Unique features include an internal relational database to navigate query results, a form definition language that will work with both CRT and X interfaces, a data definition language that will allow StarView to work with any relational database, and the ability to generate adhoc queries without requiring the user to understand the structure of the ST DADS catalog. Ultimately, StarView will allow the user to refine queries in the local database for improved performance and merge in data from external sources for correlation with other query results. The user will be able to create a query from single or multiple forms, merging the selected attributes into a single query. Arbitrary selection of attributes for querying is supported. The user will be able to select how query results are viewed. A standard form or table-row format may be used. Navigation capabilities are provided to aid the user in viewing query results. Object oriented analysis and design techniques were used in the design of StarView to support the mechanisms and concepts required to implement these features. One such mechanism is the Model-View-Controller (MVC) paradigm. The MVC allows the user to have multiple views of the underlying database, while providing a consistent mechanism for interaction regardless of the view. This approach supports both CRT and X interfaces while providing a common mode of user interaction. Another powerful abstraction is the concept of a Query Model. This concept allows a single query to be built form a single or multiple forms before it is submitted to ST DADS. Supporting this concept is the adhoc query generator which allows the user to select and qualify an indeterminate number attributes from the database. The user does not need any knowledge of how the joins across various tables are to be resolved. The adhoc generator calculates the joins automatically and generates the correct SQL query.
The Armed Forces Casualty Assistance Readiness Enhancement System (CARES): Design for Flexibility
2006-06-01
Special Form SQL Structured Query Language SSA Social Security Administration U USMA United States Military Academy V VB Visual Basic VBA Visual Basic for...of Abbreviations ................................................................... 26 Appendix B: Key VBA Macros and MS Excel Coding...internet portal, CARES Version 1.0 is a MS Excel spreadsheet application that contains a considerable number of Visual Basic for Applications ( VBA
NASA Astrophysics Data System (ADS)
Maffei, A. R.; Chandler, C. L.; Work, T.; Allen, J.; Groman, R. C.; Fox, P. A.
2009-12-01
Content Management Systems (CMSs) provide powerful features that can be of use to oceanographic (and other geo-science) data managers. However, in many instances, geo-science data management offices have previously designed customized schemas for their metadata. The WHOI Ocean Informatics initiative and the NSF funded Biological Chemical and Biological Data Management Office (BCO-DMO) have jointly sponsored a project to port an existing, relational database containing oceanographic metadata, along with an existing interface coded in Cold Fusion middleware, to a Drupal6 Content Management System. The goal was to translate all the existing database tables, input forms, website reports, and other features present in the existing system to employ Drupal CMS features. The replacement features include Drupal content types, CCK node-reference fields, themes, RDB, SPARQL, workflow, and a number of other supporting modules. Strategic use of some Drupal6 CMS features enables three separate but complementary interfaces that provide access to oceanographic research metadata via the MySQL database: 1) a Drupal6-powered front-end; 2) a standard SQL port (used to provide a Mapserver interface to the metadata and data; and 3) a SPARQL port (feeding a new faceted search capability being developed). Future plans include the creation of science ontologies, by scientist/technologist teams, that will drive semantically-enabled faceted search capabilities planned for the site. Incorporation of semantic technologies included in the future Drupal 7 core release is also anticipated. Using a public domain CMS as opposed to proprietary middleware, and taking advantage of the many features of Drupal 6 that are designed to support semantically-enabled interfaces will help prepare the BCO-DMO database for interoperability with other ecosystem databases.
A Quality-Control-Oriented Database for a Mesoscale Meteorological Observation Network
NASA Astrophysics Data System (ADS)
Lussana, C.; Ranci, M.; Uboldi, F.
2012-04-01
In the operational context of a local weather service, data accessibility and quality related issues must be managed by taking into account a wide set of user needs. This work describes the structure and the operational choices made for the operational implementation of a database system storing data from highly automated observing stations, metadata and information on data quality. Lombardy's environmental protection agency, ARPA Lombardia, manages a highly automated mesoscale meteorological network. A Quality Assurance System (QAS) ensures that reliable observational information is collected and disseminated to the users. The weather unit in ARPA Lombardia, at the same time an important QAS component and an intensive data user, has developed a database specifically aimed to: 1) providing quick access to data for operational activities and 2) ensuring data quality for real-time applications, by means of an Automatic Data Quality Control (ADQC) procedure. Quantities stored in the archive include hourly aggregated observations of: precipitation amount, temperature, wind, relative humidity, pressure, global and net solar radiation. The ADQC performs several independent tests on raw data and compares their results in a decision-making procedure. An important ADQC component is the Spatial Consistency Test based on Optimal Interpolation. Interpolated and Cross-Validation analysis values are also stored in the database, providing further information to human operators and useful estimates in case of missing data. The technical solution adopted is based on a LAMP (Linux, Apache, MySQL and Php) system, constituting an open source environment suitable for both development and operational practice. The ADQC procedure itself is performed by R scripts directly interacting with the MySQL database. Users and network managers can access the database by using a set of web-based Php applications.
The VERITAS Facility: A Virtual Environment Platform for Human Performance Research
2016-01-01
IAE The IAE supports the audio environment that users experience during the course of an experiment. This includes environmental sounds, user-to...future, we are looking towards a database-based system that would use MySQL or an equivalent product to store the large data sets and provide standard
Teaching Undergraduate Software Engineering Using Open Source Development Tools
2012-01-01
ware. Some example appliances are: a LAMP stack, Redmine, MySQL database, Moodle, Tom- cat on Apache, and Bugzilla. Some of the important features...Ada, C, C++, PHP , Py- thon, etc., and also supports a wide range of SDKs such as Google’s Android SDK and the Google Web Toolkit SDK. Additionally
78 FR 55699 - Privacy Act of 1974; Proposed New Systems of Records
Federal Register 2010, 2011, 2012, 2013, 2014
2013-09-11
... FMC OIT staff at its Washington, DC headquarters. The FMC GSS is made up of servers, switches, gateways, and two firewall devices. The servers, switches, gateways, and firewall devices are physically... within the confines of FMC-39, FMC General Support System (FMC GSS) and FMC-41, FMC SQL Database (FMCDB...
Oracle Applications Patch Administration Tool (PAT) Beta Version
DOE Office of Scientific and Technical Information (OSTI.GOV)
2002-01-04
PAT is a Patch Administration Tool that provides analysis, tracking, and management of Oracle Application patches. This includes capabilities as outlined below: Patch Analysis & Management Tool Outline of capabilities: Administration Patch Data Maintenance -- track Oracle Application patches applied to what database instance & machine Patch Analysis capture text files (readme.txt and driver files) form comparison detail report comparison detail PL/SQL package comparison detail SQL scripts detail JSP module comparison detail Parse and load the current applptch.txt (10.7) or load patch data from Oracle Application database patch tables (11i) Display Analysis -- Compare patch to be applied with currentmore » Oracle Application installed Appl_top code versions Patch Detail Module comparison detail Analyze and display one Oracle Application module patch. Patch Management -- automatic queue and execution of patches Administration Parameter maintenance -- setting for directory structure of Oracle Application appl_top Validation data maintenance -- machine names and instances to patch Operation Patch Data Maintenance Schedule a patch (queue for later execution) Run a patch (queue for immediate execution) Review the patch logs Patch Management Reports« less
WebCN: A web-based computation tool for in situ-produced cosmogenic nuclides
NASA Astrophysics Data System (ADS)
Ma, Xiuzeng; Li, Yingkui; Bourgeois, Mike; Caffee, Marc; Elmore, David; Granger, Darryl; Muzikar, Paul; Smith, Preston
2007-06-01
Cosmogenic nuclide techniques are increasingly being utilized in geoscience research. For this it is critical to establish an effective, easily accessible and well defined tool for cosmogenic nuclide computations. We have been developing a web-based tool (WebCN) to calculate surface exposure ages and erosion rates based on the nuclide concentrations measured by the accelerator mass spectrometry. WebCN for 10Be and 26Al has been finished and published at http://www.physics.purdue.edu/primelab/for_users/rockage.html. WebCN for 36Cl is under construction. WebCN is designed as a three-tier client/server model and uses the open source PostgreSQL for the database management and PHP for the interface design and calculations. On the client side, an internet browser and Microsoft Access are used as application interfaces to access the system. Open Database Connectivity is used to link PostgreSQL and Microsoft Access. WebCN accounts for both spatial and temporal distributions of the cosmic ray flux to calculate the production rates of in situ-produced cosmogenic nuclides at the Earth's surface.
Multiple-Feature Extracting Modules Based Leak Mining System Design
Cho, Ying-Chiang; Pan, Jen-Yi
2013-01-01
Over the years, human dependence on the Internet has increased dramatically. A large amount of information is placed on the Internet and retrieved from it daily, which makes web security in terms of online information a major concern. In recent years, the most problematic issues in web security have been e-mail address leakage and SQL injection attacks. There are many possible causes of information leakage, such as inadequate precautions during the programming process, which lead to the leakage of e-mail addresses entered online or insufficient protection of database information, a loophole that enables malicious users to steal online content. In this paper, we implement a crawler mining system that is equipped with SQL injection vulnerability detection, by means of an algorithm developed for the web crawler. In addition, we analyze portal sites of the governments of various countries or regions in order to investigate the information leaking status of each site. Subsequently, we analyze the database structure and content of each site, using the data collected. Thus, we make use of practical verification in order to focus on information security and privacy through black-box testing. PMID:24453892
Multiple-feature extracting modules based leak mining system design.
Cho, Ying-Chiang; Pan, Jen-Yi
2013-01-01
Over the years, human dependence on the Internet has increased dramatically. A large amount of information is placed on the Internet and retrieved from it daily, which makes web security in terms of online information a major concern. In recent years, the most problematic issues in web security have been e-mail address leakage and SQL injection attacks. There are many possible causes of information leakage, such as inadequate precautions during the programming process, which lead to the leakage of e-mail addresses entered online or insufficient protection of database information, a loophole that enables malicious users to steal online content. In this paper, we implement a crawler mining system that is equipped with SQL injection vulnerability detection, by means of an algorithm developed for the web crawler. In addition, we analyze portal sites of the governments of various countries or regions in order to investigate the information leaking status of each site. Subsequently, we analyze the database structure and content of each site, using the data collected. Thus, we make use of practical verification in order to focus on information security and privacy through black-box testing.
Ontology-based data integration between clinical and research systems.
Mate, Sebastian; Köpcke, Felix; Toddenroth, Dennis; Martin, Marcus; Prokosch, Hans-Ulrich; Bürkle, Thomas; Ganslandt, Thomas
2015-01-01
Data from the electronic medical record comprise numerous structured but uncoded elements, which are not linked to standard terminologies. Reuse of such data for secondary research purposes has gained in importance recently. However, the identification of relevant data elements and the creation of database jobs for extraction, transformation and loading (ETL) are challenging: With current methods such as data warehousing, it is not feasible to efficiently maintain and reuse semantically complex data extraction and trans-formation routines. We present an ontology-supported approach to overcome this challenge by making use of abstraction: Instead of defining ETL procedures at the database level, we use ontologies to organize and describe the medical concepts of both the source system and the target system. Instead of using unique, specifically developed SQL statements or ETL jobs, we define declarative transformation rules within ontologies and illustrate how these constructs can then be used to automatically generate SQL code to perform the desired ETL procedures. This demonstrates how a suitable level of abstraction may not only aid the interpretation of clinical data, but can also foster the reutilization of methods for un-locking it.
An Expert System for Diagnosing Eye Diseases using Forward Chaining Method
NASA Astrophysics Data System (ADS)
Munaiseche, C. P. C.; Kaparang, D. R.; Rompas, P. T. D.
2018-02-01
Expert System is a system that seeks to adopt human knowledge to the computer, so that the computer can solve problems which are usually done by experts. The purpose of medical expert system is to support the diagnosis process of physicians. It considers facts and symptoms to provide diagnosis. This implies that a medical expert system uses knowledge about diseases and facts about the patients to suggest diagnosis. The aim of this research is to design an expert system application for diagnosing eye diseases using forward chaining method and to figure out user acceptance to this application through usability testing. Eye is selected because it is one of the five senses which is very sensitive and important. The scope of the work is extended to 16 types of eye diseases with 41 symptoms of the disease, arranged in 16 rules. The computer programming language employed was the PHP programming language and MySQL as the Relational Database Management System (RDBMS). The results obtained showed that the expert system was able to successfully diagnose eye diseases corresponding to the selected symptoms entered as query and the system evaluation through usability testing showed the expert system for diagnosis eye diseases had very good rate of usability, which includes learnability, efficiency, memorability, errors, and satisfaction so that the system can be received in the operational environment.
NASA Astrophysics Data System (ADS)
Grimes, J.; Mahoney, A. R.; Heinrichs, T. A.; Eicken, H.
2012-12-01
Sensor data can be highly variable in nature and also varied depending on the physical quantity being observed, sensor hardware and sampling parameters. The sea ice mass balance site (MBS) operated in Barrow by the University of Alaska Fairbanks (http://seaice.alaska.edu/gi/observatories/barrow_sealevel) is a multisensor platform consisting of a thermistor string, air and water temperature sensors, acoustic altimeters above and below the ice and a humidity sensor. Each sensor has a unique specification and configuration. The data from multiple sensors are combined to generate sea ice data products. For example, ice thickness is calculated from the positions of the upper and lower ice surfaces, which are determined using data from downward-looking and upward-looking acoustic altimeters above and below the ice, respectively. As a data clearinghouse, the Geographic Information Network of Alaska (GINA) processes real time data from many sources, including the Barrow MBS. Doing so requires a system that is easy to use, yet also offers the flexibility to handle data from multisensor observing platforms. In the case of the Barrow MBS, the metadata system needs to accommodate the addition of new and retirement of old sensors from year to year as well as instrument configuration changes caused by, for example, spring melt or inquisitive polar bears. We also require ease of use for both administrators and end users. Here we present the data and processing steps of using sensor data system powered by the NoSQL storage engine, MongoDB. The system has been developed to ingest, process, disseminate and archive data from the Barrow MBS. Storing sensor data in a generalized format, from many different sources, is a challenging task, especially for traditional SQL databases with a set schema. MongoDB is a NoSQL (not only SQL) database that does not require a fixed schema. There are several advantages using this model over the traditional relational database management system (RDBMS) model databases. The lack of a required schema allows flexibility in how the data can be ingested into the database. For example, MongoDB imposes no restrictions on field names. For researchers using the system, this means that the name they have chosen for the sensor is carried through the database, any processing, and to the final output helping to preserve data integrity. Also, MongoDB allows the data to be pushed to it dynamically meaning that field attributes can be defined at the point of ingestion. This allows any sensor data to be ingested as a document and for this functionality to be transferred to the user interface, allowing greater adaptability to different use-case scenarios. In presenting the MondoDB data system being developed for the Barrow MBS, we demonstrate the versatility of this approach and its suitability as the foundation of a Barrow node of the Arctic Observing Network. Authors Jason Grimes - Geographic Information Network of Alaska - jason@gina.alaska.edu Andy Mahony - Geophysical Institute - mahoney@gi.alaska.edu Hajo Eiken - Geophysical Institute - Hajo.Eicken@gi.alaska.edu Tom Heinrichs - Geographic Information Network of Alaska - Tom.Heinrichs@alaska.edu
NASA Astrophysics Data System (ADS)
Celicourt, P.; Piasecki, M.
2014-12-01
The high cost of hydro-meteorological data acquisition, communication and publication systems along with limited qualified human resources is considered as the main reason why hydro-meteorological data collection remains a challenge especially in developing countries. Despite significant advances in sensor network technologies which gave birth to open hardware and software, low-cost (less than $50) and low-power (in the order of a few miliWatts) sensor platforms in the last two decades, sensors and sensor network deployment remains a labor-intensive, time consuming, cumbersome, and thus expensive task. These factors give rise for the need to develop a affordable, simple to deploy, scalable and self-organizing end-to-end (from sensor to publication) system suitable for deployment in such countries. The design of the envisioned system will consist of a few Sensed-And-Programmed Arduino-based sensor nodes with low-cost sensors measuring parameters relevant to hydrological processes and a Raspberry Pi micro-computer hosting the in-the-field back-end data management. This latter comprises the Python/Django model of the CUAHSI Observations Data Model (ODM) namely DjangODM backed by a PostgreSQL Database Server. We are also developing a Python-based data processing script which will be paired with the data autoloading capability of Django to populate the DjangODM database with the incoming data. To publish the data, the WOFpy (WaterOneFlow Web Services in Python) developed by the Texas Water Development Board for 'Water Data for Texas' which can produce WaterML web services from a variety of back-end database installations such as SQLite, MySQL, and PostgreSQL will be used. A step further would be the development of an appealing online visualization tool using Python statistics and analytics tools (Scipy, Numpy, Pandas) showing the spatial distribution of variables across an entire watershed as a time variant layer on top of a basemap.
Implementation of a WAP-based telemedicine system for patient monitoring.
Hung, Kevin; Zhang, Yuan-Ting
2003-06-01
Many parties have already demonstrated telemedicine applications that use cellular phones and the Internet. A current trend in telecommunication is the convergence of wireless communication and computer network technologies, and the emergence of wireless application protocol (WAP) devices is an example. Since WAP will also be a common feature found in future mobile communication devices, it is worthwhile to investigate its use in telemedicine. This paper describes the implementation and experiences with a WAP-based telemedicine system for patient-monitoring that has been developed in our laboratory. It utilizes WAP devices as mobile access terminals for general inquiry and patient-monitoring services. Authorized users can browse the patients' general data, monitored blood pressure (BP), and electrocardiogram (ECG) on WAP devices in store-and-forward mode. The applications, written in wireless markup language (WML), WMLScript, and Perl, resided in a content server. A MySQL relational database system was set up to store the BP readings, ECG data, patient records, clinic and hospital information, and doctors' appointments with patients. A wireless ECG subsystem was built for recording ambulatory ECG in an indoor environment and for storing ECG data into the database. For testing, a WAP phone compliant with WAP 1.1 was used at GSM 1800 MHz by circuit-switched data (CSD) to connect to the content server through a WAP gateway, which was provided by a mobile phone service provider in Hong Kong. Data were successfully retrieved from the database and displayed on the WAP phone. The system shows how WAP can be feasible in remote patient-monitoring and patient data retrieval.
The HITRAN2016 Molecular Spectroscopic Database
DOE Office of Scientific and Technical Information (OSTI.GOV)
Gordon, I. E.; Rothman, L. S.; Hill, C.
This article describes the contents of the 2016 edition of the HITRAN molecular spectroscopic compilation. The new edition replaces the previous HITRAN edition of 2012 and its updates during the intervening years. The HITRAN molecular absorption compilation is composed of five major components: the traditional line-by-line spectroscopic parameters required for high-resolution radiative-transfer codes, infrared absorption cross-sections for molecules not yet amenable to representation in a line-by-line form, collision-induced absorption data, aerosol indices of refraction, and general tables such as partition sums that apply globally to the data. The new HITRAN is greatly extended in terms of accuracy, spectral coverage, additionalmore » absorption phenomena, added line-shape formalisms, and validity. Moreover, molecules, isotopologues, and perturbing gases have been added that address the issues of atmospheres beyond the Earth. Of considerable note, experimental IR cross-sections for almost 300 additional molecules important in different areas of atmospheric science have been added to the database. The compilation can be accessed through www.hitran.org. Most of the HITRAN data have now been cast into an underlying relational database structure that offers many advantages over the long-standing sequential text-based structure. The new structure empowers the user in many ways. It enables the incorporation of an extended set of fundamental parameters per transition, sophisticated line-shape formalisms, easy user-defined output formats, and very convenient searching, filtering, and plotting of data. Finally, a powerful application programming interface making use of structured query language (SQL) features for higher-level applications of HITRAN is also provided.« less
The HITRAN2016 Molecular Spectroscopic Database
Gordon, I. E.; Rothman, L. S.; Hill, C.; ...
2017-07-05
This article describes the contents of the 2016 edition of the HITRAN molecular spectroscopic compilation. The new edition replaces the previous HITRAN edition of 2012 and its updates during the intervening years. The HITRAN molecular absorption compilation is composed of five major components: the traditional line-by-line spectroscopic parameters required for high-resolution radiative-transfer codes, infrared absorption cross-sections for molecules not yet amenable to representation in a line-by-line form, collision-induced absorption data, aerosol indices of refraction, and general tables such as partition sums that apply globally to the data. The new HITRAN is greatly extended in terms of accuracy, spectral coverage, additionalmore » absorption phenomena, added line-shape formalisms, and validity. Moreover, molecules, isotopologues, and perturbing gases have been added that address the issues of atmospheres beyond the Earth. Of considerable note, experimental IR cross-sections for almost 300 additional molecules important in different areas of atmospheric science have been added to the database. The compilation can be accessed through www.hitran.org. Most of the HITRAN data have now been cast into an underlying relational database structure that offers many advantages over the long-standing sequential text-based structure. The new structure empowers the user in many ways. It enables the incorporation of an extended set of fundamental parameters per transition, sophisticated line-shape formalisms, easy user-defined output formats, and very convenient searching, filtering, and plotting of data. Finally, a powerful application programming interface making use of structured query language (SQL) features for higher-level applications of HITRAN is also provided.« less
An Ada/SQL (Structured Query Language) Application Scanner.
1988-03-01
Digital ...8217 (" DIGITS "), 46 new STRING’ ("DO"), new STRING’ ("ELSE"), new STRING’ ("ELSIF"), new STRING’ ("END"), new STRING’ ("ENTRY"), new STRING’ ("EXCEPTION...INTEGERPRINT; generic type NUM is digits <>; package FLOATPRINT is package txtprts.ada 18 prcdr PR (FL inFL %YE LINE n LINTYPE UNCLASSIFIED procedure
ERIC Educational Resources Information Center
Sanchez, Pablo; Zorrilla, Marta; Duque, Rafael; Nieto-Reyes, Alicia
2011-01-01
Models in Software Engineering are considered as abstract representations of software systems. Models highlight relevant details for a certain purpose, whereas irrelevant ones are hidden. Models are supposed to make system comprehension easier by reducing complexity. Therefore, models should play a key role in education, since they would ease the…
New Directions in the NOAO Observing Proposal System
NASA Astrophysics Data System (ADS)
Gasson, David; Bell, Dave
For the past eight years NOAO has been refining its on-line observing proposal system. Virtually all related processes are now handled electronically. Members of the astronomical community can submit proposals through email, web form, or via the Gemini Phase I Tool. NOAO staff can use the system to do administrative tasks, scheduling, and compilation of various statistics. In addition, all information relevant to the TAC process is made available on-line, including the proposals themselves (in HTML, PDF and PostScript) and technical comments. Grades and TAC comments are entered and edited through web forms, and can be sorted and filtered according to specified criteria. Current developments include a move away from proprietary solutions, toward open standards such as SQL (in the form of the MySQL relational database system), Perl, PHP and XML.
Development of a Prototype Detailing Management System for the Civil Engineer Corps
2002-09-01
73 Figure 30. Associate Members To Billets SQL Statement...American Standard Code for Information Interchange EMPRS Electronic Military Personnel Record System VBA Visual Basic for Applications SDLC...capturing keystrokes or carrying out a series of actions when opening an Access database. In the place of macros, VBA should be used because of its
2001-09-01
of MEIMS was programmed in Microsoft Access 97 using Visual Basic for Applications ( VBA ). This prototype had very little documentation. The FAA...using Acess 2000 as an interface and SQL server as the database engine. Question 1: Did you have any problems accessing the program? Y / N
DOE Office of Scientific and Technical Information (OSTI.GOV)
Paulson, Patrick R.; Gibson, Tara D.; Schuchardt, Karen L.
2008-03-01
Requirements for the provenance store and access API are developed. Existing RDF stores and APIs are evaluated against the requirements and performance benchmarks. The team’s conclusion is to use MySQL as a database backend, with a possible move to Oracle in the near-term future. Both Jena and Sesame’s APIs will be supported, but new code will use the Jena API
Web-Based Search and Plot System for Nuclear Reaction Data
DOE Office of Scientific and Technical Information (OSTI.GOV)
Otuka, N.; Nakagawa, T.; Fukahori, T.
2005-05-24
A web-based search and plot system for nuclear reaction data has been developed, covering experimental data in EXFOR format and evaluated data in ENDF format. The system is implemented for Linux OS, with Perl and MySQL used for CGI scripts and the database manager, respectively. Two prototypes for experimental and evaluated data are presented.
High Performance Semantic Factoring of Giga-Scale Semantic Graph Databases
DOE Office of Scientific and Technical Information (OSTI.GOV)
Joslyn, Cliff A.; Adolf, Robert D.; Al-Saffar, Sinan
2010-10-04
As semantic graph database technology grows to address components ranging from extant large triple stores to SPARQL endpoints over SQL-structured relational databases, it will become increasingly important to be able to bring high performance computational resources to bear on their analysis, interpretation, and visualization, especially with respect to their innate semantic structure. Our research group built a novel high performance hybrid system comprising computational capability for semantic graph database processing utilizing the large multithreaded architecture of the Cray XMT platform, conventional clusters, and large data stores. In this paper we describe that architecture, and present the results of our deployingmore » that for the analysis of the Billion Triple dataset with respect to its semantic factors.« less
Clinical Databases for Chest Physicians.
Courtwright, Andrew M; Gabriel, Peter E
2018-04-01
A clinical database is a repository of patient medical and sociodemographic information focused on one or more specific health condition or exposure. Although clinical databases may be used for research purposes, their primary goal is to collect and track patient data for quality improvement, quality assurance, and/or actual clinical management. This article aims to provide an introduction and practical advice on the development of small-scale clinical databases for chest physicians and practice groups. Through example projects, we discuss the pros and cons of available technical platforms, including Microsoft Excel and Access, relational database management systems such as Oracle and PostgreSQL, and Research Electronic Data Capture. We consider approaches to deciding the base unit of data collection, creating consensus around variable definitions, and structuring routine clinical care to complement database aims. We conclude with an overview of regulatory and security considerations for clinical databases. Copyright © 2018 American College of Chest Physicians. Published by Elsevier Inc. All rights reserved.
A Framework for Cloudy Model Optimization and Database Storage
NASA Astrophysics Data System (ADS)
Calvén, Emilia; Helton, Andrew; Sankrit, Ravi
2018-01-01
We present a framework for producing Cloudy photoionization models of the nebular emission from novae ejecta and storing a subset of the results in SQL database format for later usage. The database can be searched for models best fitting observed spectral line ratios. Additionally, the framework includes an optimization feature that can be used in tandem with the database to search for and improve on models by creating new Cloudy models while, varying the parameters. The database search and optimization can be used to explore the structures of nebulae by deriving their properties from the best-fit models. The goal is to provide the community with a large database of Cloudy photoionization models, generated from parameters reflecting conditions within novae ejecta, that can be easily fitted to observed spectral lines; either by directly accessing the database using the framework code or by usage of a website specifically made for this purpose.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Poliakov, Alexander; Couronne, Olivier
2002-11-04
Aligning large vertebrate genomes that are structurally complex poses a variety of problems not encountered on smaller scales. Such genomes are rich in repetitive elements and contain multiple segmental duplications, which increases the difficulty of identifying true orthologous SNA segments in alignments. The sizes of the sequences make many alignment algorithms designed for comparing single proteins extremely inefficient when processing large genomic intervals. We integrated both local and global alignment tools and developed a suite of programs for automatically aligning large vertebrate genomes and identifying conserved non-coding regions in the alignments. Our method uses the BLAT local alignment program tomore » find anchors on the base genome to identify regions of possible homology for a query sequence. These regions are postprocessed to find the best candidates which are then globally aligned using the AVID global alignment program. In the last step conserved non-coding segments are identified using VISTA. Our methods are fast and the resulting alignments exhibit a high degree of sensitivity, covering more than 90% of known coding exons in the human genome. The GenomeVISTA software is a suite of Perl programs that is built on a MySQL database platform. The scheduler gets control data from the database, builds a queve of jobs, and dispatches them to a PC cluster for execution. The main program, running on each node of the cluster, processes individual sequences. A Perl library acts as an interface between the database and the above programs. The use of a separate library allows the programs to function independently of the database schema. The library also improves on the standard Perl MySQL database interfere package by providing auto-reconnect functionality and improved error handling.« less
CBS Genome Atlas Database: a dynamic storage for bioinformatic results and sequence data.
Hallin, Peter F; Ussery, David W
2004-12-12
Currently, new bacterial genomes are being published on a monthly basis. With the growing amount of genome sequence data, there is a demand for a flexible and easy-to-maintain structure for storing sequence data and results from bioinformatic analysis. More than 150 sequenced bacterial genomes are now available, and comparisons of properties for taxonomically similar organisms are not readily available to many biologists. In addition to the most basic information, such as AT content, chromosome length, tRNA count and rRNA count, a large number of more complex calculations are needed to perform detailed comparative genomics. DNA structural calculations like curvature and stacking energy, DNA compositions like base skews, oligo skews and repeats at the local and global level are just a few of the analysis that are presented on the CBS Genome Atlas Web page. Complex analysis, changing methods and frequent addition of new models are factors that require a dynamic database layout. Using basic tools like the GNU Make system, csh, Perl and MySQL, we have created a flexible database environment for storing and maintaining such results for a collection of complete microbial genomes. Currently, these results counts to more than 220 pieces of information. The backbone of this solution consists of a program package written in Perl, which enables administrators to synchronize and update the database content. The MySQL database has been connected to the CBS web-server via PHP4, to present a dynamic web content for users outside the center. This solution is tightly fitted to existing server infrastructure and the solutions proposed here can perhaps serve as a template for other research groups to solve database issues. A web based user interface which is dynamically linked to the Genome Atlas Database can be accessed via www.cbs.dtu.dk/services/GenomeAtlas/. This paper has a supplemental information page which links to the examples presented: www.cbs.dtu.dk/services/GenomeAtlas/suppl/bioinfdatabase.