Why Save Your Course as a Relational Database?
ERIC Educational Resources Information Center
Hamilton, Gregory C.; Katz, David L.; Davis, James E.
2000-01-01
Describes a system that stores course materials for computer-based training programs in a relational database called Of Course! Outlines the basic structure of the databases; explains distinctions between Of Course! and other authoring languages; and describes how data is retrieved from the database and presented to the student. (Author/LRW)
[A relational database to store Poison Centers calls].
Barelli, Alessandro; Biondi, Immacolata; Tafani, Chiara; Pellegrini, Aristide; Soave, Maurizio; Gaspari, Rita; Annetta, Maria Giuseppina
2006-01-01
Italian Poison Centers answer to approximately 100,000 calls per year. Potentially, this activity is a huge source of data for toxicovigilance and for syndromic surveillance. During the last decade, surveillance systems for early detection of outbreaks have drawn the attention of public health institutions due to the threat of terrorism and high-profile disease outbreaks. Poisoning surveillance needs the ongoing, systematic collection, analysis, interpretation, and dissemination of harmonised data about poisonings from all Poison Centers for use in public health action to reduce morbidity and mortality and to improve health. The entity-relationship model for a Poison Center relational database is extremely complex and not studied in detail. For this reason, not harmonised data collection happens among Italian Poison Centers. Entities are recognizable concepts, either concrete or abstract, such as patients and poisons, or events which have relevance to the database, such as calls. Connectivity and cardinality of relationships are complex as well. A one-to-many relationship exist between calls and patients: for one instance of entity calls, there are zero, one, or many instances of entity patients. At the same time, a one-to-many relationship exist between patients and poisons: for one instance of entity patients, there are zero, one, or many instances of entity poisons. This paper shows a relational model for a poison center database which allows the harmonised data collection of poison centers calls.
Serials Management by Microcomputer: The Potential of DBMS.
ERIC Educational Resources Information Center
Vogel, J. Thomas; Burns, Lynn W.
1984-01-01
Describes serials management at Philadelphia College of Textiles and Science library via a microcomputer, a file manager called PFS, and a relational database management system called dBase II. Check-in procedures, programing with dBase II, "static" and "active" databases, and claim procedures are discussed. Check-in forms are…
DOE Office of Scientific and Technical Information (OSTI.GOV)
Femec, D.A.
This report describes two code-generating tools used to speed design and implementation of relational databases and user interfaces: CREATE-SCHEMA and BUILD-SCREEN. CREATE-SCHEMA produces the SQL commands that actually create and define the database. BUILD-SCREEN takes templates for data entry screens and generates the screen management system routine calls to display the desired screen. Both tools also generate the related FORTRAN declaration statements and precompiled SQL calls. Included with this report is the source code for a number of FORTRAN routines and functions used by the user interface. This code is broadly applicable to a number of different databases.
ERIC Educational Resources Information Center
Lynch, Clifford A.
1991-01-01
Describes several aspects of the problem of supporting information retrieval system query requirements in the relational database management system (RDBMS) environment and proposes an extension to query processing called nonmaterialized relations. User interactions with information retrieval systems are discussed, and nonmaterialized relations are…
Performance assessment of EMR systems based on post-relational database.
Yu, Hai-Yan; Li, Jing-Song; Zhang, Xiao-Guang; Tian, Yu; Suzuki, Muneou; Araki, Kenji
2012-08-01
Post-relational databases provide high performance and are currently widely used in American hospitals. As few hospital information systems (HIS) in either China or Japan are based on post-relational databases, here we introduce a new-generation electronic medical records (EMR) system called Hygeia, which was developed with the post-relational database Caché and the latest platform Ensemble. Utilizing the benefits of a post-relational database, Hygeia is equipped with an "integration" feature that allows all the system users to access data-with a fast response time-anywhere and at anytime. Performance tests of databases in EMR systems were implemented in both China and Japan. First, a comparison test was conducted between a post-relational database, Caché, and a relational database, Oracle, embedded in the EMR systems of a medium-sized first-class hospital in China. Second, a user terminal test was done on the EMR system Izanami, which is based on the identical database Caché and operates efficiently at the Miyazaki University Hospital in Japan. The results proved that the post-relational database Caché works faster than the relational database Oracle and showed perfect performance in the real-time EMR system.
Optimization of the Controlled Evaluation of Closed Relational Queries
NASA Astrophysics Data System (ADS)
Biskup, Joachim; Lochner, Jan-Hendrik; Sonntag, Sebastian
For relational databases, controlled query evaluation is an effective inference control mechanism preserving confidentiality regarding a previously declared confidentiality policy. Implementations of controlled query evaluation usually lack efficiency due to costly theorem prover calls. Suitably constrained controlled query evaluation can be implemented efficiently, but is not flexible enough from the perspective of database users and security administrators. In this paper, we propose an optimized framework for controlled query evaluation in relational databases, being efficiently implementable on the one hand and relaxing the constraints of previous approaches on the other hand.
THE NATIONAL EXPOSURE RESEARCH LABORATORY'S CONSOLIDATED HUMAN ACTIVITY DATABASE
EPA's National Exposure Research Laboratory (NERL) has combined data from 12 U.S. studies related to human activities into one comprehensive data system that can be accessed via the Internet. The data system is called the Consolidated Human Activity Database (CHAD), and it is ...
Calling and life satisfaction: it's not about having it, it's about living it.
Duffy, Ryan D; Allan, Blake A; Autin, Kelsey L; Bott, Elizabeth M
2013-01-01
The present study examined the relation of career calling to life satisfaction among a diverse sample of 553 working adults, with a specific focus on the distinction between perceiving a calling (sensing a calling to a career) and living a calling (actualizing one's calling in one's current career). As hypothesized, the relation of perceiving a calling to life satisfaction was fully mediated by living a calling. On the basis of this finding, a structural equation model was tested to examine possible mediators between living a calling and life satisfaction. As hypothesized, the relation of living a calling to life satisfaction was partially mediated by job satisfaction and life meaning, and the link between living a calling and job satisfaction was mediated by work meaning and career commitment. Modifications of the model also revealed that the link of living a calling to life meaning was mediated by work meaning. Implications for research and practice are discussed. PsycINFO Database Record (c) 2013 APA, all rights reserved.
THE NATIONAL EXPOSURE RESEARCH LABORATORY'S COMPREHENSIVE HUMAN ACTIVITY DATABASE
EPA's National Exposure Research Laboratory (NERL) has combined data from nine U.S. studies related to human activities into one comprehensive data system that can be accessed via the world-wide web. The data system is called CHAD-Consolidated Human Activity Database-and it is ...
NASA Technical Reports Server (NTRS)
Maluf, David A.; Tran, Peter B.
2003-01-01
Object-Relational database management system is an integrated hybrid cooperative approach to combine the best practices of both the relational model utilizing SQL queries and the object-oriented, semantic paradigm for supporting complex data creation. In this paper, a highly scalable, information on demand database framework, called NETMARK, is introduced. NETMARK takes advantages of the Oracle 8i object-relational database using physical addresses data types for very efficient keyword search of records spanning across both context and content. NETMARK was originally developed in early 2000 as a research and development prototype to solve the vast amounts of unstructured and semi-structured documents existing within NASA enterprises. Today, NETMARK is a flexible, high-throughput open database framework for managing, storing, and searching unstructured or semi-structured arbitrary hierarchal models, such as XML and HTML.
Impact of the mass media on calls to the CDC National AIDS Hotline.
Fan, D P
1996-06-01
This paper considers new computer methodologies for assessing the impact of different types of public health information. The example used public service announcements (PSAs) and mass media news to predict the volume of attempts to call the CDC National AIDS Hotline from December 1992 through to the end of 1993. The analysis relied solely on data from electronic databases. Newspaper stories and television news transcripts were obtained from the NEXIS electronic database and were scored by machine for AIDS coverage. The PSA database was generated by computer monitoring of advertising distributed by the Centers for Disease Control and Prevention (CDC) and by others. The volume of call attempts was collected automatically by the public branch exchange (PBX) of the Hotline telephone system. The call attempts, the PSAs and the news story data were related to each other using both a standard time series method and the statistical model of ideodynamics. The analysis indicated that the only significant explanatory variable for the call attempts was PSAs produced by the CDC. One possible explanation was that these commercials all included the Hotline telephone number while the other information sources did not.
Security Controls in the Stockpoint Logistics Integrated Communications Environment (SPLICE).
1985-03-01
call programs as authorized after checks by the Terminal Management Subsystem on SAS databases . SAS overlays the TANDEM GUARDIAN operating system to...Security Access Profile database (SAP) and a query capability generating various security reports. SAS operates with the System Monitor (SMON) subsystem...system to DDN and other components. The first SAS component to be reviewed is the SAP database . SAP is organized into two types of files. Relational
A Methodology for Benchmarking Relational Database Machines,
1984-01-01
user benchmarks is to compare the multiple users to the best-case performance The data for each query classification coll and the performance...called a benchmark. The term benchmark originates from the markers used by sur - veyors in establishing common reference points for their measure...formatted databases. In order to further simplify the problem, we restrict our study to those DBMs which support the relational model. A sur - vey
NASA Technical Reports Server (NTRS)
Maluf, David A.; Tran, Peter B.
2003-01-01
Object-Relational database management system is an integrated hybrid cooperative approach to combine the best practices of both the relational model utilizing SQL queries and the object-oriented, semantic paradigm for supporting complex data creation. In this paper, a highly scalable, information on demand database framework, called NETMARK, is introduced. NETMARK takes advantages of the Oracle 8i object-relational database using physical addresses data types for very efficient keyword search of records spanning across both context and content. NETMARK was originally developed in early 2000 as a research and development prototype to solve the vast amounts of unstructured and semistructured documents existing within NASA enterprises. Today, NETMARK is a flexible, high-throughput open database framework for managing, storing, and searching unstructured or semi-structured arbitrary hierarchal models, such as XML and HTML.
An Extensible Schema-less Database Framework for Managing High-throughput Semi-Structured Documents
NASA Technical Reports Server (NTRS)
Maluf, David A.; Tran, Peter B.; La, Tracy; Clancy, Daniel (Technical Monitor)
2002-01-01
Object-Relational database management system is an integrated hybrid cooperative approach to combine the best practices of both the relational model utilizing SQL queries and the object oriented, semantic paradigm for supporting complex data creation. In this paper, a highly scalable, information on demand database framework, called NETMARK is introduced. NETMARK takes advantages of the Oracle 8i object-relational database using physical addresses data types for very efficient keyword searches of records for both context and content. NETMARK was originally developed in early 2000 as a research and development prototype to solve the vast amounts of unstructured and semi-structured documents existing within NASA enterprises. Today, NETMARK is a flexible, high throughput open database framework for managing, storing, and searching unstructured or semi structured arbitrary hierarchal models, XML and HTML.
NASA Astrophysics Data System (ADS)
Bößwetter, Daniel
Much has been written about the pros and cons of column-orientation as a means to speed up read-mostly analytic workloads in relational databases. In this paper we try to dissect the primitive mechanisms of a database that help express the coherence of tuples and present a novel way of organizing relational data in order to exploit the advantages of both, the row-oriented and the column-oriented world. As we go, we break with yet another bad habit of databases, namely the equal granularity of reads and writes which leads us to the introduction of consecutive clusters of disk pages called super-pages.
47 CFR 64.615 - TRS User Registration Database and administrator.
Code of Federal Regulations, 2014 CFR
2014-10-01
... 47 Telecommunication 3 2014-10-01 2014-10-01 false TRS User Registration Database and... Registration Database and administrator. (a) TRS User Registration Database. (1) VRS providers shall validate... Database on a per-call basis. Emergency 911 calls are excepted from this requirement. (i) Validation shall...
47 CFR 64.615 - TRS User Registration Database and administrator.
Code of Federal Regulations, 2013 CFR
2013-10-01
... 47 Telecommunication 3 2013-10-01 2013-10-01 false TRS User Registration Database and... Registration Database and administrator. (a) TRS User Registration Database. (1) VRS providers shall validate... Database on a per-call basis. Emergency 911 calls are excepted from this requirement. (i) Validation shall...
Code of Federal Regulations, 2014 CFR
2014-10-01
... subscriber calls. (e) The term database method means a number portability method that utilizes one or more external databases for providing called party routing information. (f) The term downstream database means a database owned and operated by an individual carrier for the purpose of providing number portability in...
Code of Federal Regulations, 2011 CFR
2011-10-01
... subscriber calls. (e) The term database method means a number portability method that utilizes one or more external databases for providing called party routing information. (f) The term downstream database means a database owned and operated by an individual carrier for the purpose of providing number portability in...
Code of Federal Regulations, 2012 CFR
2012-10-01
... subscriber calls. (e) The term database method means a number portability method that utilizes one or more external databases for providing called party routing information. (f) The term downstream database means a database owned and operated by an individual carrier for the purpose of providing number portability in...
Code of Federal Regulations, 2013 CFR
2013-10-01
... subscriber calls. (e) The term database method means a number portability method that utilizes one or more external databases for providing called party routing information. (f) The term downstream database means a database owned and operated by an individual carrier for the purpose of providing number portability in...
AgeFactDB--the JenAge Ageing Factor Database--towards data integration in ageing research.
Hühne, Rolf; Thalheim, Torsten; Sühnel, Jürgen
2014-01-01
AgeFactDB (http://agefactdb.jenage.de) is a database aimed at the collection and integration of ageing phenotype data including lifespan information. Ageing factors are considered to be genes, chemical compounds or other factors such as dietary restriction, whose action results in a changed lifespan or another ageing phenotype. Any information related to the effects of ageing factors is called an observation and is presented on observation pages. To provide concise access to the complete information for a particular ageing factor, corresponding observations are also summarized on ageing factor pages. In a first step, ageing-related data were primarily taken from existing databases such as the Ageing Gene Database--GenAge, the Lifespan Observations Database and the Dietary Restriction Gene Database--GenDR. In addition, we have started to include new ageing-related information. Based on homology data taken from the HomoloGene Database, AgeFactDB also provides observation and ageing factor pages of genes that are homologous to known ageing-related genes. These homologues are considered as candidate or putative ageing-related genes. AgeFactDB offers a variety of search and browse options, and also allows the download of ageing factor or observation lists in TSV, CSV and XML formats.
XML technology planning database : lessons learned
NASA Technical Reports Server (NTRS)
Some, Raphael R.; Neff, Jon M.
2005-01-01
A hierarchical Extensible Markup Language(XML) database called XCALIBR (XML Analysis LIBRary) has been developed by Millennium Program to assist in technology investment (ROI) analysis and technology Language Capability the New return on portfolio optimization. The database contains mission requirements and technology capabilities, which are related by use of an XML dictionary. The XML dictionary codifies a standardized taxonomy for space missions, systems, subsystems and technologies. In addition to being used for ROI analysis, the database is being examined for use in project planning, tracking and documentation. During the past year, the database has moved from development into alpha testing. This paper describes the lessons learned during construction and testing of the prototype database and the motivation for moving from an XML taxonomy to a standard XML-based ontology.
Producing approximate answers to database queries
NASA Technical Reports Server (NTRS)
Vrbsky, Susan V.; Liu, Jane W. S.
1993-01-01
We have designed and implemented a query processor, called APPROXIMATE, that makes approximate answers available if part of the database is unavailable or if there is not enough time to produce an exact answer. The accuracy of the approximate answers produced improves monotonically with the amount of data retrieved to produce the result. The exact answer is produced if all of the needed data are available and query processing is allowed to continue until completion. The monotone query processing algorithm of APPROXIMATE works within the standard relational algebra framework and can be implemented on a relational database system with little change to the relational architecture. We describe here the approximation semantics of APPROXIMATE that serves as the basis for meaningful approximations of both set-valued and single-valued queries. We show how APPROXIMATE is implemented to make effective use of semantic information, provided by an object-oriented view of the database, and describe the additional overhead required by APPROXIMATE.
RDIS: The Rabies Disease Information System.
Dharmalingam, Baskeran; Jothi, Lydia
2015-01-01
Rabies is a deadly viral disease causing acute inflammation or encephalitis of the brain in human beings and other mammals. Therefore, it is of interest to collect information related to the disease from several sources including known literature databases for further analysis and interpretation. Hence, we describe the development of a database called the Rabies Disease Information System (RDIS) for this purpose. The online database describes the etiology, epidemiology, pathogenesis and pathology of the disease using diagrammatic representations. It provides information on several carriers of the rabies viruses like dog, bat, fox and civet, and their distributions around the world. Information related to the urban and sylvatic cycles of transmission of the virus is also made available. The database also contains information related to available diagnostic methods and vaccines for human and other animals. This information is of use to medical, veterinary and paramedical practitioners, students, researchers, pet owners, animal lovers, livestock handlers, travelers and many others. The database is available for free http://rabies.mscwbif.org/home.html.
Insect barcode information system.
Pratheepa, Maria; Jalali, Sushil Kumar; Arokiaraj, Robinson Silvester; Venkatesan, Thiruvengadam; Nagesh, Mandadi; Panda, Madhusmita; Pattar, Sharath
2014-01-01
Insect Barcode Information System called as Insect Barcode Informática (IBIn) is an online database resource developed by the National Bureau of Agriculturally Important Insects, Bangalore. This database provides acquisition, storage, analysis and publication of DNA barcode records of agriculturally important insects, for researchers specifically in India and other countries. It bridges a gap in bioinformatics by integrating molecular, morphological and distribution details of agriculturally important insects. IBIn was developed using PHP/My SQL by using relational database management concept. This database is based on the client- server architecture, where many clients can access data simultaneously. IBIn is freely available on-line and is user-friendly. IBIn allows the registered users to input new information, search and view information related to DNA barcode of agriculturally important insects.This paper provides a current status of insect barcode in India and brief introduction about the database IBIn. http://www.nabg-nbaii.res.in/barcode.
Three Dimensional Visualization of GOES Cloud Data Using Octress
1993-06-01
structure for CAD of integrated circuits that can subdivide the cubes into more complex polyhedrons . Medical imaging is also taking advantage of the...CIGOES 501 FORMAT(A) CALL OPENDBCPARAM’, ISTATRM) IF (ISTATRM .NE. 0) CALL FRIMERRC Error opening database .’, "+ ISTATRM) CALL OLDIMAGE(1, CIGOES, STATUS...image name (no .ext):’ ACCEPT 501, CIGOES 501 FORMAT(A) CALL OPENDB(’PARAM’, ISTATRM) IF (ISTATRM .NE. 0) CALL FRIMERRC Error opening database
Using an image-extended relational database to support content-based image retrieval in a PACS.
Traina, Caetano; Traina, Agma J M; Araújo, Myrian R B; Bueno, Josiane M; Chino, Fabio J T; Razente, Humberto; Azevedo-Marques, Paulo M
2005-12-01
This paper presents a new Picture Archiving and Communication System (PACS), called cbPACS, which has content-based image retrieval capabilities. The cbPACS answers range and k-nearest- neighbor similarity queries, employing a relational database manager extended to support images. The images are compared through their features, which are extracted by an image-processing module and stored in the extended relational database. The database extensions were developed aiming at efficiently answering similarity queries by taking advantage of specialized indexing methods. The main concept supporting the extensions is the definition, inside the relational manager, of distance functions based on features extracted from the images. An extension to the SQL language enables the construction of an interpreter that intercepts the extended commands and translates them to standard SQL, allowing any relational database server to be used. By now, the system implemented works on features based on color distribution of the images through normalized histograms as well as metric histograms. Metric histograms are invariant regarding scale, translation and rotation of images and also to brightness transformations. The cbPACS is prepared to integrate new image features, based on texture and shape of the main objects in the image.
Database architectures for Space Telescope Science Institute
NASA Astrophysics Data System (ADS)
Lubow, Stephen
1993-08-01
At STScI nearly all large applications require database support. A general purpose architecture has been developed and is in use that relies upon an extended client-server paradigm. Processing is in general distributed across three processes, each of which generally resides on its own processor. Database queries are evaluated on one such process, called the DBMS server. The DBMS server software is provided by a database vendor. The application issues database queries and is called the application client. This client uses a set of generic DBMS application programming calls through our STDB/NET programming interface. Intermediate between the application client and the DBMS server is the STDB/NET server. This server accepts generic query requests from the application and converts them into the specific requirements of the DBMS server. In addition, it accepts query results from the DBMS server and passes them back to the application. Typically the STDB/NET server is local to the DBMS server, while the application client may be remote. The STDB/NET server provides additional capabilities such as database deadlock restart and performance monitoring. This architecture is currently in use for some major STScI applications, including the ground support system. We are currently investigating means of providing ad hoc query support to users through the above architecture. Such support is critical for providing flexible user interface capabilities. The Universal Relation advocated by Ullman, Kernighan, and others appears to be promising. In this approach, the user sees the entire database as a single table, thereby freeing the user from needing to understand the detailed schema. A software layer provides the translation between the user and detailed schema views of the database. However, many subtle issues arise in making this transformation. We are currently exploring this scheme for use in the Hubble Space Telescope user interface to the data archive system (DADS).
NASA Technical Reports Server (NTRS)
Wrenn, Gregory A.
2005-01-01
This report describes a database routine called DB90 which is intended for use with scientific and engineering computer programs. The software is written in the Fortran 90/95 programming language standard with file input and output routines written in the C programming language. These routines should be completely portable to any computing platform and operating system that has Fortran 90/95 and C compilers. DB90 allows a program to supply relation names and up to 5 integer key values to uniquely identify each record of each relation. This permits the user to select records or retrieve data in any desired order.
2013-01-01
Background Research in organic chemistry generates samples of novel chemicals together with their properties and other related data. The involved scientists must be able to store this data and search it by chemical structure. There are commercial solutions for common needs like chemical registration systems or electronic lab notebooks. However for specific requirements of in-house databases and processes no such solutions exist. Another issue is that commercial solutions have the risk of vendor lock-in and may require an expensive license of a proprietary relational database management system. To speed up and simplify the development for applications that require chemical structure search capabilities, I have developed Molecule Database Framework. The framework abstracts the storing and searching of chemical structures into method calls. Therefore software developers do not require extensive knowledge about chemistry and the underlying database cartridge. This decreases application development time. Results Molecule Database Framework is written in Java and I created it by integrating existing free and open-source tools and frameworks. The core functionality includes: • Support for multi-component compounds (mixtures) • Import and export of SD-files • Optional security (authorization) For chemical structure searching Molecule Database Framework leverages the capabilities of the Bingo Cartridge for PostgreSQL and provides type-safe searching, caching, transactions and optional method level security. Molecule Database Framework supports multi-component chemical compounds (mixtures). Furthermore the design of entity classes and the reasoning behind it are explained. By means of a simple web application I describe how the framework could be used. I then benchmarked this example application to create some basic performance expectations for chemical structure searches and import and export of SD-files. Conclusions By using a simple web application it was shown that Molecule Database Framework successfully abstracts chemical structure searches and SD-File import and export to simple method calls. The framework offers good search performance on a standard laptop without any database tuning. This is also due to the fact that chemical structure searches are paged and cached. Molecule Database Framework is available for download on the projects web page on bitbucket: https://bitbucket.org/kienerj/moleculedatabaseframework. PMID:24325762
Kiener, Joos
2013-12-11
Research in organic chemistry generates samples of novel chemicals together with their properties and other related data. The involved scientists must be able to store this data and search it by chemical structure. There are commercial solutions for common needs like chemical registration systems or electronic lab notebooks. However for specific requirements of in-house databases and processes no such solutions exist. Another issue is that commercial solutions have the risk of vendor lock-in and may require an expensive license of a proprietary relational database management system. To speed up and simplify the development for applications that require chemical structure search capabilities, I have developed Molecule Database Framework. The framework abstracts the storing and searching of chemical structures into method calls. Therefore software developers do not require extensive knowledge about chemistry and the underlying database cartridge. This decreases application development time. Molecule Database Framework is written in Java and I created it by integrating existing free and open-source tools and frameworks. The core functionality includes:•Support for multi-component compounds (mixtures)•Import and export of SD-files•Optional security (authorization)For chemical structure searching Molecule Database Framework leverages the capabilities of the Bingo Cartridge for PostgreSQL and provides type-safe searching, caching, transactions and optional method level security. Molecule Database Framework supports multi-component chemical compounds (mixtures).Furthermore the design of entity classes and the reasoning behind it are explained. By means of a simple web application I describe how the framework could be used. I then benchmarked this example application to create some basic performance expectations for chemical structure searches and import and export of SD-files. By using a simple web application it was shown that Molecule Database Framework successfully abstracts chemical structure searches and SD-File import and export to simple method calls. The framework offers good search performance on a standard laptop without any database tuning. This is also due to the fact that chemical structure searches are paged and cached. Molecule Database Framework is available for download on the projects web page on bitbucket: https://bitbucket.org/kienerj/moleculedatabaseframework.
Kumar, Pankaj; Chaitanya, Pasumarthy S; Nagarajaram, Hampapathalu A
2011-01-01
PSSRdb (Polymorphic Simple Sequence Repeats database) (http://www.cdfd.org.in/PSSRdb/) is a relational database of polymorphic simple sequence repeats (PSSRs) extracted from 85 different species of prokaryotes. Simple sequence repeats (SSRs) are the tandem repeats of nucleotide motifs of the sizes 1-6 bp and are highly polymorphic. SSR mutations in and around coding regions affect transcription and translation of genes. Such changes underpin phase variations and antigenic variations seen in some bacteria. Although SSR-mediated phase variation and antigenic variations have been well-studied in some bacteria there seems a lot of other species of prokaryotes yet to be investigated for SSR mediated adaptive and other evolutionary advantages. As a part of our on-going studies on SSR polymorphism in prokaryotes we compared the genome sequences of various strains and isolates available for 85 different species of prokaryotes and extracted a number of SSRs showing length variations and created a relational database called PSSRdb. This database gives useful information such as location of PSSRs in genomes, length variation across genomes, the regions harboring PSSRs, etc. The information provided in this database is very useful for further research and analysis of SSRs in prokaryotes.
Lee, Christopher T; Bulterys, Marc; Martel, Lise D; Dahl, Benjamin A
2016-03-11
The epidemic of Ebola virus disease (Ebola) in West Africa began in Guinea in late 2013 (1), and on August 8, 2014, the World Health Organization (WHO) declared the epidemic a Public Health Emergency of International Concern (2). Guinea was declared Ebola-free on December 29, 2015, and is under a 90 day period of enhanced surveillance, following 3,351 confirmed and 453 probable cases of Ebola and 2,536 deaths (3). Passive surveillance for Ebola in Guinea has been conducted principally through the use of a telephone alert system. Community members and health facilities report deaths and suspected Ebola cases to local alert numbers operated by prefecture health departments or to a national toll-free call center. The national call center additionally functions as a source of public health information by responding to questions from the public about Ebola. To evaluate the sensitivity of the two systems and compare the sensitivity of the national call center with the local alerts system, the CDC country team performed probabilistic record linkage of the combined prefecture alerts database, as well as the national call center database, with the national viral hemorrhagic fever (VHF) database; the VHF database contains records of all known confirmed Ebola cases. Among 17,309 alert calls analyzed from the national call center, 71 were linked to 1,838 confirmed Ebola cases in the VHF database, yielding a sensitivity of 3.9%. The sensitivity of the national call center was highest in the capital city of Conakry (11.4%) and lower in other prefectures. In comparison, the local alerts system had a sensitivity of 51.1%. Local public health infrastructure plays an important role in surveillance in an epidemic setting.
Experiences with DCE: the pro7 communication server based on OSF-DCE functionality.
Schulte, M; Lordieck, W
1997-01-01
The pro7-communication server is a new approach to manage communication between different applications on different hardware platforms in a hospital environment. The most important features are the use of OSF/DCE for realising remote procedure calls between different platforms, the use of an SQL-92 compatible relational database and the design of a new software development tool (called protocol definition language compiler) for describing the interface of a new application, which is to integrate in a hospital environment.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Ball, G.; Kuznetsov, V.; Evans, D.
We present the Data Aggregation System, a system for information retrieval and aggregation from heterogenous sources of relational and non-relational data for the Compact Muon Solenoid experiment on the CERN Large Hadron Collider. The experiment currently has a number of organically-developed data sources, including front-ends to a number of different relational databases and non-database data services which do not share common data structures or APIs (Application Programming Interfaces), and cannot at this stage be readily converged. DAS provides a single interface for querying all these services, a caching layer to speed up access to expensive underlying calls and the abilitymore » to merge records from different data services pertaining to a single primary key.« less
Does a TV Public Service Advertisement Campaign for Suicide Prevention Really Work?
Song, In Han; You, Jung-Won; Kim, Ji Eun; Kim, Jung-Soo; Kwon, Se Won; Park, Jong-Ik
2017-05-01
One of the critical measures in suicide prevention is promoting public awareness of crisis hotline numbers so that individuals can more readily seek help in a time of crisis. Although public service advertisements (PSA) may be effective in raising the rates of both awareness and use of a suicide hotline, few investigations have been performed regarding their effectiveness in South Korea, where the suicide rate is the highest among OECD countries. The goal of this study was to evaluate the effectiveness of a television PSA campaign. We analyzed a database of crisis phone calls compiled by the Korean Ministry of Health and Welfare to track changes in call volume to a crisis hotline that was promoted in a TV campaign. We compared daily call counts for three periods of equal length: before, during, and after the campaign. The number of crisis calls during the campaign was about 1.6 times greater than the number before or after the campaign. Relative to the number of suicide-related calls in the previous year, the number of calls during the campaign period surged, displaying a noticeable increase. The findings confirmed that this campaign had a positive impact on call volume to the suicide hotline.
Relational Information Management Data-Base System
NASA Technical Reports Server (NTRS)
Storaasli, O. O.; Erickson, W. J.; Gray, F. P.; Comfort, D. L.; Wahlstrom, S. O.; Von Limbach, G.
1985-01-01
DBMS with several features particularly useful to scientists and engineers. RIM5 interfaced with any application program written in language capable of Calling FORTRAN routines. Applications include data management for Space Shuttle Columbia tiles, aircraft flight tests, high-pressure piping, atmospheric chemistry, census, university registration, CAD/CAM Geometry, and civil-engineering dam construction.
[Presence of the biomedical periodicals of Hungarian editions in international databases].
Vasas, Lívia; Hercsel, Imréné
2006-01-15
Presence of the biomedical periodicals of Hungarian editions in international databases. The majority of Hungarian scientific results in medical and related sciences are published in scientific periodicals of foreign edition with high impact factor (IF) values, and they appear in international scientific literature in foreign languages. In this study the authors dealt with the presence and registered citation in international databases of those periodicals only, which had been published in Hungary and/or in cooperation with foreign publishing companies. The examination went back to year 1980 and covered a 25-year long period. 110 periodicals were selected for more detailed examination. The authors analyzed the situation of the current periodicals in the three most often visited databases (MEDLINE, EMBASE, Web of Science), and discovered, that the biomedical scientific periodicals of Hungarian interests were not represented with reasonable emphasis in the relevant international bibliographic databases. Because of the great number of data the scientific literature of medicine and related sciences could not be represented in its entirety, this publication, however, might give useful information for the inquirers, and call the attention of the competent people.
Online database for documenting clinical pathology resident education.
Hoofnagle, Andrew N; Chou, David; Astion, Michael L
2007-01-01
Training of clinical pathologists is evolving and must now address the 6 core competencies described by the Accreditation Council for Graduate Medical Education (ACGME), which include patient care. A substantial portion of the patient care performed by the clinical pathology resident takes place while the resident is on call for the laboratory, a practice that provides the resident with clinical experience and assists the laboratory in providing quality service to clinicians in the hospital and surrounding community. Documenting the educational value of these on-call experiences and providing evidence of competence is difficult for residency directors. An online database of these calls, entered by residents and reviewed by faculty, would provide a mechanism for documenting and improving the education of clinical pathology residents. With Microsoft Access we developed an online database that uses active server pages and secure sockets layer encryption to document calls to the clinical pathology resident. Using the data collected, we evaluated the efficacy of 3 interventions aimed at improving resident education. The database facilitated the documentation of more than 4 700 calls in the first 21 months it was online, provided archived resident-generated data to assist in serving clients, and demonstrated that 2 interventions aimed at improving resident education were successful. We have developed a secure online database, accessible from any computer with Internet access, that can be used to easily document clinical pathology resident education and competency.
Lee, Byungwook; Kim, Taehyung; Kim, Seon-Kyu; Lee, Kwang H; Lee, Doheon
2007-01-01
With the advent of automated and high-throughput techniques, the number of patent applications containing biological sequences has been increasing rapidly. However, they have attracted relatively little attention compared to other sequence resources. We have built a database server called Patome, which contains biological sequence data disclosed in patents and published applications, as well as their analysis information. The analysis is divided into two steps. The first is an annotation step in which the disclosed sequences were annotated with RefSeq database. The second is an association step where the sequences were linked to Entrez Gene, OMIM and GO databases, and their results were saved as a gene-patent table. From the analysis, we found that 55% of human genes were associated with patenting. The gene-patent table can be used to identify whether a particular gene or disease is related to patenting. Patome is available at http://www.patome.org/; the information is updated bimonthly.
Lee, Byungwook; Kim, Taehyung; Kim, Seon-Kyu; Lee, Kwang H.; Lee, Doheon
2007-01-01
With the advent of automated and high-throughput techniques, the number of patent applications containing biological sequences has been increasing rapidly. However, they have attracted relatively little attention compared to other sequence resources. We have built a database server called Patome, which contains biological sequence data disclosed in patents and published applications, as well as their analysis information. The analysis is divided into two steps. The first is an annotation step in which the disclosed sequences were annotated with RefSeq database. The second is an association step where the sequences were linked to Entrez Gene, OMIM and GO databases, and their results were saved as a gene–patent table. From the analysis, we found that 55% of human genes were associated with patenting. The gene–patent table can be used to identify whether a particular gene or disease is related to patenting. Patome is available at ; the information is updated bimonthly. PMID:17085479
What do consumers want to know about antibiotics? Analysis of a medicines call centre database.
Hawke, Kate L; McGuire, Treasure M; Ranmuthugala, Geetha; van Driel, Mieke L
2016-02-01
Australia is one of the highest users of antibiotics in the developed world. This study aimed to identify consumer antibiotic information needs to improve targeting of medicines information. We conducted a retrospective, mixed-method study of consumers' antibiotic-related calls to Australia's National Prescribing Service (NPS) Medicines Line from September 2002 to June 2010. Demographic and question data were analysed, and the most common enquiry type in each age group was explored for key narrative themes. Relative antibiotic call frequencies were determined by comparing number of calls to antibiotic utilization in Australian Statistics on Medicines (ASM) data. Between 2002 and 2010, consumers made 8696 antibiotic calls to Medicines Line. The most common reason was questions about the role of their medicine (22.4%). Patient age groups differed in enquiry pattern, with more questions about lactation in the 0- to 4-year age group (33.6%), administration (5-14 years: 32.4%), interactions (15-24 years: 33.4% and 25-54 years: 23.3%) and role of the medicine (55 years and over: 26.6%). Key themes were identified for each age group. Relative to use in the community, antibiotics most likely to attract consumer calls were ciprofloxacin (18.0 calls/100,000 ASM prescriptions) and metronidazole (12.9 calls/100,000 ASM prescriptions), with higher call rates than the most commonly prescribed antibiotic amoxicillin (3.9 calls/100,000 ASM prescriptions). Consumers' knowledge gaps and concerns about antibiotics vary with age, and certain antibiotics generate greater concern relative to their usage. Clinicians should target medicines information to proactively address consumer concerns. © The Author 2015. Published by Oxford University Press. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.
Hardison, Ross C; Chui, David H K; Giardine, Belinda; Riemer, Cathy; Patrinos, George P; Anagnou, Nicholas; Miller, Webb; Wajcman, Henri
2002-03-01
We have constructed a relational database of hemoglobin variants and thalassemia mutations, called HbVar, which can be accessed on the web at http://globin.cse.psu.edu. Extensive information is recorded for each variant and mutation, including a description of the variant and associated pathology, hematology, electrophoretic mobility, methods of isolation, stability information, ethnic occurrence, structure studies, functional studies, and references. The initial information was derived from books by Dr. Titus Huisman and colleagues [Huisman et al., 1996, 1997, 1998]. The current database is updated regularly with the addition of new data and corrections to previous data. Queries can be formulated based on fields in the database. Tables of common categories of variants, such as all those involving the alpha1-globin gene (HBA1) or all those that result in high oxygen affinity, are maintained by automated queries on the database. Users can formulate more precise queries, such as identifying "all beta-globin variants associated with instability and found in Scottish populations." This new database should be useful for clinical diagnosis as well as in fundamental studies of hemoglobin biochemistry, globin gene regulation, and human sequence variation at these loci. Copyright 2002 Wiley-Liss, Inc.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Coleman, Andre M.; Johnson, Gary E.; Borde, Amy B.
Pacific Northwest National Laboratory (PNNL) conducted this project for the U.S. Army Corps of Engineers, Portland District (Corps). The purpose of the project is to develop a geospatial, web-accessible database (called “Oncor”) for action effectiveness and related data from monitoring and research efforts for the Columbia Estuary Ecosystem Restoration Program (CEERP). The intent is for the Oncor database to enable synthesis and evaluation, the results of which can then be applied in subsequent CEERP decision-making. This is the first annual report in what is expected to be a 3- to 4-year project, which commenced on February 14, 2012.
Secure, web-accessible call rosters for academic radiology departments.
Nguyen, A V; Tellis, W M; Avrin, D E
2000-05-01
Traditionally, radiology department call rosters have been posted via paper and bulletin boards. Frequently, changes to these lists are made by multiple people independently, but often not synchronized, resulting in confusion among the house staff and technical staff as to who is on call and when. In addition, multiple and disparate copies exist in different sections of the department, and changes made would not be propagated to all the schedules. To eliminate such difficulties, a paperless call scheduling application was developed. Our call scheduling program allowed Java-enabled web access to a database by designated personnel from each radiology section who have privileges to make the necessary changes. Once a person made a change, everyone accessing the database would see the modification. This eliminates the chaos resulting from people swapping shifts at the last minute and not having the time to record or broadcast the change. Furthermore, all changes to the database were logged. Users are given a log-in name and password and can only edit their section; however, all personnel have access to all sections' schedules. Our applet was written in Java 2 using the latest technology in database access. We access our Interbase database through the DataExpress and DB Swing (Borland, Scotts Valley, CA) components. The result is secure access to the call rosters via the web. There are many advantages to the web-enabled access, mainly the ability for people to make changes and have the changes recorded and propagated in a single virtual location and available to all who need to know.
A Pilot Study Using an Online, Experimental, Two-Asset Market.
ERIC Educational Resources Information Center
Lypny, Gregory
2003-01-01
Describes an online, securities market, research tool, called Borsa, to engage students in the exploration of asset pricing in microeconomics courses. Defines Borsa as related database files served on the Internet using a dedicated IP address. Discusses practical considerations in running the market. Offers questions that arise from using the…
AgeFactDB—the JenAge Ageing Factor Database—towards data integration in ageing research
Hühne, Rolf; Thalheim, Torsten; Sühnel, Jürgen
2014-01-01
AgeFactDB (http://agefactdb.jenage.de) is a database aimed at the collection and integration of ageing phenotype data including lifespan information. Ageing factors are considered to be genes, chemical compounds or other factors such as dietary restriction, whose action results in a changed lifespan or another ageing phenotype. Any information related to the effects of ageing factors is called an observation and is presented on observation pages. To provide concise access to the complete information for a particular ageing factor, corresponding observations are also summarized on ageing factor pages. In a first step, ageing-related data were primarily taken from existing databases such as the Ageing Gene Database—GenAge, the Lifespan Observations Database and the Dietary Restriction Gene Database—GenDR. In addition, we have started to include new ageing-related information. Based on homology data taken from the HomoloGene Database, AgeFactDB also provides observation and ageing factor pages of genes that are homologous to known ageing-related genes. These homologues are considered as candidate or putative ageing-related genes. AgeFactDB offers a variety of search and browse options, and also allows the download of ageing factor or observation lists in TSV, CSV and XML formats. PMID:24217911
DOE Office of Scientific and Technical Information (OSTI.GOV)
Rohatgi, Upendra S.
Nuclear reactor codes require validation with appropriate data representing the plant for specific scenarios. The thermal-hydraulic data is scattered in different locations and in different formats. Some of the data is in danger of being lost. A relational database is being developed to organize the international thermal hydraulic test data for various reactor concepts and different scenarios. At the reactor system level, that data is organized to include separate effect tests and integral effect tests for specific scenarios and corresponding phenomena. The database relies on the phenomena identification sections of expert developed PIRTs. The database will provide a summary ofmore » appropriate data, review of facility information, test description, instrumentation, references for the experimental data and some examples of application of the data for validation. The current database platform includes scenarios for PWR, BWR, VVER, and specific benchmarks for CFD modelling data and is to be expanded to include references for molten salt reactors. There are place holders for high temperature gas cooled reactors, CANDU and liquid metal reactors. This relational database is called The International Experimental Thermal Hydraulic Systems (TIETHYS) database and currently resides at Nuclear Energy Agency (NEA) of the OECD and is freely open to public access. Going forward the database will be extended to include additional links and data as they become available. https://www.oecd-nea.org/tiethysweb/« less
Maximizing the use of Special Olympics International's Healthy Athletes database: A call to action.
Lloyd, Meghann; Foley, John T; Temple, Viviene A
2018-02-01
There is a critical need for high-quality population-level data related to the health of individuals with intellectual disabilities. For more than 15 years Special Olympics International has been conducting free Healthy Athletes screenings at local, national and international events. The Healthy Athletes database is the largest known international database specifically on the health of people with intellectual disabilities; however, it is relatively under-utilized by the research community. A consensus meeting with two dozen North American researchers, stakeholders, clinicians and policymakers took place in Toronto, Canada. The purpose of the meeting was to: 1) establish the perceived utility of the database, and 2) to identify and prioritize 3-5 specific priorities related to using the Healthy Athletes database to promote the health of individuals with intellectual disabilities. There was unanimous agreement from the meeting participants that this database represents an immense opportunity both from the data already collected, and data that will be collected in the future. The 3 top priorities for the database were deemed to be: 1) establish the representativeness of data collected on Special Olympics athletes compared to the general population with intellectual disabilities, 2) create a scientific advisory group for Special Olympics International, and 3) use the data to improve Special Olympics programs around the world. The Special Olympics Healthy Athletes database includes data not found in any other source and should be used, in partnership with Special Olympics International, by researchers to significantly increase our knowledge and understanding of the health of individuals with intellectual disabilities. Copyright © 2017 The Authors. Published by Elsevier Ltd.. All rights reserved.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Yu, Haoyu S.; Zhang, Wenjing; Verma, Pragya
2015-01-01
The goal of this work is to develop a gradient approximation to the exchange–correlation functional of Kohn–Sham density functional theory for treating molecular problems with a special emphasis on the prediction of quantities important for homogeneous catalysis and other molecular energetics. Our training and validation of exchange–correlation functionals is organized in terms of databases and subdatabases. The key properties required for homogeneous catalysis are main group bond energies (database MGBE137), transition metal bond energies (database TMBE32), reaction barrier heights (database BH76), and molecular structures (database MS10). We also consider 26 other databases, most of which are subdatabases of a newlymore » extended broad database called Database 2015, which is presented in the present article and in its ESI. Based on the mathematical form of a nonseparable gradient approximation (NGA), as first employed in the N12 functional, we design a new functional by using Database 2015 and by adding smoothness constraints to the optimization of the functional. The resulting functional is called the gradient approximation for molecules, or GAM. The GAM functional gives better results for MGBE137, TMBE32, and BH76 than any available generalized gradient approximation (GGA) or than N12. The GAM functional also gives reasonable results for MS10 with an MUE of 0.018 Å. The GAM functional provides good results both within the training sets and outside the training sets. The convergence tests and the smooth curves of exchange–correlation enhancement factor as a function of the reduced density gradient show that the GAM functional is a smooth functional that should not lead to extra expense or instability in optimizations. NGAs, like GGAs, have the advantage over meta-GGAs and hybrid GGAs of respectively smaller grid-size requirements for integrations and lower costs for extended systems. These computational advantages combined with the relatively high accuracy for all the key properties needed for molecular catalysis make the GAM functional very promising for future applications.« less
NEMiD: a web-based curated microbial diversity database with geo-based plotting.
Bhattacharjee, Kaushik; Joshi, Santa Ram
2014-01-01
The majority of the Earth's microbes remain unknown, and that their potential utility cannot be exploited until they are discovered and characterized. They provide wide scope for the development of new strains as well as biotechnological uses. The documentation and bioprospection of microorganisms carry enormous significance considering their relevance to human welfare. This calls for an urgent need to develop a database with emphasis on the microbial diversity of the largest untapped reservoirs in the biosphere. The data annotated in the North-East India Microbial database (NEMiD) were obtained by the isolation and characterization of microbes from different parts of the Eastern Himalayan region. The database was constructed as a relational database management system (RDBMS) for data storage in MySQL in the back-end on a Linux server and implemented in an Apache/PHP environment. This database provides a base for understanding the soil microbial diversity pattern in this megabiodiversity hotspot and indicates the distribution patterns of various organisms along with identification. The NEMiD database is freely available at www.mblabnehu.info/nemid/.
NEMiD: A Web-Based Curated Microbial Diversity Database with Geo-Based Plotting
Bhattacharjee, Kaushik; Joshi, Santa Ram
2014-01-01
The majority of the Earth's microbes remain unknown, and that their potential utility cannot be exploited until they are discovered and characterized. They provide wide scope for the development of new strains as well as biotechnological uses. The documentation and bioprospection of microorganisms carry enormous significance considering their relevance to human welfare. This calls for an urgent need to develop a database with emphasis on the microbial diversity of the largest untapped reservoirs in the biosphere. The data annotated in the North-East India Microbial database (NEMiD) were obtained by the isolation and characterization of microbes from different parts of the Eastern Himalayan region. The database was constructed as a relational database management system (RDBMS) for data storage in MySQL in the back-end on a Linux server and implemented in an Apache/PHP environment. This database provides a base for understanding the soil microbial diversity pattern in this megabiodiversity hotspot and indicates the distribution patterns of various organisms along with identification. The NEMiD database is freely available at www.mblabnehu.info/nemid/. PMID:24714636
Clinton, Michael E; Conway, Neil; Sturges, Jane
2017-01-01
It has been argued that when people believe that their work is a calling, it can often be experienced as an intense and consuming passion with significant personal meaning. While callings have been demonstrated to have several positive outcomes for individuals, less is known about the potential downsides for those who experience work in this way. This study develops a multiple-meditation model proposing that, while the intensity of a calling has a positive direct effect on work-related vigor, it motivates people to work longer hours, which both directly and indirectly via longer work hours, limits their psychological detachment from work in the evenings. In turn, this process reduces sleep quality and morning vigor. Survey and diary data of 193 church ministers supported all hypotheses associated with this model. This implies that intense callings may limit the process of recovery from work experiences. The findings contribute to a more balanced theoretical understanding of callings. (PsycINFO Database Record (c) 2016 APA, all rights reserved).
ClassLess: A Comprehensive Database of Young Stellar Objects
NASA Astrophysics Data System (ADS)
Hillenbrand, Lynne; Baliber, Nairn
2015-01-01
We have designed and constructed a database housing published measurements of Young Stellar Objects (YSOs) within ~1 kpc of the Sun. ClassLess, so called because it includes YSOs in all stages of evolution, is a relational database in which user interaction is conducted via HTML web browsers, queries are performed in scientific language, and all data are linked to the sources of publication. Each star is associated with a cluster (or clusters), and both spatially resolved and unresolved measurements are stored, allowing proper use of data from multiple star systems. With this fully searchable tool, myriad ground- and space-based instruments and surveys across wavelength regimes can be exploited. In addition to primary measurements, the database self consistently calculates and serves higher level data products such as extinction, luminosity, and mass. As a result, searches for young stars with specific physical characteristics can be completed with just a few mouse clicks.
Clique-based data mining for related genes in a biomedical database.
Matsunaga, Tsutomu; Yonemori, Chikara; Tomita, Etsuji; Muramatsu, Masaaki
2009-07-01
Progress in the life sciences cannot be made without integrating biomedical knowledge on numerous genes in order to help formulate hypotheses on the genetic mechanisms behind various biological phenomena, including diseases. There is thus a strong need for a way to automatically and comprehensively search from biomedical databases for related genes, such as genes in the same families and genes encoding components of the same pathways. Here we address the extraction of related genes by searching for densely-connected subgraphs, which are modeled as cliques, in a biomedical relational graph. We constructed a graph whose nodes were gene or disease pages, and edges were the hyperlink connections between those pages in the Online Mendelian Inheritance in Man (OMIM) database. We obtained over 20,000 sets of related genes (called 'gene modules') by enumerating cliques computationally. The modules included genes in the same family, genes for proteins that form a complex, and genes for components of the same signaling pathway. The results of experiments using 'metabolic syndrome'-related gene modules show that the gene modules can be used to get a coherent holistic picture helpful for interpreting relations among genes. We presented a data mining approach extracting related genes by enumerating cliques. The extracted gene sets provide a holistic picture useful for comprehending complex disease mechanisms.
Emission Database for Global Atmospheric Research (EDGAR).
ERIC Educational Resources Information Center
Olivier, J. G. J.; And Others
1994-01-01
Presents the objective and methodology chosen for the construction of a global emissions source database called EDGAR and the structural design of the database system. The database estimates on a regional and grid basis, 1990 annual emissions of greenhouse gases, and of ozone depleting compounds from all known sources. (LZ)
One approach to design of speech emotion database
NASA Astrophysics Data System (ADS)
Uhrin, Dominik; Chmelikova, Zdenka; Tovarek, Jaromir; Partila, Pavol; Voznak, Miroslav
2016-05-01
This article describes a system for evaluating the credibility of recordings with emotional character. Sound recordings form Czech language database for training and testing systems of speech emotion recognition. These systems are designed to detect human emotions in his voice. The emotional state of man is useful in the security forces and emergency call service. Man in action (soldier, police officer and firefighter) is often exposed to stress. Information about the emotional state (his voice) will help to dispatch to adapt control commands for procedure intervention. Call agents of emergency call service must recognize the mental state of the caller to adjust the mood of the conversation. In this case, the evaluation of the psychological state is the key factor for successful intervention. A quality database of sound recordings is essential for the creation of the mentioned systems. There are quality databases such as Berlin Database of Emotional Speech or Humaine. The actors have created these databases in an audio studio. It means that the recordings contain simulated emotions, not real. Our research aims at creating a database of the Czech emotional recordings of real human speech. Collecting sound samples to the database is only one of the tasks. Another one, no less important, is to evaluate the significance of recordings from the perspective of emotional states. The design of a methodology for evaluating emotional recordings credibility is described in this article. The results describe the advantages and applicability of the developed method.
Geothermal NEPA Database on OpenEI (Poster)
DOE Office of Scientific and Technical Information (OSTI.GOV)
Young, K. R.; Levine, A.
2014-09-01
The National Renewable Energy Laboratory (NREL) developed the Geothermal National Environmental Policy Act (NEPA) Database as a platform for government agencies and industry to access and maintain information related to geothermal NEPA documents. The data were collected to inform analyses of NEPA timelines, and the collected data were made publically available via this tool in case others might find the data useful. NREL staff and contractors collected documents from agency websites, during visits to the two busiest Bureau of Land Management (BLM) field offices for geothermal development, and through email and phone call requests from other BLM field offices. Theymore » then entered the information into the database, hosted by Open Energy Information (http://en.openei.org/wiki/RAPID/NEPA). The long-term success of the project will depend on the willingness of federal agencies, industry, and others to populate the database with NEPA and related documents, and to use the data for their own analyses. As the information and capabilities of the database expand, developers and agencies can save time on new NEPA reports by accessing a single location to research related activities, their potential impacts, and previously proposed and imposed mitigation measures. NREL used a wiki platform to allow industry and agencies to maintain the content in the future so that it continues to provide relevant and accurate information to users.« less
History Places: A Case Study for Relational Database and Information Retrieval System Design
ERIC Educational Resources Information Center
Hendry, David G.
2007-01-01
This article presents a project-based case study that was developed for students with diverse backgrounds and varied inclinations for engaging technical topics. The project, called History Places, requires that student teams develop a vision for a kind of digital library, propose a conceptual model, and use the model to derive a logical model and…
Atlas - a data warehouse for integrative bioinformatics.
Shah, Sohrab P; Huang, Yong; Xu, Tao; Yuen, Macaire M S; Ling, John; Ouellette, B F Francis
2005-02-21
We present a biological data warehouse called Atlas that locally stores and integrates biological sequences, molecular interactions, homology information, functional annotations of genes, and biological ontologies. The goal of the system is to provide data, as well as a software infrastructure for bioinformatics research and development. The Atlas system is based on relational data models that we developed for each of the source data types. Data stored within these relational models are managed through Structured Query Language (SQL) calls that are implemented in a set of Application Programming Interfaces (APIs). The APIs include three languages: C++, Java, and Perl. The methods in these API libraries are used to construct a set of loader applications, which parse and load the source datasets into the Atlas database, and a set of toolbox applications which facilitate data retrieval. Atlas stores and integrates local instances of GenBank, RefSeq, UniProt, Human Protein Reference Database (HPRD), Biomolecular Interaction Network Database (BIND), Database of Interacting Proteins (DIP), Molecular Interactions Database (MINT), IntAct, NCBI Taxonomy, Gene Ontology (GO), Online Mendelian Inheritance in Man (OMIM), LocusLink, Entrez Gene and HomoloGene. The retrieval APIs and toolbox applications are critical components that offer end-users flexible, easy, integrated access to this data. We present use cases that use Atlas to integrate these sources for genome annotation, inference of molecular interactions across species, and gene-disease associations. The Atlas biological data warehouse serves as data infrastructure for bioinformatics research and development. It forms the backbone of the research activities in our laboratory and facilitates the integration of disparate, heterogeneous biological sources of data enabling new scientific inferences. Atlas achieves integration of diverse data sets at two levels. First, Atlas stores data of similar types using common data models, enforcing the relationships between data types. Second, integration is achieved through a combination of APIs, ontology, and tools. The Atlas software is freely available under the GNU General Public License at: http://bioinformatics.ubc.ca/atlas/
Atlas – a data warehouse for integrative bioinformatics
Shah, Sohrab P; Huang, Yong; Xu, Tao; Yuen, Macaire MS; Ling, John; Ouellette, BF Francis
2005-01-01
Background We present a biological data warehouse called Atlas that locally stores and integrates biological sequences, molecular interactions, homology information, functional annotations of genes, and biological ontologies. The goal of the system is to provide data, as well as a software infrastructure for bioinformatics research and development. Description The Atlas system is based on relational data models that we developed for each of the source data types. Data stored within these relational models are managed through Structured Query Language (SQL) calls that are implemented in a set of Application Programming Interfaces (APIs). The APIs include three languages: C++, Java, and Perl. The methods in these API libraries are used to construct a set of loader applications, which parse and load the source datasets into the Atlas database, and a set of toolbox applications which facilitate data retrieval. Atlas stores and integrates local instances of GenBank, RefSeq, UniProt, Human Protein Reference Database (HPRD), Biomolecular Interaction Network Database (BIND), Database of Interacting Proteins (DIP), Molecular Interactions Database (MINT), IntAct, NCBI Taxonomy, Gene Ontology (GO), Online Mendelian Inheritance in Man (OMIM), LocusLink, Entrez Gene and HomoloGene. The retrieval APIs and toolbox applications are critical components that offer end-users flexible, easy, integrated access to this data. We present use cases that use Atlas to integrate these sources for genome annotation, inference of molecular interactions across species, and gene-disease associations. Conclusion The Atlas biological data warehouse serves as data infrastructure for bioinformatics research and development. It forms the backbone of the research activities in our laboratory and facilitates the integration of disparate, heterogeneous biological sources of data enabling new scientific inferences. Atlas achieves integration of diverse data sets at two levels. First, Atlas stores data of similar types using common data models, enforcing the relationships between data types. Second, integration is achieved through a combination of APIs, ontology, and tools. The Atlas software is freely available under the GNU General Public License at: PMID:15723693
Linking CALL and SLA: Using the IRIS Database to Locate Research Instruments
ERIC Educational Resources Information Center
Handley, Zöe; Marsden, Emma
2014-01-01
To establish an evidence base for future computer-assisted language learning (CALL) design, CALL research needs to move away from CALL versus non-CALL comparisons, and focus on investigating the differential impact of individual coding elements, that is, specific features of a technology which might have an impact on learning (Pederson, 1987).…
Incremental Query Rewriting with Resolution
NASA Astrophysics Data System (ADS)
Riazanov, Alexandre; Aragão, Marcelo A. T.
We address the problem of semantic querying of relational databases (RDB) modulo knowledge bases using very expressive knowledge representation formalisms, such as full first-order logic or its various fragments. We propose to use a resolution-based first-order logic (FOL) reasoner for computing schematic answers to deductive queries, with the subsequent translation of these schematic answers to SQL queries which are evaluated using a conventional relational DBMS. We call our method incremental query rewriting, because an original semantic query is rewritten into a (potentially infinite) series of SQL queries. In this chapter, we outline the main idea of our technique - using abstractions of databases and constrained clauses for deriving schematic answers, and provide completeness and soundness proofs to justify the applicability of this technique to the case of resolution for FOL without equality. The proposed method can be directly used with regular RDBs, including legacy databases. Moreover, we propose it as a potential basis for an efficient Web-scale semantic search technology.
Experiment on building Sundanese lexical database based on WordNet
NASA Astrophysics Data System (ADS)
Dewi Budiwati, Sari; Nurani Setiawan, Novihana
2018-03-01
Sundanese language is the second biggest local language used in Indonesia. Currently, Sundanese language is rarely used since we have the Indonesian language in everyday conversation and as the national language. We built a Sundanese lexical database based on WordNet and Indonesian WordNet as an alternative way to preserve the language as one of local culture. WordNet was chosen because of Sundanese language has three levels of word delivery, called language code of conduct. Web user participant involved in this research for specifying Sundanese semantic relations, and an expert linguistic for validating the relations. The merge methodology was implemented in this experiment. Some words are equivalent with WordNet while another does not have its equivalence since some words are not exist in another culture.
A novel data storage logic in the cloud
Mátyás, Bence; Szarka, Máté; Járvás, Gábor; Kusper, Gábor; Argay, István; Fialowski, Alice
2016-01-01
Databases which store and manage long-term scientific information related to life science are used to store huge amount of quantitative attributes. Introduction of a new entity attribute requires modification of the existing data tables and the programs that use these data tables. The solution is increasing the virtual data tables while the number of screens remains the same. The main objective of the present study was to introduce a logic called Joker Tao (JT) which provides universal data storage for cloud-based databases. It means all types of input data can be interpreted as an entity and attribute at the same time, in the same data table. PMID:29026521
A pier-scour database: 2,427 field and laboratory measurements of pier scour
Benedict, Stephen T.; Caldwell, Andral W.
2014-01-01
The U.S. Geological Survey conducted a literature review to identify potential sources of published pier-scour data, and selected data were compiled into a digital spreadsheet called the 2014 USGS Pier-Scour Database (PSDb-2014) consisting of 569 laboratory and 1,858 field measurements. These data encompass a wide range of laboratory and field conditions and represent field data from 23 States within the United States and from 6 other countries. The digital spreadsheet is available on the Internet and offers a valuable resource to engineers and researchers seeking to understand pier-scour relations in the laboratory and field.
A novel data storage logic in the cloud.
Mátyás, Bence; Szarka, Máté; Járvás, Gábor; Kusper, Gábor; Argay, István; Fialowski, Alice
2016-01-01
Databases which store and manage long-term scientific information related to life science are used to store huge amount of quantitative attributes. Introduction of a new entity attribute requires modification of the existing data tables and the programs that use these data tables. The solution is increasing the virtual data tables while the number of screens remains the same. The main objective of the present study was to introduce a logic called Joker Tao (JT) which provides universal data storage for cloud-based databases. It means all types of input data can be interpreted as an entity and attribute at the same time, in the same data table.
Interconnecting heterogeneous database management systems
NASA Technical Reports Server (NTRS)
Gligor, V. D.; Luckenbaugh, G. L.
1984-01-01
It is pointed out that there is still a great need for the development of improved communication between remote, heterogeneous database management systems (DBMS). Problems regarding the effective communication between distributed DBMSs are primarily related to significant differences between local data managers, local data models and representations, and local transaction managers. A system of interconnected DBMSs which exhibit such differences is called a network of distributed, heterogeneous DBMSs. In order to achieve effective interconnection of remote, heterogeneous DBMSs, the users must have uniform, integrated access to the different DBMs. The present investigation is mainly concerned with an analysis of the existing approaches to interconnecting heterogeneous DBMSs, taking into account four experimental DBMS projects.
Databases on biotechnology and biosafety of GMOs.
Degrassi, Giuliano; Alexandrova, Nevena; Ripandelli, Decio
2003-01-01
Due to the involvement of scientific, industrial, commercial and public sectors of society, the complexity of the issues concerning the safety of genetically modified organisms (GMOs) for the environment, agriculture, and human and animal health calls for a wide coverage of information. Accordingly, development of the field of biotechnology, along with concerns related to the fate of released GMOs, has led to a rapid development of tools for disseminating such information. As a result, there is a growing number of databases aimed at collecting and storing information related to GMOs. Most of the sites deal with information on environmental releases, field trials, transgenes and related sequences, regulations and legislation, risk assessment documents, and literature. Databases are mainly established and managed by scientific, national or international authorities, and are addressed towards scientists, government officials, policy makers, consumers, farmers, environmental groups and civil society representatives. This complexity can lead to an overlapping of information. The purpose of the present review is to analyse the relevant databases currently available on the web, providing comments on their vastly different information and on the structure of the sites pertaining to different users. A preliminary overview on the development of these sites during the last decade, at both the national and international level, is also provided.
NASA Astrophysics Data System (ADS)
Shi, Congming; Wang, Feng; Deng, Hui; Liu, Yingbo; Liu, Cuiyin; Wei, Shoulin
2017-08-01
As a dedicated synthetic aperture radio interferometer in China, the MingantU SpEctral Radioheliograph (MUSER), initially known as the Chinese Spectral RadioHeliograph (CSRH), has entered the stage of routine observation. More than 23 million data records per day need to be effectively managed to provide high-performance data query and retrieval for scientific data reduction. In light of these massive amounts of data generated by the MUSER, in this paper, a novel data management technique called the negative database (ND) is proposed and used to implement a data management system for the MUSER. Based on the key-value database, the ND technique makes complete utilization of the complement set of observational data to derive the requisite information. Experimental results showed that the proposed ND can significantly reduce storage volume in comparison with a relational database management system (RDBMS). Even when considering the time needed to derive records that were absent, its overall performance, including querying and deriving the data of the ND, is comparable with that of a relational database management system (RDBMS). The ND technique effectively solves the problem of massive data storage for the MUSER and is a valuable reference for the massive data management required in next-generation telescopes.
LAND-deFeND - An innovative database structure for landslides and floods and their consequences.
Napolitano, Elisabetta; Marchesini, Ivan; Salvati, Paola; Donnini, Marco; Bianchi, Cinzia; Guzzetti, Fausto
2018-02-01
Information on historical landslides and floods - collectively called "geo-hydrological hazards - is key to understand the complex dynamics of the events, to estimate the temporal and spatial frequency of damaging events, and to quantify their impact. A number of databases on geo-hydrological hazards and their consequences have been developed worldwide at different geographical and temporal scales. Of the few available database structures that can handle information on both landslides and floods some are outdated and others were not designed to store, organize, and manage information on single phenomena or on the type and monetary value of the damages and the remediation actions. Here, we present the LANDslides and Floods National Database (LAND-deFeND), a new database structure able to store, organize, and manage in a single digital structure spatial information collected from various sources with different accuracy. In designing LAND-deFeND, we defined four groups of entities, namely: nature-related, human-related, geospatial-related, and information-source-related entities that collectively can describe fully the geo-hydrological hazards and their consequences. In LAND-deFeND, the main entities are the nature-related entities, encompassing: (i) the "phenomenon", a single landslide or local inundation, (ii) the "event", which represent the ensemble of the inundations and/or landslides occurred in a conventional geographical area in a limited period, and (iii) the "trigger", which is the meteo-climatic or seismic cause (trigger) of the geo-hydrological hazards. LAND-deFeND maintains the relations between the nature-related entities and the human-related entities even where the information is missing partially. The physical model of the LAND-deFeND contains 32 tables, including nine input tables, 21 dictionary tables, and two association tables, and ten views, including specific views that make the database structure compliant with the EC INSPIRE and the Floods Directives. The LAND-deFeND database structure is open, and freely available from http://geomorphology.irpi.cnr.it/tools. Copyright © 2017 The Authors. Published by Elsevier Ltd.. All rights reserved.
Computer Science Research in Europe.
1984-08-29
most attention, multi- database and its structure, and (3) the dependencies between databases Distributed Systems and multi- databases . Having...completed a multi- database Newcastle University, UK system for distributed data management, At the University of Newcastle the INRIA is now working on a real...communications re- INRIA quirements of distributed database A project called SIRIUS was estab- systems, protocols for checking the lished in 1977 at the
2014-04-25
EA’s Java application programming interface (API), the team built a tool called OWL2EA that can ingest an OWL file and generate the corresponding UML...ObjectItemStructure specification shown in Figure 10. Running this script in the relational database server MySQL creates the physical schema that
DOE Office of Scientific and Technical Information (OSTI.GOV)
Chain, Patrick
Genomics — the genetic mapping and DNA sequencing of sets of genes or the complete genomes of organisms, along with related genome analysis and database work — is emerging as one of the transformative sciences of the 21st century. But current bioinformatics tools are not accessible to most biological researchers. Now, a new computational and web-based tool called EDGE Bioinformatics is working to fulfill the promise of democratizing genomics.
Database Software for Social Studies. A MicroSIFT Quarterly Report.
ERIC Educational Resources Information Center
Weaver, Dave
The report describes and evaluates the use of a set of learning tools called database managers and their creation of databases to help teach problem solving skills in social studies. Details include the design, building, and use of databases in a social studies setting, along with advantages and disadvantages of using them. The three types of…
CRAVE: a database, middleware and visualization system for phenotype ontologies.
Gkoutos, Georgios V; Green, Eain C J; Greenaway, Simon; Blake, Andrew; Mallon, Ann-Marie; Hancock, John M
2005-04-01
A major challenge in modern biology is to link genome sequence information to organismal function. In many organisms this is being done by characterizing phenotypes resulting from mutations. Efficiently expressing phenotypic information requires combinatorial use of ontologies. However tools are not currently available to visualize combinations of ontologies. Here we describe CRAVE (Concept Relation Assay Value Explorer), a package allowing storage, active updating and visualization of multiple ontologies. CRAVE is a web-accessible JAVA application that accesses an underlying MySQL database of ontologies via a JAVA persistent middleware layer (Chameleon). This maps the database tables into discrete JAVA classes and creates memory resident, interlinked objects corresponding to the ontology data. These JAVA objects are accessed via calls through the middleware's application programming interface. CRAVE allows simultaneous display and linking of multiple ontologies and searching using Boolean and advanced searches.
An approach to efficient mobility management in intelligent networks
NASA Technical Reports Server (NTRS)
Murthy, K. M. S.
1995-01-01
Providing personal communication systems supporting full mobility require intelligent networks for tracking mobile users and facilitating outgoing and incoming calls over different physical and network environments. In realizing the intelligent network functionalities, databases play a major role. Currently proposed network architectures envision using the SS7-based signaling network for linking these DB's and also for interconnecting DB's with switches. If the network has to support ubiquitous, seamless mobile services, then it has to support additionally mobile application parts, viz., mobile origination calls, mobile destination calls, mobile location updates and inter-switch handovers. These functions will generate significant amount of data and require them to be transferred between databases (HLR, VLR) and switches (MSC's) very efficiently. In the future, the users (fixed or mobile) may use and communicate with sophisticated CPE's (e.g. multimedia, multipoint and multisession calls) which may require complex signaling functions. This will generate volumness service handling data and require efficient transfer of these message between databases and switches. Consequently, the network providers would be able to add new services and capabilities to their networks incrementally, quickly and cost-effectively.
ClassLess: A Comprehensive Database of Young Stellar Objects
NASA Astrophysics Data System (ADS)
Hillenbrand, Lynne A.; baliber, nairn
2015-08-01
We have designed and constructed a database intended to house catalog and literature-published measurements of Young Stellar Objects (YSOs) within ~1 kpc of the Sun. ClassLess, so called because it includes YSOs in all stages of evolution, is a relational database in which user interaction is conducted via HTML web browsers, queries are performed in scientific language, and all data are linked to the sources of publication. Each star is associated with a cluster (or clusters), and both spatially resolved and unresolved measurements are stored, allowing proper use of data from multiple star systems. With this fully searchable tool, myriad ground- and space-based instruments and surveys across wavelength regimes can be exploited. In addition to primary measurements, the database self consistently calculates and serves higher level data products such as extinction, luminosity, and mass. As a result, searches for young stars with specific physical characteristics can be completed with just a few mouse clicks. We are in the database population phase now, and are eager to engage with interested experts worldwide on local galactic star formation and young stellar populations.
High Performance Descriptive Semantic Analysis of Semantic Graph Databases
DOE Office of Scientific and Technical Information (OSTI.GOV)
Joslyn, Cliff A.; Adolf, Robert D.; al-Saffar, Sinan
As semantic graph database technology grows to address components ranging from extant large triple stores to SPARQL endpoints over SQL-structured relational databases, it will become increasingly important to be able to understand their inherent semantic structure, whether codified in explicit ontologies or not. Our group is researching novel methods for what we call descriptive semantic analysis of RDF triplestores, to serve purposes of analysis, interpretation, visualization, and optimization. But data size and computational complexity makes it increasingly necessary to bring high performance computational resources to bear on this task. Our research group built a novel high performance hybrid system comprisingmore » computational capability for semantic graph database processing utilizing the large multi-threaded architecture of the Cray XMT platform, conventional servers, and large data stores. In this paper we describe that architecture and our methods, and present the results of our analyses of basic properties, connected components, namespace interaction, and typed paths such for the Billion Triple Challenge 2010 dataset.« less
NASA Astrophysics Data System (ADS)
van Rensburg, L.; Claassens, S.; Bezuidenhout, J. J.; Jansen van Rensburg, P. J.
2009-03-01
The much publicised problem with major asbestos pollution and related health issues in South Africa, has called for action to be taken to negate the situation. The aim of this project was to establish a prioritisation index that would provide a scientifically based sequence in which polluted asbestos mines in Southern Africa ought to be rehabilitated. It was reasoned that a computerised database capable of calculating such a Rehabilitation Prioritisation Index (RPI) would be a fruitful departure from the previously used subjective selection prone to human bias. The database was developed in Microsoft Access and both quantitative and qualitative data were used for the calculation of the RPI value. The logical database structure consists of a number of mines, each consisting of a number of dumps, for which a number of samples have been analysed to determine asbestos fibre contents. For this system to be accurate as well as relevant, the data in the database should be revalidated and updated on a regular basis.
Guidelines for the Effective Use of Entity-Attribute-Value Modeling for Biomedical Databases
Dinu, Valentin; Nadkarni, Prakash
2007-01-01
Purpose To introduce the goals of EAV database modeling, to describe the situations where Entity-Attribute-Value (EAV) modeling is a useful alternative to conventional relational methods of database modeling, and to describe the fine points of implementation in production systems. Methods We analyze the following circumstances: 1) data are sparse and have a large number of applicable attributes, but only a small fraction will apply to a given entity; 2) numerous classes of data need to be represented, each class has a limited number of attributes, but the number of instances of each class is very small. We also consider situations calling for a mixed approach where both conventional and EAV design are used for appropriate data classes. Results and Conclusions In robust production systems, EAV-modeled databases trade a modest data sub-schema for a complex metadata sub-schema. The need to design the metadata effectively makes EAV design potentially more challenging than conventional design. PMID:17098467
Yanagita, Satoshi; Imahana, Masato; Suwa, Kazuaki; Sugimura, Hitomi; Nishiki, Masayuki
2016-01-01
Japanese Society of Radiological Technology (JSRT) standard digital image database contains many useful cases of chest X-ray images, and has been used in many state-of-the-art researches. However, the pixel values of all the images are simply digitized as relative density values by utilizing a scanned film digitizer. As a result, the pixel values are completely different from the standardized display system input value of digital imaging and communications in medicine (DICOM), called presentation value (P-value), which can maintain a visual consistency when observing images using different display luminance. Therefore, we converted all the images from JSRT standard digital image database to DICOM format followed by the conversion of the pixel values to P-value using an original program developed by ourselves. Consequently, JSRT standard digital image database has been modified so that the visual consistency of images is maintained among different luminance displays.
Chemical databases evaluated by order theoretical tools.
Voigt, Kristina; Brüggemann, Rainer; Pudenz, Stefan
2004-10-01
Data on environmental chemicals are urgently needed to comply with the future chemicals policy in the European Union. The availability of data on parameters and chemicals can be evaluated by chemometrical and environmetrical methods. Different mathematical and statistical methods are taken into account in this paper. The emphasis is set on a new, discrete mathematical method called METEOR (method of evaluation by order theory). Application of the Hasse diagram technique (HDT) of the complete data-matrix comprising 12 objects (databases) x 27 attributes (parameters + chemicals) reveals that ECOTOX (ECO), environmental fate database (EFD) and extoxnet (EXT)--also called multi-database databases--are best. Most single databases which are specialised are found in a minimal position in the Hasse diagram; these are biocatalysis/biodegradation database (BID), pesticide database (PES) and UmweltInfo (UMW). The aggregation of environmental parameters and chemicals (equal weight) leads to a slimmer data-matrix on the attribute side. However, no significant differences are found in the "best" and "worst" objects. The whole approach indicates a rather bad situation in terms of the availability of data on existing chemicals and hence an alarming signal concerning the new and existing chemicals policies of the EEC.
Metabolomics analysis: Finding out metabolic building blocks
2017-01-01
In this paper we propose a new methodology for the analysis of metabolic networks. We use the notion of strongly connected components of a graph, called in this context metabolic building blocks. Every strongly connected component is contracted to a single node in such a way that the resulting graph is a directed acyclic graph, called a metabolic DAG, with a considerably reduced number of nodes. The property of being a directed acyclic graph brings out a background graph topology that reveals the connectivity of the metabolic network, as well as bridges, isolated nodes and cut nodes. Altogether, it becomes a key information for the discovery of functional metabolic relations. Our methodology has been applied to the glycolysis and the purine metabolic pathways for all organisms in the KEGG database, although it is general enough to work on any database. As expected, using the metabolic DAGs formalism, a considerable reduction on the size of the metabolic networks has been obtained, specially in the case of the purine pathway due to its relative larger size. As a proof of concept, from the information captured by a metabolic DAG and its corresponding metabolic building blocks, we obtain the core of the glycolysis pathway and the core of the purine metabolism pathway and detect some essential metabolic building blocks that reveal the key reactions in both pathways. Finally, the application of our methodology to the glycolysis pathway and the purine metabolism pathway reproduce the tree of life for the whole set of the organisms represented in the KEGG database which supports the utility of this research. PMID:28493998
Subject Specific Databases: A Powerful Research Tool
ERIC Educational Resources Information Center
Young, Terrence E., Jr.
2004-01-01
Subject specific databases, or vortals (vertical portals), are databases that provide highly detailed research information on a particular topic. They are the smallest, most focused search tools on the Internet and, in recent years, they've been on the rise. Currently, more of the so-called "mainstream" search engines, subject directories, and…
Finger, Robert P; Porz, Gabriele; Fleckenstein, Monika; Charbel Issa, Peter; Lechtenfeld, Werner; Brohlburg, Daniela; Scholl, Hendrik P N; Holz, Frank G
2010-04-01
The purpose of this study was to establish and evaluate a nationwide telephone counseling for patients with retinal diseases hotline in Germany against the background of an increasing demand for information and counseling in the field of retina services as a result of current demographic trends. The telephone Retina Hotline was installed, advertised, and run for 1.5 years at the Department of Ophthalmology, University of Bonn, and open to callers from the whole of Germany. The hotline was staffed by ophthalmologists. Calls were handled according to standard flow charts and counsel given adhered to a list of standardized answers as appropriate in the individual case. All calls were documented in an online database, which was subsequently analyzed and used for evaluation. A total of 1,384 calls were documented leading to an average of 7.6 calls per afternoon. The average length of calls was 8.5 minutes. The majority of callers were female patients (63%) who had age-related macular degeneration. Only 17% of callers were relatives. Most callers (59%) were >60 years of age. The majority of questions were related to therapeutic options for dry or neovascular age-related macular degeneration as well as various forms of retinitis pigmentosa (45%). A service such as the Retina Hotline seems necessary and well justified against the background of need for information and support documented. However, on the basis of an adequate computer program and a standard catalog of answers or flow charts, it may not need to be staffed by ophthalmologists, but well-trained nonmedical staff may be sufficient.
Chain, Patrick
2018-05-31
Genomics â the genetic mapping and DNA sequencing of sets of genes or the complete genomes of organisms, along with related genome analysis and database work â is emerging as one of the transformative sciences of the 21st century. But current bioinformatics tools are not accessible to most biological researchers. Now, a new computational and web-based tool called EDGE Bioinformatics is working to fulfill the promise of democratizing genomics.
ERIC Educational Resources Information Center
Terawaki, Yuki; Takahashi, Yuichi; Kodama, Yasushi; Yana, Kazuo
2011-01-01
This paper describes an integration of different Relational Database Management System (RDBMS) of two Course Management Systems (CMS) called Sakai and the Common Factory for Inspiration and Value in Education (CFIVE). First, when the service of CMS is provided campus-wide, the problems of user support, CMS operation and customization of CMS are…
Intelligent communication assistant for databases
DOE Office of Scientific and Technical Information (OSTI.GOV)
Jakobson, G.; Shaked, V.; Rowley, S.
1983-01-01
An intelligent communication assistant for databases, called FRED (front end for databases) is explored. FRED is designed to facilitate access to database systems by users of varying levels of experience. FRED is a second generation of natural language front-ends for databases and intends to solve two critical interface problems existing between end-users and databases: connectivity and communication problems. The authors report their experiences in developing software for natural language query processing, dialog control, and knowledge representation, as well as the direction of future work. 10 references.
An overview of the multi-database manipulation language MDSL
DOE Office of Scientific and Technical Information (OSTI.GOV)
Litwin, W.; Abdellatif, A.
With the increase in availability of databases, data needed by a user are frequently in separate autonomous databases. The logical properties of such data differ from the classical ones with a single database. In particular, they call for new functions for data manipulation. MDSL is a new data manipulation language providing such functions. Most of the MDSL functions are not available in other languages.
Using CLIPS in a distributed system: The Network Control Center (NCC) expert system
NASA Technical Reports Server (NTRS)
Wannemacher, Tom
1990-01-01
This paper describes an intelligent troubleshooting system for the Help Desk domain. It was developed on an IBM-compatible 80286 PC using Microsoft C and CLIPS and an AT&T 3B2 minicomputer using the UNIFY database and a combination of shell script, C programs and SQL queries. The two computers are linked by a lan. The functions of this system are to help non-technical NCC personnel handle trouble calls, to keep a log of problem calls with complete, concise information, and to keep a historical database of problems. The database helps identify hardware and software problem areas and provides a source of new rules for the troubleshooting knowledge base.
Dynamic XML-based exchange of relational data: application to the Human Brain Project.
Tang, Zhengming; Kadiyska, Yana; Li, Hao; Suciu, Dan; Brinkley, James F
2003-01-01
This paper discusses an approach to exporting relational data in XML format for data exchange over the web. We describe the first real-world application of SilkRoute, a middleware program that dynamically converts existing relational data to a user-defined XML DTD. The application, called XBrain, wraps SilkRoute in a Java Server Pages framework, thus permitting a web-based XQuery interface to a legacy relational database. The application is demonstrated as a query interface to the University of Washington Brain Project's Language Map Experiment Management System, which is used to manage data about language organization in the brain.
Fifteen hundred guidelines and growing: the UK database of clinical guidelines.
van Loo, John; Leonard, Niamh
2006-06-01
The National Library for Health offers a comprehensive searchable database of nationally approved clinical guidelines, called the Guidelines Finder. This resource, commissioned in 2002, is managed and developed by the University of Sheffield Health Sciences Library. The authors introduce the historical and political dimension of guidelines and the nature of guidelines as a mechanism to ensure clinical effectiveness in practice. The article then outlines the maintenance and organisation of the Guidelines Finder database itself, the criteria for selection, who publishes guidelines and guideline formats, usage of the Guidelines Finder service and finally looks at some lessons learnt from a local library offering a national service. Clinical guidelines are central to effective clinical practice at the national, organisational and individual level. The Guidelines Finder is one of the most visited resources within the National Library for Health and is successful in answering information needs related to specific patient care, clinical research, guideline development and education.
Hmrbase: a database of hormones and their receptors
Rashid, Mamoon; Singla, Deepak; Sharma, Arun; Kumar, Manish; Raghava, Gajendra PS
2009-01-01
Background Hormones are signaling molecules that play vital roles in various life processes, like growth and differentiation, physiology, and reproduction. These molecules are mostly secreted by endocrine glands, and transported to target organs through the bloodstream. Deficient, or excessive, levels of hormones are associated with several diseases such as cancer, osteoporosis, diabetes etc. Thus, it is important to collect and compile information about hormones and their receptors. Description This manuscript describes a database called Hmrbase which has been developed for managing information about hormones and their receptors. It is a highly curated database for which information has been collected from the literature and the public databases. The current version of Hmrbase contains comprehensive information about ~2000 hormones, e.g., about their function, source organism, receptors, mature sequences, structures etc. Hmrbase also contains information about ~3000 hormone receptors, in terms of amino acid sequences, subcellular localizations, ligands, and post-translational modifications etc. One of the major features of this database is that it provides data about ~4100 hormone-receptor pairs. A number of online tools have been integrated into the database, to provide the facilities like keyword search, structure-based search, mapping of a given peptide(s) on the hormone/receptor sequence, sequence similarity search. This database also provides a number of external links to other resources/databases in order to help in the retrieving of further related information. Conclusion Owing to the high impact of endocrine research in the biomedical sciences, the Hmrbase could become a leading data portal for researchers. The salient features of Hmrbase are hormone-receptor pair-related information, mapping of peptide stretches on the protein sequences of hormones and receptors, Pfam domain annotations, categorical browsing options, online data submission, DrugPedia linkage etc. Hmrbase is available online for public from . PMID:19589147
dbDSM: a manually curated database for deleterious synonymous mutations.
Wen, Pengbo; Xiao, Peng; Xia, Junfeng
2016-06-15
Synonymous mutations (SMs), which changed the sequence of a gene without directly altering the amino acid sequence of the encoded protein, were thought to have no functional consequences for a long time. They are often assumed to be neutral in models of mutation and selection and were completely ignored in many studies. However, accumulating experimental evidence has demonstrated that these mutations exert their impact on gene functions via splicing accuracy, mRNA stability, translation fidelity, protein folding and expression, and some of these mutations are implicated in human diseases. To the best of our knowledge, there is still no database specially focusing on disease-related SMs. We have developed a new database called dbDSM (database of Deleterious Synonymous Mutation), a continually updated database that collects, curates and manages available human disease-related SM data obtained from published literature. In the current release, dbDSM collects 1936 SM-disease association entries, including 1289 SMs and 443 human diseases from ClinVar, GRASP, GWAS Catalog, GWASdb, PolymiRTS database, PubMed database and Web of Knowledge. Additionally, we provided users a link to download all the data in the dbDSM and a link to submit novel data into the database. We hope dbDSM will be a useful resource for investigating the roles of SMs in human disease. dbDSM is freely available online at http://bioinfo.ahu.edu.cn:8080/dbDSM/index.jsp with all major browser supported. jfxia@ahu.edu.cn Supplementary data are available at Bioinformatics online. © The Author 2016. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.
Metabolonote: A Wiki-Based Database for Managing Hierarchical Metadata of Metabolome Analyses
Ara, Takeshi; Enomoto, Mitsuo; Arita, Masanori; Ikeda, Chiaki; Kera, Kota; Yamada, Manabu; Nishioka, Takaaki; Ikeda, Tasuku; Nihei, Yoshito; Shibata, Daisuke; Kanaya, Shigehiko; Sakurai, Nozomu
2015-01-01
Metabolomics – technology for comprehensive detection of small molecules in an organism – lags behind the other “omics” in terms of publication and dissemination of experimental data. Among the reasons for this are difficulty precisely recording information about complicated analytical experiments (metadata), existence of various databases with their own metadata descriptions, and low reusability of the published data, resulting in submitters (the researchers who generate the data) being insufficiently motivated. To tackle these issues, we developed Metabolonote, a Semantic MediaWiki-based database designed specifically for managing metabolomic metadata. We also defined a metadata and data description format, called “Togo Metabolome Data” (TogoMD), with an ID system that is required for unique access to each level of the tree-structured metadata such as study purpose, sample, analytical method, and data analysis. Separation of the management of metadata from that of data and permission to attach related information to the metadata provide advantages for submitters, readers, and database developers. The metadata are enriched with information such as links to comparable data, thereby functioning as a hub of related data resources. They also enhance not only readers’ understanding and use of data but also submitters’ motivation to publish the data. The metadata are computationally shared among other systems via APIs, which facilitate the construction of novel databases by database developers. A permission system that allows publication of immature metadata and feedback from readers also helps submitters to improve their metadata. Hence, this aspect of Metabolonote, as a metadata preparation tool, is complementary to high-quality and persistent data repositories such as MetaboLights. A total of 808 metadata for analyzed data obtained from 35 biological species are published currently. Metabolonote and related tools are available free of cost at http://metabolonote.kazusa.or.jp/. PMID:25905099
Metabolonote: a wiki-based database for managing hierarchical metadata of metabolome analyses.
Ara, Takeshi; Enomoto, Mitsuo; Arita, Masanori; Ikeda, Chiaki; Kera, Kota; Yamada, Manabu; Nishioka, Takaaki; Ikeda, Tasuku; Nihei, Yoshito; Shibata, Daisuke; Kanaya, Shigehiko; Sakurai, Nozomu
2015-01-01
Metabolomics - technology for comprehensive detection of small molecules in an organism - lags behind the other "omics" in terms of publication and dissemination of experimental data. Among the reasons for this are difficulty precisely recording information about complicated analytical experiments (metadata), existence of various databases with their own metadata descriptions, and low reusability of the published data, resulting in submitters (the researchers who generate the data) being insufficiently motivated. To tackle these issues, we developed Metabolonote, a Semantic MediaWiki-based database designed specifically for managing metabolomic metadata. We also defined a metadata and data description format, called "Togo Metabolome Data" (TogoMD), with an ID system that is required for unique access to each level of the tree-structured metadata such as study purpose, sample, analytical method, and data analysis. Separation of the management of metadata from that of data and permission to attach related information to the metadata provide advantages for submitters, readers, and database developers. The metadata are enriched with information such as links to comparable data, thereby functioning as a hub of related data resources. They also enhance not only readers' understanding and use of data but also submitters' motivation to publish the data. The metadata are computationally shared among other systems via APIs, which facilitate the construction of novel databases by database developers. A permission system that allows publication of immature metadata and feedback from readers also helps submitters to improve their metadata. Hence, this aspect of Metabolonote, as a metadata preparation tool, is complementary to high-quality and persistent data repositories such as MetaboLights. A total of 808 metadata for analyzed data obtained from 35 biological species are published currently. Metabolonote and related tools are available free of cost at http://metabolonote.kazusa.or.jp/.
A Tutorial in Creating Web-Enabled Databases with Inmagic DB/TextWorks through ODBC.
ERIC Educational Resources Information Center
Breeding, Marshall
2000-01-01
Explains how to create Web-enabled databases. Highlights include Inmagic's DB/Text WebPublisher product called DB/TextWorks; ODBC (Open Database Connectivity) drivers; Perl programming language; HTML coding; Structured Query Language (SQL); Common Gateway Interface (CGI) programming; and examples of HTML pages and Perl scripts. (LRW)
Setlik, Jennifer; Bond, G Randall; Ho, Mona
2009-09-01
We sought to better understand the trend for prescription attention-deficit/hyperactivity disorder (ADHD) medication abuse by teenagers. We queried the American Association of Poison Control Center's National Poison Data System for the years of 1998-2005 for all cases involving people aged 13 to 19 years, for which the reason was intentional abuse or intentional misuse and the substance was a prescription medication used for ADHD treatment. For trend comparison, we sought data on the total number of exposures. In addition, we used teen and preteen ADHD medication sales data from IMS Health's National Disease and Therapeutic Index database to compare poison center call trends with likely availability. Calls related to teenaged victims of prescription ADHD medication abuse rose 76%, which is faster than calls for victims of substance abuse generally and teen substance abuse. The annual rate of total and teen exposures was unchanged. Over the 8 years, estimated prescriptions for teenagers and preteenagers increased 133% for amphetamine products, 52% for methylphenidate products, and 80% for both together. Reports of exposure to methylphenidate fell from 78% to 30%, whereas methylphenidate as a percentage of ADHD prescriptions decreased from 66% to 56%. Substance-related abuse calls per million adolescent prescriptions rose 140%. The sharp increase, out of proportion to other poison center calls, suggests a rising problem with teen ADHD stimulant medication abuse. Case severity increased over time. Sales data of ADHD medications suggest that the use and call-volume increase reflects availability, but the increase disproportionately involves amphetamines.
Five years of poisons information on the internet: the UK experience of TOXBASE
Bateman, D N; Good, A M
2006-01-01
Introduction In 1999, the UK adopted a policy of using TOXBASE, an internet service available free to registered National Health Service (NHS) departments and professionals, as the first point of information on poisoning. This was the first use worldwide of the internet for provision of clinical advice at a national level. We report the impact on database usage and NPIS telephone call loads. Methods Trends in the pattern of TOXBASE usage from 2000–2004 are reported by user category. Information on the monographs accessed most frequently was also extracted from the webserver and sorted by user category. The numbers of telephone calls to the National Poisons Information Service (NPIS) were extracted from NPIS annual reports. Results Numbers of database logons increased 3.5 fold from 102 352 in 2000 to 368 079 in 2004, with a total of 789 295 accesses to product monographs in 2004. Registered users increased almost tenfold, with approximately half accessing the database at least once a year. Telephone calls to the NPIS dropped by over half. Total contacts with NPIS (web and telephone) increased 50%. Major users in 2004 were hospital emergency departments (60.5% of logons) and NHS public access helplines (NHS Direct and NHS24) (29.4%). Different user groups access different parts of the database. Emergency departments access printable fact sheets for about 10% of monographs they access. Conclusion Provision of poisons information by the internet has been successful in reducing NPIS call loads. Provision of basic poisons information by this method appears to be acceptable to different professional groups, and to be effective in reducing telephone call loads and increasing service cost effectiveness. PMID:16858093
Five years of poisons information on the internet: the UK experience of TOXBASE.
Bateman, D N; Good, A M
2006-08-01
In 1999, the UK adopted a policy of using TOXBASE, an internet service available free to registered National Health Service (NHS) departments and professionals, as the first point of information on poisoning. This was the first use worldwide of the internet for provision of clinical advice at a national level. We report the impact on database usage and NPIS telephone call loads. Trends in the pattern of TOXBASE usage from 2000-2004 are reported by user category. Information on the monographs accessed most frequently was also extracted from the webserver and sorted by user category. The numbers of telephone calls to the National Poisons Information Service (NPIS) were extracted from NPIS annual reports. Numbers of database logons increased 3.5 fold from 102,352 in 2000 to 368,079 in 2004, with a total of 789,295 accesses to product monographs in 2004. Registered users increased almost tenfold, with approximately half accessing the database at least once a year. Telephone calls to the NPIS dropped by over half. Total contacts with NPIS (web and telephone) increased 50%. Major users in 2004 were hospital emergency departments (60.5% of logons) and NHS public access helplines (NHS Direct and NHS24) (29.4%). Different user groups access different parts of the database. Emergency departments access printable fact sheets for about 10% of monographs they access. Provision of poisons information by the internet has been successful in reducing NPIS call loads. Provision of basic poisons information by this method appears to be acceptable to different professional groups, and to be effective in reducing telephone call loads and increasing service cost effectiveness.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Courteau, J.
1991-10-11
Since the Genome Project began several years ago, a plethora of databases have been developed or are in the works. They range from the massive Genome Data Base at Johns Hopkins University, the central repository of all gene mapping information, to small databases focusing on single chromosomes or organisms. Some are publicly available, others are essentially private electronic lab notebooks. Still others limit access to a consortium of researchers working on, say, a single human chromosome. An increasing number incorporate sophisticated search and analytical software, while others operate as little more than data lists. In consultation with numerous experts inmore » the field, a list has been compiled of some key genome-related databases. The list was not limited to map and sequence databases but also included the tools investigators use to interpret and elucidate genetic data, such as protein sequence and protein structure databases. Because a major goal of the Genome Project is to map and sequence the genomes of several experimental animals, including E. coli, yeast, fruit fly, nematode, and mouse, the available databases for those organisms are listed as well. The author also includes several databases that are still under development - including some ambitious efforts that go beyond data compilation to create what are being called electronic research communities, enabling many users, rather than just one or a few curators, to add or edit the data and tag it as raw or confirmed.« less
Integrated Array/Metadata Analytics
NASA Astrophysics Data System (ADS)
Misev, Dimitar; Baumann, Peter
2015-04-01
Data comes in various forms and types, and integration usually presents a problem that is often simply ignored and solved with ad-hoc solutions. Multidimensional arrays are an ubiquitous data type, that we find at the core of virtually all science and engineering domains, as sensor, model, image, statistics data. Naturally, arrays are richly described by and intertwined with additional metadata (alphanumeric relational data, XML, JSON, etc). Database systems, however, a fundamental building block of what we call "Big Data", lack adequate support for modelling and expressing these array data/metadata relationships. Array analytics is hence quite primitive or non-existent at all in modern relational DBMS. Recognizing this, we extended SQL with a new SQL/MDA part seamlessly integrating multidimensional array analytics into the standard database query language. We demonstrate the benefits of SQL/MDA with real-world examples executed in ASQLDB, an open-source mediator system based on HSQLDB and rasdaman, that already implements SQL/MDA.
Kalyanaraman, Ananth; Cannon, William R; Latt, Benjamin; Baxter, Douglas J
2011-11-01
A MapReduce-based implementation called MR-MSPolygraph for parallelizing peptide identification from mass spectrometry data is presented. The underlying serial method, MSPolygraph, uses a novel hybrid approach to match an experimental spectrum against a combination of a protein sequence database and a spectral library. Our MapReduce implementation can run on any Hadoop cluster environment. Experimental results demonstrate that, relative to the serial version, MR-MSPolygraph reduces the time to solution from weeks to hours, for processing tens of thousands of experimental spectra. Speedup and other related performance studies are also reported on a 400-core Hadoop cluster using spectral datasets from environmental microbial communities as inputs. The source code along with user documentation are available on http://compbio.eecs.wsu.edu/MR-MSPolygraph. ananth@eecs.wsu.edu; william.cannon@pnnl.gov. Supplementary data are available at Bioinformatics online.
Ikeda, Shun; Abe, Takashi; Nakamura, Yukiko; Kibinge, Nelson; Hirai Morita, Aki; Nakatani, Atsushi; Ono, Naoaki; Ikemura, Toshimichi; Nakamura, Kensuke; Altaf-Ul-Amin, Md; Kanaya, Shigehiko
2013-05-01
Biology is increasingly becoming a data-intensive science with the recent progress of the omics fields, e.g. genomics, transcriptomics, proteomics and metabolomics. The species-metabolite relationship database, KNApSAcK Core, has been widely utilized and cited in metabolomics research, and chronological analysis of that research work has helped to reveal recent trends in metabolomics research. To meet the needs of these trends, the KNApSAcK database has been extended by incorporating a secondary metabolic pathway database called Motorcycle DB. We examined the enzyme sequence diversity related to secondary metabolism by means of batch-learning self-organizing maps (BL-SOMs). Initially, we constructed a map by using a big data matrix consisting of the frequencies of all possible dipeptides in the protein sequence segments of plants and bacteria. The enzyme sequence diversity of the secondary metabolic pathways was examined by identifying clusters of segments associated with certain enzyme groups in the resulting map. The extent of diversity of 15 secondary metabolic enzyme groups is discussed. Data-intensive approaches such as BL-SOM applied to big data matrices are needed for systematizing protein sequences. Handling big data has become an inevitable part of biology.
Construction of In-house Databases in a Corporation
NASA Astrophysics Data System (ADS)
Tamura, Haruki; Mezaki, Koji
This paper describes fundamental idea of technical information management in Mitsubishi Heavy Industries, Ltd., and present status of the activities. Then it introduces the background and history of the development, problems and countermeasures against them regarding Mitsubishi Heavy Industries Technical Information Retrieval System (called MARON) which started its service in May, 1985. The system deals with databases which cover information common to the whole company (in-house research and technical reports, holding information of books, journals and so on), and local information held in each business division or department. Anybody from any division can access to these databases through the company-wide network. The in-house interlibrary loan subsystem called Orderentry is available, which supports acquiring operation of original materials.
MySQL/PHP web database applications for IPAC proposal submission
NASA Astrophysics Data System (ADS)
Crane, Megan K.; Storrie-Lombardi, Lisa J.; Silbermann, Nancy A.; Rebull, Luisa M.
2008-07-01
The Infrared Processing and Analysis Center (IPAC) is NASA's multi-mission center of expertise for long-wavelength astrophysics. Proposals for various IPAC missions and programs are ingested via MySQL/PHP web database applications. Proposers use web forms to enter coversheet information and upload PDF files related to the proposal. Upon proposal submission, a unique directory is created on the webserver into which all of the uploaded files are placed. The coversheet information is converted into a PDF file using a PHP extension called FPDF. The files are concatenated into one PDF file using the command-line tool pdftk and then forwarded to the review committee. This work was performed at the California Institute of Technology under contract to the National Aeronautics and Space Administration.
Use of Graph Database for the Integration of Heterogeneous Biological Data.
Yoon, Byoung-Ha; Kim, Seon-Kyu; Kim, Seon-Young
2017-03-01
Understanding complex relationships among heterogeneous biological data is one of the fundamental goals in biology. In most cases, diverse biological data are stored in relational databases, such as MySQL and Oracle, which store data in multiple tables and then infer relationships by multiple-join statements. Recently, a new type of database, called the graph-based database, was developed to natively represent various kinds of complex relationships, and it is widely used among computer science communities and IT industries. Here, we demonstrate the feasibility of using a graph-based database for complex biological relationships by comparing the performance between MySQL and Neo4j, one of the most widely used graph databases. We collected various biological data (protein-protein interaction, drug-target, gene-disease, etc.) from several existing sources, removed duplicate and redundant data, and finally constructed a graph database containing 114,550 nodes and 82,674,321 relationships. When we tested the query execution performance of MySQL versus Neo4j, we found that Neo4j outperformed MySQL in all cases. While Neo4j exhibited a very fast response for various queries, MySQL exhibited latent or unfinished responses for complex queries with multiple-join statements. These results show that using graph-based databases, such as Neo4j, is an efficient way to store complex biological relationships. Moreover, querying a graph database in diverse ways has the potential to reveal novel relationships among heterogeneous biological data.
Use of Graph Database for the Integration of Heterogeneous Biological Data
Yoon, Byoung-Ha; Kim, Seon-Kyu
2017-01-01
Understanding complex relationships among heterogeneous biological data is one of the fundamental goals in biology. In most cases, diverse biological data are stored in relational databases, such as MySQL and Oracle, which store data in multiple tables and then infer relationships by multiple-join statements. Recently, a new type of database, called the graph-based database, was developed to natively represent various kinds of complex relationships, and it is widely used among computer science communities and IT industries. Here, we demonstrate the feasibility of using a graph-based database for complex biological relationships by comparing the performance between MySQL and Neo4j, one of the most widely used graph databases. We collected various biological data (protein-protein interaction, drug-target, gene-disease, etc.) from several existing sources, removed duplicate and redundant data, and finally constructed a graph database containing 114,550 nodes and 82,674,321 relationships. When we tested the query execution performance of MySQL versus Neo4j, we found that Neo4j outperformed MySQL in all cases. While Neo4j exhibited a very fast response for various queries, MySQL exhibited latent or unfinished responses for complex queries with multiple-join statements. These results show that using graph-based databases, such as Neo4j, is an efficient way to store complex biological relationships. Moreover, querying a graph database in diverse ways has the potential to reveal novel relationships among heterogeneous biological data. PMID:28416946
Optics Toolbox: An Intelligent Relational Database System For Optical Designers
NASA Astrophysics Data System (ADS)
Weller, Scott W.; Hopkins, Robert E.
1986-12-01
Optical designers were among the first to use the computer as an engineering tool. Powerful programs have been written to do ray-trace analysis, third-order layout, and optimization. However, newer computing techniques such as database management and expert systems have not been adopted by the optical design community. For the purpose of this discussion we will define a relational database system as a database which allows the user to specify his requirements using logical relations. For example, to search for all lenses in a lens database with a F/number less than two, and a half field of view near 28 degrees, you might enter the following: FNO < 2.0 and FOV of 28 degrees ± 5% Again for the purpose of this discussion, we will define an expert system as a program which contains expert knowledge, can ask intelligent questions, and can form conclusions based on the answers given and the knowledge which it contains. Most expert systems store this knowledge in the form of rules-of-thumb, which are written in an English-like language, and which are easily modified by the user. An example rule is: IF require microscope objective in air and require NA > 0.9 THEN suggest the use of an oil immersion objective The heart of the expert system is the rule interpreter, sometimes called an inference engine, which reads the rules and forms conclusions based on them. The use of a relational database system containing lens prototypes seems to be a viable prospect. However, it is not clear that expert systems have a place in optical design. In domains such as medical diagnosis and petrology, expert systems are flourishing. These domains are quite different from optical design, however, because optical design is a creative process, and the rules are difficult to write down. We do think that an expert system is feasible in the area of first order layout, which is sufficiently diagnostic in nature to permit useful rules to be written. This first-order expert would emulate an expert designer as he interacted with a customer for the first time: asking the right questions, forming conclusions, and making suggestions. With these objectives in mind, we have developed the Optics Toolbox. Optics Toolbox is actually two programs in one: it is a powerful relational database system with twenty-one search parameters, four search modes, and multi-database support, as well as a first-order optical design expert system with a rule interpreter which has full access to the relational database. The system schematic is shown in Figure 1.
2016-03-01
Representational state transfer Java messaging service Java application programming interface (API) Internet relay chat (IRC)/extensible messaging and...JBoss application server or an Apache Tomcat servlet container instance. The relational database management system can be either PostgreSQL or MySQL ... Java library called direct web remoting. This library has been part of the core CACE architecture for quite some time; however, there have not been
Nuclear Energy Infrastructure Database Description and User’s Manual
DOE Office of Scientific and Technical Information (OSTI.GOV)
Heidrich, Brenden
In 2014, the Deputy Assistant Secretary for Science and Technology Innovation initiated the Nuclear Energy (NE)–Infrastructure Management Project by tasking the Nuclear Science User Facilities, formerly the Advanced Test Reactor National Scientific User Facility, to create a searchable and interactive database of all pertinent NE-supported and -related infrastructure. This database, known as the Nuclear Energy Infrastructure Database (NEID), is used for analyses to establish needs, redundancies, efficiencies, distributions, etc., to best understand the utility of NE’s infrastructure and inform the content of infrastructure calls. The Nuclear Science User Facilities developed the database by utilizing data and policy direction from amore » variety of reports from the U.S. Department of Energy, the National Research Council, the International Atomic Energy Agency, and various other federal and civilian resources. The NEID currently contains data on 802 research and development instruments housed in 377 facilities at 84 institutions in the United States and abroad. The effort to maintain and expand the database is ongoing. Detailed information on many facilities must be gathered from associated institutions and added to complete the database. The data must be validated and kept current to capture facility and instrumentation status as well as to cover new acquisitions and retirements. This document provides a short tutorial on the navigation of the NEID web portal at NSUF-Infrastructure.INL.gov.« less
NATIONAL URBAN DATABASE AND ACCESS PROTAL TOOL
Current mesoscale weather prediction and microscale dispersion models are limited in their ability to perform accurate assessments in urban areas. A project called the National Urban Database with Access Portal Tool (NUDAPT) is beginning to provide urban data and improve the para...
The XSD-Builder Specification Language—Toward a Semantic View of XML Schema Definition
NASA Astrophysics Data System (ADS)
Fong, Joseph; Cheung, San Kuen
In the present database market, XML database model is a main structure for the forthcoming database system in the Internet environment. As a conceptual schema of XML database, XML Model has its limitation on presenting its data semantics. System analyst has no toolset for modeling and analyzing XML system. We apply XML Tree Model (shown in Figure 2) as a conceptual schema of XML database to model and analyze the structure of an XML database. It is important not only for visualizing, specifying, and documenting structural models, but also for constructing executable systems. The tree model represents inter-relationship among elements inside different logical schema such as XML Schema Definition (XSD), DTD, Schematron, XDR, SOX, and DSD (shown in Figure 1, an explanation of the terms in the figure are shown in Table 1). The XSD-Builder consists of XML Tree Model, source language, translator, and XSD. The source language is called XSD-Source which is mainly for providing an environment with concept of user friendliness while writing an XSD. The source language will consequently be translated by XSD-Translator. Output of XSD-Translator is an XSD which is our target and is called as an object language.
Pan-cancer analysis reveals technical artifacts in TCGA germline variant calls.
Buckley, Alexandra R; Standish, Kristopher A; Bhutani, Kunal; Ideker, Trey; Lasken, Roger S; Carter, Hannah; Harismendy, Olivier; Schork, Nicholas J
2017-06-12
Cancer research to date has largely focused on somatically acquired genetic aberrations. In contrast, the degree to which germline, or inherited, variation contributes to tumorigenesis remains unclear, possibly due to a lack of accessible germline variant data. Here we called germline variants on 9618 cases from The Cancer Genome Atlas (TCGA) database representing 31 cancer types. We identified batch effects affecting loss of function (LOF) variant calls that can be traced back to differences in the way the sequence data were generated both within and across cancer types. Overall, LOF indel calls were more sensitive to technical artifacts than LOF Single Nucleotide Variant (SNV) calls. In particular, whole genome amplification of DNA prior to sequencing led to an artificially increased burden of LOF indel calls, which confounded association analyses relating germline variants to tumor type despite stringent indel filtering strategies. The samples affected by these technical artifacts include all acute myeloid leukemia and practically all ovarian cancer samples. We demonstrate how technical artifacts induced by whole genome amplification of DNA can lead to false positive germline-tumor type associations and suggest TCGA whole genome amplified samples be used with caution. This study draws attention to the need to be sensitive to problems associated with a lack of uniformity in data generation in TCGA data.
HITRAN2016 : new and improved data and tools towards studies of planetary atmospheres
NASA Astrophysics Data System (ADS)
Gordon, Iouli; Rothman, Laurence S.; Wilzewski, Jonas S.; Kochanov, Roman V.; Hill, Christian; Tan, Yan; Wcislo, Piotr
2016-10-01
The HITRAN2016 molecular spectroscopic database is scheduled to be released this year. It will replace the current edition, HITRAN2012 [1], which has been in use, along with some intermediate updates, since 2012.We have added, revised, and improved many transitions and bands of molecular species and their isotopologues. Also, the amount of parameters has also been significantly increased, now incorporating, for instance, broadening by He, H2 and CO2 which are dominant in different planetary atmospheres [2]; non-Voigt line profiles [3]; and other phenomena. This poster will provide a summary of the updates, emphasizing details of some of the most important or drastic improvements or additions.To allow flexible incorporation of the new parameters and improve the efficiency of the database usage, the whole database has been reorganized into a relational database structure and presented to the user by means of a very powerful, easy-to-use internet program called HITRANonline [4] accessible at
Yu, Kebing; Salomon, Arthur R
2009-12-01
Recently, dramatic progress has been achieved in expanding the sensitivity, resolution, mass accuracy, and scan rate of mass spectrometers able to fragment and identify peptides through MS/MS. Unfortunately, this enhanced ability to acquire proteomic data has not been accompanied by a concomitant increase in the availability of flexible tools allowing users to rapidly assimilate, explore, and analyze this data and adapt to various experimental workflows with minimal user intervention. Here we fill this critical gap by providing a flexible relational database called PeptideDepot for organization of expansive proteomic data sets, collation of proteomic data with available protein information resources, and visual comparison of multiple quantitative proteomic experiments. Our software design, built upon the synergistic combination of a MySQL database for safe warehousing of proteomic data with a FileMaker-driven graphical user interface for flexible adaptation to diverse workflows, enables proteomic end-users to directly tailor the presentation of proteomic data to the unique analysis requirements of the individual proteomics lab. PeptideDepot may be deployed as an independent software tool or integrated directly with our high throughput autonomous proteomic pipeline used in the automated acquisition and post-acquisition analysis of proteomic data.
Minerva: using a software program to improve resident performance during independent call
NASA Astrophysics Data System (ADS)
Itri, Jason N.; Redfern, Regina O.; Cook, Tessa; Scanlon, Mary H.
2010-03-01
We have developed an application called Minerva that allows tracking of resident discrepancy rates and missed cases. Minerva mines the radiology information system (RIS) for preliminary interpretations provided by residents during independent call and copies both the preliminary and final interpretations to a database. Both versions are displayed for direct comparison by Minerva and classified as 'in agreement', 'minor discrepancy' or 'major discrepancy' by the resident program director. Minerva compiles statistics comparing minor, major and total discrepancy rates for individual residents relative to the overall group. Discrepant cases are categorized according to date, modality and body part and reviewed for trends in missed cases. The rate of minor, major and total discrepancies for residents on-call at our institution was similar to rates previously published, including a 2.4% major discrepancy rate for second year radiology residents in the DePICTORS study and a 2.6% major discrepancy rate for resident at a community hospital. Trend analysis of missed cases was used to generate a topic-specific resident missed case conference on acromioclavicular (AC) joint separation injuries, which resulted in a 75% decrease in the number of missed cases related to AC separation subsequent to the conference. Using a software program to track of minor and major discrepancy rates for residents taking independent call using modified RadPeer scoring guidelines provides a competency-based metric to determine resident performance. Topic-specific conferences using the cases identified by Minerva can result in a decrease in missed cases.
Abductive Equivalential Translation and its application to Natural Language Database Interfacing
NASA Astrophysics Data System (ADS)
Rayner, Manny
1994-05-01
The thesis describes a logical formalization of natural-language database interfacing. We assume the existence of a ``natural language engine'' capable of mediating between surface linguistic string and their representations as ``literal'' logical forms: the focus of interest will be the question of relating ``literal'' logical forms to representations in terms of primitives meaningful to the underlying database engine. We begin by describing the nature of the problem, and show how a variety of interface functionalities can be considered as instances of a type of formal inference task which we call ``Abductive Equivalential Translation'' (AET); functionalities which can be reduced to this form include answering questions, responding to commands, reasoning about the completeness of answers, answering meta-questions of type ``Do you know...'', and generating assertions and questions. In each case, a ``linguistic domain theory'' (LDT) Γ and an input formula F are given, and the goal is to construct a formula with certain properties which is equivalent to F, given Γ and a set of permitted assumptions. If the LDT is of a certain specified type, whose formulas are either conditional equivalences or Horn-clauses, we show that the AET problem can be reduced to a goal-directed inference method. We present an abstract description of this method, and sketch its realization in Prolog. The relationship between AET and several problems previously discussed in the literature is discussed. In particular, we show how AET can provide a simple and elegant solution to the so-called ``Doctor on Board'' problem, and in effect allows a ``relativization'' of the Closed World Assumption. The ideas in the thesis have all been implemented concretely within the SRI CLARE project, using a real projects and payments database. The LDT for the example database is described in detail, and examples of the types of functionality that can be achieved within the example domain are presented.
The Design and Analysis of a Complete Hierarchical Interface for the Multi-Backend Database System.
1984-06-01
Change the prerequisite of Course# 4 from Math to Discrete Math . The DL/I call to accomplish this is as follows: GHU COURSE (COURSE# = ’) PREREQ change...title to ’ Discrete Math ’ in I/O work area REPL The interface would respond to this call by treating the Get Hold Unique call as a Get Unique call...4) & (PREREQ.COURSE# = COURSE#1)) <TITLE = DISCRETE MATH > Upon execution of this request, the call is completed. 61 VI. IMPLEMENTATION CONCERNS AND
Gupta, Amarnath; Bug, William; Marenco, Luis; Qian, Xufei; Condit, Christopher; Rangarajan, Arun; Müller, Hans Michael; Miller, Perry L.; Sanders, Brian; Grethe, Jeffrey S.; Astakhov, Vadim; Shepherd, Gordon; Sternberg, Paul W.; Martone, Maryann E.
2009-01-01
The overarching goal of the NIF (Neuroscience Information Framework) project is to be a one-stop-shop for Neuroscience. This paper provides a technical overview of how the system is designed. The technical goal of the first version of the NIF system was to develop an information system that a neuroscientist can use to locate relevant information from a wide variety of information sources by simple keyword queries. Although the user would provide only keywords to retrieve information, the NIF system is designed to treat them as concepts whose meanings are interpreted by the system. Thus, a search for term should find a record containing synonyms of the term. The system is targeted to find information from web pages, publications, databases, web sites built upon databases, XML documents and any other modality in which such information may be published. We have designed a system to achieve this functionality. A central element in the system is an ontology called NIFSTD (for NIF Standard) constructed by amalgamating a number of known and newly developed ontologies. NIFSTD is used by our ontology management module, called OntoQuest to perform ontology-based search over data sources. The NIF architecture currently provides three different mechanisms for searching heterogeneous data sources including relational databases, web sites, XML documents and full text of publications. Version 1.0 of the NIF system is currently in beta test and may be accessed through http://nif.nih.gov. PMID:18958629
Gupta, Amarnath; Bug, William; Marenco, Luis; Qian, Xufei; Condit, Christopher; Rangarajan, Arun; Müller, Hans Michael; Miller, Perry L; Sanders, Brian; Grethe, Jeffrey S; Astakhov, Vadim; Shepherd, Gordon; Sternberg, Paul W; Martone, Maryann E
2008-09-01
The overarching goal of the NIF (Neuroscience Information Framework) project is to be a one-stop-shop for Neuroscience. This paper provides a technical overview of how the system is designed. The technical goal of the first version of the NIF system was to develop an information system that a neuroscientist can use to locate relevant information from a wide variety of information sources by simple keyword queries. Although the user would provide only keywords to retrieve information, the NIF system is designed to treat them as concepts whose meanings are interpreted by the system. Thus, a search for term should find a record containing synonyms of the term. The system is targeted to find information from web pages, publications, databases, web sites built upon databases, XML documents and any other modality in which such information may be published. We have designed a system to achieve this functionality. A central element in the system is an ontology called NIFSTD (for NIF Standard) constructed by amalgamating a number of known and newly developed ontologies. NIFSTD is used by our ontology management module, called OntoQuest to perform ontology-based search over data sources. The NIF architecture currently provides three different mechanisms for searching heterogeneous data sources including relational databases, web sites, XML documents and full text of publications. Version 1.0 of the NIF system is currently in beta test and may be accessed through http://nif.nih.gov.
Response to Pilaar Burch and Graham
USDA-ARS?s Scientific Manuscript database
We are delighted that our call for IsoBank, a database for isotopes, has generated interest among our colleagues, and we applaud Pilaar Birch and Graham in their letter for offering a potential repository, Neotoma Paleoecological Database. Their suggestion is promising, and should be explored. We en...
NASA Astrophysics Data System (ADS)
Stewart, Brent K.; Langer, Steven G.; Martin, Kelly P.
1999-07-01
The purpose of this paper is to integrate multiple DICOM image webservers into the currently existing enterprises- wide web-browsable electronic medical record. Over the last six years the University of Washington has created a clinical data repository combining in a distributed relational database information from multiple departmental databases (MIND). A character cell-based view of this data called the Mini Medical Record (MMR) has been available for four years, MINDscape, unlike the text-based MMR. provides a platform independent, dynamic, web browser view of the MIND database that can be easily linked with medical knowledge resources on the network, like PubMed and the Federated Drug Reference. There are over 10,000 MINDscape user accounts at the University of Washington Academic Medical Centers. The weekday average number of hits to MINDscape is 35,302 and weekday average number of individual users is 1252. DICOM images from multiple webservers are now being viewed through the MINDscape electronic medical record.
HotRegion: a database of predicted hot spot clusters.
Cukuroglu, Engin; Gursoy, Attila; Keskin, Ozlem
2012-01-01
Hot spots are energetically important residues at protein interfaces and they are not randomly distributed across the interface but rather clustered. These clustered hot spots form hot regions. Hot regions are important for the stability of protein complexes, as well as providing specificity to binding sites. We propose a database called HotRegion, which provides the hot region information of the interfaces by using predicted hot spot residues, and structural properties of these interface residues such as pair potentials of interface residues, accessible surface area (ASA) and relative ASA values of interface residues of both monomer and complex forms of proteins. Also, the 3D visualization of the interface and interactions among hot spot residues are provided. HotRegion is accessible at http://prism.ccbb.ku.edu.tr/hotregion.
ERIC Educational Resources Information Center
Kahle, Brewster; Prelinger, Rick; Jackson, Mary E.; Boyack, Kevin W.; Wylie, Brian N.; Davidson, George S.; Witten, Ian H.; Bainbridge, David; Boddie, Stefan J.; Garrison, William A.; Cunningham, Sally Jo; Borgman, Christine L.; Hessel, Heather
2001-01-01
These six articles discuss various issues relating to digital libraries. Highlights include public access to digital materials; intellectual property concerns; the need for collaboration across disciplines; Greenstone software for construction and presentation of digital information collections; the Colorado Digitization Project; and conferences…
Jiménez-Hernández, M; Castro-Zamudio, S; Guzmán Parra, J; Martínez-García, A I; Guillén-Benítez, C; Moreno-Küstner, B
2017-12-29
Suicidal behaviour (fatal and non-fatal) has become a serious public health problem in many countries. The aim of the study was to describe the differential characteristics of emergency calls due to suicidal behaviour made to the Emergency Coordinating Centre (CCUE) in the province of Málaga, in comparison with calls due to physical or psychiatric problems. Retrospective observational study of the calls recorded in the database of the Public Company for Emergency Health during one year. Multivariate logistic regression analyses were carried out including age, gender and the following variables related with the demand: hours of the day, type of day (working days or bank holidays), months of the year and trimesters, number of resources mobilized and types of resolution. The analyses were carried out on 163,331 calls, of which 1,380 calls were due to suicidal behaviour (0.8%), 9,951 for psychiatric reasons (6.1%) and 152,000 for physical reasons (93%). The emergency calls for suicidal behaviour were mainly made by females, between 31-60 years, in the evening and at night, and required transfer to hospital and more than one mobilized resource. Calls due to completed suicide were more frequently made by older men. Calls due to suicidal tendencies predominated over those due to attempted or threatened suicide during the first trimester of the year, while the opposite was the case during the third trimester. The results indicated differential characteristics of suicide calls that are potentially relevant for prevention in spite of the limitations of the present study.
DynGO: a tool for visualizing and mining of Gene Ontology and its associations
Liu, Hongfang; Hu, Zhang-Zhi; Wu, Cathy H
2005-01-01
Background A large volume of data and information about genes and gene products has been stored in various molecular biology databases. A major challenge for knowledge discovery using these databases is to identify related genes and gene products in disparate databases. The development of Gene Ontology (GO) as a common vocabulary for annotation allows integrated queries across multiple databases and identification of semantically related genes and gene products (i.e., genes and gene products that have similar GO annotations). Meanwhile, dozens of tools have been developed for browsing, mining or editing GO terms, their hierarchical relationships, or their "associated" genes and gene products (i.e., genes and gene products annotated with GO terms). Tools that allow users to directly search and inspect relations among all GO terms and their associated genes and gene products from multiple databases are needed. Results We present a standalone package called DynGO, which provides several advanced functionalities in addition to the standard browsing capability of the official GO browsing tool (AmiGO). DynGO allows users to conduct batch retrieval of GO annotations for a list of genes and gene products, and semantic retrieval of genes and gene products sharing similar GO annotations. The result are shown in an association tree organized according to GO hierarchies and supported with many dynamic display options such as sorting tree nodes or changing orientation of the tree. For GO curators and frequent GO users, DynGO provides fast and convenient access to GO annotation data. DynGO is generally applicable to any data set where the records are annotated with GO terms, as illustrated by two examples. Conclusion We have presented a standalone package DynGO that provides functionalities to search and browse GO and its association databases as well as several additional functions such as batch retrieval and semantic retrieval. The complete documentation and software are freely available for download from the website . PMID:16091147
NASA Astrophysics Data System (ADS)
Kadhem, Hasan; Amagasa, Toshiyuki; Kitagawa, Hiroyuki
Encryption can provide strong security for sensitive data against inside and outside attacks. This is especially true in the “Database as Service” model, where confidentiality and privacy are important issues for the client. In fact, existing encryption approaches are vulnerable to a statistical attack because each value is encrypted to another fixed value. This paper presents a novel database encryption scheme called MV-OPES (Multivalued — Order Preserving Encryption Scheme), which allows privacy-preserving queries over encrypted databases with an improved security level. Our idea is to encrypt a value to different multiple values to prevent statistical attacks. At the same time, MV-OPES preserves the order of the integer values to allow comparison operations to be directly applied on encrypted data. Using calculated distance (range), we propose a novel method that allows a join query between relations based on inequality over encrypted values. We also present techniques to offload query execution load to a database server as much as possible, thereby making a better use of server resources in a database outsourcing environment. Our scheme can easily be integrated with current database systems as it is designed to work with existing indexing structures. It is robust against statistical attack and the estimation of true values. MV-OPES experiments show that security for sensitive data can be achieved with reasonable overhead, establishing the practicability of the scheme.
Ameur, Adam; Bunikis, Ignas; Enroth, Stefan; Gyllensten, Ulf
2014-01-01
CanvasDB is an infrastructure for management and analysis of genetic variants from massively parallel sequencing (MPS) projects. The system stores SNP and indel calls in a local database, designed to handle very large datasets, to allow for rapid analysis using simple commands in R. Functional annotations are included in the system, making it suitable for direct identification of disease-causing mutations in human exome- (WES) or whole-genome sequencing (WGS) projects. The system has a built-in filtering function implemented to simultaneously take into account variant calls from all individual samples. This enables advanced comparative analysis of variant distribution between groups of samples, including detection of candidate causative mutations within family structures and genome-wide association by sequencing. In most cases, these analyses are executed within just a matter of seconds, even when there are several hundreds of samples and millions of variants in the database. We demonstrate the scalability of canvasDB by importing the individual variant calls from all 1092 individuals present in the 1000 Genomes Project into the system, over 4.4 billion SNPs and indels in total. Our results show that canvasDB makes it possible to perform advanced analyses of large-scale WGS projects on a local server. Database URL: https://github.com/UppsalaGenomeCenter/CanvasDB PMID:25281234
Ameur, Adam; Bunikis, Ignas; Enroth, Stefan; Gyllensten, Ulf
2014-01-01
CanvasDB is an infrastructure for management and analysis of genetic variants from massively parallel sequencing (MPS) projects. The system stores SNP and indel calls in a local database, designed to handle very large datasets, to allow for rapid analysis using simple commands in R. Functional annotations are included in the system, making it suitable for direct identification of disease-causing mutations in human exome- (WES) or whole-genome sequencing (WGS) projects. The system has a built-in filtering function implemented to simultaneously take into account variant calls from all individual samples. This enables advanced comparative analysis of variant distribution between groups of samples, including detection of candidate causative mutations within family structures and genome-wide association by sequencing. In most cases, these analyses are executed within just a matter of seconds, even when there are several hundreds of samples and millions of variants in the database. We demonstrate the scalability of canvasDB by importing the individual variant calls from all 1092 individuals present in the 1000 Genomes Project into the system, over 4.4 billion SNPs and indels in total. Our results show that canvasDB makes it possible to perform advanced analyses of large-scale WGS projects on a local server. Database URL: https://github.com/UppsalaGenomeCenter/CanvasDB. © The Author(s) 2014. Published by Oxford University Press.
Ahmadi, Farshid Farnood; Ebadi, Hamid
2009-01-01
3D spatial data acquired from aerial and remote sensing images by photogrammetric techniques is one of the most accurate and economic data sources for GIS, map production, and spatial data updating. However, there are still many problems concerning storage, structuring and appropriate management of spatial data obtained using these techniques. According to the capabilities of spatial database management systems (SDBMSs); direct integration of photogrammetric and spatial database management systems can save time and cost of producing and updating digital maps. This integration is accomplished by replacing digital maps with a single spatial database. Applying spatial databases overcomes the problem of managing spatial and attributes data in a coupled approach. This management approach is one of the main problems in GISs for using map products of photogrammetric workstations. Also by the means of these integrated systems, providing structured spatial data, based on OGC (Open GIS Consortium) standards and topological relations between different feature classes, is possible at the time of feature digitizing process. In this paper, the integration of photogrammetric systems and SDBMSs is evaluated. Then, different levels of integration are described. Finally design, implementation and test of a software package called Integrated Photogrammetric and Oracle Spatial Systems (IPOSS) is presented.
Drinking Water - National Drinking Water Clearinghouse
relevant to drinking water issues. We provide free and low-cost publications, products, databases , referrals, and more. Free Technical Assistance Calls The NDWC can answer common questions involving issues system troubleshooting. Call our Engineers and technical assistance specialists toll-free at (304) 293
Discovering Knowledge from Noisy Databases Using Genetic Programming.
ERIC Educational Resources Information Center
Wong, Man Leung; Leung, Kwong Sak; Cheng, Jack C. Y.
2000-01-01
Presents a framework that combines Genetic Programming and Inductive Logic Programming, two approaches in data mining, to induce knowledge from noisy databases. The framework is based on a formalism of logic grammars and is implemented as a data mining system called LOGENPRO (Logic Grammar-based Genetic Programming System). (Contains 34…
Collision Cross Section (CCS) Database: An Additional Measure to Characterize Steroids.
Hernández-Mesa, Maykel; Le Bizec, Bruno; Monteau, Fabrice; García-Campaña, Ana M; Dervilly-Pinel, Gaud
2018-04-03
Ion mobility spectrometry enhances the performance characteristics of liquid chromatography-mass spectrometry workflows intended to steroid profiling by providing a new separation dimension and a novel characterization parameter, the so-called collision cross section (CCS). This work proposes the first CCS database for 300 steroids (i.e., endogenous, including phase I and phase II metabolites, and exogenous synthetic compounds), which involves 1080 ions and covers the CCS of 127 androgens, 84 estrogens, 50 corticosteroids, and 39 progestagens. This large database provides information related to all the ionized species identified for each steroid in positive electrospray ionization mode as well as for estrogens in negative ionization mode. CCS values have been measured using nitrogen as drift gas in the ion mobility cell. Generally, direct correlation exists between mass-to-charge ratio ( m/ z) and CCS because both are related parameters. However, several steroids mainly steroid glucuronides and steroid esters have been characterized as more compact or elongated molecules than expected. In such cases, CCS results in additional relevant information to retention time and mass spectral data for the identification of steroids. Moreover, several isomeric steroid pairs (e.g., 5β-androstane-3,17-dione and 5α-androstane-3,17-dione) have been separated based on their CCS differences. These results indicate that adding the CCS to databases in analytical workflows increases selectivity, thus improving the confidence in steroids analysis. Consequences in terms of identification and quantification are discussed. Quality criteria and a construction of an interlaboratory reproducibility approach are also reported for the obtained CCS values. The CCS database described here is made publicly available.
Naveja, J. Jesús; Medina-Franco, José L.
2017-01-01
We present a novel approach called ChemMaps for visualizing chemical space based on the similarity matrix of compound datasets generated with molecular fingerprints’ similarity. The method uses a ‘satellites’ approach, where satellites are, in principle, molecules whose similarity to the rest of the molecules in the database provides sufficient information for generating a visualization of the chemical space. Such an approach could help make chemical space visualizations more efficient. We hereby describe a proof-of-principle application of the method to various databases that have different diversity measures. Unsurprisingly, we found the method works better with databases that have low 2D diversity. 3D diversity played a secondary role, although it seems to be more relevant as 2D diversity increases. For less diverse datasets, taking as few as 25% satellites seems to be sufficient for a fair depiction of the chemical space. We propose to iteratively increase the satellites number by a factor of 5% relative to the whole database, and stop when the new and the prior chemical space correlate highly. This Research Note represents a first exploratory step, prior to the full application of this method for several datasets. PMID:28794856
Naveja, J Jesús; Medina-Franco, José L
2017-01-01
We present a novel approach called ChemMaps for visualizing chemical space based on the similarity matrix of compound datasets generated with molecular fingerprints' similarity. The method uses a 'satellites' approach, where satellites are, in principle, molecules whose similarity to the rest of the molecules in the database provides sufficient information for generating a visualization of the chemical space. Such an approach could help make chemical space visualizations more efficient. We hereby describe a proof-of-principle application of the method to various databases that have different diversity measures. Unsurprisingly, we found the method works better with databases that have low 2D diversity. 3D diversity played a secondary role, although it seems to be more relevant as 2D diversity increases. For less diverse datasets, taking as few as 25% satellites seems to be sufficient for a fair depiction of the chemical space. We propose to iteratively increase the satellites number by a factor of 5% relative to the whole database, and stop when the new and the prior chemical space correlate highly. This Research Note represents a first exploratory step, prior to the full application of this method for several datasets.
A History of Commitment in CALL.
ERIC Educational Resources Information Center
Jamieson, Joan
The evolution of computer-assisted language learning (CALL) is examined, focusing on what has changed and what has not changed much during that time. A variety of changes are noted: the development of multimedia capabilities, color, animation, and technical improvement of audio and video quality; availability of databases, better fit between…
BNDB - the Biochemical Network Database.
Küntzer, Jan; Backes, Christina; Blum, Torsten; Gerasch, Andreas; Kaufmann, Michael; Kohlbacher, Oliver; Lenhof, Hans-Peter
2007-10-02
Technological advances in high-throughput techniques and efficient data acquisition methods have resulted in a massive amount of life science data. The data is stored in numerous databases that have been established over the last decades and are essential resources for scientists nowadays. However, the diversity of the databases and the underlying data models make it difficult to combine this information for solving complex problems in systems biology. Currently, researchers typically have to browse several, often highly focused, databases to obtain the required information. Hence, there is a pressing need for more efficient systems for integrating, analyzing, and interpreting these data. The standardization and virtual consolidation of the databases is a major challenge resulting in a unified access to a variety of data sources. We present the Biochemical Network Database (BNDB), a powerful relational database platform, allowing a complete semantic integration of an extensive collection of external databases. BNDB is built upon a comprehensive and extensible object model called BioCore, which is powerful enough to model most known biochemical processes and at the same time easily extensible to be adapted to new biological concepts. Besides a web interface for the search and curation of the data, a Java-based viewer (BiNA) provides a powerful platform-independent visualization and navigation of the data. BiNA uses sophisticated graph layout algorithms for an interactive visualization and navigation of BNDB. BNDB allows a simple, unified access to a variety of external data sources. Its tight integration with the biochemical network library BN++ offers the possibility for import, integration, analysis, and visualization of the data. BNDB is freely accessible at http://www.bndb.org.
PlantTribes: a gene and gene family resource for comparative genomics in plants
Wall, P. Kerr; Leebens-Mack, Jim; Müller, Kai F.; Field, Dawn; Altman, Naomi S.; dePamphilis, Claude W.
2008-01-01
The PlantTribes database (http://fgp.huck.psu.edu/tribe.html) is a plant gene family database based on the inferred proteomes of five sequenced plant species: Arabidopsis thaliana, Carica papaya, Medicago truncatula, Oryza sativa and Populus trichocarpa. We used the graph-based clustering algorithm MCL [Van Dongen (Technical Report INS-R0010 2000) and Enright et al. (Nucleic Acids Res. 2002; 30: 1575–1584)] to classify all of these species’ protein-coding genes into putative gene families, called tribes, using three clustering stringencies (low, medium and high). For all tribes, we have generated protein and DNA alignments and maximum-likelihood phylogenetic trees. A parallel database of microarray experimental results is linked to the genes, which lets researchers identify groups of related genes and their expression patterns. Unified nomenclatures were developed, and tribes can be related to traditional gene families and conserved domain identifiers. SuperTribes, constructed through a second iteration of MCL clustering, connect distant, but potentially related gene clusters. The global classification of nearly 200 000 plant proteins was used as a scaffold for sorting ∼4 million additional cDNA sequences from over 200 plant species. All data and analyses are accessible through a flexible interface allowing users to explore the classification, to place query sequences within the classification, and to download results for further study. PMID:18073194
Meetei, Potshangbam Angamba; Singh, Pankaj; Nongdam, Potshangbam; Prabhu, N Prakash; Rathore, RS; Vindal, Vaibhav
2012-01-01
The North-East region of India is one of the twelve mega biodiversity region, containing many rare and endangered species. A curated database of medicinal and aromatic plants from the regions called NeMedPlant is developed. The database contains traditional, scientific and medicinal information about plants and their active constituents, obtained from scholarly literature and local sources. The database is cross-linked with major biochemical databases and analytical tools. The integrated database provides resource for investigations into hitherto unexplored medicinal plants and serves to speed up the discovery of natural productsbased drugs. Availability The database is available for free at http://bif.uohyd.ac.in/nemedplant/orhttp://202.41.85.11/nemedplant/ PMID:22419844
2002-12-01
Brazilian Air Force has been testing a new surveillance system called Sistema de Vigilancia da Amazonia (SIVAM), designed to...2000 Online Database, 23 April 1998 and “Plan de seguridad para la triple frontera,” Ser en el 2000 Online Database, 01 June...Plan de seguridad para la triple frontera,” Ser en el 2000 Online Database, 01 June 1998. 64 Robert Devlin, Antoni Estevadeordal
Large scale database scrubbing using object oriented software components.
Herting, R L; Barnes, M R
1998-01-01
Now that case managers, quality improvement teams, and researchers use medical databases extensively, the ability to share and disseminate such databases while maintaining patient confidentiality is paramount. A process called scrubbing addresses this problem by removing personally identifying information while keeping the integrity of the medical information intact. Scrubbing entire databases, containing multiple tables, requires that the implicit relationships between data elements in different tables of the database be maintained. To address this issue we developed DBScrub, a Java program that interfaces with any JDBC compliant database and scrubs the database while maintaining the implicit relationships within it. DBScrub uses a small number of highly configurable object-oriented software components to carry out the scrubbing. We describe the structure of these software components and how they maintain the implicit relationships within the database.
Citation Discovery Tools for Conducting Adaptive Meta-analyses to Update Systematic Reviews.
Bae, Jong-Myon; Kim, Eun Hee
2016-03-01
The systematic review (SR) is a research methodology that aims to synthesize related evidence. Updating previously conducted SRs is necessary when new evidence has been produced, but no consensus has yet emerged on the appropriate update methodology. The authors have developed a new SR update method called 'adaptive meta-analysis' (AMA) using the 'cited by', 'similar articles', and 'related articles' citation discovery tools in the PubMed and Scopus databases. This study evaluates the usefulness of these citation discovery tools for updating SRs. Lists were constructed by applying the citation discovery tools in the two databases to the articles analyzed by a published SR. The degree of overlap between the lists and distribution of excluded results were evaluated. The articles ultimately selected for the SR update meta-analysis were found in the lists obtained from the 'cited by' and 'similar' tools in PubMed. Most of the selected articles appeared in both the 'cited by' lists in Scopus and PubMed. The Scopus 'related' tool did not identify the appropriate articles. The AMA, which involves using both citation discovery tools in PubMed, and optionally, the 'related' tool in Scopus, was found to be useful for updating an SR.
Cooper, T A; Wiggans, G R; VanRaden, P M
2013-05-01
Call rates on both a single nucleotide polymorphism (SNP) basis and an animal basis are used as measures of data quality and as screening tools for genomic studies and evaluations of dairy cattle. To investigate the relationship of SNP call rate and genotype accuracy for individual SNP, the correlation between percentages of missing genotypes and parent-progeny conflicts for each SNP was calculated for 103,313 Holsteins. Correlations ranged from 0.14 to 0.38 for the BovineSNP50 and BovineLD (Illumina Inc., San Diego, CA) and GeneSeek Genomic Profiler (Neogen Corp., Lincoln, NE) chips, with lower correlations for newer chips. For US genomic evaluations, genotypes are excluded for animals with a call rate of <90% across autosomal SNP or <80% across X-specific SNP. Mean call rate for 220,175 Holstein, Jersey, and Brown Swiss genotypes was 99.6%. Animal genotypes with a call rate of ≤99% were examined from the US Department of Agriculture genotype database to determine how genotype call rate is related to accuracy of calls on an animal basis. Animal call rate was determined from SNP used in genomic evaluation and is the number of called autosomal and X-specific SNP genotypes divided by the number of SNP from that type of chip. To investigate the relationship of animal call rate and parentage validation, conflicts between a genotyped animal and its sire or dam were determined through a duo test (opposite homozygous SNP genotypes between sire and progeny; 1,374 animal genotypes) and a trio test (also including conflicts with dam and heterozygous SNP genotype for the animal when both parents are the same homozygote; 482 animal genotypes). When animal call rate was ≤ 80%, parentage validation was no longer reliable with the duo test. With the trio test, parentage validation was no longer reliable when animal call rate was ≤ 90%. To investigate how animal call rate was related to genotyping accuracy for animals with multiple genotypes, concordance between genotypes for 1,216 animals that had a genotype with a call rate of ≤ 99% (low call rate) as well as a genotype with a call rate of >99% (high call rate) were calculated by dividing the number of identical SNP genotype calls by the number of SNP that were called for both genotypes. Mean concordance between low- and high-call genotypes was >99% for a low call rate of >90% but decreased to 97% for a call rate of 86 to 90% and to 58% for a call rate of <60%. Edits on call rate reduce the use of incorrect SNP genotypes to calculate genomic evaluations. Copyright © 2013 American Dairy Science Association. Published by Elsevier Inc. All rights reserved.
First use of LHC Run 3 Conditions Database infrastructure for auxiliary data files in ATLAS
NASA Astrophysics Data System (ADS)
Aperio Bella, L.; Barberis, D.; Buttinger, W.; Formica, A.; Gallas, E. J.; Rinaldi, L.; Rybkin, G.; ATLAS Collaboration
2017-10-01
Processing of the large amount of data produced by the ATLAS experiment requires fast and reliable access to what we call Auxiliary Data Files (ADF). These files, produced by Combined Performance, Trigger and Physics groups, contain conditions, calibrations, and other derived data used by the ATLAS software. In ATLAS this data has, thus far for historical reasons, been collected and accessed outside the ATLAS Conditions Database infrastructure and related software. For this reason, along with the fact that ADF are effectively read by the software as binary objects, this class of data appears ideal for testing the proposed Run 3 conditions data infrastructure now in development. This paper describes this implementation as well as the lessons learned in exploring and refining the new infrastructure with the potential for deployment during Run 2.
Nuclear Energy Infrastructure Database Fitness and Suitability Review
DOE Office of Scientific and Technical Information (OSTI.GOV)
Heidrich, Brenden
In 2014, the Deputy Assistant Secretary for Science and Technology Innovation (NE-4) initiated the Nuclear Energy-Infrastructure Management Project by tasking the Nuclear Science User Facilities (NSUF) to create a searchable and interactive database of all pertinent NE supported or related infrastructure. This database will be used for analyses to establish needs, redundancies, efficiencies, distributions, etc. in order to best understand the utility of NE’s infrastructure and inform the content of the infrastructure calls. The NSUF developed the database by utilizing data and policy direction from a wide variety of reports from the Department of Energy, the National Research Council, themore » International Atomic Energy Agency and various other federal and civilian resources. The NEID contains data on 802 R&D instruments housed in 377 facilities at 84 institutions in the US and abroad. A Database Review Panel (DRP) was formed to review and provide advice on the development, implementation and utilization of the NEID. The panel is comprised of five members with expertise in nuclear energy-associated research. It was intended that they represent the major constituencies associated with nuclear energy research: academia, industry, research reactor, national laboratory, and Department of Energy program management. The Nuclear Energy Infrastructure Database Review Panel concludes that the NSUF has succeeded in creating a capability and infrastructure database that identifies and documents the major nuclear energy research and development capabilities across the DOE complex. The effort to maintain and expand the database will be ongoing. Detailed information on many facilities must be gathered from associated institutions added to complete the database. The data must be validated and kept current to capture facility and instrumentation status as well as to cover new acquisitions and retirements.« less
NASA Technical Reports Server (NTRS)
Steck, Daniel
2009-01-01
This report documents the generation of an outbound Earth to Moon transfer preliminary database consisting of four cases calculated twice a day for a 19 year period. The database was desired as the first step in order for NASA to rapidly generate Earth to Moon trajectories for the Constellation Program using the Mission Assessment Post Processor. The completed database was created running a flight trajectory and optimization program, called Copernicus, in batch mode with the use of newly created Matlab functions. The database is accurate and has high data resolution. The techniques and scripts developed to generate the trajectory information will also be directly used in generating a comprehensive database.
Chen, Mingyang; Stott, Amanda C; Li, Shenggang; Dixon, David A
2012-04-01
A robust metadata database called the Collaborative Chemistry Database Tool (CCDBT) for massive amounts of computational chemistry raw data has been designed and implemented. It performs data synchronization and simultaneously extracts the metadata. Computational chemistry data in various formats from different computing sources, software packages, and users can be parsed into uniform metadata for storage in a MySQL database. Parsing is performed by a parsing pyramid, including parsers written for different levels of data types and sets created by the parser loader after loading parser engines and configurations. Copyright © 2011 Elsevier Inc. All rights reserved.
Barry, N'Diris; M Miller, Karen; Ryshen, Gregory; Uffman, Joshua; Taghon, Thomas A; Tobias, Joseph D
2016-05-01
The goal of this study was to identify the etiology of events and demographics of patients that experience complications requiring activation of the Rapid Response Team (RRT) during the first 24 h following anesthetic care. We performed a retrospective review of the Quality Improvement database from the Department of Anesthesiology & Pain Medicine at Nationwide Children's Hospital. The database was searched to identify those patients who had a RRT evaluation activated within 24 h of receiving anesthesia or procedural sedation. These patients' charts were reviewed to obtain demographic information, etiology of the RRT call, and outcomes. The study cohort included 106 RRT calls that were made over a 3-year period. Six patients were excluded from analysis due to incomplete datasets. One hundred patients remained for analysis including 60 males and 40 females. Patients ranged in age from 0.08 to 31.21 years (7.8 ± 7.7 years, median 5.3 years). Seventy-one patients were American Society of Anesthesiologists' (ASA) status 3 or 4 and 29 patients were ASA status 1 or 2. Five calls were made for patients who had undergone procedural sedation while the other 95 were on patients who received general anesthesia. The average time to the RRT call after the end of anesthetic care was 11.4 ± 6.6 h. Respiratory concern was the most common reason for RRT initiation, accounting for 71 of the 100 calls. Forty-nine patients had a recent respiratory illness, chronic respiratory-related disease, or history of preterm birth. Fifty patients (50%) were transferred to a higher level of care following the RRT consult. There was no significant difference between age, gender, ASA status, or etiology of the event for patients transferred vs. those who were not. A significant difference was noted in the Pediatric Early Warning Score of patients transferred to a higher level of care in comparison to patients who remained on the floor (4 ± 2 vs. 3 ± 2, P = 0.0097). RRT calls were most common for respiratory concerns. High ASA status, general anesthesia administration, and the presence of acute or chronic conditions prior to anesthetic administration predispose a patient to perioperative complications resulting in the need for an RRT call. © 2016 John Wiley & Sons Ltd.
ERIC Educational Resources Information Center
Porter, Stephen R.
Almost all studies of retention inappropriately combine stopouts with transfer-outs because of a lack of data. The National Student Clearinghouse (NSC) (formerly called the National Student Loan Clearinghouse) created a new database that tracks students across institutions. These data, in combination with institutional databases, now allow…
Information of urban morphological features at high resolution is needed to properly model and characterize the meteorological and air quality fields in urban areas. We describe a new project called National Urban Database with Access Portal Tool, (NUDAPT) that addresses this nee...
Update on terrestrial ecological classification in the highlands of West Virginia
James P. Vanderhorst
2010-01-01
The West Virginia Natural Heritage Program (WVNHP) maintains databases on the biological diversity of the state, including species and natural communities, to help focus conservation efforts by agencies and organizations. Information on terrestrial communities (also called vegetation, or habitat, depending on user or audience focus) is maintained in two databases. The...
47 CFR 52.26 - NANC Recommendations on Local Number Portability Administration.
Code of Federal Regulations, 2012 CFR
2012-10-01
... perform a database query to determine if the telephone number has been ported to another local exchange carrier, the local exchange carrier may block the unqueried call only if performing the database query is... manage and oversee the local number portability administrators, subject to review by the NANC, but only...
47 CFR 52.26 - NANC Recommendations on Local Number Portability Administration.
Code of Federal Regulations, 2014 CFR
2014-10-01
... perform a database query to determine if the telephone number has been ported to another local exchange carrier, the local exchange carrier may block the unqueried call only if performing the database query is... manage and oversee the local number portability administrators, subject to review by the NANC, but only...
Using decision-tree classifier systems to extract knowledge from databases
NASA Technical Reports Server (NTRS)
St.clair, D. C.; Sabharwal, C. L.; Hacke, Keith; Bond, W. E.
1990-01-01
One difficulty in applying artificial intelligence techniques to the solution of real world problems is that the development and maintenance of many AI systems, such as those used in diagnostics, require large amounts of human resources. At the same time, databases frequently exist which contain information about the process(es) of interest. Recently, efforts to reduce development and maintenance costs of AI systems have focused on using machine learning techniques to extract knowledge from existing databases. Research is described in the area of knowledge extraction using a class of machine learning techniques called decision-tree classifier systems. Results of this research suggest ways of performing knowledge extraction which may be applied in numerous situations. In addition, a measurement called the concept strength metric (CSM) is described which can be used to determine how well the resulting decision tree can differentiate between the concepts it has learned. The CSM can be used to determine whether or not additional knowledge needs to be extracted from the database. An experiment involving real world data is presented to illustrate the concepts described.
Yu, Kebing; Salomon, Arthur R.
2010-01-01
Recently, dramatic progress has been achieved in expanding the sensitivity, resolution, mass accuracy, and scan rate of mass spectrometers able to fragment and identify peptides through tandem mass spectrometry (MS/MS). Unfortunately, this enhanced ability to acquire proteomic data has not been accompanied by a concomitant increase in the availability of flexible tools allowing users to rapidly assimilate, explore, and analyze this data and adapt to a variety of experimental workflows with minimal user intervention. Here we fill this critical gap by providing a flexible relational database called PeptideDepot for organization of expansive proteomic data sets, collation of proteomic data with available protein information resources, and visual comparison of multiple quantitative proteomic experiments. Our software design, built upon the synergistic combination of a MySQL database for safe warehousing of proteomic data with a FileMaker-driven graphical user interface for flexible adaptation to diverse workflows, enables proteomic end-users to directly tailor the presentation of proteomic data to the unique analysis requirements of the individual proteomics lab. PeptideDepot may be deployed as an independent software tool or integrated directly with our High Throughput Autonomous Proteomic Pipeline (HTAPP) used in the automated acquisition and post-acquisition analysis of proteomic data. PMID:19834895
Children's Concerns about Their Parents' Health and Well-Being: Researching with ChildLine Scotland
ERIC Educational Resources Information Center
Backett-Milburn, Kathryn; Jackson, Sharon
2012-01-01
This paper reports on collaborative research conducted with ChildLine Scotland, a free, confidential, telephone counselling service, using their database. We focussed on children's calls about parental health and well-being and how this affected their own lives. Children's concerns emerged within multi-layered calls in which they discussed…
Program for Generating Graphs and Charts
NASA Technical Reports Server (NTRS)
Ackerson, C. T.
1986-01-01
Office Automation Pilot (OAP) Graphics Database system offers IBM personal computer user assistance in producing wide variety of graphs and charts and convenient data-base system, called chart base, for creating and maintaining data associated with graphs and charts. Thirteen different graphics packages available. Access graphics capabilities obtained in similar manner. User chooses creation, revision, or chartbase-maintenance options from initial menu; Enters or modifies data displayed on graphic chart. OAP graphics data-base system written in Microsoft PASCAL.
Matloob, Samir A; Hyam, Jonathan A; Thorne, Lewis; Bradford, Robert
2016-01-01
Documentation of urgent referrals to neurosurgical units and communication with referring hospitals is critical for effective handover and appropriate continuity of care within a tertiary service. Referrals to our neurosurgical unit were audited and we found that the majority of referrals were not documented and this led to more calls to the on-call neurosurgery registrar regarding old referrals. We implemented a new referral system in an attempt to improve documentation of referrals, communication with our referring hospitals and to professionalise the service we offer them. During a 14-day period, number of bleeps, missed bleeps, calls discussing new referrals and previously processed referrals were recorded. Whether new referrals were appropriately documented and referrers received a written response was also recorded. A commercially provided secure cloud-based data archiving telecommunications and database platform for referrals was subsequently introduced within the Trust and the questionnaire repeated during another 14-day period 1 year after implementation. Missed bleeps per day reduced from 16% (SD ± 6.4%) to 9% (SD ± 4.8%; df = 13, paired t-tests p = 0.007) and mean calls per day clarifying previous referrals reduced from 10 (SD ± 4) to 5 (SD ± 3.5; df = 13, p = 0.003). Documentation of new referrals increased from 43% (74/174) to 85% (181/210), and responses to referrals increased from 74% to 98%. The use of a secure cloud-based data archiving telecommunications and database platform significantly increased the documentation of new referrals. This led to fewer missed bleeps and fewer calls about old referrals for the on call registrar. This system of documenting referrals results in improved continuity of care for neurosurgical patients, a significant reduction in risk for Trusts and a more efficient use of Registrar time.
El-Bastawissi, Ay; McAfee, T; Zbikowski, S M; Hollis, J; Stark, M; Wassum, K; Clark, N; Barwinski, R; Broughton, E
2003-03-01
To describe the experience of uninsured and Medicaid Oregon tobacco users who registered in Free & Clear (F&C), a telephone based cessation programme including five scheduled outbound calls. Using a retrospective cohort design, 1334 (423 uninsured, 806 Medicaid, and 105 commercially insured) Oregon tobacco users who registered in F&C between 18 November 1998 and 28 February 2000 were identified and followed for 12 months post-registration; 648 (48.6%) were successfully contacted at 12 months. Information was collected from the F&C database. Unconditional logistic regression, adjusted for race and education, was used. The seven day quit rate at 12 months, assuming non-respondents were smokers, was 14.8% (95% confidence interval (CI) 13.0 to 16.9). This rate was significantly higher among commercially insured participants (v Medicaid but not uninsured) and among participants who completed > or = 5 calls (v < 5 calls). The quit rate for those contacted at 12 months was 30.6% (95% CI 27.0% to 34.3%) and varied, however not significantly, by insurance and number of calls. After adjustment, respondents who completed > or = 5 calls were 60% more likely to quit tobacco (odds ratio (OR) 1.6, 95% CI 0.9 to 3.1), and uninsured respondents who completed > or = 5 calls were 70% more likely to quit tobacco (OR 1.7, 95% CI 0.9 to 3.5), relative to those who completed < 5 calls, but the difference was not significant. The quit rates are similar to those reported in efficacy trials. The observed variation in quitting tobacco for respondents by number of calls completed and by insurance merits further investigation concentrating on increasing compliance with the call schedule, particularly for the uninsured.
El-Bastawissi, A.; McAfee, T; Zbikowski, S; Hollis, J; Stark, M; Wassum, K; Clark, N; Barwinski, R; Broughton, E
2003-01-01
Objective: To describe the experience of uninsured and Medicaid Oregon tobacco users who registered in Free & Clear (F&C), a telephone based cessation programme including five scheduled outbound calls. Design and setting: Using a retrospective cohort design, 1334 (423 uninsured, 806 Medicaid, and 105 commercially insured) Oregon tobacco users who registered in F&C between 18 November 1998 and 28 February 2000 were identified and followed for 12 months post-registration; 648 (48.6%) were successfully contacted at 12 months. Information was collected from the F&C database. Unconditional logistic regression, adjusted for race and education, was used. Results: The seven day quit rate at 12 months, assuming non-respondents were smokers, was 14.8% (95% confidence interval (CI) 13.0 to 16.9). This rate was significantly higher among commercially insured participants (v Medicaid but not uninsured) and among participants who completed ⩾ 5 calls (v < 5 calls). The quit rate for those contacted at 12 months was 30.6% (95% CI 27.0% to 34.3%) and varied, however not significantly, by insurance and number of calls. After adjustment, respondents who completed ⩾ 5 calls were 60% more likely to quit tobacco (odds ratio (OR) 1.6, 95% CI 0.9 to 3.1), and uninsured respondents who completed ⩾ 5 calls were 70% more likely to quit tobacco (OR 1.7, 95% CI 0.9 to 3.5), relative to those who completed < 5 calls, but the difference was not significant. Conclusions: The quit rates are similar to those reported in efficacy trials. The observed variation in quitting tobacco for respondents by number of calls completed and by insurance merits further investigation concentrating on increasing compliance with the call schedule, particularly for the uninsured. PMID:12612361
SPMBR: a scalable algorithm for mining sequential patterns based on bitmaps
NASA Astrophysics Data System (ADS)
Xu, Xiwei; Zhang, Changhai
2013-12-01
Now some sequential patterns mining algorithms generate too many candidate sequences, and increase the processing cost of support counting. Therefore, we present an effective and scalable algorithm called SPMBR (Sequential Patterns Mining based on Bitmap Representation) to solve the problem of mining the sequential patterns for large databases. Our method differs from previous related works of mining sequential patterns. The main difference is that the database of sequential patterns is represented by bitmaps, and a simplified bitmap structure is presented firstly. In this paper, First the algorithm generate candidate sequences by SE(Sequence Extension) and IE(Item Extension), and then obtain all frequent sequences by comparing the original bitmap and the extended item bitmap .This method could simplify the problem of mining the sequential patterns and avoid the high processing cost of support counting. Both theories and experiments indicate that the performance of SPMBR is predominant for large transaction databases, the required memory size for storing temporal data is much less during mining process, and all sequential patterns can be mined with feasibility.
The Gypsy Database (GyDB) of mobile genetic elements: release 2.0
Llorens, Carlos; Futami, Ricardo; Covelli, Laura; Domínguez-Escribá, Laura; Viu, Jose M.; Tamarit, Daniel; Aguilar-Rodríguez, Jose; Vicente-Ripolles, Miguel; Fuster, Gonzalo; Bernet, Guillermo P.; Maumus, Florian; Munoz-Pomer, Alfonso; Sempere, Jose M.; Latorre, Amparo; Moya, Andres
2011-01-01
This article introduces the second release of the Gypsy Database of Mobile Genetic Elements (GyDB 2.0): a research project devoted to the evolutionary dynamics of viruses and transposable elements based on their phylogenetic classification (per lineage and protein domain). The Gypsy Database (GyDB) is a long-term project that is continuously progressing, and that owing to the high molecular diversity of mobile elements requires to be completed in several stages. GyDB 2.0 has been powered with a wiki to allow other researchers participate in the project. The current database stage and scope are long terminal repeats (LTR) retroelements and relatives. GyDB 2.0 is an update based on the analysis of Ty3/Gypsy, Retroviridae, Ty1/Copia and Bel/Pao LTR retroelements and the Caulimoviridae pararetroviruses of plants. Among other features, in terms of the aforementioned topics, this update adds: (i) a variety of descriptions and reviews distributed in multiple web pages; (ii) protein-based phylogenies, where phylogenetic levels are assigned to distinct classified elements; (iii) a collection of multiple alignments, lineage-specific hidden Markov models and consensus sequences, called GyDB collection; (iv) updated RefSeq databases and BLAST and HMM servers to facilitate sequence characterization of new LTR retroelement and caulimovirus queries; and (v) a bibliographic server. GyDB 2.0 is available at http://gydb.org. PMID:21036865
The Gypsy Database (GyDB) of mobile genetic elements: release 2.0.
Llorens, Carlos; Futami, Ricardo; Covelli, Laura; Domínguez-Escribá, Laura; Viu, Jose M; Tamarit, Daniel; Aguilar-Rodríguez, Jose; Vicente-Ripolles, Miguel; Fuster, Gonzalo; Bernet, Guillermo P; Maumus, Florian; Munoz-Pomer, Alfonso; Sempere, Jose M; Latorre, Amparo; Moya, Andres
2011-01-01
This article introduces the second release of the Gypsy Database of Mobile Genetic Elements (GyDB 2.0): a research project devoted to the evolutionary dynamics of viruses and transposable elements based on their phylogenetic classification (per lineage and protein domain). The Gypsy Database (GyDB) is a long-term project that is continuously progressing, and that owing to the high molecular diversity of mobile elements requires to be completed in several stages. GyDB 2.0 has been powered with a wiki to allow other researchers participate in the project. The current database stage and scope are long terminal repeats (LTR) retroelements and relatives. GyDB 2.0 is an update based on the analysis of Ty3/Gypsy, Retroviridae, Ty1/Copia and Bel/Pao LTR retroelements and the Caulimoviridae pararetroviruses of plants. Among other features, in terms of the aforementioned topics, this update adds: (i) a variety of descriptions and reviews distributed in multiple web pages; (ii) protein-based phylogenies, where phylogenetic levels are assigned to distinct classified elements; (iii) a collection of multiple alignments, lineage-specific hidden Markov models and consensus sequences, called GyDB collection; (iv) updated RefSeq databases and BLAST and HMM servers to facilitate sequence characterization of new LTR retroelement and caulimovirus queries; and (v) a bibliographic server. GyDB 2.0 is available at http://gydb.org.
ERIC Educational Resources Information Center
Kurhan, Scott H.; Griffing, Elizabeth A.
2011-01-01
Reference services in public libraries are changing dramatically. The Internet, online databases, and shrinking budgets are all making it necessary for non-traditional reference staff to become familiar with online reference tools. Recognizing the need for cross-training, Chesapeake Public Library (CPL) developed a program called the Database…
RPA tree-level database users guide
Patrick D. Miles; Scott A. Pugh; Brad Smith; Sonja N. Oswalt
2014-01-01
The Forest and Rangeland Renewable Resources Planning Act (RPA) of 1974 calls for a periodic assessment of the Nation's renewable resources. The Forest Inventory and Analysis (FIA) program of the U.S. Forest Service supports the RPA effort by providing information on the forest resources of the United States. The RPA tree-level database (RPAtreeDB) was generated...
Analysis of a virtual memory model for maintaining database views
NASA Technical Reports Server (NTRS)
Kinsley, Kathryn C.; Hughes, Charles E.
1992-01-01
This paper presents an analytical model for predicting the performance of a new support strategy for database views. This strategy, called the virtual method, is compared with traditional methods for supporting views. The analytical model's predictions of improved performance by the virtual method are then validated by comparing these results with those achieved in an experimental implementation.
Architecture for biomedical multimedia information delivery on the World Wide Web
NASA Astrophysics Data System (ADS)
Long, L. Rodney; Goh, Gin-Hua; Neve, Leif; Thoma, George R.
1997-10-01
Research engineers at the National Library of Medicine are building a prototype system for the delivery of multimedia biomedical information on the World Wide Web. This paper discuses the architecture and design considerations for the system, which will be used initially to make images and text from the third National Health and Nutrition Examination Survey (NHANES) publicly available. We categorized our analysis as follows: (1) fundamental software tools: we analyzed trade-offs among use of conventional HTML/CGI, X Window Broadway, and Java; (2) image delivery: we examined the use of unconventional TCP transmission methods; (3) database manager and database design: we discuss the capabilities and planned use of the Informix object-relational database manager and the planned schema for the HNANES database; (4) storage requirements for our Sun server; (5) user interface considerations; (6) the compatibility of the system with other standard research and analysis tools; (7) image display: we discuss considerations for consistent image display for end users. Finally, we discuss the scalability of the system in terms of incorporating larger or more databases of similar data, and the extendibility of the system for supporting content-based retrieval of biomedical images. The system prototype is called the Web-based Medical Information Retrieval System. An early version was built as a Java applet and tested on Unix, PC, and Macintosh platforms. This prototype used the MiniSQL database manager to do text queries on a small database of records of participants in the second NHANES survey. The full records and associated x-ray images were retrievable and displayable on a standard Web browser. A second version has now been built, also a Java applet, using the MySQL database manager.
2014-06-01
central location. Each of the SQLite databases are converted and stored in one MySQL database and the pcap files are parsed to extract call information...from the specific communications applications used during the experiment. This extracted data is then stored in the same MySQL database. With all...rhythm of the event. Figure 3 demonstrates the application usage over the course of the experiment for the EXDIR. As seen, the EXDIR spent the majority
Staradmin -- Starlink User Database Maintainer
NASA Astrophysics Data System (ADS)
Fish, Adrian
The subject of this SSN is a utility called STARADMIN. This utility allows the system administrator to build and maintain a Starlink User Database (UDB). The principal source of information for each user is a text file, named after their username. The content of each file is a list consisting of one keyword followed by the relevant user data per line. These user database files reside in a single directory. The STARADMIN program is used to manipulate these user data files and automatically generate user summary lists.
2004-01-01
Mr. R appealed for a decision by the Court to overturn the refusal of the Medical Director of Health to her request that health information in medical records pertaining to herdeceased father should not be entered into the Health Sector Database. Furthermore, she called for recognition of her right to prohibit the transfer of such information into a database. Article 8 of Act No 139/1998 on a Health Sector Database provides for the right of patients to refuse permission, by notification to the Medical Director of Health, for information concerning them to be entered into the Health Sector Database. The Court concluded that R could not exercise this right acting as a substitute of her deceased father, but it was recognised that she might, on the basis of her right to protection of privacy, have an interest in preventing the transfer of health data concerning her father into the database, as information could be inferred from such data relating to the hereditary characteristics of her father which might also apply to herself. It was revealed in the course of proceedings that extensive information concerning people's health is entered into medical records, e.g. medical treatment, life-style and social conditions, employment and family circumstances, together with a detailed identification of the person that the information concerns. It was recognised as unequivocal that the provisions of Paragraph 1 of Article 71 of the Constitution applied to such information and guaranteed to every person the right to protection of privacy in this respect. The Court concluded that the opinion of the District Court, which, inter alia, was based on the opinion of an assessor, to the effect that so-called one-way encryption could be carried out in such a secure manner that it would be virtually impossible to read the encrypted data, had not been refuted. It was noted, however, that Act No. 139/1998 provides no details as to what information from medical records is required to be encrypted in this manner prior to transfer into the database or whether certain information contained in the medical records will not be transferred into the database. The documents of the case indicate that only the identity number of the patient would be encrypted in the database, and that names, both those of the patient and his relatives, as well as the precise address, would be omitted. It is obvious that information on these items is not the only information appearing in the medical records which could, in certain cases, unequivocally identify the person concerned. Act No. 139/1998 also provides for authorisation to the licensee to process information from the medical records transferred into the database. The Act stipulates that certain specified public entities must approve procedures and process methods and monitor all queries and processing of information in the database. However, there is no clear definition of what type of queries will be directed to the database or in what form the replies to such queries will appear. The Court concluded that even though individual provisions of Act No 139/1998 repeatedly stipulate that health information in the Health Sector Database should be non-personally identifiable, it is far from adequately ensured under statutory law that this stated objective will be achieved. In light of the obligations imposed on the legislature by Paragraph 1 of Article 71 of the Constitution, the Court concluded that various forms of monitoring of the creation and, operation of the database are no substitute in this respect without foundation in definite statutory norms. In light of these circumstances, and taking into account the principles of Icelandic law concerning the confidentiality and protection of privacy, the Court concluded that the right of R in this matter must be recognised, and her court claims, therefore, upheld.
Work-related hand injuries in Ontario: an historical perspective.
Schofield, Michel M E
2005-10-01
Worker's compensation legislation was enacted in Ontario almost 90 years ago. Workers injured on the job gave up their right to sue employers and received no-fault compensation from an independent, employer-funded body called the Workmen's Compensation Board. Three academic health sciences centers in Ontario that are recognized for their commitment to patient care, research, and education compose part of the Specialty Program network with the Ontario Workplace Safety and Insurance Board (WSIB). Statistical data from the WSIB database for workers with hand injuries from 1996 to 2003 show an increase in fractures from fall injuries in the group of women older than 60 that may be related to osteoporosis, a common condition in this group.
Understanding immigrants, schooling, and school psychology: Contemporary science and practice.
Frisby, Craig L; Jimerson, Shane R
2016-06-01
Immigration into the United States is a particularly salient topic of current contemporary educational, social, and political discussions. The school-related needs of immigrant children and youth can be well served by rigorous research and effective school psychology preservice training and preparation. This overview highlights key definitions, demographic statistics, and current resources related to immigration in U.S. society. This special topic section on understanding immigrants, schooling, and school psychology features articles relevant to this important topic. We conclude with a call for this effort to serve as a springboard for future discussions, scholarship, and school psychology training in preparing practitioners for serving children who are immigrants. (PsycINFO Database Record (c) 2016 APA, all rights reserved).
Carl Rogers and the larger context of therapeutic thought.
Wachtel, Paul L
2007-09-01
Carl Rogers' classic account (see record 2007-14639-002) of the necessary and sufficient conditions for therapeutic personality change is examined in light of developments in theory and practice since the time he wrote. Rogers' ideas, which diverged from and were very largely a challenge to, the dominant psychoanalytic ideology of the era in which he wrote, are considered in relation to new theoretical developments in what has come to be called relational psychoanalysis. They are also considered in light of the greatly increased influence of and substantial evidence supporting behavioral and cognitive-behavioral approaches. Points of convergence and divergence among these approaches are examined. (PsycINFO Database Record (c) 2010 APA, all rights reserved).
Hanaki, Nao; Yamashita, Kazuto; Kunisawa, Susumu; Imanaka, Yuichi
2016-12-09
In Japan, ambulance staff sometimes must make request calls to find hospitals that can accept patients because of an inadequate information sharing system. This study aimed to quantify effects of the number of request calls on the time interval between an emergency call and hospital arrival. A cross-sectional study of an ambulance records database in Nara prefecture, Japan. A total of 43 663 patients (50% women; 31.2% aged 80 years and over): (1) transported by ambulance from April 2013 to March 2014, (2) aged 15 years and over, and (3) with suspected major illness. The time from call to hospital arrival, defined as the time interval from receipt of an emergency call to ambulance arrival at a hospital. The mean time interval from emergency call to hospital arrival was 44.5 min, and the mean number of requests was 1.8. Multilevel linear regression analysis showed that ∼43.8% of variations in transportation times were explained by patient age, sex, season, day of the week, time, category of suspected illness, person calling for the ambulance, emergency status at request call, area and number of request calls. A higher number of request calls was associated with longer time intervals to hospital arrival (addition of 6.3 min per request call; p<0.001). In an analysis dividing areas into three groups, there were differences in transportation time for diseases needing cardiologists, neurologists, neurosurgeons and orthopaedists. The study revealed 6.3 additional minutes needed in transportation time for every refusal of a request call, and also revealed disease-specific delays among specific areas. An effective system should be collaboratively established by policymakers and physicians to ensure the rapid identification of an available hospital for patient transportation in order to reduce the time from the initial emergency call to hospital arrival. Published by the BMJ Publishing Group Limited. For permission to use (where not already granted under a licence) please go to http://www.bmj.com/company/products-services/rights-and-licensing/.
Hanaki, Nao; Yamashita, Kazuto; Kunisawa, Susumu; Imanaka, Yuichi
2016-01-01
Objectives In Japan, ambulance staff sometimes must make request calls to find hospitals that can accept patients because of an inadequate information sharing system. This study aimed to quantify effects of the number of request calls on the time interval between an emergency call and hospital arrival. Design and setting A cross-sectional study of an ambulance records database in Nara prefecture, Japan. Cases A total of 43 663 patients (50% women; 31.2% aged 80 years and over): (1) transported by ambulance from April 2013 to March 2014, (2) aged 15 years and over, and (3) with suspected major illness. Primary outcome measures The time from call to hospital arrival, defined as the time interval from receipt of an emergency call to ambulance arrival at a hospital. Results The mean time interval from emergency call to hospital arrival was 44.5 min, and the mean number of requests was 1.8. Multilevel linear regression analysis showed that ∼43.8% of variations in transportation times were explained by patient age, sex, season, day of the week, time, category of suspected illness, person calling for the ambulance, emergency status at request call, area and number of request calls. A higher number of request calls was associated with longer time intervals to hospital arrival (addition of 6.3 min per request call; p<0.001). In an analysis dividing areas into three groups, there were differences in transportation time for diseases needing cardiologists, neurologists, neurosurgeons and orthopaedists. Conclusions The study revealed 6.3 additional minutes needed in transportation time for every refusal of a request call, and also revealed disease-specific delays among specific areas. An effective system should be collaboratively established by policymakers and physicians to ensure the rapid identification of an available hospital for patient transportation in order to reduce the time from the initial emergency call to hospital arrival. PMID:27940625
Siberchicot, Aurélie; Bessy, Adrien; Guéguen, Laurent; Marais, Gabriel A B
2017-10-01
Given the importance of meiotic recombination in biology, there is a need to develop robust methods to estimate meiotic recombination rates. A popular approach, called the Marey map approach, relies on comparing genetic and physical maps of a chromosome to estimate local recombination rates. In the past, we have implemented this approach in an R package called MareyMap, which includes many functionalities useful to get reliable recombination rate estimates in a semi-automated way. MareyMap has been used repeatedly in studies looking at the effect of recombination on genome evolution. Here, we propose a simpler user-friendly web service version of MareyMap, called MareyMap Online, which allows a user to get recombination rates from her/his own data or from a publicly available database that we offer in a few clicks. When the analysis is done, the user is asked whether her/his curated data can be placed in the database and shared with other users, which we hope will make meta-analysis on recombination rates including many species easy in the future. © The Author 2017. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.
TPMG Northern California appointments and advice call center.
Conolly, Patricia; Levine, Leslie; Amaral, Debra J; Fireman, Bruce H; Driscoll, Tom
2005-08-01
Kaiser Permanente (KP) has been developing its use of call centers as a way to provide an expansive set of healthcare services to KP members efficiently and cost effectively. Since 1995, when The Permanente Medical Group (TPMG) began to consolidate primary care phone services into three physical call centers, the TPMG Appointments and Advice Call Center (AACC) has become the "front office" for primary care services across approximately 89% of Northern California. The AACC provides primary care phone service for approximately 3 million Kaiser Foundation Health Plan members in Northern California and responds to approximately 1 million calls per month across the three AACC sites. A database records each caller's identity as well as the day, time, and duration of each call; reason for calling; services provided to callers as a result of calls; and clinical outcomes of calls. We here summarize this information for the period 2000 through 2003.
Multiple Image Arrangement for Subjective Quality Assessment
NASA Astrophysics Data System (ADS)
Wang, Yan; Zhai, Guangtao
2017-12-01
Subjective quality assessment serves as the foundation for almost all visual quality related researches. Size of the image quality databases has expanded from dozens to thousands in the last decades. Since each subjective rating therein has to be averaged over quite a few participants, the ever-increasing overall size of those databases calls for an evolution of existing subjective test methods. Traditional single/double stimulus based approaches are being replaced by multiple image tests, where several distorted versions of the original one are displayed and rated at once. And this naturally brings upon the question of how to arrange those multiple images on screen during the test. In this paper, we answer this question by performing subjective viewing test with eye tracker for different types arrangements. Our research indicates that isometric arrangement imposes less duress on participants and has more uniform distribution of eye fixations and movements and therefore is expected to generate more reliable subjective ratings.
Comparative homology agreement search: An effective combination of homology-search methods
Alam, Intikhab; Dress, Andreas; Rehmsmeier, Marc; Fuellen, Georg
2004-01-01
Many methods have been developed to search for homologous members of a protein family in databases, and the reliability of results and conclusions may be compromised if only one method is used, neglecting the others. Here we introduce a general scheme for combining such methods. Based on this scheme, we implemented a tool called comparative homology agreement search (chase) that integrates different search strategies to obtain a combined “E value.” Our results show that a consensus method integrating distinct strategies easily outperforms any of its component algorithms. More specifically, an evaluation based on the Structural Classification of Proteins database reveals that, on average, a coverage of 47% can be obtained in searches for distantly related homologues (i.e., members of the same superfamily but not the same family, which is a very difficult task), accepting only 10 false positives, whereas the individual methods obtain a coverage of 28–38%. PMID:15367730
Toward a Bio-Medical Thesaurus: Building the Foundation of the UMLS
Tuttle, Mark S.; Blois, Marsden S.; Erlbaum, Mark S.; Nelson, Stuart J.; Sherertz, David D.
1988-01-01
The Unified Medical Language System (UMLS) is being designed to provide a uniform user interface to heterogeneous machine-readable bio-medical information resources, such as bibliographic databases, genetic databases, expert systems and patient records.1 Such an interface will have to recognize different ways of saying the same thing, and provide links to ways of saying related things. One way to represent the necessary associations is via a domain thesaurus. As no such thesaurus exists, and because, once built, it will be both sizable and in need of continuous maintenance, its design should include a methodology for building and maintaining it. We propose a methodology, utilizing lexically expanded schema inversion, and a design, called T. Lex, which together form one approach to the problem of defining and building a bio-medical thesaurus. We argue that the semantic locality implicit in such a thesaurus will support model-based reasoning in bio-medicine.2
DOE Office of Scientific and Technical Information (OSTI.GOV)
Abbott, Jennifer; Sandberg, Tami
The Wind-Wildlife Impacts Literature Database (WILD), formerly known as the Avian Literature Database, was created in 1997. The goal of the database was to begin tracking the research that detailed the potential impact of wind energy development on birds. The Avian Literature Database was originally housed on a proprietary platform called Livelink ECM from Open- Text and maintained by in-house technical staff. The initial set of records was added by library staff. A vital part of the newly launched Drupal-based WILD database is the Bibliography module. Many of the resources included in the database have digital object identifiers (DOI). Themore » bibliographic information for any item that has a DOI can be imported into the database using this module. This greatly reduces the amount of manual data entry required to add records to the database. The content available in WILD is international in scope, which can be easily discerned by looking at the tags available in the browse menu.« less
CampusGIS of the University of Cologne: a tool for orientation, navigation, and management
NASA Astrophysics Data System (ADS)
Baaser, U.; Gnyp, M. L.; Hennig, S.; Hoffmeister, D.; Köhn, N.; Laudien, R.; Bareth, G.
2006-10-01
The working group for GIS and Remote Sensing at the Department of Geography at the University of Cologne has established a WebGIS called CampusGIS of the University of Cologne. The overall task of the CampusGIS is the connection of several existing databases at the University of Cologne with spatial data. These existing databases comprise data about staff, buildings, rooms, lectures, and general infrastructure like bus stops etc. These information were yet not linked to their spatial relation. Therefore, a GIS-based method is developed to link all the different databases to spatial entities. Due to the philosophy of the CampusGIS, an online-GUI is programmed which enables users to search for staff, buildings, or institutions. The query results are linked to the GIS database which allows the visualization of the spatial location of the searched entity. This system was established in 2005 and is operational since early 2006. In this contribution, the focus is on further developments. First results of (i) including routing services in, (ii) programming GUIs for mobile devices for, and (iii) including infrastructure management tools in the CampusGIS are presented. Consequently, the CampusGIS is not only available for spatial information retrieval and orientation. It also serves for on-campus navigation and administrative management.
pGenN, a gene normalization tool for plant genes and proteins in scientific literature.
Ding, Ruoyao; Arighi, Cecilia N; Lee, Jung-Youn; Wu, Cathy H; Vijay-Shanker, K
2015-01-01
Automatically detecting gene/protein names in the literature and connecting them to databases records, also known as gene normalization, provides a means to structure the information buried in free-text literature. Gene normalization is critical for improving the coverage of annotation in the databases, and is an essential component of many text mining systems and database curation pipelines. In this manuscript, we describe a gene normalization system specifically tailored for plant species, called pGenN (pivot-based Gene Normalization). The system consists of three steps: dictionary-based gene mention detection, species assignment, and intra species normalization. We have developed new heuristics to improve each of these phases. We evaluated the performance of pGenN on an in-house expertly annotated corpus consisting of 104 plant relevant abstracts. Our system achieved an F-value of 88.9% (Precision 90.9% and Recall 87.2%) on this corpus, outperforming state-of-art systems presented in BioCreative III. We have processed over 440,000 plant-related Medline abstracts using pGenN. The gene normalization results are stored in a local database for direct query from the pGenN web interface (proteininformationresource.org/pgenn/). The annotated literature corpus is also publicly available through the PIR text mining portal (proteininformationresource.org/iprolink/).
Application driven interface generation for EASIE. M.S. Thesis
NASA Technical Reports Server (NTRS)
Kao, Ya-Chen
1992-01-01
The Environment for Application Software Integration and Execution (EASIE) provides a user interface and a set of utility programs which support the rapid integration and execution of analysis programs about a central relational database. EASIE provides users with two basic modes of execution. One of them is a menu-driven execution mode, called Application-Driven Execution (ADE), which provides sufficient guidance to review data, select a menu action item, and execute an application program. The other mode of execution, called Complete Control Execution (CCE), provides an extended executive interface which allows in-depth control of the design process. Currently, the EASIE system is based on alphanumeric techniques only. It is the purpose of this project to extend the flexibility of the EASIE system in the ADE mode by implementing it in a window system. Secondly, a set of utilities will be developed to assist the experienced engineer in the generation of an ADE application.
Mock jurors' use of error rates in DNA database trawls.
Scurich, Nicholas; John, Richard S
2013-12-01
Forensic science is not infallible, as data collected by the Innocence Project have revealed. The rate at which errors occur in forensic DNA testing-the so-called "gold standard" of forensic science-is not currently known. This article presents a Bayesian analysis to demonstrate the profound impact that error rates have on the probative value of a DNA match. Empirical evidence on whether jurors are sensitive to this effect is equivocal: Studies have typically found they are not, while a recent, methodologically rigorous study found that they can be. This article presents the results of an experiment that examined this issue within the context of a database trawl case in which one DNA profile was tested against a multitude of profiles. The description of the database was manipulated (i.e., "medical" or "offender" database, or not specified) as was the rate of error (i.e., one-in-10 or one-in-1,000). Jury-eligible participants were nearly twice as likely to convict in the offender database condition compared to the condition not specified. The error rates did not affect verdicts. Both factors, however, affected the perception of the defendant's guilt, in the expected direction, although the size of the effect was meager compared to Bayesian prescriptions. The results suggest that the disclosure of an offender database to jurors might constitute prejudicial evidence, and calls for proficiency testing in forensic science as well as training of jurors are echoed. (c) 2013 APA, all rights reserved
Timothy A. Bottomley
2008-01-01
The BLM uses a database, called the Forest Vegetation Information System (FORVIS), to store, retrieve, and analyze forest resource information on a majority of their forested lands. FORVIS also has the capability of easily transferring appropriate data electronically into Forest Vegetation Simulator (FVS) for simulation runs. Only minor additional data inputs or...
Call-Center Based Disease Management of Pediatric Asthmatics
2006-04-01
This study will measure the impact of CBDMP, which promotes patient education and empowerment, on multiple factors to include; patient/caregiver quality...Prepare and reproduce patient education materials, and informed consent work sheets. Contract Oracle data base administrator to establish database for... Patient education materials and informed consent documents were reproduced. A web-based Oracle data-base was determined to be both prohibitively
Centralized database for interconnection system design. [for spacecraft
NASA Technical Reports Server (NTRS)
Billitti, Joseph W.
1989-01-01
A database application called DFACS (Database, Forms and Applications for Cabling and Systems) is described. The objective of DFACS is to improve the speed and accuracy of interconnection system information flow during the design and fabrication stages of a project, while simultaneously supporting both the horizontal (end-to-end wiring) and the vertical (wiring by connector) design stratagems used by the Jet Propulsion Laboratory (JPL) project engineering community. The DFACS architecture is centered around a centralized database and program methodology which emulates the manual design process hitherto used at JPL. DFACS has been tested and successfully applied to existing JPL hardware tasks with a resulting reduction in schedule time and costs.
NASA Astrophysics Data System (ADS)
Jacquinet-Husson, N.; Lmd Team
The GEISA (Gestion et Etude des Informations Spectroscopiques Atmosphériques: Management and Study of Atmospheric Spectroscopic Information) computer accessible database system, in its former 1997 and 2001 versions, has been updated in 2003 (GEISA-03). It is developed by the ARA (Atmospheric Radiation Analysis) group at LMD (Laboratoire de Météorologie Dynamique, France) since 1974. This early effort implemented the so-called `` line-by-line and layer-by-layer '' approach for forward radiative transfer modelling action. The GEISA 2003 system comprises three databases with their associated management softwares: a database of spectroscopic parameters required to describe adequately the individual spectral lines belonging to 42 molecules (96 isotopic species) and located in a spectral range from the microwave to the limit of the visible. The featured molecules are of interest in studies of the terrestrial as well as the other planetary atmospheres, especially those of the Giant Planets. a database of absorption cross-sections of molecules such as chlorofluorocarbons which exhibit unresolvable spectra. a database of refractive indices of basic atmospheric aerosol components. Illustrations will be given of GEISA-03, data archiving method, contents, management softwares and Web access facilities at: http://ara.lmd.polytechnique.fr The performance of instruments like AIRS (Atmospheric Infrared Sounder; http://www-airs.jpl.nasa.gov) in the USA, and IASI (Infrared Atmospheric Sounding Interferometer; http://smsc.cnes.fr/IASI/index.htm) in Europe, which have a better vertical resolution and accuracy, compared to the presently existing satellite infrared vertical sounders, is directly related to the quality of the spectroscopic parameters of the optically active gases, since these are essential input in the forward models used to simulate recorded radiance spectra. For these upcoming atmospheric sounders, the so-called GEISA/IASI sub-database system has been elaborated, from GEISA. Its content, will be described, as well. This work is ongoing, with the purpose of assessing the IASI measurements capabilities and the spectroscopic information quality, within the ISSWG (IASI Sounding Science Working Group), in the frame of the CNES (Centre National d'Etudes Spatiales, France)/EUMETSAT (EUropean organization for the exploitation of METeorological SATellites) Polar System (EPS) project, by simulating high resolution radiances and/or using experimental data. EUMETSAT will implement GEISA/IASI into the EPS ground segment. The IASI soundings spectroscopic data archive requirements will be discussed in the context of comparisons between recorded and calculated experimental spectra, using the ARA/4A forward line-by-line radiative transfer modelling code in its latest version.
Carter, Sarah P; Osborne, Laura J; Renshaw, Keith D; Allen, Elizabeth S; Loew, Benjamin A; Markman, Howard J; Stanley, Scott M
2018-02-01
Long-distance communication has been frequently identified as essential to military couples trying to maintain their relationship during a deployment. Little quantitative research, however, has assessed the types of topics discussed during such communication and how those topics relate to overall relationship satisfaction. The current study draws on a sample of 56 Army couples who provided data through online surveys while the service member was actively deployed. These couples provided information on current marital satisfaction, topics discussed during deployment (problem talk, friendship talk, love talk), and how they communicated via synchronous media (e.g., phone calls, video calls) and letters during deployment. Nonparametric Friedman tests followed by paired t tests revealed that synchronous communication was primarily utilized for friendship talk, whereas letters included friendship talk and love talk in similar amounts. Both synchronous communication and letters included less problem talk than other topics. In mixed-level modeling, only topics of communication for synchronous media (not for letters) were related to relationship satisfaction. Love talk via synchronous media was related to higher relationship satisfaction, whereas problem talk via synchronous media was related to less relationship satisfaction. The current study offers the first quantitative assessment of topics within deployment communication media and associations with relationship satisfaction. (PsycINFO Database Record (c) 2018 APA, all rights reserved).
TreeQ-VISTA: An Interactive Tree Visualization Tool withFunctional Annotation Query Capabilities
DOE Office of Scientific and Technical Information (OSTI.GOV)
Gu, Shengyin; Anderson, Iain; Kunin, Victor
2007-05-07
Summary: We describe a general multiplatform exploratorytool called TreeQ-Vista, designed for presenting functional annotationsin a phylogenetic context. Traits, such as phenotypic and genomicproperties, are interactively queried from a relational database with auser-friendly interface which provides a set of tools for users with orwithout SQL knowledge. The query results are projected onto aphylogenetic tree and can be displayed in multiple color groups. A richset of browsing, grouping and query tools are provided to facilitatetrait exploration, comparison and analysis.Availability: The program,detailed tutorial and examples are available online athttp://genome-test.lbl.gov/vista/TreeQVista.
Tandem Mass Spectrum Sequencing: An Alternative to Database Search Engines in Shotgun Proteomics.
Muth, Thilo; Rapp, Erdmann; Berven, Frode S; Barsnes, Harald; Vaudel, Marc
2016-01-01
Protein identification via database searches has become the gold standard in mass spectrometry based shotgun proteomics. However, as the quality of tandem mass spectra improves, direct mass spectrum sequencing gains interest as a database-independent alternative. In this chapter, the general principle of this so-called de novo sequencing is introduced along with pitfalls and challenges of the technique. The main tools available are presented with a focus on user friendly open source software which can be directly applied in everyday proteomic workflows.
Document image database indexing with pictorial dictionary
NASA Astrophysics Data System (ADS)
Akbari, Mohammad; Azimi, Reza
2010-02-01
In this paper we introduce a new approach for information retrieval from Persian document image database without using Optical Character Recognition (OCR).At first an attribute called subword upper contour label is defined then, a pictorial dictionary is constructed based on this attribute for the subwords. By this approach we address two issues in document image retrieval: keyword spotting and retrieval according to the document similarities. The proposed methods have been evaluated on a Persian document image database. The results have proved the ability of this approach in document image information retrieval.
Nacul, L C; Stewart, A; Alberg, C; Chowdhury, S; Darlison, M W; Grollman, C; Hall, A; Modell, B; Moorthie, S; Sagoo, G S; Burton, H
2014-06-01
In 2010 the World Health Assembly called for action to improve the care and prevention of congenital disorders, noting that technical guidance would be required for this task, especially in low- and middle-income countries. Responding to this call, we have developed a freely available web-accessible Toolkit for assessing health needs for congenital disorders. Materials for the Toolkit website (http://toolkit.phgfoundation.org) were prepared by an iterative process of writing, discussion and modification by the project team, with advice from external experts. A customized database was developed using epidemiological, demographic, socio-economic and health-services data from a range of validated sources. Document-processing and data integration software combines data from the database with a template to generate topic- and country-specific Calculator documents for quantitative analysis. The Toolkit guides users through selection of topics (including both clinical conditions and relevant health services), assembly and evaluation of qualitative and quantitative information, assessment of the potential effects of selected interventions, and planning and prioritization of actions to reduce the risk or prevalence of congenital disorders. The Toolkit enables users without epidemiological or public health expertise to undertake health needs assessment as a prerequisite for strategic planning in relation to congenital disorders in their country or region. © The Author 2013. Published by Oxford University Press on behalf of Faculty of Public Health.
Pediatric emergencies on a US-based commercial airline.
Moore, Brian R; Ping, Jennifer M; Claypool, David W
2005-11-01
The purpose of this investigation was to determine the incidence and character of pediatric emergencies on a US-based commercial airline and to evaluate current in-flight medical kits. In-flight consultations to a major US airline by a member of our staff are recorded in an institutional database. In this observational retrospective review, the database was queried for consultations for all passengers up to 18 years old between January 1, 1995, and December 31, 2002. Consultations were reviewed for type of emergency, use of the medical kit, and unscheduled landings. Two hundred twenty-two pediatric consultations were identified, representing 1 pediatric call per 20,775 flights. The mean age of patients was 6.8 years. Fifty-three emergencies were preflight calls, and 169 were in-flight pediatric consultations. The most common in-flight consultations concerned infectious disease (45 calls, 27%), neurological (25 calls, 15%), and respiratory tract (22 calls, 13%) emergencies. The emergency medical kit was used for 60 emergencies. Nineteen consultations (11%) resulted in flight diversions (1/240,000 flights), most commonly because of in-flight neurological (9) and respiratory tract (5) emergencies. International flights had a higher incidence than domestic flights of consultations and diversions for pediatric emergencies. The most common in-flight pediatric emergencies involved infectious diseases and neurological and respiratory tract problems. Emergency medical kits should be expanded to include pediatric medications.
Communication with Family and Friends across the Life Course.
David-Barrett, Tamas; Kertesz, Janos; Rotkirch, Anna; Ghosh, Asim; Bhattacharya, Kunal; Monsivais, Daniel; Kaski, Kimmo
2016-01-01
Each stage of the human life course is characterised by a distinctive pattern of social relations. We study how the intensity and importance of the closest social contacts vary across the life course, using a large database of mobile communication from a European country. We first determine the most likely social relationship type from these mobile phone records by relating the age and gender of the caller and recipient to the frequency, length, and direction of calls. We then show how communication patterns between parents and children, romantic partner, and friends vary across the six main stages of the adult family life course. Young adulthood is dominated by a gradual shift of call activity from parents to close friends, and then to a romantic partner, culminating in the period of early family formation during which the focus is on the romantic partner. During middle adulthood call patterns suggest a high dependence on the parents of the ego, who, presumably often provide alloparental care, while at this stage female same-gender friendship also peaks. During post-reproductive adulthood, individuals and especially women balance close social contacts among three generations. The age of grandparenthood brings the children entering adulthood and family formation into the focus, and is associated with a realignment of close social contacts especially among women, while the old age is dominated by dependence on their children.
Radar target classification studies: Software development and documentation
NASA Astrophysics Data System (ADS)
Kamis, A.; Garber, F.; Walton, E.
1985-09-01
Three computer programs were developed to process and analyze calibrated radar returns. The first program, called DATABASE, was developed to create and manage a random accessed data base. The second program, called FTRAN DB, was developed to process horizontal and vertical polarizations radar returns into different formats (i.e., time domain, circular polarizations and polarization parameters). The third program, called RSSE, was developed to simulate a variety of radar systems and to evaluate their ability to identify radar returns. Complete computer listings are included in the appendix volumes.
Challenges in developing medicinal plant databases for sharing ethnopharmacological knowledge.
Ningthoujam, Sanjoy Singh; Talukdar, Anupam Das; Potsangbam, Kumar Singh; Choudhury, Manabendra Dutta
2012-05-07
Major research contributions in ethnopharmacology have generated vast amount of data associated with medicinal plants. Computerized databases facilitate data management and analysis making coherent information available to researchers, planners and other users. Web-based databases also facilitate knowledge transmission and feed the circle of information exchange between the ethnopharmacological studies and public audience. However, despite the development of many medicinal plant databases, a lack of uniformity is still discernible. Therefore, it calls for defining a common standard to achieve the common objectives of ethnopharmacology. The aim of the study is to review the diversity of approaches in storing ethnopharmacological information in databases and to provide some minimal standards for these databases. Survey for articles on medicinal plant databases was done on the Internet by using selective keywords. Grey literatures and printed materials were also searched for information. Listed resources were critically analyzed for their approaches in content type, focus area and software technology. Necessity for rapid incorporation of traditional knowledge by compiling primary data has been felt. While citation collection is common approach for information compilation, it could not fully assimilate local literatures which reflect traditional knowledge. Need for defining standards for systematic evaluation, checking quality and authenticity of the data is felt. Databases focussing on thematic areas, viz., traditional medicine system, regional aspect, disease and phytochemical information are analyzed. Issues pertaining to data standard, data linking and unique identification need to be addressed in addition to general issues like lack of update and sustainability. In the background of the present study, suggestions have been made on some minimum standards for development of medicinal plant database. In spite of variations in approaches, existence of many overlapping features indicates redundancy of resources and efforts. As the development of global data in a single database may not be possible in view of the culture-specific differences, efforts can be given to specific regional areas. Existing scenario calls for collaborative approach for defining a common standard in medicinal plant database for knowledge sharing and scientific advancement. Copyright © 2012 Elsevier Ireland Ltd. All rights reserved.
Viereck, Søren; Møller, Thea Palsgaard; Ersbøll, Annette Kjær; Bækgaard, Josefine Stokholm; Claesson, Andreas; Hollenberg, Jacob; Folke, Fredrik; Lippert, Freddy K
2017-06-01
Initiation of early bystander cardiopulmonary resuscitation (CPR) depends on bystanders' or medical dispatchers' recognition of out-of-hospital cardiac arrest (OHCA). The primary aim of our study was to investigate if OHCA recognition during the emergency call was associated with bystander CPR, return of spontaneous circulation (ROSC), and 30-day survival. Our secondary aim was to identify patient-, setting-, and dispatcher-related predictors of OHCA recognition. We performed an observational study of all OHCA patients' emergency calls in the Capital Region of Denmark from 01/01/2013-31/12/2013. OHCAs were collected from the Danish Cardiac Arrest Registry and the Mobile Critical Care Unit database. Emergency call recordings were identified and evaluated. Multivariable logistic regression analyses were applied to all OHCAs and witnessed OHCAs only to analyse the association between OHCA recognition and bystander CPR, ROSC, and 30-day survival. Univariable logistic regression analyses were applied to identify predictors of OHCA recognition. We included 779 emergency calls in the analyses. During the emergency calls, 70.1% (n=534) of OHCAs were recognised; OHCA recognition was positively associated with bystander CPR (odds ratio [OR]=7.84, 95% confidence interval [CI]: 5.10-12.05) in all OHCAs; and ROSC (OR=1.86, 95% CI: 1.13-3.06) and 30-day survival (OR=2.80, 95% CI: 1.58-4.96) in witnessed OHCA. Predictors of OHCA recognition were addressing breathing (OR=1.76, 95% CI: 1.17-2.66) and callers located by the patient's side (OR=2.16, 95% CI: 1.46-3.19). Recognition of OHCA during emergency calls was positively associated with the provision of bystander CPR, ROSC, and 30-day survival in witnessed OHCA. Copyright © 2017 The Authors. Published by Elsevier B.V. All rights reserved.
Peshin, Sharda Shah; Srivastava, Amita; Halder, Nabanita; Gupta, Yogendra Kumar
2014-02-01
The study was designed to analyze the incidence and pattern of pesticide poisoning calls reported to the National Poisons Information Centre (NPIC), AIIMS, New Delhi and highlight the common classes of pesticides involved in poisoning. The telephone calls received by the Centre during the thirteen year period (1999-2012) were entered into a preset proforma and then into a retrievable database. A total of 4929 calls of pesticide poisoning were recorded. The data was analyzed with respect to age, gender, mode and type of poisoning. The age ranged from 1 to 65 years with the preponderance of males (M = 62.19%, F = 37.80%). The age group mainly involved in poisoning was 18-35 years. While 59.38% calls pertained to household pesticides, 40.61% calls related to agricultural pesticides. The common mode of poisoning was intentional (64.60%) followed by accidental (34.40%) and unknown (1%). Amongst the household pesticides, the highest number of calls were due to pyrethroids (26.23%) followed by rodenticides (17.06%), organophosphates (6.26%), carbamates (4.95%) and others (4.86%). In agricultural pesticides group, the organophosphates (9.79%) ranked the first followed by, aluminium phosphide (9.65%), organochlorines (9.31%), pyrethroids (3.87%), herbicides, weedicides and fungicides (3.20%), ethylene dibromide (2.82%), and others (1.70%). The data analysis shows a high incidence of poisoning due to household pesticides as compared to agricultural pesticides, clearly emphasizing the need for creating awareness and education about proper use and implementation of prevention programmes. Copyright © 2013 Elsevier Ltd and Faculty of Forensic and Legal Medicine. All rights reserved.
On the connection of gamma-ray bursts and X-ray flashes in the BATSE and RHESSI databases
NASA Astrophysics Data System (ADS)
Řípa, J.; Mészáros, A.
2016-12-01
Classification of gamma-ray bursts (GRBs) into groups has been intensively studied by various statistical tests in previous years. It has been suggested that there was a distinct group of GRBs, beyond the long and short ones, with intermediate durations. However, such a group is not securely confirmed yet. Strangely, concerning the spectral hardness, the observations from the Swift and RHESSI satellites give different results. For the Swift/BAT database it is found that the intermediate-duration bursts might well be related to so-called X-ray flashes (XRFs). On the other hand, for the RHESSI dataset the intermediate-duration bursts seem to be spectrally too hard to be given by XRFs. The connection of the intermediate-duration bursts and XRFs for the BATSE database is not clear as well. The purpose of this article is to check the relation between XRFs and GRBs for the BATSE and RHESSI databases, respectively. We use an empirical definition of XRFs introduced by other authors earlier. For the RHESSI database we also use a transformation between the detected counts and the fluences based on the simulated detector response function. The purpose is to compare the hardnesses of GRBs with the definition of XRFs. There is a 1.3-4.2 % fraction of XRFs in the whole BATSE database. The vast majority of the BATSE short bursts are not XRFs because only 0.7-5.7 % of the short bursts can be given by XRFs. However, there is a large uncertainty in the fraction of XRFs among the intermediate-duration bursts. The fraction of 1-85 % of the BATSE intermediate-duration bursts can be related to XRFs. For the long bursts this fraction is between 1.0 % and 3.4 %. The uncertainties in these fractions are large, however it can be claimed that all BATSE intermediate-duration bursts cannot be given by XRFs. At least 79 % of RHESSI short bursts, at least 53 % of RHESSI intermediate-duration bursts, and at least 45 % of RHESSI long bursts should not be given by XRFs. A simulation of XRFs observed by HETE-2 and Swift has shown that RHESSI would detect, and in fact detected, only one long-duration XRF out of 26 ones observed by those two satellites. We arrive at the conclusion that the intermediate-duration bursts in the BATSE database can be partly populated by XRFs, but the RHESSI intermediate-duration bursts are most likely not given by XRFs. The results, claiming that the Swift/BAT intermediate-duration bursts are closely related to XRFs do not hold for the BATSE and RHESSI databases.
The Co-regulation Data Harvester: Automating gene annotation starting from a transcriptome database
NASA Astrophysics Data System (ADS)
Tsypin, Lev M.; Turkewitz, Aaron P.
Identifying co-regulated genes provides a useful approach for defining pathway-specific machinery in an organism. To be efficient, this approach relies on thorough genome annotation, a process much slower than genome sequencing per se. Tetrahymena thermophila, a unicellular eukaryote, has been a useful model organism and has a fully sequenced but sparsely annotated genome. One important resource for studying this organism has been an online transcriptomic database. We have developed an automated approach to gene annotation in the context of transcriptome data in T. thermophila, called the Co-regulation Data Harvester (CDH). Beginning with a gene of interest, the CDH identifies co-regulated genes by accessing the Tetrahymena transcriptome database. It then identifies their closely related genes (orthologs) in other organisms by using reciprocal BLAST searches. Finally, it collates the annotations of those orthologs' functions, which provides the user with information to help predict the cellular role of the initial query. The CDH, which is freely available, represents a powerful new tool for analyzing cell biological pathways in Tetrahymena. Moreover, to the extent that genes and pathways are conserved between organisms, the inferences obtained via the CDH should be relevant, and can be explored, in many other systems.
The DNA database search controversy revisited: bridging the Bayesian-frequentist gap.
Storvik, Geir; Egeland, Thore
2007-09-01
Two different quantities have been suggested for quantification of evidence in cases where a suspect is found by a search through a database of DNA profiles. The likelihood ratio, typically motivated from a Bayesian setting, is preferred by most experts in the field. The so-called np rule has been suggested through frequentist arguments and has been suggested by the American National Research Council and Stockmarr (1999, Biometrics55, 671-677). The two quantities differ substantially and have given rise to the DNA database search controversy. Although several authors have criticized the different approaches, a full explanation of why these differences appear is still lacking. In this article we show that a P-value in a frequentist hypothesis setting is approximately equal to the result of the np rule. We argue, however, that a more reasonable procedure in this case is to use conditional testing, in which case a P-value directly related to posterior probabilities and the likelihood ratio is obtained. This way of viewing the problem bridges the gap between the Bayesian and frequentist approaches. At the same time it indicates that the np rule should not be used to quantify evidence.
2010-01-01
Background Mosquitoes are important vectors of diseases but, in spite of various mosquito faunistic surveys globally, there is a need for a spatial online database of mosquito collection data and distribution summaries. Such a resource could provide entomologists with the results of previous mosquito surveys, and vector disease control workers, preventative medicine practitioners, and health planners with information relating mosquito distribution to vector-borne disease risk. Results A web application called MosquitoMap was constructed comprising mosquito collection point data stored in an ArcGIS 9.3 Server/SQL geodatabase that includes administrative area and vector species x country lookup tables. In addition to the layer containing mosquito collection points, other map layers were made available including environmental, and vector and pathogen/disease distribution layers. An application within MosquitoMap called the Mal-area calculator (MAC) was constructed to quantify the area of overlap, for any area of interest, of vector, human, and disease distribution models. Data standards for mosquito records were developed for MosquitoMap. Conclusion MosquitoMap is a public domain web resource that maps and compares georeferenced mosquito collection points to other spatial information, in a geographical information system setting. The MAC quantifies the Mal-area, i.e. the area where it is theoretically possible for vector-borne disease transmission to occur, thus providing a useful decision tool where other disease information is limited. The Mal-area approach emphasizes the independent but cumulative contribution to disease risk of the vector species predicted present. MosquitoMap adds value to, and makes accessible, the results of past collecting efforts, as well as providing a template for other arthropod spatial databases. PMID:20167090
Foley, Desmond H; Wilkerson, Richard C; Birney, Ian; Harrison, Stanley; Christensen, Jamie; Rueda, Leopoldo M
2010-02-18
Mosquitoes are important vectors of diseases but, in spite of various mosquito faunistic surveys globally, there is a need for a spatial online database of mosquito collection data and distribution summaries. Such a resource could provide entomologists with the results of previous mosquito surveys, and vector disease control workers, preventative medicine practitioners, and health planners with information relating mosquito distribution to vector-borne disease risk. A web application called MosquitoMap was constructed comprising mosquito collection point data stored in an ArcGIS 9.3 Server/SQL geodatabase that includes administrative area and vector species x country lookup tables. In addition to the layer containing mosquito collection points, other map layers were made available including environmental, and vector and pathogen/disease distribution layers. An application within MosquitoMap called the Mal-area calculator (MAC) was constructed to quantify the area of overlap, for any area of interest, of vector, human, and disease distribution models. Data standards for mosquito records were developed for MosquitoMap. MosquitoMap is a public domain web resource that maps and compares georeferenced mosquito collection points to other spatial information, in a geographical information system setting. The MAC quantifies the Mal-area, i.e. the area where it is theoretically possible for vector-borne disease transmission to occur, thus providing a useful decision tool where other disease information is limited. The Mal-area approach emphasizes the independent but cumulative contribution to disease risk of the vector species predicted present. MosquitoMap adds value to, and makes accessible, the results of past collecting efforts, as well as providing a template for other arthropod spatial databases.
On-line resources for bacterial micro-evolution studies using MLVA or CRISPR typing.
Grissa, Ibtissem; Bouchon, Patrick; Pourcel, Christine; Vergnaud, Gilles
2008-04-01
The control of bacterial pathogens requires the development of tools allowing the precise identification of strains at the subspecies level. It is now widely accepted that these tools will need to be DNA-based assays (in contrast to identification at the species level, where biochemical based assays are still widely used, even though very powerful 16S DNA sequence databases exist). Typing assays need to be cheap and amenable to the designing of international databases. The success of such subspecies typing tools will eventually be measured by the size of the associated reference databases accessible over the internet. Three methods have shown some potential in this direction, the so-called spoligotyping assay (Mycobacterium tuberculosis, 40,000 entries database), Multiple Loci Sequence Typing (MLST; up to a few thousands entries for the more than 20 bacterial species), and more recently Multiple Loci VNTR Analysis (MLVA; up to a few hundred entries, assays available for more than 20 pathogens). In the present report we will review the current status of the tools and resources we have developed along the past seven years to help in the setting-up or the use of MLVA assays or lately for analysing Clustered Regularly Interspaced Short Palindromic Repeats called CRISPRs which are the basis for spoligotyping assays.
Carolus, Marshall; Biglarbigi, Khosrow; Warwick, Peter D.; Attanasi, Emil D.; Freeman, Philip A.; Lohr, Celeste D.
2017-10-24
A database called the “Comprehensive Resource Database” (CRD) was prepared to support U.S. Geological Survey (USGS) assessments of technically recoverable hydrocarbons that might result from the injection of miscible or immiscible carbon dioxide (CO2) for enhanced oil recovery (EOR). The CRD was designed by INTEK Inc., a consulting company under contract to the USGS. The CRD contains data on the location, key petrophysical properties, production, and well counts (number of wells) for the major oil and gas reservoirs in onshore areas and State waters of the conterminous United States and Alaska. The CRD includes proprietary data on petrophysical properties of fields and reservoirs from the “Significant Oil and Gas Fields of the United States Database,” prepared by Nehring Associates in 2012, and proprietary production and drilling data from the “Petroleum Information Data Model Relational U.S. Well Data,” prepared by IHS Inc. in 2012. This report describes the CRD and the computer algorithms used to (1) estimate missing reservoir property values in the Nehring Associates (2012) database, and to (2) generate values of additional properties used to characterize reservoirs suitable for miscible or immiscible CO2 flooding for EOR. Because of the proprietary nature of the data and contractual obligations, the CRD and actual data from Nehring Associates (2012) and IHS Inc. (2012) cannot be presented in this report.
Searching Across the International Space Station Databases
NASA Technical Reports Server (NTRS)
Maluf, David A.; McDermott, William J.; Smith, Ernest E.; Bell, David G.; Gurram, Mohana
2007-01-01
Data access in the enterprise generally requires us to combine data from different sources and different formats. It is advantageous thus to focus on the intersection of the knowledge across sources and domains; keeping irrelevant knowledge around only serves to make the integration more unwieldy and more complicated than necessary. A context search over multiple domain is proposed in this paper to use context sensitive queries to support disciplined manipulation of domain knowledge resources. The objective of a context search is to provide the capability for interrogating many domain knowledge resources, which are largely semantically disjoint. The search supports formally the tasks of selecting, combining, extending, specializing, and modifying components from a diverse set of domains. This paper demonstrates a new paradigm in composition of information for enterprise applications. In particular, it discusses an approach to achieving data integration across multiple sources, in a manner that does not require heavy investment in database and middleware maintenance. This lean approach to integration leads to cost-effectiveness and scalability of data integration with an underlying schemaless object-relational database management system. This highly scalable, information on demand system framework, called NX-Search, which is an implementation of an information system built on NETMARK. NETMARK is a flexible, high-throughput open database integration framework for managing, storing, and searching unstructured or semi-structured arbitrary XML and HTML used widely at the National Aeronautics Space Administration (NASA) and industry.
pGenN, a Gene Normalization Tool for Plant Genes and Proteins in Scientific Literature
Ding, Ruoyao; Arighi, Cecilia N.; Lee, Jung-Youn; Wu, Cathy H.; Vijay-Shanker, K.
2015-01-01
Background Automatically detecting gene/protein names in the literature and connecting them to databases records, also known as gene normalization, provides a means to structure the information buried in free-text literature. Gene normalization is critical for improving the coverage of annotation in the databases, and is an essential component of many text mining systems and database curation pipelines. Methods In this manuscript, we describe a gene normalization system specifically tailored for plant species, called pGenN (pivot-based Gene Normalization). The system consists of three steps: dictionary-based gene mention detection, species assignment, and intra species normalization. We have developed new heuristics to improve each of these phases. Results We evaluated the performance of pGenN on an in-house expertly annotated corpus consisting of 104 plant relevant abstracts. Our system achieved an F-value of 88.9% (Precision 90.9% and Recall 87.2%) on this corpus, outperforming state-of-art systems presented in BioCreative III. We have processed over 440,000 plant-related Medline abstracts using pGenN. The gene normalization results are stored in a local database for direct query from the pGenN web interface (proteininformationresource.org/pgenn/). The annotated literature corpus is also publicly available through the PIR text mining portal (proteininformationresource.org/iprolink/). PMID:26258475
NASA Astrophysics Data System (ADS)
Rogers, Steven P.; Hamilton, David B.
1994-06-01
To employ the most readily comprehensible presentation methods and symbology with helmet-mounted displays (HMDs), it is critical to identify the information elements needed to perform each pilot function and to analytically determine the attributes of these elements. The extensive analyses of mission requirements currently performed for pilot-vehicle interface design can be aided and improved by the new capabilities of intelligent systems and relational databases. An intelligent system, named ACIDTEST, has been developed specifically for organizing and applying rules to identify the best display modalities, locations, and formats. The primary objectives of the ACIDTEST system are to provide rapid accessibility to pertinent display research data, to integrate guidelines from many disciplines and identify conflicts among these guidelines, to force a consistent display approach among the design team members, and to serve as an 'audit trail' of design decisions and justifications. A powerful relational database called TAWL ORDIR has been developed to document information requirements and attributes for use by ACIDTEST as well as to greatly augment the applicability of mission analysis data. TAWL ORDIR can be used to rapidly reorganize mission analysis data components for study, perform commonality analyses for groups of tasks, determine the information content requirement for tailored display modes, and identify symbology integration opportunities.
A data analysis expert system for large established distributed databases
NASA Technical Reports Server (NTRS)
Gnacek, Anne-Marie; An, Y. Kim; Ryan, J. Patrick
1987-01-01
A design for a natural language database interface system, called the Deductively Augmented NASA Management Decision support System (DANMDS), is presented. The DANMDS system components have been chosen on the basis of the following considerations: maximal employment of the existing NASA IBM-PC computers and supporting software; local structuring and storing of external data via the entity-relationship model; a natural easy-to-use error-free database query language; user ability to alter query language vocabulary and data analysis heuristic; and significant artificial intelligence data analysis heuristic techniques that allow the system to become progressively and automatically more useful.
Functional Interaction Network Construction and Analysis for Disease Discovery.
Wu, Guanming; Haw, Robin
2017-01-01
Network-based approaches project seemingly unrelated genes or proteins onto a large-scale network context, therefore providing a holistic visualization and analysis platform for genomic data generated from high-throughput experiments, reducing the dimensionality of data via using network modules and increasing the statistic analysis power. Based on the Reactome database, the most popular and comprehensive open-source biological pathway knowledgebase, we have developed a highly reliable protein functional interaction network covering around 60 % of total human genes and an app called ReactomeFIViz for Cytoscape, the most popular biological network visualization and analysis platform. In this chapter, we describe the detailed procedures on how this functional interaction network is constructed by integrating multiple external data sources, extracting functional interactions from human curated pathway databases, building a machine learning classifier called a Naïve Bayesian Classifier, predicting interactions based on the trained Naïve Bayesian Classifier, and finally constructing the functional interaction database. We also provide an example on how to use ReactomeFIViz for performing network-based data analysis for a list of genes.
PeTMbase: A Database of Plant Endogenous Target Mimics (eTMs).
Karakülah, Gökhan; Yücebilgili Kurtoğlu, Kuaybe; Unver, Turgay
2016-01-01
MicroRNAs (miRNA) are small endogenous RNA molecules, which regulate target gene expression at post-transcriptional level. Besides, miRNA activity can be controlled by a newly discovered regulatory mechanism called endogenous target mimicry (eTM). In target mimicry, eTMs bind to the corresponding miRNAs to block the binding of specific transcript leading to increase mRNA expression. Thus, miRNA-eTM-target-mRNA regulation modules involving a wide range of biological processes; an increasing need for a comprehensive eTM database arose. Except miRSponge with limited number of Arabidopsis eTM data no available database and/or repository was developed and released for plant eTMs yet. Here, we present an online plant eTM database, called PeTMbase (http://petmbase.org), with a highly efficient search tool. To establish the repository a number of identified eTMs was obtained utilizing from high-throughput RNA-sequencing data of 11 plant species. Each transcriptome libraries is first mapped to corresponding plant genome, then long non-coding RNA (lncRNA) transcripts are characterized. Furthermore, additional lncRNAs retrieved from GREENC and PNRD were incorporated into the lncRNA catalog. Then, utilizing the lncRNA and miRNA sources a total of 2,728 eTMs were successfully predicted. Our regularly updated database, PeTMbase, provides high quality information regarding miRNA:eTM modules and will aid functional genomics studies particularly, on miRNA regulatory networks.
Using Web Ontology Language to Integrate Heterogeneous Databases in the Neurosciences
Lam, Hugo Y.K.; Marenco, Luis; Shepherd, Gordon M.; Miller, Perry L.; Cheung, Kei-Hoi
2006-01-01
Integrative neuroscience involves the integration and analysis of diverse types of neuroscience data involving many different experimental techniques. This data will increasingly be distributed across many heterogeneous databases that are web-accessible. Currently, these databases do not expose their schemas (database structures) and their contents to web applications/agents in a standardized, machine-friendly way. This limits database interoperation. To address this problem, we describe a pilot project that illustrates how neuroscience databases can be expressed using the Web Ontology Language, which is a semantically-rich ontological language, as a common data representation language to facilitate complex cross-database queries. In this pilot project, an existing tool called “D2RQ” was used to translate two neuroscience databases (NeuronDB and CoCoDat) into OWL, and the resulting OWL ontologies were then merged. An OWL-based reasoner (Racer) was then used to provide a sophisticated query language (nRQL) to perform integrated queries across the two databases based on the merged ontology. This pilot project is one step toward exploring the use of semantic web technologies in the neurosciences. PMID:17238384
Creation of the NaSCoRD Database
DOE Office of Scientific and Technical Information (OSTI.GOV)
Denman, Matthew R.; Jankovsky, Zachary Kyle; Stuart, William
This report was written as part of a United States Department of Energy (DOE), Office of Nuclear Energy, Advanced Reactor Technologies program funded project to re-create the capabilities of the legacy Centralized Reliability Database Organization (CREDO) database. The CREDO database provided a record of component design and performance documentation across various systems that used sodium as a working fluid. Regaining this capability will allow the DOE complex and the domestic sodium reactor industry to better understand how previous systems were designed and built for use in improving the design and operations of future loops. The contents of this report include:more » overview of the current state of domestic sodium reliability databases; summary of the ongoing effort to improve, understand, and process the CREDO information; summary of the initial efforts to develop a unified sodium reliability database called the Sodium System Component Reliability Database (NaSCoRD); and explain both how potential users can access the domestic sodium reliability databases and the type of information that can be accessed from these databases.« less
Factors Implicated in Safety-related Firefighter Fatalities.
Kahn, Steven A; Palmieri, Tina L; Sen, Soman; Woods, Jason; Gunter, Oliver L
Firefighting is wrought with risk, as 80-100 firefighters (FFs) die on the job each year in the United States. Many of the fatalities have been analyzed by the National Institute for Occupational Safety and Health (NIOSH) to determine contributing factors. The purpose of this study is to determine variables that put FFs at risk for potentially preventable workplace mortality such as use of personal protective equipment (PPE), seat belts, and appropriate training/fitness/clearance for duty. The NIOSH FF Fatality Database reports from 2009 to 2014 were analyzed. Data including age, gender, years on the job, weather, other calls on the same shift, and department type were compared between FFs who employed PPE, seat belts, or wellness/fitness and those who did not. A second group of FFs was determined by NIOSH to have inexperience, lack of training, or inappropriate clearance for duty implicated in their fatalities. Comparisons for the second group were between those whose department used training and safety-related standard operating protocols and those who did not. In 84/176 deaths, PPE/seat belts/fitness was implicated in the fatality. Lack of PPE was more likely on clear days (P = .03) but less likely on cloudy and windy days (P < .001). These FFs dying with lack of PPE had more time on the job in a single department, 18 vs 13 years (P = .03), and more time in a volunteer department, 17 vs 8 years (P < .01). Being deployed on another call during the same shift was associated with lack of PPE-34 vs 16% of those who had not been on another call (P = .005). Lack of training, experience, or medical clearance was implicated in fatalities for 100/176 FFs. FFs who worked in departments that lacked standard operating protocols for respirator fit testing, PPE, fitness testing, rapid intervention, medical clearance, safety/distress alarms, vehicle maintenance, or incident command were statistically more likely to have lack of experience/training/clearance implicated in the fatality. Good weather during a call and more years on the job, particularly in a volunteer department, are associated with FF mortality related to unsafe practices. These factors might create an air of complacency that puts FFs at risk for safety-related omissions. Having been on a recent call may create distraction or fatigue that puts FF at risk during subsequent calls. Lack of key safety-related protocols appears to put FFs at risk of mortality, and the risk may be increasing over time. Further study and prevention efforts from multidisciplinary groups are needed to better understand and combat this problem.
Hahn, Lars; Leimeister, Chris-André; Ounit, Rachid; Lonardi, Stefano; Morgenstern, Burkhard
2016-10-01
Many algorithms for sequence analysis rely on word matching or word statistics. Often, these approaches can be improved if binary patterns representing match and don't-care positions are used as a filter, such that only those positions of words are considered that correspond to the match positions of the patterns. The performance of these approaches, however, depends on the underlying patterns. Herein, we show that the overlap complexity of a pattern set that was introduced by Ilie and Ilie is closely related to the variance of the number of matches between two evolutionarily related sequences with respect to this pattern set. We propose a modified hill-climbing algorithm to optimize pattern sets for database searching, read mapping and alignment-free sequence comparison of nucleic-acid sequences; our implementation of this algorithm is called rasbhari. Depending on the application at hand, rasbhari can either minimize the overlap complexity of pattern sets, maximize their sensitivity in database searching or minimize the variance of the number of pattern-based matches in alignment-free sequence comparison. We show that, for database searching, rasbhari generates pattern sets with slightly higher sensitivity than existing approaches. In our Spaced Words approach to alignment-free sequence comparison, pattern sets calculated with rasbhari led to more accurate estimates of phylogenetic distances than the randomly generated pattern sets that we previously used. Finally, we used rasbhari to generate patterns for short read classification with CLARK-S. Here too, the sensitivity of the results could be improved, compared to the default patterns of the program. We integrated rasbhari into Spaced Words; the source code of rasbhari is freely available at http://rasbhari.gobics.de/.
A New Paradigm to Analyze Data Completeness of Patient Data.
Nasir, Ayan; Gurupur, Varadraj; Liu, Xinliang
2016-08-03
There is a need to develop a tool that will measure data completeness of patient records using sophisticated statistical metrics. Patient data integrity is important in providing timely and appropriate care. Completeness is an important step, with an emphasis on understanding the complex relationships between data fields and their relative importance in delivering care. This tool will not only help understand where data problems are but also help uncover the underlying issues behind them. Develop a tool that can be used alongside a variety of health care database software packages to determine the completeness of individual patient records as well as aggregate patient records across health care centers and subpopulations. The methodology of this project is encapsulated within the Data Completeness Analysis Package (DCAP) tool, with the major components including concept mapping, CSV parsing, and statistical analysis. The results from testing DCAP with Healthcare Cost and Utilization Project (HCUP) State Inpatient Database (SID) data show that this tool is successful in identifying relative data completeness at the patient, subpopulation, and database levels. These results also solidify a need for further analysis and call for hypothesis driven research to find underlying causes for data incompleteness. DCAP examines patient records and generates statistics that can be used to determine the completeness of individual patient data as well as the general thoroughness of record keeping in a medical database. DCAP uses a component that is customized to the settings of the software package used for storing patient data as well as a Comma Separated Values (CSV) file parser to determine the appropriate measurements. DCAP itself is assessed through a proof of concept exercise using hypothetical data as well as available HCUP SID patient data.
A New Paradigm to Analyze Data Completeness of Patient Data
Nasir, Ayan; Liu, Xinliang
2016-01-01
Summary Background There is a need to develop a tool that will measure data completeness of patient records using sophisticated statistical metrics. Patient data integrity is important in providing timely and appropriate care. Completeness is an important step, with an emphasis on understanding the complex relationships between data fields and their relative importance in delivering care. This tool will not only help understand where data problems are but also help uncover the underlying issues behind them. Objectives Develop a tool that can be used alongside a variety of health care database software packages to determine the completeness of individual patient records as well as aggregate patient records across health care centers and subpopulations. Methods The methodology of this project is encapsulated within the Data Completeness Analysis Package (DCAP) tool, with the major components including concept mapping, CSV parsing, and statistical analysis. Results The results from testing DCAP with Healthcare Cost and Utilization Project (HCUP) State Inpatient Database (SID) data show that this tool is successful in identifying relative data completeness at the patient, subpopulation, and database levels. These results also solidify a need for further analysis and call for hypothesis driven research to find underlying causes for data incompleteness. Conclusion DCAP examines patient records and generates statistics that can be used to determine the completeness of individual patient data as well as the general thoroughness of record keeping in a medical database. DCAP uses a component that is customized to the settings of the software package used for storing patient data as well as a Comma Separated Values (CSV) file parser to determine the appropriate measurements. DCAP itself is assessed through a proof of concept exercise using hypothetical data as well as available HCUP SID patient data. PMID:27484918
NASA Technical Reports Server (NTRS)
Abiteboul, Serge
1997-01-01
The amount of data of all kinds available electronically has increased dramatically in recent years. The data resides in different forms, ranging from unstructured data in the systems to highly structured in relational database systems. Data is accessible through a variety of interfaces including Web browsers, database query languages, application-specic interfaces, or data exchange formats. Some of this data is raw data, e.g., images or sound. Some of it has structure even if the structure is often implicit, and not as rigid or regular as that found in standard database systems. Sometimes the structure exists but has to be extracted from the data. Sometimes also it exists but we prefer to ignore it for certain purposes such as browsing. We call here semi-structured data this data that is (from a particular viewpoint) neither raw data nor strictly typed, i.e., not table-oriented as in a relational model or sorted-graph as in object databases. As will seen later when the notion of semi-structured data is more precisely de ned, the need for semi-structured data arises naturally in the context of data integration, even when the data sources are themselves well-structured. Although data integration is an old topic, the need to integrate a wider variety of data- formats (e.g., SGML or ASN.1 data) and data found on the Web has brought the topic of semi-structured data to the forefront of research. The main purpose of the paper is to isolate the essential aspects of semi- structured data. We also survey some proposals of models and query languages for semi-structured data. In particular, we consider recent works at Stanford U. and U. Penn on semi-structured data. In both cases, the motivation is found in the integration of heterogeneous data.
2009-01-01
Background Polymerase chain reaction (PCR) is very useful in many areas of molecular biology research. It is commonly observed that PCR success is critically dependent on design of an effective primer pair. Current tools for primer design do not adequately address the problem of PCR failure due to mis-priming on target-related sequences and structural variations in the genome. Methods We have developed an integrated graphical web-based application for primer design, called RExPrimer, which was written in Python language. The software uses Primer3 as the primer designing core algorithm. Locally stored sequence information and genomic variant information were hosted on MySQLv5.0 and were incorporated into RExPrimer. Results RExPrimer provides many functionalities for improved PCR primer design. Several databases, namely annotated human SNP databases, insertion/deletion (indel) polymorphisms database, pseudogene database, and structural genomic variation databases were integrated into RExPrimer, enabling an effective without-leaving-the-website validation of the resulting primers. By incorporating these databases, the primers reported by RExPrimer avoid mis-priming to related sequences (e.g. pseudogene, segmental duplication) as well as possible PCR failure because of structural polymorphisms (SNP, indel, and copy number variation (CNV)). To prevent mismatching caused by unexpected SNPs in the designed primers, in particular the 3' end (SNP-in-Primer), several SNP databases covering the broad range of population-specific SNP information are utilized to report SNPs present in the primer sequences. Population-specific SNP information also helps customize primer design for a specific population. Furthermore, RExPrimer offers a graphical user-friendly interface through the use of scalable vector graphic image that intuitively presents resulting primers along with the corresponding gene structure. In this study, we demonstrated the program effectiveness in successfully generating primers for strong homologous sequences. Conclusion The improvements for primer design incorporated into RExPrimer were demonstrated to be effective in designing primers for challenging PCR experiments. Integration of SNP and structural variation databases allows for robust primer design for a variety of PCR applications, irrespective of the sequence complexity in the region of interest. This software is freely available at http://www4a.biotec.or.th/rexprimer. PMID:19958502
2013-01-01
Background Due to the growing number of biomedical entries in data repositories of the National Center for Biotechnology Information (NCBI), it is difficult to collect, manage and process all of these entries in one place by third-party software developers without significant investment in hardware and software infrastructure, its maintenance and administration. Web services allow development of software applications that integrate in one place the functionality and processing logic of distributed software components, without integrating the components themselves and without integrating the resources to which they have access. This is achieved by appropriate orchestration or choreography of available Web services and their shared functions. After the successful application of Web services in the business sector, this technology can now be used to build composite software tools that are oriented towards biomedical data processing. Results We have developed a new tool for efficient and dynamic data exploration in GenBank and other NCBI databases. A dedicated search GenBank system makes use of NCBI Web services and a package of Entrez Programming Utilities (eUtils) in order to provide extended searching capabilities in NCBI data repositories. In search GenBank users can use one of the three exploration paths: simple data searching based on the specified user’s query, advanced data searching based on the specified user’s query, and advanced data exploration with the use of macros. search GenBank orchestrates calls of particular tools available through the NCBI Web service providing requested functionality, while users interactively browse selected records in search GenBank and traverse between NCBI databases using available links. On the other hand, by building macros in the advanced data exploration mode, users create choreographies of eUtils calls, which can lead to the automatic discovery of related data in the specified databases. Conclusions search GenBank extends standard capabilities of the NCBI Entrez search engine in querying biomedical databases. The possibility of creating and saving macros in the search GenBank is a unique feature and has a great potential. The potential will further grow in the future with the increasing density of networks of relationships between data stored in particular databases. search GenBank is available for public use at http://sgb.biotools.pl/. PMID:23452691
Mrozek, Dariusz; Małysiak-Mrozek, Bożena; Siążnik, Artur
2013-03-01
Due to the growing number of biomedical entries in data repositories of the National Center for Biotechnology Information (NCBI), it is difficult to collect, manage and process all of these entries in one place by third-party software developers without significant investment in hardware and software infrastructure, its maintenance and administration. Web services allow development of software applications that integrate in one place the functionality and processing logic of distributed software components, without integrating the components themselves and without integrating the resources to which they have access. This is achieved by appropriate orchestration or choreography of available Web services and their shared functions. After the successful application of Web services in the business sector, this technology can now be used to build composite software tools that are oriented towards biomedical data processing. We have developed a new tool for efficient and dynamic data exploration in GenBank and other NCBI databases. A dedicated search GenBank system makes use of NCBI Web services and a package of Entrez Programming Utilities (eUtils) in order to provide extended searching capabilities in NCBI data repositories. In search GenBank users can use one of the three exploration paths: simple data searching based on the specified user's query, advanced data searching based on the specified user's query, and advanced data exploration with the use of macros. search GenBank orchestrates calls of particular tools available through the NCBI Web service providing requested functionality, while users interactively browse selected records in search GenBank and traverse between NCBI databases using available links. On the other hand, by building macros in the advanced data exploration mode, users create choreographies of eUtils calls, which can lead to the automatic discovery of related data in the specified databases. search GenBank extends standard capabilities of the NCBI Entrez search engine in querying biomedical databases. The possibility of creating and saving macros in the search GenBank is a unique feature and has a great potential. The potential will further grow in the future with the increasing density of networks of relationships between data stored in particular databases. search GenBank is available for public use at http://sgb.biotools.pl/.
Hardin, Andrew
2017-09-01
In this issue, Bollen and Diamantopoulos (2017) defend causal-formative indicators against several common criticisms leveled by scholars who oppose their use. In doing so, the authors make several convincing assertions: Constructs exist independently from their measures; theory determines whether indicators cause or measure latent variables; and reflective and causal-formative indicators are both subject to interpretational confounding. However, despite being a well-reasoned, comprehensive defense of causal-formative indicators, no single article can address all of the issues associated with this debate. Thus, Bollen and Diamantopoulos leave a few fundamental issues unresolved. For example, how can researchers establish the reliability of indicators that may include measurement error? Moreover, how should researchers interpret disturbance terms that capture sources of influence related to both the empirical definition of the latent variable and to the theoretical definition of the construct? Relatedly, how should researchers reconcile the requirement for a census of causal-formative indicators with the knowledge that indicators are likely missing from the empirically estimated latent variable? This commentary develops 6 related research questions to draw attention to these fundamental issues, and to call for future research that can lead to the development of theory to guide the use of causal-formative indicators. (PsycINFO Database Record (c) 2017 APA, all rights reserved).
A scalable database model for multiparametric time series: a volcano observatory case study
NASA Astrophysics Data System (ADS)
Montalto, Placido; Aliotta, Marco; Cassisi, Carmelo; Prestifilippo, Michele; Cannata, Andrea
2014-05-01
The variables collected by a sensor network constitute a heterogeneous data source that needs to be properly organized in order to be used in research and geophysical monitoring. With the time series term we refer to a set of observations of a given phenomenon acquired sequentially in time. When the time intervals are equally spaced one speaks of period or sampling frequency. Our work describes in detail a possible methodology for storage and management of time series using a specific data structure. We designed a framework, hereinafter called TSDSystem (Time Series Database System), in order to acquire time series from different data sources and standardize them within a relational database. The operation of standardization provides the ability to perform operations, such as query and visualization, of many measures synchronizing them using a common time scale. The proposed architecture follows a multiple layer paradigm (Loaders layer, Database layer and Business Logic layer). Each layer is specialized in performing particular operations for the reorganization and archiving of data from different sources such as ASCII, Excel, ODBC (Open DataBase Connectivity), file accessible from the Internet (web pages, XML). In particular, the loader layer performs a security check of the working status of each running software through an heartbeat system, in order to automate the discovery of acquisition issues and other warning conditions. Although our system has to manage huge amounts of data, performance is guaranteed by using a smart partitioning table strategy, that keeps balanced the percentage of data stored in each database table. TSDSystem also contains modules for the visualization of acquired data, that provide the possibility to query different time series on a specified time range, or follow the realtime signal acquisition, according to a data access policy from the users.
A multidisciplinary database for geophysical time series management
NASA Astrophysics Data System (ADS)
Montalto, P.; Aliotta, M.; Cassisi, C.; Prestifilippo, M.; Cannata, A.
2013-12-01
The variables collected by a sensor network constitute a heterogeneous data source that needs to be properly organized in order to be used in research and geophysical monitoring. With the time series term we refer to a set of observations of a given phenomenon acquired sequentially in time. When the time intervals are equally spaced one speaks of period or sampling frequency. Our work describes in detail a possible methodology for storage and management of time series using a specific data structure. We designed a framework, hereinafter called TSDSystem (Time Series Database System), in order to acquire time series from different data sources and standardize them within a relational database. The operation of standardization provides the ability to perform operations, such as query and visualization, of many measures synchronizing them using a common time scale. The proposed architecture follows a multiple layer paradigm (Loaders layer, Database layer and Business Logic layer). Each layer is specialized in performing particular operations for the reorganization and archiving of data from different sources such as ASCII, Excel, ODBC (Open DataBase Connectivity), file accessible from the Internet (web pages, XML). In particular, the loader layer performs a security check of the working status of each running software through an heartbeat system, in order to automate the discovery of acquisition issues and other warning conditions. Although our system has to manage huge amounts of data, performance is guaranteed by using a smart partitioning table strategy, that keeps balanced the percentage of data stored in each database table. TSDSystem also contains modules for the visualization of acquired data, that provide the possibility to query different time series on a specified time range, or follow the realtime signal acquisition, according to a data access policy from the users.
Real-time intelligent decision making with data mining
NASA Astrophysics Data System (ADS)
Gupta, Deepak P.; Gopalakrishnan, Bhaskaran
2004-03-01
Database mining, widely known as knowledge discovery and data mining (KDD), has attracted lot of attention in recent years. With the rapid growth of databases in commercial, industrial, administrative and other applications, it is necessary and interesting to extract knowledge automatically from huge amount of data. Almost all the organizations are generating data and information at an unprecedented rate and they need to get some useful information from this data. Data mining is the extraction of non-trivial, previously unknown and potentially useful patterns, trends, dependence and correlation known as association rules among data values in large databases. In last ten to fifteen years, data mining spread out from one company to the other to help them understand more about customers' aspect of quality and response and also distinguish the customers they want from those they do not. A credit-card company found that customers who complete their applications in pencil rather than pen are more likely to default. There is a program that identifies callers by purchase history. The bigger the spender, the quicker the call will be answered. If you feel your call is being answered in the order in which it was received, think again. Many algorithms assume that data is static in nature and mine the rules and relations in that data. But for a dynamic database e.g. in most of the manufacturing industries, the rules and relations thus developed among the variables/items no longer hold true. A simple approach may be to mine the associations among the variables after every fixed period of time. But again, how much the length of this period should be, is a question to be answered. The next problem with the static data mining is that some of the relationships that might be of interest from one period to the other may be lost after a new set of data is used. To reflect the effect of new data set and current status of the association rules where some of the strong rules might become weak and vice versa, there is a need to develop an efficient algorithm to adapt to the current patterns and associations. Some work has been done in developing the association rules for incremental database but to the best of the author"s knowledge no work has been done to do the same for periodic cause and effect analysis for online association rules in manufacturing industries. The present research attempts to answer these questions and develop an algorithm that can display the association rules online, find the periodic patterns in the data and detect the root cause of the problem.
Text Mining to Support Gene Ontology Curation and Vice Versa.
Ruch, Patrick
2017-01-01
In this chapter, we explain how text mining can support the curation of molecular biology databases dealing with protein functions. We also show how curated data can play a disruptive role in the developments of text mining methods. We review a decade of efforts to improve the automatic assignment of Gene Ontology (GO) descriptors, the reference ontology for the characterization of genes and gene products. To illustrate the high potential of this approach, we compare the performances of an automatic text categorizer and show a large improvement of +225 % in both precision and recall on benchmarked data. We argue that automatic text categorization functions can ultimately be embedded into a Question-Answering (QA) system to answer questions related to protein functions. Because GO descriptors can be relatively long and specific, traditional QA systems cannot answer such questions. A new type of QA system, so-called Deep QA which uses machine learning methods trained with curated contents, is thus emerging. Finally, future advances of text mining instruments are directly dependent on the availability of high-quality annotated contents at every curation step. Databases workflows must start recording explicitly all the data they curate and ideally also some of the data they do not curate.
Power system modeling and optimization methods vis-a-vis integrated resource planning (IRP)
NASA Astrophysics Data System (ADS)
Arsali, Mohammad H.
1998-12-01
The state-of-the-art restructuring of power industries is changing the fundamental nature of retail electricity business. As a result, the so-called Integrated Resource Planning (IRP) strategies implemented on electric utilities are also undergoing modifications. Such modifications evolve from the imminent considerations to minimize the revenue requirements and maximize electrical system reliability vis-a-vis capacity-additions (viewed as potential investments). IRP modifications also provide service-design bases to meet the customer needs towards profitability. The purpose of this research as deliberated in this dissertation is to propose procedures for optimal IRP intended to expand generation facilities of a power system over a stretched period of time. Relevant topics addressed in this research towards IRP optimization are as follows: (1) Historical prospective and evolutionary aspects of power system production-costing models and optimization techniques; (2) A survey of major U.S. electric utilities adopting IRP under changing socioeconomic environment; (3) A new technique designated as the Segmentation Method for production-costing via IRP optimization; (4) Construction of a fuzzy relational database of a typical electric power utility system for IRP purposes; (5) A genetic algorithm based approach for IRP optimization using the fuzzy relational database.
Surate Solaligue, David Emanuel; Hederman, Lucy; Martin, Carmel Mary
2014-08-01
Timely access to general practitioner (GP) care is a recognized strategy to address avoidable hospitalization. Little is known about patients seeking planned (decided ahead) and unplanned (decided on day) GP visits. The Patient Journey Record System (PaJR) provides a biopsychosocial real-time monitoring and support service to chronically ill and older people over 65 who may be at risk of an avoidable hospital admission. This study aims to describe reported profiles associated with planned and unplanned GP visits during the week in the PaJR database of regular outbound phone calls made by Care Guides to multi-morbid older patients. One hundred fifty consecutive patients with one or more chronic condition (including chronic obstructive pulmonary disease, heart/vascular disease, heart failure and/or diabetes), one or more hospital admission in previous year, and consecutively recruited from hospital discharge, out-of-hour care and GP practices comprised the study sample. Using a semistructured script, Care Guides telephoned the patients approximately every 3 week days, and entered call data into the PaJR database in 2011. The PaJR project identified and prompted unplanned visits according to its algorithms. Logistic regression modelling and descriptive statistics identified significant predictors of planned and unplanned visits and patterns of GP visits on weekdays reported in calls. In 5096 telephone calls, unplanned versus planned GP visits were predicted by change in health state, significant symptom concerns, poor self-rated health, bodily pain and concerns about caregiver or intimates. Calls not reporting visits had significantly fewer of these features. Planned visits were associated with general and medication concerns, reduced social participation and feeling down. Planned visits were highest on Monday and trended downwards to Fridays. Unplanned visits were reported at the same rate each weekday and more frequently when the interval between calls was ≥3 days. The PaJR project Care Guides advised patients to make unplanned visits in 6.3% of calls and advised planned GP visits in 2.5% of calls. Unplanned GP visits consistently indicated a significant change to worse health with planned visits presenting less acuity in this study of older multi-morbid patients in general practice, when monitored by regular calls at about every 3 days. The PaJR study actively prompted GP visits according to its algorithms. Assessing and predicting acuity in older multi-morbid patients appears to be a promising strategy to improve access to primary care, and thus to reducing avoidable hospital utilization. Further research is needed to investigate the topic on a wider scale. © 2014 John Wiley & Sons, Ltd.
Chang, Larry William; Kagaayi, Joseph; Nakigozi, Gertrude; Galiwango, Ronald; Mulamba, Jeremiah; Ludigo, James; Ruwangula, Andrew; Gray, Ronald H; Quinn, Thomas C; Bollinger, Robert C; Reynolds, Steven J
2008-01-01
Hotlines and warmlines have been successfully used in the developed world to provide clinical advice; however, reports on their replicability in resource-limited settings are limited. A warmline was established in Rakai, Uganda, to support an antiretroviral therapy program. Over a 17-month period, a database was kept of who called, why they called, and the result of the call. A program evaluation was also administered to clinical staff. A total of 1303 calls (3.5 calls per weekday) were logged. The warmline was used mostly by field staff and peripherally based peer health workers. Calls addressed important clinical issues, including the need for urgent care, medication side effects, and follow-up needs. Most clinical staff felt that the warmline made their jobs easier and improved the health of patients. An HIV/AIDS warmline leveraged the skills of a limited workforce to provide increased access to HIV/AIDS care, advice, and education.
An evaluation of medical knowledge contained in Wikipedia and its use in the LOINC database.
Friedlin, Jeff; McDonald, Clement J
2010-01-01
The logical observation identifiers names and codes (LOINC) database contains 55 000 terms consisting of more atomic components called parts. LOINC carries more than 18 000 distinct parts. It is necessary to have definitions/descriptions for each of these parts to assist users in mapping local laboratory codes to LOINC. It is believed that much of this information can be obtained from the internet; the first effort was with Wikipedia. This project focused on 1705 laboratory analytes (the first part in the LOINC laboratory name). Of the 1705 parts queried, 1314 matching articles were found in Wikipedia. Of these, 1299 (98.9%) were perfect matches that exactly described the LOINC part, 15 (1.14%) were partial matches (the description in Wikipedia was related to the LOINC part, but did not describe it fully), and 102 (7.76%) were mis-matches. The current release of RELMA and LOINC include Wikipedia descriptions of LOINC parts obtained as a direct result of this project.
The Biomolecular Crystallization Database Version 4: expanded content and new features.
Tung, Michael; Gallagher, D Travis
2009-01-01
The Biological Macromolecular Crystallization Database (BMCD) has been a publicly available resource since 1988, providing a curated archive of information on crystal growth for proteins and other biological macromolecules. The BMCD content has recently been expanded to include 14 372 crystal entries. The resource continues to be freely available at http://xpdb.nist.gov:8060/BMCD4. In addition, the software has been adapted to support the Java-based Lucene query language, enabling detailed searching over specific parameters, and explicit search of parameter ranges is offered for five numeric variables. Extensive tools have been developed for import and handling of data from the RCSB Protein Data Bank. The updated BMCD is called version 4.02 or BMCD4. BMCD4 entries have been expanded to include macromolecule sequence, enabling more elaborate analysis of relations among protein properties, crystal-growth conditions and the geometric and diffraction properties of the crystals. The BMCD version 4.02 contains greatly expanded content and enhanced search capabilities to facilitate scientific analysis and design of crystal-growth strategies.
1986 Year End Report for Road Following at Carnegie-Mellon
1987-05-01
how to make them work efficiently. We designed a hierarchical structure and a monitor module which manages all parts of the hierarchy (see figure 1...database, called the Local Map, is managed by a program known as the Local Map Builder (LMB). Each module stores and retrieves information in the...knowledge-intensive modules, and a database manager that synchronizes the modules-is characteristic of a traditional blackboard system. Such a system is
An Introduction to MAMA (Meta-Analysis of MicroArray data) System.
Zhang, Zhe; Fenstermacher, David
2005-01-01
Analyzing microarray data across multiple experiments has been proven advantageous. To support this kind of analysis, we are developing a software system called MAMA (Meta-Analysis of MicroArray data). MAMA utilizes a client-server architecture with a relational database on the server-side for the storage of microarray datasets collected from various resources. The client-side is an application running on the end user's computer that allows the user to manipulate microarray data and analytical results locally. MAMA implementation will integrate several analytical methods, including meta-analysis within an open-source framework offering other developers the flexibility to plug in additional statistical algorithms.
Delaney, Aogán; Tamás, Peter A
2018-03-01
Despite recognition that database search alone is inadequate even within the health sciences, it appears that reviewers in fields that have adopted systematic review are choosing to rely primarily, or only, on database search for information retrieval. This commentary reminds readers of factors that call into question the appropriateness of default reliance on database searches particularly as systematic review is adapted for use in new and lower consensus fields. It then discusses alternative methods for information retrieval that require development, formalisation, and evaluation. Our goals are to encourage reviewers to reflect critically and transparently on their choice of information retrieval methods and to encourage investment in research on alternatives. Copyright © 2017 John Wiley & Sons, Ltd.
Prakash, Peralam Yegneswaran; Irinyi, Laszlo; Halliday, Catriona; Chen, Sharon; Robert, Vincent
2017-01-01
ABSTRACT The increase in public online databases dedicated to fungal identification is noteworthy. This can be attributed to improved access to molecular approaches to characterize fungi, as well as to delineate species within specific fungal groups in the last 2 decades, leading to an ever-increasing complexity of taxonomic assortments and nomenclatural reassignments. Thus, well-curated fungal databases with substantial accurate sequence data play a pivotal role for further research and diagnostics in the field of mycology. This minireview aims to provide an overview of currently available online databases for the taxonomy and identification of human and animal-pathogenic fungi and calls for the establishment of a cloud-based dynamic data network platform. PMID:28179406
Carrault, G; Cordier, M-O; Quiniou, R; Wang, F
2003-07-01
This paper proposes a novel approach to cardiac arrhythmia recognition from electrocardiograms (ECGs). ECGs record the electrical activity of the heart and are used to diagnose many heart disorders. The numerical ECG is first temporally abstracted into series of time-stamped events. Temporal abstraction makes use of artificial neural networks to extract interesting waves and their features from the input signals. A temporal reasoner called a chronicle recogniser processes such series in order to discover temporal patterns called chronicles which can be related to cardiac arrhythmias. Generally, it is difficult to elicit an accurate set of chronicles from a doctor. Thus, we propose to learn automatically from symbolic ECG examples the chronicles discriminating the arrhythmias belonging to some specific subset. Since temporal relationships are of major importance, inductive logic programming (ILP) is the tool of choice as it enables first-order relational learning. The approach has been evaluated on real ECGs taken from the MIT-BIH database. The performance of the different modules as well as the efficiency of the whole system is presented. The results are rather good and demonstrate that integrating numerical techniques for low level perception and symbolic techniques for high level classification is very valuable.
Modeling Free Energies of Solvation in Olive Oil
Chamberlin, Adam C.; Levitt, David G.; Cramer, Christopher J.; Truhlar, Donald G.
2009-01-01
Olive oil partition coefficients are useful for modeling the bioavailability of drug-like compounds. We have recently developed an accurate solvation model called SM8 for aqueous and organic solvents (Marenich, A. V.; Olson, R. M.; Kelly, C. P.; Cramer, C. J.; Truhlar, D. G. J. Chem. Theory Comput. 2007, 3, 2011) and a temperature-dependent solvation model called SM8T for aqueous solution (Chamberlin, A. C.; Cramer, C. J.; Truhlar, D. G. J. Phys. Chem. B 2008, 112, 3024). Here we describe an extension of SM8T to predict air–olive oil and water–olive oil partitioning for drug-like solutes as functions of temperature. We also describe the database of experimental partition coefficients used to parameterize the model; this database includes 371 entries for 304 compounds spanning the 291–310 K temperature range. PMID:19434923
Mining Co-Location Patterns with Clustering Items from Spatial Data Sets
NASA Astrophysics Data System (ADS)
Zhou, G.; Li, Q.; Deng, G.; Yue, T.; Zhou, X.
2018-05-01
The explosive growth of spatial data and widespread use of spatial databases emphasize the need for the spatial data mining. Co-location patterns discovery is an important branch in spatial data mining. Spatial co-locations represent the subsets of features which are frequently located together in geographic space. However, the appearance of a spatial feature C is often not determined by a single spatial feature A or B but by the two spatial features A and B, that is to say where A and B appear together, C often appears. We note that this co-location pattern is different from the traditional co-location pattern. Thus, this paper presents a new concept called clustering terms, and this co-location pattern is called co-location patterns with clustering items. And the traditional algorithm cannot mine this co-location pattern, so we introduce the related concept in detail and propose a novel algorithm. This algorithm is extended by join-based approach proposed by Huang. Finally, we evaluate the performance of this algorithm.
Corcoran, Callan C.; Grady, Cameron R.; Pisitkun, Trairak; Parulekar, Jaya
2017-01-01
The organization of the mammalian genome into gene subsets corresponding to specific functional classes has provided key tools for systems biology research. Here, we have created a web-accessible resource called the Mammalian Metabolic Enzyme Database (https://hpcwebapps.cit.nih.gov/ESBL/Database/MetabolicEnzymes/MetabolicEnzymeDatabase.html) keyed to the biochemical reactions represented on iconic metabolic pathway wall charts created in the previous century. Overall, we have mapped 1,647 genes to these pathways, representing ~7 percent of the protein-coding genome. To illustrate the use of the database, we apply it to the area of kidney physiology. In so doing, we have created an additional database (Database of Metabolic Enzymes in Kidney Tubule Segments: https://hpcwebapps.cit.nih.gov/ESBL/Database/MetabolicEnzymes/), mapping mRNA abundance measurements (mined from RNA-Seq studies) for all metabolic enzymes to each of 14 renal tubule segments. We carry out bioinformatics analysis of the enzyme expression pattern among renal tubule segments and mine various data sources to identify vasopressin-regulated metabolic enzymes in the renal collecting duct. PMID:27974320
Structure and software tools of AIDA.
Duisterhout, J S; Franken, B; Witte, F
1987-01-01
AIDA consists of a set of software tools to allow for fast development and easy-to-maintain Medical Information Systems. AIDA supports all aspects of such a system both during development and operation. It contains tools to build and maintain forms for interactive data entry and on-line input validation, a database management system including a data dictionary and a set of run-time routines for database access, and routines for querying the database and output formatting. Unlike an application generator, the user of AIDA may select parts of the tools to fulfill his needs and program other subsystems not developed with AIDA. The AIDA software uses as host language the ANSI-standard programming language MUMPS, an interpreted language embedded in an integrated database and programming environment. This greatly facilitates the portability of AIDA applications. The database facilities supported by AIDA are based on a relational data model. This data model is built on top of the MUMPS database, the so-called global structure. This relational model overcomes the restrictions of the global structure regarding string length. The global structure is especially powerful for sorting purposes. Using MUMPS as a host language allows the user an easy interface between user-defined data validation checks or other user-defined code and the AIDA tools. AIDA has been designed primarily for prototyping and for the construction of Medical Information Systems in a research environment which requires a flexible approach. The prototyping facility of AIDA operates terminal independent and is even to a great extent multi-lingual. Most of these features are table-driven; this allows on-line changes in the use of terminal type and language, but also causes overhead. AIDA has a set of optimizing tools by which it is possible to build a faster, but (of course) less flexible code from these table definitions. By separating the AIDA software in a source and a run-time version, one is able to write implementation-specific code which can be selected and loaded by a special source loader, being part of the AIDA software. This feature is also accessible for maintaining software on different sites and on different installations.
ERIC Educational Resources Information Center
Goodgion, Laurel; And Others
1986-01-01
Eight articles in special supplement to "Library Journal" and "School Library Journal" cover a computer program called "Byte into Books"; microcomputers and the small library; creating databases with students; online searching with a microcomputer; quality automation software; Meckler Publishing Company's…
Cowan, Nelson
2015-07-01
Miller's (1956) article about storage capacity limits, "The Magical Number Seven Plus or Minus Two . . .," is one of the best-known articles in psychology. Though influential in several ways, for about 40 years it was oddly followed by rather little research on the numerical limit of capacity in working memory, or on the relation between 3 potentially related phenomena that Miller described. Given that the article was written in a humorous tone and was framed around a tongue-in-cheek premise (persecution by an integer), I argue that it may have inadvertently stymied progress on these topics as researchers attempted to avoid ridicule. This commentary relates some correspondence with Miller on his article and concludes with a call to avoid self-censorship of our less conventional ideas. (PsycINFO Database Record (c) 2015 APA, all rights reserved).
DOE Office of Scientific and Technical Information (OSTI.GOV)
Langan, Roisin T.; Archibald, Richard K.; Lamberti, Vincent
We have applied a new imputation-based method for analyzing incomplete data, called Monte Carlo Bayesian Database Generation (MCBDG), to the Spent Fuel Isotopic Composition (SFCOMPO) database. About 60% of the entries are absent for SFCOMPO. The method estimates missing values of a property from a probability distribution created from the existing data for the property, and then generates multiple instances of the completed database for training a machine learning algorithm. Uncertainty in the data is represented by an empirical or an assumed error distribution. The method makes few assumptions about the underlying data, and compares favorably against results obtained bymore » replacing missing information with constant values.« less
Pan Air Geometry Management System (PAGMS): A data-base management system for PAN AIR geometry data
NASA Technical Reports Server (NTRS)
Hall, J. F.
1981-01-01
A data-base management system called PAGMS was developed to facilitate the data transfer in applications computer programs that create, modify, plot or otherwise manipulate PAN AIR type geometry data in preparation for input to the PAN AIR system of computer programs. PAGMS is composed of a series of FORTRAN callable subroutines which can be accessed directly from applications programs. Currently only a NOS version of PAGMS has been developed.
Environmental distribution, abundance and activity of the Miscellaneous Crenarchaeotal Group
NASA Astrophysics Data System (ADS)
Lloyd, K. G.; Biddle, J.; Teske, A.
2011-12-01
Many marine sedimentary microbes have only been identified by 16S rRNA sequences. Consequently, little is known about the types of metabolism, activity levels, or relative abundance of these groups in marine sediments. We found that one of these uncultured groups, called the Miscellaneous Crenarchaeotal Group (MCG), dominated clone libraries made from reverse transcribed 16S rRNA, and 454 pyrosequenced 16S rRNA genes, in the White Oak River estuary. Primers suitable for quantitative PCR were developed for MCG and used to show that 16S rRNA DNA copy numbers from MCG account for nearly all the archaeal 16S rRNA genes present. RT-qPCR shows much less MCG rRNA than total archaeal rRNA, but comparisons of different primers for each group suggest bias in the RNA-based work relative to the DNA-based work. There is no evidence of a population shift with depth below the sulfate-methane transition zone, suggesting that the metabolism of MCG may not be tied to sulfur or methane cycles. We classified 2,771 new sequences within the SSU Silva 106 database that, along with the classified sequences in the Silva database was used to make an MCG database of 4,646 sequences that allowed us to increase the named subgroups of MCG from 7 to 19. Percent terrestrial sequences in each subgroup is positively correlated with percent of the marine sequences that are nearshore, suggesting that membership in the different subgroups is not random, but dictated by environmental selective pressures. Given their high phylogenetic diversity, ubiquitous distribution in anoxic environments, and high DNA copy number relative to total archaea, members of MCG are most likely anaerobic heterotrophs who are integral to the post-depositional marine carbon cycle.
An integrative relational point of view.
Wachtel, Paul L
2014-09-01
This article, part of a special section on the Relational Foundations of Psychotherapy, describes a particular relational approach called cyclical psychodynamics. Cyclical psychodynamics is rooted both in the relational perspective in psychoanalysis and in an integrative melding of psychodynamic, cognitive-behavioral, systemic, and experiential points of view. Central to its theoretical structure is a focus on the vicious and virtuous circles that perpetuate (or contribute to changing) personality patterns that may have originated in childhood but that persist because they often generate the very feedback from others that is necessary to keep them going. As a consequence of this latter focus, the relational foundation of cyclical psychodynamic therapy addresses in equal and dynamically reciprocal fashion both the therapeutic relationship in the consulting room and the key relationships outside the consulting room that play an essential role in the maintenance or change of the problematic patterns the person has come to therapy to work on. PsycINFO Database Record (c) 2014 APA, all rights reserved.
A framework for cross-observatory volcanological database management
NASA Astrophysics Data System (ADS)
Aliotta, Marco Antonio; Amore, Mauro; Cannavò, Flavio; Cassisi, Carmelo; D'Agostino, Marcello; Dolce, Mario; Mastrolia, Andrea; Mangiagli, Salvatore; Messina, Giuseppe; Montalto, Placido; Fabio Pisciotta, Antonino; Prestifilippo, Michele; Rossi, Massimo; Scarpato, Giovanni; Torrisi, Orazio
2017-04-01
In the last years, it has been clearly shown how the multiparametric approach is the winning strategy to investigate the complex dynamics of the volcanic systems. This involves the use of different sensor networks, each one dedicated to the acquisition of particular data useful for research and monitoring. The increasing interest devoted to the study of volcanological phenomena led the constitution of different research organizations or observatories, also relative to the same volcanoes, which acquire large amounts of data from sensor networks for the multiparametric monitoring. At INGV we developed a framework, hereinafter called TSDSystem (Time Series Database System), which allows to acquire data streams from several geophysical and geochemical permanent sensor networks (also represented by different data sources such as ASCII, ODBC, URL etc.), located on the main volcanic areas of Southern Italy, and relate them within a relational database management system. Furthermore, spatial data related to different dataset are managed using a GIS module for sharing and visualization purpose. The standardization provides the ability to perform operations, such as query and visualization, of many measures synchronizing them using a common space and time scale. In order to share data between INGV observatories, and also with Civil Protection, whose activity is related on the same volcanic districts, we designed a "Master View" system that, starting from the implementation of a number of instances of the TSDSystem framework (one for each observatory), makes possible the joint interrogation of data, both temporal and spatial, on instances located in different observatories, through the use of web services technology (RESTful, SOAP). Similarly, it provides metadata for equipment using standard schemas (such as FDSN StationXML). The "Master View" is also responsible for managing the data policy through a "who owns what" system, which allows you to associate viewing/download of spatial or time intervals to particular users or groups.
Techniques for Efficiently Managing Large Geosciences Data Sets
NASA Astrophysics Data System (ADS)
Kruger, A.; Krajewski, W. F.; Bradley, A. A.; Smith, J. A.; Baeck, M. L.; Steiner, M.; Lawrence, R. E.; Ramamurthy, M. K.; Weber, J.; Delgreco, S. A.; Domaszczynski, P.; Seo, B.; Gunyon, C. A.
2007-12-01
We have developed techniques and software tools for efficiently managing large geosciences data sets. While the techniques were developed as part of an NSF-Funded ITR project that focuses on making NEXRAD weather data and rainfall products available to hydrologists and other scientists, they are relevant to other geosciences disciplines that deal with large data sets. Metadata, relational databases, data compression, and networking are central to our methodology. Data and derived products are stored on file servers in a compressed format. URLs to, and metadata about the data and derived products are managed in a PostgreSQL database. Virtually all access to the data and products is through this database. Geosciences data normally require a number of processing steps to transform the raw data into useful products: data quality assurance, coordinate transformations and georeferencing, applying calibration information, and many more. We have developed the concept of crawlers that manage this scientific workflow. Crawlers are unattended processes that run indefinitely, and at set intervals query the database for their next assignment. A database table functions as a roster for the crawlers. Crawlers perform well-defined tasks that are, except for perhaps sequencing, largely independent from other crawlers. Once a crawler is done with its current assignment, it updates the database roster table, and gets its next assignment by querying the database. We have developed a library that enables one to quickly add crawlers. The library provides hooks to external (i.e., C-language) compiled codes, so that developers can work and contribute independently. Processes called ingesters inject data into the system. The bulk of the data are from a real-time feed using UCAR/Unidata's IDD/LDM software. An exciting recent development is the establishment of a Unidata HYDRO feed that feeds value-added metadata over the IDD/LDM. Ingesters grab the metadata and populate the PostgreSQL tables. These and other concepts we have developed have enabled us to efficiently manage a 70 Tb (and growing) data weather radar data set.
Soranno, Patricia A; Bissell, Edward G; Cheruvelil, Kendra S; Christel, Samuel T; Collins, Sarah M; Fergus, C Emi; Filstrup, Christopher T; Lapierre, Jean-Francois; Lottig, Noah R; Oliver, Samantha K; Scott, Caren E; Smith, Nicole J; Stopyak, Scott; Yuan, Shuai; Bremigan, Mary Tate; Downing, John A; Gries, Corinna; Henry, Emily N; Skaff, Nick K; Stanley, Emily H; Stow, Craig A; Tan, Pang-Ning; Wagner, Tyler; Webster, Katherine E
2015-01-01
Although there are considerable site-based data for individual or groups of ecosystems, these datasets are widely scattered, have different data formats and conventions, and often have limited accessibility. At the broader scale, national datasets exist for a large number of geospatial features of land, water, and air that are needed to fully understand variation among these ecosystems. However, such datasets originate from different sources and have different spatial and temporal resolutions. By taking an open-science perspective and by combining site-based ecosystem datasets and national geospatial datasets, science gains the ability to ask important research questions related to grand environmental challenges that operate at broad scales. Documentation of such complicated database integration efforts, through peer-reviewed papers, is recommended to foster reproducibility and future use of the integrated database. Here, we describe the major steps, challenges, and considerations in building an integrated database of lake ecosystems, called LAGOS (LAke multi-scaled GeOSpatial and temporal database), that was developed at the sub-continental study extent of 17 US states (1,800,000 km(2)). LAGOS includes two modules: LAGOSGEO, with geospatial data on every lake with surface area larger than 4 ha in the study extent (~50,000 lakes), including climate, atmospheric deposition, land use/cover, hydrology, geology, and topography measured across a range of spatial and temporal extents; and LAGOSLIMNO, with lake water quality data compiled from ~100 individual datasets for a subset of lakes in the study extent (~10,000 lakes). Procedures for the integration of datasets included: creating a flexible database design; authoring and integrating metadata; documenting data provenance; quantifying spatial measures of geographic data; quality-controlling integrated and derived data; and extensively documenting the database. Our procedures make a large, complex, and integrated database reproducible and extensible, allowing users to ask new research questions with the existing database or through the addition of new data. The largest challenge of this task was the heterogeneity of the data, formats, and metadata. Many steps of data integration need manual input from experts in diverse fields, requiring close collaboration.
Soranno, Patricia A.; Bissell, E.G.; Cheruvelil, Kendra S.; Christel, Samuel T.; Collins, Sarah M.; Fergus, C. Emi; Filstrup, Christopher T.; Lapierre, Jean-Francois; Lotting, Noah R.; Oliver, Samantha K.; Scott, Caren E.; Smith, Nicole J.; Stopyak, Scott; Yuan, Shuai; Bremigan, Mary Tate; Downing, John A.; Gries, Corinna; Henry, Emily N.; Skaff, Nick K.; Stanley, Emily H.; Stow, Craig A.; Tan, Pang-Ning; Wagner, Tyler; Webster, Katherine E.
2015-01-01
Although there are considerable site-based data for individual or groups of ecosystems, these datasets are widely scattered, have different data formats and conventions, and often have limited accessibility. At the broader scale, national datasets exist for a large number of geospatial features of land, water, and air that are needed to fully understand variation among these ecosystems. However, such datasets originate from different sources and have different spatial and temporal resolutions. By taking an open-science perspective and by combining site-based ecosystem datasets and national geospatial datasets, science gains the ability to ask important research questions related to grand environmental challenges that operate at broad scales. Documentation of such complicated database integration efforts, through peer-reviewed papers, is recommended to foster reproducibility and future use of the integrated database. Here, we describe the major steps, challenges, and considerations in building an integrated database of lake ecosystems, called LAGOS (LAke multi-scaled GeOSpatial and temporal database), that was developed at the sub-continental study extent of 17 US states (1,800,000 km2). LAGOS includes two modules: LAGOSGEO, with geospatial data on every lake with surface area larger than 4 ha in the study extent (~50,000 lakes), including climate, atmospheric deposition, land use/cover, hydrology, geology, and topography measured across a range of spatial and temporal extents; and LAGOSLIMNO, with lake water quality data compiled from ~100 individual datasets for a subset of lakes in the study extent (~10,000 lakes). Procedures for the integration of datasets included: creating a flexible database design; authoring and integrating metadata; documenting data provenance; quantifying spatial measures of geographic data; quality-controlling integrated and derived data; and extensively documenting the database. Our procedures make a large, complex, and integrated database reproducible and extensible, allowing users to ask new research questions with the existing database or through the addition of new data. The largest challenge of this task was the heterogeneity of the data, formats, and metadata. Many steps of data integration need manual input from experts in diverse fields, requiring close collaboration.
How Does Hamas End: A Historical Overview And Where The Future Leads
2014-04-10
26 September 2013, http://www.jpost.com/ Middle - East /Hamas-Islamic-Jihad-call-for-a-third-intifada-327202 (accessed 6 April 2014). 37 Interview with a...www.jpost.com/ Middle - East /Hamas-Islamic-Jihad-call-for-a- third-intifada-327202 (accessed 6 April 2014). University of Maryland. Global Terrorism Database...religious standpoint all three monotheistic religions, Judaism, Christianity, and Islam claim a common patriarch in Abraham, who settled in what is modern
Virus Database and Online Inquiry System Based on Natural Vectors.
Dong, Rui; Zheng, Hui; Tian, Kun; Yau, Shek-Chung; Mao, Weiguang; Yu, Wenping; Yin, Changchuan; Yu, Chenglong; He, Rong Lucy; Yang, Jie; Yau, Stephen St
2017-01-01
We construct a virus database called VirusDB (http://yaulab.math.tsinghua.edu.cn/VirusDB/) and an online inquiry system to serve people who are interested in viral classification and prediction. The database stores all viral genomes, their corresponding natural vectors, and the classification information of the single/multiple-segmented viral reference sequences downloaded from National Center for Biotechnology Information. The online inquiry system serves the purpose of computing natural vectors and their distances based on submitted genomes, providing an online interface for accessing and using the database for viral classification and prediction, and back-end processes for automatic and manual updating of database content to synchronize with GenBank. Submitted genomes data in FASTA format will be carried out and the prediction results with 5 closest neighbors and their classifications will be returned by email. Considering the one-to-one correspondence between sequence and natural vector, time efficiency, and high accuracy, natural vector is a significant advance compared with alignment methods, which makes VirusDB a useful database in further research.
Prakash, Peralam Yegneswaran; Irinyi, Laszlo; Halliday, Catriona; Chen, Sharon; Robert, Vincent; Meyer, Wieland
2017-04-01
The increase in public online databases dedicated to fungal identification is noteworthy. This can be attributed to improved access to molecular approaches to characterize fungi, as well as to delineate species within specific fungal groups in the last 2 decades, leading to an ever-increasing complexity of taxonomic assortments and nomenclatural reassignments. Thus, well-curated fungal databases with substantial accurate sequence data play a pivotal role for further research and diagnostics in the field of mycology. This minireview aims to provide an overview of currently available online databases for the taxonomy and identification of human and animal-pathogenic fungi and calls for the establishment of a cloud-based dynamic data network platform. Copyright © 2017 American Society for Microbiology.
Construction of In-house Databases in a Corporation
NASA Astrophysics Data System (ADS)
Senoo, Tetsuo
As computer technology, communication technology and others have progressed, many corporations are likely to locate constructing and utilizing their own databases at the center of the information activities, and aim at developing their information activities newly. This paper considers how information management in a corporation is affected under changing management and technology environments, and clarifies and generalizes what in-house databases should be constructed and utilized from the viewpoints of requirements to be furnished, types and forms of information to be dealt, indexing, use type and frequency, evaluation method and so on. The author outlines an information system of Matsushita called MATIS (Matsushita Technical Information System) as an actual example, and describes the present status and some points to be reminded in constructing and utilizing databases of REP, BOOK and SYMP.
Classifying environmental pollutants: Part 3. External validation of the classification system.
Verhaar, H J; Solbé, J; Speksnijder, J; van Leeuwen, C J; Hermens, J L
2000-04-01
In order to validate a classification system for the prediction of the toxic effect concentrations of organic environmental pollutants to fish, all available fish acute toxicity data were retrieved from the ECETOC database, a database of quality-evaluated aquatic toxicity measurements created and maintained by the European Centre for the Ecotoxicology and Toxicology of Chemicals. The individual chemicals for which these data were available were classified according to the rulebase under consideration and predictions of effect concentrations or ranges of possible effect concentrations were generated. These predictions were compared to the actual toxicity data retrieved from the database. The results of this comparison show that generally, the classification system provides adequate predictions of either the aquatic toxicity (class 1) or the possible range of toxicity (other classes) of organic compounds. A slight underestimation of effect concentrations occurs for some highly water soluble, reactive chemicals with low log K(ow) values. On the other end of the scale, some compounds that are classified as belonging to a relatively toxic class appear to belong to the so-called baseline toxicity compounds. For some of these, additional classification rules are proposed. Furthermore, some groups of compounds cannot be classified, although they should be amenable to predictions. For these compounds additional research as to class membership and associated prediction rules is proposed.
The SEED and the Rapid Annotation of microbial genomes using Subsystems Technology (RAST)
Overbeek, Ross; Olson, Robert; Pusch, Gordon D.; Olsen, Gary J.; Davis, James J.; Disz, Terry; Edwards, Robert A.; Gerdes, Svetlana; Parrello, Bruce; Shukla, Maulik; Vonstein, Veronika; Wattam, Alice R.; Xia, Fangfang; Stevens, Rick
2014-01-01
In 2004, the SEED (http://pubseed.theseed.org/) was created to provide consistent and accurate genome annotations across thousands of genomes and as a platform for discovering and developing de novo annotations. The SEED is a constantly updated integration of genomic data with a genome database, web front end, API and server scripts. It is used by many scientists for predicting gene functions and discovering new pathways. In addition to being a powerful database for bioinformatics research, the SEED also houses subsystems (collections of functionally related protein families) and their derived FIGfams (protein families), which represent the core of the RAST annotation engine (http://rast.nmpdr.org/). When a new genome is submitted to RAST, genes are called and their annotations are made by comparison to the FIGfam collection. If the genome is made public, it is then housed within the SEED and its proteins populate the FIGfam collection. This annotation cycle has proven to be a robust and scalable solution to the problem of annotating the exponentially increasing number of genomes. To date, >12 000 users worldwide have annotated >60 000 distinct genomes using RAST. Here we describe the interconnectedness of the SEED database and RAST, the RAST annotation pipeline and updates to both resources. PMID:24293654
A Java API for working with PubChem datasets.
Southern, Mark R; Griffin, Patrick R
2011-03-01
PubChem is a public repository of chemical structures and associated biological activities. The PubChem BioAssay database contains assay descriptions, conditions and readouts and biological screening results that have been submitted by the biomedical research community. The PubChem web site and Power User Gateway (PUG) web service allow users to interact with the data and raw files are available via FTP. These resources are helpful to many but there can also be great benefit by using a software API to manipulate the data. Here, we describe a Java API with entity objects mapped to the PubChem Schema and with wrapper functions for calling the NCBI eUtilities and PubChem PUG web services. PubChem BioAssays and associated chemical compounds can then be queried and manipulated in a local relational database. Features include chemical structure searching and generation and display of curve fits from stored dose-response experiments, something that is not yet available within PubChem itself. The aim is to provide researchers with a fast, consistent, queryable local resource from which to manipulate PubChem BioAssays in a database agnostic manner. It is not intended as an end user tool but to provide a platform for further automation and tools development. http://code.google.com/p/pubchemdb.
The SEED and the Rapid Annotation of microbial genomes using Subsystems Technology (RAST).
Overbeek, Ross; Olson, Robert; Pusch, Gordon D; Olsen, Gary J; Davis, James J; Disz, Terry; Edwards, Robert A; Gerdes, Svetlana; Parrello, Bruce; Shukla, Maulik; Vonstein, Veronika; Wattam, Alice R; Xia, Fangfang; Stevens, Rick
2014-01-01
In 2004, the SEED (http://pubseed.theseed.org/) was created to provide consistent and accurate genome annotations across thousands of genomes and as a platform for discovering and developing de novo annotations. The SEED is a constantly updated integration of genomic data with a genome database, web front end, API and server scripts. It is used by many scientists for predicting gene functions and discovering new pathways. In addition to being a powerful database for bioinformatics research, the SEED also houses subsystems (collections of functionally related protein families) and their derived FIGfams (protein families), which represent the core of the RAST annotation engine (http://rast.nmpdr.org/). When a new genome is submitted to RAST, genes are called and their annotations are made by comparison to the FIGfam collection. If the genome is made public, it is then housed within the SEED and its proteins populate the FIGfam collection. This annotation cycle has proven to be a robust and scalable solution to the problem of annotating the exponentially increasing number of genomes. To date, >12 000 users worldwide have annotated >60 000 distinct genomes using RAST. Here we describe the interconnectedness of the SEED database and RAST, the RAST annotation pipeline and updates to both resources.
Tai, David; Fang, Jianwen
2012-08-27
The large sizes of today's chemical databases require efficient algorithms to perform similarity searches. It can be very time consuming to compare two large chemical databases. This paper seeks to build upon existing research efforts by describing a novel strategy for accelerating existing search algorithms for comparing large chemical collections. The quest for efficiency has focused on developing better indexing algorithms by creating heuristics for searching individual chemical against a chemical library by detecting and eliminating needless similarity calculations. For comparing two chemical collections, these algorithms simply execute searches for each chemical in the query set sequentially. The strategy presented in this paper achieves a speedup upon these algorithms by indexing the set of all query chemicals so redundant calculations that arise in the case of sequential searches are eliminated. We implement this novel algorithm by developing a similarity search program called Symmetric inDexing or SymDex. SymDex shows over a 232% maximum speedup compared to the state-of-the-art single query search algorithm over real data for various fingerprint lengths. Considerable speedup is even seen for batch searches where query set sizes are relatively small compared to typical database sizes. To the best of our knowledge, SymDex is the first search algorithm designed specifically for comparing chemical libraries. It can be adapted to most, if not all, existing indexing algorithms and shows potential for accelerating future similarity search algorithms for comparing chemical databases.
Evaluating a NoSQL Alternative for Chilean Virtual Observatory Services
NASA Astrophysics Data System (ADS)
Antognini, J.; Araya, M.; Solar, M.; Valenzuela, C.; Lira, F.
2015-09-01
Currently, the standards and protocols for data access in the Virtual Observatory architecture (DAL) are generally implemented with relational databases based on SQL. In particular, the Astronomical Data Query Language (ADQL), language used by IVOA to represent queries to VO services, was created to satisfy the different data access protocols, such as Simple Cone Search. ADQL is based in SQL92, and has extra functionality implemented using PgSphere. An emergent alternative to SQL are the so called NoSQL databases, which can be classified in several categories such as Column, Document, Key-Value, Graph, Object, etc.; each one recommended for different scenarios. Within their notable characteristics we can find: schema-free, easy replication support, simple API, Big Data, etc. The Chilean Virtual Observatory (ChiVO) is developing a functional prototype based on the IVOA architecture, with the following relevant factors: Performance, Scalability, Flexibility, Complexity, and Functionality. Currently, it's very difficult to compare these factors, due to a lack of alternatives. The objective of this paper is to compare NoSQL alternatives with SQL through the implementation of a Web API REST that satisfies ChiVO's needs: a SESAME-style name resolver for the data from ALMA. Therefore, we propose a test scenario by configuring a NoSQL database with data from different sources and evaluating the feasibility of creating a Simple Cone Search service and its performance. This comparison will allow to pave the way for the application of Big Data databases in the Virtual Observatory.
Grissa, Ibtissem; Vergnaud, Gilles; Pourcel, Christine
2007-01-01
Background In Archeae and Bacteria, the repeated elements called CRISPRs for "clustered regularly interspaced short palindromic repeats" are believed to participate in the defence against viruses. Short sequences called spacers are stored in-between repeated elements. In the current model, motifs comprising spacers and repeats may target an invading DNA and lead to its degradation through a proposed mechanism similar to RNA interference. Analysis of intra-species polymorphism shows that new motifs (one spacer and one repeated element) are added in a polarised fashion. Although their principal characteristics have been described, a lot remains to be discovered on the way CRISPRs are created and evolve. As new genome sequences become available it appears necessary to develop automated scanning tools to make available CRISPRs related information and to facilitate additional investigations. Description We have produced a program, CRISPRFinder, which identifies CRISPRs and extracts the repeated and unique sequences. Using this software, a database is constructed which is automatically updated monthly from newly released genome sequences. Additional tools were created to allow the alignment of flanking sequences in search for similarities between different loci and to build dictionaries of unique sequences. To date, almost six hundred CRISPRs have been identified in 475 published genomes. Two Archeae out of thirty-seven and about half of Bacteria do not possess a CRISPR. Fine analysis of repeated sequences strongly supports the current view that new motifs are added at one end of the CRISPR adjacent to the putative promoter. Conclusion It is hoped that availability of a public database, regularly updated and which can be queried on the web will help in further dissecting and understanding CRISPR structure and flanking sequences evolution. Subsequent analyses of the intra-species CRISPR polymorphism will be facilitated by CRISPRFinder and the dictionary creator. CRISPRdb is accessible at PMID:17521438
NASA Astrophysics Data System (ADS)
Schiefele, Jens; Bader, Joachim; Kastner, S.; Wiesemann, Thorsten; von Viebahn, Harro
2002-07-01
Next generation of cockpit display systems will display mass data. Mass data includes terrain, obstacle, and airport databases. Display formats will be two and eventually 3D. A prerequisite for the introduction of these new functions is the availability of certified graphics hardware. The paper describes functionality and required features of an aviation certified 2D/3D graphics board. This graphics board should be based on low-level and hi-level API calls. These graphic calls should be very similar to OpenGL. All software and the API must be aviation certified. As an example application, a 2D airport navigation function and a 3D terrain visualization is presented. The airport navigation format is based on highly precise airport database following EUROCAE ED-99/RTCA DO-272 specifications. Terrain resolution is based on EUROCAE ED-98/RTCA DO-276 requirements.
Domain Regeneration for Cross-Database Micro-Expression Recognition
NASA Astrophysics Data System (ADS)
Zong, Yuan; Zheng, Wenming; Huang, Xiaohua; Shi, Jingang; Cui, Zhen; Zhao, Guoying
2018-05-01
In this paper, we investigate the cross-database micro-expression recognition problem, where the training and testing samples are from two different micro-expression databases. Under this setting, the training and testing samples would have different feature distributions and hence the performance of most existing micro-expression recognition methods may decrease greatly. To solve this problem, we propose a simple yet effective method called Target Sample Re-Generator (TSRG) in this paper. By using TSRG, we are able to re-generate the samples from target micro-expression database and the re-generated target samples would share same or similar feature distributions with the original source samples. For this reason, we can then use the classifier learned based on the labeled source samples to accurately predict the micro-expression categories of the unlabeled target samples. To evaluate the performance of the proposed TSRG method, extensive cross-database micro-expression recognition experiments designed based on SMIC and CASME II databases are conducted. Compared with recent state-of-the-art cross-database emotion recognition methods, the proposed TSRG achieves more promising results.
Greenlee, Dave
2007-01-01
A week after Hurricane Katrina made landfall in Louisiana, a collaboration among multiple organizations began building a database called the Geographic Information System for the Gulf, shortened to "GIS for the Gulf," to support the geospatial data needs of people in the hurricane-affected area. Data were gathered from diverse sources and entered into a consistent and standardized data model in a manner that is Web accessible.
Efficient Privacy-Enhancing Techniques for Medical Databases
NASA Astrophysics Data System (ADS)
Schartner, Peter; Schaffer, Martin
In this paper, we introduce an alternative for using linkable unique health identifiers: locally generated system-wide unique digital pseudonyms. The presented techniques are based on a novel technique called collision-free number generation which is discussed in the introductory part of the article. Afterwards, attention is payed onto two specific variants of collision-free number generation: one based on the RSA-Problem and the other one based on the Elliptic Curve Discrete Logarithm Problem. Finally, two applications are sketched: centralized medical records and anonymous medical databases.
Thanacoody, H K R; Good, A M; Waring, W S; Bateman, D N
2008-03-01
Paracetamol is the most common means of drug overdose in the UK. Guidance on management is available to junior doctors through TOXBASE, the online resource managed by the UK National Poisons Information Service (NPIS) and in poster form. TOXBASE is supported by NPIS units and further by a UK national rota of clinical toxicologists. A study was undertaken to examine reasons why calls about paracetamol are referred to consultants to better understand issues in managing this common poisoning. Calls relating to paracetamol overdose referred by a poisons information specialist to the duty NPIS consultant between 1 May 2005 and 30 April 2006 were identified from the database and the number of TOXBASE accesses during the same time period was determined. Enquiries that resulted in consultant referral were classified into six categories. Calls referred to NPIS consultants pertain mainly to patients who present late, staggered overdoses, adverse reactions to N-acetylcysteine, and interpretation of blood results. This information has been used to inform the development of TOXBASE so that comprehensive advice is readily available to end users. The operation of a national consultant rota enables information on difficult or unusual cases of poisoning to be pooled so that treatment guidelines can be developed to optimise treatment throughout the UK.
Dextromethorphan Abuse in Adolescence
Bryner, Jodi K.; Wang, Uerica K.; Hui, Jenny W.; Bedodo, Merilin; MacDougall, Conan; Anderson, Ilene B.
2008-01-01
Objectives To analyze the trend of dextromethorphan abuse in California and to compare these findings with national trends. Design A 6-year retrospective review. Setting California Poison Control System (CPCS), American Association of Poison Control Centers (AAPCC), and Drug Abuse Warning Network (DAWN) databases from January 1, 1999, to December 31, 2004. Participants All dextromethorphan abuse cases reported to the CPCS, AAPCC, and DAWN. The main exposures of dextromethorphan abuse cases included date of exposure, age, acute vs long-term use, coingestants, product formulation, and clinical outcome. Main Outcome Measure The annual proportion of dextromethorphan abuse cases among all exposures reported to the CPCS, AAPCC, and DAWN databases. Results A total of 1382 CPCS cases were included in the study. A 10-fold increase in CPCS dextromethorphan abuse cases from 1999 (0.23 cases per 1000 calls) to 2004 (2.15 cases per 1000 calls) (odds ratio, 1.48; 95% confidence interval, 1.43–1.54) was identified. Of all CPCS dextromethorphan abuse cases, 74.5% were aged 9 to 17 years; the frequency of cases among this age group increased more than 15-fold during the study (from 0.11 to 1.68 cases per 1000 calls). Similar trends were seen in the AAPCC and DAWN databases. The highest frequency of dextromethorphan abuse occurred among adolescents aged 15 and 16 years. The most commonly abused product was Coricidin HBP Cough & Cold Tablets. Conclusions Our study revealed an increasing trend of dextromethorphan abuse cases reported to the CPCS that is paralleled nationally as reported to the AAPCC and DAWN. This increase was most evident in the adolescent population. PMID:17146018
Gradishar, William; Johnson, KariAnne; Brown, Krystal; Mundt, Erin; Manley, Susan
2017-07-01
There is a growing move to consult public databases following receipt of a genetic test result from a clinical laboratory; however, the well-documented limitations of these databases call into question how often clinicians will encounter discordant variant classifications that may introduce uncertainty into patient management. Here, we evaluate discordance in BRCA1 and BRCA2 variant classifications between a single commercial testing laboratory and a public database commonly consulted in clinical practice. BRCA1 and BRCA2 variant classifications were obtained from ClinVar and compared with the classifications from a reference laboratory. Full concordance and discordance were determined for variants whose ClinVar entries were of the same pathogenicity (pathogenic, benign, or uncertain). Variants with conflicting ClinVar classifications were considered partially concordant if ≥1 of the listed classifications agreed with the reference laboratory classification. Four thousand two hundred and fifty unique BRCA1 and BRCA2 variants were available for analysis. Overall, 73.2% of classifications were fully concordant and 12.3% were partially concordant. The remaining 14.5% of variants had discordant classifications, most of which had a definitive classification (pathogenic or benign) from the reference laboratory compared with an uncertain classification in ClinVar (14.0%). Here, we show that discrepant classifications between a public database and single reference laboratory potentially account for 26.7% of variants in BRCA1 and BRCA2 . The time and expertise required of clinicians to research these discordant classifications call into question the practicality of checking all test results against a database and suggest that discordant classifications should be interpreted with these limitations in mind. With the increasing use of clinical genetic testing for hereditary cancer risk, accurate variant classification is vital to ensuring appropriate medical management. There is a growing move to consult public databases following receipt of a genetic test result from a clinical laboratory; however, we show that up to 26.7% of variants in BRCA1 and BRCA2 have discordant classifications between ClinVar and a reference laboratory. The findings presented in this paper serve as a note of caution regarding the utility of database consultation. © AlphaMed Press 2017.
Evaluating the quality of Marfan genotype-phenotype correlations in existing FBN1 databases.
Groth, Kristian A; Von Kodolitsch, Yskert; Kutsche, Kerstin; Gaustadnes, Mette; Thorsen, Kasper; Andersen, Niels H; Gravholt, Claus H
2017-07-01
Genetic FBN1 testing is pivotal for confirming the clinical diagnosis of Marfan syndrome. In an effort to evaluate variant causality, FBN1 databases are often used. We evaluated the current databases regarding FBN1 variants and validated associated phenotype records with a new Marfan syndrome geno-phenotyping tool called the Marfan score. We evaluated four databases (UMD-FBN1, ClinVar, the Human Gene Mutation Database (HGMD), and Uniprot) containing 2,250 FBN1 variants supported by 4,904 records presented in 307 references. The Marfan score calculated for phenotype data from the records quantified variant associations with Marfan syndrome phenotype. We calculated a Marfan score for 1,283 variants, of which we confirmed the database diagnosis of Marfan syndrome in 77.1%. This represented only 35.8% of the total registered variants; 18.5-33.3% (UMD-FBN1 versus HGMD) of variants associated with Marfan syndrome in the databases could not be confirmed by the recorded phenotype. FBN1 databases can be imprecise and incomplete. Data should be used with caution when evaluating FBN1 variants. At present, the UMD-FBN1 database seems to be the biggest and best curated; therefore, it is the most comprehensive database. However, the need for better genotype-phenotype curated databases is evident, and we hereby present such a database.Genet Med advance online publication 01 December 2016.
Sánchez-de-Madariaga, Ricardo; Muñoz, Adolfo; Lozano-Rubí, Raimundo; Serrano-Balazote, Pablo; Castro, Antonio L; Moreno, Oscar; Pascual, Mario
2017-08-18
The objective of this research is to compare the relational and non-relational (NoSQL) database systems approaches in order to store, recover, query and persist standardized medical information in the form of ISO/EN 13606 normalized Electronic Health Record XML extracts, both in isolation and concurrently. NoSQL database systems have recently attracted much attention, but few studies in the literature address their direct comparison with relational databases when applied to build the persistence layer of a standardized medical information system. One relational and two NoSQL databases (one document-based and one native XML database) of three different sizes have been created in order to evaluate and compare the response times (algorithmic complexity) of six different complexity growing queries, which have been performed on them. Similar appropriate results available in the literature have also been considered. Relational and non-relational NoSQL database systems show almost linear algorithmic complexity query execution. However, they show very different linear slopes, the former being much steeper than the two latter. Document-based NoSQL databases perform better in concurrency than in isolation, and also better than relational databases in concurrency. Non-relational NoSQL databases seem to be more appropriate than standard relational SQL databases when database size is extremely high (secondary use, research applications). Document-based NoSQL databases perform in general better than native XML NoSQL databases. EHR extracts visualization and edition are also document-based tasks more appropriate to NoSQL database systems. However, the appropriate database solution much depends on each particular situation and specific problem.
Chang, Larry William; Kagaayi, Joseph; Nakigozi, Gertrude; Galiwango, Ronald; Mulamba, Jeremiah; Ludigo, James; Ruwangula, Andrew; Gray, Ronald H.; Quinn, Thomas C.; Bollinger, Robert C.; Reynolds, Steven J.
2009-01-01
Hotlines and warmlines have been successfully used in the developed world to provide clinical advice; however, reports on their replicability in resource-limited settings are limited. A warmline was established in Rakai, Uganda, to support an antiretroviral therapy program. Over a 17-month period, a database was kept of who called, why they called, and the result of the call. A program evaluation was also administered to clinical staff. A total of 1303 calls (3.5 calls per weekday) were logged. The warmline was used mostly by field staff and peripherally based peer health workers. Calls addressed important clinical issues, including the need for urgent care, medication side effects, and follow-up needs. Most clinical staff felt that the warmline made their jobs easier and improved the health of patients. An HIV/AIDS warmline leveraged the skills of a limited workforce to provide increased access to HIV/AIDS care, advice, and education. PMID:18441254
Speech-Like Rhythm in a Voiced and Voiceless Orangutan Call
Lameira, Adriano R.; Hardus, Madeleine E.; Bartlett, Adrian M.; Shumaker, Robert W.; Wich, Serge A.; Menken, Steph B. J.
2015-01-01
The evolutionary origins of speech remain obscure. Recently, it was proposed that speech derived from monkey facial signals which exhibit a speech-like rhythm of ∼5 open-close lip cycles per second. In monkeys, these signals may also be vocalized, offering a plausible evolutionary stepping stone towards speech. Three essential predictions remain, however, to be tested to assess this hypothesis' validity; (i) Great apes, our closest relatives, should likewise produce 5Hz-rhythm signals, (ii) speech-like rhythm should involve calls articulatorily similar to consonants and vowels given that speech rhythm is the direct product of stringing together these two basic elements, and (iii) speech-like rhythm should be experience-based. Via cinematic analyses we demonstrate that an ex-entertainment orangutan produces two calls at a speech-like rhythm, coined “clicks” and “faux-speech.” Like voiceless consonants, clicks required no vocal fold action, but did involve independent manoeuvring over lips and tongue. In parallel to vowels, faux-speech showed harmonic and formant modulations, implying vocal fold and supralaryngeal action. This rhythm was several times faster than orangutan chewing rates, as observed in monkeys and humans. Critically, this rhythm was seven-fold faster, and contextually distinct, than any other known rhythmic calls described to date in the largest database of the orangutan repertoire ever assembled. The first two predictions advanced by this study are validated and, based on parsimony and exclusion of potential alternative explanations, initial support is given to the third prediction. Irrespectively of the putative origins of these calls and underlying mechanisms, our findings demonstrate irrevocably that great apes are not respiratorily, articulatorilly, or neurologically constrained for the production of consonant- and vowel-like calls at speech rhythm. Orangutan clicks and faux-speech confirm the importance of rhythmic speech antecedents within the primate lineage, and highlight potential articulatory homologies between great ape calls and human consonants and vowels. PMID:25569211
2011-01-01
Training databases for LRE2007 and LRE2009 systems CF CallFriend CH CallHome F Fisher English Part 1 .and 2. F Fisher Levantine Arabic F HKUST Mandarin...information is estimated to average 1 hour per response, including the time for reviewing instructions, searching existing data sources, gathering...information if it does not display a currently valid OMB control number. PLEASE DO NOT RETURN YOUR FORM TO THE ABOVE ADDRESS. 1 . REPORT DATE (DD-MM
Interdisciplinary analysis procedures in the modeling and control of large space-based structures
NASA Technical Reports Server (NTRS)
Cooper, Paul A.; Stockwell, Alan E.; Kim, Zeen C.
1987-01-01
The paper describes a computer software system called the Integrated Multidisciplinary Analysis Tool, IMAT, that has been developed at NASA Langley Research Center. IMAT provides researchers and analysts with an efficient capability to analyze satellite control systems influenced by structural dynamics. Using a menu-driven interactive executive program, IMAT links a relational database to commercial structural and controls analysis codes. The paper describes the procedures followed to analyze a complex satellite structure and control system. The codes used to accomplish the analysis are described, and an example is provided of an application of IMAT to the analysis of a reference space station subject to a rectangular pulse loading at its docking port.
Solving Relational Database Problems with ORDBMS in an Advanced Database Course
ERIC Educational Resources Information Center
Wang, Ming
2011-01-01
This paper introduces how to use the object-relational database management system (ORDBMS) to solve relational database (RDB) problems in an advanced database course. The purpose of the paper is to provide a guideline for database instructors who desire to incorporate the ORDB technology in their traditional database courses. The paper presents…
LDSplitDB: a database for studies of meiotic recombination hotspots in MHC using human genomic data.
Guo, Jing; Chen, Hao; Yang, Peng; Lee, Yew Ti; Wu, Min; Przytycka, Teresa M; Kwoh, Chee Keong; Zheng, Jie
2018-04-20
Meiotic recombination happens during the process of meiosis when chromosomes inherited from two parents exchange genetic materials to generate chromosomes in the gamete cells. The recombination events tend to occur in narrow genomic regions called recombination hotspots. Its dysregulation could lead to serious human diseases such as birth defects. Although the regulatory mechanism of recombination events is still unclear, DNA sequence polymorphisms have been found to play crucial roles in the regulation of recombination hotspots. To facilitate the studies of the underlying mechanism, we developed a database named LDSplitDB which provides an integrative and interactive data mining and visualization platform for the genome-wide association studies of recombination hotspots. It contains the pre-computed association maps of the major histocompatibility complex (MHC) region in the 1000 Genomes Project and the HapMap Phase III datasets, and a genome-scale study of the European population from the HapMap Phase II dataset. Besides the recombination profiles, related data of genes, SNPs and different types of epigenetic modifications, which could be associated with meiotic recombination, are provided for comprehensive analysis. To meet the computational requirement of the rapidly increasing population genomics data, we prepared a lookup table of 400 haplotypes for recombination rate estimation using the well-known LDhat algorithm which includes all possible two-locus haplotype configurations. To the best of our knowledge, LDSplitDB is the first large-scale database for the association analysis of human recombination hotspots with DNA sequence polymorphisms. It provides valuable resources for the discovery of the mechanism of meiotic recombination hotspots. The information about MHC in this database could help understand the roles of recombination in human immune system. DATABASE URL: http://histone.scse.ntu.edu.sg/LDSplitDB.
LIRIS flight database and its use toward noncooperative rendezvous
NASA Astrophysics Data System (ADS)
Mongrard, O.; Ankersen, F.; Casiez, P.; Cavrois, B.; Donnard, A.; Vergnol, A.; Southivong, U.
2018-06-01
ESA's fifth and last Automated Transfer Vehicle, ATV Georges Lemaître, tested new rendezvous technology before docking with the International Space Station (ISS) in August 2014. The technology demonstration called Laser Infrared Imaging Sensors (LIRIS) provides an unseen view of the ISS. During Georges Lemaître's rendezvous, LIRIS sensors, composed of two infrared cameras, one visible camera, and a scanning LIDAR (Light Detection and Ranging), were turned on two and a half hours and 3500 m from the Space Station. All sensors worked as expected and a large amount of data was recorded and stored within ATV-5's cargo hold before being returned to Earth with the Soyuz flight 38S in September 2014. As a part of the LIRIS postflight activities, the information gathered by all sensors is collected inside a flight database together with the reference ATV trajectory and attitude estimated by ATV main navigation sensors. Although decoupled from the ATV main computer, the LIRIS data were carefully synchronized with ATV guidance, navigation, and control (GNC) data. Hence, the LIRIS database can be used to assess the performance of various image processing algorithms to provide range and line-of-sight (LoS) navigation at long/medium range but also 6 degree-of-freedom (DoF) navigation at short range. The database also contains information related to the overall ATV position with respect to Earth and the Sun direction within ATV frame such that the effect of the environment on the sensors can also be investigated. This paper introduces the structure of the LIRIS database and provides some example of applications to increase the technology readiness level of noncooperative rendezvous.
An expression database for roots of the model legume Medicago truncatula under salt stress
2009-01-01
Background Medicago truncatula is a model legume whose genome is currently being sequenced by an international consortium. Abiotic stresses such as salt stress limit plant growth and crop productivity, including those of legumes. We anticipate that studies on M. truncatula will shed light on other economically important legumes across the world. Here, we report the development of a database called MtED that contains gene expression profiles of the roots of M. truncatula based on time-course salt stress experiments using the Affymetrix Medicago GeneChip. Our hope is that MtED will provide information to assist in improving abiotic stress resistance in legumes. Description The results of our microarray experiment with roots of M. truncatula under 180 mM sodium chloride were deposited in the MtED database. Additionally, sequence and annotation information regarding microarray probe sets were included. MtED provides functional category analysis based on Gene and GeneBins Ontology, and other Web-based tools for querying and retrieving query results, browsing pathways and transcription factor families, showing metabolic maps, and comparing and visualizing expression profiles. Utilities like mapping probe sets to genome of M. truncatula and In-Silico PCR were implemented by BLAT software suite, which were also available through MtED database. Conclusion MtED was built in the PHP script language and as a MySQL relational database system on a Linux server. It has an integrated Web interface, which facilitates ready examination and interpretation of the results of microarray experiments. It is intended to help in selecting gene markers to improve abiotic stress resistance in legumes. MtED is available at http://bioinformatics.cau.edu.cn/MtED/. PMID:19906315
An expression database for roots of the model legume Medicago truncatula under salt stress.
Li, Daofeng; Su, Zhen; Dong, Jiangli; Wang, Tao
2009-11-11
Medicago truncatula is a model legume whose genome is currently being sequenced by an international consortium. Abiotic stresses such as salt stress limit plant growth and crop productivity, including those of legumes. We anticipate that studies on M. truncatula will shed light on other economically important legumes across the world. Here, we report the development of a database called MtED that contains gene expression profiles of the roots of M. truncatula based on time-course salt stress experiments using the Affymetrix Medicago GeneChip. Our hope is that MtED will provide information to assist in improving abiotic stress resistance in legumes. The results of our microarray experiment with roots of M. truncatula under 180 mM sodium chloride were deposited in the MtED database. Additionally, sequence and annotation information regarding microarray probe sets were included. MtED provides functional category analysis based on Gene and GeneBins Ontology, and other Web-based tools for querying and retrieving query results, browsing pathways and transcription factor families, showing metabolic maps, and comparing and visualizing expression profiles. Utilities like mapping probe sets to genome of M. truncatula and In-Silico PCR were implemented by BLAT software suite, which were also available through MtED database. MtED was built in the PHP script language and as a MySQL relational database system on a Linux server. It has an integrated Web interface, which facilitates ready examination and interpretation of the results of microarray experiments. It is intended to help in selecting gene markers to improve abiotic stress resistance in legumes. MtED is available at http://bioinformatics.cau.edu.cn/MtED/.
Corcoran, Callan C; Grady, Cameron R; Pisitkun, Trairak; Parulekar, Jaya; Knepper, Mark A
2017-03-01
The organization of the mammalian genome into gene subsets corresponding to specific functional classes has provided key tools for systems biology research. Here, we have created a web-accessible resource called the Mammalian Metabolic Enzyme Database ( https://hpcwebapps.cit.nih.gov/ESBL/Database/MetabolicEnzymes/MetabolicEnzymeDatabase.html) keyed to the biochemical reactions represented on iconic metabolic pathway wall charts created in the previous century. Overall, we have mapped 1,647 genes to these pathways, representing ~7 percent of the protein-coding genome. To illustrate the use of the database, we apply it to the area of kidney physiology. In so doing, we have created an additional database ( Database of Metabolic Enzymes in Kidney Tubule Segments: https://hpcwebapps.cit.nih.gov/ESBL/Database/MetabolicEnzymes/), mapping mRNA abundance measurements (mined from RNA-Seq studies) for all metabolic enzymes to each of 14 renal tubule segments. We carry out bioinformatics analysis of the enzyme expression pattern among renal tubule segments and mine various data sources to identify vasopressin-regulated metabolic enzymes in the renal collecting duct. Copyright © 2017 the American Physiological Society.
Anestis, Michael D; Anestis, Joye C; Zawilinski, Laci L; Hopkins, Tiffany A; Lilienfeld, Scott O
2014-12-01
Equine-related treatments (ERT) for mental disorders are becoming increasingly popular for a variety of diagnoses; however, they have been subjected only to limited systematic investigation. To examine the quality of and results from peer-reviewed research on ERT for mental disorders and related outcomes. Peer-reviewed studies (k = 14) examining treatments for mental disorders or closely related outcomes were identified from databases and article reference sections. All studies were compromised by a substantial number of threats to validity, calling into question the meaning and clinical significance of their findings. Additionally, studies failed to provide consistent evidence that ERT is superior to the mere passage of time in the treatment of any mental disorder. The current evidence base does not justify the marketing and utilization of ERT for mental disorders. Such services should not be offered to the public unless and until well-designed studies provide evidence that justify different conclusions. © 2014 Wiley Periodicals, Inc.
Matthews, Edwin J; Kruhlak, Naomi L; Weaver, James L; Benz, R Daniel; Contrera, Joseph F
2004-12-01
The FDA's Spontaneous Reporting System (SRS) database contains over 1.5 million adverse drug reaction (ADR) reports for 8620 drugs/biologics that are listed for 1191 Coding Symbols for Thesaurus of Adverse Reaction (COSTAR) terms of adverse effects. We have linked the trade names of the drugs to 1861 generic names and retrieved molecular structures for each chemical to obtain a set of 1515 organic chemicals that are suitable for modeling with commercially available QSAR software packages. ADR report data for 631 of these compounds were extracted and pooled for the first five years that each drug was marketed. Patient exposure was estimated during this period using pharmaceutical shipping units obtained from IMS Health. Significant drug effects were identified using a Reporting Index (RI), where RI = (# ADR reports / # shipping units) x 1,000,000. MCASE/MC4PC software was used to identify the optimal conditions for defining a significant adverse effect finding. Results suggest that a significant effect in our database is characterized by > or = 4 ADR reports and > or = 20,000 shipping units during five years of marketing, and an RI > or = 4.0. Furthermore, for a test chemical to be evaluated as active it must contain a statistically significant molecular structural alert, called a decision alert, in two or more toxicologically related endpoints. We also report the use of a composite module, which pools observations from two or more toxicologically related COSTAR term endpoints to provide signal enhancement for detecting adverse effects.
Pirooznia, Mehdi; Gong, Ping; Guan, Xin; Inouye, Laura S; Yang, Kuan; Perkins, Edward J; Deng, Youping
2007-01-01
Background Eisenia fetida, commonly known as red wiggler or compost worm, belongs to the Lumbricidae family of the Annelida phylum. Little is known about its genome sequence although it has been extensively used as a test organism in terrestrial ecotoxicology. In order to understand its gene expression response to environmental contaminants, we cloned 4032 cDNAs or expressed sequence tags (ESTs) from two E. fetida libraries enriched with genes responsive to ten ordnance related compounds using suppressive subtractive hybridization-PCR. Results A total of 3144 good quality ESTs (GenBank dbEST accession number EH669363–EH672369 and EL515444–EL515580) were obtained from the raw clone sequences after cleaning. Clustering analysis yielded 2231 unique sequences including 448 contigs (from 1361 ESTs) and 1783 singletons. Comparative genomic analysis showed that 743 or 33% of the unique sequences shared high similarity with existing genes in the GenBank nr database. Provisional function annotation assigned 830 Gene Ontology terms to 517 unique sequences based on their homology with the annotated genomes of four model organisms Drosophila melanogaster, Mus musculus, Saccharomyces cerevisiae, and Caenorhabditis elegans. Seven percent of the unique sequences were further mapped to 99 Kyoto Encyclopedia of Genes and Genomes pathways based on their matching Enzyme Commission numbers. All the information is stored and retrievable at a highly performed, web-based and user-friendly relational database called EST model database or ESTMD version 2. Conclusion The ESTMD containing the sequence and annotation information of 4032 E. fetida ESTs is publicly accessible at . PMID:18047730
Nuclear Forensics Analysis with Missing and Uncertain Data
Langan, Roisin T.; Archibald, Richard K.; Lamberti, Vincent
2015-10-05
We have applied a new imputation-based method for analyzing incomplete data, called Monte Carlo Bayesian Database Generation (MCBDG), to the Spent Fuel Isotopic Composition (SFCOMPO) database. About 60% of the entries are absent for SFCOMPO. The method estimates missing values of a property from a probability distribution created from the existing data for the property, and then generates multiple instances of the completed database for training a machine learning algorithm. Uncertainty in the data is represented by an empirical or an assumed error distribution. The method makes few assumptions about the underlying data, and compares favorably against results obtained bymore » replacing missing information with constant values.« less
Application of Large-Scale Database-Based Online Modeling to Plant State Long-Term Estimation
NASA Astrophysics Data System (ADS)
Ogawa, Masatoshi; Ogai, Harutoshi
Recently, attention has been drawn to the local modeling techniques of a new idea called “Just-In-Time (JIT) modeling”. To apply “JIT modeling” to a large amount of database online, “Large-scale database-based Online Modeling (LOM)” has been proposed. LOM is a technique that makes the retrieval of neighboring data more efficient by using both “stepwise selection” and quantization. In order to predict the long-term state of the plant without using future data of manipulated variables, an Extended Sequential Prediction method of LOM (ESP-LOM) has been proposed. In this paper, the LOM and the ESP-LOM are introduced.
Identifying work-related motor vehicle crashes in multiple databases.
Thomas, Andrea M; Thygerson, Steven M; Merrill, Ray M; Cook, Lawrence J
2012-01-01
To compare and estimate the magnitude of work-related motor vehicle crashes in Utah using 2 probabilistically linked statewide databases. Data from 2006 and 2007 motor vehicle crash and hospital databases were joined through probabilistic linkage. Summary statistics and capture-recapture were used to describe occupants injured in work-related motor vehicle crashes and estimate the size of this population. There were 1597 occupants in the motor vehicle crash database and 1673 patients in the hospital database identified as being in a work-related motor vehicle crash. We identified 1443 occupants with at least one record from either the motor vehicle crash or hospital database indicating work-relatedness that linked to any record in the opposing database. We found that 38.7 percent of occupants injured in work-related motor vehicle crashes identified in the motor vehicle crash database did not have a primary payer code of workers' compensation in the hospital database and 40.0 percent of patients injured in work-related motor vehicle crashes identified in the hospital database did not meet our definition of a work-related motor vehicle crash in the motor vehicle crash database. Depending on how occupants injured in work-related motor crashes are identified, we estimate the population to be between 1852 and 8492 in Utah for the years 2006 and 2007. Research on single databases may lead to biased interpretations of work-related motor vehicle crashes. Combining 2 population based databases may still result in an underestimate of the magnitude of work-related motor vehicle crashes. Improved coding of work-related incidents is needed in current databases.
Informatics in radiology: Efficiency metrics for imaging device productivity.
Hu, Mengqi; Pavlicek, William; Liu, Patrick T; Zhang, Muhong; Langer, Steve G; Wang, Shanshan; Place, Vicki; Miranda, Rafael; Wu, Teresa Tong
2011-01-01
Acute awareness of the costs associated with medical imaging equipment is an ever-present aspect of the current healthcare debate. However, the monitoring of productivity associated with expensive imaging devices is likely to be labor intensive, relies on summary statistics, and lacks accepted and standardized benchmarks of efficiency. In the context of the general Six Sigma DMAIC (design, measure, analyze, improve, and control) process, a World Wide Web-based productivity tool called the Imaging Exam Time Monitor was developed to accurately and remotely monitor imaging efficiency with use of Digital Imaging and Communications in Medicine (DICOM) combined with a picture archiving and communication system. Five device efficiency metrics-examination duration, table utilization, interpatient time, appointment interval time, and interseries time-were derived from DICOM values. These metrics allow the standardized measurement of productivity, to facilitate the comparative evaluation of imaging equipment use and ongoing efforts to improve efficiency. A relational database was constructed to store patient imaging data, along with device- and examination-related data. The database provides full access to ad hoc queries and can automatically generate detailed reports for administrative and business use, thereby allowing staff to monitor data for trends and to better identify possible changes that could lead to improved productivity and reduced costs in association with imaging services. © RSNA, 2011.
A review on quantum search algorithms
NASA Astrophysics Data System (ADS)
Giri, Pulak Ranjan; Korepin, Vladimir E.
2017-12-01
The use of superposition of states in quantum computation, known as quantum parallelism, has significant advantage in terms of speed over the classical computation. It is evident from the early invented quantum algorithms such as Deutsch's algorithm, Deutsch-Jozsa algorithm and its variation as Bernstein-Vazirani algorithm, Simon algorithm, Shor's algorithms, etc. Quantum parallelism also significantly speeds up the database search algorithm, which is important in computer science because it comes as a subroutine in many important algorithms. Quantum database search of Grover achieves the task of finding the target element in an unsorted database in a time quadratically faster than the classical computer. We review Grover's quantum search algorithms for a singe and multiple target elements in a database. The partial search algorithm of Grover and Radhakrishnan and its optimization by Korepin called GRK algorithm are also discussed.
Pardo-Hernandez, Hector; Urrútia, Gerard; Barajas-Nava, Leticia A; Buitrago-Garcia, Diana; Garzón, Julieth Vanessa; Martínez-Zapata, María José; Bonfill, Xavier
2017-06-13
Systematic reviews provide the best evidence on the effect of health care interventions. They rely on comprehensive access to the available scientific literature. Electronic search strategies alone may not suffice, requiring the implementation of a handsearching approach. We have developed a database to provide an Internet-based platform from which handsearching activities can be coordinated, including a procedure to streamline the submission of these references into CENTRAL, the Cochrane Collaboration Central Register of Controlled Trials. We developed a database and a descriptive analysis. Through brainstorming and discussion among stakeholders involved in handsearching projects, we designed a database that met identified needs that had to be addressed in order to ensure the viability of handsearching activities. Three handsearching teams pilot tested the proposed database. Once the final version of the database was approved, we proceeded to train the staff involved in handsearching. The proposed database is called BADERI (Database of Iberoamerican Clinical Trials and Journals, by its initials in Spanish). BADERI was officially launched in October 2015, and it can be accessed at www.baderi.com/login.php free of cost. BADERI has an administration subsection, from which the roles of users are managed; a references subsection, where information associated to identified controlled clinical trials (CCTs) can be entered; a reports subsection, from which reports can be generated to track and analyse the results of handsearching activities; and a built-in free text search engine. BADERI allows all references to be exported in ProCite files that can be directly uploaded into CENTRAL. To date, 6284 references to CCTs have been uploaded to BADERI and sent to CENTRAL. The identified CCTs were published in a total of 420 journals related to 46 medical specialties. The year of publication ranged between 1957 and 2016. BADERI allows the efficient management of handsearching activities across different countries and institutions. References to all CCTs available in BADERI can be readily submitted to CENTRAL for their potential inclusion in systematic reviews.
FreeSolv: A database of experimental and calculated hydration free energies, with input files
Mobley, David L.; Guthrie, J. Peter
2014-01-01
This work provides a curated database of experimental and calculated hydration free energies for small neutral molecules in water, along with molecular structures, input files, references, and annotations. We call this the Free Solvation Database, or FreeSolv. Experimental values were taken from prior literature and will continue to be curated, with updated experimental references and data added as they become available. Calculated values are based on alchemical free energy calculations using molecular dynamics simulations. These used the GAFF small molecule force field in TIP3P water with AM1-BCC charges. Values were calculated with the GROMACS simulation package, with full details given in references cited within the database itself. This database builds in part on a previous, 504-molecule database containing similar information. However, additional curation of both experimental data and calculated values has been done here, and the total number of molecules is now up to 643. Additional information is now included in the database, such as SMILES strings, PubChem compound IDs, accurate reference DOIs, and others. One version of the database is provided in the Supporting Information of this article, but as ongoing updates are envisioned, the database is now versioned and hosted online. In addition to providing the database, this work describes its construction process. The database is available free-of-charge via http://www.escholarship.org/uc/item/6sd403pz. PMID:24928188
Bohn, Justin; Eddings, Wesley; Schneeweiss, Sebastian
2017-03-15
Distributed networks of health-care data sources are increasingly being utilized to conduct pharmacoepidemiologic database studies. Such networks may contain data that are not physically pooled but instead are distributed horizontally (separate patients within each data source) or vertically (separate measures within each data source) in order to preserve patient privacy. While multivariable methods for the analysis of horizontally distributed data are frequently employed, few practical approaches have been put forth to deal with vertically distributed health-care databases. In this paper, we propose 2 propensity score-based approaches to vertically distributed data analysis and test their performance using 5 example studies. We found that these approaches produced point estimates close to what could be achieved without partitioning. We further found a performance benefit (i.e., lower mean squared error) for sequentially passing a propensity score through each data domain (called the "sequential approach") as compared with fitting separate domain-specific propensity scores (called the "parallel approach"). These results were validated in a small simulation study. This proof-of-concept study suggests a new multivariable analysis approach to vertically distributed health-care databases that is practical, preserves patient privacy, and warrants further investigation for use in clinical research applications that rely on health-care databases. © The Author 2017. Published by Oxford University Press on behalf of the Johns Hopkins Bloomberg School of Public Health. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.
National Urban Database and Access Portal Tool
Based on the need for advanced treatments of high resolution urban morphological features (e.g., buildings, trees) in meteorological, dispersion, air quality and human exposure modeling systems for future urban applications, a new project was launched called the National Urban Da...
7 CFR 1.7 - Agency response to requests for records.
Code of Federal Regulations, 2010 CFR
2010-01-01
... Archives and Records Administration (“NARA”), the agency shall inform the requester of this fact and shall...) Database at http://www/nara.gov/nara.nail.html, or by calling NARA at (301) 713-6800. If the agency has no...
Myria: Scalable Analytics as a Service
NASA Astrophysics Data System (ADS)
Howe, B.; Halperin, D.; Whitaker, A.
2014-12-01
At the UW eScience Institute, we're working to empower non-experts, especially in the sciences, to write and use data-parallel algorithms. To this end, we are building Myria, a web-based platform for scalable analytics and data-parallel programming. Myria's internal model of computation is the relational algebra extended with iteration, such that every program is inherently data-parallel, just as every query in a database is inherently data-parallel. But unlike databases, iteration is a first class concept, allowing us to express machine learning tasks, graph traversal tasks, and more. Programs can be expressed in a number of languages and can be executed on a number of execution environments, but we emphasize a particular language called MyriaL that supports both imperative and declarative styles and a particular execution engine called MyriaX that uses an in-memory column-oriented representation and asynchronous iteration. We deliver Myria over the web as a service, providing an editor, performance analysis tools, and catalog browsing features in a single environment. We find that this web-based "delivery vector" is critical in reaching non-experts: they are insulated from irrelevant effort technical work associated with installation, configuration, and resource management. The MyriaX backend, one of several execution runtimes we support, is a main-memory, column-oriented, RDBMS-on-the-worker system that supports cyclic data flows as a first-class citizen and has been shown to outperform competitive systems on 100-machine cluster sizes. I will describe the Myria system, give a demo, and present some new results in large-scale oceanographic microbiology.
Novel algorithm by low complexity filter on retinal vessel segmentation
NASA Astrophysics Data System (ADS)
Rostampour, Samad
2011-10-01
This article shows a new method to detect blood vessels in the retina by digital images. Retinal vessel segmentation is important for detection of side effect of diabetic disease, because diabetes can form new capillaries which are very brittle. The research has been done in two phases: preprocessing and processing. Preprocessing phase consists to apply a new filter that produces a suitable output. It shows vessels in dark color on white background and make a good difference between vessels and background. The complexity is very low and extra images are eliminated. The second phase is processing and used the method is called Bayesian. It is a built-in in supervision classification method. This method uses of mean and variance of intensity of pixels for calculate of probability. Finally Pixels of image are divided into two classes: vessels and background. Used images are related to the DRIVE database. After performing this operation, the calculation gives 95 percent of efficiency average. The method also was performed from an external sample DRIVE database which has retinopathy, and perfect result was obtained
Nakayama, Takeo
2012-01-01
The concept of evidence-based medicine (EBM) has promulgated among healthcare professionals in recent years, on the other hand, the problem of underuse of useful clinical evidence is coming to be important. This is called as evidence-practice gap. The major concern about evidence-practice gap is insufficient implementation of evidence-based effective treatment, however, the perspective can be extended to measures to improve drug safety and prevention of drug related adverse events. First, this article reviews the characteristics of the database of receipt (healthcare claims) and the usefulness for research purpose of pharmacoepidemiology. Second, as the real example of the study on evidence-practice gap by using the receipt database, the case of ergot-derived anti-Parkinson drugs, of which risk of valvulopathy has been identified, is introduced. The receipt analysis showed that more than 70% of Parkinson's disease patients prescribed with cabergoline or pergolide did not undergo echocardiography despite the revision of the product label recommendation. Afterwards, the issues of pharmaceutical risk management and risk communication will be discussed.
NASA Technical Reports Server (NTRS)
Anderson, David J.; Mizukami, Masashi
1993-01-01
NASA has initiated the High Speed Research (HSR) program with the goal to develop technologies for a new generation, economically viable, environmentally acceptable, supersonic transport (SST) called the High Speed Civil Transport (HSCT). A significant part of this effort is expected to be in multidisciplinary systems integration, such as in propulsion airframe integration (PAI). In order to assimilate the knowledge database on PAI for SST type aircraft, a bibliography on this subject was compiled. The bibliography with over 1200 entries, full abstracts, and indexes. Related topics are also covered, such as the following: engine inlets, engine cycles, nozzles, existing supersonic cruise aircraft, noise issues, computational fluid dynamics, aerodynamics, and external interference. All identified documents from 1980 through early 1991 are included; this covers the latter part of the NASA Supersonic Cruise Research (SCR) program and the beginnings of the HSR program. In addition, some pre-1980 documents of significant merit or reference value are also included. The references were retrieved via a computerized literature search using the NASA RECON database system.
Schel, Anne Marijke; Tranquilli, Sandra; Zuberbühler, Klaus
2009-05-01
Vervet monkey alarm calling has long been the paradigmatic example of how primates use vocalizations in response to predators. In vervets, there is a close and direct relationship between the production of distinct alarm vocalizations and the presence of distinct predator types. Recent fieldwork has however revealed the use of several additional alarm calling systems in primates. Here, the authors describe playback studies on the alarm call system of two colobine species, the King colobus (Colobus polykomos) of Taï Forest, Ivory Coast, and the Guereza colobus (C. guereza) of Budongo Forest, Uganda. Both species produce two basic alarm call types, snorts and acoustically variable roaring phrases, when confronted with leopards or crowned eagles. Neither call type is given exclusively to one predator, but the authors found strong regularities in call sequencing. Leopards typically elicited sequences consisting of a snort followed by few phrases, while eagles typically elicited sequences with no snorts and many phrases. The authors discuss how these call sequences have the potential to encode information at different levels, such as predator type, response-urgency, or the caller's imminent behavior. (PsycINFO Database Record (c) 2009 APA, all rights reserved).
Moore, P Quincy; Weber, Joseph; Cina, Steven; Aks, Steven
2017-11-01
Describe surveillance data from three existing surveillance systems during an unexpected fentanyl outbreak in a large metropolitan area. We performed a retrospective analysis of three data sets: Chicago Fire Department EMS, Cook County Medical Examiner, and Illinois Poison Center. Each included data from January 1, 2015 through December 31, 2015. EMS data included all EMS responses in Chicago, Illinois, for suspected opioid overdose in which naloxone was administered and EMS personnel documented other criteria indicative of opioid overdose. Medical Examiner data included all deaths in Cook County, Illinois, related to heroin, fentanyl or both. Illinois Poison Center data included all calls in Chicago, Illinois, related to fentanyl, heroin, and other prescription opioids. Descriptive statistics using Microsoft Excel® were used to analyze the data and create figures. We identified a spike in opioid-related EMS responses during an 11-day period from September 30-October 10, 2015. Medical Examiner data showed an increase in both fentanyl and mixed fentanyl/heroin related deaths during the months of September and October, 2015 (375% and 550% above the median, respectively.) Illinois Poison Center data showed no significant increase in heroin, fentanyl, or other opioid-related calls during September and October 2015. Our data suggests that EMS data is an effective real-time surveillance mechanism for changes in the rate of opioid overdoses. Medical Examiner's data was found to be valuable for confirmation of EMS surveillance data and identification of specific intoxicants. Poison Center data did not correlate with EMS or Medical Examiner data. Copyright © 2017 Elsevier Inc. All rights reserved.
The impact of work-related stress on medication errors in Eastern Region Saudi Arabia.
Salam, Abdul; Segal, David M; Abu-Helalah, Munir Ahmad; Gutierrez, Mary Lou; Joosub, Imran; Ahmed, Wasim; Bibi, Rubina; Clarke, Elizabeth; Qarni, Ali Ahmed Al
2018-05-07
To examine the relationship between overall level and source-specific work-related stressors on medication errors rate. A cross-sectional study examined the relationship between overall levels of stress, 25 source-specific work-related stressors and medication error rate based on documented incident reports in Saudi Arabia (SA) hospital, using secondary databases. King Abdulaziz Hospital in Al-Ahsa, Eastern Region, SA. Two hundred and sixty-nine healthcare professionals (HCPs). The odds ratio (OR) and corresponding 95% confidence interval (CI) for HCPs documented incident report medication errors and self-reported sources of Job Stress Survey. Multiple logistic regression analysis identified source-specific work-related stress as significantly associated with HCPs who made at least one medication error per month (P < 0.05), including disruption to home life, pressure to meet deadlines, difficulties with colleagues, excessive workload, income over 10 000 riyals and compulsory night/weekend call duties either some or all of the time. Although not statistically significant, HCPs who reported overall stress were two times more likely to make at least one medication error per month than non-stressed HCPs (OR: 1.95, P = 0.081). This is the first study to use documented incident reports for medication errors rather than self-report to evaluate the level of stress-related medication errors in SA HCPs. Job demands, such as social stressors (home life disruption, difficulties with colleagues), time pressures, structural determinants (compulsory night/weekend call duties) and higher income, were significantly associated with medication errors whereas overall stress revealed a 2-fold higher trend.
Modernization and multiscale databases at the U.S. geological survey
Morrison, J.L.
1992-01-01
The U.S. Geological Survey (USGS) has begun a digital cartographic modernization program. Keys to that program are the creation of a multiscale database, a feature-based file structure that is derived from a spatial data model, and a series of "templates" or rules that specify the relationships between instances of entities in reality and features in the database. The database will initially hold data collected from the USGS standard map products at scales of 1:24,000, 1:100,000, and 1:2,000,000. The spatial data model is called the digital line graph-enhanced model, and the comprehensive rule set consists of collection rules, product generation rules, and conflict resolution rules. This modernization program will affect the USGS mapmaking process because both digital and graphic products will be created from the database. In addition, non-USGS map users will have more flexibility in uses of the databases. These remarks are those of the session discussant made in response to the six papers and the keynote address given in the session. ?? 1992.
Active browsing using similarity pyramids
NASA Astrophysics Data System (ADS)
Chen, Jau-Yuen; Bouman, Charles A.; Dalton, John C.
1998-12-01
In this paper, we describe a new approach to managing large image databases, which we call active browsing. Active browsing integrates relevance feedback into the browsing environment, so that users can modify the database's organization to suit the desired task. Our method is based on a similarity pyramid data structure, which hierarchically organizes the database, so that it can be efficiently browsed. At coarse levels, the similarity pyramid allows users to view the database as large clusters of similar images. Alternatively, users can 'zoom into' finer levels to view individual images. We discuss relevance feedback for the browsing process, and argue that it is fundamentally different from relevance feedback for more traditional search-by-query tasks. We propose two fundamental operations for active browsing: pruning and reorganization. Both of these operations depend on a user-defined relevance set, which represents the image or set of images desired by the user. We present statistical methods for accurately pruning the database, and we propose a new 'worm hole' distance metric for reorganizing the database, so that members of the relevance set are grouped together.
AquaPathogen X--A template database for tracking field isolates of aquatic pathogens
Emmenegger, Evi; Kurath, Gael
2012-01-01
AquaPathogen X is a template database for recording information on individual isolates of aquatic pathogens and is available for download from the U.S. Geological Survey (USGS) Western Fisheries Research Center (WFRC) website (http://wfrc.usgs.gov). This template database can accommodate the nucleotide sequence data generated in molecular epidemiological studies along with the myriad of abiotic and biotic traits associated with isolates of various pathogens (for example, viruses, parasites, or bacteria) from multiple aquatic animal host species (for example, fish, shellfish, or shrimp). The simultaneous cataloging of isolates from different aquatic pathogens is a unique feature to the AquaPathogen X database, which can be used in surveillance of emerging aquatic animal diseases and clarification of main risk factors associated with pathogen incursions into new water systems. As a template database, the data fields are empty upon download and can be modified to user specifications. For example, an application of the template database that stores the epidemiological profiles of fish virus isolates, called Fish ViroTrak (fig. 1), was also developed (Emmenegger and others, 2011).
SoyFN: a knowledge database of soybean functional networks.
Xu, Yungang; Guo, Maozu; Liu, Xiaoyan; Wang, Chunyu; Liu, Yang
2014-01-01
Many databases for soybean genomic analysis have been built and made publicly available, but few of them contain knowledge specifically targeting the omics-level gene-gene, gene-microRNA (miRNA) and miRNA-miRNA interactions. Here, we present SoyFN, a knowledge database of soybean functional gene networks and miRNA functional networks. SoyFN provides user-friendly interfaces to retrieve, visualize, analyze and download the functional networks of soybean genes and miRNAs. In addition, it incorporates much information about KEGG pathways, gene ontology annotations and 3'-UTR sequences as well as many useful tools including SoySearch, ID mapping, Genome Browser, eFP Browser and promoter motif scan. SoyFN is a schema-free database that can be accessed as a Web service from any modern programming language using a simple Hypertext Transfer Protocol call. The Web site is implemented in Java, JavaScript, PHP, HTML and Apache, with all major browsers supported. We anticipate that this database will be useful for members of research communities both in soybean experimental science and bioinformatics. Database URL: http://nclab.hit.edu.cn/SoyFN.
A Data Analysis Expert System For Large Established Distributed Databases
NASA Astrophysics Data System (ADS)
Gnacek, Anne-Marie; An, Y. Kim; Ryan, J. Patrick
1987-05-01
The purpose of this work is to analyze the applicability of artificial intelligence techniques for developing a user-friendly, parallel interface to large isolated, incompatible NASA databases for the purpose of assisting the management decision process. To carry out this work, a survey was conducted to establish the data access requirements of several key NASA user groups. In addition, current NASA database access methods were evaluated. The results of this work are presented in the form of a design for a natural language database interface system, called the Deductively Augmented NASA Management Decision Support System (DANMDS). This design is feasible principally because of recently announced commercial hardware and software product developments which allow cross-vendor compatibility. The goal of the DANMDS system is commensurate with the central dilemma confronting most large companies and institutions in America, the retrieval of information from large, established, incompatible database systems. The DANMDS system implementation would represent a significant first step toward this problem's resolution.
NASA MEaSUREs Combined ASTER and MODIS Emissivity over Land (CAMEL)
NASA Astrophysics Data System (ADS)
Borbas, E. E.; Hulley, G. C.; Feltz, M.; Knuteson, R. O.; Hook, S. J.
2016-12-01
A land surface emissivity product of the NASA MEASUREs project called Combined ASTER and MODIS Emissivity over Land (CAMEL) is being made available as part of the Unified and Coherent Land Surface Temperature and Emissivity (LST&E) Earth System Data Record (ESDR). The CAMEL database has been created by merging the UW MODIS-based baseline-fit emissivity database (UWIREMIS) developed at the University of Wisconsin-Madison, and the ASTER Global Emissivity Database (ASTER GED V4) produced at JPL. This poster will introduce the beta version of the database, which is available globally for the period 2003 through 2015 at 5km in mean monthly time-steps and for 13 bands from 3.6-14.3 micron. An algorithm to create a high spectral emissivity on 417 wavenumbers is also provided for high spectral IR applications. On the poster the CAMEL database has been evaluated with the IASI Emissivity Atlas (Zhou et al, 2010) and laboratory measurements, and also through simulation of IASI BTs in the RTTOV Forward model.
A segmentation-free approach to Arabic and Urdu OCR
NASA Astrophysics Data System (ADS)
Sabbour, Nazly; Shafait, Faisal
2013-01-01
In this paper, we present a generic Optical Character Recognition system for Arabic script languages called Nabocr. Nabocr uses OCR approaches specific for Arabic script recognition. Performing recognition on Arabic script text is relatively more difficult than Latin text due to the nature of Arabic script, which is cursive and context sensitive. Moreover, Arabic script has different writing styles that vary in complexity. Nabocr is initially trained to recognize both Urdu Nastaleeq and Arabic Naskh fonts. However, it can be trained by users to be used for other Arabic script languages. We have evaluated our system's performance for both Urdu and Arabic. In order to evaluate Urdu recognition, we have generated a dataset of Urdu text called UPTI (Urdu Printed Text Image Database), which measures different aspects of a recognition system. The performance of our system for Urdu clean text is 91%. For Arabic clean text, the performance is 86%. Moreover, we have compared the performance of our system against Tesseract's newly released Arabic recognition, and the performance of both systems on clean images is almost the same.
EVALUATION OF PUBLIC DATABASES AS SOURCES OF DATA FOR LIFE CYCLE ASSESSMENTS
Methods to determine the environmental effects of production systems must encourage a comprehensive evaluation of all "upstream" and "downstream" effects and their interrelationships. This cradle-to-grave approach, called Life Cycle Assessment (LCA), has led to the development...
47 CFR 64.5105 - Use of customer proprietary network information without customer approval.
Code of Federal Regulations, 2014 CFR
2014-10-01
... calls; (ii) Access, either directly or via a third party, a commercially available database that will... permit access to CPNI upon request by the administrator of the TRS Fund, as that term is defined in § 64...
47 CFR 64.5105 - Use of customer proprietary network information without customer approval.
Code of Federal Regulations, 2013 CFR
2013-10-01
... calls; (ii) Access, either directly or via a third party, a commercially available database that will... permit access to CPNI upon request by the administrator of the TRS Fund, as that term is defined in § 64...
Database of significant deposits of gold, silver, copper, lead, and zinc in the United States
Long, Keith R.; DeYoung,, John H.; Ludington, Stephen
1998-01-01
It has long been recognized that the largest mineral deposits contain most of the known mineral endowment (Singer and DeYoung, 1980). Sometimes called giant or world-class deposits, these largest deposits account for a very large share of historic and current mineral production and resources in industrial society (Singer, 1995). For example, Singer (1995) shows that the largest 10 percent of the world’s gold deposits contain 86 percent of the gold discovered to date. Many mineral resource issues and investigations are more easily addressed if limited to the relatively small number of deposits that contain most of the known mineral resources. An estimate of known resources using just these deposits would normally be sufficient, because considering smaller deposits would not add significantly to the total estimate. Land-use planning should treat mainly with these deposits due to their relative scarcity, the large share of known resources they contain, and the fact that economies of scale allow minerals to be produced much more cheaply from larger deposits. Investigation of environmental and other hazards that result from mining operations can be limited to these largest deposits because they account for most of past and current production.The National Mineral Resource Assessment project of the U.S. Geological Survey (USGS) has compiled a database on the largest known deposits of gold, silver, copper, lead, and zinc in the United States to complement the 1996 national assessment of undiscovered deposits of these same metals (Ludington and Cox, 1996). The deposits in this database account for approximately 99 percent of domestic production of these metals and probably a similar share of identified resources. These data may be compared with results of the assessment of undiscovered resources to characterize the nation’s total mineral endowment for these metals. This database is a starting point for any national or regional mineral-resource or mineral-environmental investigation.
Schneider, Jeffrey C; Chen, Liang; Simko, Laura C; Warren, Katherine N; Nguyen, Brian Phu; Thorpe, Catherine R; Jeng, James C; Hickerson, William L; Kazis, Lewis E; Ryan, Colleen M
2018-02-20
The use of common data elements (CDEs) is growing in medical research; CDEs have demonstrated benefit in maximizing the impact of existing research infrastructure and funding. However, the field of burn care does not have a standard set of CDEs. The objective of this study is to examine the extent of common data collected in current burn databases.This study examines the data dictionaries of six U.S. burn databases to ascertain the extent of common data. This was assessed from a quantitative and qualitative perspective. Thirty-two demographic and clinical data elements were examined. The number of databases that collect each data element was calculated. The data values for each data element were compared across the six databases for common terminology. Finally, the data prompts of the data elements were examined for common language and structure.Five (16%) of the 32 data elements are collected by all six burn databases; additionally, five data elements (16%) are present in only one database. Furthermore, there are considerable variations in data values and prompts used among the burn databases. Only one of the 32 data elements (age) contains the same data values across all databases.The burn databases examined show minimal evidence of common data. There is a need to develop CDEs and standardized coding to enhance interoperability of burn databases.
Short Fiction on Film: A Relational DataBase.
ERIC Educational Resources Information Center
May, Charles
Short Fiction on Film is a database that was created and will run on DataRelator, a relational database manager created by Bill Finzer for the California State Department of Education in 1986. DataRelator was designed for use in teaching students database management skills and to provide teachers with examples of how a database manager might be…
ARACHNID: A prototype object-oriented database tool for distributed systems
NASA Technical Reports Server (NTRS)
Younger, Herbert; Oreilly, John; Frogner, Bjorn
1994-01-01
This paper discusses the results of a Phase 2 SBIR project sponsored by NASA and performed by MIMD Systems, Inc. A major objective of this project was to develop specific concepts for improved performance in accessing large databases. An object-oriented and distributed approach was used for the general design, while a geographical decomposition was used as a specific solution. The resulting software framework is called ARACHNID. The Faint Source Catalog developed by NASA was the initial database testbed. This is a database of many giga-bytes, where an order of magnitude improvement in query speed is being sought. This database contains faint infrared point sources obtained from telescope measurements of the sky. A geographical decomposition of this database is an attractive approach to dividing it into pieces. Each piece can then be searched on individual processors with only a weak data linkage between the processors being required. As a further demonstration of the concepts implemented in ARACHNID, a tourist information system is discussed. This version of ARACHNID is the commercial result of the project. It is a distributed, networked, database application where speed, maintenance, and reliability are important considerations. This paper focuses on the design concepts and technologies that form the basis for ARACHNID.
NASA Astrophysics Data System (ADS)
Parise, Mario; Vennari, Carmela
2015-04-01
Sinkholes are definitely the most typical geohazard affecting karst territories. Even though typically their formation is related to an underground cave, and the related subterranean drainage, sinkholes can also be observed on non-soluble deposits such as alluvial and/or colluvial materials. Further, the presence of cavities excavated by man (for different purposes, and in different ages) may be at the origin of other phenomena of sinkholes, the so-called anthropogenic sinkholes, that characterize many historical centres of built-up areas. In Italy, due to the long history of the country, these latter, too, are of great importance, being those that typically involve human buildings and infrastructures, and cause damage and losses to society. As for any other geohazard, building a database through collection of information on the past events is a mandatory step to start the analyses aimed at the evaluation of susceptibility, hazard, and risk. The Institute of Research for the Hydrological Protection (IRPI) of the National Research Council of Italy (CNR) has been working in the last years at the construction of a specific chronological database on sinkholes in the whole country. In the database, the natural and anthropogenic sinkholes are treated in two different subsets, given the strong differences existing as regards both the causal and triggering factors, and the stabilization works as well. A particular care was given in the database to the precise site and date of occurrence of the events, as crucial information for assessing, respectively, the susceptibility and the hazard related to the particular phenomenon under study. As a requirement to be included in the database, a temporal reference of the sinkhole occurrence must be therefore known. Certainty in the geographical position of the event is a fundamental information to correctly locate the sinkhole, and to develop geological and morphological considerations aimed at performing a susceptibility analysis. This factor does not have to be disregarded since, especially for the most ancient events, the data from the sources may be not of high precision for a correct positioning of the sinkhole site. As a consequence, each sinkhole in the database was ranked according to the degree of certainty in the location, subdivided into three different levels. Accuracy of the date of occurrence of the sinkhole was then evaluated, and the highest accuracy was assigned when all the information required (hour, day, month and year of occurrence) were available. The temporal reference is of crucial importance in the IRPI database, since the final goal of the research project is the definition of the sinkhole hazard in Italy. In order to reach such goal, given the definition of hazard, the time of occurrence, and the most likely return time of the events have to be assessed. Overall, the aforementioned elements of the database allow to make some considerations about the reliability of the information presented, their precision, and to give the correct weight to the outcomes deriving from its analyses. Such issues are discussed in the present contribution, as crucial elements that need to be clearly defined in a scientifically-sound database. The database has reached so far about 900 events (31% natural sinkholes and 48% anthropogenic sinkholes, whilst 21% of sinkholes have an uncertain origin). It is continuously updated, and represents a good starting point for analysis of the sinkhole hazard at the national scale, aimed at increasing the level of attention by scientists, practitioners and authorities on this subtle hazard.
High-energy physics software parallelization using database techniques
NASA Astrophysics Data System (ADS)
Argante, E.; van der Stok, P. D. V.; Willers, I.
1997-02-01
A programming model for software parallelization, called CoCa, is introduced that copes with problems caused by typical features of high-energy physics software. By basing CoCa on the database transaction paradimg, the complexity induced by the parallelization is for a large part transparent to the programmer, resulting in a higher level of abstraction than the native message passing software. CoCa is implemented on a Meiko CS-2 and on a SUN SPARCcenter 2000 parallel computer. On the CS-2, the performance is comparable with the performance of native PVM and MPI.
Class dependency of fuzzy relational database using relational calculus and conditional probability
NASA Astrophysics Data System (ADS)
Deni Akbar, Mohammad; Mizoguchi, Yoshihiro; Adiwijaya
2018-03-01
In this paper, we propose a design of fuzzy relational database to deal with a conditional probability relation using fuzzy relational calculus. In the previous, there are several researches about equivalence class in fuzzy database using similarity or approximate relation. It is an interesting topic to investigate the fuzzy dependency using equivalence classes. Our goal is to introduce a formulation of a fuzzy relational database model using the relational calculus on the category of fuzzy relations. We also introduce general formulas of the relational calculus for the notion of database operations such as ’projection’, ’selection’, ’injection’ and ’natural join’. Using the fuzzy relational calculus and conditional probabilities, we introduce notions of equivalence class, redundant, and dependency in the theory fuzzy relational database.
Liu, Shanshan; Chen, Guanxing; Xu, Haidong; Zou, Weibin; Yan, Wenrui; Wang, Qianqian; Deng, Hengwei; Zhang, Heqian; Yu, Guojiao; He, Jianguo; Weng, Shaoping
2017-01-01
Mud crab (Scylla paramamosain) is an economically important marine cultured species in China's coastal area. Mud crab reovirus (MCRV) is the most important pathogen of mud crab, resulting in large economic losses in crab farming. In this paper, next-generation sequencing technology and bioinformatics analysis are used to study transcriptome differences between MCRV-infected mud crab and normal control. A total of 104.3 million clean reads were obtained, including 52.7 million and 51.6 million clean reads from MCRV-infected (CA) and controlled (HA) mud crabs respectively. 81,901, 70,059 and 67,279 unigenes were gained respectively from HA reads, CA reads and HA&CA reads. A total of 32,547 unigenes from HA&CA reads called All-Unigenes were matched to at least one database among Nr, Nt, Swiss-prot, COG, GO and KEGG databases. Among these, 13,039, 20,260 and 11,866 unigenes belonged to the 3, 258 and 25 categories of GO, KEGG pathway, and COG databases, respectively. Solexa/Illumina's DGE platform was also used, and about 13,856 differentially expressed genes (DEGs), including 4444 significantly upregulated and 9412 downregulated DEGs were detected in diseased crabs compared with the control. KEGG pathway analysis revealed that DEGs were obviously enriched in the pathways related to different diseases or infections. This transcriptome analysis provided valuable information on gene functions associated with the response to MCRV in mud crab, as well as detail information for identifying novel genes in the absence of the mud crab genome database. Copyright © 2016. Published by Elsevier Ltd.
EuCliD (European Clinical Database): a database comparing different realities.
Marcelli, D; Kirchgessner, J; Amato, C; Steil, H; Mitteregger, A; Moscardò, V; Carioni, C; Orlandini, G; Gatti, E
2001-01-01
Quality and variability of dialysis practice are generally gaining more and more importance. Fresenius Medical Care (FMC), as provider of dialysis, has the duty to continuously monitor and guarantee the quality of care delivered to patients treated in its European dialysis units. Accordingly, a new clinical database called EuCliD has been developed. It is a multilingual and fully codified database, using as far as possible international standard coding tables. EuCliD collects and handles sensitive medical patient data, fully assuring confidentiality. The Infrastructure: a Domino server is installed in each country connected to EuCliD. All the centres belonging to a country are connected via modem to the country server. All the Domino Servers are connected via Wide Area Network to the Head Quarter Server in Bad Homburg (Germany). Inside each country server only anonymous data related to that particular country are available. The only place where all the anonymous data are available is the Head Quarter Server. The data collection is strongly supported in each country by "key-persons" with solid relationships to their respective national dialysis units. The quality of the data in EuCliD is ensured at different levels. At the end of January 2001, more than 11,000 patients treated in 135 centres located in 7 countries are already included in the system. FMC has put the patient care at the centre of its activities for many years and now is able to provide transparency to the community (Authorities, Nephrologists, Patients.....) thus demonstrating the quality of the service.
A Tool for Conditions Tag Management in ATLAS
NASA Astrophysics Data System (ADS)
Sharmazanashvili, A.; Batiashvili, G.; Gvaberidze, G.; Shekriladze, L.; Formica, A.; Atlas Collaboration
2014-06-01
ATLAS Conditions data include about 2 TB in a relational database and 400 GB of files referenced from the database. Conditions data is entered and retrieved using COOL, the API for accessing data in the LCG Conditions Database infrastructure. It is managed using an ATLAS-customized python based tool set. Conditions data are required for every reconstruction and simulation job, so access to them is crucial for all aspects of ATLAS data taking and analysis, as well as by preceding tasks to derive optimal corrections to reconstruction. Optimized sets of conditions for processing are accomplished using strict version control on those conditions: a process which assigns COOL Tags to sets of conditions, and then unifies those conditions over data-taking intervals into a COOL Global Tag. This Global Tag identifies the set of conditions used to process data so that the underlying conditions can be uniquely identified with 100% reproducibility should the processing be executed again. Understanding shifts in the underlying conditions from one tag to another and ensuring interval completeness for all detectors for a set of runs to be processed is a complex task, requiring tools beyond the above mentioned python utilities. Therefore, a JavaScript /PHP based utility called the Conditions Tag Browser (CTB) has been developed. CTB gives detector and conditions experts the possibility to navigate through the different databases and COOL folders; explore the content of given tags and the differences between them, as well as their extent in time; visualize the content of channels associated with leaf tags. This report describes the structure and PHP/ JavaScript classes of functions of the CTB.
Database of synesthetic color associations for Japanese kanji.
Hamada, Daisuke; Yamamoto, Hiroki; Saiki, Jun
2017-02-01
Synesthesia is a neurological phenomenon in which certain types of stimuli elicit involuntary perceptions in an unrelated pathway. A common type of synesthesia is grapheme-color synesthesia, in which the visual perception of letters and numbers stimulates the perception of a specific color. Previous studies have often collected relatively small numbers of grapheme-color associations per synesthete, but the accumulation of a large quantity of data has greater promise for uncovering the mechanisms underlying synesthetic association. In this study, we therefore collected large samples of data from a total of eight synesthetes. All told, we obtained over 1000 synesthetic colors associated with Japanese kanji characters from each of two synesthetes, over 100 synesthetic colors form each of three synesthetes, and about 80 synesthetic colors associated with Japanese hiragana, Latin letters, and Arabic numerals from each of three synesthetes. We then compiled the data into a database, called the KANJI-Synesthetic Colors Database (K-SCD), which has a total of 5122 colors for 483, 46, and 46 Japanese kanji, hiragana, and katakana characters, respectively, as well as for 26 Latin letters and ten Arabic numerals. In addition to introducing the K-SCD, this article demonstrates the database's merits by using two examples, in which two new rules for synesthetic association, "shape similarity" and "synesthetic color clustering," were found. The K-SCD is publicly accessible ( www.cv.jinkan.kyoto-u.ac.jp/site/uploads/K-SCD.xlsm ) and will be a valuable resource for those who wish to conduct statistical analyses using a rich dataset in order to uncover the rules governing synesthetic association and to understand its mechanisms.
DenHunt - A Comprehensive Database of the Intricate Network of Dengue-Human Interactions
Arjunan, Selvam; Sastri, Narayan P.; Chandra, Nagasuma
2016-01-01
Dengue virus (DENV) is a human pathogen and its etiology has been widely established. There are many interactions between DENV and human proteins that have been reported in literature. However, no publicly accessible resource for efficiently retrieving the information is yet available. In this study, we mined all publicly available dengue–human interactions that have been reported in the literature into a database called DenHunt. We retrieved 682 direct interactions of human proteins with dengue viral components, 382 indirect interactions and 4120 differentially expressed human genes in dengue infected cell lines and patients. We have illustrated the importance of DenHunt by mapping the dengue–human interactions on to the host interactome and observed that the virus targets multiple host functional complexes of important cellular processes such as metabolism, immune system and signaling pathways suggesting a potential role of these interactions in viral pathogenesis. We also observed that 7 percent of the dengue virus interacting human proteins are also associated with other infectious and non-infectious diseases. Finally, the understanding that comes from such analyses could be used to design better strategies to counteract the diseases caused by dengue virus. The whole dataset has been catalogued in a searchable database, called DenHunt (http://proline.biochem.iisc.ernet.in/DenHunt/). PMID:27618709
DenHunt - A Comprehensive Database of the Intricate Network of Dengue-Human Interactions.
Karyala, Prashanthi; Metri, Rahul; Bathula, Christopher; Yelamanchi, Syam K; Sahoo, Lipika; Arjunan, Selvam; Sastri, Narayan P; Chandra, Nagasuma
2016-09-01
Dengue virus (DENV) is a human pathogen and its etiology has been widely established. There are many interactions between DENV and human proteins that have been reported in literature. However, no publicly accessible resource for efficiently retrieving the information is yet available. In this study, we mined all publicly available dengue-human interactions that have been reported in the literature into a database called DenHunt. We retrieved 682 direct interactions of human proteins with dengue viral components, 382 indirect interactions and 4120 differentially expressed human genes in dengue infected cell lines and patients. We have illustrated the importance of DenHunt by mapping the dengue-human interactions on to the host interactome and observed that the virus targets multiple host functional complexes of important cellular processes such as metabolism, immune system and signaling pathways suggesting a potential role of these interactions in viral pathogenesis. We also observed that 7 percent of the dengue virus interacting human proteins are also associated with other infectious and non-infectious diseases. Finally, the understanding that comes from such analyses could be used to design better strategies to counteract the diseases caused by dengue virus. The whole dataset has been catalogued in a searchable database, called DenHunt (http://proline.biochem.iisc.ernet.in/DenHunt/).
Unluturk, Mehmet S
2012-06-01
Nurse call system is an electrically functioning system by which patients can call upon from a bedside station or from a duty station. An intermittent tone shall be heard and a corridor lamp located outside the room starts blinking with a slow or a faster rate depending on the call origination. It is essential to alert nurses on time so that they can offer care and comfort without any delay. There are currently many devices available for a nurse call system to improve communication between nurses and patients such as pagers, RFID (radio frequency identification) badges, wireless phones and so on. To integrate all these devices into an existing nurse call system and make they communicate with each other, we propose software client applications called bridges in this paper. We also propose a window server application called SEE (Supervised Event Executive) that delivers messages among these devices. A single hardware dongle is utilized for authentication and copy protection for SEE. Protecting SEE with securities provided by dongle only is a weak defense against hackers. In this paper, we develop some defense patterns for hackers such as calculating checksums in runtime, making calls to dongle from multiple places in code and handling errors properly by logging them into database.
Kentala, E; Pyykkö, I; Auramo, Y; Juhola, M
1995-03-01
An interactive database has been developed to assist the diagnostic procedure for vertigo and to store the data. The database offers a possibility to split and reunite the collected information when needed. It contains detailed information about a patient's history, symptoms, and findings in otoneurologic, audiologic, and imaging tests. The symptoms are classified into sets of questions on vertigo (including postural instability), hearing loss and tinnitus, and provoking factors. Confounding disorders are screened. The otoneurologic tests involve saccades, smooth pursuit, posturography, and a caloric test. In addition, findings from specific antibody tests, clinical neurotologic tests, magnetic resonance imaging, brain stem audiometry, and electrocochleography are included. The input information can be applied to workups for vertigo in an expert system called ONE. The database assists its user in that the input of information is easy. If not only can be used for diagnostic purposes but is also beneficial for research, and in combination with the expert system, it provides a tutorial guide for medical students.
Data-driven indexing mechanism for the recognition of polyhedral objects
NASA Astrophysics Data System (ADS)
McLean, Stewart; Horan, Peter; Caelli, Terry M.
1992-02-01
This paper is concerned with the problem of searching large model databases. To date, most object recognition systems have concentrated on the problem of matching using simple searching algorithms. This is quite acceptable when the number of object models is small. However, in the future, general purpose computer vision systems will be required to recognize hundreds or perhaps thousands of objects and, in such circumstances, efficient searching algorithms will be needed. The problem of searching a large model database is one which must be addressed if future computer vision systems are to be at all effective. In this paper we present a method we call data-driven feature-indexed hypothesis generation as one solution to the problem of searching large model databases.
Clegg, Gareth R; Lyon, Richard M; James, Scott; Branigan, Holly P; Bard, Ellen G; Egan, Gerry J
2014-01-01
Survival from out-of-hospital cardiac arrest (OHCA) is dependent on the chain of survival. Early recognition of cardiac arrest and provision of bystander cardiopulmonary resuscitation (CPR) are key determinants of OHCA survival. Emergency medical dispatchers play a key role in cardiac arrest recognition and giving telephone CPR advice. The interaction between caller and dispatcher can influence the time to bystander CPR and quality of resuscitation. We sought to pilot the use of emergency call transcription to audit and evaluate the holdups in performing dispatch-assisted CPR. A retrospective case selection of 50 consecutive suspected OHCA was performed. Audio recordings of calls were downloaded from the emergency medical dispatch centre computer database. All calls were transcribed using proprietary software and voice dialogue was compared with the corresponding stage on the Medical Priority Dispatch System (MPDS). Time to progress through each stage and number of caller-dispatcher interactions were calculated. Of the 50 downloaded calls, 47 were confirmed cases of OHCA. Call transcription was successfully completed for all OHCA calls. Bystander CPR was performed in 39 (83%) of these. In the remaining cases, the caller decided the patient was beyond help (n = 7) or the caller said that they were physically unable to perform CPR (n = 1). MPDS stages varied substantially in time to completion. Stage 9 (determining if the patient is breathing through airway instructions) took the longest time to complete (median = 59 s, IQR 22-82 s). Stage 11 (giving CPR instructions) also took a relatively longer time to complete compared to the other stages (median = 46 s, IQR 37-75 s). Stage 5 (establishing the patient's age) took the shortest time to complete (median = 5.5s, IQR 3-9s). Transcription of OHCA emergency calls and caller-dispatcher interaction compared to MPDS stage is feasible. Confirming whether a patient is breathing and completing CPR instructions required the longest time and most interactions between caller and dispatcher. Use of call transcription has the potential to identify key factors in caller-dispatcher interaction that could improve time to CPR and further research is warranted in this area. Copyright © 2013 Elsevier Ireland Ltd. All rights reserved.
A Java API for working with PubChem datasets
Southern, Mark R.; Griffin, Patrick R.
2011-01-01
Summary: PubChem is a public repository of chemical structures and associated biological activities. The PubChem BioAssay database contains assay descriptions, conditions and readouts and biological screening results that have been submitted by the biomedical research community. The PubChem web site and Power User Gateway (PUG) web service allow users to interact with the data and raw files are available via FTP. These resources are helpful to many but there can also be great benefit by using a software API to manipulate the data. Here, we describe a Java API with entity objects mapped to the PubChem Schema and with wrapper functions for calling the NCBI eUtilities and PubChem PUG web services. PubChem BioAssays and associated chemical compounds can then be queried and manipulated in a local relational database. Features include chemical structure searching and generation and display of curve fits from stored dose–response experiments, something that is not yet available within PubChem itself. The aim is to provide researchers with a fast, consistent, queryable local resource from which to manipulate PubChem BioAssays in a database agnostic manner. It is not intended as an end user tool but to provide a platform for further automation and tools development. Availability: http://code.google.com/p/pubchemdb Contact: southern@scripps.edu PMID:21216779
Engels, Michael F M; Gibbs, Alan C; Jaeger, Edward P; Verbinnen, Danny; Lobanov, Victor S; Agrafiotis, Dimitris K
2006-01-01
We report on the structural comparison of the corporate collections of Johnson & Johnson Pharmaceutical Research & Development (JNJPRD) and 3-Dimensional Pharmaceuticals (3DP), performed in the context of the recent acquisition of 3DP by JNJPRD. The main objective of the study was to assess the druglikeness of the 3DP library and the extent to which it enriched the chemical diversity of the JNJPRD corporate collection. The two databases, at the time of acquisition, collectively contained more than 1.1 million compounds with a clearly defined structural description. The analysis was based on a clustering approach and aimed at providing an intuitive quantitative estimate and visual representation of this enrichment. A novel hierarchical clustering algorithm called divisive k-means was employed in combination with Kelley's cluster-level selection method to partition the combined data set into clusters, and the diversity contribution of each library was evaluated as a function of the relative occupancy of these clusters. Typical 3DP chemotypes enriching the diversity of the JNJPRD collection were catalogued and visualized using a modified maximum common substructure algorithm. The joint collection of JNJPRD and 3DP compounds was also compared to other databases of known medicinally active or druglike compounds. The potential of the methodology for the analysis of very large chemical databases is discussed.
Improving accuracy and power with transfer learning using a meta-analytic database.
Schwartz, Yannick; Varoquaux, Gaël; Pallier, Christophe; Pinel, Philippe; Poline, Jean-Baptiste; Thirion, Bertrand
2012-01-01
Typical cohorts in brain imaging studies are not large enough for systematic testing of all the information contained in the images. To build testable working hypotheses, investigators thus rely on analysis of previous work, sometimes formalized in a so-called meta-analysis. In brain imaging, this approach underlies the specification of regions of interest (ROIs) that are usually selected on the basis of the coordinates of previously detected effects. In this paper, we propose to use a database of images, rather than coordinates, and frame the problem as transfer learning: learning a discriminant model on a reference task to apply it to a different but related new task. To facilitate statistical analysis of small cohorts, we use a sparse discriminant model that selects predictive voxels on the reference task and thus provides a principled procedure to define ROIs. The benefits of our approach are twofold. First it uses the reference database for prediction, i.e., to provide potential biomarkers in a clinical setting. Second it increases statistical power on the new task. We demonstrate on a set of 18 pairs of functional MRI experimental conditions that our approach gives good prediction. In addition, on a specific transfer situation involving different scanners at different locations, we show that voxel selection based on transfer learning leads to higher detection power on small cohorts.
Public perceptions of animal experimentation across Europe.
von Roten, Fabienne Crettaz
2013-08-01
The goal of this article is to map out public perceptions of animal experimentation in 28 European countries. Postulating cross-cultural differences, this study mixes country-level variables (from the Eurostat database) and individual-level variables (from Eurobarometer Science and Technology 2010). It is shown that experimentation on animals such as mice is generally accepted in European countries, but perceptions are divided on dogs and monkeys. Between 2005 and 2010, we observe globally a change of approval on dogs and monkeys, with a significant decrease in nine countries. Multilevel analysis results show differences at country level (related to a post-industrialism model) and at individual level (related to gender, age, education, proximity and perceptions of science and the environment). These results may have consequences for public perceptions of science and we call for more cross-cultural research on press coverage of animal research and on the level of public engagement of scientists doing animal research.
NASA Astrophysics Data System (ADS)
Miritello, Giovanna; Lara, Rubén; Moro, Esteban
Recent research has shown the deep impact of the dynamics of human interactions (or temporal social networks) on the spreading of information, opinion formation, etc. In general, the bursty nature of human interactions lowers the interaction between people to the extent that both the speed and reach of information diffusion are diminished. Using a large database of 20 million users of mobile phone calls we show evidence this effect is not homogeneous in the social network but in fact, there is a large correlation between this effect and the social topological structure around a given individual. In particular, we show that social relations of hubs in a network are relatively weaker from the dynamical point than those that are poorer connected in the information diffusion process. Our results show the importance of the temporal patterns of communication when analyzing and modeling dynamical process on social networks.
Emmenegger, E.J.; Kentop, E.; Thompson, T.M.; Pittam, S.; Ryan, A.; Keon, D.; Carlino, J.A.; Ranson, J.; Life, R.B.; Troyer, R.M.; Garver, K.A.; Kurath, G.
2011-01-01
The AquaPathogen X database is a template for recording information on individual isolates of aquatic pathogens and is freely available for download (http://wfrc.usgs.gov). This database can accommodate the nucleotide sequence data generated in molecular epidemiological studies along with the myriad of abiotic and biotic traits associated with isolates of various pathogens (e.g. viruses, parasites and bacteria) from multiple aquatic animal host species (e.g. fish, shellfish and shrimp). The cataloguing of isolates from different aquatic pathogens simultaneously is a unique feature to the AquaPathogen X database, which can be used in surveillance of emerging aquatic animal diseases and elucidation of key risk factors associated with pathogen incursions into new water systems. An application of the template database that stores the epidemiological profiles of fish virus isolates, called Fish ViroTrak, was also developed. Exported records for two aquatic rhabdovirus species emerging in North America were used in the implementation of two separate web-accessible databases: the Molecular Epidemiology of Aquatic Pathogens infectious haematopoietic necrosis virus (MEAP-IHNV) database (http://gis.nacse.org/ihnv/) released in 2006 and the MEAP- viral haemorrhagic septicaemia virus (http://gis.nacse.org/vhsv/) database released in 2010.
ToxMiner Software Interface for Visualizing and Analyzing ToxCast Data
The ToxCast dataset represents a collection of assays and endpoints that will require both standard statistical approaches as well as customized data analysis workflows. To analyze this unique dataset, we have developed an integrated database with Javabased interface called ToxMi...
MODELING DISPERSANT INTERACTIONS WITH OIL SPILLS
EPA is developing a model called the EPA Research Object-Oriented Oil Spill Model (ERO3S) and associated databases to simulate the impacts of dispersants on oil slicks. Because there are features of oil slicks that align naturally with major concepts of object-oriented programmi...
Coding the Eggen Cards (Poster abstract)
NASA Astrophysics Data System (ADS)
Silvis, G.
2014-06-01
(Abstract only) A look at the Eggen Portal for accessing the Eggen cards. And a call for volunteers to help code the cards: 100,000 cards must be looked at and their star references identified and coded into the database for this to be a valuable resource.
2009.1 Revision of the Evaluated Nuclear Data Library (ENDL2009.1)
DOE Office of Scientific and Technical Information (OSTI.GOV)
Thompson, I. J.; Beck, B.; Descalles, M. A.
LLNL’s Computational Nuclear Data and Theory Group have created a 2009.1 revised release of the Evaluated Nuclear Data Library (ENDL2009.1). This library is designed to support LLNL’s current and future nuclear data needs and will be employed in nuclear reactor, nuclear security and stockpile stewardship simulations with ASC codes. The ENDL2009 database was the most complete nuclear database for Monte Carlo and deterministic transport of neutrons and charged particles. It was assembled with strong support from the ASC PEM and Attribution programs, leveraged with support from Campaign 4 and the DOE/Office of Science’s US Nuclear Data Program. This document listsmore » the revisions and fixes made in a new release called ENDL2009.1, by comparing with the existing data in the original release which is now called ENDL2009.0. These changes are made in conjunction with the revisions for ENDL2011.1, so that both the .1 releases are as free as possible of known defects.« less
The NASA Program Management Tool: A New Vision in Business Intelligence
NASA Technical Reports Server (NTRS)
Maluf, David A.; Swanson, Keith; Putz, Peter; Bell, David G.; Gawdiak, Yuri
2006-01-01
This paper describes a novel approach to business intelligence and program management for large technology enterprises like the U.S. National Aeronautics and Space Administration (NASA). Two key distinctions of the approach are that 1) standard business documents are the user interface, and 2) a "schema-less" XML database enables flexible integration of technology information for use by both humans and machines in a highly dynamic environment. The implementation utilizes patent-pending NASA software called the NASA Program Management Tool (PMT) and its underlying "schema-less" XML database called Netmark. Initial benefits of PMT include elimination of discrepancies between business documents that use the same information and "paperwork reduction" for program and project management in the form of reducing the effort required to understand standard reporting requirements and to comply with those reporting requirements. We project that the underlying approach to business intelligence will enable significant benefits in the timeliness, integrity and depth of business information available to decision makers on all organizational levels.
NASA Technical Reports Server (NTRS)
Spuler, Linda M.; Ford, Patricia K.; Skeete, Darren C.; Hershman, Scot; Raviprakash, Pushpa; Arnold, John W.; Tran, Victor; Haenze, Mary Alice
2005-01-01
"Close Call Action Log Form" ("CCALF") is the name of both a computer program and a Web-based service provided by the program for creating an enhanced database of close calls (in the colloquial sense of mishaps that were avoided by small margins) assigned to the Center Operations Directorate (COD) at Johnson Space Center. CCALF provides a single facility for on-line collaborative review of close calls. Through CCALF, managers can delegate responses to employees. CCALF utilizes a pre-existing e-mail system to notify managers that there are close calls to review, but eliminates the need for the prior practices of passing multiple e-mail messages around the COD, then collecting and consolidating them into final responses: CCALF now collects comments from all responders for incorporation into reports that it generates. Also, whereas it was previously necessary to manually calculate metrics (e.g., numbers of maintenance-work orders necessitated by close calls) for inclusion in the reports, CCALF now computes the metrics, summarizes them, and displays them in graphical form. The reports and all pertinent information used to generate the reports are logged, tracked, and retained by CCALF for historical purposes.
Symptom attribution after a plane crash: comparison between self-reported symptoms and GP records.
Donker, G A; Yzermans, C J; Spreeuwenberg, P; van der Zee, J
2002-01-01
BACKGROUND: On 4 October 1992, an El Al Boeing 747-F cargo aeroplane crashed on two apartment buildings in Amsterdam. Thirty-nine residents on the ground and the four crew members of the plane died. In the years after, a gradually increasing number of people attributed physical signs and symptoms to their presence at the disaster scene. AIM: To investigate the consistency between patients' symptoms attributed to the crash and GPs' diagnoses and perception of the association with the crash. DESIGN OF STUDY: Comparison between self-reported symptoms to a call centre and GPs' medical records on onset and type of symptoms, diagnoses, and GPs' perception of association with the disaster, assessed by questionnaire. SETTING: Consenting patients (n = 621) contacting the call centre and their GPs. METHOD: Patients were interviewed by the call centre staff and interview data were recorded on a database. Questionnaires were sent to the consenting patients' GPs, requesting their opinions on whether or not their patients' symptoms were attributable to the effects of disaster. Baseline differences and differences in reported symptoms between interviewed patients and their GP records were tested using the chi2 test. RESULTS: The 553 responders reported on average 4.3 symptoms to the call centre. The majority of these symptoms (74%) were reported to the GP. Of the ten most commonly reported symptoms, fatigue, skin complaints, feeling anxious or nervous, dyspnoea, and backache featured in 80% of symptoms reported to the GP. One out of four symptoms was either reported to the GP before the disaster took place, or six or more years after (1998/1999, during a period of much media attention). Depression (7%), post-traumatic stress disorder (PTSD) (5%) and eczema (5%) were most frequently diagnosed by GPs. They related 6% of all reported symptoms to the disaster. CONCLUSIONS: Most of the symptoms attributed to a disaster by patients have been reported to their GP, who related only a small proportion of these to the disaster. PMID:12434961
Symptom attribution after a plane crash: comparison between self-reported symptoms and GP records.
Donker, G A; Yzermans, C J; Spreeuwenberg, P; van der Zee, J
2002-11-01
On 4 October 1992, an El Al Boeing 747-F cargo aeroplane crashed on two apartment buildings in Amsterdam. Thirty-nine residents on the ground and the four crew members of the plane died. In the years after, a gradually increasing number of people attributed physical signs and symptoms to their presence at the disaster scene. To investigate the consistency between patients' symptoms attributed to the crash and GPs' diagnoses and perception of the association with the crash. Comparison between self-reported symptoms to a call centre and GPs' medical records on onset and type of symptoms, diagnoses, and GPs' perception of association with the disaster, assessed by questionnaire. Consenting patients (n = 621) contacting the call centre and their GPs. Patients were interviewed by the call centre staff and interview data were recorded on a database. Questionnaires were sent to the consenting patients' GPs, requesting their opinions on whether or not their patients' symptoms were attributable to the effects of disaster. Baseline differences and differences in reported symptoms between interviewed patients and their GP records were tested using the chi2 test. The 553 responders reported on average 4.3 symptoms to the call centre. The majority of these symptoms (74%) were reported to the GP. Of the ten most commonly reported symptoms, fatigue, skin complaints, feeling anxious or nervous, dyspnoea, and backache featured in 80% of symptoms reported to the GP. One out of four symptoms was either reported to the GP before the disaster took place, or six or more years after (1998/1999, during a period of much media attention). Depression (7%), post-traumatic stress disorder (PTSD) (5%) and eczema (5%) were most frequently diagnosed by GPs. They related 6% of all reported symptoms to the disaster. Most of the symptoms attributed to a disaster by patients have been reported to their GP, who related only a small proportion of these to the disaster.
Critical evaluation and thermodynamic optimization of the Iron-Rare-Earth systems
NASA Astrophysics Data System (ADS)
Konar, Bikram
Rare-Earth elements by virtue of its typical magnetic, electronic and chemical properties are gaining importance in power, electronics, telecommunications and sustainable green technology related industries. The Magnets from RE-alloys are more powerful than conventional magnets which have more longevity and high temperature workability. The dis-equilibrium in the Rare-Earth element supply and demand has increased the importance of recycling and extraction of REE's from used permanent Magnets. However, lack of the thermodynamic data on RE alloys has made it difficult to design an effective extraction and recycling process. In this regard, Computational Thermodynamic calculations can serve as a cost effective and less time consuming tool to design a waste magnet recycling process. The most common RE permanent magnet is Nd magnet (Nd 2Fe14B). Various elements such as Dy, Tb, Pr, Cu, Co, Ni, etc. are also added to increase its magnetic and mechanical properties. In order to perform reliable thermodynamic calculations for the RE recycling process, accurate thermodynamic database for RE and related alloys are required. The thermodynamic database can be developed using the so-called CALPHAD method. The database development based on the CALPHAD method is essentially the critical evaluation and optimization of all available thermodynamic and phase diagram data. As a results, one set of self-consistent thermodynamic functions for all phases in the given system can be obtained, which can reproduce all reliable thermodynamic and phase diagram data. The database containing the optimized Gibbs energy functions can be used to calculate complex chemical reactions for any high temperature processes. Typically a Gibbs energy minimization routine, such as in FactSage software, can be used to obtain the accurate thermodynamic equilibrium in multicomponent systems. As part of a large thermodynamic database development for permanent magnet recycling and Mg alloy design, all thermodynamic and phase diagram data in the literature for the fourteen Fe-RE binary systems: Fe-La, Fe-Ce, Fe-Pr, Fe-Nd, Fe-Sm, Fe-Gd, Fe-Tb, Fe-Dy, Fe-Ho, Fe-Er, Fe-Tm, Fe-Lu, Fe-Sc and Fe-Y are critically evaluated and optimized to obtain thermodynamic model parameters. The model parameters can be used to calculate phase diagrams and Gibbs energies of all phases as functions of temperature and composition. This database can be incorporated with the present thermodynamic database in FactSage software to perform complex chemical reactions and phase diagram calculations for RE magnet recycling process.
Digital divide, biometeorological data infrastructures and human vulnerability definition
NASA Astrophysics Data System (ADS)
Fdez-Arroyabe, Pablo; Lecha Estela, Luis; Schimt, Falko
2018-05-01
The design and implementation of any climate-related health service, nowadays, imply avoiding the digital divide as it means having access and being able to use complex technological devices, massive meteorological data, user's geographic location and biophysical information. This article presents the co-creation, in detail, of a biometeorological data infrastructure, which is a complex platform formed by multiple components: a mainframe, a biometeorological model called Pronbiomet, a relational database management system, data procedures, communication protocols, different software packages, users, datasets and a mobile application. The system produces four daily world maps of the partial density of the atmospheric oxygen and collects user feedback on their health condition. The infrastructure is shown to be a useful tool to delineate individual vulnerability to meteorological changes as one key factor in the definition of any biometeorological risk. This technological approach to study weather-related health impacts is the initial seed for the definition of biometeorological profiles of persons, and for the future development of customized climate services for users in the near future.
Digital divide, biometeorological data infrastructures and human vulnerability definition.
Fdez-Arroyabe, Pablo; Lecha Estela, Luis; Schimt, Falko
2018-05-01
The design and implementation of any climate-related health service, nowadays, imply avoiding the digital divide as it means having access and being able to use complex technological devices, massive meteorological data, user's geographic location and biophysical information. This article presents the co-creation, in detail, of a biometeorological data infrastructure, which is a complex platform formed by multiple components: a mainframe, a biometeorological model called Pronbiomet, a relational database management system, data procedures, communication protocols, different software packages, users, datasets and a mobile application. The system produces four daily world maps of the partial density of the atmospheric oxygen and collects user feedback on their health condition. The infrastructure is shown to be a useful tool to delineate individual vulnerability to meteorological changes as one key factor in the definition of any biometeorological risk. This technological approach to study weather-related health impacts is the initial seed for the definition of biometeorological profiles of persons, and for the future development of customized climate services for users in the near future.
Digital divide, biometeorological data infrastructures and human vulnerability definition
NASA Astrophysics Data System (ADS)
Fdez-Arroyabe, Pablo; Lecha Estela, Luis; Schimt, Falko
2017-06-01
The design and implementation of any climate-related health service, nowadays, imply avoiding the digital divide as it means having access and being able to use complex technological devices, massive meteorological data, user's geographic location and biophysical information. This article presents the co-creation, in detail, of a biometeorological data infrastructure, which is a complex platform formed by multiple components: a mainframe, a biometeorological model called Pronbiomet, a relational database management system, data procedures, communication protocols, different software packages, users, datasets and a mobile application. The system produces four daily world maps of the partial density of the atmospheric oxygen and collects user feedback on their health condition. The infrastructure is shown to be a useful tool to delineate individual vulnerability to meteorological changes as one key factor in the definition of any biometeorological risk. This technological approach to study weather-related health impacts is the initial seed for the definition of biometeorological profiles of persons, and for the future development of customized climate services for users in the near future.
Morphological filtering and multiresolution fusion for mammographic microcalcification detection
NASA Astrophysics Data System (ADS)
Chen, Lulin; Chen, Chang W.; Parker, Kevin J.
1997-04-01
Mammographic images are often of relatively low contrast and poor sharpness with non-stationary background or clutter and are usually corrupted by noise. In this paper, we propose a new method for microcalcification detection using gray scale morphological filtering followed by multiresolution fusion and present a unified general filtering form called the local operating transformation for whitening filtering and adaptive thresholding. The gray scale morphological filters are used to remove all large areas that are considered as non-stationary background or clutter variations, i.e., to prewhiten images. The multiresolution fusion decision is based on matched filter theory. In addition to the normal matched filter, the Laplacian matched filter which is directly related through the wavelet transforms to multiresolution analysis is exploited for microcalcification feature detection. At the multiresolution fusion stage, the region growing techniques are used in each resolution level. The parent-child relations between resolution levels are adopted to make final detection decision. FROC is computed from test on the Nijmegen database.
Identification of membrane proteome of Paracoccidioides lutzii and its regulation by zinc
de Curcio, Juliana Santana; Silva, Marielle Garcia; Silva Bailão, Mirelle Garcia; Báo, Sônia Nair; Casaletti, Luciana; Bailão, Alexandre Mello; de Almeida Soares, Célia Maria
2017-01-01
Aim: During infection development in the host, Paracoccidioides spp. faces the deprivation of micronutrients, a mechanism called nutritional immunity. This condition induces the remodeling of proteins present in different metabolic pathways. Therefore, we attempted to identify membrane proteins and their regulation by zinc in Paracoccidioides lutzii. Materials & methods: Membranes enriched fraction of yeast cells of P. lutzii were isolated, purified and identified by 2D LC–MS/MS detection and database search. Results & conclusion: Zinc deprivation suppressed the expression of membrane proteins such as glycoproteins, those involved in cell wall synthesis and those related to oxidative phosphorylation. This is the first study describing membrane proteins and the effect of zinc deficiency in their regulation in one member of the genus Paracoccidioides. PMID:29134119
A virtual university Web system for a medical school.
Séka, L P; Duvauferrier, R; Fresnel, A; Le Beux, P
1998-01-01
This paper describes a Virtual Medical University Web Server. This project started in 1994 by the development of the French Radiology Server. The main objective of our Medical Virtual University is to offer not only an initial training (for students) but also the Continuing Professional Education (for practitioners). Our system is based on electronic textbooks, clinical cases (around 4000) and a medical knowledge base called A.D.M. ("Aide au Diagnostic Medical"). We have indexed all electronic textbooks and clinical cases according to the ADM base in order to facilitate the navigation on the system. This system base is supported by a relational database management system. The Virtual Medical University, available on the Web Internet, is presently in the process of external evaluations.
Incorporating Spatial Data into Enterprise Applications
NASA Astrophysics Data System (ADS)
Akiki, Pierre; Maalouf, Hoda
The main goal of this chapter is to discuss the usage of spatial data within enterprise as well as smaller line-of-business applications. In particular, this chapter proposes new methodologies for storing and manipulating vague spatial data and provides methods for visualizing both crisp and vague spatial data. It also provides a comparison between different types of spatial data, mainly 2D crisp and vague spatial data, and their respective fields of application. Additionally, it compares existing commercial relational database management systems, which are the most widely used with enterprise applications, and discusses their deficiencies in terms of spatial data support. A new spatial extension package called Spatial Extensions (SPEX) is provided in this chapter and is tested on a software prototype.
Relational Database for the Geology of the Northern Rocky Mountains - Idaho, Montana, and Washington
Causey, J. Douglas; Zientek, Michael L.; Bookstrom, Arthur A.; Frost, Thomas P.; Evans, Karl V.; Wilson, Anna B.; Van Gosen, Bradley S.; Boleneus, David E.; Pitts, Rebecca A.
2008-01-01
A relational database was created to prepare and organize geologic map-unit and lithologic descriptions for input into a spatial database for the geology of the northern Rocky Mountains, a compilation of forty-three geologic maps for parts of Idaho, Montana, and Washington in U.S. Geological Survey Open File Report 2005-1235. Not all of the information was transferred to and incorporated in the spatial database due to physical file limitations. This report releases that part of the relational database that was completed for that earlier product. In addition to descriptive geologic information for the northern Rocky Mountains region, the relational database contains a substantial bibliography of geologic literature for the area. The relational database nrgeo.mdb (linked below) is available in Microsoft Access version 2000, a proprietary database program. The relational database contains data tables and other tables used to define terms, relationships between the data tables, and hierarchical relationships in the data; forms used to enter data; and queries used to extract data.
Correcting ligands, metabolites, and pathways
Ott, Martin A; Vriend, Gert
2006-01-01
Background A wide range of research areas in bioinformatics, molecular biology and medicinal chemistry require precise chemical structure information about molecules and reactions, e.g. drug design, ligand docking, metabolic network reconstruction, and systems biology. Most available databases, however, treat chemical structures more as illustrations than as a datafield in its own right. Lack of chemical accuracy impedes progress in the areas mentioned above. We present a database of metabolites called BioMeta that augments the existing pathway databases by explicitly assessing the validity, correctness, and completeness of chemical structure and reaction information. Description The main bulk of the data in BioMeta were obtained from the KEGG Ligand database. We developed a tool for chemical structure validation which assesses the chemical validity and stereochemical completeness of a molecule description. The validation tool was used to examine the compounds in BioMeta, showing that a relatively small number of compounds had an incorrect constitution (connectivity only, not considering stereochemistry) and that a considerable number (about one third) had incomplete or even incorrect stereochemistry. We made a large effort to correct the errors and to complete the structural descriptions. A total of 1468 structures were corrected and/or completed. We also established the reaction balance of the reactions in BioMeta and corrected 55% of the unbalanced (stoichiometrically incorrect) reactions in an automatic procedure. The BioMeta database was implemented in PostgreSQL and provided with a web-based interface. Conclusion We demonstrate that the validation of metabolite structures and reactions is a feasible and worthwhile undertaking, and that the validation results can be used to trigger corrections and improvements to BioMeta, our metabolite database. BioMeta provides some tools for rational drug design, reaction searches, and visualization. It is freely available at provided that the copyright notice of all original data is cited. The database will be useful for querying and browsing biochemical pathways, and to obtain reference information for identifying compounds. However, these applications require that the underlying data be correct, and that is the focus of BioMeta. PMID:17132165
Hayashi, Takanori; Matsuzaki, Yuri; Yanagisawa, Keisuke; Ohue, Masahito; Akiyama, Yutaka
2018-05-08
Protein-protein interactions (PPIs) play several roles in living cells, and computational PPI prediction is a major focus of many researchers. The three-dimensional (3D) structure and binding surface are important for the design of PPI inhibitors. Therefore, rigid body protein-protein docking calculations for two protein structures are expected to allow elucidation of PPIs different from known complexes in terms of 3D structures because known PPI information is not explicitly required. We have developed rapid PPI prediction software based on protein-protein docking, called MEGADOCK. In order to fully utilize the benefits of computational PPI predictions, it is necessary to construct a comprehensive database to gather prediction results and their predicted 3D complex structures and to make them easily accessible. Although several databases exist that provide predicted PPIs, the previous databases do not contain a sufficient number of entries for the purpose of discovering novel PPIs. In this study, we constructed an integrated database of MEGADOCK PPI predictions, named MEGADOCK-Web. MEGADOCK-Web provides more than 10 times the number of PPI predictions than previous databases and enables users to conduct PPI predictions that cannot be found in conventional PPI prediction databases. In MEGADOCK-Web, there are 7528 protein chains and 28,331,628 predicted PPIs from all possible combinations of those proteins. Each protein structure is annotated with PDB ID, chain ID, UniProt AC, related KEGG pathway IDs, and known PPI pairs. Additionally, MEGADOCK-Web provides four powerful functions: 1) searching precalculated PPI predictions, 2) providing annotations for each predicted protein pair with an experimentally known PPI, 3) visualizing candidates that may interact with the query protein on biochemical pathways, and 4) visualizing predicted complex structures through a 3D molecular viewer. MEGADOCK-Web provides a huge amount of comprehensive PPI predictions based on docking calculations with biochemical pathways and enables users to easily and quickly assess PPI feasibilities by archiving PPI predictions. MEGADOCK-Web also promotes the discovery of new PPIs and protein functions and is freely available for use at http://www.bi.cs.titech.ac.jp/megadock-web/ .
Migration from relational to NoSQL database
NASA Astrophysics Data System (ADS)
Ghotiya, Sunita; Mandal, Juhi; Kandasamy, Saravanakumar
2017-11-01
Data generated by various real time applications, social networking sites and sensor devices is of very huge amount and unstructured, which makes it difficult for Relational database management systems to handle the data. Data is very precious component of any application and needs to be analysed after arranging it in some structure. Relational databases are only able to deal with structured data, so there is need of NoSQL Database management System which can deal with semi -structured data also. Relational database provides the easiest way to manage the data but as the use of NoSQL is increasing it is becoming necessary to migrate the data from Relational to NoSQL databases. Various frameworks has been proposed previously which provides mechanisms for migration of data stored at warehouses in SQL, middle layer solutions which can provide facility of data to be stored in NoSQL databases to handle data which is not structured. This paper provides a literature review of some of the recent approaches proposed by various researchers to migrate data from relational to NoSQL databases. Some researchers proposed mechanisms for the co-existence of NoSQL and Relational databases together. This paper provides a summary of mechanisms which can be used for mapping data stored in Relational databases to NoSQL databases. Various techniques for data transformation and middle layer solutions are summarised in the paper.
2012-01-01
Background To discover a compound inhibiting multiple proteins (i.e. polypharmacological targets) is a new paradigm for the complex diseases (e.g. cancers and diabetes). In general, the polypharmacological proteins often share similar local binding environments and motifs. As the exponential growth of the number of protein structures, to find the similar structural binding motifs (pharma-motifs) is an emergency task for drug discovery (e.g. side effects and new uses for old drugs) and protein functions. Results We have developed a Space-Related Pharmamotifs (called SRPmotif) method to recognize the binding motifs by searching against protein structure database. SRPmotif is able to recognize conserved binding environments containing spatially discontinuous pharma-motifs which are often short conserved peptides with specific physico-chemical properties for protein functions. Among 356 pharma-motifs, 56.5% interacting residues are highly conserved. Experimental results indicate that 81.1% and 92.7% polypharmacological targets of each protein-ligand complex are annotated with same biological process (BP) and molecular function (MF) terms, respectively, based on Gene Ontology (GO). Our experimental results show that the identified pharma-motifs often consist of key residues in functional (active) sites and play the key roles for protein functions. The SRPmotif is available at http://gemdock.life.nctu.edu.tw/SRP/. Conclusions SRPmotif is able to identify similar pharma-interfaces and pharma-motifs sharing similar binding environments for polypharmacological targets by rapidly searching against the protein structure database. Pharma-motifs describe the conservations of binding environments for drug discovery and protein functions. Additionally, these pharma-motifs provide the clues for discovering new sequence-based motifs to predict protein functions from protein sequence databases. We believe that SRPmotif is useful for elucidating protein functions and drug discovery. PMID:23281852
Chiu, Yi-Yuan; Lin, Chun-Yu; Lin, Chih-Ta; Hsu, Kai-Cheng; Chang, Li-Zen; Yang, Jinn-Moon
2012-01-01
To discover a compound inhibiting multiple proteins (i.e. polypharmacological targets) is a new paradigm for the complex diseases (e.g. cancers and diabetes). In general, the polypharmacological proteins often share similar local binding environments and motifs. As the exponential growth of the number of protein structures, to find the similar structural binding motifs (pharma-motifs) is an emergency task for drug discovery (e.g. side effects and new uses for old drugs) and protein functions. We have developed a Space-Related Pharmamotifs (called SRPmotif) method to recognize the binding motifs by searching against protein structure database. SRPmotif is able to recognize conserved binding environments containing spatially discontinuous pharma-motifs which are often short conserved peptides with specific physico-chemical properties for protein functions. Among 356 pharma-motifs, 56.5% interacting residues are highly conserved. Experimental results indicate that 81.1% and 92.7% polypharmacological targets of each protein-ligand complex are annotated with same biological process (BP) and molecular function (MF) terms, respectively, based on Gene Ontology (GO). Our experimental results show that the identified pharma-motifs often consist of key residues in functional (active) sites and play the key roles for protein functions. The SRPmotif is available at http://gemdock.life.nctu.edu.tw/SRP/. SRPmotif is able to identify similar pharma-interfaces and pharma-motifs sharing similar binding environments for polypharmacological targets by rapidly searching against the protein structure database. Pharma-motifs describe the conservations of binding environments for drug discovery and protein functions. Additionally, these pharma-motifs provide the clues for discovering new sequence-based motifs to predict protein functions from protein sequence databases. We believe that SRPmotif is useful for elucidating protein functions and drug discovery.
Automating Relational Database Design for Microcomputer Users.
ERIC Educational Resources Information Center
Pu, Hao-Che
1991-01-01
Discusses issues involved in automating the relational database design process for microcomputer users and presents a prototype of a microcomputer-based system (RA, Relation Assistant) that is based on expert systems technology and helps avoid database maintenance problems. Relational database design is explained and the importance of easy input…
Cultural variation in communal versus exchange norms: Implications for social support.
Miller, Joan G; Akiyama, Hiroko; Kapadia, Shagufa
2017-07-01
Whereas an interdependent cultural view of self has been linked to communal norms and to socially supportive behavior, its relationship to social support has been called into question in research suggesting that discomfort in social support is associated with an interdependent cultural view of self (e.g., Taylor et al., 2004). These contrasting claims were addressed in 2 studies conducted among Japanese, Indian, and American adults. Assessing everyday social support, Study 1 showed that Japanese and Americans rely on exchange norms more frequently than Indians among friends, whereas American rely on exchange norms more frequently than Indians and Japanese among siblings. Assessing responses to vignettes, Study 2 demonstrated that Japanese and Americans rely more frequently on exchange norms than Indians, with greatest relational concerns and most negative outlooks on social support observed among Japanese, less among Americans, and least among Indians. Results further indicated that relational concerns mediated the link between exchange norms and negative social support outlooks. Supporting past claims that relational concerns explain cultural variation in discomfort in social support (e.g., Kim, Sherman, & Taylor, 2008), the findings underscore the need to take into account as well the role of exchange norms in explaining such discomfort. The findings also highlight the existence of culturally variable approaches to exchange and call into question claims that discomfort in social support can be explained in terms of the global concept of an interdependent cultural view of self. (PsycINFO Database Record (c) 2017 APA, all rights reserved).
Rural Leadership. January 1979-December 1988. Quick Bibliography Series.
ERIC Educational Resources Information Center
La Caille John, Patricia, Comp.
This bibliography contains 126 entries of written materials available from the National Agricultural Library's (NAL) AGRICOLA database pertaining to agricultural or rural leadership. Each entry includes bibliographical information and the NAL call number, while some also include abstracts. All the listed materials, including books, reports,…
78 FR 73535 - Privacy Act System of Records
Federal Register 2010, 2011, 2012, 2013, 2014
2013-12-06
... (WCB) will use the information contained in FCC/WCB- 1 to cover the personally identifiable information... direction. USAC will maintain the databases containing consumer PII that are necessary to eliminate waste... practices.'' The contractor will operate this call center, which individuals may use who are seeking to...
A Probabilistic Approach to Crosslingual Information Retrieval
2001-06-01
language expansion step can be performed before the translation process. Implemented as a call to the INQUERY function get_modified_query with one of the...database consists of American English while the dictionary is British English. Therefore, e.g. the Spanish word basura is translated to rubbish and
False Fronts? Behind Higher Education's Voluntary Accountability Systems
ERIC Educational Resources Information Center
Kelly, Andrew P.; Aldeman, Chad
2010-01-01
The major higher education trade associations have addressed the calls for transparency and accountability by creating two public online databases into which colleges are able to voluntarily submit information on costs and outcomes. The National Association of Independent Colleges and Universities (NAICU) launched its University and College…
Multi-Sensor Scene Synthesis and Analysis
1981-09-01
Quad Trees for Image Representation and Processing ...... ... 126 2.6.2 Databases ..... ..... ... ..... ... ..... ..... 138 2.6.2.1 Definitions and...Basic Concepts ....... 138 2.6.3 Use of Databases in Hierarchical Scene Analysis ...... ... ..................... 147 2.6.4 Use of Relational Tables...Multisensor Image Database Systems (MIDAS) . 161 2.7.2 Relational Database System for Pictures .... ..... 168 2.7.3 Relational Pictorial Database
Bach, A; Christensen, E F
2007-07-01
The first link in the 'chain of survival' is the activation of Emergency Medical Services (EMS). In the major part of Denmark, police officers operate the alarm 1-1-2 centre, including calls for EMS. Our aim was to study the police 1-1-2 operators' accuracy in identifying calls concerning patients with loss of consciousness as a key symptom of life-threatening conditions. 'Unconsciousness' was defined as patients with a Glasgow Coma Scale (GCS) score of < 9, scored by the on-scene anaesthesiologist from the Mobile Emergency Care Unit (MECU). This study was an observational cohort study based on data from the Police 1-1-2 Database and the Aarhus County Pre-hospital Database containing data from MECU cases during 6 months in 2004-05. Two thousand, three hundred and forty-three emergency calls with MECU dispatch were identified. In 1655 cases, both 1-1-2 data and the GCS score were recorded. Two hundred and ninety-five patients were found with a GCS score of < 9 at MECU arrival, 243 of whom were reported 'unconscious' by 1-1-2, giving a sensitivity of 82%. One thousand, three hundred and sixty patients were found with a GCS score of > or = 9, 972 of whom were reported 'awake', giving a specificity of 72%. The positive predictive value (percentage of patients found with a GCS score of < 9 at MECU arrival amongst patients reported as 'unconscious') was 39%. The accuracy was moderate with room for improvement. The positive predictive value was low, indicating over-triage of MECU.
Li, John; Maclehose, Rich; Smith, Kirk; Kaehler, Dawn; Hedberg, Craig
2011-01-01
Foodborne illness surveillance based on consumer complaints detects outbreaks by finding common exposures among callers, but this process is often difficult. Laboratory testing of ill callers could also help identify potential outbreaks. However, collection of stool samples from all callers is not feasible. Methods to help screen calls for etiology are needed to increase the efficiency of complaint surveillance systems and increase the likelihood of detecting foodborne outbreaks caused by Salmonella. Data from the Minnesota Department of Health foodborne illness surveillance database (2000 to 2008) were analyzed. Complaints with identified etiologies were examined to create a predictive model for Salmonella. Bootstrap methods were used to internally validate the model. Seventy-one percent of complaints in the foodborne illness database with known etiologies were due to norovirus. The predictive model had a good discriminatory ability to identify Salmonella calls. Three cutoffs for the predictive model were tested: one that maximized sensitivity, one that maximized specificity, and one that maximized predictive ability, providing sensitivities and specificities of 32 and 96%, 100 and 54%, and 89 and 72%, respectively. Development of a predictive model for Salmonella could help screen calls for etiology. The cutoff that provided the best predictive ability for Salmonella corresponded to a caller reporting diarrhea and fever with no vomiting, and five or fewer people ill. Screening calls for etiology would help identify complaints for further follow-up and result in identifying Salmonella cases that would otherwise go unconfirmed; in turn, this could lead to the identification of more outbreaks.
Enhanced DIII-D Data Management Through a Relational Database
NASA Astrophysics Data System (ADS)
Burruss, J. R.; Peng, Q.; Schachter, J.; Schissel, D. P.; Terpstra, T. B.
2000-10-01
A relational database is being used to serve data about DIII-D experiments. The database is optimized for queries across multiple shots, allowing for rapid data mining by SQL-literate researchers. The relational database relates different experiments and datasets, thus providing a big picture of DIII-D operations. Users are encouraged to add their own tables to the database. Summary physics quantities about DIII-D discharges are collected and stored in the database automatically. Meta-data about code runs, MDSplus usage, and visualization tool usage are collected, stored in the database, and later analyzed to improve computing. Documentation on the database may be accessed through programming languages such as C, Java, and IDL, or through ODBC compliant applications such as Excel and Access. A database-driven web page also provides a convenient means for viewing database quantities through the World Wide Web. Demonstrations will be given at the poster.
Urban, Michal; Leššo, Roman; Pelclová, Daniela
2016-07-01
The purpose of the article was to study unintentional pharmaceutical-related poisonings committed by laypersons that were reported to the Toxicological Information Centre in the Czech Republic. Identifying frequency, sources, reasons and consequences of the medication errors in laypersons could help to reduce the overall rate of medication errors. Records of medication error enquiries from 2013 to 2014 were extracted from the electronic database, and the following variables were reviewed: drug class, dosage form, dose, age of the subject, cause of the error, time interval from ingestion to the call, symptoms, prognosis at the time of the call and first aid recommended. Of the calls, 1354 met the inclusion criteria. Among them, central nervous system-affecting drugs (23.6%), respiratory drugs (18.5%) and alimentary drugs (16.2%) were the most common drug classes involved in the medication errors. The highest proportion of the patients was in the youngest age subgroup 0-5 year-old (46%). The reasons for the medication errors involved the leaflet misinterpretation and mistaken dose (53.6%), mixing up medications (19.2%), attempting to reduce pain with repeated doses (6.4%), erroneous routes of administration (2.2%), psychiatric/elderly patients (2.7%), others (9.0%) or unknown (6.9%). A high proportion of children among the patients may be due to the fact that children's dosages for many drugs vary by their weight, and more medications come in a variety of concentrations. Most overdoses could be prevented by safer labelling, proper cap closure systems for liquid products and medication reconciliation by both physicians and pharmacists. © 2016 Nordic Association for the Publication of BCPT (former Nordic Pharmacological Society).
Bezombes, Lucie; Gaucherand, Stéphanie; Kerbiriou, Christian; Reinert, Marie-Eve; Spiegelberger, Thomas
2017-08-01
In many countries, biodiversity compensation is required to counterbalance negative impacts of development projects on biodiversity by carrying out ecological measures, called offset when the goal is to reach "no net loss" of biodiversity. One main issue is to ensure that offset gains are equivalent to impact-related losses. Ecological equivalence is assessed with ecological equivalence assessment methods taking into account a range of key considerations that we summarized as ecological, spatial, temporal, and uncertainty. When equivalence assessment methods take into account all considerations, we call them "comprehensive". Equivalence assessment methods should also aim to be science-based and operational, which is challenging. Many equivalence assessment methods have been developed worldwide but none is fully satisfying. In the present study, we examine 13 equivalence assessment methods in order to identify (i) their general structure and (ii) the synergies and trade-offs between equivalence assessment methods characteristics related to operationality, scientific-basis and comprehensiveness (called "challenges" in his paper). We evaluate each equivalence assessment methods on the basis of 12 criteria describing the level of achievement of each challenge. We observe that all equivalence assessment methods share a general structure, with possible improvements in the choice of target biodiversity, the indicators used, the integration of landscape context and the multipliers reflecting time lags and uncertainties. We show that no equivalence assessment methods combines all challenges perfectly. There are trade-offs between and within the challenges: operationality tends to be favored while scientific basis are integrated heterogeneously in equivalence assessment methods development. One way of improving the challenges combination would be the use of offset dedicated data-bases providing scientific feedbacks on previous offset measures.
Makkar, Steve R.; Haynes, Abby; Williamson, Anna; Redman, Sally
2018-01-01
There are calls for policymakers to make greater use of research when formulating policies. Therefore, it is important that policy organisations have a range of tools and systems to support their staff in using research in their work. The aim of the present study was to measure the extent to which a range of tools and systems to support research use were available within six Australian agencies with a role in health policy, and examine whether this was related to the extent of engagement with, and use of research in policymaking by their staff. The presence of relevant systems and tools was assessed via a structured interview called ORACLe which is conducted with a senior executive from the agency. To measure research use, four policymakers from each agency undertook a structured interview called SAGE, which assesses and scores the extent to which policymakers engaged with (i.e., searched for, appraised, and generated) research, and used research in the development of a specific policy document. The results showed that all agencies had at least a moderate range of tools and systems in place, in particular policy development processes; resources to access and use research (such as journals, databases, libraries, and access to research experts); processes to generate new research; and mechanisms to establish relationships with researchers. Agencies were less likely, however, to provide research training for staff and leaders, or to have evidence-based processes for evaluating existing policies. For the majority of agencies, the availability of tools and systems was related to the extent to which policymakers engaged with, and used research when developing policy documents. However, some agencies did not display this relationship, suggesting that other factors, namely the organisation’s culture towards research use, must also be considered. PMID:29513669
Makkar, Steve R; Haynes, Abby; Williamson, Anna; Redman, Sally
2018-01-01
There are calls for policymakers to make greater use of research when formulating policies. Therefore, it is important that policy organisations have a range of tools and systems to support their staff in using research in their work. The aim of the present study was to measure the extent to which a range of tools and systems to support research use were available within six Australian agencies with a role in health policy, and examine whether this was related to the extent of engagement with, and use of research in policymaking by their staff. The presence of relevant systems and tools was assessed via a structured interview called ORACLe which is conducted with a senior executive from the agency. To measure research use, four policymakers from each agency undertook a structured interview called SAGE, which assesses and scores the extent to which policymakers engaged with (i.e., searched for, appraised, and generated) research, and used research in the development of a specific policy document. The results showed that all agencies had at least a moderate range of tools and systems in place, in particular policy development processes; resources to access and use research (such as journals, databases, libraries, and access to research experts); processes to generate new research; and mechanisms to establish relationships with researchers. Agencies were less likely, however, to provide research training for staff and leaders, or to have evidence-based processes for evaluating existing policies. For the majority of agencies, the availability of tools and systems was related to the extent to which policymakers engaged with, and used research when developing policy documents. However, some agencies did not display this relationship, suggesting that other factors, namely the organisation's culture towards research use, must also be considered.
A survey of commercial object-oriented database management systems
NASA Technical Reports Server (NTRS)
Atkins, John
1992-01-01
The object-oriented data model is the culmination of over thirty years of database research. Initially, database research focused on the need to provide information in a consistent and efficient manner to the business community. Early data models such as the hierarchical model and the network model met the goal of consistent and efficient access to data and were substantial improvements over simple file mechanisms for storing and accessing data. However, these models required highly skilled programmers to provide access to the data. Consequently, in the early 70's E.F. Codd, an IBM research computer scientists, proposed a new data model based on the simple mathematical notion of the relation. This model is known as the Relational Model. In the relational model, data is represented in flat tables (or relations) which have no physical or internal links between them. The simplicity of this model fostered the development of powerful but relatively simple query languages that now made data directly accessible to the general database user. Except for large, multi-user database systems, a database professional was in general no longer necessary. Database professionals found that traditional data in the form of character data, dates, and numeric data were easily represented and managed via the relational model. Commercial relational database management systems proliferated and performance of relational databases improved dramatically. However, there was a growing community of potential database users whose needs were not met by the relational model. These users needed to store data with data types not available in the relational model and who required a far richer modelling environment than that provided by the relational model. Indeed, the complexity of the objects to be represented in the model mandated a new approach to database technology. The Object-Oriented Model was the result.
hEIDI: An Intuitive Application Tool To Organize and Treat Large-Scale Proteomics Data.
Hesse, Anne-Marie; Dupierris, Véronique; Adam, Claire; Court, Magali; Barthe, Damien; Emadali, Anouk; Masselon, Christophe; Ferro, Myriam; Bruley, Christophe
2016-10-07
Advances in high-throughput proteomics have led to a rapid increase in the number, size, and complexity of the associated data sets. Managing and extracting reliable information from such large series of data sets require the use of dedicated software organized in a consistent pipeline to reduce, validate, exploit, and ultimately export data. The compilation of multiple mass-spectrometry-based identification and quantification results obtained in the context of a large-scale project represents a real challenge for developers of bioinformatics solutions. In response to this challenge, we developed a dedicated software suite called hEIDI to manage and combine both identifications and semiquantitative data related to multiple LC-MS/MS analyses. This paper describes how, through a user-friendly interface, hEIDI can be used to compile analyses and retrieve lists of nonredundant protein groups. Moreover, hEIDI allows direct comparison of series of analyses, on the basis of protein groups, while ensuring consistent protein inference and also computing spectral counts. hEIDI ensures that validated results are compliant with MIAPE guidelines as all information related to samples and results is stored in appropriate databases. Thanks to the database structure, validated results generated within hEIDI can be easily exported in the PRIDE XML format for subsequent publication. hEIDI can be downloaded from http://biodev.extra.cea.fr/docs/heidi .
Indexing Temporal XML Using FIX
NASA Astrophysics Data System (ADS)
Zheng, Tiankun; Wang, Xinjun; Zhou, Yingchun
XML has become an important criterion for description and exchange of information. It is of practical significance to introduce the temporal information on this basis, because time has penetrated into all walks of life as an important property information .Such kind of database can track document history and recover information to state of any time before, and is called Temporal XML database. We advise a new feature vector on the basis of FIX which is a feature-based XML index, and build an index on temporal XML database using B+ tree, donated TFIX. We also put forward a new query algorithm upon it for temporal query. Our experiments proved that this index has better performance over other kinds of XML indexes. The index can satisfy all TXPath queries with depth up to K(>0).
The DREO Elint Browser Utility (DEBU) reference manual
NASA Astrophysics Data System (ADS)
Ford, Barbara; Jones, David
1992-04-01
An electronic intelligent database browsing tool called DEBU has been developed that allows databases such as ELP, Kilting, EWIR, and AFEWC to be reviewed and analyzed from a user-friendly environment on a personal computer. DEBU's basic function is to allow users to examine the contents of user-selected subfiles of user-selected emitters of user-selected databases. DEBU augments this functionality with support for selecting (filtering) and combining subsets of emitters by user-selected attributes such as name, parameter type, or parameter value. DEBU provides facilities for examining histograms and x-y plots of selected parameters, for doing ambiguity analysis and mode level analysis, and for generating and printing a variety of reports. A manual is provided for users of DEBU, including descriptions and illustrations of menus and windows.
Associative memory model for searching an image database by image snippet
NASA Astrophysics Data System (ADS)
Khan, Javed I.; Yun, David Y.
1994-09-01
This paper presents an associative memory called an multidimensional holographic associative computing (MHAC), which can be potentially used to perform feature based image database query using image snippet. MHAC has the unique capability to selectively focus on specific segments of a query frame during associative retrieval. As a result, this model can perform search on the basis of featural significance described by a subset of the snippet pixels. This capability is critical for visual query in image database because quite often the cognitive index features in the snippet are statistically weak. Unlike, the conventional artificial associative memories, MHAC uses a two level representation and incorporates additional meta-knowledge about the reliability status of segments of information it receives and forwards. In this paper we present the analysis of focus characteristics of MHAC.
Kawano, Shin; Watanabe, Tsutomu; Mizuguchi, Sohei; Araki, Norie; Katayama, Toshiaki; Yamaguchi, Atsuko
2014-07-01
TogoTable (http://togotable.dbcls.jp/) is a web tool that adds user-specified annotations to a table that a user uploads. Annotations are drawn from several biological databases that use the Resource Description Framework (RDF) data model. TogoTable uses database identifiers (IDs) in the table as a query key for searching. RDF data, which form a network called Linked Open Data (LOD), can be searched from SPARQL endpoints using a SPARQL query language. Because TogoTable uses RDF, it can integrate annotations from not only the reference database to which the IDs originally belong, but also externally linked databases via the LOD network. For example, annotations in the Protein Data Bank can be retrieved using GeneID through links provided by the UniProt RDF. Because RDF has been standardized by the World Wide Web Consortium, any database with annotations based on the RDF data model can be easily incorporated into this tool. We believe that TogoTable is a valuable Web tool, particularly for experimental biologists who need to process huge amounts of data such as high-throughput experimental output. © The Author(s) 2014. Published by Oxford University Press on behalf of Nucleic Acids Research.
NCBI2RDF: enabling full RDF-based access to NCBI databases.
Anguita, Alberto; García-Remesal, Miguel; de la Iglesia, Diana; Maojo, Victor
2013-01-01
RDF has become the standard technology for enabling interoperability among heterogeneous biomedical databases. The NCBI provides access to a large set of life sciences databases through a common interface called Entrez. However, the latter does not provide RDF-based access to such databases, and, therefore, they cannot be integrated with other RDF-compliant databases and accessed via SPARQL query interfaces. This paper presents the NCBI2RDF system, aimed at providing RDF-based access to the complete NCBI data repository. This API creates a virtual endpoint for servicing SPARQL queries over different NCBI repositories and presenting to users the query results in SPARQL results format, thus enabling this data to be integrated and/or stored with other RDF-compliant repositories. SPARQL queries are dynamically resolved, decomposed, and forwarded to the NCBI-provided E-utilities programmatic interface to access the NCBI data. Furthermore, we show how our approach increases the expressiveness of the native NCBI querying system, allowing several databases to be accessed simultaneously. This feature significantly boosts productivity when working with complex queries and saves time and effort to biomedical researchers. Our approach has been validated with a large number of SPARQL queries, thus proving its reliability and enhanced capabilities in biomedical environments.
A Participants' DSS for a Management Game with a DSS Generator.
ERIC Educational Resources Information Center
Yeo, Gee Kin; Nah, Fui Hoon
1992-01-01
Describes the design of a decision support system (DSS) for a management game called MAGNUS (Management Game for National University of Singapore). Built-in models for performance analysis and decision making are explained; database query and model building are described; and future work is discussed. (11 references) (LRW)
Huge Databases Offer a Research Gold Mine and Privacy Worries
ERIC Educational Resources Information Center
Glenn, David
2008-01-01
Last month several news organizations reported on the emergence of "fusion centers"--vast data clearinghouses, operated by state law-enforcement agencies, that can instantly call up key personal information on anyone: telephone numbers, insurance records, family ties, and much more. Architects of the fusion centers say they are a…
A Call for Feminist Research: A Limited Client Perspective
ERIC Educational Resources Information Center
Murray, Kirsten
2006-01-01
Feminist approaches embrace a counselor stance that is both collaborative and supportive, seeking client empowerment. On review of feminist family and couple counseling literature of the past 20 years using several academic databases, no research was found that explored a clients experience of feminist-informed family and couple counseling. The…
Reply to Comment by Briere and Elliott.
ERIC Educational Resources Information Center
Nash, Michael R.; And Others
1993-01-01
Nash et al. respond to Briere and Elliott's (this issue) comments regarding their study (this issue) on effects of controlling for family environment when studying sexual abuse sequelae. Cites limitations of Briere and Elliott's survey study database. Agrees with Briere and Elliott in call for longitudinal, multimethod designs for examining…
Close the Textbook & Open "The Cell: An Image Library"
ERIC Educational Resources Information Center
Saunders, Cheston; Taylor, Amy
2014-01-01
Many students leave the biology classroom with misconceptions centered on cellular structure. This article presents an activity in which students utilize images from an online database called "The Cell: An Image Library" (http://www.cellimagelibrary. org/) to gain a greater understanding of the diversity of cellular structure and the…
49 CFR 1001.1 - Records available from the Board.
Code of Federal Regulations, 2010 CFR
2010-10-01
... Transportation Board Administrative Issuances. (b) The following records, so-called “reading room” documents, are... or date of issuance and are available for viewing and downloading from the Board's Electronic Reading Room at www.stb.dot.gov, the Board's website. Final decisions are maintained in a database that is full...
Extending the ARIADNE Web-Based Learning Environment.
ERIC Educational Resources Information Center
Van Durm, Rafael; Duval, Erik; Verhoeven, Bart; Cardinaels, Kris; Olivie, Henk
One of the central notions of the ARIADNE learning platform is a share-and-reuse approach toward the development of digital course material. The ARIADNE infrastructure includes a distributed database called the Knowledge Pool System (KPS), which acts as a repository of pedagogical material, described with standardized IEEE LTSC Learning Object…
Standardization of Keyword Search Mode
ERIC Educational Resources Information Center
Su, Di
2010-01-01
In spite of its popularity, keyword search mode has not been standardized. Though information professionals are quick to adapt to various presentations of keyword search mode, novice end-users may find keyword search confusing. This article compares keyword search mode in some major reference databases and calls for standardization. (Contains 3…
Mining and Indexing Graph Databases
ERIC Educational Resources Information Center
Yuan, Dayu
2013-01-01
Graphs are widely used to model structures and relationships of objects in various scientific and commercial fields. Chemical molecules, proteins, malware system-call dependencies and three-dimensional mechanical parts are all modeled as graphs. In this dissertation, we propose to mine and index those graph data to enable fast and scalable search.…
Technical Aspects of Interfacing MUMPS to an External SQL Relational Database Management System
Kuzmak, Peter M.; Walters, Richard F.; Penrod, Gail
1988-01-01
This paper describes an interface connecting InterSystems MUMPS (M/VX) to an external relational DBMS, the SYBASE Database Management System. The interface enables MUMPS to operate in a relational environment and gives the MUMPS language full access to a complete set of SQL commands. MUMPS generates SQL statements as ASCII text and sends them to the RDBMS. The RDBMS executes the statements and returns ASCII results to MUMPS. The interface suggests that the language features of MUMPS make it an attractive tool for use in the relational database environment. The approach described in this paper separates MUMPS from the relational database. Positioning the relational database outside of MUMPS promotes data sharing and permits a number of different options to be used for working with the data. Other languages like C, FORTRAN, and COBOL can access the RDBMS database. Advanced tools provided by the relational database vendor can also be used. SYBASE is an advanced high-performance transaction-oriented relational database management system for the VAX/VMS and UNIX operating systems. SYBASE is designed using a distributed open-systems architecture, and is relatively easy to interface with MUMPS.
Duffy, Ryan D; Bott, Elizabeth M; Allan, Blake A; Torrey, Carrie L; Dik, Bryan J
2012-01-01
The current study examined the relation between perceiving a calling, living a calling, and job satisfaction among a diverse group of employed adults who completed an online survey (N = 201). Perceiving a calling and living a calling were positively correlated with career commitment, work meaning, and job satisfaction. Living a calling moderated the relations of perceiving a calling with career commitment and work meaning, such that these relations were more robust for those with a stronger sense they were living their calling. Additionally, a moderated, multiple mediator model was run to examine the mediating role of career commitment and work meaning in the relation of perceiving a calling and job satisfaction, while accounting for the moderating role of living a calling. Results indicated that work meaning and career commitment fully mediated the relation between perceiving a calling and job satisfaction. However, the indirect effects of work meaning and career commitment were only significant for individuals with high levels of living a calling, indicating the importance of living a calling in the link between perceiving a calling and job satisfaction. Implications for research and practice are discussed. (c) 2012 APA, all rights reserved.
This document may be of assistance in applying the New Source Review (NSR) air permitting regulations including the Prevention of Significant Deterioration (PSD) requirements. This document is part of the NSR Policy and Guidance Database. Some documents in the database are a scanned or retyped version of a paper photocopy of the original. Although we have taken considerable effort to quality assure the documents, some may contain typographical errors. Contact the office that issued the document if you need a copy of the original.
GlycomeDB – integration of open-access carbohydrate structure databases
Ranzinger, René; Herget, Stephan; Wetter, Thomas; von der Lieth, Claus-Wilhelm
2008-01-01
Background Although carbohydrates are the third major class of biological macromolecules, after proteins and DNA, there is neither a comprehensive database for carbohydrate structures nor an established universal structure encoding scheme for computational purposes. Funding for further development of the Complex Carbohydrate Structure Database (CCSD or CarbBank) ceased in 1997, and since then several initiatives have developed independent databases with partially overlapping foci. For each database, different encoding schemes for residues and sequence topology were designed. Therefore, it is virtually impossible to obtain an overview of all deposited structures or to compare the contents of the various databases. Results We have implemented procedures which download the structures contained in the seven major databases, e.g. GLYCOSCIENCES.de, the Consortium for Functional Glycomics (CFG), the Kyoto Encyclopedia of Genes and Genomes (KEGG) and the Bacterial Carbohydrate Structure Database (BCSDB). We have created a new database called GlycomeDB, containing all structures, their taxonomic annotations and references (IDs) for the original databases. More than 100000 datasets were imported, resulting in more than 33000 unique sequences now encoded in GlycomeDB using the universal format GlycoCT. Inconsistencies were found in all public databases, which were discussed and corrected in multiple feedback rounds with the responsible curators. Conclusion GlycomeDB is a new, publicly available database for carbohydrate sequences with a unified, all-encompassing structure encoding format and NCBI taxonomic referencing. The database is updated weekly and can be downloaded free of charge. The JAVA application GlycoUpdateDB is also available for establishing and updating a local installation of GlycomeDB. With the advent of GlycomeDB, the distributed islands of knowledge in glycomics are now bridged to form a single resource. PMID:18803830
Handwritten word preprocessing for database adaptation
NASA Astrophysics Data System (ADS)
Oprean, Cristina; Likforman-Sulem, Laurence; Mokbel, Chafic
2013-01-01
Handwriting recognition systems are typically trained using publicly available databases, where data have been collected in controlled conditions (image resolution, paper background, noise level,...). Since this is not often the case in real-world scenarios, classification performance can be affected when novel data is presented to the word recognition system. To overcome this problem, we present in this paper a new approach called database adaptation. It consists of processing one set (training or test) in order to adapt it to the other set (test or training, respectively). Specifically, two kinds of preprocessing, namely stroke thickness normalization and pixel intensity normalization are considered. The advantage of such approach is that we can re-use the existing recognition system trained on controlled data. We conduct several experiments with the Rimes 2011 word database and with a real-world database. We adapt either the test set or the training set. Results show that training set adaptation achieves better results than test set adaptation, at the cost of a second training stage on the adapted data. Accuracy of data set adaptation is increased by 2% to 3% in absolute value over no adaptation.
A linked GeoData map for enabling information access
Powell, Logan J.; Varanka, Dalia E.
2018-01-10
OverviewThe Geospatial Semantic Web (GSW) is an emerging technology that uses the Internet for more effective knowledge engineering and information extraction. Among the aims of the GSW are to structure the semantic specifications of data to reduce ambiguity and to link those data more efficiently. The data are stored as triples, the basic data unit in graph databases, which are similar to the vector data model of geographic information systems (GIS); that is, a node-edge-node model that forms a graph of semantically related information. The GSW is supported by emerging technologies such as linked geospatial data, described below, that enable it to store and manage geographical data that require new cartographic methods for visualization. This report describes a map that can interact with linked geospatial data using a simulation of a data query approach called the browsable graph to find information that is semantically related to a subject of interest, visualized using the Data Driven Documents (D3) library. Such a semantically enabled map functions as a map knowledge base (MKB) (Varanka and Usery, 2017).A MKB differs from a database in an important way. The central element of a triple, alternatively called the edge or property, is composed of a logic formalization that structures the relation between the first and third parts, the nodes or objects. Node-edge-node represents the graphic form of the triple, and the subject-property-object terms represent the data structure. Object classes connect to build a federated graph, similar to a network in visual form. Because the triple property is a logical statement (a predicate), the data graph represents logical propositions or assertions accepted to be true about the subject matter. These logical formalizations can be manipulated to calculate new triples, representing inferred logical assertions, from the existing data.To demonstrate a MKB system, a technical proof-of-concept is developed that uses geographically attributed Resource Description Framework (RDF) serializations of linked data for mapping. The proof-of-concept focuses on accessing triple data from visual elements of a geographic map as the interface to the MKB. The map interface is embedded with other essential functions such as SPARQL Protocol and RDF Query Language (SPARQL) data query endpoint services and reasoning capabilities of Apache Marmotta (Apache Software Foundation, 2017). An RDF database of the Geographic Names Information System (GNIS), which contains official names of domestic feature in the United States, was linked to a county data layer from The National Map of the U.S. Geological Survey. The county data are part of a broader Government Units theme offered to the public as Esri shapefiles. The shapefile used to draw the map itself was converted to a geographic-oriented JavaScript Object Notation (JSON) (GeoJSON) format and linked through various properties with a linked geodata version of the GNIS database called “GNIS–LD” (Butler and others, 2016; B. Regalia and others, University of California-Santa Barbara, written commun., 2017). The GNIS–LD files originated in Terse RDF Triple Language (Turtle) format but were converted to a JSON format specialized in linked data, “JSON–LD” (Beckett and Berners-Lee, 2011; Sorny and others, 2014). The GNIS–LD database is composed of roughly three predominant triple data graphs: Features, Names, and History. The graphs include a set of namespace prefixes used by each of the attributes. Predefining the prefixes made the conversion to the JSON–LD format simple to complete because Turtle and JSON–LD are variant specifications of the basic RDF concept.To convert a shapefile into GeoJSON format to capture the geospatial coordinate geometry objects, an online converter, Mapshaper, was used (Bloch, 2013). To convert the Turtle files, a custom converter written in Java reconstructs the files by parsing each grouping of attributes belonging to one subject and pasting the data into a new file that follows the syntax of JSON–LD. Additionally, the Features file contained its own set of geometries, which was exported into a separate JSON–LD file along with its elevation value to form a fourth file, named “features-geo.json.” Extracted data from external files can be represented in HyperText Markup Language (HTML) path objects. The goal was to import multiple JSON–LD files using this approach.
Ling, Sophia L-Y; McD Taylor, David; Robinson, Jeffery
2018-04-01
The aim of this study is to determine the period prevalence, nature and causes of workplace chemical and toxin exposures reported to the Victorian Poisons Information Centre (VPIC). All cases classified as 'workplace: acute' when entered into the VPIC database (June 2005-December 2013) were analysed. Data were collected on patient sex, the nature of the chemical or toxin, route of exposure and season. Overall, 4928 cases were extracted. Exposures to men (71.5% of calls) differed from women (P<0.001), with most exposures relating to industry/trade substances (23.7%) and cleaners/bleaches/detergents (36.9%), respectively. Ocular (33.2%), inhalational (27.7%) and dermal (22.1%) exposures were most common. Exposures were most common in Spring and most seasonal variation was found for veterinary/animal, agricultural/plant and household categories (P<0.05). In all, 3445 (69.9%) cases had symptoms related to their exposure at the time of the call. However, the proportion of symptomatic cases within the major substance categories differed significantly (P<0.001). Chemicals associated with the most symptoms were cleaners/bleaches/detergents, industrial/trade substances and acids. Mild-moderately important workplace exposures are common. Significant variations exist between the sexes and seasons. Poisons Information Centres may play a role in ongoing surveillance of chemical and toxin exposures and a minimum exposure dataset is recommended.
Concentrations of indoor pollutants (CIP) database user's manual (Version 4. 0)
DOE Office of Scientific and Technical Information (OSTI.GOV)
Apte, M.G.; Brown, S.R.; Corradi, C.A.
1990-10-01
This is the latest release of the database and the user manual. The user manual is a tutorial and reference for utilizing the CIP Database system. An installation guide is included to cover various hardware configurations. Numerous examples and explanations of the dialogue between the user and the database program are provided. It is hoped that this resource will, along with on-line help and the menu-driven software, make for a quick and easy learning curve. For the purposes of this manual, it is assumed that the user is acquainted with the goals of the CIP Database, which are: (1) tomore » collect existing measurements of concentrations of indoor air pollutants in a user-oriented database and (2) to provide a repository of references citing measured field results openly accessible to a wide audience of researchers, policy makers, and others interested in the issues of indoor air quality. The database software, as distinct from the data, is contained in two files, CIP. EXE and PFIL.COM. CIP.EXE is made up of a number of programs written in dBase III command code and compiled using Clipper into a single, executable file. PFIL.COM is a program written in Turbo Pascal that handles the output of summary text files and is called from CIP.EXE. Version 4.0 of the CIP Database is current through March 1990.« less
Quality assessment and improvement of nationwide cancer registration system in Taiwan: a review.
Chiang, Chun-Ju; You, San-Lin; Chen, Chien-Jen; Yang, Ya-Wen; Lo, Wei-Cheng; Lai, Mei-Shu
2015-03-01
Cancer registration provides core information for cancer surveillance and control. The population-based Taiwan Cancer Registry was implemented in 1979. After the Cancer Control Act was promulgated in 2003, the completeness (97%) and data quality of cancer registry database has achieved at an excellent level. Hospitals with 50 or more beds, which provide outpatient and hospitalized cancer care, are recruited to report 20 items of information on all newly diagnosed cancers to the central registry office (called short-form database). The Taiwan Cancer Registry is organized and funded by the Ministry of Health and Welfare. The National Taiwan University has been contracted to operate the registry and organized an advisory board to standardize definitions of terminology, coding and procedures of the registry's reporting system since 1996. To monitor the cancer care patterns and evaluate the cancer treatment outcomes, central cancer registry has been reformed since 2002 to include detail items of the stage at diagnosis and the first course of treatment (called long-form database). There are 80 hospitals, which count for >90% of total cancer cases, involved in the long-form registration. The Taiwan Cancer Registry has run smoothly for >30 years, which provides essential foundation for academic research and cancer control policy in Taiwan. © The Author 2015. Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oup.com.
HuMiChip: Development of a Functional Gene Array for the Study of Human Microbiomes
DOE Office of Scientific and Technical Information (OSTI.GOV)
Tu, Q.; Deng, Ye; Lin, Lu
Microbiomes play very important roles in terms of nutrition, health and disease by interacting with their hosts. Based on sequence data currently available in public domains, we have developed a functional gene array to monitor both organismal and functional gene profiles of normal microbiota in human and mouse hosts, and such an array is called human and mouse microbiota array, HMM-Chip. First, seed sequences were identified from KEGG databases, and used to construct a seed database (seedDB) containing 136 gene families in 19 metabolic pathways closely related to human and mouse microbiomes. Second, a mother database (motherDB) was constructed withmore » 81 genomes of bacterial strains with 54 from gut and 27 from oral environments, and 16 metagenomes, and used for selection of genes and probe design. Gene prediction was performed by Glimmer3 for bacterial genomes, and by the Metagene program for metagenomes. In total, 228,240 and 801,599 genes were identified for bacterial genomes and metagenomes, respectively. Then the motherDB was searched against the seedDB using the HMMer program, and gene sequences in the motherDB that were highly homologous with seed sequences in the seedDB were used for probe design by the CommOligo software. Different degrees of specific probes, including gene-specific, inclusive and exclusive group-specific probes were selected. All candidate probes were checked against the motherDB and NCBI databases for specificity. Finally, 7,763 probes covering 91.2percent (12,601 out of 13,814) HMMer confirmed sequences from 75 bacterial genomes and 16 metagenomes were selected. This developed HMM-Chip is able to detect the diversity and abundance of functional genes, the gene expression of microbial communities, and potentially, the interactions of microorganisms and their hosts.« less
Correlates of Gay-Related Name-Calling in Schools
ERIC Educational Resources Information Center
Slaatten, Hilde; Hetland, Jørn; Anderssen, Norman
2015-01-01
The aim of this study was to examine whether attitudes about gay-related name-calling, social norms concerning gay-related name-calling among co-students, teacher intervention, and school-related support would predict whether secondary school pupils had called another pupil a gay-related name during the last month. A total of 921 ninth-grade…
Akune, Yukie; Lin, Chi-Hung; Abrahams, Jodie L; Zhang, Jingyu; Packer, Nicolle H; Aoki-Kinoshita, Kiyoko F; Campbell, Matthew P
2016-08-05
Glycan structures attached to proteins are comprised of diverse monosaccharide sequences and linkages that are produced from precursor nucleotide-sugars by a series of glycosyltransferases. Databases of these structures are an essential resource for the interpretation of analytical data and the development of bioinformatics tools. However, with no template to predict what structures are possible the human glycan structure databases are incomplete and rely heavily on the curation of published, experimentally determined, glycan structure data. In this work, a library of 45 human glycosyltransferases was used to generate a theoretical database of N-glycan structures comprised of 15 or less monosaccharide residues. Enzyme specificities were sourced from major online databases including Kyoto Encyclopedia of Genes and Genomes (KEGG) Glycan, Consortium for Functional Glycomics (CFG), Carbohydrate-Active enZymes (CAZy), GlycoGene DataBase (GGDB) and BRENDA. Based on the known activities, more than 1.1 million theoretical structures and 4.7 million synthetic reactions were generated and stored in our database called UniCorn. Furthermore, we analyzed the differences between the predicted glycan structures in UniCorn and those contained in UniCarbKB (www.unicarbkb.org), a database which stores experimentally described glycan structures reported in the literature, and demonstrate that UniCorn can be used to aid in the assignment of ambiguous structures whilst also serving as a discovery database. Copyright © 2016 Elsevier Ltd. All rights reserved.
Patel, Vanash M.; Ashrafian, Hutan; Almoudaris, Alex; Makanjuola, Jonathan; Bucciarelli-Ducci, Chiara; Darzi, Ara; Athanasiou, Thanos
2013-01-01
Objectives To compare H index scores for healthcare researchers returned by Google Scholar, Web of Science and Scopus databases, and to assess whether a researcher's age, country of institutional affiliation and physician status influences calculations. Subjects and Methods One hundred and ninety-five Nobel laureates in Physiology and Medicine from 1901 to 2009 were considered. Year of first and last publications, total publications and citation counts, and the H index for each laureate were calculated from each database. Cronbach's alpha statistics was used to measure the reliability of H index scores between the databases. Laureate characteristic influence on the H index was analysed using linear regression. Results There was no concordance between the databases when considering the number of publications and citations count per laureate. The H index was the most reliably calculated bibliometric across the three databases (Cronbach's alpha = 0.900). All databases returned significantly higher H index scores for younger laureates (p < 0.0001). Google Scholar and Web of Science returned significantly higher H index for physician laureates (p = 0.025 and p = 0.029, respectively). Country of institutional affiliation did not influence the H index in any database. Conclusion The H index appeared to be the most consistently calculated bibliometric between the databases for Nobel laureates in Physiology and Medicine. Researcher-specific characteristics constituted an important component of objective research assessment. The findings of this study call to question the choice of current and future academic performance databases. PMID:22964880
Stellar Abundances for Galactic Archaeology Database. IV. Compilation of stars in dwarf galaxies
NASA Astrophysics Data System (ADS)
Suda, Takuma; Hidaka, Jun; Aoki, Wako; Katsuta, Yutaka; Yamada, Shimako; Fujimoto, Masayuki Y.; Ohtani, Yukari; Masuyama, Miyu; Noda, Kazuhiro; Wada, Kentaro
2017-10-01
We have constructed a database of stars in Local Group galaxies using the extended version of the SAGA (Stellar Abundances for Galactic Archaeology) database that contains stars in 24 dwarf spheroidal galaxies and ultra-faint dwarfs. The new version of the database includes more than 4500 stars in the Milky Way, by removing the previous metallicity criterion of [Fe/H] ≤ -2.5, and more than 6000 stars in the Local Group galaxies. We examined the validity of using a combined data set for elemental abundances. We also checked the consistency between the derived distances to individual stars and those to galaxies as given in the literature. Using the updated database, the characteristics of stars in dwarf galaxies are discussed. Our statistical analyses of α-element abundances show that the change of the slope of the [α/Fe] relative to [Fe/H] (so-called "knee") occurs at [Fe/H] = -1.0 ± 0.1 for the Milky Way. The knee positions for selected galaxies are derived by applying the same method. The star formation history of individual galaxies is explored using the slope of the cumulative metallicity distribution function. Radial gradients along the four directions are inspected in six galaxies where we find no direction-dependence of metallicity gradients along the major and minor axes. The compilation of all the available data shows a lack of CEMP-s population in dwarf galaxies, while there may be some CEMP-no stars at [Fe/H] ≲ -3 even in the very small sample. The inspection of the relationship between Eu and Ba abundances confirms an anomalously Ba-rich population in Fornax, which indicates a pre-enrichment of interstellar gas with r-process elements. We do not find any evidence of anti-correlations in O-Na and Mg-Al abundances, which characterizes the abundance trends in the Galactic globular clusters.
RCDB: Renal Cancer Gene Database.
Ramana, Jayashree
2012-05-18
Renal cell carcinoma or RCC is one of the common and most lethal urological cancers, with 40% of the patients succumbing to death because of metastatic progression of the disease. Treatment of metastatic RCC remains highly challenging because of its resistance to chemotherapy as well as radiotherapy, besides surgical resection. Whereas RCC comprises tumors with differing histological types, clear cell RCC remains the most common. A major problem in the clinical management of patients presenting with localized ccRCC is the inability to determine tumor aggressiveness and accurately predict the risk of metastasis following surgery. As a measure to improve the diagnosis and prognosis of RCC, researchers have identified several molecular markers through a number of techniques. However the wealth of information available is scattered in literature and not easily amenable to data-mining. To reduce this gap, this work describes a comprehensive repository called Renal Cancer Gene Database, as an integrated gateway to study renal cancer related data. Renal Cancer Gene Database is a manually curated compendium of 240 protein-coding and 269 miRNA genes contributing to the etiology and pathogenesis of various forms of renal cell carcinomas. The protein coding genes have been classified according to the kind of gene alteration observed in RCC. RCDB also includes the miRNAsdysregulated in RCC, along with the corresponding information regarding the type of RCC and/or metastatic or prognostic significance. While some of the miRNA genes showed an association with other types of cancers few were unique to RCC. Users can query the database using keywords, category and chromosomal location of the genes. The knowledgebase can be freely accessed via a user-friendly web interface at http://www.juit.ac.in/attachments/jsr/rcdb/homenew.html. It is hoped that this database would serve as a useful complement to the existing public resources and as a good starting point for researchers and physicians interested in RCC genetics.
jqcML: an open-source java API for mass spectrometry quality control data in the qcML format.
Bittremieux, Wout; Kelchtermans, Pieter; Valkenborg, Dirk; Martens, Lennart; Laukens, Kris
2014-07-03
The awareness that systematic quality control is an essential factor to enable the growth of proteomics into a mature analytical discipline has increased over the past few years. To this aim, a controlled vocabulary and document structure have recently been proposed by Walzer et al. to store and disseminate quality-control metrics for mass-spectrometry-based proteomics experiments, called qcML. To facilitate the adoption of this standardized quality control routine, we introduce jqcML, a Java application programming interface (API) for the qcML data format. First, jqcML provides a complete object model to represent qcML data. Second, jqcML provides the ability to read, write, and work in a uniform manner with qcML data from different sources, including the XML-based qcML file format and the relational database qcDB. Interaction with the XML-based file format is obtained through the Java Architecture for XML Binding (JAXB), while generic database functionality is obtained by the Java Persistence API (JPA). jqcML is released as open-source software under the permissive Apache 2.0 license and can be downloaded from https://bitbucket.org/proteinspector/jqcml .
Comprehensive Reconstruction and Visualization of Non-Coding Regulatory Networks in Human
Bonnici, Vincenzo; Russo, Francesco; Bombieri, Nicola; Pulvirenti, Alfredo; Giugno, Rosalba
2014-01-01
Research attention has been powered to understand the functional roles of non-coding RNAs (ncRNAs). Many studies have demonstrated their deregulation in cancer and other human disorders. ncRNAs are also present in extracellular human body fluids such as serum and plasma, giving them a great potential as non-invasive biomarkers. However, non-coding RNAs have been relatively recently discovered and a comprehensive database including all of them is still missing. Reconstructing and visualizing the network of ncRNAs interactions are important steps to understand their regulatory mechanism in complex systems. This work presents ncRNA-DB, a NoSQL database that integrates ncRNAs data interactions from a large number of well established on-line repositories. The interactions involve RNA, DNA, proteins, and diseases. ncRNA-DB is available at http://ncrnadb.scienze.univr.it/ncrnadb/. It is equipped with three interfaces: web based, command-line, and a Cytoscape app called ncINetView. By accessing only one resource, users can search for ncRNAs and their interactions, build a network annotated with all known ncRNAs and associated diseases, and use all visual and mining features available in Cytoscape. PMID:25540777
Ice Nucleating Particles around the world - a global review
NASA Astrophysics Data System (ADS)
Kanji, Zamin A.; Atkinson, James; Sierau, Berko; Lohmann, Ulrike
2017-04-01
In the atmosphere the formation of new ice particles at temperatures above -36 °C is due to a subset of aerosol called Ice Nucleating Particles (INP). However, the spatial and temporal evolution of such particles is poorly understood. Current modelling of INP is attempting to estimate the sources and transport of INP, but is hampered by the availability and convenience of INP observations. As part of the EU FP7 project impact of Biogenic versus Anthropogenic emissions on Clouds and Climate: towards a Holistic UnderStanding (BACCHUS), historical and contemporary observations of INP have been collated into a database (http://www.bacchus-env.eu/in/) and are reviewed here. Outside of Europe and North America the coverage of measurements is sparse, especially for modern day climate - in many areas the only measurements available are from the mid-20th century. As well as an overview of all the data in the database, correlations with several accompanying variables are presented. For example, immersion freezing INP seem to be negatively correlated with altitude, whereas CFDC based condensation freezing INP show no height correlation. An initial global parameterisation of INP concentrations taking into account freezing temperature and relative humidity for use in modelling is provided.
Comprehensive reconstruction and visualization of non-coding regulatory networks in human.
Bonnici, Vincenzo; Russo, Francesco; Bombieri, Nicola; Pulvirenti, Alfredo; Giugno, Rosalba
2014-01-01
Research attention has been powered to understand the functional roles of non-coding RNAs (ncRNAs). Many studies have demonstrated their deregulation in cancer and other human disorders. ncRNAs are also present in extracellular human body fluids such as serum and plasma, giving them a great potential as non-invasive biomarkers. However, non-coding RNAs have been relatively recently discovered and a comprehensive database including all of them is still missing. Reconstructing and visualizing the network of ncRNAs interactions are important steps to understand their regulatory mechanism in complex systems. This work presents ncRNA-DB, a NoSQL database that integrates ncRNAs data interactions from a large number of well established on-line repositories. The interactions involve RNA, DNA, proteins, and diseases. ncRNA-DB is available at http://ncrnadb.scienze.univr.it/ncrnadb/. It is equipped with three interfaces: web based, command-line, and a Cytoscape app called ncINetView. By accessing only one resource, users can search for ncRNAs and their interactions, build a network annotated with all known ncRNAs and associated diseases, and use all visual and mining features available in Cytoscape.
NASA Astrophysics Data System (ADS)
Flewelling, Heather
2018-01-01
On December 19, 2016, Pan-STARRS released the stacked images, mean attributes catalogs, and static sky catalogs for the 3pi survey, in 5 filters (g,r,i,z,y), covering 3/4 of the sky, everything north of -30 in declination. This set of data is called Data Release 1 (DR1), and it is available to all at http://panstarrs.stsci.edu. It contains more than 10 billion objects, 3 billion of those objects have stack photometry. We give an update on the progress of the forthcoming Data Release (DR2) database, which will provide time domain catalogs and single exposures for the 3pi survey. This includes 3pi data taken between 2010 and 2014, covering approximately 60 epochs per patch of sky, and includes measurements detected in the single exposures as well as forced photometry measurements (photometry measured on single exposures using the positions from sources detected in the stacks). We also provide informations on futures releases (DR3 and beyond), which will contain the rest of the 3pi database (specifically, the data products related to difference imaging), as well as the data products for the Medium Deep (MD) survey.
Biomedical question answering using semantic relations.
Hristovski, Dimitar; Dinevski, Dejan; Kastrin, Andrej; Rindflesch, Thomas C
2015-01-16
The proliferation of the scientific literature in the field of biomedicine makes it difficult to keep abreast of current knowledge, even for domain experts. While general Web search engines and specialized information retrieval (IR) systems have made important strides in recent decades, the problem of accurate knowledge extraction from the biomedical literature is far from solved. Classical IR systems usually return a list of documents that have to be read by the user to extract relevant information. This tedious and time-consuming work can be lessened with automatic Question Answering (QA) systems, which aim to provide users with direct and precise answers to their questions. In this work we propose a novel methodology for QA based on semantic relations extracted from the biomedical literature. We extracted semantic relations with the SemRep natural language processing system from 122,421,765 sentences, which came from 21,014,382 MEDLINE citations (i.e., the complete MEDLINE distribution up to the end of 2012). A total of 58,879,300 semantic relation instances were extracted and organized in a relational database. The QA process is implemented as a search in this database, which is accessed through a Web-based application, called SemBT (available at http://sembt.mf.uni-lj.si ). We conducted an extensive evaluation of the proposed methodology in order to estimate the accuracy of extracting a particular semantic relation from a particular sentence. Evaluation was performed by 80 domain experts. In total 7,510 semantic relation instances belonging to 2,675 distinct relations were evaluated 12,083 times. The instances were evaluated as correct 8,228 times (68%). In this work we propose an innovative methodology for biomedical QA. The system is implemented as a Web-based application that is able to provide precise answers to a wide range of questions. A typical question is answered within a few seconds. The tool has some extensions that make it especially useful for interpretation of DNA microarray results.
[Establishment of a comprehensive database for laryngeal cancer related genes and the miRNAs].
Li, Mengjiao; E, Qimin; Liu, Jialin; Huang, Tingting; Liang, Chuanyu
2015-09-01
By collecting and analyzing the laryngeal cancer related genes and the miRNAs, to build a comprehensive laryngeal cancer-related gene database, which differs from the current biological information database with complex and clumsy structure and focuses on the theme of gene and miRNA, and it could make the research and teaching more convenient and efficient. Based on the B/S architecture, using Apache as a Web server, MySQL as coding language of database design and PHP as coding language of web design, a comprehensive database for laryngeal cancer-related genes was established, providing with the gene tables, protein tables, miRNA tables and clinical information tables of the patients with laryngeal cancer. The established database containsed 207 laryngeal cancer related genes, 243 proteins, 26 miRNAs, and their particular information such as mutations, methylations, diversified expressions, and the empirical references of laryngeal cancer relevant molecules. The database could be accessed and operated via the Internet, by which browsing and retrieval of the information were performed. The database were maintained and updated regularly. The database for laryngeal cancer related genes is resource-integrated and user-friendly, providing a genetic information query tool for the study of laryngeal cancer.
Pourabbasi, Ata; Farzami, Jalal; Shirvani, Mahbubeh-Sadat Ebrahimnegad; Shams, Amir Hossein; Larijani, Bagher
2017-01-01
One of the main usages of social networks in clinical studies is facilitating the process of sampling and case finding for scientists. The main focus of this study is on comparing two different methods of sampling through phone calls and using social network, for study purposes. One of the researchers started calling 214 families of children with diabetes during 90 days. After this period, phone calls stopped, and the team started communicating with families through telegram, a virtual social network for 30 days. The number of children who participated in the study was evaluated. Although the telegram method was 60 days shorter than the phone call method, researchers found that the number of participants from telegram (17.6%) did not have any significant differences compared with the ones being phone called (12.9%). Using social networks can be suggested as a beneficial method for local researchers who look for easier sampling methods, winning their samples' trust, following up with the procedure, and an easy-access database.
Liu, Yu; Hong, Yang; Lin, Chun-Yuan; Hung, Che-Lun
2015-01-01
The Smith-Waterman (SW) algorithm has been widely utilized for searching biological sequence databases in bioinformatics. Recently, several works have adopted the graphic card with Graphic Processing Units (GPUs) and their associated CUDA model to enhance the performance of SW computations. However, these works mainly focused on the protein database search by using the intertask parallelization technique, and only using the GPU capability to do the SW computations one by one. Hence, in this paper, we will propose an efficient SW alignment method, called CUDA-SWfr, for the protein database search by using the intratask parallelization technique based on a CPU-GPU collaborative system. Before doing the SW computations on GPU, a procedure is applied on CPU by using the frequency distance filtration scheme (FDFS) to eliminate the unnecessary alignments. The experimental results indicate that CUDA-SWfr runs 9.6 times and 96 times faster than the CPU-based SW method without and with FDFS, respectively.
TRedD—A database for tandem repeats over the edit distance
Sokol, Dina; Atagun, Firat
2010-01-01
A ‘tandem repeat’ in DNA is a sequence of two or more contiguous, approximate copies of a pattern of nucleotides. Tandem repeats are common in the genomes of both eukaryotic and prokaryotic organisms. They are significant markers for human identity testing, disease diagnosis, sequence homology and population studies. In this article, we describe a new database, TRedD, which contains the tandem repeats found in the human genome. The database is publicly available online, and the software for locating the repeats is also freely available. The definition of tandem repeats used by TRedD is a new and innovative definition based upon the concept of ‘evolutive tandem repeats’. In addition, we have developed a tool, called TandemGraph, to graphically depict the repeats occurring in a sequence. This tool can be coupled with any repeat finding software, and it should greatly facilitate analysis of results. Database URL: http://tandem.sci.brooklyn.cuny.edu/ PMID:20624712
2009-01-01
Background Insertional mutagenesis is an effective method for functional genomic studies in various organisms. It can rapidly generate easily tractable mutations. A large-scale insertional mutagenesis with the piggyBac (PB) transposon is currently performed in mice at the Institute of Developmental Biology and Molecular Medicine (IDM), Fudan University in Shanghai, China. This project is carried out via collaborations among multiple groups overseeing interconnected experimental steps and generates a large volume of experimental data continuously. Therefore, the project calls for an efficient database system for recording, management, statistical analysis, and information exchange. Results This paper presents a database application called MP-PBmice (insertional mutation mapping system of PB Mutagenesis Information Center), which is developed to serve the on-going large-scale PB insertional mutagenesis project. A lightweight enterprise-level development framework Struts-Spring-Hibernate is used here to ensure constructive and flexible support to the application. The MP-PBmice database system has three major features: strict access-control, efficient workflow control, and good expandability. It supports the collaboration among different groups that enter data and exchange information on daily basis, and is capable of providing real time progress reports for the whole project. MP-PBmice can be easily adapted for other large-scale insertional mutation mapping projects and the source code of this software is freely available at http://www.idmshanghai.cn/PBmice. Conclusion MP-PBmice is a web-based application for large-scale insertional mutation mapping onto the mouse genome, implemented with the widely used framework Struts-Spring-Hibernate. This system is already in use by the on-going genome-wide PB insertional mutation mapping project at IDM, Fudan University. PMID:19958505
Visualization and interaction tools for aerial photograph mosaics
NASA Astrophysics Data System (ADS)
Fernandes, João Pedro; Fonseca, Alexandra; Pereira, Luís; Faria, Adriano; Figueira, Helder; Henriques, Inês; Garção, Rita; Câmara, António
1997-05-01
This paper describes the development of a digital spatial library based on mosaics of digital orthophotos, called Interactive Portugal, that will enable users both to retrieve geospatial information existing in the Portuguese National System for Geographic Information World Wide Web server, and to develop local databases connected to the main system. A set of navigation, interaction, and visualization tools are proposed and discussed. They include sketching, dynamic sketching, and navigation capabilities over the digital orthophotos mosaics. Main applications of this digital spatial library are pointed out and discussed, namely for education, professional, and tourism markets. Future developments are considered. These developments are related to user reactions, technological advancements, and projects that also aim at delivering and exploring digital imagery on the World Wide Web. Future capabilities for site selection and change detection are also considered.
Content-level deduplication on mobile internet datasets
NASA Astrophysics Data System (ADS)
Hou, Ziyu; Chen, Xunxun; Wang, Yang
2017-06-01
Various systems and applications involve a large volume of duplicate items. Based on high data redundancy in real world datasets, data deduplication can reduce storage capacity and improve the utilization of network bandwidth. However, chunks of existing deduplications range in size from 4KB to over 16KB, existing systems are not applicable to the datasets consisting of short records. In this paper, we propose a new framework called SF-Dedup which is able to implement the deduplication process on a large set of Mobile Internet records, the size of records can be smaller than 100B, or even smaller than 10B. SF-Dedup is a short fingerprint, in-line, hash-collisions-resolved deduplication. Results of experimental applications illustrate that SH-Dedup is able to reduce storage capacity and shorten query time on relational database.
NASA Astrophysics Data System (ADS)
Tinti, S.; Armigliato, A.; Pagnoni, G.; Zaniboni, F.
2012-04-01
One of the most challenging goals that the geo-scientific community is facing after the catastrophic tsunami occurred on December 2004 in the Indian Ocean is to develop the so-called "next generation" Tsunami Early Warning Systems (TEWS). Indeed, the meaning of "next generation" does not refer to the aim of a TEWS, which obviously remains to detect whether a tsunami has been generated or not by a given source and, in the first case, to send proper warnings and/or alerts in a suitable time to all the countries and communities that can be affected by the tsunami. Instead, "next generation" identifies with the development of a Decision Support System (DSS) that, in general terms, relies on 1) an integrated set of seismic, geodetic and marine sensors whose objective is to detect and characterise the possible tsunamigenic sources and to monitor instrumentally the time and space evolution of the generated tsunami, 2) databases of pre-computed numerical tsunami scenarios to be suitably combined based on the information coming from the sensor environment and to be used to forecast the degree of exposition of different coastal places both in the near- and in the far-field, 3) a proper overall (software) system architecture. The EU-FP7 TRIDEC Project aims at developing such a DSS and has selected two test areas in the Euro-Mediterranean region, namely the western Iberian margin and the eastern Mediterranean (Turkish coasts). In this study, we discuss the strategies that are being adopted in TRIDEC to build the databases of pre-computed tsunami scenarios and we show some applications to the western Iberian margin. In particular, two different databases are being populated, called "Virtual Scenario Database" (VSDB) and "Matching Scenario Database" (MSDB). The VSDB contains detailed simulations of few selected earthquake-generated tsunamis. The cases provided by the members of the VSDB are computed "real events"; in other words, they represent the unknowns that the TRIDEC platform must be able to recognise and match during the early crisis management phase. The MSDB contains a very large number (order of thousands) of tsunami simulations performed starting from many different simple earthquake sources of different magnitudes and located in the "vicinity" of the virtual scenario earthquake. Examples from both databases will be presented.
A Relational Database System for Student Use.
ERIC Educational Resources Information Center
Fertuck, Len
1982-01-01
Describes an APL implementation of a relational database system suitable for use in a teaching environment in which database development and database administration are studied, and discusses the functions of the user and the database administrator. An appendix illustrating system operation and an eight-item reference list are attached. (Author/JL)
Nolan, Jerry P; Soar, Jasmeet; Smith, Gary B; Gwinnutt, Carl; Parrott, Francesca; Power, Sarah; Harrison, David A; Nixon, Edel; Rowan, Kathryn
2014-08-01
To report the incidence, characteristics and outcome of adult in-hospital cardiac arrest in the United Kingdom (UK) National Cardiac Arrest Audit database. A prospectively defined analysis of the UK National Cardiac Arrest Audit (NCAA) database. 144 acute hospitals contributed data relating to 22,628 patients aged 16 years or over receiving chest compressions and/or defibrillation and attended by a hospital-based resuscitation team in response to a 2222 call. The main outcome measures were incidence of adult in-hospital cardiac arrest and survival to hospital discharge. The overall incidence of adult in-hospital cardiac arrest was 1.6 per 1000 hospital admissions with a median across hospitals of 1.5 (interquartile range 1.2-2.2). Incidence varied seasonally, peaking in winter. Overall unadjusted survival to hospital discharge was 18.4%. The presenting rhythm was shockable (ventricular fibrillation or pulseless ventricular tachycardia) in 16.9% and non-shockable (asystole or pulseless electrical activity) in 72.3%; rates of survival to hospital discharge associated with these rhythms were 49.0% and 10.5%, respectively, but varied substantially across hospitals. These first results from the NCAA database describing the current incidence and outcome of adult in-hospital cardiac arrest in UK hospitals will serve as a benchmark from which to assess the future impact of changes in service delivery, organisation and treatment for in-hospital cardiac arrest. Copyright © 2014 Elsevier Ireland Ltd. All rights reserved.
Foster, Joseph M; Moreno, Pablo; Fabregat, Antonio; Hermjakob, Henning; Steinbeck, Christoph; Apweiler, Rolf; Wakelam, Michael J O; Vizcaíno, Juan Antonio
2013-01-01
Protein sequence databases are the pillar upon which modern proteomics is supported, representing a stable reference space of predicted and validated proteins. One example of such resources is UniProt, enriched with both expertly curated and automatic annotations. Taken largely for granted, similar mature resources such as UniProt are not available yet in some other "omics" fields, lipidomics being one of them. While having a seasoned community of wet lab scientists, lipidomics lies significantly behind proteomics in the adoption of data standards and other core bioinformatics concepts. This work aims to reduce the gap by developing an equivalent resource to UniProt called 'LipidHome', providing theoretically generated lipid molecules and useful metadata. Using the 'FASTLipid' Java library, a database was populated with theoretical lipids, generated from a set of community agreed upon chemical bounds. In parallel, a web application was developed to present the information and provide computational access via a web service. Designed specifically to accommodate high throughput mass spectrometry based approaches, lipids are organised into a hierarchy that reflects the variety in the structural resolution of lipid identifications. Additionally, cross-references to other lipid related resources and papers that cite specific lipids were used to annotate lipid records. The web application encompasses a browser for viewing lipid records and a 'tools' section where an MS1 search engine is currently implemented. LipidHome can be accessed at http://www.ebi.ac.uk/apweiler-srv/lipidhome.
A Novel Method for Constructing a WIFI Positioning System with Efficient Manpower
Du, Yuanfeng; Yang, Dongkai; Xiu, Chundi
2015-01-01
With the rapid development of WIFI technology, WIFI-based indoor positioning technology has been widely studied for location-based services. To solve the problems related to the signal strength database adopted in the widely used fingerprint positioning technology, we first introduce a new system framework in this paper, which includes a modified AP firmware and some cheap self-made WIFI sensor anchors. The periodically scanned reports regarding the neighboring APs and sensor anchors are sent to the positioning server and serve as the calibration points. Besides the calculation of correlations between the target points and the neighboring calibration points, we take full advantage of the important but easily overlooked feature that the signal attenuation model varies in different regions in the regression algorithm to get more accurate results. Thus, a novel method called RSSI Geography Weighted Regression (RGWR) is proposed to solve the fingerprint database construction problem. The average error of all the calibration points’ self-localization results will help to make the final decision of whether the database is the latest or has to be updated automatically. The effects of anchors on system performance are further researched to conclude that the anchors should be deployed at the locations that stand for the features of RSSI distributions. The proposed system is convenient for the establishment of practical positioning system and extensive experiments have been performed to validate that the proposed method is robust and manpower efficient. PMID:25868078
A novel method for constructing a WIFI positioning system with efficient manpower.
Du, Yuanfeng; Yang, Dongkai; Xiu, Chundi
2015-04-10
With the rapid development of WIFI technology, WIFI-based indoor positioning technology has been widely studied for location-based services. To solve the problems related to the signal strength database adopted in the widely used fingerprint positioning technology, we first introduce a new system framework in this paper, which includes a modified AP firmware and some cheap self-made WIFI sensor anchors. The periodically scanned reports regarding the neighboring APs and sensor anchors are sent to the positioning server and serve as the calibration points. Besides the calculation of correlations between the target points and the neighboring calibration points, we take full advantage of the important but easily overlooked feature that the signal attenuation model varies in different regions in the regression algorithm to get more accurate results. Thus, a novel method called RSSI Geography Weighted Regression (RGWR) is proposed to solve the fingerprint database construction problem. The average error of all the calibration points' self-localization results will help to make the final decision of whether the database is the latest or has to be updated automatically. The effects of anchors on system performance are further researched to conclude that the anchors should be deployed at the locations that stand for the features of RSSI distributions. The proposed system is convenient for the establishment of practical positioning system and extensive experiments have been performed to validate that the proposed method is robust and manpower efficient.
Yue, Ming; Zhou, Dianshuang; Zhi, Hui; Wang, Peng; Zhang, Yan; Gao, Yue; Guo, Maoni; Li, Xin; Wang, Yanxia
2018-01-01
Abstract The MiRNA SNP Disease Database (MSDD, http://www.bio-bigdata.com/msdd/) is a manually curated database that provides comprehensive experimentally supported associations among microRNAs (miRNAs), single nucleotide polymorphisms (SNPs) and human diseases. SNPs in miRNA-related functional regions such as mature miRNAs, promoter regions, pri-miRNAs, pre-miRNAs and target gene 3′-UTRs, collectively called ‘miRSNPs’, represent a novel category of functional molecules. miRSNPs can lead to miRNA and its target gene dysregulation, and resulting in susceptibility to or onset of human diseases. A curated collection and summary of miRSNP-associated diseases is essential for a thorough understanding of the mechanisms and functions of miRSNPs. Here, we describe MSDD, which currently documents 525 associations among 182 human miRNAs, 197 SNPs, 153 genes and 164 human diseases through a review of more than 2000 published papers. Each association incorporates information on the miRNAs, SNPs, miRNA target genes and disease names, SNP locations and alleles, the miRNA dysfunctional pattern, experimental techniques, a brief functional description, the original reference and additional annotation. MSDD provides a user-friendly interface to conveniently browse, retrieve, download and submit novel data. MSDD will significantly improve our understanding of miRNA dysfunction in disease, and thus, MSDD has the potential to serve as a timely and valuable resource. PMID:29106642
Yue, Ming; Zhou, Dianshuang; Zhi, Hui; Wang, Peng; Zhang, Yan; Gao, Yue; Guo, Maoni; Li, Xin; Wang, Yanxia; Zhang, Yunpeng; Ning, Shangwei; Li, Xia
2018-01-04
The MiRNA SNP Disease Database (MSDD, http://www.bio-bigdata.com/msdd/) is a manually curated database that provides comprehensive experimentally supported associations among microRNAs (miRNAs), single nucleotide polymorphisms (SNPs) and human diseases. SNPs in miRNA-related functional regions such as mature miRNAs, promoter regions, pri-miRNAs, pre-miRNAs and target gene 3'-UTRs, collectively called 'miRSNPs', represent a novel category of functional molecules. miRSNPs can lead to miRNA and its target gene dysregulation, and resulting in susceptibility to or onset of human diseases. A curated collection and summary of miRSNP-associated diseases is essential for a thorough understanding of the mechanisms and functions of miRSNPs. Here, we describe MSDD, which currently documents 525 associations among 182 human miRNAs, 197 SNPs, 153 genes and 164 human diseases through a review of more than 2000 published papers. Each association incorporates information on the miRNAs, SNPs, miRNA target genes and disease names, SNP locations and alleles, the miRNA dysfunctional pattern, experimental techniques, a brief functional description, the original reference and additional annotation. MSDD provides a user-friendly interface to conveniently browse, retrieve, download and submit novel data. MSDD will significantly improve our understanding of miRNA dysfunction in disease, and thus, MSDD has the potential to serve as a timely and valuable resource. © The Author(s) 2017. Published by Oxford University Press on behalf of Nucleic Acids Research.
Adderson, Elisabeth E.; Boudreaux, Jan W.; Cummings, Jessica R.; Pounds, Stanley; Wilson, Deborah A.; Procop, Gary W.; Hayden, Randall T.
2008-01-01
We compared the relative levels of effectiveness of three commercial identification kits and three nucleic acid amplification tests for the identification of coryneform bacteria by testing 50 diverse isolates, including 12 well-characterized control strains and 38 organisms obtained from pediatric oncology patients at our institution. Between 33.3 and 75.0% of control strains were correctly identified to the species level by phenotypic systems or nucleic acid amplification assays. The most sensitive tests were the API Coryne system and amplification and sequencing of the 16S rRNA gene using primers optimized for coryneform bacteria, which correctly identified 9 of 12 control isolates to the species level, and all strains with a high-confidence call were correctly identified. Organisms not correctly identified were species not included in the test kit databases or not producing a pattern of reactions included in kit databases or which could not be differentiated among several genospecies based on reaction patterns. Nucleic acid amplification assays had limited abilities to identify some bacteria to the species level, and comparison of sequence homologies was complicated by the inclusion of allele sequences obtained from uncultivated and uncharacterized strains in databases. The utility of rpoB genotyping was limited by the small number of representative gene sequences that are currently available for comparison. The correlation between identifications produced by different classification systems was poor, particularly for clinical isolates. PMID:18160450
Semantic annotation of Web data applied to risk in food.
Hignette, Gaëlle; Buche, Patrice; Couvert, Olivier; Dibie-Barthélemy, Juliette; Doussot, David; Haemmerlé, Ollivier; Mettler, Eric; Soler, Lydie
2008-11-30
A preliminary step to risk in food assessment is the gathering of experimental data. In the framework of the Sym'Previus project (http://www.symprevius.org), a complete data integration system has been designed, grouping data provided by industrial partners and data extracted from papers published in the main scientific journals of the domain. Those data have been classified by means of a predefined vocabulary, called ontology. Our aim is to complement the database with data extracted from the Web. In the framework of the WebContent project (www.webcontent.fr), we have designed a semi-automatic acquisition tool, called @WEB, which retrieves scientific documents from the Web. During the @WEB process, data tables are extracted from the documents and then annotated with the ontology. We focus on the data tables as they contain, in general, a synthesis of data published in the documents. In this paper, we explain how the columns of the data tables are automatically annotated with data types of the ontology and how the relations represented by the table are recognised. We also give the results of our experimentation to assess the quality of such an annotation.
NASA Astrophysics Data System (ADS)
Gentry, Jeffery D.
2000-05-01
A relational database is a powerful tool for collecting and analyzing the vast amounts of inner-related data associated with the manufacture of composite materials. A relational database contains many individual database tables that store data that are related in some fashion. Manufacturing process variables as well as quality assurance measurements can be collected and stored in database tables indexed according to lot numbers, part type or individual serial numbers. Relationships between manufacturing process and product quality can then be correlated over a wide range of product types and process variations. This paper presents details on how relational databases are used to collect, store, and analyze process variables and quality assurance data associated with the manufacture of advanced composite materials. Important considerations are covered including how the various types of data are organized and how relationships between the data are defined. Employing relational database techniques to establish correlative relationships between process variables and quality assurance measurements is then explored. Finally, the benefits of database techniques such as data warehousing, data mining and web based client/server architectures are discussed in the context of composite material manufacturing.
Relational Databases and Biomedical Big Data.
de Silva, N H Nisansa D
2017-01-01
In various biomedical applications that collect, handle, and manipulate data, the amounts of data tend to build up and venture into the range identified as bigdata. In such occurrences, a design decision has to be taken as to what type of database would be used to handle this data. More often than not, the default and classical solution to this in the biomedical domain according to past research is relational databases. While this used to be the norm for a long while, it is evident that there is a trend to move away from relational databases in favor of other types and paradigms of databases. However, it still has paramount importance to understand the interrelation that exists between biomedical big data and relational databases. This chapter will review the pros and cons of using relational databases to store biomedical big data that previous researches have discussed and used.
ERIC Educational Resources Information Center
Chudagr, Amita; Luschei, Thomas F.
2016-01-01
The objective of this commentary is to call attention to the feasibility and importance of large-scale, systematic, quantitative analysis in international and comparative education research. We contend that although many existing databases are under- or unutilized in quantitative international-comparative research, these resources present the…
Mining Hidden Gems Beneath the Surface: A Look At the Invisible Web.
ERIC Educational Resources Information Center
Carlson, Randal D.; Repman, Judi
2002-01-01
Describes resources for researchers called the Invisible Web that are hidden from the usual search engines and other tools and contrasts them with those resources available on the surface Web. Identifies specialized search tools, databases, and strategies that can be used to locate credible in-depth information. (Author/LRW)
Relationships between Computer Skills and Technostress: How Does This Affect Me?
ERIC Educational Resources Information Center
Shepherd, Sonya S. Gaither
2004-01-01
The creation of computer software and hardware, telecommunications, databases, and the Internet has affected society as a whole, and particularly higher education by giving people new productivity options and changing the way they work (Hulbert, 1998). In the so-called "Information Age" the increasing use of technology has become the driving force…
Integrating Health Information Systems into a Database Course: A Case Study
ERIC Educational Resources Information Center
Anderson, Nicole; Zhang, Mingrui; McMaster, Kirby
2011-01-01
Computer Science is a rich field with many growing application areas, such as Health Information Systems. What we suggest here is that multi-disciplinary threads can be introduced to supplement, enhance, and strengthen the primary area of study in a course. We call these supplementary materials "threads," because they are executed…
Childhood Obesity and Cardiovascular Disease: January 1985-May 1990. Quick Bibliography Series.
ERIC Educational Resources Information Center
Updegrove, Natalie A.
This bibliography consists of 212 recent citations (January 1985 through May 1990) from AGRICOLA, the National Agricultural Library (NAL) computerized database. The bibliography addresses issues concerning childhood obesity and cardiovascular disease. Each citation includes the NAL call number, the title, the author(s) the city of publication, the…
Federal Register 2010, 2011, 2012, 2013, 2014
2010-10-15
... request an appointment with the Disclosure Officer by telephoning (703) 905-5034 (not a toll-free call... terrorism, and to implement counter-money laundering programs and compliance procedures.\\2\\ Regulations... public comments received in response to this notice. \\4\\ BSA E-Filing is a free service provided by Fin...
GPO Gate: University of California, San Diego's New Gateway to Electronic Government Information.
ERIC Educational Resources Information Center
Cruse, Patricia; Jahns, Cynthia
1996-01-01
Describes the development of a new interface called GPO Gate for accessing the Government Printing Office (GPO) WAIS (wide area information server) databases, GPO Access. Highlights include development and use of GPO Gate at the University of California, San Diego, and implications for public service. (Author/LRW)
Anthropology and environmental policy: What counts?
Susan Charnley; William H. Durham
2010-01-01
In this article, we call for enhanced quantitative and environmental analysis in the work of environmental anthropologists who wish to influence policy. Using a database of 77 leading monographs published between 1967 and 2006, 147 articles by the same authors, and a separate sample of 137 articles from the journal Human Organization, we document a...
The Case for Creating a Scholars Portal to the Web: A White Paper.
ERIC Educational Resources Information Center
Campbell, Jerry D.
2001-01-01
Considers the need for reliable, scholarly access to the Web and suggests that the Association for Research Libraries, in partnership with OCLC and the Library of Congress, develop a so-called scholar's portal. Topics include quality content; enhanced library services; and gateway functions, including access to commercial databases and focused…
IRIS TOXICOLOGICAL REVIEW AND SUMMARY DOCUMENTS FOR 2,2,4-TRIMETHYLPENTANE (EXTERNAL REVIEW DRAFT)
EPA has conducted a peer review of the scientific basis supporting the human health hazard and dose-response assessment of 2,2,4-trimethylpentane, also called TMP, that will appear on the Integrated Risk Information System (IRIS) database. Peer review is meant to ensure that scie...
Qian, Jianjun; Yang, Jian; Xu, Yong
2013-09-01
This paper presents a robust but simple image feature extraction method, called image decomposition based on local structure (IDLS). It is assumed that in the local window of an image, the macro-pixel (patch) of the central pixel, and those of its neighbors, are locally linear. IDLS captures the local structural information by describing the relationship between the central macro-pixel and its neighbors. This relationship is represented with the linear representation coefficients determined using ridge regression. One image is actually decomposed into a series of sub-images (also called structure images) according to a local structure feature vector. All the structure images, after being down-sampled for dimensionality reduction, are concatenated into one super-vector. Fisher linear discriminant analysis is then used to provide a low-dimensional, compact, and discriminative representation for each super-vector. The proposed method is applied to face recognition and examined using our real-world face image database, NUST-RWFR, and five popular, publicly available, benchmark face image databases (AR, Extended Yale B, PIE, FERET, and LFW). Experimental results show the performance advantages of IDLS over state-of-the-art algorithms.
Pathway Tools version 13.0: integrated software for pathway/genome informatics and systems biology
Paley, Suzanne M.; Krummenacker, Markus; Latendresse, Mario; Dale, Joseph M.; Lee, Thomas J.; Kaipa, Pallavi; Gilham, Fred; Spaulding, Aaron; Popescu, Liviu; Altman, Tomer; Paulsen, Ian; Keseler, Ingrid M.; Caspi, Ron
2010-01-01
Pathway Tools is a production-quality software environment for creating a type of model-organism database called a Pathway/Genome Database (PGDB). A PGDB such as EcoCyc integrates the evolving understanding of the genes, proteins, metabolic network and regulatory network of an organism. This article provides an overview of Pathway Tools capabilities. The software performs multiple computational inferences including prediction of metabolic pathways, prediction of metabolic pathway hole fillers and prediction of operons. It enables interactive editing of PGDBs by DB curators. It supports web publishing of PGDBs, and provides a large number of query and visualization tools. The software also supports comparative analyses of PGDBs, and provides several systems biology analyses of PGDBs including reachability analysis of metabolic networks, and interactive tracing of metabolites through a metabolic network. More than 800 PGDBs have been created using Pathway Tools by scientists around the world, many of which are curated DBs for important model organisms. Those PGDBs can be exchanged using a peer-to-peer DB sharing system called the PGDB Registry. PMID:19955237
A Relational Algebra Query Language for Programming Relational Databases
ERIC Educational Resources Information Center
McMaster, Kirby; Sambasivam, Samuel; Anderson, Nicole
2011-01-01
In this paper, we describe a Relational Algebra Query Language (RAQL) and Relational Algebra Query (RAQ) software product we have developed that allows database instructors to teach relational algebra through programming. Instead of defining query operations using mathematical notation (the approach commonly taken in database textbooks), students…
BioCreative V CDR task corpus: a resource for chemical disease relation extraction.
Li, Jiao; Sun, Yueping; Johnson, Robin J; Sciaky, Daniela; Wei, Chih-Hsuan; Leaman, Robert; Davis, Allan Peter; Mattingly, Carolyn J; Wiegers, Thomas C; Lu, Zhiyong
2016-01-01
Community-run, formal evaluations and manually annotated text corpora are critically important for advancing biomedical text-mining research. Recently in BioCreative V, a new challenge was organized for the tasks of disease named entity recognition (DNER) and chemical-induced disease (CID) relation extraction. Given the nature of both tasks, a test collection is required to contain both disease/chemical annotations and relation annotations in the same set of articles. Despite previous efforts in biomedical corpus construction, none was found to be sufficient for the task. Thus, we developed our own corpus called BC5CDR during the challenge by inviting a team of Medical Subject Headings (MeSH) indexers for disease/chemical entity annotation and Comparative Toxicogenomics Database (CTD) curators for CID relation annotation. To ensure high annotation quality and productivity, detailed annotation guidelines and automatic annotation tools were provided. The resulting BC5CDR corpus consists of 1500 PubMed articles with 4409 annotated chemicals, 5818 diseases and 3116 chemical-disease interactions. Each entity annotation includes both the mention text spans and normalized concept identifiers, using MeSH as the controlled vocabulary. To ensure accuracy, the entities were first captured independently by two annotators followed by a consensus annotation: The average inter-annotator agreement (IAA) scores were 87.49% and 96.05% for the disease and chemicals, respectively, in the test set according to the Jaccard similarity coefficient. Our corpus was successfully used for the BioCreative V challenge tasks and should serve as a valuable resource for the text-mining research community.Database URL: http://www.biocreative.org/tasks/biocreative-v/track-3-cdr/. Published by Oxford University Press 2016. This work is written by US Government employees and is in the public domain in the United States.
Illinois hospital using Web to build database for relationship marketing.
Rees, T
2000-01-01
Silver Cross Hospital and Medical Centers, Joliet, Ill., is promoting its Web site as a tool for gathering health information about patients and prospective patients in order to build a relationship marketing database. The database will enable the hospital to identify health care needs of consumers in Joliet, Will County and many southwestern suburbs of Chicago. The Web site is promoted in a multimedia advertising campaign that invites residents to participate in a Healthy Living Quiz that rewards respondents with free health screenings. The effort is part of a growing planning and marketing strategy in the health care industry called customer relationship management (CRM). Not only does a total CRM plan offer health care organizations the chance to discover the potential for meeting consumers' needs; it also helps find any marketplace gaps that may exist.
Ibmdbpy-spatial : An Open-source implementation of in-database geospatial analytics in Python
NASA Astrophysics Data System (ADS)
Roy, Avipsa; Fouché, Edouard; Rodriguez Morales, Rafael; Moehler, Gregor
2017-04-01
As the amount of spatial data acquired from several geodetic sources has grown over the years and as data infrastructure has become more powerful, the need for adoption of in-database analytic technology within geosciences has grown rapidly. In-database analytics on spatial data stored in a traditional enterprise data warehouse enables much faster retrieval and analysis for making better predictions about risks and opportunities, identifying trends and spot anomalies. Although there are a number of open-source spatial analysis libraries like geopandas and shapely available today, most of them have been restricted to manipulation and analysis of geometric objects with a dependency on GEOS and similar libraries. We present an open-source software package, written in Python, to fill the gap between spatial analysis and in-database analytics. Ibmdbpy-spatial provides a geospatial extension to the ibmdbpy package, implemented in 2015. It provides an interface for spatial data manipulation and access to in-database algorithms in IBM dashDB, a data warehouse platform with a spatial extender that runs as a service on IBM's cloud platform called Bluemix. Working in-database reduces the network overload, as the complete data need not be replicated into the user's local system altogether and only a subset of the entire dataset can be fetched into memory in a single instance. Ibmdbpy-spatial accelerates Python analytics by seamlessly pushing operations written in Python into the underlying database for execution using the dashDB spatial extender, thereby benefiting from in-database performance-enhancing features, such as columnar storage and parallel processing. The package is currently supported on Python versions from 2.7 up to 3.4. The basic architecture of the package consists of three main components - 1) a connection to the dashDB represented by the instance IdaDataBase, which uses a middleware API namely - pypyodbc or jaydebeapi to establish the database connection via ODBC or JDBC respectively, 2) an instance to represent the spatial data stored in the database as a dataframe in Python, called the IdaGeoDataFrame, with a specific geometry attribute which recognises a planar geometry column in dashDB and 3) Python wrappers for spatial functions like within, distance, area, buffer} and more which dashDB currently supports to make the querying process from Python much simpler for the users. The spatial functions translate well-known geopandas-like syntax into SQL queries utilising the database connection to perform spatial operations in-database and can operate on single geometries as well two different geometries from different IdaGeoDataFrames. The in-database queries strictly follow the standards of OpenGIS Implementation Specification for Geographic information - Simple feature access for SQL. The results of the operations obtained can thereby be accessed dynamically via interactive Jupyter notebooks from any system which supports Python, without any additional dependencies and can also be combined with other open source libraries such as matplotlib and folium in-built within Jupyter notebooks for visualization purposes. We built a use case to analyse crime hotspots in New York city to validate our implementation and visualized the results as a choropleth map for each borough.
WASP: a Web-based Allele-Specific PCR assay designing tool for detecting SNPs and mutations
Wangkumhang, Pongsakorn; Chaichoompu, Kridsadakorn; Ngamphiw, Chumpol; Ruangrit, Uttapong; Chanprasert, Juntima; Assawamakin, Anunchai; Tongsima, Sissades
2007-01-01
Background Allele-specific (AS) Polymerase Chain Reaction is a convenient and inexpensive method for genotyping Single Nucleotide Polymorphisms (SNPs) and mutations. It is applied in many recent studies including population genetics, molecular genetics and pharmacogenomics. Using known AS primer design tools to create primers leads to cumbersome process to inexperience users since information about SNP/mutation must be acquired from public databases prior to the design. Furthermore, most of these tools do not offer the mismatch enhancement to designed primers. The available web applications do not provide user-friendly graphical input interface and intuitive visualization of their primer results. Results This work presents a web-based AS primer design application called WASP. This tool can efficiently design AS primers for human SNPs as well as mutations. To assist scientists with collecting necessary information about target polymorphisms, this tool provides a local SNP database containing over 10 million SNPs of various populations from public domain databases, namely NCBI dbSNP, HapMap and JSNP respectively. This database is tightly integrated with the tool so that users can perform the design for existing SNPs without going off the site. To guarantee specificity of AS primers, the proposed system incorporates a primer specificity enhancement technique widely used in experiment protocol. In particular, WASP makes use of different destabilizing effects by introducing one deliberate 'mismatch' at the penultimate (second to last of the 3'-end) base of AS primers to improve the resulting AS primers. Furthermore, WASP offers graphical user interface through scalable vector graphic (SVG) draw that allow users to select SNPs and graphically visualize designed primers and their conditions. Conclusion WASP offers a tool for designing AS primers for both SNPs and mutations. By integrating the database for known SNPs (using gene ID or rs number), this tool facilitates the awkward process of getting flanking sequences and other related information from public SNP databases. It takes into account the underlying destabilizing effect to ensure the effectiveness of designed primers. With user-friendly SVG interface, WASP intuitively presents resulting designed primers, which assist users to export or to make further adjustment to the design. This software can be freely accessed at . PMID:17697334
A dynamic clinical dental relational database.
Taylor, D; Naguib, R N G; Boulton, S
2004-09-01
The traditional approach to relational database design is based on the logical organization of data into a number of related normalized tables. One assumption is that the nature and structure of the data is known at the design stage. In the case of designing a relational database to store historical dental epidemiological data from individual clinical surveys, the structure of the data is not known until the data is presented for inclusion into the database. This paper addresses the issues concerned with the theoretical design of a clinical dynamic database capable of adapting the internal table structure to accommodate clinical survey data, and presents a prototype database application capable of processing, displaying, and querying the dental data.
The Danish Testicular Cancer database.
Daugaard, Gedske; Kier, Maria Gry Gundgaard; Bandak, Mikkel; Mortensen, Mette Saksø; Larsson, Heidi; Søgaard, Mette; Toft, Birgitte Groenkaer; Engvad, Birte; Agerbæk, Mads; Holm, Niels Vilstrup; Lauritsen, Jakob
2016-01-01
The nationwide Danish Testicular Cancer database consists of a retrospective research database (DaTeCa database) and a prospective clinical database (Danish Multidisciplinary Cancer Group [DMCG] DaTeCa database). The aim is to improve the quality of care for patients with testicular cancer (TC) in Denmark, that is, by identifying risk factors for relapse, toxicity related to treatment, and focusing on late effects. All Danish male patients with a histologically verified germ cell cancer diagnosis in the Danish Pathology Registry are included in the DaTeCa databases. Data collection has been performed from 1984 to 2007 and from 2013 onward, respectively. The retrospective DaTeCa database contains detailed information with more than 300 variables related to histology, stage, treatment, relapses, pathology, tumor markers, kidney function, lung function, etc. A questionnaire related to late effects has been conducted, which includes questions regarding social relationships, life situation, general health status, family background, diseases, symptoms, use of medication, marital status, psychosocial issues, fertility, and sexuality. TC survivors alive on October 2014 were invited to fill in this questionnaire including 160 validated questions. Collection of questionnaires is still ongoing. A biobank including blood/sputum samples for future genetic analyses has been established. Both samples related to DaTeCa and DMCG DaTeCa database are included. The prospective DMCG DaTeCa database includes variables regarding histology, stage, prognostic group, and treatment. The DMCG DaTeCa database has existed since 2013 and is a young clinical database. It is necessary to extend the data collection in the prospective database in order to answer quality-related questions. Data from the retrospective database will be added to the prospective data. This will result in a large and very comprehensive database for future studies on TC patients.
Lee, Ken Ka-Yin; Tang, Wai-Choi; Choi, Kup-Sze
2013-04-01
Clinical data are dynamic in nature, often arranged hierarchically and stored as free text and numbers. Effective management of clinical data and the transformation of the data into structured format for data analysis are therefore challenging issues in electronic health records development. Despite the popularity of relational databases, the scalability of the NoSQL database model and the document-centric data structure of XML databases appear to be promising features for effective clinical data management. In this paper, three database approaches--NoSQL, XML-enabled and native XML--are investigated to evaluate their suitability for structured clinical data. The database query performance is reported, together with our experience in the databases development. The results show that NoSQL database is the best choice for query speed, whereas XML databases are advantageous in terms of scalability, flexibility and extensibility, which are essential to cope with the characteristics of clinical data. While NoSQL and XML technologies are relatively new compared to the conventional relational database, both of them demonstrate potential to become a key database technology for clinical data management as the technology further advances. Copyright © 2012 Elsevier Ireland Ltd. All rights reserved.
Plant Genome Resources at the National Center for Biotechnology Information
Wheeler, David L.; Smith-White, Brian; Chetvernin, Vyacheslav; Resenchuk, Sergei; Dombrowski, Susan M.; Pechous, Steven W.; Tatusova, Tatiana; Ostell, James
2005-01-01
The National Center for Biotechnology Information (NCBI) integrates data from more than 20 biological databases through a flexible search and retrieval system called Entrez. A core Entrez database, Entrez Nucleotide, includes GenBank and is tightly linked to the NCBI Taxonomy database, the Entrez Protein database, and the scientific literature in PubMed. A suite of more specialized databases for genomes, genes, gene families, gene expression, gene variation, and protein domains dovetails with the core databases to make Entrez a powerful system for genomic research. Linked to the full range of Entrez databases is the NCBI Map Viewer, which displays aligned genetic, physical, and sequence maps for eukaryotic genomes including those of many plants. A specialized plant query page allow maps from all plant genomes covered by the Map Viewer to be searched in tandem to produce a display of aligned maps from several species. PlantBLAST searches against the sequences shown in the Map Viewer allow BLAST alignments to be viewed within a genomic context. In addition, precomputed sequence similarities, such as those for proteins offered by BLAST Link, enable fluid navigation from unannotated to annotated sequences, quickening the pace of discovery. NCBI Web pages for plants, such as Plant Genome Central, complete the system by providing centralized access to NCBI's genomic resources as well as links to organism-specific Web pages beyond NCBI. PMID:16010002
Development of a prototype commonality analysis tool for use in space programs
NASA Technical Reports Server (NTRS)
Yeager, Dorian P.
1988-01-01
A software tool to aid in performing commonality analyses, called Commonality Analysis Problem Solver (CAPS), was designed, and a prototype version (CAPS 1.0) was implemented and tested. The CAPS 1.0 runs in an MS-DOS or IBM PC-DOS environment. The CAPS is designed around a simple input language which provides a natural syntax for the description of feasibility constraints. It provides its users with the ability to load a database representing a set of design items, describe the feasibility constraints on items in that database, and do a comprehensive cost analysis to find the most economical substitution pattern.
Bjerring, Ole Steen; Fristrup, Claus; Mortensen, Michael Bau
2012-08-01
As seven out of every ten patients with upper gastrointestinal malignancies (UGIM) are not eligible for curative treatment, life after diagnosis is characterised by a rapid deterioration and uncertainty. To accommodate these issues, we established a telephone hotline. In a two-year period, all patients evaluated for UGIM were given the hotline phone number. The hotline was staffed by either a nurse or a secretary, and subsequently the specialist in charge of the patient would return the call. All calls were registered in a prospective database. The following data were recorded: diagnosis, time from call to return call, problem and solution to the problem. A total of 477 patients were included, and 172 (36%) patients used the Hotline a total of 254 times. Of the 254 calls, 210 (83%) were returned the same day. A total of 104 (41%) calls were made due to elaborative questions and 89% of these were solved over the phone. Dysphagia was the problem in 51 cases which gave rise to an endoscopy in 86% of cases. Pain was the problem in 35. Overall, of the 254 calls, 152 (60%) problems were solved over the phone. Furthermore, 75 calls triggered a hospital visit and 27 calls led to the patient being referred for further examinations. The establishment of a telephone hotline was feasible and it was used by some patients. Most of the callers only made one call. Nearly all calls (96%) were returned the day after the initial call, at the latest. The problem pattern did not differ between disease groups apart from dysphagia in oesophageal cancer. We found that the hotline was an effective and inexpensive part of overall patient management. not relevant. not relevant.
NCBI2RDF: Enabling Full RDF-Based Access to NCBI Databases
Anguita, Alberto; García-Remesal, Miguel; de la Iglesia, Diana; Maojo, Victor
2013-01-01
RDF has become the standard technology for enabling interoperability among heterogeneous biomedical databases. The NCBI provides access to a large set of life sciences databases through a common interface called Entrez. However, the latter does not provide RDF-based access to such databases, and, therefore, they cannot be integrated with other RDF-compliant databases and accessed via SPARQL query interfaces. This paper presents the NCBI2RDF system, aimed at providing RDF-based access to the complete NCBI data repository. This API creates a virtual endpoint for servicing SPARQL queries over different NCBI repositories and presenting to users the query results in SPARQL results format, thus enabling this data to be integrated and/or stored with other RDF-compliant repositories. SPARQL queries are dynamically resolved, decomposed, and forwarded to the NCBI-provided E-utilities programmatic interface to access the NCBI data. Furthermore, we show how our approach increases the expressiveness of the native NCBI querying system, allowing several databases to be accessed simultaneously. This feature significantly boosts productivity when working with complex queries and saves time and effort to biomedical researchers. Our approach has been validated with a large number of SPARQL queries, thus proving its reliability and enhanced capabilities in biomedical environments. PMID:23984425
Semi-automatic feedback using concurrence between mixture vectors for general databases
NASA Astrophysics Data System (ADS)
Larabi, Mohamed-Chaker; Richard, Noel; Colot, Olivier; Fernandez-Maloigne, Christine
2001-12-01
This paper describes how a query system can exploit the basic knowledge by employing semi-automatic relevance feedback to refine queries and runtimes. For general databases, it is often useless to call complex attributes, because we have not sufficient information about images in the database. Moreover, these images can be topologically very different from one to each other and an attribute that is powerful for a database category may be very powerless for the other categories. The idea is to use very simple features, such as color histogram, correlograms, Color Coherence Vectors (CCV), to fill out the signature vector. Then, a number of mixture vectors is prepared depending on the number of very distinctive categories in the database. Knowing that a mixture vector is a vector containing the weight of each attribute that will be used to compute a similarity distance. We post a query in the database using successively all the mixture vectors defined previously. We retain then the N first images for each vector in order to make a mapping using the following information: Is image I present in several mixture vectors results? What is its rank in the results? These informations allow us to switch the system on an unsupervised relevance feedback or user's feedback (supervised feedback).
Interactive searching of facial image databases
NASA Astrophysics Data System (ADS)
Nicholls, Robert A.; Shepherd, John W.; Shepherd, Jean
1995-09-01
A set of psychological facial descriptors has been devised to enable computerized searching of criminal photograph albums. The descriptors have been used to encode image databased of up to twelve thousand images. Using a system called FACES, the databases are searched by translating a witness' verbal description into corresponding facial descriptors. Trials of FACES have shown that this coding scheme is more productive and efficient than searching traditional photograph albums. An alternative method of searching the encoded database using a genetic algorithm is currenly being tested. The genetic search method does not require the witness to verbalize a description of the target but merely to indicate a degree of similarity between the target and a limited selection of images from the database. The major drawback of FACES is that is requires a manual encoding of images. Research is being undertaken to automate the process, however, it will require an algorithm which can predict human descriptive values. Alternatives to human derived coding schemes exist using statistical classifications of images. Since databases encoded using statistical classifiers do not have an obvious direct mapping to human derived descriptors, a search method which does not require the entry of human descriptors is required. A genetic search algorithm is being tested for such a purpose.
Caputo, Sandrine; Benboudjema, Louisa; Sinilnikova, Olga; Rouleau, Etienne; Béroud, Christophe; Lidereau, Rosette
2012-01-01
BRCA1 and BRCA2 are the two main genes responsible for predisposition to breast and ovarian cancers, as a result of protein-inactivating monoallelic mutations. It remains to be established whether many of the variants identified in these two genes, so-called unclassified/unknown variants (UVs), contribute to the disease phenotype or are simply neutral variants (or polymorphisms). Given the clinical importance of establishing their status, a nationwide effort to annotate these UVs was launched by laboratories belonging to the French GGC consortium (Groupe Génétique et Cancer), leading to the creation of the UMD-BRCA1/BRCA2 databases (http://www.umd.be/BRCA1/ and http://www.umd.be/BRCA2/). These databases have been endorsed by the French National Cancer Institute (INCa) and are designed to collect all variants detected in France, whether causal, neutral or UV. They differ from other BRCA databases in that they contain co-occurrence data for all variants. Using these data, the GGC French consortium has been able to classify certain UVs also contained in other databases. In this article, we report some novel UVs not contained in the BIC database and explore their impact in cancer predisposition based on a structural approach.
Detection and Rectification of Distorted Fingerprints.
Si, Xuanbin; Feng, Jianjiang; Zhou, Jie; Luo, Yuxuan
2015-03-01
Elastic distortion of fingerprints is one of the major causes for false non-match. While this problem affects all fingerprint recognition applications, it is especially dangerous in negative recognition applications, such as watchlist and deduplication applications. In such applications, malicious users may purposely distort their fingerprints to evade identification. In this paper, we proposed novel algorithms to detect and rectify skin distortion based on a single fingerprint image. Distortion detection is viewed as a two-class classification problem, for which the registered ridge orientation map and period map of a fingerprint are used as the feature vector and a SVM classifier is trained to perform the classification task. Distortion rectification (or equivalently distortion field estimation) is viewed as a regression problem, where the input is a distorted fingerprint and the output is the distortion field. To solve this problem, a database (called reference database) of various distorted reference fingerprints and corresponding distortion fields is built in the offline stage, and then in the online stage, the nearest neighbor of the input fingerprint is found in the reference database and the corresponding distortion field is used to transform the input fingerprint into a normal one. Promising results have been obtained on three databases containing many distorted fingerprints, namely FVC2004 DB1, Tsinghua Distorted Fingerprint database, and the NIST SD27 latent fingerprint database.
SORTEZ: a relational translator for NCBI's ASN.1 database.
Hart, K W; Searls, D B; Overton, G C
1994-07-01
The National Center for Biotechnology Information (NCBI) has created a database collection that includes several protein and nucleic acid sequence databases, a biosequence-specific subset of MEDLINE, as well as value-added information such as links between similar sequences. Information in the NCBI database is modeled in Abstract Syntax Notation 1 (ASN.1) an Open Systems Interconnection protocol designed for the purpose of exchanging structured data between software applications rather than as a data model for database systems. While the NCBI database is distributed with an easy-to-use information retrieval system, ENTREZ, the ASN.1 data model currently lacks an ad hoc query language for general-purpose data access. For that reason, we have developed a software package, SORTEZ, that transforms the ASN.1 database (or other databases with nested data structures) to a relational data model and subsequently to a relational database management system (Sybase) where information can be accessed through the relational query language, SQL. Because the need to transform data from one data model and schema to another arises naturally in several important contexts, including efficient execution of specific applications, access to multiple databases and adaptation to database evolution this work also serves as a practical study of the issues involved in the various stages of database transformation. We show that transformation from the ASN.1 data model to a relational data model can be largely automated, but that schema transformation and data conversion require considerable domain expertise and would greatly benefit from additional support tools.
CycADS: an annotation database system to ease the development and update of BioCyc databases
Vellozo, Augusto F.; Véron, Amélie S.; Baa-Puyoulet, Patrice; Huerta-Cepas, Jaime; Cottret, Ludovic; Febvay, Gérard; Calevro, Federica; Rahbé, Yvan; Douglas, Angela E.; Gabaldón, Toni; Sagot, Marie-France; Charles, Hubert; Colella, Stefano
2011-01-01
In recent years, genomes from an increasing number of organisms have been sequenced, but their annotation remains a time-consuming process. The BioCyc databases offer a framework for the integrated analysis of metabolic networks. The Pathway tool software suite allows the automated construction of a database starting from an annotated genome, but it requires prior integration of all annotations into a specific summary file or into a GenBank file. To allow the easy creation and update of a BioCyc database starting from the multiple genome annotation resources available over time, we have developed an ad hoc data management system that we called Cyc Annotation Database System (CycADS). CycADS is centred on a specific database model and on a set of Java programs to import, filter and export relevant information. Data from GenBank and other annotation sources (including for example: KAAS, PRIAM, Blast2GO and PhylomeDB) are collected into a database to be subsequently filtered and extracted to generate a complete annotation file. This file is then used to build an enriched BioCyc database using the PathoLogic program of Pathway Tools. The CycADS pipeline for annotation management was used to build the AcypiCyc database for the pea aphid (Acyrthosiphon pisum) whose genome was recently sequenced. The AcypiCyc database webpage includes also, for comparative analyses, two other metabolic reconstruction BioCyc databases generated using CycADS: TricaCyc for Tribolium castaneum and DromeCyc for Drosophila melanogaster. Linked to its flexible design, CycADS offers a powerful software tool for the generation and regular updating of enriched BioCyc databases. The CycADS system is particularly suited for metabolic gene annotation and network reconstruction in newly sequenced genomes. Because of the uniform annotation used for metabolic network reconstruction, CycADS is particularly useful for comparative analysis of the metabolism of different organisms. Database URL: http://www.cycadsys.org PMID:21474551
Using SQL Databases for Sequence Similarity Searching and Analysis.
Pearson, William R; Mackey, Aaron J
2017-09-13
Relational databases can integrate diverse types of information and manage large sets of similarity search results, greatly simplifying genome-scale analyses. By focusing on taxonomic subsets of sequences, relational databases can reduce the size and redundancy of sequence libraries and improve the statistical significance of homologs. In addition, by loading similarity search results into a relational database, it becomes possible to explore and summarize the relationships between all of the proteins in an organism and those in other biological kingdoms. This unit describes how to use relational databases to improve the efficiency of sequence similarity searching and demonstrates various large-scale genomic analyses of homology-related data. It also describes the installation and use of a simple protein sequence database, seqdb_demo, which is used as a basis for the other protocols. The unit also introduces search_demo, a database that stores sequence similarity search results. The search_demo database is then used to explore the evolutionary relationships between E. coli proteins and proteins in other organisms in a large-scale comparative genomic analysis. © 2017 by John Wiley & Sons, Inc. Copyright © 2017 John Wiley & Sons, Inc.
System, method and apparatus for generating phrases from a database
NASA Technical Reports Server (NTRS)
McGreevy, Michael W. (Inventor)
2004-01-01
A phrase generation is a method of generating sequences of terms, such as phrases, that may occur within a database of subsets containing sequences of terms, such as text. A database is provided and a relational model of the database is created. A query is then input. The query includes a term or a sequence of terms or multiple individual terms or multiple sequences of terms or combinations thereof. Next, several sequences of terms that are contextually related to the query are assembled from contextual relations in the model of the database. The sequences of terms are then sorted and output. Phrase generation can also be an iterative process used to produce sequences of terms from a relational model of a database.
GOClonto: an ontological clustering approach for conceptualizing PubMed abstracts.
Zheng, Hai-Tao; Borchert, Charles; Kim, Hong-Gee
2010-02-01
Concurrent with progress in biomedical sciences, an overwhelming of textual knowledge is accumulating in the biomedical literature. PubMed is the most comprehensive database collecting and managing biomedical literature. To help researchers easily understand collections of PubMed abstracts, numerous clustering methods have been proposed to group similar abstracts based on their shared features. However, most of these methods do not explore the semantic relationships among groupings of documents, which could help better illuminate the groupings of PubMed abstracts. To address this issue, we proposed an ontological clustering method called GOClonto for conceptualizing PubMed abstracts. GOClonto uses latent semantic analysis (LSA) and gene ontology (GO) to identify key gene-related concepts and their relationships as well as allocate PubMed abstracts based on these key gene-related concepts. Based on two PubMed abstract collections, the experimental results show that GOClonto is able to identify key gene-related concepts and outperforms the STC (suffix tree clustering) algorithm, the Lingo algorithm, the Fuzzy Ants algorithm, and the clustering based TRS (tolerance rough set) algorithm. Moreover, the two ontologies generated by GOClonto show significant informative conceptual structures.
Akerjordet, Kristin; Severinsson, Elisabeth
2010-05-01
To explore the state of the science of emotional intelligence (EI) related to nursing leadership and its critiques. The phenomenon of EI has emerged as a potential new construct of importance for nursing leadership that enhances educational, organizational, staff and patient outcomes. Nevertheless, important questions and critical reflections related to exaggerated claims, conceptualizations and measurements exist. A literature search was conducted using international databases covering the period January 1999 to December 2009. A manual search of relevant journals and significant references increased the data. Critical reflection seems to be associated with the unsubstantiated predictive validity of EI in the area of nursing leadership. In addition, important moral issues are called into question. It is important to possess in-depth knowledge of EI and its scientific critique when integrating the concept into nursing research, education and practical settings. More attention to the nature of emotion in EI is necessary. Implications for nursing leadership The dynamics of EI should be explored in the context of both the surrounding environment and individual differences, as the latter can be adaptive in some settings but harmful in others.
The relative benefits of green versus lean office space: three field experiments.
Nieuwenhuis, Marlon; Knight, Craig; Postmes, Tom; Haslam, S Alexander
2014-09-01
Principles of lean office management increasingly call for space to be stripped of extraneous decorations so that it can flexibly accommodate changing numbers of people and different office functions within the same area. Yet this practice is at odds with evidence that office workers' quality of life can be enriched by office landscaping that involves the use of plants that have no formal work-related function. To examine the impact of these competing approaches, 3 field experiments were conducted in large commercial offices in The Netherlands and the U.K. These examined the impact of lean and "green" offices on subjective perceptions of air quality, concentration, and workplace satisfaction as well as objective measures of productivity. Two studies were longitudinal, examining effects of interventions over subsequent weeks and months. In all 3 experiments enhanced outcomes were observed when offices were enriched by plants. Implications for theory and practice are discussed. PsycINFO Database Record (c) 2014 APA, all rights reserved.
[Intoxation with paramethoxymethamphetamine].
Al-Samarraie, Muhammad S; Vevelstad, Merete; Nygaard, Ilah Le; Bachs, Liliana; Mørland, Jørg
2013-05-07
Since the summer of 2010, there has been an epidemic of deaths related to paramethoxymethamphetamine (PMMA) in Norway. We present a review of the pharmacology and toxicology of the substance. The review is based on a literature search in the databases PubMed, Ovid and MEDLINE. A discretionary selection was made of relevant articles. Paramethoxymethamphetamine and paramethoxyamphetamine (PMA) are two so-called designer amphetamines which appear from time to time on the illegal narcotics market in many countries. They are frequently sold as ecstasy or amphetamine, often mixed with amphetamine or methamphetamine. The substances, known on the street as «Death», have potent serotonergic effects and are associated with significant toxicity. Many deaths have been reported worldwide, even after intake of an «ordinary user dose». The narcotic effect is not very pronounced and the onset is slow, which may lead to unintentional overdosing. In cases of severe intoxation that are apparently related to intake of amphetamine or ecstasy, PMMA/PMA intoxation should be suspected.
Ramifications of increased training in quantitative methodology.
Zimiles, Herbert
2009-01-01
Comments on the article "Doctoral training in statistics, measurement, and methodology in psychology: Replication and extension of Aiken, West, Sechrest, and Reno's (1990) survey of PhD programs in North America" by Aiken, West, and Millsap. The current author asks three questions that are provoked by the comprehensive identification of gaps and deficiencies in the training of quantitative methodology that led Aiken, West, and Millsap to call for expanded graduate instruction resources and programs. This comment calls for greater attention to how advances and expansion in the training of quantitative analysis are influencing who chooses to study psychology and how and what will be studied. PsycINFO Database Record 2009 APA.
Hoyer, Chad E; Ghosh, Soumen; Truhlar, Donald G; Gagliardi, Laura
2016-02-04
A correct description of electronically excited states is critical to the interpretation of visible-ultraviolet spectra, photochemical reactions, and excited-state charge-transfer processes in chemical systems. We have recently proposed a theory called multiconfiguration pair-density functional theory (MC-PDFT), which is based on a combination of multiconfiguration wave function theory and a new kind of density functional called an on-top density functional. Here, we show that MC-PDFT with a first-generation on-top density functional performs as well as CASPT2 for an organic chemistry database including valence, Rydberg, and charge-transfer excitations. The results are very encouraging for practical applications.
Xu, Huilei; Baroukh, Caroline; Dannenfelser, Ruth; Chen, Edward Y; Tan, Christopher M; Kou, Yan; Kim, Yujin E; Lemischka, Ihor R; Ma'ayan, Avi
2013-01-01
High content studies that profile mouse and human embryonic stem cells (m/hESCs) using various genome-wide technologies such as transcriptomics and proteomics are constantly being published. However, efforts to integrate such data to obtain a global view of the molecular circuitry in m/hESCs are lagging behind. Here, we present an m/hESC-centered database called Embryonic Stem Cell Atlas from Pluripotency Evidence integrating data from many recent diverse high-throughput studies including chromatin immunoprecipitation followed by deep sequencing, genome-wide inhibitory RNA screens, gene expression microarrays or RNA-seq after knockdown (KD) or overexpression of critical factors, immunoprecipitation followed by mass spectrometry proteomics and phosphoproteomics. The database provides web-based interactive search and visualization tools that can be used to build subnetworks and to identify known and novel regulatory interactions across various regulatory layers. The web-interface also includes tools to predict the effects of combinatorial KDs by additive effects controlled by sliders, or through simulation software implemented in MATLAB. Overall, the Embryonic Stem Cell Atlas from Pluripotency Evidence database is a comprehensive resource for the stem cell systems biology community. Database URL: http://www.maayanlab.net/ESCAPE
Advanced SPARQL querying in small molecule databases.
Galgonek, Jakub; Hurt, Tomáš; Michlíková, Vendula; Onderka, Petr; Schwarz, Jan; Vondrášek, Jiří
2016-01-01
In recent years, the Resource Description Framework (RDF) and the SPARQL query language have become more widely used in the area of cheminformatics and bioinformatics databases. These technologies allow better interoperability of various data sources and powerful searching facilities. However, we identified several deficiencies that make usage of such RDF databases restrictive or challenging for common users. We extended a SPARQL engine to be able to use special procedures inside SPARQL queries. This allows the user to work with data that cannot be simply precomputed and thus cannot be directly stored in the database. We designed an algorithm that checks a query against data ontology to identify possible user errors. This greatly improves query debugging. We also introduced an approach to visualize retrieved data in a user-friendly way, based on templates describing visualizations of resource classes. To integrate all of our approaches, we developed a simple web application. Our system was implemented successfully, and we demonstrated its usability on the ChEBI database transformed into RDF form. To demonstrate procedure call functions, we employed compound similarity searching based on OrChem. The application is publicly available at https://bioinfo.uochb.cas.cz/projects/chemRDF.
Analyzing a multimodal biometric system using real and virtual users
NASA Astrophysics Data System (ADS)
Scheidat, Tobias; Vielhauer, Claus
2007-02-01
Three main topics of recent research on multimodal biometric systems are addressed in this article: The lack of sufficiently large multimodal test data sets, the influence of cultural aspects and data protection issues of multimodal biometric data. In this contribution, different possibilities are presented to extend multimodal databases by generating so-called virtual users, which are created by combining single biometric modality data of different users. Comparative tests on databases containing real and virtual users based on a multimodal system using handwriting and speech are presented, to study to which degree the use of virtual multimodal databases allows conclusions with respect to recognition accuracy in comparison to real multimodal data. All tests have been carried out on databases created from donations from three different nationality groups. This allows to review the experimental results both in general and in context of cultural origin. The results show that in most cases the usage of virtual persons leads to lower accuracy than the usage of real users in terms of the measurement applied: the Equal Error Rate. Finally, this article will address the general question how the concept of virtual users may influence the data protection requirements for multimodal evaluation databases in the future.
Han, Guanghui; Liu, Xiabi; Han, Feifei; Santika, I Nyoman Tenaya; Zhao, Yanfeng; Zhao, Xinming; Zhou, Chunwu
2015-02-01
Lung computed tomography (CT) imaging signs play important roles in the diagnosis of lung diseases. In this paper, we review the significance of CT imaging signs in disease diagnosis and determine the inclusion criterion of CT scans and CT imaging signs of our database. We develop the software of abnormal regions annotation and design the storage scheme of CT images and annotation data. Then, we present a publicly available database of lung CT imaging signs, called LISS for short, which contains 271 CT scans and 677 abnormal regions in them. The 677 abnormal regions are divided into nine categories of common CT imaging signs of lung disease (CISLs). The ground truth of these CISLs regions and the corresponding categories are provided. Furthermore, to make the database publicly available, all private data in CT scans are eliminated or replaced with provisioned values. The main characteristic of our LISS database is that it is developed from a new perspective of CT imaging signs of lung diseases instead of commonly considered lung nodules. Thus, it is promising to apply to computer-aided detection and diagnosis research and medical education.
Khan, Aihab; Husain, Syed Afaq
2013-01-01
We put forward a fragile zero watermarking scheme to detect and characterize malicious modifications made to a database relation. Most of the existing watermarking schemes for relational databases introduce intentional errors or permanent distortions as marks into the database original content. These distortions inevitably degrade the data quality and data usability as the integrity of a relational database is violated. Moreover, these fragile schemes can detect malicious data modifications but do not characterize the tempering attack, that is, the nature of tempering. The proposed fragile scheme is based on zero watermarking approach to detect malicious modifications made to a database relation. In zero watermarking, the watermark is generated (constructed) from the contents of the original data rather than introduction of permanent distortions as marks into the data. As a result, the proposed scheme is distortion-free; thus, it also resolves the inherent conflict between security and imperceptibility. The proposed scheme also characterizes the malicious data modifications to quantify the nature of tempering attacks. Experimental results show that even minor malicious modifications made to a database relation can be detected and characterized successfully.
Evaluation of relational and NoSQL database architectures to manage genomic annotations.
Schulz, Wade L; Nelson, Brent G; Felker, Donn K; Durant, Thomas J S; Torres, Richard
2016-12-01
While the adoption of next generation sequencing has rapidly expanded, the informatics infrastructure used to manage the data generated by this technology has not kept pace. Historically, relational databases have provided much of the framework for data storage and retrieval. Newer technologies based on NoSQL architectures may provide significant advantages in storage and query efficiency, thereby reducing the cost of data management. But their relative advantage when applied to biomedical data sets, such as genetic data, has not been characterized. To this end, we compared the storage, indexing, and query efficiency of a common relational database (MySQL), a document-oriented NoSQL database (MongoDB), and a relational database with NoSQL support (PostgreSQL). When used to store genomic annotations from the dbSNP database, we found the NoSQL architectures to outperform traditional, relational models for speed of data storage, indexing, and query retrieval in nearly every operation. These findings strongly support the use of novel database technologies to improve the efficiency of data management within the biological sciences. Copyright © 2016 Elsevier Inc. All rights reserved.
Building an R&D chemical registration system.
Martin, Elyette; Monge, Aurélien; Duret, Jacques-Antoine; Gualandi, Federico; Peitsch, Manuel C; Pospisil, Pavel
2012-05-31
Small molecule chemistry is of central importance to a number of R&D companies in diverse areas such as the pharmaceutical, nutraceutical, food flavoring, and cosmeceutical industries. In order to store and manage thousands of chemical compounds in such an environment, we have built a state-of-the-art master chemical database with unique structure identifiers. Here, we present the concept and methodology we used to build the system that we call the Unique Compound Database (UCD). In the UCD, each molecule is registered only once (uniqueness), structures with alternative representations are entered in a uniform way (normalization), and the chemical structure drawings are recognizable to chemists and to a cartridge. In brief, structural molecules are entered as neutral entities which can be associated with a salt. The salts are listed in a dictionary and bound to the molecule with the appropriate stoichiometric coefficient in an entity called "substance". The substances are associated with batches. Once a molecule is registered, some properties (e.g., ADMET prediction, IUPAC name, chemical properties) are calculated automatically. The UCD has both automated and manual data controls. Moreover, the UCD concept enables the management of user errors in the structure entry by reassigning or archiving the batches. It also allows updating of the records to include newly discovered properties of individual structures. As our research spans a wide variety of scientific fields, the database enables registration of mixtures of compounds, enantiomers, tautomers, and compounds with unknown stereochemistries.
A database for TMT interface control documents
NASA Astrophysics Data System (ADS)
Gillies, Kim; Roberts, Scott; Brighton, Allan; Rogers, John
2016-08-01
The TMT Software System consists of software components that interact with one another through a software infrastructure called TMT Common Software (CSW). CSW consists of software services and library code that is used by developers to create the subsystems and components that participate in the software system. CSW also defines the types of components that can be constructed and their roles. The use of common component types and shared middleware services allows standardized software interfaces for the components. A software system called the TMT Interface Database System was constructed to support the documentation of the interfaces for components based on CSW. The programmer describes a subsystem and each of its components using JSON-style text files. A command interface file describes each command a component can receive and any commands a component sends. The event interface files describe status, alarms, and events a component publishes and status and events subscribed to by a component. A web application was created to provide a user interface for the required features. Files are ingested into the software system's database. The user interface allows browsing subsystem interfaces, publishing versions of subsystem interfaces, and constructing and publishing interface control documents that consist of the intersection of two subsystem interfaces. All published subsystem interfaces and interface control documents are versioned for configuration control and follow the standard TMT change control processes. Subsystem interfaces and interface control documents can be visualized in the browser or exported as PDF files.
ERIC Educational Resources Information Center
Leonard, Scott A., Comp.; Dobert, Raymond, Comp.
This bibliography on the commercialization and economic aspects of biotechnology was produced by the National Agricultural Library. It contains 151 citations in English from the AGRICOLA database. The search strategy is included, call numbers are given for each entry, and abstracts are provided for some citations. The bibliography concludes with…
ERIC Educational Resources Information Center
Wiley, Emily A.; Stover, Nicholas A.
2014-01-01
Use of inquiry-based research modules in the classroom has soared over recent years, largely in response to national calls for teaching that provides experience with scientific processes and methodologies. To increase the visibility of in-class studies among interested researchers and to strengthen their impact on student learning, we have…
ERIC Educational Resources Information Center
Fountain, Kathleen Carlisle
2013-01-01
Library instruction methods most frequently focus on teaching students searching skills to navigate the maze of library databases to locate appropriate research materials. The current theory of critical information literacy instruction calls on librarians to spend more of their time in the classroom focused on understanding the social, political,…
Computers Track the Elusive Metaphor
ERIC Educational Resources Information Center
Guernsey, Lisa
2009-01-01
Computers may not be able to master poetics like Aristotle, but they have become smart enough to know a metaphor when they see one. An online database called The Mind Is a Metaphor, created by Brad Pasanek, an assistant professor of English at the University of Virginia, is a searchable bank of phrases, verses, and lines from literature that…
Flip-J: Development of the System for Flipped Jigsaw Supported Language Learning
ERIC Educational Resources Information Center
Yamada, Masanori; Goda, Yoshiko; Hata, Kojiro; Matsukawa, Hideya; Yasunami, Seisuke
2016-01-01
This study aims to develop and evaluate a language learning system supported by the "flipped jigsaw" technique, called "Flip-J". This system mainly consists of three functions: (1) the creation of a learning material database, (2) allocation of learning materials, and (3) formation of an expert and jigsaw group. Flip-J was…
A Framework for Transparently Accessing Deep Web Sources
ERIC Educational Resources Information Center
Dragut, Eduard Constantin
2010-01-01
An increasing number of Web sites expose their content via query interfaces, many of them offering the same type of products/services (e.g., flight tickets, car rental/purchasing). They constitute the so-called "Deep Web". Accessing the content on the Deep Web has been a long-standing challenge for the database community. For a user interested in…
ERIC Educational Resources Information Center
Weiskel, Timothy C.
1991-01-01
An online system designed to help global environmental research, the electronic research system called Eco-Link draws data from various electronic sources including online catalogs and databases, CD-ROMs, electronic news sources, and electronic data subscription services to produce briefing booklets on environmental issues. It can be accessed by…
Acquiring geographical data with web harvesting
NASA Astrophysics Data System (ADS)
Dramowicz, K.
2016-04-01
Many websites contain very attractive and up to date geographical information. This information can be extracted, stored, analyzed and mapped using web harvesting techniques. Poorly organized data from websites are transformed with web harvesting into a more structured format, which can be stored in a database and analyzed. Almost 25% of web traffic is related to web harvesting, mostly while using search engines. This paper presents how to harvest geographic information from web documents using the free tool called the Beautiful Soup, one of the most commonly used Python libraries for pulling data from HTML and XML files. It is a relatively easy task to process one static HTML table. The more challenging task is to extract and save information from tables located in multiple and poorly organized websites. Legal and ethical aspects of web harvesting are discussed as well. The paper demonstrates two case studies. The first one shows how to extract various types of information about the Good Country Index from the multiple web pages, load it into one attribute table and map the results. The second case study shows how script tools and GIS can be used to extract information from one hundred thirty six websites about Nova Scotia wines. In a little more than three minutes a database containing one hundred and six liquor stores selling these wines is created. Then the availability and spatial distribution of various types of wines (by grape types, by wineries, and by liquor stores) are mapped and analyzed.
Services for Emodnet-Chemistry Data Products
NASA Astrophysics Data System (ADS)
Santinelli, Giorgio; Hendriksen, Gerrit; Barth, Alexander
2016-04-01
In the framework of Emodnet Chemistry lot, data products from regional leaders were made available in order to transform information into a database. This has been done by using functions and scripts, reading so-called enriched ODV files and inserting data directly into a cloud relational geodatabase. The main table is the one of observations which contains the main data and meta-data associated with the enriched ODV files. A particular implementation in data loading is used in order to improve on-the-fly computational speed. Data from Baltic Sea, North Sea, Mediterrean, Black Sea and part of the Atlantic region has been entered into the geodatabase, and consequently being instantly available from the OceanBrowser Emodnet portal. Furthermore, Deltares has developed an application that provides additional visualisation services for the aggregated and validated data collections. The visualisations are produced by making use of part of the OpenEarthTool stack (http://www.openearth.eu), by the integration of Web Feature Services and by the implementation of Web Processing Services. The goal is the generation of server-side plots of timeseries, profiles, timeprofiles and maps of selected parameters from data sets of selected stations. Regional data collections are retrieved using Emodnet Chemistry cloud relational geo-database. The spatial resolution in time and the intensity of data availability for selected parameters is shown using Web Service requests via the OceanBrowser Emodnet Web portal. OceanBrowser also shows station reference codes, which are used to establish a link for additional metadata, further data shopping and download.
Nonintrusive multibiometrics on a mobile device: a comparison of fusion techniques
NASA Astrophysics Data System (ADS)
Allano, Lorene; Morris, Andrew C.; Sellahewa, Harin; Garcia-Salicetti, Sonia; Koreman, Jacques; Jassim, Sabah; Ly-Van, Bao; Wu, Dalei; Dorizzi, Bernadette
2006-04-01
In this article we test a number of score fusion methods for the purpose of multimodal biometric authentication. These tests were made for the SecurePhone project, whose aim is to develop a prototype mobile communication system enabling biometrically authenticated users to deal legally binding m-contracts during a mobile phone call on a PDA. The three biometrics of voice, face and signature were selected because they are all traditional non-intrusive and easy to use means of authentication which can readily be captured on a PDA. By combining multiple biometrics of relatively low security it may be possible to obtain a combined level of security which is at least as high as that provided by a PIN or handwritten signature, traditionally used for user authentication. As the relative success of different fusion methods depends on the database used and tests made, the database we used was recorded on a suitable PDA (the Qtek2020) and the test protocol was designed to reflect the intended application scenario, which is expected to use short text prompts. Not all of the fusion methods tested are original. They were selected for their suitability for implementation within the constraints imposed by the application. All of the methods tested are based on fusion of the match scores output by each modality. Though computationally simple, the methods tested have shown very promising results. All of the 4 fusion methods tested obtain a significant performance increase.
Clauson, Kevin A; Polen, Hyla H; Marsh, Wallace A
2007-12-01
To evaluate personal digital assistant (PDA) drug information databases used to support clinical decision-making, and to compare the performance of PDA databases with their online versions. Prospective evaluation with descriptive analysis. Five drug information databases available for PDAs and online were evaluated according to their scope (inclusion of correct answers), completeness (on a 3-point scale), and ease of use; 158 question-answer pairs across 15 weighted categories of drug information essential to health care professionals were used to evaluate these databases. An overall composite score integrating these three measures was then calculated. Scores for the PDA databases and for each PDA-online pair were compared. Among the PDA databases, composite rankings, from highest to lowest, were as follows: Lexi-Drugs, Clinical Pharmacology OnHand, Epocrates Rx Pro, mobileMicromedex (now called Thomson Clinical Xpert), and Epocrates Rx free version. When we compared database pairs, online databases that had greater scope than their PDA counterparts were Clinical Pharmacology (137 vs 100 answers, p<0.001), Micromedex (132 vs 96 answers, p<0.001), Lexi-Comp Online (131 vs 119 answers, p<0.001), and Epocrates Online Premium (103 vs 98 answers, p=0.001). Only Micromedex online was more complete than its PDA version (p=0.008). Regarding ease of use, the Lexi-Drugs PDA database was superior to Lexi-Comp Online (p<0.001); however, Epocrates Online Premium, Epocrates Online Free, and Micromedex online were easier to use than their PDA counterparts (p<0.001). In terms of composite scores, only the online versions of Clinical Pharmacology and Micromedex demonstrated superiority over their PDA versions (p>0.01). Online and PDA drug information databases assist practitioners in improving their clinical decision-making. Lexi-Drugs performed significantly better than all of the other PDA databases evaluated. No PDA database demonstrated superiority to its online counterpart; however, the online versions of Clinical Pharmacology and Micromedex were superior to their PDA versions in answering questions.
Hewitt, Robin; Gobbi, Alberto; Lee, Man-Ling
2005-01-01
Relational databases are the current standard for storing and retrieving data in the pharmaceutical and biotech industries. However, retrieving data from a relational database requires specialized knowledge of the database schema and of the SQL query language. At Anadys, we have developed an easy-to-use system for searching and reporting data in a relational database to support our drug discovery project teams. This system is fast and flexible and allows users to access all data without having to write SQL queries. This paper presents the hierarchical, graph-based metadata representation and SQL-construction methods that, together, are the basis of this system's capabilities.
The effect of work shift configurations on emergency medical dispatch center response.
Montassier, Emmanuel; Labady, Julien; Andre, Antoine; Potel, Gilles; Berthier, Frederic; Jenvrin, Joel; Penverne, Yann
2015-01-01
It has been proved that emergency medical dispatch centers (EMDC) save lives by promoting an appropriate allocation of emergency medical service resources. Indeed, optimal dispatcher call duration is pivotal to reduce the time gap between the time a call is placed and the delivery of medical care. However, little is known about the impact of work shift configurations (i.e., work shift duration and work shift rotation throughout the day) and dispatcher call duration. Thus, the objective of our study was to assess the effect of work shift configurations on dispatcher call duration. During a 1-year study period, we analyzed the dispatcher call durations for medical and trauma calls during the 4 different work shift rotations (day, morning, evening, and night) and during the 10-hour work shift of each dispatcher in the EMDC of Nantes. We extracted dispatcher call durations from our advanced telephone system, configured with CC Pulse + (Genesys, Alcatel Lucent), and collected them in a custom designed database (Excel, Microsoft). Afterward, we analyzed these data using linear mixed effects models. During the study period, our EMDC received 408,077 calls. Globally, the mean dispatcher call duration was 107 ± 45 seconds. Based on multivariate linear mixed effects models, the dispatcher call duration was affected by night work shift and work shift duration greater than 8 hours, increasing it by about 10 ± 1 seconds and 4 ± 1 seconds, respectively (both p < 0.001). Our study showed that there was a statistically significant difference in dispatcher call duration over work shift rotation and duration, with longer durations seen over night shifts and shifts over 8 hours. While these differences are small and may not have clinical significance, they may have implications for EMDC efficiency.
Centre-based restricted nearest feature plane with angle classifier for face recognition
NASA Astrophysics Data System (ADS)
Tang, Linlin; Lu, Huifen; Zhao, Liang; Li, Zuohua
2017-10-01
An improved classifier based on the nearest feature plane (NFP), called the centre-based restricted nearest feature plane with the angle (RNFPA) classifier, is proposed for the face recognition problems here. The famous NFP uses the geometrical information of samples to increase the number of training samples, but it increases the computation complexity and it also has an inaccuracy problem coursed by the extended feature plane. To solve the above problems, RNFPA exploits a centre-based feature plane and utilizes a threshold of angle to restrict extended feature space. By choosing the appropriate angle threshold, RNFPA can improve the performance and decrease computation complexity. Experiments in the AT&T face database, AR face database and FERET face database are used to evaluate the proposed classifier. Compared with the original NFP classifier, the nearest feature line (NFL) classifier, the nearest neighbour (NN) classifier and some other improved NFP classifiers, the proposed one achieves competitive performance.
Karp, Peter D; Paley, Suzanne; Romero, Pedro
2002-01-01
Bioinformatics requires reusable software tools for creating model-organism databases (MODs). The Pathway Tools is a reusable, production-quality software environment for creating a type of MOD called a Pathway/Genome Database (PGDB). A PGDB such as EcoCyc (see http://ecocyc.org) integrates our evolving understanding of the genes, proteins, metabolic network, and genetic network of an organism. This paper provides an overview of the four main components of the Pathway Tools: The PathoLogic component supports creation of new PGDBs from the annotated genome of an organism. The Pathway/Genome Navigator provides query, visualization, and Web-publishing services for PGDBs. The Pathway/Genome Editors support interactive updating of PGDBs. The Pathway Tools ontology defines the schema of PGDBs. The Pathway Tools makes use of the Ocelot object database system for data management services for PGDBs. The Pathway Tools has been used to build PGDBs for 13 organisms within SRI and by external users.
Using bibliographic databases in technology transfer
NASA Technical Reports Server (NTRS)
Huffman, G. David
1987-01-01
When technology developed for a specific purpose is used in another application, the process is called technology transfer--the application of an existing technology to a new use or user for purposes other than those for which the technology was originally intended. Using Bibliographical Databases in Technology Transfer deals with demand-pull transfer, technology transfer that arises from need recognition, and is a guide for conducting demand-pull technology transfer studies. It can be used by a researcher as a self-teaching manual or by an instructor as a classroom text. A major problem of technology transfer is finding applicable technology to transfer. Described in detail is the solution to this problem, the use of computerized, bibliographic databases, which currently contain virtually all documented technology of the past 15 years. A general framework for locating technology is described. NASA technology organizations and private technology transfer firms are listed for consultation.
NASA Astrophysics Data System (ADS)
Wan, Qianwen; Panetta, Karen; Agaian, Sos
2017-05-01
Autonomous facial recognition system is widely used in real-life applications, such as homeland border security, law enforcement identification and authentication, and video-based surveillance analysis. Issues like low image quality, non-uniform illumination as well as variations in poses and facial expressions can impair the performance of recognition systems. To address the non-uniform illumination challenge, we present a novel robust autonomous facial recognition system inspired by the human visual system based, so called, logarithmical image visualization technique. In this paper, the proposed method, for the first time, utilizes the logarithmical image visualization technique coupled with the local binary pattern to perform discriminative feature extraction for facial recognition system. The Yale database, the Yale-B database and the ATT database are used for computer simulation accuracy and efficiency testing. The extensive computer simulation demonstrates the method's efficiency, accuracy, and robustness of illumination invariance for facial recognition.
Kessell, Eric R.; Alvidrez, Jennifer; McConnell, William A.; Shumway, Martha
2010-01-01
Objective This study investigated the association between San Francisco neighborhoods’ racial/ethnic residential composition and the rate of mental-health-related 911 calls. Methods Calls to the San Francisco 911 system from January 2001 through June 2003 (n=1,341,608) were divided into mental-health-related and other calls. Police sector data in the call records were overlaid onto U.S. Census tracts to estimate sector demographic and socioeconomic characteristics. Negative binomial regression was used to estimate the association between black, Asian, Latino and white resident percentage and rates of mental-health-related calls. Results Percent of black residents was associated with a lower rate of mental-health-related calls (IRR=.99, 95% CI .98–1.00). Percent of Asian and Latino residents had no significant effect. Conclusions The observed relationship between black residents and mental-health-related calls is not consistent with known emergency mental health service utilization patterns. The paradox between underutilization of the 911 system and overutilization of psychiatric emergency services deserves further investigation. PMID:19797379
Kessell, Eric R; Alvidrez, Jennifer; McConnell, William A; Shumway, Martha
2009-10-01
This study investigated the association between the racial and ethnic residential composition of San Francisco neighborhoods and the rate of mental health-related 911 calls. A total of 1,341,608 emergency calls (28,197 calls related to mental health) to San Francisco's 911 system were made from January 2001 through June 2003. Police sector data in the call records were overlaid onto U.S. census tracts to estimate sector demographic and socioeconomic characteristics. Negative binomial regression was used to estimate the association between the percentage of black, Asian, Latino, and white residents and rates of mental health-related calls. A one-point increase in a sector's percentage of black residents was associated with a lower rate of mental health-related calls (incidence rate ratio=.99, p<.05). A sector's percentage of Asian and Latino residents had no significant effect. The observed relationship between the percentage of black residents and mental health-related calls is not consistent with known emergency mental health service utilization patterns.
Simple Logic for Big Problems: An Inside Look at Relational Databases.
ERIC Educational Resources Information Center
Seba, Douglas B.; Smith, Pat
1982-01-01
Discusses database design concept termed "normalization" (process replacing associations between data with associations in two-dimensional tabular form) which results in formation of relational databases (they are to computers what dictionaries are to spoken languages). Applications of the database in serials control and complex systems…
Relational Database Design in Information Science Education.
ERIC Educational Resources Information Center
Brooks, Terrence A.
1985-01-01
Reports on database management system (dbms) applications designed by library school students for university community at University of Iowa. Three dbms design issues are examined: synthesis of relations, analysis of relations (normalization procedure), and data dictionary usage. Database planning prior to automation using data dictionary approach…
Calling and Life Satisfaction: It's Not about Having It, It's about Living It
ERIC Educational Resources Information Center
Duffy, Ryan D.; Allan, Blake A.; Autin, Kelsey L.; Bott, Elizabeth M.
2013-01-01
The present study examined the relation of career calling to life satisfaction among a diverse sample of 553 working adults, with a specific focus on the distinction between perceiving a calling (sensing a calling to a career) and living a calling (actualizing one's calling in one's current career). As hypothesized, the relation of perceiving a…