Sample records for distributed databases

  1. Building a generalized distributed system model

    NASA Technical Reports Server (NTRS)

    Mukkamala, Ravi

    1991-01-01

    A number of topics related to building a generalized distributed system model are discussed. The effects of distributed database modeling on evaluation of transaction rollbacks, the measurement of effects of distributed database models on transaction availability measures, and a performance analysis of static locking in replicated distributed database systems are covered.

  2. Computer Science Research in Europe.

    DTIC Science & Technology

    1984-08-29

    most attention, multi- database and its structure, and (3) the dependencies between databases Distributed Systems and multi- databases . Having...completed a multi- database Newcastle University, UK system for distributed data management, At the University of Newcastle the INRIA is now working on a real...communications re- INRIA quirements of distributed database A project called SIRIUS was estab- systems, protocols for checking the lished in 1977 at the

  3. Distribution Grid Integration Unit Cost Database | Solar Research | NREL

    Science.gov Websites

    Unit Cost Database Distribution Grid Integration Unit Cost Database NREL's Distribution Grid Integration Unit Cost Database contains unit cost information for different components that may be used to associated with PV. It includes information from the California utility unit cost guides on traditional

  4. VIEWCACHE: An incremental pointer-based access method for autonomous interoperable databases

    NASA Technical Reports Server (NTRS)

    Roussopoulos, N.; Sellis, Timos

    1992-01-01

    One of biggest problems facing NASA today is to provide scientists efficient access to a large number of distributed databases. Our pointer-based incremental database access method, VIEWCACHE, provides such an interface for accessing distributed data sets and directories. VIEWCACHE allows database browsing and search performing inter-database cross-referencing with no actual data movement between database sites. This organization and processing is especially suitable for managing Astrophysics databases which are physically distributed all over the world. Once the search is complete, the set of collected pointers pointing to the desired data are cached. VIEWCACHE includes spatial access methods for accessing image data sets, which provide much easier query formulation by referring directly to the image and very efficient search for objects contained within a two-dimensional window. We will develop and optimize a VIEWCACHE External Gateway Access to database management systems to facilitate distributed database search.

  5. Data Mining on Distributed Medical Databases: Recent Trends and Future Directions

    NASA Astrophysics Data System (ADS)

    Atilgan, Yasemin; Dogan, Firat

    As computerization in healthcare services increase, the amount of available digital data is growing at an unprecedented rate and as a result healthcare organizations are much more able to store data than to extract knowledge from it. Today the major challenge is to transform these data into useful information and knowledge. It is important for healthcare organizations to use stored data to improve quality while reducing cost. This paper first investigates the data mining applications on centralized medical databases, and how they are used for diagnostic and population health, then introduces distributed databases. The integration needs and issues of distributed medical databases are described. Finally the paper focuses on data mining studies on distributed medical databases.

  6. Production and distribution of scientific and technical databases - Comparison among Japan, US and Europe

    NASA Astrophysics Data System (ADS)

    Onodera, Natsuo; Mizukami, Masayuki

    This paper estimates several quantitative indice on production and distribution of scientific and technical databases based on various recent publications and attempts to compare the indice internationally. Raw data used for the estimation are brought mainly from the Database Directory (published by MITI) for database production and from some domestic and foreign study reports for database revenues. The ratio of the indice among Japan, US and Europe for usage of database is similar to those for general scientific and technical activities such as population and R&D expenditures. But Japanese contributions to production, revenue and over-countory distribution of databases are still lower than US and European countries. International comparison of relative database activities between public and private sectors is also discussed.

  7. Performance related issues in distributed database systems

    NASA Technical Reports Server (NTRS)

    Mukkamala, Ravi

    1991-01-01

    The key elements of research performed during the year long effort of this project are: Investigate the effects of heterogeneity in distributed real time systems; Study the requirements to TRAC towards building a heterogeneous database system; Study the effects of performance modeling on distributed database performance; and Experiment with an ORACLE based heterogeneous system.

  8. WLN's Database: New Directions.

    ERIC Educational Resources Information Center

    Ziegman, Bruce N.

    1988-01-01

    Describes features of the Western Library Network's database, including the database structure, authority control, contents, quality control, and distribution methods. The discussion covers changes in distribution necessitated by increasing telecommunications costs and the development of optical data disk products. (CLB)

  9. VIEWCACHE: An incremental pointer-based access method for autonomous interoperable databases

    NASA Technical Reports Server (NTRS)

    Roussopoulos, N.; Sellis, Timos

    1993-01-01

    One of the biggest problems facing NASA today is to provide scientists efficient access to a large number of distributed databases. Our pointer-based incremental data base access method, VIEWCACHE, provides such an interface for accessing distributed datasets and directories. VIEWCACHE allows database browsing and search performing inter-database cross-referencing with no actual data movement between database sites. This organization and processing is especially suitable for managing Astrophysics databases which are physically distributed all over the world. Once the search is complete, the set of collected pointers pointing to the desired data are cached. VIEWCACHE includes spatial access methods for accessing image datasets, which provide much easier query formulation by referring directly to the image and very efficient search for objects contained within a two-dimensional window. We will develop and optimize a VIEWCACHE External Gateway Access to database management systems to facilitate database search.

  10. Performance analysis of static locking in replicated distributed database systems

    NASA Technical Reports Server (NTRS)

    Kuang, Yinghong; Mukkamala, Ravi

    1991-01-01

    Data replication and transaction deadlocks can severely affect the performance of distributed database systems. Many current evaluation techniques ignore these aspects, because it is difficult to evaluate through analysis and time consuming to evaluate through simulation. A technique is used that combines simulation and analysis to closely illustrate the impact of deadlock and evaluate performance of replicated distributed database with both shared and exclusive locks.

  11. Heterogeneous distributed query processing: The DAVID system

    NASA Technical Reports Server (NTRS)

    Jacobs, Barry E.

    1985-01-01

    The objective of the Distributed Access View Integrated Database (DAVID) project is the development of an easy to use computer system with which NASA scientists, engineers and administrators can uniformly access distributed heterogeneous databases. Basically, DAVID will be a database management system that sits alongside already existing database and file management systems. Its function is to enable users to access the data in other languages and file systems without having to learn the data manipulation languages. Given here is an outline of a talk on the DAVID project and several charts.

  12. Architecture Knowledge for Evaluating Scalable Databases

    DTIC Science & Technology

    2015-01-16

    problems, arising from the proliferation of new data models and distributed technologies for building scalable, available data stores . Architects must...longer are relational databases the de facto standard for building data repositories. Highly distributed, scalable “ NoSQL ” databases [11] have emerged...This is especially challenging at the data storage layer. The multitude of competing NoSQL database technologies creates a complex and rapidly

  13. Preliminary surficial geologic map database of the Amboy 30 x 60 minute quadrangle, California

    USGS Publications Warehouse

    Bedford, David R.; Miller, David M.; Phelps, Geoffrey A.

    2006-01-01

    The surficial geologic map database of the Amboy 30x60 minute quadrangle presents characteristics of surficial materials for an area approximately 5,000 km2 in the eastern Mojave Desert of California. This map consists of new surficial mapping conducted between 2000 and 2005, as well as compilations of previous surficial mapping. Surficial geology units are mapped and described based on depositional process and age categories that reflect the mode of deposition, pedogenic effects occurring post-deposition, and, where appropriate, the lithologic nature of the material. The physical properties recorded in the database focus on those that drive hydrologic, biologic, and physical processes such as particle size distribution (PSD) and bulk density. This version of the database is distributed with point data representing locations of samples for both laboratory determined physical properties and semi-quantitative field-based information. Future publications will include the field and laboratory data as well as maps of distributed physical properties across the landscape tied to physical process models where appropriate. The database is distributed in three parts: documentation, spatial map-based data, and printable map graphics of the database. Documentation includes this file, which provides a discussion of the surficial geology and describes the format and content of the map data, a database 'readme' file, which describes the database contents, and FGDC metadata for the spatial map information. Spatial data are distributed as Arc/Info coverage in ESRI interchange (e00) format, or as tabular data in the form of DBF3-file (.DBF) file formats. Map graphics files are distributed as Postscript and Adobe Portable Document Format (PDF) files, and are appropriate for representing a view of the spatial database at the mapped scale.

  14. Design and implementation of a distributed large-scale spatial database system based on J2EE

    NASA Astrophysics Data System (ADS)

    Gong, Jianya; Chen, Nengcheng; Zhu, Xinyan; Zhang, Xia

    2003-03-01

    With the increasing maturity of distributed object technology, CORBA, .NET and EJB are universally used in traditional IT field. However, theories and practices of distributed spatial database need farther improvement in virtue of contradictions between large scale spatial data and limited network bandwidth or between transitory session and long transaction processing. Differences and trends among of CORBA, .NET and EJB are discussed in details, afterwards the concept, architecture and characteristic of distributed large-scale seamless spatial database system based on J2EE is provided, which contains GIS client application, web server, GIS application server and spatial data server. Moreover the design and implementation of components of GIS client application based on JavaBeans, the GIS engine based on servlet, the GIS Application server based on GIS enterprise JavaBeans(contains session bean and entity bean) are explained.Besides, the experiments of relation of spatial data and response time under different conditions are conducted, which proves that distributed spatial database system based on J2EE can be used to manage, distribute and share large scale spatial data on Internet. Lastly, a distributed large-scale seamless image database based on Internet is presented.

  15. Comparison of the Frontier Distributed Database Caching System to NoSQL Databases

    NASA Astrophysics Data System (ADS)

    Dykstra, Dave

    2012-12-01

    One of the main attractions of non-relational “NoSQL” databases is their ability to scale to large numbers of readers, including readers spread over a wide area. The Frontier distributed database caching system, used in production by the Large Hadron Collider CMS and ATLAS detector projects for Conditions data, is based on traditional SQL databases but also adds high scalability and the ability to be distributed over a wide-area for an important subset of applications. This paper compares the major characteristics of the two different approaches and identifies the criteria for choosing which approach to prefer over the other. It also compares in some detail the NoSQL databases used by CMS and ATLAS: MongoDB, CouchDB, HBase, and Cassandra.

  16. Comparison of the Frontier Distributed Database Caching System to NoSQL Databases

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Dykstra, Dave

    One of the main attractions of non-relational NoSQL databases is their ability to scale to large numbers of readers, including readers spread over a wide area. The Frontier distributed database caching system, used in production by the Large Hadron Collider CMS and ATLAS detector projects for Conditions data, is based on traditional SQL databases but also adds high scalability and the ability to be distributed over a wide-area for an important subset of applications. This paper compares the major characteristics of the two different approaches and identifies the criteria for choosing which approach to prefer over the other. It alsomore » compares in some detail the NoSQL databases used by CMS and ATLAS: MongoDB, CouchDB, HBase, and Cassandra.« less

  17. The ATLAS TAGS database distribution and management - Operational challenges of a multi-terabyte distributed database

    NASA Astrophysics Data System (ADS)

    Viegas, F.; Malon, D.; Cranshaw, J.; Dimitrov, G.; Nowak, M.; Nairz, A.; Goossens, L.; Gallas, E.; Gamboa, C.; Wong, A.; Vinek, E.

    2010-04-01

    The TAG files store summary event quantities that allow a quick selection of interesting events. This data will be produced at a nominal rate of 200 Hz, and is uploaded into a relational database for access from websites and other tools. The estimated database volume is 6TB per year, making it the largest application running on the ATLAS relational databases, at CERN and at other voluntary sites. The sheer volume and high rate of production makes this application a challenge to data and resource management, in many aspects. This paper will focus on the operational challenges of this system. These include: uploading the data from files to the CERN's and remote sites' databases; distributing the TAG metadata that is essential to guide the user through event selection; controlling resource usage of the database, from the user query load to the strategy of cleaning and archiving of old TAG data.

  18. Brief Report: Databases in the Asia-Pacific Region: The Potential for a Distributed Network Approach.

    PubMed

    Lai, Edward Chia-Cheng; Man, Kenneth K C; Chaiyakunapruk, Nathorn; Cheng, Ching-Lan; Chien, Hsu-Chih; Chui, Celine S L; Dilokthornsakul, Piyameth; Hardy, N Chantelle; Hsieh, Cheng-Yang; Hsu, Chung Y; Kubota, Kiyoshi; Lin, Tzu-Chieh; Liu, Yanfang; Park, Byung Joo; Pratt, Nicole; Roughead, Elizabeth E; Shin, Ju-Young; Watcharathanakij, Sawaeng; Wen, Jin; Wong, Ian C K; Yang, Yea-Huei Kao; Zhang, Yinghong; Setoguchi, Soko

    2015-11-01

    This study describes the availability and characteristics of databases in Asian-Pacific countries and assesses the feasibility of a distributed network approach in the region. A web-based survey was conducted among investigators using healthcare databases in the Asia-Pacific countries. Potential survey participants were identified through the Asian Pharmacoepidemiology Network. Investigators from a total of 11 databases participated in the survey. Database sources included four nationwide claims databases from Japan, South Korea, and Taiwan; two nationwide electronic health records from Hong Kong and Singapore; a regional electronic health record from western China; two electronic health records from Thailand; and cancer and stroke registries from Taiwan. We identified 11 databases with capabilities for distributed network approaches. Many country-specific coding systems and terminologies have been already converted to international coding systems. The harmonization of health expenditure data is a major obstacle for future investigations attempting to evaluate issues related to medical costs.

  19. Performance analysis of static locking in replicated distributed database systems

    NASA Technical Reports Server (NTRS)

    Kuang, Yinghong; Mukkamala, Ravi

    1991-01-01

    Data replications and transaction deadlocks can severely affect the performance of distributed database systems. Many current evaluation techniques ignore these aspects, because it is difficult to evaluate through analysis and time consuming to evaluate through simulation. Here, a technique is discussed that combines simulation and analysis to closely illustrate the impact of deadlock and evaluate performance of replicated distributed databases with both shared and exclusive locks.

  20. A Database for Decision-Making in Training and Distributed Learning Technology

    DTIC Science & Technology

    1998-04-01

    developer must answer these questions: ♦ Who will develop the courseware? Should we outsource ? ♦ What media should we use? How much will it cost? ♦ What...to develop , the database can be useful for answering staffing questions and planning transitions to technology- assisted courses. The database...of distributed learning curricula in com- parison to traditional methods. To develop a military-wide distributed learning plan, the existing course

  1. Distribution System Upgrade Unit Cost Database

    DOE Data Explorer

    Horowitz, Kelsey

    2017-11-30

    This database contains unit cost information for different components that may be used to integrate distributed photovotaic (D-PV) systems onto distribution systems. Some of these upgrades and costs may also apply to integration of other distributed energy resources (DER). Which components are required, and how many of each, is system-specific and should be determined by analyzing the effects of distributed PV at a given penetration level on the circuit of interest in combination with engineering assessments on the efficacy of different solutions to increase the ability of the circuit to host additional PV as desired. The current state of the distribution system should always be considered in these types of analysis. The data in this database was collected from a variety of utilities, PV developers, technology vendors, and published research reports. Where possible, we have included information on the source of each data point and relevant notes. In some cases where data provided is sensitive or proprietary, we were not able to specify the source, but provide other information that may be useful to the user (e.g. year, location where equipment was installed). NREL has carefully reviewed these sources prior to inclusion in this database. Additional information about the database, data sources, and assumptions is included in the "Unit_cost_database_guide.doc" file included in this submission. This guide provides important information on what costs are included in each entry. Please refer to this guide before using the unit cost database for any purpose.

  2. Resident database interfaces to the DAVID system, a heterogeneous distributed database management system

    NASA Technical Reports Server (NTRS)

    Moroh, Marsha

    1988-01-01

    A methodology for building interfaces of resident database management systems to a heterogeneous distributed database management system under development at NASA, the DAVID system, was developed. The feasibility of that methodology was demonstrated by construction of the software necessary to perform the interface task. The interface terminology developed in the course of this research is presented. The work performed and the results are summarized.

  3. CMO: Cruise Metadata Organizer for JAMSTEC Research Cruises

    NASA Astrophysics Data System (ADS)

    Fukuda, K.; Saito, H.; Hanafusa, Y.; Vanroosebeke, A.; Kitayama, T.

    2011-12-01

    JAMSTEC's Data Research Center for Marine-Earth Sciences manages and distributes a wide variety of observational data and samples obtained from JAMSTEC research vessels and deep sea submersibles. Generally, metadata are essential to identify data and samples were obtained. In JAMSTEC, cruise metadata include cruise information such as cruise ID, name of vessel, research theme, and diving information such as dive number, name of submersible and position of diving point. They are submitted by chief scientists of research cruises in the Microsoft Excel° spreadsheet format, and registered into a data management database to confirm receipt of observational data files, cruise summaries, and cruise reports. The cruise metadata are also published via "JAMSTEC Data Site for Research Cruises" within two months after end of cruise. Furthermore, these metadata are distributed with observational data, images and samples via several data and sample distribution websites after a publication moratorium period. However, there are two operational issues in the metadata publishing process. One is that duplication efforts and asynchronous metadata across multiple distribution websites due to manual metadata entry into individual websites by administrators. The other is that differential data types or representation of metadata in each website. To solve those problems, we have developed a cruise metadata organizer (CMO) which allows cruise metadata to be connected from the data management database to several distribution websites. CMO is comprised of three components: an Extensible Markup Language (XML) database, an Enterprise Application Integration (EAI) software, and a web-based interface. The XML database is used because of its flexibility for any change of metadata. Daily differential uptake of metadata from the data management database to the XML database is automatically processed via the EAI software. Some metadata are entered into the XML database using the web-based interface by a metadata editor in CMO as needed. Then daily differential uptake of metadata from the XML database to databases in several distribution websites is automatically processed using a convertor defined by the EAI software. Currently, CMO is available for three distribution websites: "Deep Sea Floor Rock Sample Database GANSEKI", "Marine Biological Sample Database", and "JAMSTEC E-library of Deep-sea Images". CMO is planned to provide "JAMSTEC Data Site for Research Cruises" with metadata in the future.

  4. SMALL-SCALE AND GLOBAL DYNAMOS AND THE AREA AND FLUX DISTRIBUTIONS OF ACTIVE REGIONS, SUNSPOT GROUPS, AND SUNSPOTS: A MULTI-DATABASE STUDY

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Muñoz-Jaramillo, Andrés; Windmueller, John C.; Amouzou, Ernest C.

    2015-02-10

    In this work, we take advantage of 11 different sunspot group, sunspot, and active region databases to characterize the area and flux distributions of photospheric magnetic structures. We find that, when taken separately, different databases are better fitted by different distributions (as has been reported previously in the literature). However, we find that all our databases can be reconciled by the simple application of a proportionality constant, and that, in reality, different databases are sampling different parts of a composite distribution. This composite distribution is made up by linear combination of Weibull and log-normal distributions—where a pure Weibull (log-normal) characterizesmore » the distribution of structures with fluxes below (above) 10{sup 21}Mx (10{sup 22}Mx). Additionally, we demonstrate that the Weibull distribution shows the expected linear behavior of a power-law distribution (when extended to smaller fluxes), making our results compatible with the results of Parnell et al. We propose that this is evidence of two separate mechanisms giving rise to visible structures on the photosphere: one directly connected to the global component of the dynamo (and the generation of bipolar active regions), and the other with the small-scale component of the dynamo (and the fragmentation of magnetic structures due to their interaction with turbulent convection)« less

  5. Bridging the Gap between the Data Base and User in a Distributed Environment.

    ERIC Educational Resources Information Center

    Howard, Richard D.; And Others

    1989-01-01

    The distribution of databases physically separates users from those who administer the database and the administrators who perform database administration. By drawing on the work of social scientists in reliability and validity, a set of concepts and a list of questions to ensure data quality were developed. (Author/MLW)

  6. A Web-based open-source database for the distribution of hyperspectral signatures

    NASA Astrophysics Data System (ADS)

    Ferwerda, J. G.; Jones, S. D.; Du, Pei-Jun

    2006-10-01

    With the coming of age of field spectroscopy as a non-destructive means to collect information on the physiology of vegetation, there is a need for storage of signatures, and, more importantly, their metadata. Without the proper organisation of metadata, the signatures itself become limited. In order to facilitate re-distribution of data, a database for the storage & distribution of hyperspectral signatures and their metadata was designed. The database was built using open-source software, and can be used by the hyperspectral community to share their data. Data is uploaded through a simple web-based interface. The database recognizes major file-formats by ASD, GER and International Spectronics. The database source code is available for download through the hyperspectral.info web domain, and we happily invite suggestion for additions & modification for the database to be submitted through the online forums on the same website.

  7. Distributed Database Control and Allocation. Volume 3. Distributed Database System Designer’s Handbook.

    DTIC Science & Technology

    1983-10-01

    Multiversion Data 2-18 2.7.1 Multiversion Timestamping 2-20 2.T.2 Multiversion Looking 2-20 2.8 Combining the Techniques 2-22 3. Database Recovery Algorithms...See rTHEM79, GIFF79] for details. 2.7 Multiversion Data Let us return to a database system model where each logical data item is stored at one DM...In a multiversion database each Write wifxl, produces a new copy (or version) of x, denoted xi. Thus, the value of z is a set of ver- sions. For each

  8. The Design and Implementation of a Relational to Network Query Translator for a Distributed Database Management System.

    DTIC Science & Technology

    1985-12-01

    RELATIONAL TO NETWORK QUERY TRANSLATOR FOR A DISTRIBUTED DATABASE MANAGEMENT SYSTEM TH ESI S .L Kevin H. Mahoney -- Captain, USAF AFIT/GCS/ENG/85D-7...NETWORK QUERY TRANSLATOR FOR A DISTRIBUTED DATABASE MANAGEMENT SYSTEM - THESIS Presented to the Faculty of the School of Engineering of the Air Force...Institute of Technology Air University In Partial Fulfillment of the Requirements for the Degree of Master of Science in Computer Systems - Kevin H. Mahoney

  9. Distributed Episodic Exploratory Planning (DEEP)

    DTIC Science & Technology

    2008-12-01

    API). For DEEP, Hibernate offered the following advantages: • Abstracts SQL by utilizing HQL so any database with a Java Database Connectivity... Hibernate SQL ICCRTS International Command and Control Research and Technology Symposium JDB Java Distributed Blackboard JDBC Java Database Connectivity...selected because of its opportunistic reasoning capabilities and implemented in Java for platform independence. Java was chosen for ease of

  10. Monte Carlo simulations of product distributions and contained metal estimates

    USGS Publications Warehouse

    Gettings, Mark E.

    2013-01-01

    Estimation of product distributions of two factors was simulated by conventional Monte Carlo techniques using factor distributions that were independent (uncorrelated). Several simulations using uniform distributions of factors show that the product distribution has a central peak approximately centered at the product of the medians of the factor distributions. Factor distributions that are peaked, such as Gaussian (normal) produce an even more peaked product distribution. Piecewise analytic solutions can be obtained for independent factor distributions and yield insight into the properties of the product distribution. As an example, porphyry copper grades and tonnages are now available in at least one public database and their distributions were analyzed. Although both grade and tonnage can be approximated with lognormal distributions, they are not exactly fit by them. The grade shows some nonlinear correlation with tonnage for the published database. Sampling by deposit from available databases of grade, tonnage, and geological details of each deposit specifies both grade and tonnage for that deposit. Any correlation between grade and tonnage is then preserved and the observed distribution of grades and tonnages can be used with no assumption of distribution form.

  11. How to ensure sustainable interoperability in heterogeneous distributed systems through architectural approach.

    PubMed

    Pape-Haugaard, Louise; Frank, Lars

    2011-01-01

    A major obstacle in ensuring ubiquitous information is the utilization of heterogeneous systems in eHealth. The objective in this paper is to illustrate how an architecture for distributed eHealth databases can be designed without lacking the characteristic features of traditional sustainable databases. The approach is firstly to explain traditional architecture in central and homogeneous distributed database computing, followed by a possible approach to use an architectural framework to obtain sustainability across disparate systems i.e. heterogeneous databases, concluded with a discussion. It is seen that through a method of using relaxed ACID properties on a service-oriented architecture it is possible to achieve data consistency which is essential when ensuring sustainable interoperability.

  12. Database System Design and Implementation for Marine Air-Traffic-Controller Training

    DTIC Science & Technology

    2017-06-01

    NAVAL POSTGRADUATE SCHOOL MONTEREY, CALIFORNIA THESIS Approved for public release. Distribution is unlimited. DATABASE SYSTEM DESIGN AND...thesis 4. TITLE AND SUBTITLE DATABASE SYSTEM DESIGN AND IMPLEMENTATION FOR MARINE AIR-TRAFFIC-CONTROLLER TRAINING 5. FUNDING NUMBERS 6. AUTHOR(S...12b. DISTRIBUTION CODE 13. ABSTRACT (maximum 200 words) This project focused on the design , development, and implementation of a centralized

  13. Effects of distributed database modeling on evaluation of transaction rollbacks

    NASA Technical Reports Server (NTRS)

    Mukkamala, Ravi

    1991-01-01

    Data distribution, degree of data replication, and transaction access patterns are key factors in determining the performance of distributed database systems. In order to simplify the evaluation of performance measures, database designers and researchers tend to make simplistic assumptions about the system. The effect is studied of modeling assumptions on the evaluation of one such measure, the number of transaction rollbacks, in a partitioned distributed database system. Six probabilistic models and expressions are developed for the numbers of rollbacks under each of these models. Essentially, the models differ in terms of the available system information. The analytical results so obtained are compared to results from simulation. From here, it is concluded that most of the probabilistic models yield overly conservative estimates of the number of rollbacks. The effect of transaction commutativity on system throughout is also grossly undermined when such models are employed.

  14. Effects of distributed database modeling on evaluation of transaction rollbacks

    NASA Technical Reports Server (NTRS)

    Mukkamala, Ravi

    1991-01-01

    Data distribution, degree of data replication, and transaction access patterns are key factors in determining the performance of distributed database systems. In order to simplify the evaluation of performance measures, database designers and researchers tend to make simplistic assumptions about the system. Here, researchers investigate the effect of modeling assumptions on the evaluation of one such measure, the number of transaction rollbacks in a partitioned distributed database system. The researchers developed six probabilistic models and expressions for the number of rollbacks under each of these models. Essentially, the models differ in terms of the available system information. The analytical results obtained are compared to results from simulation. It was concluded that most of the probabilistic models yield overly conservative estimates of the number of rollbacks. The effect of transaction commutativity on system throughput is also grossly undermined when such models are employed.

  15. Distribution Characteristics of Air-Bone Gaps – Evidence of Bias in Manual Audiometry

    PubMed Central

    Margolis, Robert H.; Wilson, Richard H.; Popelka, Gerald R.; Eikelboom, Robert H.; Swanepoel, De Wet; Saly, George L.

    2015-01-01

    Objective Five databases were mined to examine distributions of air-bone gaps obtained by automated and manual audiometry. Differences in distribution characteristics were examined for evidence of influences unrelated to the audibility of test signals. Design The databases provided air- and bone-conduction thresholds that permitted examination of air-bone gap distributions that were free of ceiling and floor effects. Cases with conductive hearing loss were eliminated based on air-bone gaps, tympanometry, and otoscopy, when available. The analysis is based on 2,378,921 threshold determinations from 721,831 subjects from five databases. Results Automated audiometry produced air-bone gaps that were normally distributed suggesting that air- and bone-conduction thresholds are normally distributed. Manual audiometry produced air-bone gaps that were not normally distributed and show evidence of biasing effects of assumptions of expected results. In one database, the form of the distributions showed evidence of inclusion of conductive hearing losses. Conclusions Thresholds obtained by manual audiometry show tester bias effects from assumptions of the patient’s hearing loss characteristics. Tester bias artificially reduces the variance of bone-conduction thresholds and the resulting air-bone gaps. Because the automated method is free of bias from assumptions of expected results, these distributions are hypothesized to reflect the true variability of air- and bone-conduction thresholds and the resulting air-bone gaps. PMID:26627469

  16. Private database queries based on counterfactual quantum key distribution

    NASA Astrophysics Data System (ADS)

    Zhang, Jia-Li; Guo, Fen-Zhuo; Gao, Fei; Liu, Bin; Wen, Qiao-Yan

    2013-08-01

    Based on the fundamental concept of quantum counterfactuality, we propose a protocol to achieve quantum private database queries, which is a theoretical study of how counterfactuality can be employed beyond counterfactual quantum key distribution (QKD). By adding crucial detecting apparatus to the device of QKD, the privacy of both the distrustful user and the database owner can be guaranteed. Furthermore, the proposed private-database-query protocol makes full use of the low efficiency in the counterfactual QKD, and by adjusting the relevant parameters, the protocol obtains excellent flexibility and extensibility.

  17. Surviving the Glut: The Management of Event Streams in Cyberphysical Systems

    NASA Astrophysics Data System (ADS)

    Buchmann, Alejandro

    Alejandro Buchmann is Professor in the Department of Computer Science, Technische Universität Darmstadt, where he heads the Databases and Distributed Systems Group. He received his MS (1977) and PhD (1980) from the University of Texas at Austin. He was an Assistant/Associate Professor at the Institute for Applied Mathematics and Systems IIMAS/UNAM in Mexico, doing research on databases for CAD, geographic information systems, and objectoriented databases. At Computer Corporation of America (later Xerox Advanced Information Systems) in Cambridge, Mass., he worked in the areas of active databases and real-time databases, and at GTE Laboratories, Waltham, in the areas of distributed object systems and the integration of heterogeneous legacy systems. 1991 he returned to academia and joined T.U. Darmstadt. His current research interests are at the intersection of middleware, databases, eventbased distributed systems, ubiquitous computing, and very large distributed systems (P2P, WSN). Much of the current research is concerned with guaranteeing quality of service and reliability properties in these systems, for example, scalability, performance, transactional behaviour, consistency, and end-to-end security. Many research projects imply collaboration with industry and cover a broad spectrum of application domains. Further information can be found at http://www.dvs.tu-darmstadt.de

  18. New model for distributed multimedia databases and its application to networking of museums

    NASA Astrophysics Data System (ADS)

    Kuroda, Kazuhide; Komatsu, Naohisa; Komiya, Kazumi; Ikeda, Hiroaki

    1998-02-01

    This paper proposes a new distributed multimedia data base system where the databases storing MPEG-2 videos and/or super high definition images are connected together through the B-ISDN's, and also refers to an example of the networking of museums on the basis of the proposed database system. The proposed database system introduces a new concept of the 'retrieval manager' which functions an intelligent controller so that the user can recognize a set of image databases as one logical database. A user terminal issues a request to retrieve contents to the retrieval manager which is located in the nearest place to the user terminal on the network. Then, the retrieved contents are directly sent through the B-ISDN's to the user terminal from the server which stores the designated contents. In this case, the designated logical data base dynamically generates the best combination of such a retrieving parameter as a data transfer path referring to directly or data on the basis of the environment of the system. The generated retrieving parameter is then executed to select the most suitable data transfer path on the network. Therefore, the best combination of these parameters fits to the distributed multimedia database system.

  19. ARACHNID: A prototype object-oriented database tool for distributed systems

    NASA Technical Reports Server (NTRS)

    Younger, Herbert; Oreilly, John; Frogner, Bjorn

    1994-01-01

    This paper discusses the results of a Phase 2 SBIR project sponsored by NASA and performed by MIMD Systems, Inc. A major objective of this project was to develop specific concepts for improved performance in accessing large databases. An object-oriented and distributed approach was used for the general design, while a geographical decomposition was used as a specific solution. The resulting software framework is called ARACHNID. The Faint Source Catalog developed by NASA was the initial database testbed. This is a database of many giga-bytes, where an order of magnitude improvement in query speed is being sought. This database contains faint infrared point sources obtained from telescope measurements of the sky. A geographical decomposition of this database is an attractive approach to dividing it into pieces. Each piece can then be searched on individual processors with only a weak data linkage between the processors being required. As a further demonstration of the concepts implemented in ARACHNID, a tourist information system is discussed. This version of ARACHNID is the commercial result of the project. It is a distributed, networked, database application where speed, maintenance, and reliability are important considerations. This paper focuses on the design concepts and technologies that form the basis for ARACHNID.

  20. Content Based Image Retrieval based on Wavelet Transform coefficients distribution

    PubMed Central

    Lamard, Mathieu; Cazuguel, Guy; Quellec, Gwénolé; Bekri, Lynda; Roux, Christian; Cochener, Béatrice

    2007-01-01

    In this paper we propose a content based image retrieval method for diagnosis aid in medical fields. We characterize images without extracting significant features by using distribution of coefficients obtained by building signatures from the distribution of wavelet transform. The research is carried out by computing signature distances between the query and database images. Several signatures are proposed; they use a model of wavelet coefficient distribution. To enhance results, a weighted distance between signatures is used and an adapted wavelet base is proposed. Retrieval efficiency is given for different databases including a diabetic retinopathy, a mammography and a face database. Results are promising: the retrieval efficiency is higher than 95% for some cases using an optimization process. PMID:18003013

  1. Design considerations, architecture, and use of the Mini-Sentinel distributed data system.

    PubMed

    Curtis, Lesley H; Weiner, Mark G; Boudreau, Denise M; Cooper, William O; Daniel, Gregory W; Nair, Vinit P; Raebel, Marsha A; Beaulieu, Nicolas U; Rosofsky, Robert; Woodworth, Tiffany S; Brown, Jeffrey S

    2012-01-01

    We describe the design, implementation, and use of a large, multiorganizational distributed database developed to support the Mini-Sentinel Pilot Program of the US Food and Drug Administration (FDA). As envisioned by the US FDA, this implementation will inform and facilitate the development of an active surveillance system for monitoring the safety of medical products (drugs, biologics, and devices) in the USA. A common data model was designed to address the priorities of the Mini-Sentinel Pilot and to leverage the experience and data of participating organizations and data partners. A review of existing common data models informed the process. Each participating organization designed a process to extract, transform, and load its source data, applying the common data model to create the Mini-Sentinel Distributed Database. Transformed data were characterized and evaluated using a series of programs developed centrally and executed locally by participating organizations. A secure communications portal was designed to facilitate queries of the Mini-Sentinel Distributed Database and transfer of confidential data, analytic tools were developed to facilitate rapid response to common questions, and distributed querying software was implemented to facilitate rapid querying of summary data. As of July 2011, information on 99,260,976 health plan members was included in the Mini-Sentinel Distributed Database. The database includes 316,009,067 person-years of observation time, with members contributing, on average, 27.0 months of observation time. All data partners have successfully executed distributed code and returned findings to the Mini-Sentinel Operations Center. This work demonstrates the feasibility of building a large, multiorganizational distributed data system in which organizations retain possession of their data that are used in an active surveillance system. Copyright © 2012 John Wiley & Sons, Ltd.

  2. Integrating a local database into the StarView distributed user interface

    NASA Technical Reports Server (NTRS)

    Silberberg, D. P.

    1992-01-01

    A distributed user interface to the Space Telescope Data Archive and Distribution Service (DADS) known as StarView is being developed. The DADS architecture consists of the data archive as well as a relational database catalog describing the archive. StarView is a client/server system in which the user interface is the front-end client to the DADS catalog and archive servers. Users query the DADS catalog from the StarView interface. Query commands are transmitted via a network and evaluated by the database. The results are returned via the network and are displayed on StarView forms. Based on the results, users decide which data sets to retrieve from the DADS archive. Archive requests are packaged by StarView and sent to DADS, which returns the requested data sets to the users. The advantages of distributed client/server user interfaces over traditional one-machine systems are well known. Since users run software on machines separate from the database, the overall client response time is much faster. Also, since the server is free to process only database requests, the database response time is much faster. Disadvantages inherent in this architecture are slow overall database access time due to the network delays, lack of a 'get previous row' command, and that refinements of a previously issued query must be submitted to the database server, even though the domain of values have already been returned by the previous query. This architecture also does not allow users to cross correlate DADS catalog data with other catalogs. Clearly, a distributed user interface would be more powerful if it overcame these disadvantages. A local database is being integrated into StarView to overcome these disadvantages. When a query is made through a StarView form, which is often composed of fields from multiple tables, it is translated to an SQL query and issued to the DADS catalog. At the same time, a local database table is created to contain the resulting rows of the query. The returned rows are displayed on the form as well as inserted into the local database table. Identical results are produced by reissuing the query to either the DADS catalog or to the local table. Relational databases do not provide a 'get previous row' function because of the inherent complexity of retrieving previous rows of multiple-table joins. However, since this function is easily implemented on a single table, StarView uses the local table to retrieve the previous row. Also, StarView issues subsequent query refinements to the local table instead of the DADS catalog, eliminating the network transmission overhead. Finally, other catalogs can be imported into the local database for cross correlation with local tables. Overall, it is believe that this is a more powerful architecture for distributed, database user interfaces.

  3. Process evaluation distributed system

    NASA Technical Reports Server (NTRS)

    Moffatt, Christopher L. (Inventor)

    2006-01-01

    The distributed system includes a database server, an administration module, a process evaluation module, and a data display module. The administration module is in communication with the database server for providing observation criteria information to the database server. The process evaluation module is in communication with the database server for obtaining the observation criteria information from the database server and collecting process data based on the observation criteria information. The process evaluation module utilizes a personal digital assistant (PDA). A data display module in communication with the database server, including a website for viewing collected process data in a desired metrics form, the data display module also for providing desired editing and modification of the collected process data. The connectivity established by the database server to the administration module, the process evaluation module, and the data display module, minimizes the requirement for manual input of the collected process data.

  4. Heterogeneous distributed databases: A case study

    NASA Technical Reports Server (NTRS)

    Stewart, Tracy R.; Mukkamala, Ravi

    1991-01-01

    Alternatives are reviewed for accessing distributed heterogeneous databases and a recommended solution is proposed. The current study is limited to the Automated Information Systems Center at the Naval Sea Combat Systems Engineering Station at Norfolk, VA. This center maintains two databases located on Digital Equipment Corporation's VAX computers running under the VMS operating system. The first data base, ICMS, resides on a VAX11/780 and has been implemented using VAX DBMS, a CODASYL based system. The second database, CSA, resides on a VAX 6460 and has been implemented using the ORACLE relational database management system (RDBMS). Both databases are used for configuration management within the U.S. Navy. Different customer bases are supported by each database. ICMS tracks U.S. Navy ships and major systems (anti-sub, sonar, etc.). Even though the major systems on ships and submarines have totally different functions, some of the equipment within the major systems are common to both ships and submarines.

  5. Database Search Strategies & Tips. Reprints from the Best of "ONLINE" [and]"DATABASE."

    ERIC Educational Resources Information Center

    Online, Inc., Weston, CT.

    Reprints of 17 articles presenting strategies and tips for searching databases online appear in this collection, which is one in a series of volumes of reprints from "ONLINE" and "DATABASE" magazines. Edited for information professionals who use electronically distributed databases, these articles address such topics as: (1)…

  6. Resources | Division of Cancer Prevention

    Cancer.gov

    Manual of Operations Version 3, 12/13/2012 (PDF, 162KB) Database Sources Consortium for Functional Glycomics databases Design Studies Related to the Development of Distributed, Web-based European Carbohydrate Databases (EUROCarbDB) |

  7. Analysis and Design of a Distributed System for Management and Distribution of Natural Language Assertions

    DTIC Science & Technology

    2010-09-01

    5 2. SCIL Architecture ...............................................................................6 3. Assertions...137 x THIS PAGE INTENTIONALLY LEFT BLANK xi LIST OF FIGURES Figure 1. SCIL architecture...Database Connectivity LAN Local Area Network ODBC Open Database Connectivity SCIL Social-Cultural Content in Language UMD

  8. DataHub knowledge based assistance for science visualization and analysis using large distributed databases

    NASA Technical Reports Server (NTRS)

    Handley, Thomas H., Jr.; Collins, Donald J.; Doyle, Richard J.; Jacobson, Allan S.

    1991-01-01

    Viewgraphs on DataHub knowledge based assistance for science visualization and analysis using large distributed databases. Topics covered include: DataHub functional architecture; data representation; logical access methods; preliminary software architecture; LinkWinds; data knowledge issues; expert systems; and data management.

  9. Incorporating client-server database architecture and graphical user interface into outpatient medical records.

    PubMed Central

    Fiacco, P. A.; Rice, W. H.

    1991-01-01

    Computerized medical record systems require structured database architectures for information processing. However, the data must be able to be transferred across heterogeneous platform and software systems. Client-Server architecture allows for distributive processing of information among networked computers and provides the flexibility needed to link diverse systems together effectively. We have incorporated this client-server model with a graphical user interface into an outpatient medical record system, known as SuperChart, for the Department of Family Medicine at SUNY Health Science Center at Syracuse. SuperChart was developed using SuperCard and Oracle SuperCard uses modern object-oriented programming to support a hypermedia environment. Oracle is a powerful relational database management system that incorporates a client-server architecture. This provides both a distributed database and distributed processing which improves performance. PMID:1807732

  10. Toward unification of taxonomy databases in a distributed computer environment

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Kitakami, Hajime; Tateno, Yoshio; Gojobori, Takashi

    1994-12-31

    All the taxonomy databases constructed with the DNA databases of the international DNA data banks are powerful electronic dictionaries which aid in biological research by computer. The taxonomy databases are, however not consistently unified with a relational format. If we can achieve consistent unification of the taxonomy databases, it will be useful in comparing many research results, and investigating future research directions from existent research results. In particular, it will be useful in comparing relationships between phylogenetic trees inferred from molecular data and those constructed from morphological data. The goal of the present study is to unify the existent taxonomymore » databases and eliminate inconsistencies (errors) that are present in them. Inconsistencies occur particularly in the restructuring of the existent taxonomy databases, since classification rules for constructing the taxonomy have rapidly changed with biological advancements. A repair system is needed to remove inconsistencies in each data bank and mismatches among data banks. This paper describes a new methodology for removing both inconsistencies and mismatches from the databases on a distributed computer environment. The methodology is implemented in a relational database management system, SYBASE.« less

  11. Design of special purpose database for credit cooperation bank business processing network system

    NASA Astrophysics Data System (ADS)

    Yu, Yongling; Zong, Sisheng; Shi, Jinfa

    2011-12-01

    With the popularization of e-finance in the city, the construction of e-finance is transfering to the vast rural market, and quickly to develop in depth. Developing the business processing network system suitable for the rural credit cooperative Banks can make business processing conveniently, and have a good application prospect. In this paper, We analyse the necessity of adopting special purpose distributed database in Credit Cooperation Band System, give corresponding distributed database system structure , design the specical purpose database and interface technology . The application in Tongbai Rural Credit Cooperatives has shown that system has better performance and higher efficiency.

  12. Experimental evaluation of dynamic data allocation strategies in a distributed database with changing workloads

    NASA Technical Reports Server (NTRS)

    Brunstrom, Anna; Leutenegger, Scott T.; Simha, Rahul

    1995-01-01

    Traditionally, allocation of data in distributed database management systems has been determined by off-line analysis and optimization. This technique works well for static database access patterns, but is often inadequate for frequently changing workloads. In this paper we address how to dynamically reallocate data for partionable distributed databases with changing access patterns. Rather than complicated and expensive optimization algorithms, a simple heuristic is presented and shown, via an implementation study, to improve system throughput by 30 percent in a local area network based system. Based on artificial wide area network delays, we show that dynamic reallocation can improve system throughput by a factor of two and a half for wide area networks. We also show that individual site load must be taken into consideration when reallocating data, and provide a simple policy that incorporates load in the reallocation decision.

  13. Building the Infrastructure of Resource Sharing: Union Catalogs, Distributed Search, and Cross-Database Linkage.

    ERIC Educational Resources Information Center

    Lynch, Clifford A.

    1997-01-01

    Union catalogs and distributed search systems are two ways users can locate materials in print and electronic formats. This article examines the advantages and limitations of both approaches and argues that they should be considered complementary rather than competitive. Discusses technologies creating linkage between catalogs and databases and…

  14. DISTRIBUTED STRUCTURE-SEARCHABLE TOXICITY (DSSTOX) DATABASE NETWORK: MAKING PUBLIC TOXICITY DATA RESOURCES MORE ACCESSIBLE AND USABLE FOR DATA EXPLORATION AND SAR DEVELOPMENT

    EPA Science Inventory


    Distributed Structure-Searchable Toxicity (DSSTox) Database Network: Making Public Toxicity Data Resources More Accessible and U sable for Data Exploration and SAR Development

    Many sources of public toxicity data are not currently linked to chemical structure, are not ...

  15. SPANG: a SPARQL client supporting generation and reuse of queries for distributed RDF databases.

    PubMed

    Chiba, Hirokazu; Uchiyama, Ikuo

    2017-02-08

    Toward improved interoperability of distributed biological databases, an increasing number of datasets have been published in the standardized Resource Description Framework (RDF). Although the powerful SPARQL Protocol and RDF Query Language (SPARQL) provides a basis for exploiting RDF databases, writing SPARQL code is burdensome for users including bioinformaticians. Thus, an easy-to-use interface is necessary. We developed SPANG, a SPARQL client that has unique features for querying RDF datasets. SPANG dynamically generates typical SPARQL queries according to specified arguments. It can also call SPARQL template libraries constructed in a local system or published on the Web. Further, it enables combinatorial execution of multiple queries, each with a distinct target database. These features facilitate easy and effective access to RDF datasets and integrative analysis of distributed data. SPANG helps users to exploit RDF datasets by generation and reuse of SPARQL queries through a simple interface. This client will enhance integrative exploitation of biological RDF datasets distributed across the Web. This software package is freely available at http://purl.org/net/spang .

  16. Compressing DNA sequence databases with coil.

    PubMed

    White, W Timothy J; Hendy, Michael D

    2008-05-20

    Publicly available DNA sequence databases such as GenBank are large, and are growing at an exponential rate. The sheer volume of data being dealt with presents serious storage and data communications problems. Currently, sequence data is usually kept in large "flat files," which are then compressed using standard Lempel-Ziv (gzip) compression - an approach which rarely achieves good compression ratios. While much research has been done on compressing individual DNA sequences, surprisingly little has focused on the compression of entire databases of such sequences. In this study we introduce the sequence database compression software coil. We have designed and implemented a portable software package, coil, for compressing and decompressing DNA sequence databases based on the idea of edit-tree coding. coil is geared towards achieving high compression ratios at the expense of execution time and memory usage during compression - the compression time represents a "one-off investment" whose cost is quickly amortised if the resulting compressed file is transmitted many times. Decompression requires little memory and is extremely fast. We demonstrate a 5% improvement in compression ratio over state-of-the-art general-purpose compression tools for a large GenBank database file containing Expressed Sequence Tag (EST) data. Finally, coil can efficiently encode incremental additions to a sequence database. coil presents a compelling alternative to conventional compression of flat files for the storage and distribution of DNA sequence databases having a narrow distribution of sequence lengths, such as EST data. Increasing compression levels for databases having a wide distribution of sequence lengths is a direction for future work.

  17. Compressing DNA sequence databases with coil

    PubMed Central

    White, W Timothy J; Hendy, Michael D

    2008-01-01

    Background Publicly available DNA sequence databases such as GenBank are large, and are growing at an exponential rate. The sheer volume of data being dealt with presents serious storage and data communications problems. Currently, sequence data is usually kept in large "flat files," which are then compressed using standard Lempel-Ziv (gzip) compression – an approach which rarely achieves good compression ratios. While much research has been done on compressing individual DNA sequences, surprisingly little has focused on the compression of entire databases of such sequences. In this study we introduce the sequence database compression software coil. Results We have designed and implemented a portable software package, coil, for compressing and decompressing DNA sequence databases based on the idea of edit-tree coding. coil is geared towards achieving high compression ratios at the expense of execution time and memory usage during compression – the compression time represents a "one-off investment" whose cost is quickly amortised if the resulting compressed file is transmitted many times. Decompression requires little memory and is extremely fast. We demonstrate a 5% improvement in compression ratio over state-of-the-art general-purpose compression tools for a large GenBank database file containing Expressed Sequence Tag (EST) data. Finally, coil can efficiently encode incremental additions to a sequence database. Conclusion coil presents a compelling alternative to conventional compression of flat files for the storage and distribution of DNA sequence databases having a narrow distribution of sequence lengths, such as EST data. Increasing compression levels for databases having a wide distribution of sequence lengths is a direction for future work. PMID:18489794

  18. Benchmarking distributed data warehouse solutions for storing genomic variant information

    PubMed Central

    Wiewiórka, Marek S.; Wysakowicz, Dawid P.; Okoniewski, Michał J.

    2017-01-01

    Abstract Genomic-based personalized medicine encompasses storing, analysing and interpreting genomic variants as its central issues. At a time when thousands of patientss sequenced exomes and genomes are becoming available, there is a growing need for efficient database storage and querying. The answer could be the application of modern distributed storage systems and query engines. However, the application of large genomic variant databases to this problem has not been sufficiently far explored so far in the literature. To investigate the effectiveness of modern columnar storage [column-oriented Database Management System (DBMS)] and query engines, we have developed a prototypic genomic variant data warehouse, populated with large generated content of genomic variants and phenotypic data. Next, we have benchmarked performance of a number of combinations of distributed storages and query engines on a set of SQL queries that address biological questions essential for both research and medical applications. In addition, a non-distributed, analytical database (MonetDB) has been used as a baseline. Comparison of query execution times confirms that distributed data warehousing solutions outperform classic relational DBMSs. Moreover, pre-aggregation and further denormalization of data, which reduce the number of distributed join operations, significantly improve query performance by several orders of magnitude. Most of distributed back-ends offer a good performance for complex analytical queries, while the Optimized Row Columnar (ORC) format paired with Presto and Parquet with Spark 2 query engines provide, on average, the lowest execution times. Apache Kudu on the other hand, is the only solution that guarantees a sub-second performance for simple genome range queries returning a small subset of data, where low-latency response is expected, while still offering decent performance for running analytical queries. In summary, research and clinical applications that require the storage and analysis of variants from thousands of samples can benefit from the scalability and performance of distributed data warehouse solutions. Database URL: https://github.com/ZSI-Bio/variantsdwh PMID:29220442

  19. Collection Fusion Using Bayesian Estimation of a Linear Regression Model in Image Databases on the Web.

    ERIC Educational Resources Information Center

    Kim, Deok-Hwan; Chung, Chin-Wan

    2003-01-01

    Discusses the collection fusion problem of image databases, concerned with retrieving relevant images by content based retrieval from image databases distributed on the Web. Focuses on a metaserver which selects image databases supporting similarity measures and proposes a new algorithm which exploits a probabilistic technique using Bayesian…

  20. The Make 2D-DB II package: conversion of federated two-dimensional gel electrophoresis databases into a relational format and interconnection of distributed databases.

    PubMed

    Mostaguir, Khaled; Hoogland, Christine; Binz, Pierre-Alain; Appel, Ron D

    2003-08-01

    The Make 2D-DB tool has been previously developed to help build federated two-dimensional gel electrophoresis (2-DE) databases on one's own web site. The purpose of our work is to extend the strength of the first package and to build a more efficient environment. Such an environment should be able to fulfill the different needs and requirements arising from both the growing use of 2-DE techniques and the increasing amount of distributed experimental data.

  1. Establishment of an international database for genetic variants in esophageal cancer.

    PubMed

    Vihinen, Mauno

    2016-10-01

    The establishment of a database has been suggested in order to collect, organize, and distribute genetic information about esophageal cancer. The World Organization for Specialized Studies on Diseases of the Esophagus and the Human Variome Project will be in charge of a central database of information about esophageal cancer-related variations from publications, databases, and laboratories; in addition to genetic details, clinical parameters will also be included. The aim will be to get all the central players in research, clinical, and commercial laboratories to contribute. The database will follow established recommendations and guidelines. The database will require a team of dedicated curators with different backgrounds. Numerous layers of systematics will be applied to facilitate computational analyses. The data items will be extensively integrated with other information sources. The database will be distributed as open access to ensure exchange of the data with other databases. Variations will be reported in relation to reference sequences on three levels--DNA, RNA, and protein-whenever applicable. In the first phase, the database will concentrate on genetic variations including both somatic and germline variations for susceptibility genes. Additional types of information can be integrated at a later stage. © 2016 New York Academy of Sciences.

  2. A design for the geoinformatics system

    NASA Astrophysics Data System (ADS)

    Allison, M. L.

    2002-12-01

    Informatics integrates and applies information technologies with scientific and technical disciplines. A geoinformatics system targets the spatially based sciences. The system is not a master database, but will collect pertinent information from disparate databases distributed around the world. Seamless interoperability of databases promises quantum leaps in productivity not only for scientific researchers but also for many areas of society including business and government. The system will incorporate: acquisition of analog and digital legacy data; efficient information and data retrieval mechanisms (via data mining and web services); accessibility to and application of visualization, analysis, and modeling capabilities; online workspace, software, and tutorials; GIS; integration with online scientific journal aggregates and digital libraries; access to real time data collection and dissemination; user-defined automatic notification and quality control filtering for selection of new resources; and application to field techniques such as mapping. In practical terms, such a system will provide the ability to gather data over the Web from a variety of distributed sources, regardless of computer operating systems, database formats, and servers. Search engines will gather data about any geographic location, above, on, or below ground, covering any geologic time, and at any scale or detail. A distributed network of digital geolibraries can archive permanent copies of databases at risk of being discontinued and those that continue to be maintained by the data authors. The geoinformatics system will generate results from widely distributed sources to function as a dynamic data network. Instead of posting a variety of pre-made tables, charts, or maps based on static databases, the interactive dynamic system creates these products on the fly, each time an inquiry is made, using the latest information in the appropriate databases. Thus, in the dynamic system, a map generated today may differ from one created yesterday and one to be created tomorrow, because the databases used to make it are constantly (and sometimes automatically) being updated.

  3. Mass measurement errors of Fourier-transform mass spectrometry (FTMS): distribution, recalibration, and application.

    PubMed

    Zhang, Jiyang; Ma, Jie; Dou, Lei; Wu, Songfeng; Qian, Xiaohong; Xie, Hongwei; Zhu, Yunping; He, Fuchu

    2009-02-01

    The hybrid linear trap quadrupole Fourier-transform (LTQ-FT) ion cyclotron resonance mass spectrometer, an instrument with high accuracy and resolution, is widely used in the identification and quantification of peptides and proteins. However, time-dependent errors in the system may lead to deterioration of the accuracy of these instruments, negatively influencing the determination of the mass error tolerance (MET) in database searches. Here, a comprehensive discussion of LTQ/FT precursor ion mass error is provided. On the basis of an investigation of the mass error distribution, we propose an improved recalibration formula and introduce a new tool, FTDR (Fourier-transform data recalibration), that employs a graphic user interface (GUI) for automatic calibration. It was found that the calibration could adjust the mass error distribution to more closely approximate a normal distribution and reduce the standard deviation (SD). Consequently, we present a new strategy, LDSF (Large MET database search and small MET filtration), for database search MET specification and validation of database search results. As the name implies, a large-MET database search is conducted and the search results are then filtered using the statistical MET estimated from high-confidence results. By applying this strategy to a standard protein data set and a complex data set, we demonstrate the LDSF can significantly improve the sensitivity of the result validation procedure.

  4. The importance of data quality for generating reliable distribution models for rare, elusive, and cryptic species

    Treesearch

    Keith B. Aubry; Catherine M. Raley; Kevin S. McKelvey

    2017-01-01

    The availability of spatially referenced environmental data and species occurrence records in online databases enable practitioners to easily generate species distribution models (SDMs) for a broad array of taxa. Such databases often include occurrence records of unknown reliability, yet little information is available on the influence of data quality on SDMs generated...

  5. DSSTOX WEBSITE LAUNCH: IMPROVING PUBLIC ACCESS TO DATABASES FOR BUILDING STRUCTURE-TOXICITY PREDICTION MODELS

    EPA Science Inventory

    DSSTox Website Launch: Improving Public Access to Databases for Building Structure-Toxicity Prediction Models
    Ann M. Richard
    US Environmental Protection Agency, Research Triangle Park, NC, USA

    Distributed: Decentralized set of standardized, field-delimited databases,...

  6. PROGRESS REPORT ON THE DSSTOX DATABASE NETWORK: NEWLY LAUNCHED WEBSITE, APPLICATIONS, FUTURE PLANS

    EPA Science Inventory

    Progress Report on the DSSTox Database Network: Newly Launched Website, Applications, Future Plans

    Progress will be reported on development of the Distributed Structure-Searchable Toxicity (DSSTox) Database Network and the newly launched public website that coordinates and...

  7. Image Databases.

    ERIC Educational Resources Information Center

    Pettersson, Rune

    Different kinds of pictorial databases are described with respect to aims, user groups, search possibilities, storage, and distribution. Some specific examples are given for databases used for the following purposes: (1) labor markets for artists; (2) document management; (3) telling a story; (4) preservation (archives and museums); (5) research;…

  8. Practical Quantum Private Database Queries Based on Passive Round-Robin Differential Phase-shift Quantum Key Distribution.

    PubMed

    Li, Jian; Yang, Yu-Guang; Chen, Xiu-Bo; Zhou, Yi-Hua; Shi, Wei-Min

    2016-08-19

    A novel quantum private database query protocol is proposed, based on passive round-robin differential phase-shift quantum key distribution. Compared with previous quantum private database query protocols, the present protocol has the following unique merits: (i) the user Alice can obtain one and only one key bit so that both the efficiency and security of the present protocol can be ensured, and (ii) it does not require to change the length difference of the two arms in a Mach-Zehnder interferometer and just chooses two pulses passively to interfere with so that it is much simpler and more practical. The present protocol is also proved to be secure in terms of the user security and database security.

  9. Domain Regeneration for Cross-Database Micro-Expression Recognition

    NASA Astrophysics Data System (ADS)

    Zong, Yuan; Zheng, Wenming; Huang, Xiaohua; Shi, Jingang; Cui, Zhen; Zhao, Guoying

    2018-05-01

    In this paper, we investigate the cross-database micro-expression recognition problem, where the training and testing samples are from two different micro-expression databases. Under this setting, the training and testing samples would have different feature distributions and hence the performance of most existing micro-expression recognition methods may decrease greatly. To solve this problem, we propose a simple yet effective method called Target Sample Re-Generator (TSRG) in this paper. By using TSRG, we are able to re-generate the samples from target micro-expression database and the re-generated target samples would share same or similar feature distributions with the original source samples. For this reason, we can then use the classifier learned based on the labeled source samples to accurately predict the micro-expression categories of the unlabeled target samples. To evaluate the performance of the proposed TSRG method, extensive cross-database micro-expression recognition experiments designed based on SMIC and CASME II databases are conducted. Compared with recent state-of-the-art cross-database emotion recognition methods, the proposed TSRG achieves more promising results.

  10. Conducting Privacy-Preserving Multivariable Propensity Score Analysis When Patient Covariate Information Is Stored in Separate Locations.

    PubMed

    Bohn, Justin; Eddings, Wesley; Schneeweiss, Sebastian

    2017-03-15

    Distributed networks of health-care data sources are increasingly being utilized to conduct pharmacoepidemiologic database studies. Such networks may contain data that are not physically pooled but instead are distributed horizontally (separate patients within each data source) or vertically (separate measures within each data source) in order to preserve patient privacy. While multivariable methods for the analysis of horizontally distributed data are frequently employed, few practical approaches have been put forth to deal with vertically distributed health-care databases. In this paper, we propose 2 propensity score-based approaches to vertically distributed data analysis and test their performance using 5 example studies. We found that these approaches produced point estimates close to what could be achieved without partitioning. We further found a performance benefit (i.e., lower mean squared error) for sequentially passing a propensity score through each data domain (called the "sequential approach") as compared with fitting separate domain-specific propensity scores (called the "parallel approach"). These results were validated in a small simulation study. This proof-of-concept study suggests a new multivariable analysis approach to vertically distributed health-care databases that is practical, preserves patient privacy, and warrants further investigation for use in clinical research applications that rely on health-care databases. © The Author 2017. Published by Oxford University Press on behalf of the Johns Hopkins Bloomberg School of Public Health. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.

  11. System for Performing Single Query Searches of Heterogeneous and Dispersed Databases

    NASA Technical Reports Server (NTRS)

    Maluf, David A. (Inventor); Okimura, Takeshi (Inventor); Gurram, Mohana M. (Inventor); Tran, Vu Hoang (Inventor); Knight, Christopher D. (Inventor); Trinh, Anh Ngoc (Inventor)

    2017-01-01

    The present invention is a distributed computer system of heterogeneous databases joined in an information grid and configured with an Application Programming Interface hardware which includes a search engine component for performing user-structured queries on multiple heterogeneous databases in real time. This invention reduces overhead associated with the impedance mismatch that commonly occurs in heterogeneous database queries.

  12. Organization and dissemination of multimedia medical databases on the WWW.

    PubMed

    Todorovski, L; Ribaric, S; Dimec, J; Hudomalj, E; Lunder, T

    1999-01-01

    In the paper, we focus on the problem of building and disseminating multimedia medical databases on the World Wide Web (WWW). The current results of the ongoing project of building a prototype dermatology images database and its WWW presentation are presented. The dermatology database is part of an ambitious plan concerning an organization of a network of medical institutions building distributed and federated multimedia databases of a much wider scale.

  13. Recent advances on terrain database correlation testing

    NASA Astrophysics Data System (ADS)

    Sakude, Milton T.; Schiavone, Guy A.; Morelos-Borja, Hector; Martin, Glenn; Cortes, Art

    1998-08-01

    Terrain database correlation is a major requirement for interoperability in distributed simulation. There are numerous situations in which terrain database correlation problems can occur that, in turn, lead to lack of interoperability in distributed training simulations. Examples are the use of different run-time terrain databases derived from inconsistent on source data, the use of different resolutions, and the use of different data models between databases for both terrain and culture data. IST has been developing a suite of software tools, named ZCAP, to address terrain database interoperability issues. In this paper we discuss recent enhancements made to this suite, including improved algorithms for sampling and calculating line-of-sight, an improved method for measuring terrain roughness, and the application of a sparse matrix method to the terrain remediation solution developed at the Visual Systems Lab of the Institute for Simulation and Training. We review the application of some of these new algorithms to the terrain correlation measurement processes. The application of these new algorithms improves our support for very large terrain databases, and provides the capability for performing test replications to estimate the sampling error of the tests. With this set of tools, a user can quantitatively assess the degree of correlation between large terrain databases.

  14. Development, deployment and operations of ATLAS databases

    NASA Astrophysics Data System (ADS)

    Vaniachine, A. V.; Schmitt, J. G. v. d.

    2008-07-01

    In preparation for ATLAS data taking, a coordinated shift from development towards operations has occurred in ATLAS database activities. In addition to development and commissioning activities in databases, ATLAS is active in the development and deployment (in collaboration with the WLCG 3D project) of the tools that allow the worldwide distribution and installation of databases and related datasets, as well as the actual operation of this system on ATLAS multi-grid infrastructure. We describe development and commissioning of major ATLAS database applications for online and offline. We present the first scalability test results and ramp-up schedule over the initial LHC years of operations towards the nominal year of ATLAS running, when the database storage volumes are expected to reach 6.1 TB for the Tag DB and 1.0 TB for the Conditions DB. ATLAS database applications require robust operational infrastructure for data replication between online and offline at Tier-0, and for the distribution of the offline data to Tier-1 and Tier-2 computing centers. We describe ATLAS experience with Oracle Streams and other technologies for coordinated replication of databases in the framework of the WLCG 3D services.

  15. PIGD: a database for intronless genes in the Poaceae.

    PubMed

    Yan, Hanwei; Jiang, Cuiping; Li, Xiaoyu; Sheng, Lei; Dong, Qing; Peng, Xiaojian; Li, Qian; Zhao, Yang; Jiang, Haiyang; Cheng, Beijiu

    2014-10-01

    Intronless genes are a feature of prokaryotes; however, they are widespread and unequally distributed among eukaryotes and represent an important resource to study the evolution of gene architecture. Although many databases on exons and introns exist, there is currently no cohesive database that collects intronless genes in plants into a single database. In this study, we present the Poaceae Intronless Genes Database (PIGD), a user-friendly web interface to explore information on intronless genes from different plants. Five Poaceae species, Sorghum bicolor, Zea mays, Setaria italica, Panicum virgatum and Brachypodium distachyon, are included in the current release of PIGD. Gene annotations and sequence data were collected and integrated from different databases. The primary focus of this study was to provide gene descriptions and gene product records. In addition, functional annotations, subcellular localization prediction and taxonomic distribution are reported. PIGD allows users to readily browse, search and download data. BLAST and comparative analyses are also provided through this online database, which is available at http://pigd.ahau.edu.cn/. PIGD provides a solid platform for the collection, integration and analysis of intronless genes in the Poaceae. As such, this database will be useful for subsequent bio-computational analysis in comparative genomics and evolutionary studies.

  16. Virtual Queue in a Centralized Database Environment

    NASA Astrophysics Data System (ADS)

    Kar, Amitava; Pal, Dibyendu Kumar

    2010-10-01

    Today is the era of the Internet. Every matter whether it be a gather of knowledge or planning a holiday or booking of ticket etc everything can be obtained from the internet. This paper intends to calculate the different queuing measures when some booking or purchase is done through the internet subject to the limitations in the number of tickets or seats. It involves a lot of database activities like read and write. This paper takes care of the time involved in the requests of a service, taken as arrival and the time involved in providing the required information, taken as service and thereby tries to calculate the distribution of arrival and service and the various measures of the queuing. This paper considers the database as centralized database for the sake of simplicity as the alternating concept of distributed database would rather complicate the calculation.

  17. Digital Video of Live-Scan Fingerprint Data

    National Institute of Standards and Technology Data Gateway

    NIST Digital Video of Live-Scan Fingerprint Data (PC database for purchase)   NIST Special Database 24 contains MPEG-2 (Moving Picture Experts Group) compressed digital video of live-scan fingerprint data. The database is being distributed for use in developing and testing of fingerprint verification systems.

  18. "Mr. Database" : Jim Gray and the History of Database Technologies.

    PubMed

    Hanwahr, Nils C

    2017-12-01

    Although the widespread use of the term "Big Data" is comparatively recent, it invokes a phenomenon in the developments of database technology with distinct historical contexts. The database engineer Jim Gray, known as "Mr. Database" in Silicon Valley before his disappearance at sea in 2007, was involved in many of the crucial developments since the 1970s that constitute the foundation of exceedingly large and distributed databases. Jim Gray was involved in the development of relational database systems based on the concepts of Edgar F. Codd at IBM in the 1970s before he went on to develop principles of Transaction Processing that enable the parallel and highly distributed performance of databases today. He was also involved in creating forums for discourse between academia and industry, which influenced industry performance standards as well as database research agendas. As a co-founder of the San Francisco branch of Microsoft Research, Gray increasingly turned toward scientific applications of database technologies, e. g. leading the TerraServer project, an online database of satellite images. Inspired by Vannevar Bush's idea of the memex, Gray laid out his vision of a Personal Memex as well as a World Memex, eventually postulating a new era of data-based scientific discovery termed "Fourth Paradigm Science". This article gives an overview of Gray's contributions to the development of database technology as well as his research agendas and shows that central notions of Big Data have been occupying database engineers for much longer than the actual term has been in use.

  19. Security in the CernVM File System and the Frontier Distributed Database Caching System

    NASA Astrophysics Data System (ADS)

    Dykstra, D.; Blomer, J.

    2014-06-01

    Both the CernVM File System (CVMFS) and the Frontier Distributed Database Caching System (Frontier) distribute centrally updated data worldwide for LHC experiments using http proxy caches. Neither system provides privacy or access control on reading the data, but both control access to updates of the data and can guarantee the authenticity and integrity of the data transferred to clients over the internet. CVMFS has since its early days required digital signatures and secure hashes on all distributed data, and recently Frontier has added X.509-based authenticity and integrity checking. In this paper we detail and compare the security models of CVMFS and Frontier.

  20. Application of new type of distributed multimedia databases to networked electronic museum

    NASA Astrophysics Data System (ADS)

    Kuroda, Kazuhide; Komatsu, Naohisa; Komiya, Kazumi; Ikeda, Hiroaki

    1999-01-01

    Recently, various kinds of multimedia application systems have actively been developed based on the achievement of advanced high sped communication networks, computer processing technologies, and digital contents-handling technologies. Under this background, this paper proposed a new distributed multimedia database system which can effectively perform a new function of cooperative retrieval among distributed databases. The proposed system introduces a new concept of 'Retrieval manager' which functions as an intelligent controller so that the user can recognize a set of distributed databases as one logical database. The logical database dynamically generates and performs a preferred combination of retrieving parameters on the basis of both directory data and the system environment. Moreover, a concept of 'domain' is defined in the system as a managing unit of retrieval. The retrieval can effectively be performed by cooperation of processing among multiple domains. Communication language and protocols are also defined in the system. These are used in every action for communications in the system. A language interpreter in each machine translates a communication language into an internal language used in each machine. Using the language interpreter, internal processing, such internal modules as DBMS and user interface modules can freely be selected. A concept of 'content-set' is also introduced. A content-set is defined as a package of contents. Contents in the content-set are related to each other. The system handles a content-set as one object. The user terminal can effectively control the displaying of retrieved contents, referring to data indicating the relation of the contents in the content- set. In order to verify the function of the proposed system, a networked electronic museum was experimentally built. The results of this experiment indicate that the proposed system can effectively retrieve the objective contents under the control to a number of distributed domains. The result also indicate that the system can effectively work even if the system becomes large.

  1. The distribution of common construction materials at risk to acid deposition in the United States

    NASA Astrophysics Data System (ADS)

    Lipfert, Frederick W.; Daum, Mary L.

    Information on the geographic distribution of various types of exposed materials is required to estimate the economic costs of damage to construction materials from acid deposition. This paper focuses on the identification, evaluation and interpretation of data describing the distributions of exterior construction materials, primarily in the United States. This information could provide guidance on how data needed for future economic assessments might be acquired in the most cost-effective ways. Materials distribution surveys from 16 cities in the U.S. and Canada and five related databases from government agencies and trade organizations were examined. Data on residential buildings are more commonly available than on nonresidential buildings; little geographically resolved information on distributions of materials in infrastructure was found. Survey results generally agree with the appropriate ancillary databases, but the usefulness of the databases is often limited by their coarse spatial resolution. Information on those materials which are most sensitive to acid deposition is especially scarce. Since a comprehensive error analysis has never been performed on the data required for an economic assessment, it is not possible to specify the corresponding detailed requirements for data on the distributions of materials.

  2. Maritime Operations in Disconnected, Intermittent, and Low-Bandwidth Environments

    DTIC Science & Technology

    2013-06-01

    of a Dynamic Distributed Database ( DDD ) is a core element enabling the distributed operation of networks and applications, as described in this...document. The DDD is a database containing all the relevant information required to reconfigure the applications, routing, and other network services...optimize application configuration. Figure 5 gives a snapshot of entries in the DDD . In current testing, the DDD is replicated using Domino

  3. A knowledge base architecture for distributed knowledge agents

    NASA Technical Reports Server (NTRS)

    Riedesel, Joel; Walls, Bryan

    1990-01-01

    A tuple space based object oriented model for knowledge base representation and interpretation is presented. An architecture for managing distributed knowledge agents is then implemented within the model. The general model is based upon a database implementation of a tuple space. Objects are then defined as an additional layer upon the database. The tuple space may or may not be distributed depending upon the database implementation. A language for representing knowledge and inference strategy is defined whose implementation takes advantage of the tuple space. The general model may then be instantiated in many different forms, each of which may be a distinct knowledge agent. Knowledge agents may communicate using tuple space mechanisms as in the LINDA model as well as using more well known message passing mechanisms. An implementation of the model is presented describing strategies used to keep inference tractable without giving up expressivity. An example applied to a power management and distribution network for Space Station Freedom is given.

  4. Study on distributed generation algorithm of variable precision concept lattice based on ontology heterogeneous database

    NASA Astrophysics Data System (ADS)

    WANG, Qingrong; ZHU, Changfeng

    2017-06-01

    Integration of distributed heterogeneous data sources is the key issues under the big data applications. In this paper the strategy of variable precision is introduced to the concept lattice, and the one-to-one mapping mode of variable precision concept lattice and ontology concept lattice is constructed to produce the local ontology by constructing the variable precision concept lattice for each subsystem, and the distributed generation algorithm of variable precision concept lattice based on ontology heterogeneous database is proposed to draw support from the special relationship between concept lattice and ontology construction. Finally, based on the standard of main concept lattice of the existing heterogeneous database generated, a case study has been carried out in order to testify the feasibility and validity of this algorithm, and the differences between the main concept lattice and the standard concept lattice are compared. Analysis results show that this algorithm above-mentioned can automatically process the construction process of distributed concept lattice under the heterogeneous data sources.

  5. Practical Quantum Private Database Queries Based on Passive Round-Robin Differential Phase-shift Quantum Key Distribution

    PubMed Central

    Li, Jian; Yang, Yu-Guang; Chen, Xiu-Bo; Zhou, Yi-Hua; Shi, Wei-Min

    2016-01-01

    A novel quantum private database query protocol is proposed, based on passive round-robin differential phase-shift quantum key distribution. Compared with previous quantum private database query protocols, the present protocol has the following unique merits: (i) the user Alice can obtain one and only one key bit so that both the efficiency and security of the present protocol can be ensured, and (ii) it does not require to change the length difference of the two arms in a Mach-Zehnder interferometer and just chooses two pulses passively to interfere with so that it is much simpler and more practical. The present protocol is also proved to be secure in terms of the user security and database security. PMID:27539654

  6. Information resources at the National Center for Biotechnology Information.

    PubMed Central

    Woodsmall, R M; Benson, D A

    1993-01-01

    The National Center for Biotechnology Information (NCBI), part of the National Library of Medicine, was established in 1988 to perform basic research in the field of computational molecular biology as well as build and distribute molecular biology databases. The basic research has led to new algorithms and analysis tools for interpreting genomic data and has been instrumental in the discovery of human disease genes for neurofibromatosis and Kallmann syndrome. The principal database responsibility is the National Institutes of Health (NIH) genetic sequence database, GenBank. NCBI, in collaboration with international partners, builds, distributes, and provides online and CD-ROM access to over 112,000 DNA sequences. Another major program is the integration of multiple sequences databases and related bibliographic information and the development of network-based retrieval systems for Internet access. PMID:8374583

  7. Evaluation and validity of a LORETA normative EEG database.

    PubMed

    Thatcher, R W; North, D; Biver, C

    2005-04-01

    To evaluate the reliability and validity of a Z-score normative EEG database for Low Resolution Electromagnetic Tomography (LORETA), EEG digital samples (2 second intervals sampled 128 Hz, 1 to 2 minutes eyes closed) were acquired from 106 normal subjects, and the cross-spectrum was computed and multiplied by the Key Institute's LORETA 2,394 gray matter pixel T Matrix. After a log10 transform or a Box-Cox transform the mean and standard deviation of the *.lor files were computed for each of the 2394 gray matter pixels, from 1 to 30 Hz, for each of the subjects. Tests of Gaussianity were computed in order to best approximate a normal distribution for each frequency and gray matter pixel. The relative sensitivity of a Z-score database was computed by measuring the approximation to a Gaussian distribution. The validity of the LORETA normative database was evaluated by the degree to which confirmed brain pathologies were localized using the LORETA normative database. Log10 and Box-Cox transforms approximated Gaussian distribution in the range of 95.64% to 99.75% accuracy. The percentage of normative Z-score values at 2 standard deviations ranged from 1.21% to 3.54%, and the percentage of Z-scores at 3 standard deviations ranged from 0% to 0.83%. Left temporal lobe epilepsy, right sensory motor hematoma and a right hemisphere stroke exhibited maximum Z-score deviations in the same locations as the pathologies. We conclude: (1) Adequate approximation to a Gaussian distribution can be achieved using LORETA by using a log10 transform or a Box-Cox transform and parametric statistics, (2) a Z-Score normative database is valid with adequate sensitivity when using LORETA, and (3) the Z-score LORETA normative database also consistently localized known pathologies to the expected Brodmann areas as an hypothesis test based on the surface EEG before computing LORETA.

  8. GPCALMA: A Tool For Mammography With A GRID-Connected Distributed Database

    NASA Astrophysics Data System (ADS)

    Bottigli, U.; Cerello, P.; Cheran, S.; Delogu, P.; Fantacci, M. E.; Fauci, F.; Golosio, B.; Lauria, A.; Lopez Torres, E.; Magro, R.; Masala, G. L.; Oliva, P.; Palmiero, R.; Raso, G.; Retico, A.; Stumbo, S.; Tangaro, S.

    2003-09-01

    The GPCALMA (Grid Platform for Computer Assisted Library for MAmmography) collaboration involves several departments of physics, INFN (National Institute of Nuclear Physics) sections, and italian hospitals. The aim of this collaboration is developing a tool that can help radiologists in early detection of breast cancer. GPCALMA has built a large distributed database of digitised mammographic images (about 5500 images corresponding to 1650 patients) and developed a CAD (Computer Aided Detection) software which is integrated in a station that can also be used to acquire new images, as archive and to perform statistical analysis. The images (18×24 cm2, digitised by a CCD linear scanner with a 85 μm pitch and 4096 gray levels) are completely described: pathological ones have a consistent characterization with radiologist's diagnosis and histological data, non pathological ones correspond to patients with a follow up at least three years. The distributed database is realized throught the connection of all the hospitals and research centers in GRID tecnology. In each hospital local patients digital images are stored in the local database. Using GRID connection, GPCALMA will allow each node to work on distributed database data as well as local database data. Using its database the GPCALMA tools perform several analysis. A texture analysis, i.e. an automated classification on adipose, dense or glandular texture, can be provided by the system. GPCALMA software also allows classification of pathological features, in particular massive lesions (both opacities and spiculated lesions) analysis and microcalcification clusters analysis. The detection of pathological features is made using neural network software that provides a selection of areas showing a given "suspicion level" of lesion occurrence. The performance of the GPCALMA system will be presented in terms of the ROC (Receiver Operating Characteristic) curves. The results of GPCALMA system as "second reader" will also be presented.

  9. Apollo2Go: a web service adapter for the Apollo genome viewer to enable distributed genome annotation.

    PubMed

    Klee, Kathrin; Ernst, Rebecca; Spannagl, Manuel; Mayer, Klaus F X

    2007-08-30

    Apollo, a genome annotation viewer and editor, has become a widely used genome annotation and visualization tool for distributed genome annotation projects. When using Apollo for annotation, database updates are carried out by uploading intermediate annotation files into the respective database. This non-direct database upload is laborious and evokes problems of data synchronicity. To overcome these limitations we extended the Apollo data adapter with a generic, configurable web service client that is able to retrieve annotation data in a GAME-XML-formatted string and pass it on to Apollo's internal input routine. This Apollo web service adapter, Apollo2Go, simplifies the data exchange in distributed projects and aims to render the annotation process more comfortable. The Apollo2Go software is freely available from ftp://ftpmips.gsf.de/plants/apollo_webservice.

  10. Apollo2Go: a web service adapter for the Apollo genome viewer to enable distributed genome annotation

    PubMed Central

    Klee, Kathrin; Ernst, Rebecca; Spannagl, Manuel; Mayer, Klaus FX

    2007-01-01

    Background Apollo, a genome annotation viewer and editor, has become a widely used genome annotation and visualization tool for distributed genome annotation projects. When using Apollo for annotation, database updates are carried out by uploading intermediate annotation files into the respective database. This non-direct database upload is laborious and evokes problems of data synchronicity. Results To overcome these limitations we extended the Apollo data adapter with a generic, configurable web service client that is able to retrieve annotation data in a GAME-XML-formatted string and pass it on to Apollo's internal input routine. Conclusion This Apollo web service adapter, Apollo2Go, simplifies the data exchange in distributed projects and aims to render the annotation process more comfortable. The Apollo2Go software is freely available from . PMID:17760972

  11. Constructing distributed Hippocratic video databases for privacy-preserving online patient training and counseling.

    PubMed

    Peng, Jinye; Babaguchi, Noboru; Luo, Hangzai; Gao, Yuli; Fan, Jianping

    2010-07-01

    Digital video now plays an important role in supporting more profitable online patient training and counseling, and integration of patient training videos from multiple competitive organizations in the health care network will result in better offerings for patients. However, privacy concerns often prevent multiple competitive organizations from sharing and integrating their patient training videos. In addition, patients with infectious or chronic diseases may not want the online patient training organizations to identify who they are or even which video clips they are interested in. Thus, there is an urgent need to develop more effective techniques to protect both video content privacy and access privacy . In this paper, we have developed a new approach to construct a distributed Hippocratic video database system for supporting more profitable online patient training and counseling. First, a new database modeling approach is developed to support concept-oriented video database organization and assign a degree of privacy of the video content for each database level automatically. Second, a new algorithm is developed to protect the video content privacy at the level of individual video clip by filtering out the privacy-sensitive human objects automatically. In order to integrate the patient training videos from multiple competitive organizations for constructing a centralized video database indexing structure, a privacy-preserving video sharing scheme is developed to support privacy-preserving distributed classifier training and prevent the statistical inferences from the videos that are shared for cross-validation of video classifiers. Our experiments on large-scale video databases have also provided very convincing results.

  12. Database Development for Ocean Impacts: Imaging, Outreach, and Rapid Response

    DTIC Science & Technology

    2012-09-30

    1 DISTRIBUTION STATEMENT A. Approved for public release; distribution is unlimited. Database Development for Ocean Impacts: Imaging, Outreach...Development for Ocean Impacts: Imaging, Outreach, and Rapid Response 5a. CONTRACT NUMBER 5b. GRANT NUMBER 5c. PROGRAM ELEMENT NUMBER 6. AUTHOR(S) 5d...hoses ( Applied Ocean Physics & Engineering department, WHOI, to evaluate wear and locate in mooring optical cables used in the Right Whale monitoring

  13. Relativistic quantum private database queries

    NASA Astrophysics Data System (ADS)

    Sun, Si-Jia; Yang, Yu-Guang; Zhang, Ming-Ou

    2015-04-01

    Recently, Jakobi et al. (Phys Rev A 83, 022301, 2011) suggested the first practical private database query protocol (J-protocol) based on the Scarani et al. (Phys Rev Lett 92, 057901, 2004) quantum key distribution protocol. Unfortunately, the J-protocol is just a cheat-sensitive private database query protocol. In this paper, we present an idealized relativistic quantum private database query protocol based on Minkowski causality and the properties of quantum information. Also, we prove that the protocol is secure in terms of the user security and the database security.

  14. The Network Configuration of an Object Relational Database Management System

    NASA Technical Reports Server (NTRS)

    Diaz, Philip; Harris, W. C.

    2000-01-01

    The networking and implementation of the Oracle Database Management System (ODBMS) requires developers to have knowledge of the UNIX operating system as well as all the features of the Oracle Server. The server is an object relational database management system (DBMS). By using distributed processing, processes are split up between the database server and client application programs. The DBMS handles all the responsibilities of the server. The workstations running the database application concentrate on the interpretation and display of data.

  15. DOE Office of Scientific and Technical Information (OSTI.GOV)

    Langan, Roisin T.; Archibald, Richard K.; Lamberti, Vincent

    We have applied a new imputation-based method for analyzing incomplete data, called Monte Carlo Bayesian Database Generation (MCBDG), to the Spent Fuel Isotopic Composition (SFCOMPO) database. About 60% of the entries are absent for SFCOMPO. The method estimates missing values of a property from a probability distribution created from the existing data for the property, and then generates multiple instances of the completed database for training a machine learning algorithm. Uncertainty in the data is represented by an empirical or an assumed error distribution. The method makes few assumptions about the underlying data, and compares favorably against results obtained bymore » replacing missing information with constant values.« less

  16. GlobTherm, a global database on thermal tolerances for aquatic and terrestrial organisms.

    PubMed

    Bennett, Joanne M; Calosi, Piero; Clusella-Trullas, Susana; Martínez, Brezo; Sunday, Jennifer; Algar, Adam C; Araújo, Miguel B; Hawkins, Bradford A; Keith, Sally; Kühn, Ingolf; Rahbek, Carsten; Rodríguez, Laura; Singer, Alexander; Villalobos, Fabricio; Ángel Olalla-Tárraga, Miguel; Morales-Castilla, Ignacio

    2018-03-13

    How climate affects species distributions is a longstanding question receiving renewed interest owing to the need to predict the impacts of global warming on biodiversity. Is climate change forcing species to live near their critical thermal limits? Are these limits likely to change through natural selection? These and other important questions can be addressed with models relating geographical distributions of species with climate data, but inferences made with these models are highly contingent on non-climatic factors such as biotic interactions. Improved understanding of climate change effects on species will require extensive analysis of thermal physiological traits, but such data are both scarce and scattered. To overcome current limitations, we created the GlobTherm database. The database contains experimentally derived species' thermal tolerance data currently comprising over 2,000 species of terrestrial, freshwater, intertidal and marine multicellular algae, plants, fungi, and animals. The GlobTherm database will be maintained and curated by iDiv with the aim to keep expanding it, and enable further investigations on the effects of climate on the distribution of life on Earth.

  17. Distributed data collection for a database of radiological image interpretations

    NASA Astrophysics Data System (ADS)

    Long, L. Rodney; Ostchega, Yechiam; Goh, Gin-Hua; Thoma, George R.

    1997-01-01

    The National Library of Medicine, in collaboration with the National Center for Health Statistics and the National Institute for Arthritis and Musculoskeletal and Skin Diseases, has built a system for collecting radiological interpretations for a large set of x-ray images acquired as part of the data gathered in the second National Health and Nutrition Examination Survey. This system is capable of delivering across the Internet 5- and 10-megabyte x-ray images to Sun workstations equipped with X Window based 2048 X 2560 image displays, for the purpose of having these images interpreted for the degree of presence of particular osteoarthritic conditions in the cervical and lumbar spines. The collected interpretations can then be stored in a database at the National Library of Medicine, under control of the Illustra DBMS. This system is a client/server database application which integrates (1) distributed server processing of client requests, (2) a customized image transmission method for faster Internet data delivery, (3) distributed client workstations with high resolution displays, image processing functions and an on-line digital atlas, and (4) relational database management of the collected data.

  18. Library Micro-Computing, Vol. 2. Reprints from the Best of "ONLINE" [and]"DATABASE."

    ERIC Educational Resources Information Center

    Online, Inc., Weston, CT.

    Reprints of 19 articles pertaining to library microcomputing appear in this collection, the second of two volumes on this topic in a series of volumes of reprints from "ONLINE" and "DATABASE" magazines. Edited for information professionals who use electronically distributed databases, these articles address such topics as: (1)…

  19. Web Database Development: Implications for Academic Publishing.

    ERIC Educational Resources Information Center

    Fernekes, Bob

    This paper discusses the preliminary planning, design, and development of a pilot project to create an Internet accessible database and search tool for locating and distributing company data and scholarly work. Team members established four project objectives: (1) to develop a Web accessible database and decision tool that creates Web pages on the…

  20. Prevalence and geographical distribution of Usher syndrome in Germany.

    PubMed

    Spandau, Ulrich H M; Rohrschneider, Klaus

    2002-06-01

    To estimate the prevalence of Usher syndrome in Heidelberg and Mannheim and to map its geographical distribution in Germany. Usher syndrome patients were ascertained through the databases of the Low Vision Department at the University of Heidelberg, and of the patient support group Pro Retina. Ophthalmic and audiologic examinations and medical records were used to classify patients into one of the subtypes. The database of the University of Heidelberg contains 247 Usher syndrome patients, 63 with Usher syndrome type 1 (USH1) and 184 with Usher syndrome type 2 (USH2). The USH1:USH2 ratio in the Heidelberg database was 1:3. The Pro Retina database includes 248 Usher syndrome patients, 21 with USH1 and 227 with USH2. The total number of Usher syndrome patients was 424, with 75 USH1 and 349 USH2 patients; 71 patients were in both databases. The prevalence of Usher syndrome in Heidelberg and suburbs was calculated to be 6.2 per 100,000 inhabitants. There seems to be a homogeneous distribution in Germany for both subtypes. Knowledge of the high prevalence of Usher syndrome, with up to 5,000 patients in Germany, should lead to increased awareness and timely diagnosis by ophthalmologists and otologists. It should also ensure that these patients receive good support through hearing and vision aids.

  1. Geologic map and map database of parts of Marin, San Francisco, Alameda, Contra Costa, and Sonoma counties, California

    USGS Publications Warehouse

    Blake, M.C.; Jones, D.L.; Graymer, R.W.; digital database by Soule, Adam

    2000-01-01

    This digital map database, compiled from previously published and unpublished data, and new mapping by the authors, represents the general distribution of bedrock and surficial deposits in the mapped area. Together with the accompanying text file (mageo.txt, mageo.pdf, or mageo.ps), it provides current information on the geologic structure and stratigraphy of the area covered. The database delineates map units that are identified by general age and lithology following the stratigraphic nomenclature of the U.S. Geological Survey. The scale of the source maps limits the spatial resolution (scale) of the database to 1:62,500 or smaller general distribution of bedrock and surficial deposits in the mapped area. Together with the accompanying text file (mageo.txt, mageo.pdf, or mageo.ps), it provides current information on the geologic structure and stratigraphy of the area covered. The database delineates map units that are identified by general age and lithology following the stratigraphic nomenclature of the U.S. Geological Survey. The scale of the source maps limits the spatial resolution (scale) of the database to 1:62,500 or smaller.

  2. FishTraits Database

    USGS Publications Warehouse

    Angermeier, Paul L.; Frimpong, Emmanuel A.

    2009-01-01

    The need for integrated and widely accessible sources of species traits data to facilitate studies of ecology, conservation, and management has motivated development of traits databases for various taxa. In spite of the increasing number of traits-based analyses of freshwater fishes in the United States, no consolidated database of traits of this group exists publicly, and much useful information on these species is documented only in obscure sources. The largely inaccessible and unconsolidated traits information makes large-scale analysis involving many fishes and/or traits particularly challenging. FishTraits is a database of >100 traits for 809 (731 native and 78 exotic) fish species found in freshwaters of the conterminous United States, including 37 native families and 145 native genera. The database contains information on four major categories of traits: (1) trophic ecology, (2) body size and reproductive ecology (life history), (3) habitat associations, and (4) salinity and temperature tolerances. Information on geographic distribution and conservation status is also included. Together, we refer to the traits, distribution, and conservation status information as attributes. Descriptions of attributes are available here. Many sources were consulted to compile attributes, including state and regional species accounts and other databases.

  3. Income distribution patterns from a complete social security database

    NASA Astrophysics Data System (ADS)

    Derzsy, N.; Néda, Z.; Santos, M. A.

    2012-11-01

    We analyze the income distribution of employees for 9 consecutive years (2001-2009) using a complete social security database for an economically important district of Romania. The database contains detailed information on more than half million taxpayers, including their monthly salaries from all employers where they worked. Besides studying the characteristic distribution functions in the high and low/medium income limits, the database allows us a detailed dynamical study by following the time-evolution of the taxpayers income. To our knowledge, this is the first extensive study of this kind (a previous Japanese taxpayers survey was limited to two years). In the high income limit we prove once again the validity of Pareto’s law, obtaining a perfect scaling on four orders of magnitude in the rank for all the studied years. The obtained Pareto exponents are quite stable with values around α≈2.5, in spite of the fact that during this period the economy developed rapidly and also a financial-economic crisis hit Romania in 2007-2008. For the low and medium income category we confirmed the exponential-type income distribution. Following the income of employees in time, we have found that the top limit of the income distribution is a highly dynamical region with strong fluctuations in the rank. In this region, the observed dynamics is consistent with a multiplicative random growth hypothesis. Contrarily with previous results obtained for the Japanese employees, we find that the logarithmic growth-rate is not independent of the income.

  4. Extending GIS Technology to Study Karst Features of Southeastern Minnesota

    NASA Astrophysics Data System (ADS)

    Gao, Y.; Tipping, R. G.; Alexander, E. C.; Alexander, S. C.

    2001-12-01

    This paper summarizes ongoing research on karst feature distribution of southeastern Minnesota. The main goals of this interdisciplinary research are: 1) to look for large-scale patterns in the rate and distribution of sinkhole development; 2) to conduct statistical tests of hypotheses about the formation of sinkholes; 3) to create management tools for land-use managers and planners; and 4) to deliver geomorphic and hydrogeologic criteria for making scientifically valid land-use policies and ethical decisions in karst areas of southeastern Minnesota. Existing county and sub-county karst feature datasets of southeastern Minnesota have been assembled into a large GIS-based database capable of analyzing the entire data set. The central database management system (DBMS) is a relational GIS-based system interacting with three modules: GIS, statistical and hydrogeologic modules. ArcInfo and ArcView were used to generate a series of 2D and 3D maps depicting karst feature distributions in southeastern Minnesota. IRIS ExplorerTM was used to produce satisfying 3D maps and animations using data exported from GIS-based database. Nearest-neighbor analysis has been used to test sinkhole distributions in different topographic and geologic settings. All current nearest-neighbor analyses testify that sinkholes in southeastern Minnesota are not evenly distributed in this area (i.e., they tend to be clustered). More detailed statistical methods such as cluster analysis, histograms, probability estimation, correlation and regression have been used to study the spatial distributions of some mapped karst features of southeastern Minnesota. A sinkhole probability map for Goodhue County has been constructed based on sinkhole distribution, bedrock geology, depth to bedrock, GIS buffer analysis and nearest-neighbor analysis. A series of karst features for Winona County including sinkholes, springs, seeps, stream sinks and outcrop has been mapped and entered into the Karst Feature Database of Southeastern Minnesota. The Karst Feature Database of Winona County is being expanded to include all the mapped karst features of southeastern Minnesota. Air photos from 1930s to 1990s of Spring Valley Cavern Area in Fillmore County were scanned and geo-referenced into our GIS system. This technology has been proved to be very useful to identify sinkholes and study the rate of sinkhole development.

  5. The Raid distributed database system

    NASA Technical Reports Server (NTRS)

    Bhargava, Bharat; Riedl, John

    1989-01-01

    Raid, a robust and adaptable distributed database system for transaction processing (TP), is described. Raid is a message-passing system, with server processes on each site to manage concurrent processing, consistent replicated copies during site failures, and atomic distributed commitment. A high-level layered communications package provides a clean location-independent interface between servers. The latest design of the package delivers messages via shared memory in a configuration with several servers linked into a single process. Raid provides the infrastructure to investigate various methods for supporting reliable distributed TP. Measurements on TP and server CPU time are presented, along with data from experiments on communications software, consistent replicated copy control during site failures, and concurrent distributed checkpointing. A software tool for evaluating the implementation of TP algorithms in an operating-system kernel is proposed.

  6. Distributed processor allocation for launching applications in a massively connected processors complex

    DOEpatents

    Pedretti, Kevin

    2008-11-18

    A compute processor allocator architecture for allocating compute processors to run applications in a multiple processor computing apparatus is distributed among a subset of processors within the computing apparatus. Each processor of the subset includes a compute processor allocator. The compute processor allocators can share a common database of information pertinent to compute processor allocation. A communication path permits retrieval of information from the database independently of the compute processor allocators.

  7. Practical private database queries based on a quantum-key-distribution protocol

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Jakobi, Markus; Humboldt-Universitaet zu Berlin, D-10117 Berlin; Simon, Christoph

    2011-02-15

    Private queries allow a user, Alice, to learn an element of a database held by a provider, Bob, without revealing which element she is interested in, while limiting her information about the other elements. We propose to implement private queries based on a quantum-key-distribution protocol, with changes only in the classical postprocessing of the key. This approach makes our scheme both easy to implement and loss tolerant. While unconditionally secure private queries are known to be impossible, we argue that an interesting degree of security can be achieved by relying on fundamental physical principles instead of unverifiable security assumptions inmore » order to protect both the user and the database. We think that the scope exists for such practical private queries to become another remarkable application of quantum information in the footsteps of quantum key distribution.« less

  8. A revision of the distribution of sea kraits (Reptilia, Laticauda) with an updated occurrence dataset for ecological and conservation research

    PubMed Central

    Gherghel, Iulian; Papeş, Monica; Brischoux, François; Sahlean, Tiberiu; Strugariu, Alexandru

    2016-01-01

    Abstract The genus Laticauda (Reptilia: Elapidae), commonly known as sea kraits, comprises eight species of marine amphibious snakes distributed along the shores of the Western Pacific Ocean and the Eastern Indian Ocean. We review the information available on the geographic range of sea kraits and analyze their distribution patterns. Generally, we found that south and south-west of Japan, Philippines Archipelago, parts of Indonesia, and Vanuatu have the highest diversity of sea krait species. Further, we compiled the information available on sea kraits’ occurrences from a variety of sources, including museum records, field surveys, and the scientific literature. The final database comprises 694 occurrence records, with Laticauda colubrina having the highest number of records and Laticauda schistorhyncha the lowest. The occurrence records were georeferenced and compiled as a database for each sea krait species. This database can be freely used for future studies. PMID:27110155

  9. A revision of the distribution of sea kraits (Reptilia, Laticauda) with an updated occurrence dataset for ecological and conservation research.

    PubMed

    Gherghel, Iulian; Papeş, Monica; Brischoux, François; Sahlean, Tiberiu; Strugariu, Alexandru

    2016-01-01

    The genus Laticauda (Reptilia: Elapidae), commonly known as sea kraits, comprises eight species of marine amphibious snakes distributed along the shores of the Western Pacific Ocean and the Eastern Indian Ocean. We review the information available on the geographic range of sea kraits and analyze their distribution patterns. Generally, we found that south and south-west of Japan, Philippines Archipelago, parts of Indonesia, and Vanuatu have the highest diversity of sea krait species. Further, we compiled the information available on sea kraits' occurrences from a variety of sources, including museum records, field surveys, and the scientific literature. The final database comprises 694 occurrence records, with Laticauda colubrina having the highest number of records and Laticauda schistorhyncha the lowest. The occurrence records were georeferenced and compiled as a database for each sea krait species. This database can be freely used for future studies.

  10. Mugshot Identification Database (MID)

    National Institute of Standards and Technology Data Gateway

    NIST Mugshot Identification Database (MID) (Web, free access)   NIST Special Database 18 is being distributed for use in development and testing of automated mugshot identification systems. The database consists of three CD-ROMs, containing a total of 3248 images of variable size using lossless compression. A newer version of the compression/decompression software on the CDROM can be found at the website http://www.nist.gov/itl/iad/ig/nigos.cfm as part of the NBIS package.

  11. Database Entity Persistence with Hibernate for the Network Connectivity Analysis Model

    DTIC Science & Technology

    2014-04-01

    time savings in the Java coding development process. Appendices A and B describe address setup procedures for installing the MySQL database...development environment is required: • The open source MySQL Database Management System (DBMS) from Oracle, which is a Java Database Connectivity (JDBC...compliant DBMS • MySQL JDBC Driver library that comes as a plug-in with the Netbeans distribution • The latest Java Development Kit with the latest

  12. DNApod: DNA polymorphism annotation database from next-generation sequence read archives.

    PubMed

    Mochizuki, Takako; Tanizawa, Yasuhiro; Fujisawa, Takatomo; Ohta, Tazro; Nikoh, Naruo; Shimizu, Tokurou; Toyoda, Atsushi; Fujiyama, Asao; Kurata, Nori; Nagasaki, Hideki; Kaminuma, Eli; Nakamura, Yasukazu

    2017-01-01

    With the rapid advances in next-generation sequencing (NGS), datasets for DNA polymorphisms among various species and strains have been produced, stored, and distributed. However, reliability varies among these datasets because the experimental and analytical conditions used differ among assays. Furthermore, such datasets have been frequently distributed from the websites of individual sequencing projects. It is desirable to integrate DNA polymorphism data into one database featuring uniform quality control that is distributed from a single platform at a single place. DNA polymorphism annotation database (DNApod; http://tga.nig.ac.jp/dnapod/) is an integrated database that stores genome-wide DNA polymorphism datasets acquired under uniform analytical conditions, and this includes uniformity in the quality of the raw data, the reference genome version, and evaluation algorithms. DNApod genotypic data are re-analyzed whole-genome shotgun datasets extracted from sequence read archives, and DNApod distributes genome-wide DNA polymorphism datasets and known-gene annotations for each DNA polymorphism. This new database was developed for storing genome-wide DNA polymorphism datasets of plants, with crops being the first priority. Here, we describe our analyzed data for 679, 404, and 66 strains of rice, maize, and sorghum, respectively. The analytical methods are available as a DNApod workflow in an NGS annotation system of the DNA Data Bank of Japan and a virtual machine image. Furthermore, DNApod provides tables of links of identifiers between DNApod genotypic data and public phenotypic data. To advance the sharing of organism knowledge, DNApod offers basic and ubiquitous functions for multiple alignment and phylogenetic tree construction by using orthologous gene information.

  13. DNApod: DNA polymorphism annotation database from next-generation sequence read archives

    PubMed Central

    Mochizuki, Takako; Tanizawa, Yasuhiro; Fujisawa, Takatomo; Ohta, Tazro; Nikoh, Naruo; Shimizu, Tokurou; Toyoda, Atsushi; Fujiyama, Asao; Kurata, Nori; Nagasaki, Hideki; Kaminuma, Eli; Nakamura, Yasukazu

    2017-01-01

    With the rapid advances in next-generation sequencing (NGS), datasets for DNA polymorphisms among various species and strains have been produced, stored, and distributed. However, reliability varies among these datasets because the experimental and analytical conditions used differ among assays. Furthermore, such datasets have been frequently distributed from the websites of individual sequencing projects. It is desirable to integrate DNA polymorphism data into one database featuring uniform quality control that is distributed from a single platform at a single place. DNA polymorphism annotation database (DNApod; http://tga.nig.ac.jp/dnapod/) is an integrated database that stores genome-wide DNA polymorphism datasets acquired under uniform analytical conditions, and this includes uniformity in the quality of the raw data, the reference genome version, and evaluation algorithms. DNApod genotypic data are re-analyzed whole-genome shotgun datasets extracted from sequence read archives, and DNApod distributes genome-wide DNA polymorphism datasets and known-gene annotations for each DNA polymorphism. This new database was developed for storing genome-wide DNA polymorphism datasets of plants, with crops being the first priority. Here, we describe our analyzed data for 679, 404, and 66 strains of rice, maize, and sorghum, respectively. The analytical methods are available as a DNApod workflow in an NGS annotation system of the DNA Data Bank of Japan and a virtual machine image. Furthermore, DNApod provides tables of links of identifiers between DNApod genotypic data and public phenotypic data. To advance the sharing of organism knowledge, DNApod offers basic and ubiquitous functions for multiple alignment and phylogenetic tree construction by using orthologous gene information. PMID:28234924

  14. Database Creation and Statistical Analysis: Finding Connections Between Two or More Secondary Storage Device

    DTIC Science & Technology

    2017-09-01

    NAVAL POSTGRADUATE SCHOOL MONTEREY, CALIFORNIA THESIS DATABASE CREATION AND STATISTICAL ANALYSIS: FINDING CONNECTIONS BETWEEN TWO OR MORE SECONDARY...BLANK ii Approved for public release. Distribution is unlimited. DATABASE CREATION AND STATISTICAL ANALYSIS: FINDING CONNECTIONS BETWEEN TWO OR MORE...Problem and Motivation . . . . . . . . . . . . . . . . . . . 1 1.2 DOD Applicability . . . . . . . . . . . . . . . . .. . . . . . . 2 1.3 Research

  15. Checkpointing and Recovery in Distributed and Database Systems

    ERIC Educational Resources Information Center

    Wu, Jiang

    2011-01-01

    A transaction-consistent global checkpoint of a database records a state of the database which reflects the effect of only completed transactions and not the results of any partially executed transactions. This thesis establishes the necessary and sufficient conditions for a checkpoint of a data item (or the checkpoints of a set of data items) to…

  16. Library Micro-Computing, Vol. 1. Reprints from the Best of "ONLINE" [and]"DATABASE."

    ERIC Educational Resources Information Center

    Online, Inc., Weston, CT.

    Reprints of 18 articles pertaining to library microcomputing appear in this collection, the first of two volumes on this topic in a series of volumes of reprints from "ONLINE" and "DATABASE" magazines. Edited for information professionals who use electronically distributed databases, these articles address such topics as: (1) an integrated library…

  17. An Improved Algorithm to Generate a Wi-Fi Fingerprint Database for Indoor Positioning

    PubMed Central

    Chen, Lina; Li, Binghao; Zhao, Kai; Rizos, Chris; Zheng, Zhengqi

    2013-01-01

    The major problem of Wi-Fi fingerprint-based positioning technology is the signal strength fingerprint database creation and maintenance. The significant temporal variation of received signal strength (RSS) is the main factor responsible for the positioning error. A probabilistic approach can be used, but the RSS distribution is required. The Gaussian distribution or an empirically-derived distribution (histogram) is typically used. However, these distributions are either not always correct or require a large amount of data for each reference point. Double peaks of the RSS distribution have been observed in experiments at some reference points. In this paper a new algorithm based on an improved double-peak Gaussian distribution is proposed. Kurtosis testing is used to decide if this new distribution, or the normal Gaussian distribution, should be applied. Test results show that the proposed algorithm can significantly improve the positioning accuracy, as well as reduce the workload of the off-line data training phase. PMID:23966197

  18. An improved algorithm to generate a Wi-Fi fingerprint database for indoor positioning.

    PubMed

    Chen, Lina; Li, Binghao; Zhao, Kai; Rizos, Chris; Zheng, Zhengqi

    2013-08-21

    The major problem of Wi-Fi fingerprint-based positioning technology is the signal strength fingerprint database creation and maintenance. The significant temporal variation of received signal strength (RSS) is the main factor responsible for the positioning error. A probabilistic approach can be used, but the RSS distribution is required. The Gaussian distribution or an empirically-derived distribution (histogram) is typically used. However, these distributions are either not always correct or require a large amount of data for each reference point. Double peaks of the RSS distribution have been observed in experiments at some reference points. In this paper a new algorithm based on an improved double-peak Gaussian distribution is proposed. Kurtosis testing is used to decide if this new distribution, or the normal Gaussian distribution, should be applied. Test results show that the proposed algorithm can significantly improve the positioning accuracy, as well as reduce the workload of the off-line data training phase.

  19. Privacy-Aware Location Database Service for Granular Queries

    NASA Astrophysics Data System (ADS)

    Kiyomoto, Shinsaku; Martin, Keith M.; Fukushima, Kazuhide

    Future mobile markets are expected to increasingly embrace location-based services. This paper presents a new system architecture for location-based services, which consists of a location database and distributed location anonymizers. The service is privacy-aware in the sense that the location database always maintains a degree of anonymity. The location database service permits three different levels of query and can thus be used to implement a wide range of location-based services. Furthermore, the architecture is scalable and employs simple functions that are similar to those found in general database systems.

  20. Nuclear Forensics Analysis with Missing and Uncertain Data

    DOE PAGES

    Langan, Roisin T.; Archibald, Richard K.; Lamberti, Vincent

    2015-10-05

    We have applied a new imputation-based method for analyzing incomplete data, called Monte Carlo Bayesian Database Generation (MCBDG), to the Spent Fuel Isotopic Composition (SFCOMPO) database. About 60% of the entries are absent for SFCOMPO. The method estimates missing values of a property from a probability distribution created from the existing data for the property, and then generates multiple instances of the completed database for training a machine learning algorithm. Uncertainty in the data is represented by an empirical or an assumed error distribution. The method makes few assumptions about the underlying data, and compares favorably against results obtained bymore » replacing missing information with constant values.« less

  1. A multidisciplinary database for global distribution

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Wolfe, P.J.

    The issue of selenium toxicity in the environment has been documented in the scientific literature for over 50 years. Recent studies reveal a complex connection between selenium and human and animal populations. This article introduces a bibliographic citation database on selenium in the environment developed for global distribution via the Internet by the University of Wyoming Libraries. The database incorporates material from commercial sources, print abstracts, indexes, and U.S. government literature, resulting in a multidisciplinary resource. Relevant disciplines include, biology, medicine, veterinary science, botany, chemistry, geology, pollution, aquatic sciences, ecology, and others. It covers the years 1985-1996 for most subjectmore » material, with additional years being added as resources permit.« less

  2. An approach for access differentiation design in medical distributed applications built on databases.

    PubMed

    Shoukourian, S K; Vasilyan, A M; Avagyan, A A; Shukurian, A K

    1999-01-01

    A formalized "top to bottom" design approach was described in [1] for distributed applications built on databases, which were considered as a medium between virtual and real user environments for a specific medical application. Merging different components within a unified distributed application posits new essential problems for software. Particularly protection tools, which are sufficient separately, become deficient during the integration due to specific additional links and relationships not considered formerly. E.g., it is impossible to protect a shared object in the virtual operating room using only DBMS protection tools, if the object is stored as a record in DB tables. The solution of the problem should be found only within the more general application framework. Appropriate tools are absent or unavailable. The present paper suggests a detailed outline of a design and testing toolset for access differentiation systems (ADS) in distributed medical applications which use databases. The appropriate formal model as well as tools for its mapping to a DMBS are suggested. Remote users connected via global networks are considered too.

  3. Technology and Its Use in Education: Present Roles and Future Prospects

    ERIC Educational Resources Information Center

    Courville, Keith

    2011-01-01

    (Purpose) This article describes two current trends in Educational Technology: distributed learning and electronic databases. (Findings) Topics addressed in this paper include: (1) distributed learning as a means of professional development; (2) distributed learning for content visualization; (3) usage of distributed learning for educational…

  4. MIPS: a database for protein sequences, homology data and yeast genome information.

    PubMed Central

    Mewes, H W; Albermann, K; Heumann, K; Liebl, S; Pfeiffer, F

    1997-01-01

    The MIPS group (Martinsried Institute for Protein Sequences) at the Max-Planck-Institute for Biochemistry, Martinsried near Munich, Germany, collects, processes and distributes protein sequence data within the framework of the tripartite association of the PIR-International Protein Sequence Database (,). MIPS contributes nearly 50% of the data input to the PIR-International Protein Sequence Database. The database is distributed on CD-ROM together with PATCHX, an exhaustive supplement of unique, unverified protein sequences from external sources compiled by MIPS. Through its WWW server (http://www.mips.biochem.mpg.de/ ) MIPS permits internet access to sequence databases, homology data and to yeast genome information. (i) Sequence similarity results from the FASTA program () are stored in the FASTA database for all proteins from PIR-International and PATCHX. The database is dynamically maintained and permits instant access to FASTA results. (ii) Starting with FASTA database queries, proteins have been classified into families and superfamilies (PROT-FAM). (iii) The HPT (hashed position tree) data structure () developed at MIPS is a new approach for rapid sequence and pattern searching. (iv) MIPS provides access to the sequence and annotation of the complete yeast genome (), the functional classification of yeast genes (FunCat) and its graphical display, the 'Genome Browser' (). A CD-ROM based on the JAVA programming language providing dynamic interactive access to the yeast genome and the related protein sequences has been compiled and is available on request. PMID:9016498

  5. Database for Simulation of Electron Spectra for Surface Analysis (SESSA)Database for Simulation of Electron Spectra for Surface Analysis (SESSA)

    National Institute of Standards and Technology Data Gateway

    SRD 100 Database for Simulation of Electron Spectra for Surface Analysis (SESSA)Database for Simulation of Electron Spectra for Surface Analysis (SESSA) (PC database for purchase)   This database has been designed to facilitate quantitative interpretation of Auger-electron and X-ray photoelectron spectra and to improve the accuracy of quantitation in routine analysis. The database contains all physical data needed to perform quantitative interpretation of an electron spectrum for a thin-film specimen of given composition. A simulation module provides an estimate of peak intensities as well as the energy and angular distributions of the emitted electron flux.

  6. An incremental database access method for autonomous interoperable databases

    NASA Technical Reports Server (NTRS)

    Roussopoulos, Nicholas; Sellis, Timos

    1994-01-01

    We investigated a number of design and performance issues of interoperable database management systems (DBMS's). The major results of our investigation were obtained in the areas of client-server database architectures for heterogeneous DBMS's, incremental computation models, buffer management techniques, and query optimization. We finished a prototype of an advanced client-server workstation-based DBMS which allows access to multiple heterogeneous commercial DBMS's. Experiments and simulations were then run to compare its performance with the standard client-server architectures. The focus of this research was on adaptive optimization methods of heterogeneous database systems. Adaptive buffer management accounts for the random and object-oriented access methods for which no known characterization of the access patterns exists. Adaptive query optimization means that value distributions and selectives, which play the most significant role in query plan evaluation, are continuously refined to reflect the actual values as opposed to static ones that are computed off-line. Query feedback is a concept that was first introduced to the literature by our group. We employed query feedback for both adaptive buffer management and for computing value distributions and selectivities. For adaptive buffer management, we use the page faults of prior executions to achieve more 'informed' management decisions. For the estimation of the distributions of the selectivities, we use curve-fitting techniques, such as least squares and splines, for regressing on these values.

  7. The StarLite Project Prototyping Real-Time Software

    DTIC Science & Technology

    1991-10-01

    multiversion data objects using the prototyping environment. Section 5 concludes the paper. 2. Message-Based Simulation When prototyping distributed...phase locking and priority-based synchronization algorithms, and between a multiversion database and its corresponding single-version database, through...its deadline, since the transaction is only aborted in the validation phase. 4.5. A Multiversion Database System To illustrate the effctivcness of the

  8. Distributing the ERIC Database on SilverPlatter Compact Disc--A Brief Case History.

    ERIC Educational Resources Information Center

    Brandhorst, Ted

    This description of the development of the Education Resources Information Center (ERIC) compact disc by two companies, SilverPlatter and ORI, Inc., provides background information on ERIC and the ERIC database, discusses reasons for choosing to put the ERIC database on compact discs, and describes the formulation of an ERIC CD-ROM team as part of…

  9. Chesapeake Bay Program Water Quality Database

    EPA Pesticide Factsheets

    The Chesapeake Information Management System (CIMS), designed in 1996, is an integrated, accessible information management system for the Chesapeake Bay Region. CIMS is an organized, distributed library of information and software tools designed to increase basin-wide public access to Chesapeake Bay information. The information delivered by CIMS includes technical and public information, educational material, environmental indicators, policy documents, and scientific data. Through the use of relational databases, web-based programming, and web-based GIS a large number of Internet resources have been established. These resources include multiple distributed on-line databases, on-demand graphing and mapping of environmental data, and geographic searching tools for environmental information. Baseline monitoring data, summarized data and environmental indicators that document ecosystem status and trends, confirm linkages between water quality, habitat quality and abundance, and the distribution and integrity of biological populations are also available. One of the major features of the CIMS network is the Chesapeake Bay Program's Data Hub, providing users access to a suite of long- term water quality and living resources databases. Chesapeake Bay mainstem and tidal tributary water quality, benthic macroinvertebrates, toxics, plankton, and fluorescence data can be obtained for a network of over 800 monitoring stations.

  10. The phytophthora genome initiative database: informatics and analysis for distributed pathogenomic research.

    PubMed

    Waugh, M; Hraber, P; Weller, J; Wu, Y; Chen, G; Inman, J; Kiphart, D; Sobral, B

    2000-01-01

    The Phytophthora Genome Initiative (PGI) is a distributed collaboration to study the genome and evolution of a particularly destructive group of plant pathogenic oomycete, with the goal of understanding the mechanisms of infection and resistance. NCGR provides informatics support for the collaboration as well as a centralized data repository. In the pilot phase of the project, several investigators prepared Phytophthora infestans and Phytophthora sojae EST and Phytophthora sojae BAC libraries and sent them to another laboratory for sequencing. Data from sequencing reactions were transferred to NCGR for analysis and curation. An analysis pipeline transforms raw data by performing simple analyses (i.e., vector removal and similarity searching) that are stored and can be retrieved by investigators using a web browser. Here we describe the database and access tools, provide an overview of the data therein and outline future plans. This resource has provided a unique opportunity for the distributed, collaborative study of a genus from which relatively little sequence data are available. Results may lead to insight into how better to control these pathogens. The homepage of PGI can be accessed at http:www.ncgr.org/pgi, with database access through the database access hyperlink.

  11. Towards G2G: Systems of Technology Database Systems

    NASA Technical Reports Server (NTRS)

    Maluf, David A.; Bell, David

    2005-01-01

    We present an approach and methodology for developing Government-to-Government (G2G) Systems of Technology Database Systems. G2G will deliver technologies for distributed and remote integration of technology data for internal use in analysis and planning as well as for external communications. G2G enables NASA managers, engineers, operational teams and information systems to "compose" technology roadmaps and plans by selecting, combining, extending, specializing and modifying components of technology database systems. G2G will interoperate information and knowledge that is distributed across organizational entities involved that is ideal for NASA future Exploration Enterprise. Key contributions of the G2G system will include the creation of an integrated approach to sustain effective management of technology investments that supports the ability of various technology database systems to be independently managed. The integration technology will comply with emerging open standards. Applications can thus be customized for local needs while enabling an integrated management of technology approach that serves the global needs of NASA. The G2G capabilities will use NASA s breakthrough in database "composition" and integration technology, will use and advance emerging open standards, and will use commercial information technologies to enable effective System of Technology Database systems.

  12. Advanced technologies for scalable ATLAS conditions database access on the grid

    NASA Astrophysics Data System (ADS)

    Basset, R.; Canali, L.; Dimitrov, G.; Girone, M.; Hawkings, R.; Nevski, P.; Valassi, A.; Vaniachine, A.; Viegas, F.; Walker, R.; Wong, A.

    2010-04-01

    During massive data reprocessing operations an ATLAS Conditions Database application must support concurrent access from numerous ATLAS data processing jobs running on the Grid. By simulating realistic work-flow, ATLAS database scalability tests provided feedback for Conditions Db software optimization and allowed precise determination of required distributed database resources. In distributed data processing one must take into account the chaotic nature of Grid computing characterized by peak loads, which can be much higher than average access rates. To validate database performance at peak loads, we tested database scalability at very high concurrent jobs rates. This has been achieved through coordinated database stress tests performed in series of ATLAS reprocessing exercises at the Tier-1 sites. The goal of database stress tests is to detect scalability limits of the hardware deployed at the Tier-1 sites, so that the server overload conditions can be safely avoided in a production environment. Our analysis of server performance under stress tests indicates that Conditions Db data access is limited by the disk I/O throughput. An unacceptable side-effect of the disk I/O saturation is a degradation of the WLCG 3D Services that update Conditions Db data at all ten ATLAS Tier-1 sites using the technology of Oracle Streams. To avoid such bottlenecks we prototyped and tested a novel approach for database peak load avoidance in Grid computing. Our approach is based upon the proven idea of pilot job submission on the Grid: instead of the actual query, an ATLAS utility library sends to the database server a pilot query first.

  13. Quantum partial search for uneven distribution of multiple target items

    NASA Astrophysics Data System (ADS)

    Zhang, Kun; Korepin, Vladimir

    2018-06-01

    Quantum partial search algorithm is an approximate search. It aims to find a target block (which has the target items). It runs a little faster than full Grover search. In this paper, we consider quantum partial search algorithm for multiple target items unevenly distributed in a database (target blocks have different number of target items). The algorithm we describe can locate one of the target blocks. Efficiency of the algorithm is measured by number of queries to the oracle. We optimize the algorithm in order to improve efficiency. By perturbation method, we find that the algorithm runs the fastest when target items are evenly distributed in database.

  14. Study on Big Database Construction and its Application of Sample Data Collected in CHINA'S First National Geographic Conditions Census Based on Remote Sensing Images

    NASA Astrophysics Data System (ADS)

    Cheng, T.; Zhou, X.; Jia, Y.; Yang, G.; Bai, J.

    2018-04-01

    In the project of China's First National Geographic Conditions Census, millions of sample data have been collected all over the country for interpreting land cover based on remote sensing images, the quantity of data files reaches more than 12,000,000 and has grown in the following project of National Geographic Conditions Monitoring. By now, using database such as Oracle for storing the big data is the most effective method. However, applicable method is more significant for sample data's management and application. This paper studies a database construction method which is based on relational database with distributed file system. The vector data and file data are saved in different physical location. The key issues and solution method are discussed. Based on this, it studies the application method of sample data and analyzes some kinds of using cases, which could lay the foundation for sample data's application. Particularly, sample data locating in Shaanxi province are selected for verifying the method. At the same time, it takes 10 first-level classes which defined in the land cover classification system for example, and analyzes the spatial distribution and density characteristics of all kinds of sample data. The results verify that the method of database construction which is based on relational database with distributed file system is very useful and applicative for sample data's searching, analyzing and promoted application. Furthermore, sample data collected in the project of China's First National Geographic Conditions Census could be useful in the earth observation and land cover's quality assessment.

  15. Development of geotechnical analysis and design modules for the Virginia Department of Transportation's geotechnical database.

    DOT National Transportation Integrated Search

    2005-01-01

    In 2003, an Internet-based Geotechnical Database Management System (GDBMS) was developed for the Virginia Department of Transportation (VDOT) using distributed Geographic Information System (GIS) methodology for data management, archival, retrieval, ...

  16. DSSTox and Chemical Information Technologies in Support of PredictiveToxicology

    EPA Science Inventory

    The EPA NCCT Distributed Structure-Searchable Toxicity (DSSTox) Database project initially focused on the curation and publication of high-quality, standardized, chemical structure-annotated toxicity databases for use in structure-activity relationship (SAR) modeling. In recent y...

  17. A VBA Desktop Database for Proposal Processing at National Optical Astronomy Observatories

    NASA Astrophysics Data System (ADS)

    Brown, Christa L.

    National Optical Astronomy Observatories (NOAO) has developed a relational Microsoft Windows desktop database using Microsoft Access and the Microsoft Office programming language, Visual Basic for Applications (VBA). The database is used to track data relating to observing proposals from original receipt through the review process, scheduling, observing, and final statistical reporting. The database has automated proposal processing and distribution of information. It allows NOAO to collect and archive data so as to query and analyze information about our science programs in new ways.

  18. Electron-Impact Ionization Cross Section Database

    National Institute of Standards and Technology Data Gateway

    SRD 107 Electron-Impact Ionization Cross Section Database (Web, free access)   This is a database primarily of total ionization cross sections of molecules by electron impact. The database also includes cross sections for a small number of atoms and energy distributions of ejected electrons for H, He, and H2. The cross sections were calculated using the Binary-Encounter-Bethe (BEB) model, which combines the Mott cross section with the high-incident energy behavior of the Bethe cross section. Selected experimental data are included.

  19. Distributed Access View Integrated Database (DAVID) system

    NASA Technical Reports Server (NTRS)

    Jacobs, Barry E.

    1991-01-01

    The Distributed Access View Integrated Database (DAVID) System, which was adopted by the Astrophysics Division for their Astrophysics Data System, is a solution to the system heterogeneity problem. The heterogeneous components of the Astrophysics problem is outlined. The Library and Library Consortium levels of the DAVID approach are described. The 'books' and 'kits' level is discussed. The Universal Object Typer Management System level is described. The relation of the DAVID project with the Small Business Innovative Research (SBIR) program is explained.

  20. Central Appalachian basin natural gas database: distribution, composition, and origin of natural gases

    USGS Publications Warehouse

    Román Colón, Yomayra A.; Ruppert, Leslie F.

    2015-01-01

    The U.S. Geological Survey (USGS) has compiled a database consisting of three worksheets of central Appalachian basin natural gas analyses and isotopic compositions from published and unpublished sources of 1,282 gas samples from Kentucky, Maryland, New York, Ohio, Pennsylvania, Tennessee, Virginia, and West Virginia. The database includes field and reservoir names, well and State identification number, selected geologic reservoir properties, and the composition of natural gases (methane; ethane; propane; butane, iso-butane [i-butane]; normal butane [n-butane]; iso-pentane [i-pentane]; normal pentane [n-pentane]; cyclohexane, and hexanes). In the first worksheet, location and American Petroleum Institute (API) numbers from public or published sources are provided for 1,231 of the 1,282 gas samples. A second worksheet of 186 gas samples was compiled from published sources and augmented with public location information and contains carbon, hydrogen, and nitrogen isotopic measurements of natural gas. The third worksheet is a key for all abbreviations in the database. The database can be used to better constrain the stratigraphic distribution, composition, and origin of natural gas in the central Appalachian basin.

  1. RAId_DbS: Peptide Identification using Database Searches with Realistic Statistics

    PubMed Central

    Alves, Gelio; Ogurtsov, Aleksey Y; Yu, Yi-Kuo

    2007-01-01

    Background The key to mass-spectrometry-based proteomics is peptide identification. A major challenge in peptide identification is to obtain realistic E-values when assigning statistical significance to candidate peptides. Results Using a simple scoring scheme, we propose a database search method with theoretically characterized statistics. Taking into account possible skewness in the random variable distribution and the effect of finite sampling, we provide a theoretical derivation for the tail of the score distribution. For every experimental spectrum examined, we collect the scores of peptides in the database, and find good agreement between the collected score statistics and our theoretical distribution. Using Student's t-tests, we quantify the degree of agreement between the theoretical distribution and the score statistics collected. The T-tests may be used to measure the reliability of reported statistics. When combined with reported P-value for a peptide hit using a score distribution model, this new measure prevents exaggerated statistics. Another feature of RAId_DbS is its capability of detecting multiple co-eluted peptides. The peptide identification performance and statistical accuracy of RAId_DbS are assessed and compared with several other search tools. The executables and data related to RAId_DbS are freely available upon request. PMID:17961253

  2. Analysis of Lunar Highland Regolith Samples From Apollo 16 Drive Core 64001/2 and Lunar Regolith Simulants - an Expanding Comparative Database

    NASA Technical Reports Server (NTRS)

    Schrader, Christian M.; Rickman, Doug; Stoeser, Douglas; Wentworth, Susan; McKay, Dave S.; Botha, Pieter; Butcher, Alan R.; Horsch, Hanna E.; Benedictus, Aukje; Gottlieb, Paul

    2008-01-01

    This slide presentation reviews the work to analyze the lunar highland regolith samples that came from the Apollo 16 core sample 64001/2 and simulants of lunar regolith, and build a comparative database. The work is part of a larger effort to compile an internally consistent database on lunar regolith (Apollo Samples) and lunar regolith simulants. This is in support of a future lunar outpost. The work is to characterize existing lunar regolith and simulants in terms of particle type, particle size distribution, particle shape distribution, bulk density, and other compositional characteristics, and to evaluate the regolith simulants by the same properties in comparison to the Apollo sample lunar regolith.

  3. Distributed Structure-Searchable Toxicity (DSSTox) Database

    EPA Pesticide Factsheets

    The Distributed Structure-Searchable Toxicity network provides a public forum for publishing downloadable, structure-searchable, standardized chemical structure files associated with chemical inventories or toxicity data sets of environmental relevance.

  4. UNSODA UNSATURATED SOIL HYDRAULIC DATABASE USER'S MANUAL VERSION 1.0

    EPA Science Inventory

    This report contains general documentation and serves as a user manual of the UNSODA program. UNSODA is a database of unsaturated soil hydraulic properties (water retention, hydraulic conductivity, and soil water diffusivity), basic soil properties (particle-size distribution, b...

  5. Yaquina Bay, Oregon, Intertidal Sediment Temperature Database, 1998 - 2006.

    EPA Science Inventory

    Detailed, long term sediment temperature records were obtained and compiled in a database to determine the influence of daily, monthly, seasonal and annual temperature variation on eelgrass distribution across the intertidal habitat in Yaquina Bay, Oregon. Both currently and hi...

  6. The Starlite Project

    DTIC Science & Technology

    1990-09-01

    conflicts. The current prototyping tool also provides a multiversion data object control mechanism. From a series of experiments, we found that the...performance of a multiversion distributed database system is quite sensitive to the size of read-sets and write-sets of transactions. A multiversion database...510-512. (18) Son, S. H. and N. Haghighi, "Performance Evaluation of Multiversion Database Systems," Sixth IEEE International Conference on Data

  7. Database interfaces on NASA's heterogeneous distributed database system

    NASA Technical Reports Server (NTRS)

    Huang, S. H. S.

    1986-01-01

    The purpose of the ORACLE interface is to enable the DAVID program to submit queries and transactions to databases running under the ORACLE DBMS. The interface package is made up of several modules. The progress of these modules is described below. The two approaches used in implementing the interface are also discussed. Detailed discussion of the design of the templates is shown and concluding remarks are presented.

  8. Hydra: a scalable proteomic search engine which utilizes the Hadoop distributed computing framework

    PubMed Central

    2012-01-01

    Background For shotgun mass spectrometry based proteomics the most computationally expensive step is in matching the spectra against an increasingly large database of sequences and their post-translational modifications with known masses. Each mass spectrometer can generate data at an astonishingly high rate, and the scope of what is searched for is continually increasing. Therefore solutions for improving our ability to perform these searches are needed. Results We present a sequence database search engine that is specifically designed to run efficiently on the Hadoop MapReduce distributed computing framework. The search engine implements the K-score algorithm, generating comparable output for the same input files as the original implementation. The scalability of the system is shown, and the architecture required for the development of such distributed processing is discussed. Conclusion The software is scalable in its ability to handle a large peptide database, numerous modifications and large numbers of spectra. Performance scales with the number of processors in the cluster, allowing throughput to expand with the available resources. PMID:23216909

  9. Restoration, Enhancement, and Distribution of the ATLAS-1 Imaging Spectrometric Observatory (ISO) Space Science Data Set

    NASA Technical Reports Server (NTRS)

    Germany, G. A.

    2001-01-01

    The primary goal of the funded task was to restore and distribute the ISO ATLAS-1 space science data set with enhanced software and database utilities. The first year was primarily dedicated to physically transferring the data from its original format to its initial CD archival format. The remainder of the first year was devoted to the verification of the restored data set and database. The second year was devoted to the enhancement of the data set, especially the development of IDL utilities and redesign of the database and search interface as needed. This period was also devoted to distribution of the rescued data set, principally the creation and maintenance of a web interface to the data set. The final six months was dedicated to working with NSSDC to create a permanent, off site, hive of the data set and supporting utilities. This time was also used to resolve last minute quality and design issues.

  10. Hydra: a scalable proteomic search engine which utilizes the Hadoop distributed computing framework.

    PubMed

    Lewis, Steven; Csordas, Attila; Killcoyne, Sarah; Hermjakob, Henning; Hoopmann, Michael R; Moritz, Robert L; Deutsch, Eric W; Boyle, John

    2012-12-05

    For shotgun mass spectrometry based proteomics the most computationally expensive step is in matching the spectra against an increasingly large database of sequences and their post-translational modifications with known masses. Each mass spectrometer can generate data at an astonishingly high rate, and the scope of what is searched for is continually increasing. Therefore solutions for improving our ability to perform these searches are needed. We present a sequence database search engine that is specifically designed to run efficiently on the Hadoop MapReduce distributed computing framework. The search engine implements the K-score algorithm, generating comparable output for the same input files as the original implementation. The scalability of the system is shown, and the architecture required for the development of such distributed processing is discussed. The software is scalable in its ability to handle a large peptide database, numerous modifications and large numbers of spectra. Performance scales with the number of processors in the cluster, allowing throughput to expand with the available resources.

  11. Kristin Munch | NREL

    Science.gov Websites

    Information Management System, Materials Research Society Fall Meeting (2013) Photovoltaics Informatics scientific data management, database and data systems design, database clusters, storage systems integration , and distributed data analytics. She has used her experience in laboratory data management systems, lab

  12. Development of the interconnectivity and enhancement (ICE) module in the Virginia Department of Transportation's Geotechnical Database Management System Framework.

    DOT National Transportation Integrated Search

    2007-01-01

    An Internet-based, spatiotemporal Geotechnical Database Management System (GDBMS) Framework was implemented at the Virginia Department of Transportation (VDOT) in 2002 to manage geotechnical data using a distributed Geographical Information System (G...

  13. Bibliographic Databases Outside of the United States.

    ERIC Educational Resources Information Center

    McGinn, Thomas P.; And Others

    1988-01-01

    Eight articles describe the development, content, and structure of databases outside of the United States. Features discussed include library involvement, authority control, shared cataloging services, union catalogs, thesauri, abstracts, and distribution methods. Countries and areas represented are Latin America, Australia, the United Kingdom,…

  14. Eta-Sub-Earth Projection from Kepler Data

    NASA Technical Reports Server (NTRS)

    Traub, Wesley A.

    2012-01-01

    Outline of talk: (1) The Kepler database (2) Biases (3) The radius distribution (4) The period distribution (5) Projecting from the sam ple to the population (6) Extrapolating the period distribution (7) The Habitable Zone (8) Calculating the number of terrestrial, HZ plan ets (10) Conclusions

  15. Host range, host ecology, and distribution of more than 11800 fish parasite species

    USGS Publications Warehouse

    Strona, Giovanni; Palomares, Maria Lourdes D.; Bailly, Nicholas; Galli, Paolo; Lafferty, Kevin D.

    2013-01-01

    Our data set includes 38 008 fish parasite records (for Acanthocephala, Cestoda, Monogenea, Nematoda, Trematoda) compiled from the scientific literature, Internet databases, and museum collections paired to the corresponding host ecological, biogeographical, and phylogenetic traits (maximum length, growth rate, life span, age at maturity, trophic level, habitat preference, geographical range size, taxonomy). The data focus on host features, because specific parasite traits are not consistently available across records. For this reason, the data set is intended as a flexible resource able to extend the principles of ecological niche modeling to the host–parasite system, providing researchers with the data to model parasite niches based on their distribution in host species and the associated host features. In this sense, the database offers a framework for testing general ecological, biogeographical, and phylogenetic hypotheses based on the identification of hosts as parasite habitat. Potential applications of the data set are, for example, the investigation of species–area relationships or the taxonomic distribution of host-specificity. The provided host–parasite list is that currently used by Fish Parasite Ecology Software Tool (FishPEST, http://purl.oclc.org/fishpest), which is a website that allows researchers to model several aspects of the relationships between fish parasites and their hosts. The database is intended for researchers who wish to have more freedom to analyze the database than currently possible with FishPEST. However, for readers who have not seen FishPEST, we recommend using this as a starting point for interacting with the database.

  16. Accessing and distributing EMBL data using CORBA (common object request broker architecture).

    PubMed

    Wang, L; Rodriguez-Tomé, P; Redaschi, N; McNeil, P; Robinson, A; Lijnzaad, P

    2000-01-01

    The EMBL Nucleotide Sequence Database is a comprehensive database of DNA and RNA sequences and related information traditionally made available in flat-file format. Queries through tools such as SRS (Sequence Retrieval System) also return data in flat-file format. Flat files have a number of shortcomings, however, and the resources therefore currently lack a flexible environment to meet individual researchers' needs. The Object Management Group's common object request broker architecture (CORBA) is an industry standard that provides platform-independent programming interfaces and models for portable distributed object-oriented computing applications. Its independence from programming languages, computing platforms and network protocols makes it attractive for developing new applications for querying and distributing biological data. A CORBA infrastructure developed by EMBL-EBI provides an efficient means of accessing and distributing EMBL data. The EMBL object model is defined such that it provides a basis for specifying interfaces in interface definition language (IDL) and thus for developing the CORBA servers. The mapping from the object model to the relational schema in the underlying Oracle database uses the facilities provided by PersistenceTM, an object/relational tool. The techniques of developing loaders and 'live object caching' with persistent objects achieve a smart live object cache where objects are created on demand. The objects are managed by an evictor pattern mechanism. The CORBA interfaces to the EMBL database address some of the problems of traditional flat-file formats and provide an efficient means for accessing and distributing EMBL data. CORBA also provides a flexible environment for users to develop their applications by building clients to our CORBA servers, which can be integrated into existing systems.

  17. Accessing and distributing EMBL data using CORBA (common object request broker architecture)

    PubMed Central

    Wang, Lichun; Rodriguez-Tomé, Patricia; Redaschi, Nicole; McNeil, Phil; Robinson, Alan; Lijnzaad, Philip

    2000-01-01

    Background: The EMBL Nucleotide Sequence Database is a comprehensive database of DNA and RNA sequences and related information traditionally made available in flat-file format. Queries through tools such as SRS (Sequence Retrieval System) also return data in flat-file format. Flat files have a number of shortcomings, however, and the resources therefore currently lack a flexible environment to meet individual researchers' needs. The Object Management Group's common object request broker architecture (CORBA) is an industry standard that provides platform-independent programming interfaces and models for portable distributed object-oriented computing applications. Its independence from programming languages, computing platforms and network protocols makes it attractive for developing new applications for querying and distributing biological data. Results: A CORBA infrastructure developed by EMBL-EBI provides an efficient means of accessing and distributing EMBL data. The EMBL object model is defined such that it provides a basis for specifying interfaces in interface definition language (IDL) and thus for developing the CORBA servers. The mapping from the object model to the relational schema in the underlying Oracle database uses the facilities provided by PersistenceTM, an object/relational tool. The techniques of developing loaders and 'live object caching' with persistent objects achieve a smart live object cache where objects are created on demand. The objects are managed by an evictor pattern mechanism. Conclusions: The CORBA interfaces to the EMBL database address some of the problems of traditional flat-file formats and provide an efficient means for accessing and distributing EMBL data. CORBA also provides a flexible environment for users to develop their applications by building clients to our CORBA servers, which can be integrated into existing systems. PMID:11178259

  18. Geologic database for digital geology of California, Nevada, and Utah: an application of the North American Data Model

    USGS Publications Warehouse

    Bedford, David R.; Ludington, Steve; Nutt, Constance M.; Stone, Paul A.; Miller, David M.; Miller, Robert J.; Wagner, David L.; Saucedo, George J.

    2003-01-01

    The USGS is creating an integrated national database for digital state geologic maps that includes stratigraphic, age, and lithologic information. The majority of the conterminous 48 states have digital geologic base maps available, often at scales of 1:500,000. This product is a prototype, and is intended to demonstrate the types of derivative maps that will be possible with the national integrated database. This database permits the creation of a number of types of maps via simple or sophisticated queries, maps that may be useful in a number of areas, including mineral-resource assessment, environmental assessment, and regional tectonic evolution. This database is distributed with three main parts: a Microsoft Access 2000 database containing geologic map attribute data, an Arc/Info (Environmental Systems Research Institute, Redlands, California) Export format file containing points representing designation of stratigraphic regions for the Geologic Map of Utah, and an ArcView 3.2 (Environmental Systems Research Institute, Redlands, California) project containing scripts and dialogs for performing a series of generalization and mineral resource queries. IMPORTANT NOTE: Spatial data for the respective stage geologic maps is not distributed with this report. The digital state geologic maps for the states involved in this report are separate products, and two of them are produced by individual state agencies, which may be legally and/or financially responsible for this data. However, the spatial datasets for maps discussed in this report are available to the public. Questions regarding the distribution, sale, and use of individual state geologic maps should be sent to the respective state agency. We do provide suggestions for obtaining and formatting the spatial data to make it compatible with data in this report. See section ‘Obtaining and Formatting Spatial Data’ in the PDF version of the report.

  19. An Overview of ARL’s Multimodal Signatures Database and Web Interface

    DTIC Science & Technology

    2007-12-01

    ActiveX components, which hindered distribution due to license agreements and run-time license software to use such components. g. Proprietary...Overview The database consists of multimodal signature data files in the HDF5 format. Generally, each signature file contains all the ancillary...only contains information in the database, Web interface, and signature files that is releasable to the public. The Web interface consists of static

  20. DESPIC: Detecting Early Signatures of Persuasion in Information Cascades

    DTIC Science & Technology

    2015-08-27

    over NoSQL Databases, Proceedings of the 14th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (CCGrid 2014). 26-MAY-14, . : , P...over NoSQL Databases. Proceedings of the 14th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (CCGrid 2014). Chicago, IL, USA...distributed NoSQL databases including HBase and Riak, we finalized the requirements of the optimal computational architecture to support our framework

  1. Verification of the databases EXFOR and ENDF

    NASA Astrophysics Data System (ADS)

    Berton, Gottfried; Damart, Guillaume; Cabellos, Oscar; Beauzamy, Bernard; Soppera, Nicolas; Bossant, Manuel

    2017-09-01

    The objective of this work is for the verification of large experimental (EXFOR) and evaluated nuclear reaction databases (JEFF, ENDF, JENDL, TENDL…). The work is applied to neutron reactions in EXFOR data, including threshold reactions, isomeric transitions, angular distributions and data in the resonance region of both isotopes and natural elements. Finally, a comparison of the resonance integrals compiled in EXFOR database with those derived from the evaluated libraries is also performed.

  2. Intelligent distributed medical image management

    NASA Astrophysics Data System (ADS)

    Garcia, Hong-Mei C.; Yun, David Y.

    1995-05-01

    The rapid advancements in high performance global communication have accelerated cooperative image-based medical services to a new frontier. Traditional image-based medical services such as radiology and diagnostic consultation can now fully utilize multimedia technologies in order to provide novel services, including remote cooperative medical triage, distributed virtual simulation of operations, as well as cross-country collaborative medical research and training. Fast (efficient) and easy (flexible) retrieval of relevant images remains a critical requirement for the provision of remote medical services. This paper describes the database system requirements, identifies technological building blocks for meeting the requirements, and presents a system architecture for our target image database system, MISSION-DBS, which has been designed to fulfill the goals of Project MISSION (medical imaging support via satellite integrated optical network) -- an experimental high performance gigabit satellite communication network with access to remote supercomputing power, medical image databases, and 3D visualization capabilities in addition to medical expertise anywhere and anytime around the country. The MISSION-DBS design employs a synergistic fusion of techniques in distributed databases (DDB) and artificial intelligence (AI) for storing, migrating, accessing, and exploring images. The efficient storage and retrieval of voluminous image information is achieved by integrating DDB modeling and AI techniques for image processing while the flexible retrieval mechanisms are accomplished by combining attribute- based and content-based retrievals.

  3. rAvis: an R-package for downloading information stored in Proyecto AVIS, a citizen science bird project.

    PubMed

    Varela, Sara; González-Hernández, Javier; Casabella, Eduardo; Barrientos, Rafael

    2014-01-01

    Citizen science projects store an enormous amount of information about species distribution, diversity and characteristics. Researchers are now beginning to make use of this rich collection of data. However, access to these databases is not always straightforward. Apart from the largest and international projects, citizen science repositories often lack specific Application Programming Interfaces (APIs) to connect them to the scientific environments. Thus, it is necessary to develop simple routines to allow researchers to take advantage of the information collected by smaller citizen science projects, for instance, programming specific packages to connect them to popular scientific environments (like R). Here, we present rAvis, an R-package to connect R-users with Proyecto AVIS (http://proyectoavis.com), a Spanish citizen science project with more than 82,000 bird observation records. We develop several functions to explore the database, to plot the geographic distribution of the species occurrences, and to generate personal queries to the database about species occurrences (number of individuals, distribution, etc.) and birdwatcher observations (number of species recorded by each collaborator, UTMs visited, etc.). This new R-package will allow scientists to access this database and to exploit the information generated by Spanish birdwatchers over the last 40 years.

  4. rAvis: An R-Package for Downloading Information Stored in Proyecto AVIS, a Citizen Science Bird Project

    PubMed Central

    Varela, Sara; González-Hernández, Javier; Casabella, Eduardo; Barrientos, Rafael

    2014-01-01

    Citizen science projects store an enormous amount of information about species distribution, diversity and characteristics. Researchers are now beginning to make use of this rich collection of data. However, access to these databases is not always straightforward. Apart from the largest and international projects, citizen science repositories often lack specific Application Programming Interfaces (APIs) to connect them to the scientific environments. Thus, it is necessary to develop simple routines to allow researchers to take advantage of the information collected by smaller citizen science projects, for instance, programming specific packages to connect them to popular scientific environments (like R). Here, we present rAvis, an R-package to connect R-users with Proyecto AVIS (http://proyectoavis.com), a Spanish citizen science project with more than 82,000 bird observation records. We develop several functions to explore the database, to plot the geographic distribution of the species occurrences, and to generate personal queries to the database about species occurrences (number of individuals, distribution, etc.) and birdwatcher observations (number of species recorded by each collaborator, UTMs visited, etc.). This new R-package will allow scientists to access this database and to exploit the information generated by Spanish birdwatchers over the last 40 years. PMID:24626233

  5. A TEX86 surface sediment database and extended Bayesian calibration

    NASA Astrophysics Data System (ADS)

    Tierney, Jessica E.; Tingley, Martin P.

    2015-06-01

    Quantitative estimates of past temperature changes are a cornerstone of paleoclimatology. For a number of marine sediment-based proxies, the accuracy and precision of past temperature reconstructions depends on a spatial calibration of modern surface sediment measurements to overlying water temperatures. Here, we present a database of 1095 surface sediment measurements of TEX86, a temperature proxy based on the relative cyclization of marine archaeal glycerol dialkyl glycerol tetraether (GDGT) lipids. The dataset is archived in a machine-readable format with geospatial information, fractional abundances of lipids (if available), and metadata. We use this new database to update surface and subsurface temperature calibration models for TEX86 and demonstrate the applicability of the TEX86 proxy to past temperature prediction. The TEX86 database confirms that surface sediment GDGT distribution has a strong relationship to temperature, which accounts for over 70% of the variance in the data. Future efforts, made possible by the data presented here, will seek to identify variables with secondary relationships to GDGT distributions, such as archaeal community composition.

  6. Database interfaces on NASA's heterogeneous distributed database system

    NASA Technical Reports Server (NTRS)

    Huang, Shou-Hsuan Stephen

    1987-01-01

    The purpose of Distributed Access View Integrated Database (DAVID) interface module (Module 9: Resident Primitive Processing Package) is to provide data transfer between local DAVID systems and resident Data Base Management Systems (DBMSs). The result of current research is summarized. A detailed description of the interface module is provided. Several Pascal templates were constructed. The Resident Processor program was also developed. Even though it is designed for the Pascal templates, it can be modified for templates in other languages, such as C, without much difficulty. The Resident Processor itself can be written in any programming language. Since Module 5 routines are not ready yet, there is no way to test the interface module. However, simulation shows that the data base access programs produced by the Resident Processor do work according to the specifications.

  7. A Multiagent System for Dynamic Data Aggregation in Medical Research

    PubMed Central

    Urovi, Visara; Barba, Imanol; Aberer, Karl; Schumacher, Michael Ignaz

    2016-01-01

    The collection of medical data for research purposes is a challenging and long-lasting process. In an effort to accelerate and facilitate this process we propose a new framework for dynamic aggregation of medical data from distributed sources. We use agent-based coordination between medical and research institutions. Our system employs principles of peer-to-peer network organization and coordination models to search over already constructed distributed databases and to identify the potential contributors when a new database has to be built. Our framework takes into account both the requirements of a research study and current data availability. This leads to better definition of database characteristics such as schema, content, and privacy parameters. We show that this approach enables a more efficient way to collect data for medical research. PMID:27975063

  8. The Identity Mapping Project: Demographic differences in patterns of distributed identity.

    PubMed

    Gilbert, Richard L; Dionisio, John David N; Forney, Andrew; Dorin, Philip

    2015-01-01

    The advent of cloud computing and a multi-platform digital environment is giving rise to a new phase of human identity called "The Distributed Self." In this conception, aspects of the self are distributed into a variety of 2D and 3D digital personas with the capacity to reflect any number of combinations of now malleable personality traits. In this way, the source of human identity remains internal and embodied, but the expression or enactment of the self becomes increasingly external, disembodied, and distributed on demand. The Identity Mapping Project (IMP) is an interdisciplinary collaboration between psychology and computer Science designed to empirically investigate the development of distributed forms of identity. Methodologically, it collects a large database of "identity maps" - computerized graphical representations of how active someone is online and how their identity is expressed and distributed across 7 core digital domains: email, blogs/personal websites, social networks, online forums, online dating sites, character based digital games, and virtual worlds. The current paper reports on gender and age differences in online identity based on an initial database of distributed identity profiles.

  9. Estimating Traveler Populations at Airport and Cruise Terminals for Population Distribution and Dynamics

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Jochem, Warren C; Sims, Kelly M; Bright, Eddie A

    In recent years, uses of high-resolution population distribution databases are increasing steadily for environmental, socioeconomic, public health, and disaster-related research and operations. With the development of daytime population distribution, temporal resolution of such databases has been improved. However, the lack of incorporation of transitional population, namely business and leisure travelers, leaves a significant population unaccounted for within the critical infrastructure networks, such as at transportation hubs. This paper presents two general methodologies for estimating passenger populations in airport and cruise port terminals at a high temporal resolution which can be incorporated into existing population distribution models. The methodologies are geographicallymore » scalable and are based on, and demonstrate how, two different transportation hubs with disparate temporal population dynamics can be modeled utilizing publicly available databases including novel data sources of flight activity from the Internet which are updated in near-real time. The airport population estimation model shows great potential for rapid implementation for a large collection of airports on a national scale, and the results suggest reasonable accuracy in the estimated passenger traffic. By incorporating population dynamics at high temporal resolutions into population distribution models, we hope to improve the estimates of populations exposed to or at risk to disasters, thereby improving emergency planning and response, and leading to more informed policy decisions.« less

  10. A curated gluten protein sequence database to support development of proteomics methods for determination of gluten in gluten-free foods.

    PubMed

    Bromilow, Sophie; Gethings, Lee A; Buckley, Mike; Bromley, Mike; Shewry, Peter R; Langridge, James I; Clare Mills, E N

    2017-06-23

    The unique physiochemical properties of wheat gluten enable a diverse range of food products to be manufactured. However, gluten triggers coeliac disease, a condition which is treated using a gluten-free diet. Analytical methods are required to confirm if foods are gluten-free, but current immunoassay-based methods can unreliable and proteomic methods offer an alternative but require comprehensive and well annotated sequence databases which are lacking for gluten. A manually a curated database (GluPro V1.0) of gluten proteins, comprising 630 discrete unique full length protein sequences has been compiled. It is representative of the different types of gliadin and glutenin components found in gluten. An in silico comparison of their coeliac toxicity was undertaken by analysing the distribution of coeliac toxic motifs. This demonstrated that whilst the α-gliadin proteins contained more toxic motifs, these were distributed across all gluten protein sub-types. Comparison of annotations observed using a discovery proteomics dataset acquired using ion mobility MS/MS showed that more reliable identifications were obtained using the GluPro V1.0 database compared to the complete reviewed Viridiplantae database. This highlights the value of a curated sequence database specifically designed to support the proteomic workflows and the development of methods to detect and quantify gluten. We have constructed the first manually curated open-source wheat gluten protein sequence database (GluPro V1.0) in a FASTA format to support the application of proteomic methods for gluten protein detection and quantification. We have also analysed the manually verified sequences to give the first comprehensive overview of the distribution of sequences able to elicit a reaction in coeliac disease, the prevalent form of gluten intolerance. Provision of this database will improve the reliability of gluten protein identification by proteomic analysis, and aid the development of targeted mass spectrometry methods in line with Codex Alimentarius Commission requirements for foods designed to meet the needs of gluten intolerant individuals. Copyright © 2017. Published by Elsevier B.V.

  11. The National Landslide Database and GIS for Great Britain: construction, development, data acquisition, application and communication

    NASA Astrophysics Data System (ADS)

    Pennington, Catherine; Dashwood, Claire; Freeborough, Katy

    2014-05-01

    The National Landslide Database has been developed by the British Geological Survey (BGS) and is the focus for national geohazard research for landslides in Great Britain. The history and structure of the geospatial database and associated Geographical Information System (GIS) are explained, along with the future developments of the database and its applications. The database is the most extensive source of information on landslides in Great Britain with over 16,500 records of landslide events, each documented as fully as possible. Data are gathered through a range of procedures, including: incorporation of other databases; automated trawling of current and historical scientific literature and media reports; new field- and desk-based mapping technologies with digital data capture, and crowd-sourcing information through social media and other online resources. This information is invaluable for the investigation, prevention and mitigation of areas of unstable ground in accordance with Government planning policy guidelines. The national landslide susceptibility map (GeoSure) and a national landslide domain map currently under development rely heavily on the information contained within the landslide database. Assessing susceptibility to landsliding requires knowledge of the distribution of failures and an understanding of causative factors and their spatial distribution, whilst understanding the frequency and types of landsliding present is integral to modelling how rainfall will influence the stability of a region. Communication of landslide data through the Natural Hazard Partnership (NHP) contributes to national hazard mitigation and disaster risk reduction with respect to weather and climate. Daily reports of landslide potential are published by BGS through the NHP and data collected for the National Landslide Database is used widely for the creation of these assessments. The National Landslide Database is freely available via an online GIS and is used by a variety of stakeholders for research purposes.

  12. Database technology and the management of multimedia data in the Mirror project

    NASA Astrophysics Data System (ADS)

    de Vries, Arjen P.; Blanken, H. M.

    1998-10-01

    Multimedia digital libraries require an open distributed architecture instead of a monolithic database system. In the Mirror project, we use the Monet extensible database kernel to manage different representation of multimedia objects. To maintain independence between content, meta-data, and the creation of meta-data, we allow distribution of data and operations using CORBA. This open architecture introduces new problems for data access. From an end user's perspective, the problem is how to search the available representations to fulfill an actual information need; the conceptual gap between human perceptual processes and the meta-data is too large. From a system's perspective, several representations of the data may semantically overlap or be irrelevant. We address these problems with an iterative query process and active user participating through relevance feedback. A retrieval model based on inference networks assists the user with query formulation. The integration of this model into the database design has two advantages. First, the user can query both the logical and the content structure of multimedia objects. Second, the use of different data models in the logical and the physical database design provides data independence and allows algebraic query optimization. We illustrate query processing with a music retrieval application.

  13. A web-based system architecture for ontology-based data integration in the domain of IT benchmarking

    NASA Astrophysics Data System (ADS)

    Pfaff, Matthias; Krcmar, Helmut

    2018-03-01

    In the domain of IT benchmarking (ITBM), a variety of data and information are collected. Although these data serve as the basis for business analyses, no unified semantic representation of such data yet exists. Consequently, data analysis across different distributed data sets and different benchmarks is almost impossible. This paper presents a system architecture and prototypical implementation for an integrated data management of distributed databases based on a domain-specific ontology. To preserve the semantic meaning of the data, the ITBM ontology is linked to data sources and functions as the central concept for database access. Thus, additional databases can be integrated by linking them to this domain-specific ontology and are directly available for further business analyses. Moreover, the web-based system supports the process of mapping ontology concepts to external databases by introducing a semi-automatic mapping recommender and by visualizing possible mapping candidates. The system also provides a natural language interface to easily query linked databases. The expected result of this ontology-based approach of knowledge representation and data access is an increase in knowledge and data sharing in this domain, which will enhance existing business analysis methods.

  14. Information Security Considerations for Applications Using Apache Accumulo

    DTIC Science & Technology

    2014-09-01

    Distributed File System INSCOM United States Army Intelligence and Security Command JPA Java Persistence API JSON JavaScript Object Notation MAC Mandatory... MySQL [13]. BigTable can process 20 petabytes per day [14]. High degree of scalability on commodity hardware. NoSQL databases do not rely on highly...manipulation in relational databases. NoSQL databases each have a unique programming interface that uses a lower level procedural language (e.g., Java

  15. Object recognition for autonomous robot utilizing distributed knowledge database

    NASA Astrophysics Data System (ADS)

    Takatori, Jiro; Suzuki, Kenji; Hartono, Pitoyo; Hashimoto, Shuji

    2003-10-01

    In this paper we present a novel method of object recognition utilizing a remote knowledge database for an autonomous robot. The developed robot has three robot arms with different sensors; two CCD cameras and haptic sensors. It can see, touch and move the target object from different directions. Referring to remote knowledge database of geometry and material, the robot observes and handles the objects to understand them including their physical characteristics.

  16. Functional and Database Architecture Design.

    DTIC Science & Technology

    1983-09-26

    I AD-At3.N 275 FUNCTIONAL AND D ATABASE ARCHITECTURE DESIGN (U) ALPHA / OMEGA GROUP INC HARVARD MA 26 SEP 83 NODS 4-83-C 0525 UNCLASSIFIED FG52 N EE...0525 REPORT AOO1 FUNCTIONAL AND DATABASE ARCHITECTURE DESIGN Submitted to: Office of Naval Research Department of the Navy 800 N. Quincy Street...ZNTIS GRA& I DTIC TAB Unannounced 0 Justification REPORT ON Distribution/ Availability Codes Avail and/or FUNCTIONAL AND DATABASE ARCHITECTURE DESIGN Dist

  17. Validation of temporal and spatial consistency of facility- and speed-specific vehicle-specific power distributions for emission estimation: A case study in Beijing, China.

    PubMed

    Zhai, Zhiqiang; Song, Guohua; Lu, Hongyu; He, Weinan; Yu, Lei

    2017-09-01

    Vehicle-specific power (VSP) has been found to be highly correlated with vehicle emissions. It is used in many studies on emission modeling such as the MOVES (Motor Vehicle Emissions Simulator) model. The existing studies develop specific VSP distributions (or OpMode distribution in MOVES) for different road types and various average speeds to represent the vehicle operating modes on road. However, it is still not clear if the facility- and speed-specific VSP distributions are consistent temporally and spatially. For instance, is it necessary to update periodically the database of the VSP distributions in the emission model? Are the VSP distributions developed in the city central business district (CBD) area applicable to its suburb area? In this context, this study examined the temporal and spatial consistency of the facility- and speed-specific VSP distributions in Beijing. The VSP distributions in different years and in different areas are developed, based on real-world vehicle activity data. The root mean square error (RMSE) is employed to quantify the difference between the VSP distributions. The maximum differences of the VSP distributions between different years and between different areas are approximately 20% of that between different road types. The analysis of the carbon dioxide (CO 2 ) emission factor indicates that the temporal and spatial differences of the VSP distributions have no significant impact on vehicle emission estimation, with relative error of less than 3%. The temporal and spatial differences have no significant impact on the development of the facility- and speed-specific VSP distributions for the vehicle emission estimation. The database of the specific VSP distributions in the VSP-based emission models can maintain in terms of time. Thus, it is unnecessary to update the database regularly, and it is reliable to use the history vehicle activity data to forecast the emissions in the future. In one city, the areas with less data can still develop accurate VSP distributions based on better data from other areas.

  18. Distribution of late Pleistocene ice-rich syngenetic permafrost of the Yedoma Suite in east and central Siberia, Russia

    USGS Publications Warehouse

    Grosse, Guido; Robinson, Joel E.; Bryant, Robin; Taylor, Maxwell D.; Harper, William; DeMasi, Amy; Kyker-Snowman, Emily; Veremeeva, Alexandra; Schirrmeister, Lutz; Harden, Jennifer

    2013-01-01

    This digital database is the product of collaboration between the U.S. Geological Survey, the Geophysical Institute at the University of Alaska, Fairbanks; the Los Altos Hills Foothill College GeoSpatial Technology Certificate Program; the Alfred Wegener Institute for Polar and Marine Research, Potsdam, Germany; and the Institute of Physical Chemical and Biological Problems in Soil Science of the Russian Academy of Sciences. The primary goal for creating this digital database is to enhance current estimates of soil organic carbon stored in deep permafrost, in particular the late Pleistocene syngenetic ice-rich permafrost deposits of the Yedoma Suite. Previous studies estimated that Yedoma deposits cover about 1 million square kilometers of a large region in central and eastern Siberia, but these estimates generally are based on maps with scales smaller than 1:10,000,000. Taking into account this large area, it was estimated that Yedoma may store as much as 500 petagrams of soil organic carbon, a large part of which is vulnerable to thaw and mobilization from thermokarst and erosion. To refine assessments of the spatial distribution of Yedoma deposits, we digitized 11 Russian Quaternary geologic maps. Our study focused on extracting geologic units interpreted by us as late Pleistocene ice-rich syngenetic Yedoma deposits based on lithology, ground ice conditions, stratigraphy, and geomorphological and spatial association. These Yedoma units then were merged into a single data layer across map tiles. The spatial database provides a useful update of the spatial distribution of this deposit for an approximately 2.32 million square kilometers land area in Siberia that will (1) serve as a core database for future refinements of Yedoma distribution in additional regions, and (2) provide a starting point to revise the size of deep but thaw-vulnerable permafrost carbon pools in the Arctic based on surface geology and the distribution of cryolithofacies types at high spatial resolution. However, we recognize that the extent of Yedoma deposits presented in this database is not complete for a global assessment, because Yedoma deposits also occur in the Taymyr lowlands and Chukotka, and in parts of Alaska and northwestern Canada.

  19. Terrestrial Sediments of the Earth: Development of a Global Unconsolidated Sediments Map Database (GUM)

    NASA Astrophysics Data System (ADS)

    Börker, J.; Hartmann, J.; Amann, T.; Romero-Mujalli, G.

    2018-04-01

    Mapped unconsolidated sediments cover half of the global land surface. They are of considerable importance for many Earth surface processes like weathering, hydrological fluxes or biogeochemical cycles. Ignoring their characteristics or spatial extent may lead to misinterpretations in Earth System studies. Therefore, a new Global Unconsolidated Sediments Map database (GUM) was compiled, using regional maps specifically representing unconsolidated and quaternary sediments. The new GUM database provides insights into the regional distribution of unconsolidated sediments and their properties. The GUM comprises 911,551 polygons and describes not only sediment types and subtypes, but also parameters like grain size, mineralogy, age and thickness where available. Previous global lithological maps or databases lacked detail for reported unconsolidated sediment areas or missed large areas, and reported a global coverage of 25 to 30%, considering the ice-free land area. Here, alluvial sediments cover about 23% of the mapped total ice-free area, followed by aeolian sediments (˜21%), glacial sediments (˜20%), and colluvial sediments (˜16%). A specific focus during the creation of the database was on the distribution of loess deposits, since loess is highly reactive and relevant to understand geochemical cycles related to dust deposition and weathering processes. An additional layer compiling pyroclastic sediment is added, which merges consolidated and unconsolidated pyroclastic sediments. The compilation shows latitudinal abundances of sediment types related to climate of the past. The GUM database is available at the PANGAEA database (https://doi.org/10.1594/PANGAEA.884822).

  20. Fuzzy Relational Databases: Representational Issues and Reduction Using Similarity Measures.

    ERIC Educational Resources Information Center

    Prade, Henri; Testemale, Claudette

    1987-01-01

    Compares and expands upon two approaches to dealing with fuzzy relational databases. The proposed similarity measure is based on a fuzzy Hausdorff distance and estimates the mismatch between two possibility distributions using a reduction process. The consequences of the reduction process on query evaluation are studied. (Author/EM)

  1. Imprecision and Uncertainty in the UFO Database Model.

    ERIC Educational Resources Information Center

    Van Gyseghem, Nancy; De Caluwe, Rita

    1998-01-01

    Discusses how imprecision and uncertainty are dealt with in the UFO (Uncertainty and Fuzziness in an Object-oriented) database model. Such information is expressed by means of possibility distributions, and modeled by means of the proposed concept of "role objects." The role objects model uncertain, tentative information about objects,…

  2. The BioMart community portal: an innovative alternative to large, centralized data repositories

    USDA-ARS?s Scientific Manuscript database

    The BioMart Community Portal (www.biomart.org) is a community-driven effort to provide a unified interface to biomedical databases that are distributed worldwide. The portal provides access to numerous database projects supported by 30 scientific organizations. It includes over 800 different biologi...

  3. HOED: Hypermedia Online Educational Database.

    ERIC Educational Resources Information Center

    Duval, E.; Olivie, H.

    This paper presents HOED, a distributed hypermedia client-server system for educational resources. The aim of HOED is to provide a library facility for hyperdocuments that is accessible via the world wide web. Its main application domain is education. The HOED database not only holds the educational resources themselves, but also data describing…

  4. An integrated chronostratigraphic data system for the twenty-first century

    USGS Publications Warehouse

    Sikora, P.J.; Ogg, James G.; Gary, A.; Cervato, C.; Gradstein, Felix; Huber, B.T.; Marshall, C.; Stein, J.A.; Wardlaw, B.

    2006-01-01

    Research in stratigraphy is increasingly multidisciplinary and conducted by diverse research teams whose members can be widely separated. This developing distributed-research process, facilitated by the availability of the Internet, promises tremendous future benefits to researchers. However, its full potential is hindered by the absence of a development strategy for the necessary infrastructure. At a National Science Foundation workshop convened in November 2001, thirty quantitative stratigraphers and database specialists from both academia and industry met to discuss how best to integrate their respective chronostratigraphic databases. The main goal was to develop a strategy that would allow efficient distribution and integration of existing data relevant to the study of geologic time. Discussions concentrated on three major themes: database standards and compatibility, strategies and tools for information retrieval and analysis of all types of global and regional stratigraphic data, and future directions for database integration and centralization of currently distributed depositories. The result was a recommendation to establish an integrated chronostratigraphic database, to be called Chronos, which would facilitate greater efficiency in stratigraphic studies (http://www.chronos.org/) . The Chronos system will both provide greater ease of data gathering and allow for multidisciplinary synergies, functions of fundamental importance in a variety of research, including time scale construction, paleoenvironmental analysis, paleoclimatology and paleoceanography. Beyond scientific research, Chronos will also provide educational and societal benefits by providing an accessible source of information of general interest (e.g., mass extinctions) and concern (e.g., climatic change). The National Science Foundation has currently funded a three-year program for implementing Chronos.. ?? 2006 Geological Society of America. All rights reserved.

  5. Predicting the performance of fingerprint similarity searching.

    PubMed

    Vogt, Martin; Bajorath, Jürgen

    2011-01-01

    Fingerprints are bit string representations of molecular structure that typically encode structural fragments, topological features, or pharmacophore patterns. Various fingerprint designs are utilized in virtual screening and their search performance essentially depends on three parameters: the nature of the fingerprint, the active compounds serving as reference molecules, and the composition of the screening database. It is of considerable interest and practical relevance to predict the performance of fingerprint similarity searching. A quantitative assessment of the potential that a fingerprint search might successfully retrieve active compounds, if available in the screening database, would substantially help to select the type of fingerprint most suitable for a given search problem. The method presented herein utilizes concepts from information theory to relate the fingerprint feature distributions of reference compounds to screening libraries. If these feature distributions do not sufficiently differ, active database compounds that are similar to reference molecules cannot be retrieved because they disappear in the "background." By quantifying the difference in feature distribution using the Kullback-Leibler divergence and relating the divergence to compound recovery rates obtained for different benchmark classes, fingerprint search performance can be quantitatively predicted.

  6. RAINBIO: a mega-database of tropical African vascular plants distributions

    PubMed Central

    Dauby, Gilles; Zaiss, Rainer; Blach-Overgaard, Anne; Catarino, Luís; Damen, Theo; Deblauwe, Vincent; Dessein, Steven; Dransfield, John; Droissart, Vincent; Duarte, Maria Cristina; Engledow, Henry; Fadeur, Geoffrey; Figueira, Rui; Gereau, Roy E.; Hardy, Olivier J.; Harris, David J.; de Heij, Janneke; Janssens, Steven; Klomberg, Yannick; Ley, Alexandra C.; Mackinder, Barbara A.; Meerts, Pierre; van de Poel, Jeike L.; Sonké, Bonaventure; Sosef, Marc S. M.; Stévart, Tariq; Stoffelen, Piet; Svenning, Jens-Christian; Sepulchre, Pierre; van der Burgt, Xander; Wieringa, Jan J.; Couvreur, Thomas L. P.

    2016-01-01

    Abstract The tropical vegetation of Africa is characterized by high levels of species diversity but is undergoing important shifts in response to ongoing climate change and increasing anthropogenic pressures. Although our knowledge of plant species distribution patterns in the African tropics has been improving over the years, it remains limited. Here we present RAINBIO, a unique comprehensive mega-database of georeferenced records for vascular plants in continental tropical Africa. The geographic focus of the database is the region south of the Sahel and north of Southern Africa, and the majority of data originate from tropical forest regions. RAINBIO is a compilation of 13 datasets either publicly available or personal ones. Numerous in depth data quality checks, automatic and manual via several African flora experts, were undertaken for georeferencing, standardization of taxonomic names and identification and merging of duplicated records. The resulting RAINBIO data allows exploration and extraction of distribution data for 25,356 native tropical African vascular plant species, which represents ca. 89% of all known plant species in the area of interest. Habit information is also provided for 91% of these species. PMID:28127234

  7. Overcoming barriers to a research-ready national commercial claims database.

    PubMed

    Newman, David; Herrera, Carolina-Nicole; Parente, Stephen T

    2014-11-01

    Billions of dollars have been spent on the goal of making healthcare data available to clinicians and researchers in the hopes of improving healthcare and lowering costs. However, the problems of data governance, distribution, and accessibility remain challenges for the healthcare system to overcome. In this study, we discuss some of the issues around holding, reporting, and distributing data, including the newest "big data" challenge: making the data accessible to researchers and policy makers. This article presents a case study in "big healthcare data" involving the Health Care Cost Institute (HCCI). HCCI is a nonprofit, nonpartisan, independent research institute that serves as a voluntary repository of national commercial healthcare claims data. Governance of large healthcare databases is complicated by the data-holding model and further complicated by issues related to distribution to research teams. For multi-payer healthcare claims databases, the 2 most common models of data holding (mandatory and voluntary) have different data security requirements. Furthermore, data transport and accessibility may require technological investment. HCCI's efforts offer insights from which other data managers and healthcare leaders may benefit when contemplating a data collaborative.

  8. NESDIS OSPO Data Access Policy and CRM

    NASA Astrophysics Data System (ADS)

    Seybold, M. G.; Donoho, N. A.; McNamara, D.; Paquette, J.; Renkevens, T.

    2012-12-01

    The Office of Satellite and Product Operations (OSPO) is the NESDIS office responsible for satellite operations, product generation, and product distribution. Access to and distribution of OSPO data was formally established in a Data Access Policy dated February, 2011. An extension of the data access policy is the OSPO Customer Relationship Management (CRM) Database, which has been in development since 2008 and is reaching a critical level of maturity. This presentation will provide a summary of the data access policy and standard operating procedure (SOP) for handling data access requests. The tangential CRM database will be highlighted including the incident tracking system, reporting and notification capabilities, and the first comprehensive portfolio of NESDIS satellites, instruments, servers, applications, products, user organizations, and user contacts. Select examples of CRM data exploitation will show how OSPO is utilizing the CRM database to more closely satisfy the user community's satellite data needs with new product promotions, as well as new data and imagery distribution methods in OSPO's Environmental Satellite Processing Center (ESPC). In addition, user services and outreach initiatives from the Satellite Products and Services Division will be highlighted.

  9. A Probabilistic Model of Local Sequence Alignment That Simplifies Statistical Significance Estimation

    PubMed Central

    Eddy, Sean R.

    2008-01-01

    Sequence database searches require accurate estimation of the statistical significance of scores. Optimal local sequence alignment scores follow Gumbel distributions, but determining an important parameter of the distribution (λ) requires time-consuming computational simulation. Moreover, optimal alignment scores are less powerful than probabilistic scores that integrate over alignment uncertainty (“Forward” scores), but the expected distribution of Forward scores remains unknown. Here, I conjecture that both expected score distributions have simple, predictable forms when full probabilistic modeling methods are used. For a probabilistic model of local sequence alignment, optimal alignment bit scores (“Viterbi” scores) are Gumbel-distributed with constant λ = log 2, and the high scoring tail of Forward scores is exponential with the same constant λ. Simulation studies support these conjectures over a wide range of profile/sequence comparisons, using 9,318 profile-hidden Markov models from the Pfam database. This enables efficient and accurate determination of expectation values (E-values) for both Viterbi and Forward scores for probabilistic local alignments. PMID:18516236

  10. A Climatological Study of Cloud to Ground Lightning Strikes in the Vicinity of the Kennedy Space Center

    NASA Technical Reports Server (NTRS)

    Burns, Lee; Decker, Ryan

    2004-01-01

    Lightning strike location and peak current are monitored operationally in the Kennedy Space Center (KSC)/Cape Canaveral Air Force Station (CCAFS) area by the Cloud to Ground Lightning Surveillance System (CGLSS). The present study compiles ten years of CGLSS data into a climatological database of all strikes recorded within a 20-mile radius of space shuttle launch platform LP39A, which serves as a convenient central point. The period of record (POR) for the database runs from January 1, 1993 to December 31, 2002. Histograms and cumulative probability curves are produced to determine the distribution of occurrence rates for the spectrum of strike intensities (given in kA). Further analysis of the database provides a description of both seasonal and interannual variations in the lightning distribution.

  11. On the predictability of protein database search complexity and its relevance to optimization of distributed searches.

    PubMed

    Deciu, Cosmin; Sun, Jun; Wall, Mark A

    2007-09-01

    We discuss several aspects related to load balancing of database search jobs in a distributed computing environment, such as Linux cluster. Load balancing is a technique for making the most of multiple computational resources, which is particularly relevant in environments in which the usage of such resources is very high. The particular case of the Sequest program is considered here, but the general methodology should apply to any similar database search program. We show how the runtimes for Sequest searches of tandem mass spectral data can be predicted from profiles of previous representative searches, and how this information can be used for better load balancing of novel data. A well-known heuristic load balancing method is shown to be applicable to this problem, and its performance is analyzed for a variety of search parameters.

  12. Insertion algorithms for network model database management systems

    NASA Astrophysics Data System (ADS)

    Mamadolimov, Abdurashid; Khikmat, Saburov

    2017-12-01

    The network model is a database model conceived as a flexible way of representing objects and their relationships. Its distinguishing feature is that the schema, viewed as a graph in which object types are nodes and relationship types are arcs, forms partial order. When a database is large and a query comparison is expensive then the efficiency requirement of managing algorithms is minimizing the number of query comparisons. We consider updating operation for network model database management systems. We develop a new sequantial algorithm for updating operation. Also we suggest a distributed version of the algorithm.

  13. Environmental Database For Water-Quality Data for the Penobscot River, Maine: Design Documentation and User Guide

    USGS Publications Warehouse

    Giffen, Sarah E.

    2002-01-01

    An environmental database was developed to store water-quality data collected during the 1999 U.S. Geological Survey investigation of the occurrence and distribution of dioxins, furans, and PCBs in the riverbed sediment and fish tissue in the Penobscot River in Maine. The database can be used to store a wide range of detailed information and to perform complex queries on the data it contains. The database also could be used to store data from other historical and any future environmental studies conducted on the Penobscot River and surrounding regions.

  14. Comprehensive, comprehensible, distributed and intelligent databases: current status.

    PubMed

    Frishman, D; Heumann, K; Lesk, A; Mewes, H W

    1998-01-01

    It is only a matter of time until a user will see not many but one integrated database of information for molecular biology. Is this true? Is it a good thing? Why will it happen? Where are we now? What developments are fostering and what developments are impeding progress towards this end? A list of WWW resources devoted to database issues in molecular biology is available at http://www.mips.biochem.mpg.de frishman@mips.biochem.mpg.de

  15. Performance Evaluation of NoSQL Databases: A Case Study

    DTIC Science & Technology

    2015-02-01

    a centralized relational database. The customer decided to consider NoSQL technologies for two specific uses, namely:  the primary data store for...17 custom specific 6. FU NoSQL availab data mo arking of data g a specific wo sin benchmark f hmark for tran le workload de o publish meas their...The choice of a particular NoSQL database imposes a specific distributed software architecture and data model, and is a major determinant of the

  16. A Methodolgy, Based on Analytical Modeling, for the Design of Parallel and Distributed Architectures for Relational Database Query Processors.

    DTIC Science & Technology

    1987-12-01

    Application Programs Intelligent Disk Database Controller Manangement System Operating System Host .1’ I% Figure 2. Intelligent Disk Controller Application...8217. /- - • Database Control -% Manangement System Disk Data Controller Application Programs Operating Host I"" Figure 5. Processor-Per- Head data. Therefore, the...However. these ad- ditional properties have been proven in classical set and relation theory [75]. These additional properties are described here

  17. A Unified Approach to Joint Regional/Teleseismic Calibration and Event Location with a 3D Earth Model

    DTIC Science & Technology

    2010-09-01

    raytracing and travel-time calculation in 3D Earth models, such as the finite-difference eikonal method (e.g., Podvin and Lecomte, 1991), fast...by Reiter and Rodi (2009) in constructing JWM. Two teleseismic data sets were considered, both extracted from the EHB database (Engdahl et al...extracted from the updated EHB database distributed by the International Seismological Centre (http://www.isc.ac.uk/EHB/index.html). The new database

  18. Quality Attribute-Guided Evaluation of NoSQL Databases: A Case Study

    DTIC Science & Technology

    2015-01-16

    evaluations of NoSQL databases specifically, and big data systems in general, that have become apparent during our study. Keywords—NoSQL, distributed...technology, namely that of big data , software systems [1]. At the heart of big data systems are a collection of database technologies that are more...born organizations such as Google and Amazon [3][4], along with those of numerous other big data innovators, have created a variety of open source and

  19. Automatic pattern localization across layout database and photolithography mask

    NASA Astrophysics Data System (ADS)

    Morey, Philippe; Brault, Frederic; Beisser, Eric; Ache, Oliver; Röth, Klaus-Dieter

    2016-03-01

    Advanced process photolithography masks require more and more controls for registration versus design and critical dimension uniformity (CDU). The distribution of the measurement points should be distributed all over the whole mask and may be denser in areas critical to wafer overlay requirements. This means that some, if not many, of theses controls should be made inside the customer die and may use non-dedicated patterns. It is then mandatory to access the original layout database to select patterns for the metrology process. Finding hundreds of relevant patterns in a database containing billions of polygons may be possible, but in addition, it is mandatory to create the complete metrology job fast and reliable. Combining, on one hand, a software expertise in mask databases processing and, on the other hand, advanced skills in control and registration equipment, we have developed a Mask Dataprep Station able to select an appropriate number of measurement targets and their positions in a huge database and automatically create measurement jobs on the corresponding area on the mask for the registration metrology system. In addition, the required design clips are generated from the database in order to perform the rendering procedure on the metrology system. This new methodology has been validated on real production line for the most advanced process. This paper presents the main challenges that we have faced, as well as some results on the global performances.

  20. An effective model for store and retrieve big health data in cloud computing.

    PubMed

    Goli-Malekabadi, Zohreh; Sargolzaei-Javan, Morteza; Akbari, Mohammad Kazem

    2016-08-01

    The volume of healthcare data including different and variable text types, sounds, and images is increasing day to day. Therefore, the storage and processing of these data is a necessary and challenging issue. Generally, relational databases are used for storing health data which are not able to handle the massive and diverse nature of them. This study aimed at presenting the model based on NoSQL databases for the storage of healthcare data. Despite different types of NoSQL databases, document-based DBs were selected by a survey on the nature of health data. The presented model was implemented in the Cloud environment for accessing to the distribution properties. Then, the data were distributed on the database by applying the Shard property. The efficiency of the model was evaluated in comparison with the previous data model, Relational Database, considering query time, data preparation, flexibility, and extensibility parameters. The results showed that the presented model approximately performed the same as SQL Server for "read" query while it acted more efficiently than SQL Server for "write" query. Also, the performance of the presented model was better than SQL Server in the case of flexibility, data preparation and extensibility. Based on these observations, the proposed model was more effective than Relational Databases for handling health data. Copyright © 2016 Elsevier Ireland Ltd. All rights reserved.

  1. High Resolution Electro-Optical Aerosol Phase Function Database PFNDAT2006

    DTIC Science & Technology

    2006-08-01

    snow models use the gamma distribution (equation 12) with m = 0. 3.4.1 Rain Model The most widely used analytical parameterization for raindrop size ...Uijlenhoet and Stricker (22), as the result of an analytical derivation based on a theoretical parameterization for the raindrop size distribution ...6 2.2 Particle Size Distribution Models

  2. Heterogeneous database integration in biomedicine.

    PubMed

    Sujansky, W

    2001-08-01

    The rapid expansion of biomedical knowledge, reduction in computing costs, and spread of internet access have created an ocean of electronic data. The decentralized nature of our scientific community and healthcare system, however, has resulted in a patchwork of diverse, or heterogeneous, database implementations, making access to and aggregation of data across databases very difficult. The database heterogeneity problem applies equally to clinical data describing individual patients and biological data characterizing our genome. Specifically, databases are highly heterogeneous with respect to the data models they employ, the data schemas they specify, the query languages they support, and the terminologies they recognize. Heterogeneous database systems attempt to unify disparate databases by providing uniform conceptual schemas that resolve representational heterogeneities, and by providing querying capabilities that aggregate and integrate distributed data. Research in this area has applied a variety of database and knowledge-based techniques, including semantic data modeling, ontology definition, query translation, query optimization, and terminology mapping. Existing systems have addressed heterogeneous database integration in the realms of molecular biology, hospital information systems, and application portability.

  3. Conceptual Model Formalization in a Semantic Interoperability Service Framework: Transforming Relational Database Schemas to OWL.

    PubMed

    Bravo, Carlos; Suarez, Carlos; González, Carolina; López, Diego; Blobel, Bernd

    2014-01-01

    Healthcare information is distributed through multiple heterogeneous and autonomous systems. Access to, and sharing of, distributed information sources are a challenging task. To contribute to meeting this challenge, this paper presents a formal, complete and semi-automatic transformation service from Relational Databases to Web Ontology Language. The proposed service makes use of an algorithm that allows to transform several data models of different domains by deploying mainly inheritance rules. The paper emphasizes the relevance of integrating the proposed approach into an ontology-based interoperability service to achieve semantic interoperability.

  4. Distributed query plan generation using multiobjective genetic algorithm.

    PubMed

    Panicker, Shina; Kumar, T V Vijay

    2014-01-01

    A distributed query processing strategy, which is a key performance determinant in accessing distributed databases, aims to minimize the total query processing cost. One way to achieve this is by generating efficient distributed query plans that involve fewer sites for processing a query. In the case of distributed relational databases, the number of possible query plans increases exponentially with respect to the number of relations accessed by the query and the number of sites where these relations reside. Consequently, computing optimal distributed query plans becomes a complex problem. This distributed query plan generation (DQPG) problem has already been addressed using single objective genetic algorithm, where the objective is to minimize the total query processing cost comprising the local processing cost (LPC) and the site-to-site communication cost (CC). In this paper, this DQPG problem is formulated and solved as a biobjective optimization problem with the two objectives being minimize total LPC and minimize total CC. These objectives are simultaneously optimized using a multiobjective genetic algorithm NSGA-II. Experimental comparison of the proposed NSGA-II based DQPG algorithm with the single objective genetic algorithm shows that the former performs comparatively better and converges quickly towards optimal solutions for an observed crossover and mutation probability.

  5. Distributed Query Plan Generation Using Multiobjective Genetic Algorithm

    PubMed Central

    Panicker, Shina; Vijay Kumar, T. V.

    2014-01-01

    A distributed query processing strategy, which is a key performance determinant in accessing distributed databases, aims to minimize the total query processing cost. One way to achieve this is by generating efficient distributed query plans that involve fewer sites for processing a query. In the case of distributed relational databases, the number of possible query plans increases exponentially with respect to the number of relations accessed by the query and the number of sites where these relations reside. Consequently, computing optimal distributed query plans becomes a complex problem. This distributed query plan generation (DQPG) problem has already been addressed using single objective genetic algorithm, where the objective is to minimize the total query processing cost comprising the local processing cost (LPC) and the site-to-site communication cost (CC). In this paper, this DQPG problem is formulated and solved as a biobjective optimization problem with the two objectives being minimize total LPC and minimize total CC. These objectives are simultaneously optimized using a multiobjective genetic algorithm NSGA-II. Experimental comparison of the proposed NSGA-II based DQPG algorithm with the single objective genetic algorithm shows that the former performs comparatively better and converges quickly towards optimal solutions for an observed crossover and mutation probability. PMID:24963513

  6. Spatial distribution of citizen science casuistic observations for different taxonomic groups.

    PubMed

    Tiago, Patrícia; Ceia-Hasse, Ana; Marques, Tiago A; Capinha, César; Pereira, Henrique M

    2017-10-16

    Opportunistic citizen science databases are becoming an important way of gathering information on species distributions. These data are temporally and spatially dispersed and could have limitations regarding biases in the distribution of the observations in space and/or time. In this work, we test the influence of landscape variables in the distribution of citizen science observations for eight taxonomic groups. We use data collected through a Portuguese citizen science database (biodiversity4all.org). We use a zero-inflated negative binomial regression to model the distribution of observations as a function of a set of variables representing the landscape features plausibly influencing the spatial distribution of the records. Results suggest that the density of paths is the most important variable, having a statistically significant positive relationship with number of observations for seven of the eight taxa considered. Wetland coverage was also identified as having a significant, positive relationship, for birds, amphibians and reptiles, and mammals. Our results highlight that the distribution of species observations, in citizen science projects, is spatially biased. Higher frequency of observations is driven largely by accessibility and by the presence of water bodies. We conclude that efforts are required to increase the spatial evenness of sampling effort from volunteers.

  7. Framework for Optimizing Selection of Interspecies Correlation Estimation Models to Address Species Diversity and Toxicity Gaps in an Aquatic Database

    EPA Science Inventory

    The Chemical Aquatic Fate and Effects (CAFE) database is a tool that facilitates assessments of accidental chemical releases into aquatic environments. CAFE contains aquatic toxicity data used in the development of species sensitivity distributions (SSDs) and the estimation of ha...

  8. Space Object Radiometric Modeling for Hardbody Optical Signature Database Generation

    DTIC Science & Technology

    2009-09-01

    Introduction This presentation summarizes recent activity in monitoring spacecraft health status using passive remote optical nonimaging ...Approved for public release; distribution is unlimited. Space Object Radiometric Modeling for Hardbody Optical Signature Database Generation...It is beneficial to the observer/analyst to understand the fundamental optical signature variability associated with these detection and

  9. Comparing IndexedHBase and Riak for Serving Truthy: Performance of Data Loading and Query Evaluation

    DTIC Science & Technology

    2013-08-01

    Research Triangle Park, NC 27709-2211 15. SUBJECT TERMS performance evaluation, distributed database, noSQL , HBase, indexing Xiaoming Gao, Judy Qiu...common hashtags created during a given time window. With the purpose of finding a solution for these challenges, we evaluate NoSQL databases such as

  10. Hypercat: A Database for Extragalactic Astronomy

    NASA Astrophysics Data System (ADS)

    Prugniel, Ph.; Maubon, G.

    The Hypercat Database is developed at Observatoire de Lyon and is distributed on the WEB(www-obs.univ-lyon1.fr/hypercat) through different mirrors in Europe. The goal of Hypercat is to gather data necessary for studying the evolution of galaxies (dynamics and stellar contains) and particularly for providing a z = 0 reference for these studies.

  11. Computerization of the Arkansas Fishes Database

    Treesearch

    Henry W. Robison; L. Gayle Henderson; Melvin L. Warren; Janet S. Rader

    2004-01-01

    Abstract - Until recently, distributional data for the fishes of Arkansas existed in the form of museum records, field notebooks of various ichthyologists, and published fish survey data; none of which was in a digital format. In 1995, a relational database system was used to design a PC platform data entry module for the capture of information on...

  12. CALINVASIVES: a revolutionary tool to monitor invasive threats

    Treesearch

    M. Garbelotto; S. Drill; C. Powell; J. Malpas

    2017-01-01

    CALinvasives is a web-based relational database and content management system (CMS) cataloging the statewide distribution of invasive pathogens and pests and the plant hosts they impact. The database has been developed as a collaboration between the Forest Pathology and Mycology Laboratory at UC Berkeley and Calflora. CALinvasives will combine information on the...

  13. Integrated remote sensing and visualization (IRSV) system for transportation infrastructure operations and management, phase two, volume 4 : web-based bridge information database--visualization analytics and distributed sensing.

    DOT National Transportation Integrated Search

    2012-03-01

    This report introduces the design and implementation of a Web-based bridge information visual analytics system. This : project integrates Internet, multiple databases, remote sensing, and other visualization technologies. The result : combines a GIS ...

  14. Compression of Index Term Dictionary in an Inverted-File-Oriented Database: Some Effective Algorithms.

    ERIC Educational Resources Information Center

    Wisniewski, Janusz L.

    1986-01-01

    Discussion of a new method of index term dictionary compression in an inverted-file-oriented database highlights a technique of word coding, which generates short fixed-length codes obtained from the index terms themselves by analysis of monogram and bigram statistical distributions. Substantial savings in communication channel utilization are…

  15. GLAD: a system for developing and deploying large-scale bioinformatics grid.

    PubMed

    Teo, Yong-Meng; Wang, Xianbing; Ng, Yew-Kwong

    2005-03-01

    Grid computing is used to solve large-scale bioinformatics problems with gigabytes database by distributing the computation across multiple platforms. Until now in developing bioinformatics grid applications, it is extremely tedious to design and implement the component algorithms and parallelization techniques for different classes of problems, and to access remotely located sequence database files of varying formats across the grid. In this study, we propose a grid programming toolkit, GLAD (Grid Life sciences Applications Developer), which facilitates the development and deployment of bioinformatics applications on a grid. GLAD has been developed using ALiCE (Adaptive scaLable Internet-based Computing Engine), a Java-based grid middleware, which exploits the task-based parallelism. Two bioinformatics benchmark applications, such as distributed sequence comparison and distributed progressive multiple sequence alignment, have been developed using GLAD.

  16. Distributing Variable Star Data to the Virtual Observatory

    NASA Astrophysics Data System (ADS)

    Kinne, Richard C.; Templeton, M. R.; Henden, A. A.; Zografou, P.; Harbo, P.; Evans, J.; Rots, A. H.; LAZIO, J.

    2013-01-01

    Effective distribution of data is a core element of effective astronomy today. The AAVSO is the home of several different unique databases. The AAVSO International Database (AID) contains over a century of photometric and time-series data on thousands of individual variable stars comprising over 22 million observations. The AAVSO Photometric All-Sky Survey (APASS) is a new photometric catalog containing calibrated photometry in Johnson B, V and Sloan g', r' and i' filters for stars with magnitudes of 10 < V < 17. The AAVSO is partnering with researchers and technologists at the Virtual Astronomical Observatory (VAO) to solve the data distribution problem for these datasets by making them available via various VO tools. We give specific examples of how these data can be accessed through Virtual Observatory (VO) toolsets and utilized for astronomical research.

  17. A distributed database view of network tracking systems

    NASA Astrophysics Data System (ADS)

    Yosinski, Jason; Paffenroth, Randy

    2008-04-01

    In distributed tracking systems, multiple non-collocated trackers cooperate to fuse local sensor data into a global track picture. Generating this global track picture at a central location is fairly straightforward, but the single point of failure and excessive bandwidth requirements introduced by centralized processing motivate the development of decentralized methods. In many decentralized tracking systems, trackers communicate with their peers via a lossy, bandwidth-limited network in which dropped, delayed, and out of order packets are typical. Oftentimes the decentralized tracking problem is viewed as a local tracking problem with a networking twist; we believe this view can underestimate the network complexities to be overcome. Indeed, a subsequent 'oversight' layer is often introduced to detect and handle track inconsistencies arising from a lack of robustness to network conditions. We instead pose the decentralized tracking problem as a distributed database problem, enabling us to draw inspiration from the vast extant literature on distributed databases. Using the two-phase commit algorithm, a well known technique for resolving transactions across a lossy network, we describe several ways in which one may build a distributed multiple hypothesis tracking system from the ground up to be robust to typical network intricacies. We pay particular attention to the dissimilar challenges presented by network track initiation vs. maintenance and suggest a hybrid system that balances speed and robustness by utilizing two-phase commit for only track initiation transactions. Finally, we present simulation results contrasting the performance of such a system with that of more traditional decentralized tracking implementations.

  18. Workstation Analytics in Distributed Warfighting Experimentation: Results from Coalition Attack Guidance Experiment 3A

    DTIC Science & Technology

    2014-06-01

    central location. Each of the SQLite databases are converted and stored in one MySQL database and the pcap files are parsed to extract call information...from the specific communications applications used during the experiment. This extracted data is then stored in the same MySQL database. With all...rhythm of the event. Figure 3 demonstrates the application usage over the course of the experiment for the EXDIR. As seen, the EXDIR spent the majority

  19. Inter Annual Variability of the Acoustic Propagation in the Yellow Sea Identified from a Synoptic Monthly Gridded Database as Compared with GDEM

    DTIC Science & Technology

    2016-09-01

    the world climate is in fact warming due to anthropogenic causes (Anderegg et al. 2010; Solomon et al. 2009). To put this in terms for this research ...2006). The present research uses a 0.5’ resolution. B. SEDIMENTS DATABASE There are four openly available sediment databases: Enhanced, Standard...DISTRIBUTION CODE 13. ABSTRACT (maximum 200 words) This research investigates the inter-annual acoustic variability in the Yellow Sea identified from

  20. SPERM COUNT DISTRIBUTIONS IN FERTILE MEN

    EPA Science Inventory

    Sperm concentration and count are often used as indicators of environmental impacts on male reproductive health. Existing clinical databases may be biased towards subfertile men with low sperm counts and less is known about expected sperm count distributions in cohorts of fertil...

  1. Mass-storage management for distributed image/video archives

    NASA Astrophysics Data System (ADS)

    Franchi, Santina; Guarda, Roberto; Prampolini, Franco

    1993-04-01

    The realization of image/video database requires a specific design for both database structures and mass storage management. This issue has addressed the project of the digital image/video database system that has been designed at IBM SEMEA Scientific & Technical Solution Center. Proper database structures have been defined to catalog image/video coding technique with the related parameters, and the description of image/video contents. User workstations and servers are distributed along a local area network. Image/video files are not managed directly by the DBMS server. Because of their wide size, they are stored outside the database on network devices. The database contains the pointers to the image/video files and the description of the storage devices. The system can use different kinds of storage media, organized in a hierarchical structure. Three levels of functions are available to manage the storage resources. The functions of the lower level provide media management. They allow it to catalog devices and to modify device status and device network location. The medium level manages image/video files on a physical basis. It manages file migration between high capacity media and low access time media. The functions of the upper level work on image/video file on a logical basis, as they archive, move and copy image/video data selected by user defined queries. These functions are used to support the implementation of a storage management strategy. The database information about characteristics of both storage devices and coding techniques are used by the third level functions to fit delivery/visualization requirements and to reduce archiving costs.

  2. An Updating System for the Gridded Population Database of China Based on Remote Sensing, GIS and Spatial Database Technologies.

    PubMed

    Yang, Xiaohuan; Huang, Yaohuan; Dong, Pinliang; Jiang, Dong; Liu, Honghui

    2009-01-01

    The spatial distribution of population is closely related to land use and land cover (LULC) patterns on both regional and global scales. Population can be redistributed onto geo-referenced square grids according to this relation. In the past decades, various approaches to monitoring LULC using remote sensing and Geographic Information Systems (GIS) have been developed, which makes it possible for efficient updating of geo-referenced population data. A Spatial Population Updating System (SPUS) is developed for updating the gridded population database of China based on remote sensing, GIS and spatial database technologies, with a spatial resolution of 1 km by 1 km. The SPUS can process standard Moderate Resolution Imaging Spectroradiometer (MODIS L1B) data integrated with a Pattern Decomposition Method (PDM) and an LULC-Conversion Model to obtain patterns of land use and land cover, and provide input parameters for a Population Spatialization Model (PSM). The PSM embedded in SPUS is used for generating 1 km by 1 km gridded population data in each population distribution region based on natural and socio-economic variables. Validation results from finer township-level census data of Yishui County suggest that the gridded population database produced by the SPUS is reliable.

  3. A scientific database for real-time Neutron Monitor measurements - taking Neutron Monitors into the 21st century

    NASA Astrophysics Data System (ADS)

    Steigies, Christian

    2012-07-01

    The Neutron Monitor Database project, www.nmdb.eu, has been funded in 2008 and 2009 by the European Commission's 7th framework program (FP7). Neutron monitors (NMs) have been in use worldwide since the International Geophysical Year (IGY) in 1957 and cosmic ray data from the IGY and the improved NM64 NMs has been distributed since this time, but a common data format existed only for data with one hour resolution. This data was first distributed in printed books, later via the World Data Center ftp server. In the 1990's the first NM stations started to record data at higher resolutions (typically 1 minute) and publish in on their webpages. However, every NM station chose their own format, making it cumbersome to work with this distributed data. In NMDB all European and some neighboring NM stations came together to agree on a common format for high-resolution data and made this available via a centralized database. The goal of NMDB is to make all data from all NM stations available in real-time. The original NMDB network has recently been joined by the Bartol Research Institute (Newark DE, USA), the National Autonomous University of Mexico and the North-West University (Potchefstroom, South Africa). The data is accessible to everyone via an easy to use webinterface, but expert users can also directly access the database to build applications like real-time space weather alerts. Even though SQL databases are used today by most webservices (blogs, wikis, social media, e-commerce), the power of an SQL database has not yet been fully realized by the scientific community. In training courses, we are teaching how to make use of NMDB, how to join NMDB, and how to ensure the data quality. The present status of the extended NMDB will be presented. The consortium welcomes further data providers to help increase the scientific contributions of the worldwide neutron monitor network to heliospheric physics and space weather.

  4. Prescriber Compliance With Liver Monitoring Guidelines for Pazopanib in the Postapproval Setting: Results From a Distributed Research Network.

    PubMed

    Shantakumar, Sumitra; Nordstrom, Beth L; Hall, Susan A; Djousse, Luc; van Herk-Sukel, Myrthe P P; Fraeman, Kathy H; Gagnon, David R; Chagin, Karen; Nelson, Jeanenne J

    2017-04-20

    Pazopanib received US Food and Drug Administration approval in 2009 for advanced renal cell carcinoma. During clinical development, liver chemistry abnormalities and adverse hepatic events were observed, leading to a boxed warning for hepatotoxicity and detailed label prescriber guidelines for liver monitoring. As part of postapproval regulatory commitments, a cohort study was conducted to assess prescriber compliance with liver monitoring guidelines. Over a 4-year period, a distributed network approach was used across 3 databases: US Veterans Affairs Healthcare System, a US outpatient oncology community practice database, and the Dutch PHARMO Database Network. Measures of prescriber compliance were designed using the original pazopanib label guidelines for liver monitoring. Results from the VA (n = 288) and oncology databases (n = 283) indicate that prescriber liver chemistry monitoring was less than 100%: 73% to 74% compliance with baseline testing and 37% to 39% compliance with testing every 4 weeks. Compliance was highest near drug initiation and decreased over time. Among patients who should have had weekly testing, the compliance was 56% in both databases. The more serious elevations examined, including combinations of liver enzyme elevations meeting the laboratory definition of Hy's law were infrequent but always led to appropriate discontinuation of pazopanib. Only 4 patients were identified for analysis in the Dutch database; none had recorded baseline testing. In this population-based study, prescriber compliance was reasonable near pazopanib initiation but low during subsequent weeks of treatment. This study provides information from real-world community practice settings and offers feedback to regulators on the effectiveness of label monitoring guidelines.This is an open-access article distributed under the terms of the Creative Commons Attribution-Non Commercial-No Derivatives License 4.0 (CCBY-NC-ND), where it is permissible to download and share the work provided it is properly cited. The work cannot be changed in any way or used commercially without permission from the journal.

  5. Distributed structure-searchable toxicity (DSSTox) public database network: a proposal.

    PubMed

    Richard, Ann M; Williams, ClarLynda R

    2002-01-29

    The ability to assess the potential genotoxicity, carcinogenicity, or other toxicity of pharmaceutical or industrial chemicals based on chemical structure information is a highly coveted and shared goal of varied academic, commercial, and government regulatory groups. These diverse interests often employ different approaches and have different criteria and use for toxicity assessments, but they share a need for unrestricted access to existing public toxicity data linked with chemical structure information. Currently, there exists no central repository of toxicity information, commercial or public, that adequately meets the data requirements for flexible analogue searching, Structure-Activity Relationship (SAR) model development, or building of chemical relational databases (CRD). The distributed structure-searchable toxicity (DSSTox) public database network is being proposed as a community-supported, web-based effort to address these shared needs of the SAR and toxicology communities. The DSSTox project has the following major elements: (1) to adopt and encourage the use of a common standard file format (structure data file (SDF)) for public toxicity databases that includes chemical structure, text and property information, and that can easily be imported into available CRD applications; (2) to implement a distributed source approach, managed by a DSSTox Central Website, that will enable decentralized, free public access to structure-toxicity data files, and that will effectively link knowledgeable toxicity data sources with potential users of these data from other disciplines (such as chemistry, modeling, and computer science); and (3) to engage public/commercial/academic/industry groups in contributing to and expanding this community-wide, public data sharing and distribution effort. The DSSTox project's overall aims are to effect the closer association of chemical structure information with existing toxicity data, and to promote and facilitate structure-based exploration of these data within a common chemistry-based framework that spans toxicological disciplines.

  6. Estimating a Service-Life Distribution Based on Production Counts and a Failure Database

    DOE PAGES

    Ryan, Kenneth J.; Hamada, Michael Scott; Vardeman, Stephen B.

    2017-04-01

    A manufacturer wanted to compare the service-life distributions of two similar products. These concern product lifetimes after installation (not manufacture). For each product, there were available production counts and an imperfect database providing information on failing units. In the real case, these units were expensive repairable units warrantied against repairs. Failure (of interest here) was relatively rare and driven by a different mode/mechanism than ordinary repair events (not of interest here). Approach: Data models for the service life based on a standard parametric lifetime distribution and a related limited failure population were developed. These models were used to develop expressionsmore » for the likelihood of the available data that properly accounts for information missing in the failure database. Results: A Bayesian approach was employed to obtain estimates of model parameters (with associated uncertainty) in order to investigate characteristics of the service-life distribution. Custom software was developed and is included as Supplemental Material to this case study. One part of a responsible approach to the original case was a simulation experiment used to validate the correctness of the software and the behavior of the statistical methodology before using its results in the application, and an example of such an experiment is included here. Because of confidentiality issues that prevent use of the original data, simulated data with characteristics like the manufacturer’s proprietary data are used to illustrate some aspects of our real analyses. Lastly, we also note that, although this case focuses on rare and complete product failure, the statistical methodology provided is directly applicable to more standard warranty data problems involving typically much larger warranty databases where entries are warranty claims (often for repairs) rather than reports of complete failures.« less

  7. Estimating a Service-Life Distribution Based on Production Counts and a Failure Database

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Ryan, Kenneth J.; Hamada, Michael Scott; Vardeman, Stephen B.

    A manufacturer wanted to compare the service-life distributions of two similar products. These concern product lifetimes after installation (not manufacture). For each product, there were available production counts and an imperfect database providing information on failing units. In the real case, these units were expensive repairable units warrantied against repairs. Failure (of interest here) was relatively rare and driven by a different mode/mechanism than ordinary repair events (not of interest here). Approach: Data models for the service life based on a standard parametric lifetime distribution and a related limited failure population were developed. These models were used to develop expressionsmore » for the likelihood of the available data that properly accounts for information missing in the failure database. Results: A Bayesian approach was employed to obtain estimates of model parameters (with associated uncertainty) in order to investigate characteristics of the service-life distribution. Custom software was developed and is included as Supplemental Material to this case study. One part of a responsible approach to the original case was a simulation experiment used to validate the correctness of the software and the behavior of the statistical methodology before using its results in the application, and an example of such an experiment is included here. Because of confidentiality issues that prevent use of the original data, simulated data with characteristics like the manufacturer’s proprietary data are used to illustrate some aspects of our real analyses. Lastly, we also note that, although this case focuses on rare and complete product failure, the statistical methodology provided is directly applicable to more standard warranty data problems involving typically much larger warranty databases where entries are warranty claims (often for repairs) rather than reports of complete failures.« less

  8. Preliminary Geologic Map of the Buxton 7.5' Quadrangle, Washington County, Oregon

    USGS Publications Warehouse

    Dinterman, Philip A.; Duvall, Alison R.

    2009-01-01

    This map, compiled from previously published and unpublished data, and new mapping by the authors, represents the general distribution of bedrock and surficial deposits of the Buxton 7.5-minute quadrangle. The database delineates map units that are identified by general age and lithology following the stratigraphic nomenclature of the U.S. Geological Survey. The scale of the source maps limits the spatial resolution (scale) of the database to 1:24,000 or smaller. This plot file and accompanying database depict the distribution of geologic materials and structures at a regional (1:24,000) scale. The report is intended to provide geologic information for the regional study of materials properties, earthquake shaking, landslide potential, mineral hazards, seismic velocity, and earthquake faults. In addition, the report contains new information and interpretations about the regional geologic history and framework. However, the regional scale of this report does not provide sufficient detail for site development purposes.

  9. The frequency and distribution of high-velocity gas in the Galaxy

    NASA Technical Reports Server (NTRS)

    Nichols, Joy S.

    1995-01-01

    The purpose of this study was to estimate the frequency and distribution of high-velocity gas in the Galaxy using UV absorption line measurements from archival high-dispersion IUE spectra and to identify particularly interesting regions for future study. Approximately 500 spectra have been examined. The study began with the creation of a database of all 0 and B stars with b less than or = to 30 deg observed with IUE at high dispersion over its 18-year lifetime. The original database of 2500 unique objects was reduced to 1200 objects which had optimal exposures available. The next task was to determine the distances of these stars so the high-velocity structures could be mapped in the Galaxy. Spectroscopic distances were calculated for each star for which photometry was available. The photometry was acquired for each star using the SIMBAD database. Preference was given to the ubvy system where available; otherwise the UBV system was used.

  10. Data-mining analysis of the global distribution of soil carbon in observational databases and Earth system models

    NASA Astrophysics Data System (ADS)

    Hashimoto, Shoji; Nanko, Kazuki; Ťupek, Boris; Lehtonen, Aleksi

    2017-03-01

    Future climate change will dramatically change the carbon balance in the soil, and this change will affect the terrestrial carbon stock and the climate itself. Earth system models (ESMs) are used to understand the current climate and to project future climate conditions, but the soil organic carbon (SOC) stock simulated by ESMs and those of observational databases are not well correlated when the two are compared at fine grid scales. However, the specific key processes and factors, as well as the relationships among these factors that govern the SOC stock, remain unclear; the inclusion of such missing information would improve the agreement between modeled and observational data. In this study, we sought to identify the influential factors that govern global SOC distribution in observational databases, as well as those simulated by ESMs. We used a data-mining (machine-learning) (boosted regression trees - BRT) scheme to identify the factors affecting the SOC stock. We applied BRT scheme to three observational databases and 15 ESM outputs from the fifth phase of the Coupled Model Intercomparison Project (CMIP5) and examined the effects of 13 variables/factors categorized into five groups (climate, soil property, topography, vegetation, and land-use history). Globally, the contributions of mean annual temperature, clay content, carbon-to-nitrogen (CN) ratio, wetland ratio, and land cover were high in observational databases, whereas the contributions of the mean annual temperature, land cover, and net primary productivity (NPP) were predominant in the SOC distribution in ESMs. A comparison of the influential factors at a global scale revealed that the most distinct differences between the SOCs from the observational databases and ESMs were the low clay content and CN ratio contributions, and the high NPP contribution in the ESMs. The results of this study will aid in identifying the causes of the current mismatches between observational SOC databases and ESM outputs and improve the modeling of terrestrial carbon dynamics in ESMs. This study also reveals how a data-mining algorithm can be used to assess model outputs.

  11. Surficial geologic map of the Amboy 30' x 60' quadrangle, San Bernardino County, California

    USGS Publications Warehouse

    Bedford, David R.; Miller, David M.; Phelps, Geoffrey A.

    2010-01-01

    The surficial geologic map of the Amboy 30' x 60' quadrangle presents characteristics of surficial materials for an area of approximately 5,000 km2 in the eastern Mojave Desert of southern California. This map consists of new surficial mapping conducted between 2000 and 2007, as well as compilations from previous surficial mapping. Surficial geologic units are mapped and described based on depositional process and age categories that reflect the mode of deposition, pedogenic effects following deposition, and, where appropriate, the lithologic nature of the material. Many physical properties were noted and measured during the geologic mapping. This information was used to classify surficial deposits and to understand their ecological importance. We focus on physical properties that drive hydrologic, biologic, and physical processes such as particle-size distribution (PSD) and bulk density. The database contains point data representing locations of samples for both laboratory determined physical properties and semiquantitative field-based information in the database. We include the locations of all field observations and note the type of information collected in the field to help assist in assessing the quality of the mapping. The publication is separated into three parts: documentation, spatial data, and printable map graphics of the database. Documentation includes this pamphlet, which provides a discussion of the surficial geology and units and the map. Spatial data are distributed as ArcGIS Geodatabase in Microsoft Access format and are accompanied by a readme file, which describes the database contents, and FGDC metadata for the spatial map information. Map graphics files are distributed as Postscript and Adobe Portable Document Format (PDF) files that provide a view of the spatial database at the mapped scale.

  12. Design and Implementation of an Environmental Mercury Database for Northeastern North America

    NASA Astrophysics Data System (ADS)

    Clair, T. A.; Evers, D.; Smith, T.; Goodale, W.; Bernier, M.

    2002-12-01

    An important issue faced when attempting to interpret geochemical variability studies across large regions, is the accumulation, access and consistent display of data from a large number of sources. We were given the opportunity to provide a regional assessment of mercury distribution in surface waters, sediments, invertebrates, fish, and birds in a region extending from New York State to the Island of Newfoundland. We received over 20 individual databases from State, Provincial, and Federal governments, as well as university researchers from both Canada and the United States. These databases came in a variety of formats and sizes. Our challenge was to find a way of accumulating and presenting the large amounts of acquired data, in a consistent, easily accessible fashion, which could then be more easily interpreted. Moreover, the database had to be portable and easily distributable to the large number of study participants. We developed a static database structure using a web-based approach which we were then able to mount on a server which was accessible to all project participants. The site also contained all the necessary documentation related to the data, its acquisition, as well as the methods used in its analysis and interpretation. We then copied the complete web site on CDROM's which we then distributed to all project participants, funding agencies, and other interested parties. The CDROM formed a permanent record of the project and was issued ISSN and ISBN numbers so that the information remained accessible to researchers in perpetuity. Here we present an overview of the CDROM and data structures, of the information accumulated over the first year of the study, and initial interpretation of the results.

  13. Global Distribution of Outbreaks of Water-Associated Infectious Diseases

    PubMed Central

    Yang, Kun; LeJeune, Jeffrey; Alsdorf, Doug; Lu, Bo; Shum, C. K.; Liang, Song

    2012-01-01

    Background Water plays an important role in the transmission of many infectious diseases, which pose a great burden on global public health. However, the global distribution of these water-associated infectious diseases and underlying factors remain largely unexplored. Methods and Findings Based on the Global Infectious Disease and Epidemiology Network (GIDEON), a global database including water-associated pathogens and diseases was developed. In this study, reported outbreak events associated with corresponding water-associated infectious diseases from 1991 to 2008 were extracted from the database. The location of each reported outbreak event was identified and geocoded into a GIS database. Also collected in the GIS database included geo-referenced socio-environmental information including population density (2000), annual accumulated temperature, surface water area, and average annual precipitation. Poisson models with Bayesian inference were developed to explore the association between these socio-environmental factors and distribution of the reported outbreak events. Based on model predictions a global relative risk map was generated. A total of 1,428 reported outbreak events were retrieved from the database. The analysis suggested that outbreaks of water-associated diseases are significantly correlated with socio-environmental factors. Population density is a significant risk factor for all categories of reported outbreaks of water-associated diseases; water-related diseases (e.g., vector-borne diseases) are associated with accumulated temperature; water-washed diseases (e.g., conjunctivitis) are inversely related to surface water area; both water-borne and water-related diseases are inversely related to average annual rainfall. Based on the model predictions, “hotspots” of risks for all categories of water-associated diseases were explored. Conclusions At the global scale, water-associated infectious diseases are significantly correlated with socio-environmental factors, impacting all regions which are affected disproportionately by different categories of water-associated infectious diseases. PMID:22348158

  14. SIRSALE: integrated video database management tools

    NASA Astrophysics Data System (ADS)

    Brunie, Lionel; Favory, Loic; Gelas, J. P.; Lefevre, Laurent; Mostefaoui, Ahmed; Nait-Abdesselam, F.

    2002-07-01

    Video databases became an active field of research during the last decade. The main objective in such systems is to provide users with capabilities to friendly search, access and playback distributed stored video data in the same way as they do for traditional distributed databases. Hence, such systems need to deal with hard issues : (a) video documents generate huge volumes of data and are time sensitive (streams must be delivered at a specific bitrate), (b) contents of video data are very hard to be automatically extracted and need to be humanly annotated. To cope with these issues, many approaches have been proposed in the literature including data models, query languages, video indexing etc. In this paper, we present SIRSALE : a set of video databases management tools that allow users to manipulate video documents and streams stored in large distributed repositories. All the proposed tools are based on generic models that can be customized for specific applications using ad-hoc adaptation modules. More precisely, SIRSALE allows users to : (a) browse video documents by structures (sequences, scenes, shots) and (b) query the video database content by using a graphical tool, adapted to the nature of the target video documents. This paper also presents an annotating interface which allows archivists to describe the content of video documents. All these tools are coupled to a video player integrating remote VCR functionalities and are based on active network technology. So, we present how dedicated active services allow an optimized video transport for video streams (with Tamanoir active nodes). We then describe experiments of using SIRSALE on an archive of news video and soccer matches. The system has been demonstrated to professionals with a positive feedback. Finally, we discuss open issues and present some perspectives.

  15. Development of a land-cover characteristics database for the conterminous U.S.

    USGS Publications Warehouse

    Loveland, Thomas R.; Merchant, J.W.; Ohlen, D.O.; Brown, Jesslyn F.

    1991-01-01

    Information regarding the characteristics and spatial distribution of the Earth's land cover is critical to global environmental research. A prototype land-cover database for the conterminous United States designed for use in a variety of global modelling, monitoring, mapping, and analytical endeavors has been created. The resultant database contains multiple layers, including the source AVHRR data, the ancillary data layers, the land-cover regions defined by the research, and translation tables linking the regions to other land classification schema (for example, UNESCO, USGS Anderson System). The land-cover characteristics database can be analyzed, transformed, or aggregated by users to meet a broad spectrum of requirements. -from Authors

  16. Geologic map and map database of the Palo Alto 30' x 60' quadrangle, California

    USGS Publications Warehouse

    Brabb, E.E.; Jones, D.L.; Graymer, R.W.

    2000-01-01

    This digital map database, compiled from previously published and unpublished data, and new mapping by the authors, represents the general distribution of bedrock and surficial deposits in the mapped area. Together with the accompanying text file (pamf.ps, pamf.pdf, pamf.txt), it provides current information on the geologic structure and stratigraphy of the area covered. The database delineates map units that are identified by general age and lithology following the stratigraphic nomenclature of the U.S. Geological Survey. The scale of the source maps limits the spatial resolution (scale) of the database to 1:62,500 or smaller.

  17. Geologic map and map database of western Sonoma, northernmost Marin, and southernmost Mendocino counties, California

    USGS Publications Warehouse

    Blake, M.C.; Graymer, R.W.; Stamski, R.E.

    2002-01-01

    This digital map database, compiled from previously published and unpublished data, and new mapping by the authors, represents the general distribution of bedrock and surficial deposits in the mapped area. Together with the accompanying text file (wsomf.ps, wsomf.pdf, wsomf.txt), it provides current information on the geologic structure and stratigraphy of the area covered. The database delineates map units that are identified by general age and lithology following the stratigraphic nomenclature of the U.S. Geological Survey. The scale of the source maps limits the spatial resolution (scale) of the database to 1:62,500 or smaller.

  18. Geographical Distribution of Biomass Carbon in Tropical Southeast Asian Forests: A Database (NPD-068)

    DOE Data Explorer

    Brown, Sandra [University of Illinois, Urbana, Illinois (USA); Iverson, Louis R. [University of Illinois, Urbana, Illinois (USA); Prasad, Anantha [University of Illinois, Urbana, Illinois (USA); Beaty, Tammy W. [CDIAC, Oak Ridge National Laboratory, Oak Ridge, TN (USA); Olsen, Lisa M. [CDIAC, Oak Ridge National Laboratory, Oak Ridge, TN (USA); Cushman, Robert M. [CDIAC, Oak Ridge National Laboratory, Oak Ridge, TN (USA); Brenkert, Antoinette L. [CDIAC, Oak Ridge National Laboratory, Oak Ridge, TN (USA)

    2001-03-01

    A database was generated of estimates of geographically referenced carbon densities of forest vegetation in tropical Southeast Asia for 1980. A geographic information system (GIS) was used to incorporate spatial databases of climatic, edaphic, and geomorphological indices and vegetation to estimate potential (i.e., in the absence of human intervention and natural disturbance) carbon densities of forests. The resulting map was then modified to estimate actual 1980 carbon density as a function of population density and climatic zone. The database covers the following 13 countries: Bangladesh, Brunei, Cambodia (Campuchea), India, Indonesia, Laos, Malaysia, Myanmar (Burma), Nepal, the Philippines, Sri Lanka, Thailand, and Vietnam.

  19. The U.S. Geological Survey’s nonindigenous aquatic species database: over thirty years of tracking introduced aquatic species in the United States (and counting)

    USGS Publications Warehouse

    Fuller, Pamela L.; Neilson, Matthew E.

    2015-01-01

    The U.S. Geological Survey’s Nonindigenous Aquatic Species (NAS) Database has tracked introductions of freshwater aquatic organisms in the United States for the past four decades. A website provides access to occurrence reports, distribution maps, and fact sheets for more than 1,000 species. The site also includes an on-line reporting system and an alert system for new occurrences. We provide an historical overview of the database, a description of its current capabilities and functionality, and a basic characterization of the data contained within the database.

  20. Mitigating component performance variation

    DOEpatents

    Gara, Alan G.; Sylvester, Steve S.; Eastep, Jonathan M.; Nagappan, Ramkumar; Cantalupo, Christopher M.

    2018-01-09

    Apparatus and methods may provide for characterizing a plurality of similar components of a distributed computing system based on a maximum safe operation level associated with each component and storing characterization data in a database and allocating non-uniform power to each similar component based at least in part on the characterization data in the database to substantially equalize performance of the components.

  1. Fault-tolerant symmetrically-private information retrieval

    NASA Astrophysics Data System (ADS)

    Wang, Tian-Yin; Cai, Xiao-Qiu; Zhang, Rui-Ling

    2016-08-01

    We propose two symmetrically-private information retrieval protocols based on quantum key distribution, which provide a good degree of database and user privacy while being flexible, loss-resistant and easily generalized to a large database similar to the precedent works. Furthermore, one protocol is robust to a collective-dephasing noise, and the other is robust to a collective-rotation noise.

  2. Developing a Near Real-time System for Earthquake Slip Distribution Inversion

    NASA Astrophysics Data System (ADS)

    Zhao, Li; Hsieh, Ming-Che; Luo, Yan; Ji, Chen

    2016-04-01

    Advances in observational and computational seismology in the past two decades have enabled completely automatic and real-time determinations of the focal mechanisms of earthquake point sources. However, seismic radiations from moderate and large earthquakes often exhibit strong finite-source directivity effect, which is critically important for accurate ground motion estimations and earthquake damage assessments. Therefore, an effective procedure to determine earthquake rupture processes in near real-time is in high demand for hazard mitigation and risk assessment purposes. In this study, we develop an efficient waveform inversion approach for the purpose of solving for finite-fault models in 3D structure. Full slip distribution inversions are carried out based on the identified fault planes in the point-source solutions. To ensure efficiency in calculating 3D synthetics during slip distribution inversions, a database of strain Green tensors (SGT) is established for 3D structural model with realistic surface topography. The SGT database enables rapid calculations of accurate synthetic seismograms for waveform inversion on a regular desktop or even a laptop PC. We demonstrate our source inversion approach using two moderate earthquakes (Mw~6.0) in Taiwan and in mainland China. Our results show that 3D velocity model provides better waveform fitting with more spatially concentrated slip distributions. Our source inversion technique based on the SGT database is effective for semi-automatic, near real-time determinations of finite-source solutions for seismic hazard mitigation purposes.

  3. Influenza Virus Database (IVDB): an integrated information resource and analysis platform for influenza virus research.

    PubMed

    Chang, Suhua; Zhang, Jiajie; Liao, Xiaoyun; Zhu, Xinxing; Wang, Dahai; Zhu, Jiang; Feng, Tao; Zhu, Baoli; Gao, George F; Wang, Jian; Yang, Huanming; Yu, Jun; Wang, Jing

    2007-01-01

    Frequent outbreaks of highly pathogenic avian influenza and the increasing data available for comparative analysis require a central database specialized in influenza viruses (IVs). We have established the Influenza Virus Database (IVDB) to integrate information and create an analysis platform for genetic, genomic, and phylogenetic studies of the virus. IVDB hosts complete genome sequences of influenza A virus generated by Beijing Institute of Genomics (BIG) and curates all other published IV sequences after expert annotation. Our Q-Filter system classifies and ranks all nucleotide sequences into seven categories according to sequence content and integrity. IVDB provides a series of tools and viewers for comparative analysis of the viral genomes, genes, genetic polymorphisms and phylogenetic relationships. A search system has been developed for users to retrieve a combination of different data types by setting search options. To facilitate analysis of global viral transmission and evolution, the IV Sequence Distribution Tool (IVDT) has been developed to display the worldwide geographic distribution of chosen viral genotypes and to couple genomic data with epidemiological data. The BLAST, multiple sequence alignment and phylogenetic analysis tools were integrated for online data analysis. Furthermore, IVDB offers instant access to pre-computed alignments and polymorphisms of IV genes and proteins, and presents the results as SNP distribution plots and minor allele distributions. IVDB is publicly available at http://influenza.genomics.org.cn.

  4. Intelligent Control of Micro Grid: A Big Data-Based Control Center

    NASA Astrophysics Data System (ADS)

    Liu, Lu; Wang, Yanping; Liu, Li; Wang, Zhiseng

    2018-01-01

    In this paper, a structure of micro grid system with big data-based control center is introduced. Energy data from distributed generation, storage and load are analized through the control center, and from the results new trends will be predicted and applied as a feedback to optimize the control. Therefore, each step proceeded in micro grid can be adjusted and orgnized in a form of comprehensive management. A framework of real-time data collection, data processing and data analysis will be proposed by employing big data technology. Consequently, a integrated distributed generation and a optimized energy storage and transmission process can be implemented in the micro grid system.

  5. Searching and exploitation of distributed geospatial data sources via the Naval Research Lab's Geospatial Information Database (GIDB) Portal System

    NASA Astrophysics Data System (ADS)

    McCreedy, Frank P.; Sample, John T.; Ladd, William P.; Thomas, Michael L.; Shaw, Kevin B.

    2005-05-01

    The Naval Research Laboratory"s Geospatial Information Database (GIDBTM) Portal System has been extended to now include an extensive geospatial search functionality. The GIDB Portal System interconnects over 600 distributed geospatial data sources via the Internet with a thick client, thin client and a PDA client. As the GIDB Portal System has rapidly grown over the last two years (adding hundreds of geospatial sources), the obvious requirement has arisen to more effectively mine the interconnected sources in near real-time. How the GIDB Search addresses this issue is the prime focus of this paper.

  6. Interconnecting heterogeneous database management systems

    NASA Technical Reports Server (NTRS)

    Gligor, V. D.; Luckenbaugh, G. L.

    1984-01-01

    It is pointed out that there is still a great need for the development of improved communication between remote, heterogeneous database management systems (DBMS). Problems regarding the effective communication between distributed DBMSs are primarily related to significant differences between local data managers, local data models and representations, and local transaction managers. A system of interconnected DBMSs which exhibit such differences is called a network of distributed, heterogeneous DBMSs. In order to achieve effective interconnection of remote, heterogeneous DBMSs, the users must have uniform, integrated access to the different DBMs. The present investigation is mainly concerned with an analysis of the existing approaches to interconnecting heterogeneous DBMSs, taking into account four experimental DBMS projects.

  7. DISTRIBUTED STRUCTURE-SEARCHABLE TOXICITY ...

    EPA Pesticide Factsheets

    The ability to assess the potential genotoxicity, carcinogenicity, or other toxicity of pharmaceutical or industrial chemicals based on chemical structure information is a highly coveted and shared goal of varied academic, commercial, and government regulatory groups. These diverse interests often employ different approaches and have different criteria and use for toxicity assessments, but they share a need for unrestricted access to existing public toxicity data linked with chemical structure information. Currently, there exists no central repository of toxicity information, commercial or public, that adequately meets the data requirements for flexible analogue searching, SAR model development, or building of chemical relational databases (CRD). The Distributed Structure-Searchable Toxicity (DSSTox) Public Database Network is being proposed as a community-supported, web-based effort to address these shared needs of the SAR and toxicology communities. The DSSTox project has the following major elements: 1) to adopt and encourage the use of a common standard file format (SDF) for public toxicity databases that includes chemical structure, text and property information, and that can easily be imported into available CRD applications; 2) to implement a distributed source approach, managed by a DSSTox Central Website, that will enable decentralized, free public access to structure-toxicity data files, and that will effectively link knowledgeable toxicity data s

  8. Using Web Ontology Language to Integrate Heterogeneous Databases in the Neurosciences

    PubMed Central

    Lam, Hugo Y.K.; Marenco, Luis; Shepherd, Gordon M.; Miller, Perry L.; Cheung, Kei-Hoi

    2006-01-01

    Integrative neuroscience involves the integration and analysis of diverse types of neuroscience data involving many different experimental techniques. This data will increasingly be distributed across many heterogeneous databases that are web-accessible. Currently, these databases do not expose their schemas (database structures) and their contents to web applications/agents in a standardized, machine-friendly way. This limits database interoperation. To address this problem, we describe a pilot project that illustrates how neuroscience databases can be expressed using the Web Ontology Language, which is a semantically-rich ontological language, as a common data representation language to facilitate complex cross-database queries. In this pilot project, an existing tool called “D2RQ” was used to translate two neuroscience databases (NeuronDB and CoCoDat) into OWL, and the resulting OWL ontologies were then merged. An OWL-based reasoner (Racer) was then used to provide a sophisticated query language (nRQL) to perform integrated queries across the two databases based on the merged ontology. This pilot project is one step toward exploring the use of semantic web technologies in the neurosciences. PMID:17238384

  9. An Entropy Approach to Disclosure Risk Assessment: Lessons from Real Applications and Simulated Domains

    PubMed Central

    Airoldi, Edoardo M.; Bai, Xue; Malin, Bradley A.

    2011-01-01

    We live in an increasingly mobile world, which leads to the duplication of information across domains. Though organizations attempt to obscure the identities of their constituents when sharing information for worthwhile purposes, such as basic research, the uncoordinated nature of such environment can lead to privacy vulnerabilities. For instance, disparate healthcare providers can collect information on the same patient. Federal policy requires that such providers share “de-identified” sensitive data, such as biomedical (e.g., clinical and genomic) records. But at the same time, such providers can share identified information, devoid of sensitive biomedical data, for administrative functions. On a provider-by-provider basis, the biomedical and identified records appear unrelated, however, links can be established when multiple providers’ databases are studied jointly. The problem, known as trail disclosure, is a generalized phenomenon and occurs because an individual’s location access pattern can be matched across the shared databases. Due to technical and legal constraints, it is often difficult to coordinate between providers and thus it is critical to assess the disclosure risk in distributed environments, so that we can develop techniques to mitigate such risks. Research on privacy protection has so far focused on developing technologies to suppress or encrypt identifiers associated with sensitive information. There is growing body of work on the formal assessment of the disclosure risk of database entries in publicly shared databases, but a less attention has been paid to the distributed setting. In this research, we review the trail disclosure problem in several domains with known vulnerabilities and show that disclosure risk is influenced by the distribution of how people visit service providers. Based on empirical evidence, we propose an entropy metric for assessing such risk in shared databases prior to their release. This metric assesses risk by leveraging the statistical characteristics of a visit distribution, as opposed to person-level data. It is computationally efficient and superior to existing risk assessment methods, which rely on ad hoc assessment that are often computationally expensive and unreliable. We evaluate our approach on a range of location access patterns in simulated environments. Our results demonstrate the approach is effective at estimating trail disclosure risks and the amount of self-information contained in a distributed system is one of the main driving factors. PMID:21647242

  10. Attributes of the Federal Energy Management Program's Federal Site Building Characteristics Database

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Loper, Susan A.; Sandusky, William F.

    2010-12-31

    Typically, the Federal building stock is referred to as a group of about one-half million buildings throughout the United States. Additional information beyond this level is generally limited to distribution of that total by agency and maybe distribution of the total by state. However, additional characterization of the Federal building stock is required as the Federal sector seeks ways to implement efficiency projects to reduce energy and water use intensity as mandated by legislation and Executive Order. Using a Federal facility database that was assembled for use in a geographic information system tool, additional characterization of the Federal building stockmore » is provided including information regarding the geographical distribution of sites, building counts and percentage of total by agency, distribution of sites and building totals by agency, distribution of building count and floor space by Federal building type classification by agency, and rank ordering of sites, buildings, and floor space by state. A case study is provided regarding how the building stock has changed for the Department of Energy from 2000 through 2008.« less

  11. Tutorial videos of bioinformatics resources: online distribution trial in Japan named TogoTV.

    PubMed

    Kawano, Shin; Ono, Hiromasa; Takagi, Toshihisa; Bono, Hidemasa

    2012-03-01

    In recent years, biological web resources such as databases and tools have become more complex because of the enormous amounts of data generated in the field of life sciences. Traditional methods of distributing tutorials include publishing textbooks and posting web documents, but these static contents cannot adequately describe recent dynamic web services. Due to improvements in computer technology, it is now possible to create dynamic content such as video with minimal effort and low cost on most modern computers. The ease of creating and distributing video tutorials instead of static content improves accessibility for researchers, annotators and curators. This article focuses on online video repositories for educational and tutorial videos provided by resource developers and users. It also describes a project in Japan named TogoTV (http://togotv.dbcls.jp/en/) and discusses the production and distribution of high-quality tutorial videos, which would be useful to viewer, with examples. This article intends to stimulate and encourage researchers who develop and use databases and tools to distribute how-to videos as a tool to enhance product usability.

  12. Tutorial videos of bioinformatics resources: online distribution trial in Japan named TogoTV

    PubMed Central

    Kawano, Shin; Ono, Hiromasa; Takagi, Toshihisa

    2012-01-01

    In recent years, biological web resources such as databases and tools have become more complex because of the enormous amounts of data generated in the field of life sciences. Traditional methods of distributing tutorials include publishing textbooks and posting web documents, but these static contents cannot adequately describe recent dynamic web services. Due to improvements in computer technology, it is now possible to create dynamic content such as video with minimal effort and low cost on most modern computers. The ease of creating and distributing video tutorials instead of static content improves accessibility for researchers, annotators and curators. This article focuses on online video repositories for educational and tutorial videos provided by resource developers and users. It also describes a project in Japan named TogoTV (http://togotv.dbcls.jp/en/) and discusses the production and distribution of high-quality tutorial videos, which would be useful to viewer, with examples. This article intends to stimulate and encourage researchers who develop and use databases and tools to distribute how-to videos as a tool to enhance product usability. PMID:21803786

  13. Macromolecular Structure Database. Final Progress Report

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Gilliland, Gary L.

    2003-09-23

    The central activity of the PDB continues to be the collection, archiving and distribution of high quality structural data to the scientific community on a timely basis. In support of these activities NIST has continued its roles in developing the physical archive, in developing data uniformity, in dealing with NMR issues and in the distribution of PDB data through CD-ROMs. The physical archive holdings have been organized, inventoried, and a database has been created to facilitate their use. Data from individual PDB entries have been annotated to produce uniform values improving tremendously the accuracy of results of queries. Working withmore » the NMR community we have established data items specific for NMR that will be included in new entries and facilitate data deposition. The PDB CD-ROM production has continued on a quarterly basis, and new products are being distributed.« less

  14. Lognormal Behavior of the Size Distributions of Animation Characters

    NASA Astrophysics Data System (ADS)

    Yamamoto, Ken

    This study investigates the statistical property of the character sizes of animation, superhero series, and video game. By using online databases of Pokémon (video game) and Power Rangers (superhero series), the height and weight distributions are constructed, and we find that the weight distributions of Pokémon and Zords (robots in Power Rangers) follow the lognormal distribution in common. For the theoretical mechanism of this lognormal behavior, the combination of the normal distribution and the Weber-Fechner law is proposed.

  15. The Data Dealers.

    ERIC Educational Resources Information Center

    Tenopir, Carol; Barry, Jeff

    1997-01-01

    Profiles 25 database distribution and production companies, all of which responded to a 1997 survey with information on 54 separate online, Web-based, or CD-ROM systems. Highlights increased competition, distribution formats, Web versions versus local area networks, full-text delivery, and pricing policies. Tables present a sampling of customers…

  16. A Methodology for Distributing the Corporate Database.

    ERIC Educational Resources Information Center

    McFadden, Fred R.

    The trend to distributed processing is being fueled by numerous forces, including advances in technology, corporate downsizing, increasing user sophistication, and acquisitions and mergers. Increasingly, the trend in corporate information systems (IS) departments is toward sharing resources over a network of multiple types of processors, operating…

  17. Virtual time and time warp on the JPL hypercube. [operating system implementation for distributed simulation

    NASA Technical Reports Server (NTRS)

    Jefferson, David; Beckman, Brian

    1986-01-01

    This paper describes the concept of virtual time and its implementation in the Time Warp Operating System at the Jet Propulsion Laboratory. Virtual time is a distributed synchronization paradigm that is appropriate for distributed simulation, database concurrency control, real time systems, and coordination of replicated processes. The Time Warp Operating System is targeted toward the distributed simulation application and runs on a 32-node JPL Mark II Hypercube.

  18. Retrieving high-resolution images over the Internet from an anatomical image database

    NASA Astrophysics Data System (ADS)

    Strupp-Adams, Annette; Henderson, Earl

    1999-12-01

    The Visible Human Data set is an important contribution to the national collection of anatomical images. To enhance the availability of these images, the National Library of Medicine has supported the design and development of a prototype object-oriented image database which imports, stores, and distributes high resolution anatomical images in both pixel and voxel formats. One of the key database modules is its client-server Internet interface. This Web interface provides a query engine with retrieval access to high-resolution anatomical images that range in size from 100KB for browser viewable rendered images, to 1GB for anatomical structures in voxel file formats. The Web query and retrieval client-server system is composed of applet GUIs, servlets, and RMI application modules which communicate with each other to allow users to query for specific anatomical structures, and retrieve image data as well as associated anatomical images from the database. Selected images can be downloaded individually as single files via HTTP or downloaded in batch-mode over the Internet to the user's machine through an applet that uses Netscape's Object Signing mechanism. The image database uses ObjectDesign's object-oriented DBMS, ObjectStore that has a Java interface. The query and retrieval systems has been tested with a Java-CDE window system, and on the x86 architecture using Windows NT 4.0. This paper describes the Java applet client search engine that queries the database; the Java client module that enables users to view anatomical images online; the Java application server interface to the database which organizes data returned to the user, and its distribution engine that allow users to download image files individually and/or in batch-mode.

  19. Solar Market Research and Analysis Publications | Solar Research | NREL

    Science.gov Websites

    lifespan, and saving costs. The report is an expanded edition of an interim report published in 2015. Cost achieving the SETO 2030 residential PV cost target of $0.05 /kWh by identifying and quantifying cost reduction opportunities. Distribution Grid Integration Unit Cost Database: This database contains unit cost

  20. Privacy-Preserving Classifier Learning

    NASA Astrophysics Data System (ADS)

    Brickell, Justin; Shmatikov, Vitaly

    We present an efficient protocol for the privacy-preserving, distributed learning of decision-tree classifiers. Our protocol allows a user to construct a classifier on a database held by a remote server without learning any additional information about the records held in the database. The server does not learn anything about the constructed classifier, not even the user’s choice of feature and class attributes.

  1. A Web-Based Multi-Database System Supporting Distributed Collaborative Management and Sharing of Microarray Experiment Information

    PubMed Central

    Burgarella, Sarah; Cattaneo, Dario; Masseroli, Marco

    2006-01-01

    We developed MicroGen, a multi-database Web based system for managing all the information characterizing spotted microarray experiments. It supports information gathering and storing according to the Minimum Information About Microarray Experiments (MIAME) standard. It also allows easy sharing of information and data among all multidisciplinary actors involved in spotted microarray experiments. PMID:17238488

  2. Scalable Database Design of End-Game Model with Decoupled Countermeasure and Threat Information

    DTIC Science & Technology

    2017-11-01

    Threat Information by Decetria Akole and Michael Chen Approved for public release; distribution is unlimited...Scalable Database Design of End-Game Model with Decoupled Countermeasure and Threat Information by Decetria Akole The Thurgood Marshall...for this collection of information is estimated to average 1 hour per response, including the time for reviewing instructions, searching existing data

  3. Linking U.S. School District Test Score Distributions to a Common Scale. CEPA Working Paper No. 16-09

    ERIC Educational Resources Information Center

    Reardon, Sean F.; Kalogrides, Demetra; Ho, Andrew D.

    2017-01-01

    There is no comprehensive database of U.S. district-level test scores that is comparable across states. We describe and evaluate a method for constructing such a database. First, we estimate linear, reliability-adjusted linking transformations from state test score scales to the scale of the National Assessment of Educational Progress (NAEP). We…

  4. Patterns of aquatic species imperilment in the Southern Appalachians: an evaluation of regional databases

    Treesearch

    Patricia A. Flebbe; James A. Herrig

    2000-01-01

    For regional analyses of species imperilment patterns, data on species distributions are available from the U.S. Fish and Wildlife Service and from the State heritage programs. The authors compared these two different databases as sources of best available information for regional analyses of patterns of aquatic species imperilment for 132 counties in the Southern...

  5. Database interfaces on NASA's heterogeneous distributed database system

    NASA Technical Reports Server (NTRS)

    Huang, Shou-Hsuan Stephen

    1989-01-01

    The syntax and semantics of all commands used in the template are described. Template builders should consult this document for proper commands in the template. Previous documents (Semiannual reports) described other aspects of this project. Appendix 1 contains all substituting commands used in the system. Appendix 2 includes all repeating commands. Appendix 3 is a collection of DEFINE templates from eight different DBMS's.

  6. Glycan fragment database: a database of PDB-based glycan 3D structures.

    PubMed

    Jo, Sunhwan; Im, Wonpil

    2013-01-01

    The glycan fragment database (GFDB), freely available at http://www.glycanstructure.org, is a database of the glycosidic torsion angles derived from the glycan structures in the Protein Data Bank (PDB). Analogous to protein structure, the structure of an oligosaccharide chain in a glycoprotein, referred to as a glycan, can be characterized by the torsion angles of glycosidic linkages between relatively rigid carbohydrate monomeric units. Knowledge of accessible conformations of biologically relevant glycans is essential in understanding their biological roles. The GFDB provides an intuitive glycan sequence search tool that allows the user to search complex glycan structures. After a glycan search is complete, each glycosidic torsion angle distribution is displayed in terms of the exact match and the fragment match. The exact match results are from the PDB entries that contain the glycan sequence identical to the query sequence. The fragment match results are from the entries with the glycan sequence whose substructure (fragment) or entire sequence is matched to the query sequence, such that the fragment results implicitly include the influences from the nearby carbohydrate residues. In addition, clustering analysis based on the torsion angle distribution can be performed to obtain the representative structures among the searched glycan structures.

  7. [Development and evaluation of the medical imaging distribution system with dynamic web application and clustering technology].

    PubMed

    Yokohama, Noriya; Tsuchimoto, Tadashi; Oishi, Masamichi; Itou, Katsuya

    2007-01-20

    It has been noted that the downtime of medical informatics systems is often long. Many systems encounter downtimes of hours or even days, which can have a critical effect on daily operations. Such systems remain especially weak in the areas of database and medical imaging data. The scheme design shows the three-layer architecture of the system: application, database, and storage layers. The application layer uses the DICOM protocol (Digital Imaging and Communication in Medicine) and HTTP (Hyper Text Transport Protocol) with AJAX (Asynchronous JavaScript+XML). The database is designed to decentralize in parallel using cluster technology. Consequently, restoration of the database can be done not only with ease but also with improved retrieval speed. In the storage layer, a network RAID (Redundant Array of Independent Disks) system, it is possible to construct exabyte-scale parallel file systems that exploit storage spread. Development and evaluation of the test-bed has been successful in medical information data backup and recovery in a network environment. This paper presents a schematic design of the new medical informatics system that can be accommodated from a recovery and the dynamic Web application for medical imaging distribution using AJAX.

  8. Hierarchical Data Distribution Scheme for Peer-to-Peer Networks

    NASA Astrophysics Data System (ADS)

    Bhushan, Shashi; Dave, M.; Patel, R. B.

    2010-11-01

    In the past few years, peer-to-peer (P2P) networks have become an extremely popular mechanism for large-scale content sharing. P2P systems have focused on specific application domains (e.g. music files, video files) or on providing file system like capabilities. P2P is a powerful paradigm, which provides a large-scale and cost-effective mechanism for data sharing. P2P system may be used for storing data globally. Can we implement a conventional database on P2P system? But successful implementation of conventional databases on the P2P systems is yet to be reported. In this paper we have presented the mathematical model for the replication of the partitions and presented a hierarchical based data distribution scheme for the P2P networks. We have also analyzed the resource utilization and throughput of the P2P system with respect to the availability, when a conventional database is implemented over the P2P system with variable query rate. Simulation results show that database partitions placed on the peers with higher availability factor perform better. Degradation index, throughput, resource utilization are the parameters evaluated with respect to the availability factor.

  9. Comparison of the NCI open database with seven large chemical structural databases.

    PubMed

    Voigt, J H; Bienfait, B; Wang, S; Nicklaus, M C

    2001-01-01

    Eight large chemical databases have been analyzed and compared to each other. Central to this comparison is the open National Cancer Institute (NCI) database, consisting of approximately 250 000 structures. The other databases analyzed are the Available Chemicals Directory ("ACD," from MDL, release 1.99, 3D-version); the ChemACX ("ACX," from CamSoft, Version 4.5); the Maybridge Catalog and the Asinex database (both as distributed by CamSoft as part of ChemInfo 4.5); the Sigma-Aldrich Catalog (CD-ROM, 1999 Version); the World Drug Index ("WDI," Derwent, version 1999.03); and the organic part of the Cambridge Crystallographic Database ("CSD," from Cambridge Crystallographic Data Center, 1999 Version 5.18). The database properties analyzed are internal duplication rates; compounds unique to each database; cumulative occurrence of compounds in an increasing number of databases; overlap of identical compounds between two databases; similarity overlap; diversity; and others. The crystallographic database CSD and the WDI show somewhat less overlap with the other databases than those with each other. In particular the collections of commercial compounds and compilations of vendor catalogs have a substantial degree of overlap among each other. Still, no database is completely a subset of any other, and each appears to have its own niche and thus "raison d'être". The NCI database has by far the highest number of compounds that are unique to it. Approximately 200 000 of the NCI structures were not found in any of the other analyzed databases.

  10. Joint Experimentation on Scalable Parallel Processors (JESPP)

    DTIC Science & Technology

    2006-04-01

    made use of local embedded relational databases, implemented using sqlite on each node of an SPP to execute queries and return results via an ad hoc ...rl.af.mil 12a. DISTRIBUTION / AVAILABILITY STATEENT APPROVED FOR PUBLIC RELEASE; DISTRIBUTION UNLIMITED. 12b. DISTRIBUTION CODE 13. ABSTRACT...Experimentation Directorate (J9) required expansion of its joint semi-automated forces (JSAF) code capabilities; including number of entities, behavior complexity

  11. Distributed operating system for NASA ground stations

    NASA Technical Reports Server (NTRS)

    Doyle, John F.

    1987-01-01

    NASA ground stations are characterized by ever changing support requirements, so application software is developed and modified on a continuing basis. A distributed operating system was designed to optimize the generation and maintenance of those applications. Unusual features include automatic program generation from detailed design graphs, on-line software modification in the testing phase, and the incorporation of a relational database within a real-time, distributed system.

  12. Improving Department of Defense Global Distribution Performance Through Network Analysis

    DTIC Science & Technology

    2016-06-01

    network performance increase. 14. SUBJECT TERMS supply chain metrics, distribution networks, requisition shipping time, strategic distribution database...peace and war” (p. 4). USTRANSCOM Metrics and Analysis Branch defines, develops, tracks, and maintains outcomes- based supply chain metrics to...2014a, p. 8). The Joint Staff defines a TDD standard as the maximum number of days the supply chain can take to deliver requisitioned materiel

  13. How I do it: a practical database management system to assist clinical research teams with data collection, organization, and reporting.

    PubMed

    Lee, Howard; Chapiro, Julius; Schernthaner, Rüdiger; Duran, Rafael; Wang, Zhijun; Gorodetski, Boris; Geschwind, Jean-François; Lin, MingDe

    2015-04-01

    The objective of this study was to demonstrate that an intra-arterial liver therapy clinical research database system is a more workflow efficient and robust tool for clinical research than a spreadsheet storage system. The database system could be used to generate clinical research study populations easily with custom search and retrieval criteria. A questionnaire was designed and distributed to 21 board-certified radiologists to assess current data storage problems and clinician reception to a database management system. Based on the questionnaire findings, a customized database and user interface system were created to perform automatic calculations of clinical scores including staging systems such as the Child-Pugh and Barcelona Clinic Liver Cancer, and facilitates data input and output. Questionnaire participants were favorable to a database system. The interface retrieved study-relevant data accurately and effectively. The database effectively produced easy-to-read study-specific patient populations with custom-defined inclusion/exclusion criteria. The database management system is workflow efficient and robust in retrieving, storing, and analyzing data. Copyright © 2015 AUR. Published by Elsevier Inc. All rights reserved.

  14. Developing an A Priori Database for Passive Microwave Snow Water Retrievals Over Ocean

    NASA Astrophysics Data System (ADS)

    Yin, Mengtao; Liu, Guosheng

    2017-12-01

    A physically optimized a priori database is developed for Global Precipitation Measurement Microwave Imager (GMI) snow water retrievals over ocean. The initial snow water content profiles are derived from CloudSat Cloud Profiling Radar (CPR) measurements. A radiative transfer model in which the single-scattering properties of nonspherical snowflakes are based on the discrete dipole approximate results is employed to simulate brightness temperatures and their gradients. Snow water content profiles are then optimized through a one-dimensional variational (1D-Var) method. The standard deviations of the difference between observed and simulated brightness temperatures are in a similar magnitude to the observation errors defined for observation error covariance matrix after the 1D-Var optimization, indicating that this variational method is successful. This optimized database is applied in a Bayesian retrieval snow water algorithm. The retrieval results indicated that the 1D-Var approach has a positive impact on the GMI retrieved snow water content profiles by improving the physical consistency between snow water content profiles and observed brightness temperatures. Global distribution of snow water contents retrieved from the a priori database is compared with CloudSat CPR estimates. Results showed that the two estimates have a similar pattern of global distribution, and the difference of their global means is small. In addition, we investigate the impact of using physical parameters to subset the database on snow water retrievals. It is shown that using total precipitable water to subset the database with 1D-Var optimization is beneficial for snow water retrievals.

  15. THE SOUTHWEST REGIONAL GAP PROJECT: A DATABASE MODEL FOR REGIONAL LANDSCAPE ASSESSMENT, RESOURCE PLANNING, AND VULNERABILITY ANALYSIS

    EPA Science Inventory

    The Gap Analysis Program (GAP) is a national interagency program that maps the distribution of plant communities and selected animal species and compares these distributions with land stewardship to identify biotic elements at potential risk of endangerment. Acquisition of primar...

  16. Distributed Generation Energy Technology Operations and Maintenance Costs |

    Science.gov Websites

    Costs Distributed Generation Energy Technology Operations and Maintenance Costs Transparent Cost Database Button The following charts indicate recent operations and maintenance (O&M) cost estimates available national-level cost data from a variety of sources. Costs in your specific location will vary. The

  17. A Simulation Tool for Distributed Databases.

    DTIC Science & Technology

    1981-09-01

    11-8 . Reed’s multiversion system [RE1T8] may also be viewed aa updating only copies until the commit is made. The decision to make the changes...distributed voting, and Ellis’ ring algorithm. Other, significantly different algorithms not covered in his work include Reed’s multiversion algorithm, the

  18. Interfaces for Distributed Systems of Information Servers.

    ERIC Educational Resources Information Center

    Kahle, Brewster M.; And Others

    1993-01-01

    Describes five interfaces to remote, full-text databases accessed through distributed systems of servers. These are WAIStation for the Macintosh, XWAIS for X-Windows, GWAIS for Gnu-Emacs; SWAIS for dumb terminals, and Rosebud for the Macintosh. Sixteen illustrations provide examples of display screens. Problems and needed improvements are…

  19. Statistical organelle dissection of Arabidopsis guard cells using image database LIPS.

    PubMed

    Higaki, Takumi; Kutsuna, Natsumaro; Hosokawa, Yoichiroh; Akita, Kae; Ebine, Kazuo; Ueda, Takashi; Kondo, Noriaki; Hasezawa, Seiichiro

    2012-01-01

    To comprehensively grasp cell biological events in plant stomatal movement, we have captured microscopic images of guard cells with various organelles markers. The 28,530 serial optical sections of 930 pairs of Arabidopsis guard cells have been released as a new image database, named Live Images of Plant Stomata (LIPS). We visualized the average organellar distributions in guard cells using probabilistic mapping and image clustering techniques. The results indicated that actin microfilaments and endoplasmic reticulum (ER) are mainly localized to the dorsal side and connection regions of guard cells. Subtractive images of open and closed stomata showed distribution changes in intracellular structures, including the ER, during stomatal movement. Time-lapse imaging showed that similar ER distribution changes occurred during stomatal opening induced by light irradiation or femtosecond laser shots on neighboring epidermal cells, indicating that our image analysis approach has identified a novel ER relocation in stomatal opening.

  20. Geomasking sensitive health data and privacy protection: an evaluation using an E911 database.

    PubMed

    Allshouse, William B; Fitch, Molly K; Hampton, Kristen H; Gesink, Dionne C; Doherty, Irene A; Leone, Peter A; Serre, Marc L; Miller, William C

    2010-10-01

    Geomasking is used to provide privacy protection for individual address information while maintaining spatial resolution for mapping purposes. Donut geomasking and other random perturbation geomasking algorithms rely on the assumption of a homogeneously distributed population to calculate displacement distances, leading to possible under-protection of individuals when this condition is not met. Using household data from 2007, we evaluated the performance of donut geomasking in Orange County, North Carolina. We calculated the estimated k-anonymity for every household based on the assumption of uniform household distribution. We then determined the actual k-anonymity by revealing household locations contained in the county E911 database. Census block groups in mixed-use areas with high population distribution heterogeneity were the most likely to have privacy protection below selected criteria. For heterogeneous populations, we suggest tripling the minimum displacement area in the donut to protect privacy with a less than 1% error rate.

  1. Geomasking sensitive health data and privacy protection: an evaluation using an E911 database

    PubMed Central

    Allshouse, William B; Fitch, Molly K; Hampton, Kristen H; Gesink, Dionne C; Doherty, Irene A; Leone, Peter A; Serre, Marc L; Miller, William C

    2010-01-01

    Geomasking is used to provide privacy protection for individual address information while maintaining spatial resolution for mapping purposes. Donut geomasking and other random perturbation geomasking algorithms rely on the assumption of a homogeneously distributed population to calculate displacement distances, leading to possible under-protection of individuals when this condition is not met. Using household data from 2007, we evaluated the performance of donut geomasking in Orange County, North Carolina. We calculated the estimated k-anonymity for every household based on the assumption of uniform household distribution. We then determined the actual k-anonymity by revealing household locations contained in the county E911 database. Census block groups in mixed-use areas with high population distribution heterogeneity were the most likely to have privacy protection below selected criteria. For heterogeneous populations, we suggest tripling the minimum displacement area in the donut to protect privacy with a less than 1% error rate. PMID:20953360

  2. Element Distribution in Silicon Refining: Thermodynamic Model and Industrial Measurements

    NASA Astrophysics Data System (ADS)

    Næss, Mari K.; Kero, Ida; Tranell, Gabriella; Tang, Kai; Tveit, Halvard

    2014-11-01

    To establish an overview of impurity elemental distribution among silicon, slag, and gas/fume in the refining process of metallurgical grade silicon (MG-Si), an industrial measurement campaign was performed at the Elkem Salten MG-Si plant in Norway. Samples of in- and outgoing mass streams, i.e., tapped Si, flux and cooling materials, refined Si, slag, and fume, were analyzed by high-resolution inductively coupled plasma mass spectrometry (HR-ICP-MS), with respect to 62 elements. The elemental distributions were calculated and the experimental data compared with equilibrium estimations based on commercial and proprietary, published databases and carried out using the ChemSheet software. The results are discussed in terms of boiling temperatures, vapor pressures, redox potentials, and activities of the elements. These model calculations indicate a need for expanded databases with more and reliable thermodynamic data for trace elements in general and fume constituents in particular.

  3. Distributed computing for macromolecular crystallography

    PubMed Central

    Krissinel, Evgeny; Uski, Ville; Lebedev, Andrey; Ballard, Charles

    2018-01-01

    Modern crystallographic computing is characterized by the growing role of automated structure-solution pipelines, which represent complex expert systems utilizing a number of program components, decision makers and databases. They also require considerable computational resources and regular database maintenance, which is increasingly more difficult to provide at the level of individual desktop-based CCP4 setups. On the other hand, there is a significant growth in data processed in the field, which brings up the issue of centralized facilities for keeping both the data collected and structure-solution projects. The paradigm of distributed computing and data management offers a convenient approach to tackling these problems, which has become more attractive in recent years owing to the popularity of mobile devices such as tablets and ultra-portable laptops. In this article, an overview is given of developments by CCP4 aimed at bringing distributed crystallographic computations to a wide crystallographic community. PMID:29533240

  4. Distributed computing for macromolecular crystallography.

    PubMed

    Krissinel, Evgeny; Uski, Ville; Lebedev, Andrey; Winn, Martyn; Ballard, Charles

    2018-02-01

    Modern crystallographic computing is characterized by the growing role of automated structure-solution pipelines, which represent complex expert systems utilizing a number of program components, decision makers and databases. They also require considerable computational resources and regular database maintenance, which is increasingly more difficult to provide at the level of individual desktop-based CCP4 setups. On the other hand, there is a significant growth in data processed in the field, which brings up the issue of centralized facilities for keeping both the data collected and structure-solution projects. The paradigm of distributed computing and data management offers a convenient approach to tackling these problems, which has become more attractive in recent years owing to the popularity of mobile devices such as tablets and ultra-portable laptops. In this article, an overview is given of developments by CCP4 aimed at bringing distributed crystallographic computations to a wide crystallographic community.

  5. [Computerised monitoring of integrated cervical screening. Indicators of diagnostic performance].

    PubMed

    Bucchi, L; Pierri, C; Amadori, A; Folicaldi, S; Ghidoni, D; Nannini, R; Bondi, A

    2003-12-01

    In a previous issue of this journal, we presented the background, rationale, general methods, and indicators of participation of a computerised system for the monitoring of integrated cervical screening, i.e. the integration of spontaneous Pap smear practice into organised screening. We also reported the results of the application of those indicators in the general database of the Pathology Department of Imola Health District in northern Italy. In the current paper, we present the rationale and definitions of indicators of diagnostic performance (total Pap smears and rate of unsatisfactory Pap smears, distribution by cytology class reported, rate of patients without timely follow-up, detection rate, positive predictive value, distribution of cytology classes reported by histology diagnosis, and distribution of cases of CIN and carcinoma registered by detection modality) as well as the results of their application in the same database as above.

  6. A Survey on Distributed Mobile Database and Data Mining

    NASA Astrophysics Data System (ADS)

    Goel, Ajay Mohan; Mangla, Neeraj; Patel, R. B.

    2010-11-01

    The anticipated increase in popular use of the Internet has created more opportunity in information dissemination, Ecommerce, and multimedia communication. It has also created more challenges in organizing information and facilitating its efficient retrieval. In response to this, new techniques have evolved which facilitate the creation of such applications. Certainly the most promising among the new paradigms is the use of mobile agents. In this paper, mobile agent and distributed database technologies are applied in the banking system. Many approaches have been proposed to schedule data items for broadcasting in a mobile environment. In this paper, an efficient strategy for accessing multiple data items in mobile environments and the bottleneck of current banking will be proposed.

  7. A portal for the ocean biogeographic information system

    USGS Publications Warehouse

    Zhang, Yunqing; Grassle, J. F.

    2002-01-01

    Since its inception in 1999 the Ocean Biogeographic Information System (OBIS) has developed into an international science program as well as a globally distributed network of biogeographic databases. An OBIS portal at Rutgers University provides the links and functional interoperability among member database systems. Protocols and standards have been established to support effective communication between the portal and these functional units. The portal provides distributed data searching, a taxonomy name service, a GIS with access to relevant environmental data, biological modeling, and education modules for mariners, students, environmental managers, and scientists. The portal will integrate Census of Marine Life field projects, national data archives, and other functional modules, and provides for network-wide analyses and modeling tools.

  8. [LONI & Co: about the epistemic specificity of digital spaces of knowledge in cognitive neuroscience].

    PubMed

    Huber, Lara

    2011-06-01

    In the neurosciences digital databases more and more are becoming important tools of data rendering and distributing. This development is due to the growing impact of imaging based trial design in cognitive neuroscience, including morphological as much as functional imaging technologies. As the case of the 'Laboratory of Neuro Imaging' (LONI) is showing, databases are attributed a specific epistemological power: Since the 1990s databasing is seen to foster the integration of neuroscientific data, although local regimes of data production, -manipulation and--interpretation are also challenging this development. Databasing in the neurosciences goes along with the introduction of new structures of integrating local data, hence establishing digital spaces of knowledge (epistemic spaces): At this stage, inherent norms of digital databases are affecting regimes of imaging-based trial design, for example clinical research into Alzheimer's disease.

  9. DataTri, a database of American triatomine species occurrence

    NASA Astrophysics Data System (ADS)

    Ceccarelli, Soledad; Balsalobre, Agustín; Medone, Paula; Cano, María Eugenia; Gurgel Gonçalves, Rodrigo; Feliciangeli, Dora; Vezzani, Darío; Wisnivesky-Colli, Cristina; Gorla, David E.; Marti, Gerardo A.; Rabinovich, Jorge E.

    2018-04-01

    Trypanosoma cruzi, the causative agent of Chagas disease, is transmitted to mammals - including humans - by insect vectors of the subfamily Triatominae. We present the results of a compilation of triatomine occurrence and complementary ecological data that represents the most complete, integrated and updated database (DataTri) available on triatomine species at a continental scale. This database was assembled by collecting the records of triatomine species published from 1904 to 2017, spanning all American countries with triatomine presence. A total of 21815 georeferenced records were obtained from published literature, personal fieldwork and data provided by colleagues. The data compiled includes 24 American countries, 14 genera and 135 species. From a taxonomic perspective, 67.33% of the records correspond to the genus Triatoma, 20.81% to Panstrongylus, 9.01% to Rhodnius and the remaining 2.85% are distributed among the other 11 triatomine genera. We encourage using DataTri information in various areas, especially to improve knowledge of the geographical distribution of triatomine species and its variations in time.

  10. Distributed database kriging for adaptive sampling (D²KAS)

    DOE PAGES

    Roehm, Dominic; Pavel, Robert S.; Barros, Kipton; ...

    2015-03-18

    We present an adaptive sampling method supplemented by a distributed database and a prediction method for multiscale simulations using the Heterogeneous Multiscale Method. A finite-volume scheme integrates the macro-scale conservation laws for elastodynamics, which are closed by momentum and energy fluxes evaluated at the micro-scale. In the original approach, molecular dynamics (MD) simulations are launched for every macro-scale volume element. Our adaptive sampling scheme replaces a large fraction of costly micro-scale MD simulations with fast table lookup and prediction. The cloud database Redis provides the plain table lookup, and with locality aware hashing we gather input data for our predictionmore » scheme. For the latter we use kriging, which estimates an unknown value and its uncertainty (error) at a specific location in parameter space by using weighted averages of the neighboring points. We find that our adaptive scheme significantly improves simulation performance by a factor of 2.5 to 25, while retaining high accuracy for various choices of the algorithm parameters.« less

  11. Environmental Chemistry Compound Identification Using High ...

    EPA Pesticide Factsheets

    There is a growing need for rapid chemical screening and prioritization to inform regulatory decision-making on thousands of chemicals in the environment. We have previously used high-resolution mass spectrometry to examine household vacuum dust samples using liquid chromatography time-of-flight mass spectrometry (LC-TOF/MS). Using a combination of exact mass, isotope distribution, and isotope spacing, molecular features were matched with a list of chemical formulas from the EPA’s Distributed Structure-Searchable Toxicity (DSSTox) database. This has further developed our understanding of how openly available chemical databases, together with the appropriate searches, could be used for the purpose of compound identification. We report here on the utility of the EPA’s iCSS Chemistry Dashboard for the purpose of compound identification using searches against a database of over 720,000 chemicals. We also examine the benefits of QSAR prediction for the purpose of retention time prediction to allow for alignment of both chromatographic and mass spectral properties. This abstract does not reflect U.S. EPA policy presentation at the Eastern Analytical Symposium.

  12. DataTri, a database of American triatomine species occurrence.

    PubMed

    Ceccarelli, Soledad; Balsalobre, Agustín; Medone, Paula; Cano, María Eugenia; Gurgel Gonçalves, Rodrigo; Feliciangeli, Dora; Vezzani, Darío; Wisnivesky-Colli, Cristina; Gorla, David E; Marti, Gerardo A; Rabinovich, Jorge E

    2018-04-24

    Trypanosoma cruzi, the causative agent of Chagas disease, is transmitted to mammals - including humans - by insect vectors of the subfamily Triatominae. We present the results of a compilation of triatomine occurrence and complementary ecological data that represents the most complete, integrated and updated database (DataTri) available on triatomine species at a continental scale. This database was assembled by collecting the records of triatomine species published from 1904 to 2017, spanning all American countries with triatomine presence. A total of 21815 georeferenced records were obtained from published literature, personal fieldwork and data provided by colleagues. The data compiled includes 24 American countries, 14 genera and 135 species. From a taxonomic perspective, 67.33% of the records correspond to the genus Triatoma, 20.81% to Panstrongylus, 9.01% to Rhodnius and the remaining 2.85% are distributed among the other 11 triatomine genera. We encourage using DataTri information in various areas, especially to improve knowledge of the geographical distribution of triatomine species and its variations in time.

  13. Distributed policy based access to networked heterogeneous ISR data sources

    NASA Astrophysics Data System (ADS)

    Bent, G.; Vyvyan, D.; Wood, David; Zerfos, Petros; Calo, Seraphin

    2010-04-01

    Within a coalition environment, ad hoc Communities of Interest (CoI's) come together, perhaps for only a short time, with different sensors, sensor platforms, data fusion elements, and networks to conduct a task (or set of tasks) with different coalition members taking different roles. In such a coalition, each organization will have its own inherent restrictions on how it will interact with the others. These are usually stated as a set of policies, including security and privacy policies. The capability that we want to enable for a coalition operation is to provide access to information from any coalition partner in conformance with the policies of all. One of the challenges in supporting such ad-hoc coalition operations is that of providing efficient access to distributed sources of data, where the applications requiring the data do not have knowledge of the location of the data within the network. To address this challenge the International Technology Alliance (ITA) program has been developing the concept of a Dynamic Distributed Federated Database (DDFD), also know as a Gaian Database. This type of database provides a means for accessing data across a network of distributed heterogeneous data sources where access to the information is controlled by a mixture of local and global policies. We describe how a network of disparate ISR elements can be expressed as a DDFD and how this approach enables sensor and other information sources to be discovered autonomously or semi-autonomously and/or combined, fused formally defined local and global policies.

  14. Patterns, biases and prospects in the distribution and diversity of Neotropical snakes.

    PubMed

    Guedes, Thaís B; Sawaya, Ricardo J; Zizka, Alexander; Laffan, Shawn; Faurby, Søren; Pyron, R Alexander; Bérnils, Renato S; Jansen, Martin; Passos, Paulo; Prudente, Ana L C; Cisneros-Heredia, Diego F; Braz, Henrique B; Nogueira, Cristiano de C; Antonelli, Alexandre; Meiri, Shai

    2018-01-01

    We generated a novel database of Neotropical snakes (one of the world's richest herpetofauna) combining the most comprehensive, manually compiled distribution dataset with publicly available data. We assess, for the first time, the diversity patterns for all Neotropical snakes as well as sampling density and sampling biases. We compiled three databases of species occurrences: a dataset downloaded from the Global Biodiversity Information Facility (GBIF), a verified dataset built through taxonomic work and specialized literature, and a combined dataset comprising a cleaned version of the GBIF dataset merged with the verified dataset. Neotropics, Behrmann projection equivalent to 1° × 1°. Specimens housed in museums during the last 150 years. Squamata: Serpentes. Geographical information system (GIS). The combined dataset provides the most comprehensive distribution database for Neotropical snakes to date. It contains 147,515 records for 886 species across 12 families, representing 74% of all species of snakes, spanning 27 countries in the Americas. Species richness and phylogenetic diversity show overall similar patterns. Amazonia is the least sampled Neotropical region, whereas most well-sampled sites are located near large universities and scientific collections. We provide a list and updated maps of geographical distribution of all snake species surveyed. The biodiversity metrics of Neotropical snakes reflect patterns previously documented for other vertebrates, suggesting that similar factors may determine the diversity of both ectothermic and endothermic animals. We suggest conservation strategies for high-diversity areas and sampling efforts be directed towards Amazonia and poorly known species.

  15. Storage and distribution of pathology digital images using integrated web-based viewing systems.

    PubMed

    Marchevsky, Alberto M; Dulbandzhyan, Ronda; Seely, Kevin; Carey, Steve; Duncan, Raymond G

    2002-05-01

    Health care providers have expressed increasing interest in incorporating digital images of gross pathology specimens and photomicrographs in routine pathology reports. To describe the multiple technical and logistical challenges involved in the integration of the various components needed for the development of a system for integrated Web-based viewing, storage, and distribution of digital images in a large health system. An Oracle version 8.1.6 database was developed to store, index, and deploy pathology digital photographs via our Intranet. The database allows for retrieval of images by patient demographics or by SNOMED code information. The Intranet of a large health system accessible from multiple computers located within the medical center and at distant private physician offices. The images can be viewed using any of the workstations of the health system that have authorized access to our Intranet, using a standard browser or a browser configured with an external viewer or inexpensive plug-in software, such as Prizm 2.0. The images can be printed on paper or transferred to film using a digital film recorder. Digital images can also be displayed at pathology conferences by using wireless local area network (LAN) and secure remote technologies. The standardization of technologies and the adoption of a Web interface for all our computer systems allows us to distribute digital images from a pathology database to a potentially large group of users distributed in multiple locations throughout a large medical center.

  16. Using a Materials Database System as the Backbone for a Certified Quality System (AS/NZS ISO 9001:1994) for a Distance Education Centre.

    ERIC Educational Resources Information Center

    Hughes, Norm

    The Distance Education Center (DEC) of the University of Southern Queensland (Australia) has developed a unique materials database system which is used to monitor pre-production, design and development, production and post-production planning, scheduling, and distribution of all types of materials including courses offered only on the Internet. In…

  17. XML Technology Assessment

    DTIC Science & Technology

    2001-01-01

    System (GCCS) Track Database Management System (TDBM) (3) GCCS Integrated Imagery and Intelligence (3) Intelligence Shared Data Server (ISDS) General ...The CTH is a powerful model that will allow more than just message systems to exchange information. It could be used for object-oriented databases, as...of the Naval Integrated Tactical Environmental System I (NITES I) is used as a case study to demonstrate the utility of this distributed component

  18. Dynamic Terrin

    DTIC Science & Technology

    1991-12-30

    York, 1985. [ Serway 86]: Raymond Serway , Physics for Scientists and Engineers. 2nd Edition, Saunders College Publishing, Philadelphia, 1986. pp. 200... Physical Modeling System 3.4 Realtime Hydrology 3.5 Soil Dynamics and Kinematics 4. Database Issues 4.1 Goals 4.2 Object Oriented Databases 4.3 Distributed...Animation System F. Constraints and Physical Modeling G. The PM Physical Modeling System H. Realtime Hydrology I. A Simplified Model of Soil Slumping

  19. Cost Considerations in Cloud Computing

    DTIC Science & Technology

    2014-01-01

    investments. 2. Database Options The potential promise that “ big data ” analytics holds for many enterprise mission areas makes relevant the question of the...development of a range of new distributed file systems and data - bases that have better scalability properties than traditional SQL databases. Hadoop ... data . Many systems exist that extend or supplement Hadoop —such as Apache Accumulo, which provides a highly granular mechanism for managing security

  20. Filling the gap in functional trait databases: use of ecological hypotheses to replace missing data.

    PubMed

    Taugourdeau, Simon; Villerd, Jean; Plantureux, Sylvain; Huguenin-Elie, Olivier; Amiaud, Bernard

    2014-04-01

    Functional trait databases are powerful tools in ecology, though most of them contain large amounts of missing values. The goal of this study was to test the effect of imputation methods on the evaluation of trait values at species level and on the subsequent calculation of functional diversity indices at community level using functional trait databases. Two simple imputation methods (average and median), two methods based on ecological hypotheses, and one multiple imputation method were tested using a large plant trait database, together with the influence of the percentage of missing data and differences between functional traits. At community level, the complete-case approach and three functional diversity indices calculated from grassland plant communities were included. At the species level, one of the methods based on ecological hypothesis was for all traits more accurate than imputation with average or median values, but the multiple imputation method was superior for most of the traits. The method based on functional proximity between species was the best method for traits with an unbalanced distribution, while the method based on the existence of relationships between traits was the best for traits with a balanced distribution. The ranking of the grassland communities for their functional diversity indices was not robust with the complete-case approach, even for low percentages of missing data. With the imputation methods based on ecological hypotheses, functional diversity indices could be computed with a maximum of 30% of missing data, without affecting the ranking between grassland communities. The multiple imputation method performed well, but not better than single imputation based on ecological hypothesis and adapted to the distribution of the trait values for the functional identity and range of the communities. Ecological studies using functional trait databases have to deal with missing data using imputation methods corresponding to their specific needs and making the most out of the information available in the databases. Within this framework, this study indicates the possibilities and limits of single imputation methods based on ecological hypothesis and concludes that they could be useful when studying the ranking of communities for their functional diversity indices.

  1. Enhancing SAMOS Data Access in DOMS via a Neo4j Property Graph Database.

    NASA Astrophysics Data System (ADS)

    Stallard, A. P.; Smith, S. R.; Elya, J. L.

    2016-12-01

    The Shipboard Automated Meteorological and Oceanographic System (SAMOS) initiative provides routine access to high-quality marine meteorological and near-surface oceanographic observations from research vessels. The Distributed Oceanographic Match-Up Service (DOMS) under development is a centralized service that allows researchers to easily match in situ and satellite oceanographic data from distributed sources to facilitate satellite calibration, validation, and retrieval algorithm development. The service currently uses Apache Solr as a backend search engine on each node in the distributed network. While Solr is a high-performance solution that facilitates creation and maintenance of indexed data, it is limited in the sense that its schema is fixed. The property graph model escapes this limitation by creating relationships between data objects. The authors will present the development of the SAMOS Neo4j property graph database including new search possibilities that take advantage of the property graph model, performance comparisons with Apache Solr, and a vision for graph databases as a storage tool for oceanographic data. The integration of the SAMOS Neo4j graph into DOMS will also be described. Currently, Neo4j contains spatial and temporal records from SAMOS which are modeled into a time tree and r-tree using Graph Aware and Spatial plugin tools for Neo4j. These extensions provide callable Java procedures within CYPHER (Neo4j's query language) that generate in-graph structures. Once generated, these structures can be queried using procedures from these libraries, or directly via CYPHER statements. Neo4j excels at performing relationship and path-based queries, which challenge relational-SQL databases because they require memory intensive joins due to the limitation of their design. Consider a user who wants to find records over several years, but only for specific months. If a traditional database only stores timestamps, this type of query would be complex and likely prohibitively slow. Using the time tree model, one can specify a path from the root to the data which restricts resolutions to certain timeframes (e.g., months). This query can be executed without joins, unions, or other compute-intensive operations, putting Neo4j at a computational advantage to the SQL database alternative.

  2. Filling the gap in functional trait databases: use of ecological hypotheses to replace missing data

    PubMed Central

    Taugourdeau, Simon; Villerd, Jean; Plantureux, Sylvain; Huguenin-Elie, Olivier; Amiaud, Bernard

    2014-01-01

    Functional trait databases are powerful tools in ecology, though most of them contain large amounts of missing values. The goal of this study was to test the effect of imputation methods on the evaluation of trait values at species level and on the subsequent calculation of functional diversity indices at community level using functional trait databases. Two simple imputation methods (average and median), two methods based on ecological hypotheses, and one multiple imputation method were tested using a large plant trait database, together with the influence of the percentage of missing data and differences between functional traits. At community level, the complete-case approach and three functional diversity indices calculated from grassland plant communities were included. At the species level, one of the methods based on ecological hypothesis was for all traits more accurate than imputation with average or median values, but the multiple imputation method was superior for most of the traits. The method based on functional proximity between species was the best method for traits with an unbalanced distribution, while the method based on the existence of relationships between traits was the best for traits with a balanced distribution. The ranking of the grassland communities for their functional diversity indices was not robust with the complete-case approach, even for low percentages of missing data. With the imputation methods based on ecological hypotheses, functional diversity indices could be computed with a maximum of 30% of missing data, without affecting the ranking between grassland communities. The multiple imputation method performed well, but not better than single imputation based on ecological hypothesis and adapted to the distribution of the trait values for the functional identity and range of the communities. Ecological studies using functional trait databases have to deal with missing data using imputation methods corresponding to their specific needs and making the most out of the information available in the databases. Within this framework, this study indicates the possibilities and limits of single imputation methods based on ecological hypothesis and concludes that they could be useful when studying the ranking of communities for their functional diversity indices. PMID:24772273

  3. Wind Power Forecasting Error Frequency Analyses for Operational Power System Studies: Preprint

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Florita, A.; Hodge, B. M.; Milligan, M.

    2012-08-01

    The examination of wind power forecasting errors is crucial for optimal unit commitment and economic dispatch of power systems with significant wind power penetrations. This scheduling process includes both renewable and nonrenewable generators, and the incorporation of wind power forecasts will become increasingly important as wind fleets constitute a larger portion of generation portfolios. This research considers the Western Wind and Solar Integration Study database of wind power forecasts and numerical actualizations. This database comprises more than 30,000 locations spread over the western United States, with a total wind power capacity of 960 GW. Error analyses for individual sites andmore » for specific balancing areas are performed using the database, quantifying the fit to theoretical distributions through goodness-of-fit metrics. Insights into wind-power forecasting error distributions are established for various levels of temporal and spatial resolution, contrasts made among the frequency distribution alternatives, and recommendations put forth for harnessing the results. Empirical data are used to produce more realistic site-level forecasts than previously employed, such that higher resolution operational studies are possible. This research feeds into a larger work of renewable integration through the links wind power forecasting has with various operational issues, such as stochastic unit commitment and flexible reserve level determination.« less

  4. The Current Landscape of US Pediatric Anesthesiologists: Demographic Characteristics and Geographic Distribution.

    PubMed

    Muffly, Matthew K; Muffly, Tyler M; Weterings, Robbie; Singleton, Mark; Honkanen, Anita

    2016-07-01

    There is no comprehensive database of pediatric anesthesiologists, their demographic characteristics, or geographic location in the United States. We endeavored to create a comprehensive database of pediatric anesthesiologists by merging individuals identified as US pediatric anesthesiologists by the American Board of Anesthesiology, National Provider Identifier registry, Healthgrades.com database, and the Society for Pediatric Anesthesia membership list as of November 5, 2015. Professorial rank was accessed via the Association of American Medical Colleges and other online sources. Descriptive statistics characterized pediatric anesthesiologists' demographics. Pediatric anesthesiologists' locations at the city and state level were geocoded and mapped with the use of ArcGIS Desktop 10.1 mapping software (Redlands, CA). We identified 4048 pediatric anesthesiologists in the United States, which is approximately 8.8% of the physician anesthesiology workforce (n = 46,000). The median age of pediatric anesthesiologists was 49 years (interquartile range, 40-57 years), and the majority (56.4%) were men. Approximately two-thirds of identified pediatric anesthesiologists were subspecialty board certified in pediatric anesthesiology, and 33% of pediatric anesthesiologists had an identified academic affiliation. There is substantial heterogeneity in the geographic distribution of pediatric anesthesiologists by state and US Census Division with urban clustering. This description of pediatric anesthesiologists' demographic characteristics and geographic distribution fills an important gap in our understanding of pediatric anesthesia systems of care.

  5. Software Quality Measurement for Distributed Systems. Volume 3. Distributed Computing Systems: Impact on Software Quality.

    DTIC Science & Technology

    1983-07-01

    Distributed Computing Systems impact DrnwrR - aehR on Sotwar Quaity. PERFORMING 010. REPORT NUMBER 7. AUTNOW) S. CONTRACT OR GRANT "UMBER(*)IS ThomasY...C31 Application", "Space Systems Network", "Need for Distributed Database Management", and "Adaptive Routing". This is discussed in the last para ...data reduction, buffering, encryption, and error detection and correction functions. Examples of such data streams include imagery data, video

  6. Creation of clinical research databases in the 21st century: a practical algorithm for HIPAA Compliance.

    PubMed

    Schell, Scott R

    2006-02-01

    Enforcement of the Health Insurance Portability and Accountability Act (HIPAA) began in April, 2003. Designed as a law mandating health insurance availability when coverage was lost, HIPAA imposed sweeping and broad-reaching protections of patient privacy. These changes dramatically altered clinical research by placing sizeable regulatory burdens upon investigators with threat of severe and costly federal and civil penalties. This report describes development of an algorithmic approach to clinical research database design based upon a central key-shared data (CK-SD) model allowing researchers to easily analyze, distribute, and publish clinical research without disclosure of HIPAA Protected Health Information (PHI). Three clinical database formats (small clinical trial, operating room performance, and genetic microchip array datasets) were modeled using standard structured query language (SQL)-compliant databases. The CK database was created to contain PHI data, whereas a shareable SD database was generated in real-time containing relevant clinical outcome information while protecting PHI items. Small (< 100 records), medium (< 50,000 records), and large (> 10(8) records) model databases were created, and the resultant data models were evaluated in consultation with an HIPAA compliance officer. The SD database models complied fully with HIPAA regulations, and resulting "shared" data could be distributed freely. Unique patient identifiers were not required for treatment or outcome analysis. Age data were resolved to single-integer years, grouping patients aged > 89 years. Admission, discharge, treatment, and follow-up dates were replaced with enrollment year, and follow-up/outcome intervals calculated eliminating original data. Two additional data fields identified as PHI (treating physician and facility) were replaced with integer values, and the original data corresponding to these values were stored in the CK database. Use of the algorithm at the time of database design did not increase cost or design effort. The CK-SD model for clinical database design provides an algorithm for investigators to create, maintain, and share clinical research data compliant with HIPAA regulations. This model is applicable to new projects and large institutional datasets, and should decrease regulatory efforts required for conduct of clinical research. Application of the design algorithm early in the clinical research enterprise does not increase cost or the effort of data collection.

  7. Measurement and application of bidirectional reflectance distribution function

    NASA Astrophysics Data System (ADS)

    Liao, Fei; Li, Lin; Lu, Chengwen

    2016-10-01

    When a beam of light with certain intensity and distribution reaches the surface of a material, the distribution of the diffused light is related to the incident angle, the receiving angle, the wavelength of the light and the types of the material. Bidirectional Reflectance Distribution Function (BRDF) is a method to describe this distribution. For an optical system, the optical and mechanical materials' BRDF are unique, and if we want to calculate stray light of the system we should know the correct BRDF data of the whole materials. There are fundamental significances in the area of space remote sensor where BRDF is needed in the precise radiation calibration. It is also important in the military field where BRDF can be used in the object identification and target tracking, etc. In this paper, 11 kinds of aerospace materials' BRDF are measured and more than 310,000 groups of BRDF data are achieved , and also a BRDF database is established in China for the first time. With the BRDF data of the database, we can create the detector model, build the stray light radiation surface model in the stray light analysis software. In this way, the stray radiation on the detector can be calculated correctly.

  8. Geologic map and map database of northeastern San Francisco Bay region, California, [including] most of Solano County and parts of Napa, Marin, Contra Costa, San Joaquin, Sacramento, Yolo, and Sonoma Counties

    USGS Publications Warehouse

    Graymer, Russell Walter; Jones, David Lawrence; Brabb, Earl E.

    2002-01-01

    This digital map database, compiled from previously published and unpublished data, and new mapping by the authors, represents the general distribution of bedrock and surficial deposits in the mapped area. Together with the accompanying text file (nesfmf.ps, nesfmf.pdf, nesfmf.txt), it provides current information on the geologic structure and stratigraphy of the area covered. The database delineates map units that are identified by general age and lithology following the stratigraphic nomenclature of the U.S. Geological Survey. The scale of the source maps limits the spatial resolution (scale) of the database to 1:62,500 or smaller.

  9. Global building inventory for earthquake loss estimation and risk management

    USGS Publications Warehouse

    Jaiswal, Kishor; Wald, David; Porter, Keith

    2010-01-01

    We develop a global database of building inventories using taxonomy of global building types for use in near-real-time post-earthquake loss estimation and pre-earthquake risk analysis, for the U.S. Geological Survey’s Prompt Assessment of Global Earthquakes for Response (PAGER) program. The database is available for public use, subject to peer review, scrutiny, and open enhancement. On a country-by-country level, it contains estimates of the distribution of building types categorized by material, lateral force resisting system, and occupancy type (residential or nonresidential, urban or rural). The database draws on and harmonizes numerous sources: (1) UN statistics, (2) UN Habitat’s demographic and health survey (DHS) database, (3) national housing censuses, (4) the World Housing Encyclopedia and (5) other literature.

  10. Geologic map of the Grand Canyon 30' x 60' quadrangle, Coconino and Mohave Counties, northwestern Arizona

    USGS Publications Warehouse

    Billingsley, G.H.

    2000-01-01

    This digital map database, compiled from previously published and unpublished data as well as new mapping by the author, represents the general distribution of bedrock and surficial deposits in the map area. Together with the accompanying pamphlet, it provides current information on the geologic structure and stratigraphy of the Grand Canyon area. The database delineates map units that are identified by general age and lithology following the stratigraphic nomenclature of the U.S. Geological Survey. The scale of the source maps limits the spatial resolution (scale) of the database to 1:100,000 or smaller.

  11. A data analysis expert system for large established distributed databases

    NASA Technical Reports Server (NTRS)

    Gnacek, Anne-Marie; An, Y. Kim; Ryan, J. Patrick

    1987-01-01

    A design for a natural language database interface system, called the Deductively Augmented NASA Management Decision support System (DANMDS), is presented. The DANMDS system components have been chosen on the basis of the following considerations: maximal employment of the existing NASA IBM-PC computers and supporting software; local structuring and storing of external data via the entity-relationship model; a natural easy-to-use error-free database query language; user ability to alter query language vocabulary and data analysis heuristic; and significant artificial intelligence data analysis heuristic techniques that allow the system to become progressively and automatically more useful.

  12. Online bibliographic sources in hydrology

    USGS Publications Warehouse

    Wild, Emily C.; Havener, W. Michael

    2001-01-01

    Traditional commercial bibliographic databases and indexes provide some access to hydrology materials produced by the government; however, these sources do not provide comprehensive coverage of relevant hydrologic publications. This paper discusses bibliographic information available from the federal government and state geological surveys, water resources agencies, and depositories. In addition to information in these databases, the paper describes the scope, styles of citing, subject terminology, and the ways these information sources are currently being searched, formally and informally, by hydrologists. Information available from the federal and state agencies and from the state depositories might be missed by limiting searches to commercially distributed databases.

  13. An empirical assessment of taxic paleobiology.

    PubMed

    Adrain, J M; Westrop, S R

    2000-07-07

    The analysis of major changes in faunal diversity through time is a central theme of analytical paleobiology. The most important sources of data are literature-based compilations of stratigraphic ranges of fossil taxa. The levels of error in these compilations and the possible effects of such error have often been discussed but never directly assessed. We compared our comprehensive database of trilobites to the equivalent portion of J. J. Sepkoski Jr.'s widely used global genus database. More than 70% of entries in the global database are inaccurate; however, as predicted, the error is randomly distributed and does not introduce bias.

  14. Documentation of the U.S. Geological Survey Oceanographic Time-Series Measurement Database

    USGS Publications Warehouse

    Montgomery, Ellyn T.; Martini, Marinna A.; Lightsom, Frances L.; Butman, Bradford

    2008-01-02

    This report describes the instrumentation and platforms used to make the measurements; the methods used to process, apply quality-control criteria, and archive the data; the data storage format, and how the data are released and distributed. The report also includes instructions on how to access the data from the online database at http://stellwagen.er.usgs.gov/. As of 2016, the database contains about 5,000 files, which may include observations of current velocity, wave statistics, ocean temperature, conductivity, pressure, and light transmission at one or more depths over some duration of time.

  15. An Investigation of the Fine Spatial Structure of Meteor Streams Using the Relational Database ``Meteor''

    NASA Astrophysics Data System (ADS)

    Karpov, A. V.; Yumagulov, E. Z.

    2003-05-01

    We have restored and ordered the archive of meteor observations carried out with a meteor radar complex ``KGU-M5'' since 1986. A relational database has been formed under the control of the Database Management System (DBMS) Oracle 8. We also improved and tested a statistical method for studying the fine spatial structure of meteor streams with allowance for the specific features of application of the DBMS. Statistical analysis of the results of observations made it possible to obtain information about the substance distribution in the Quadrantid, Geminid, and Perseid meteor streams.

  16. Neuroimaging Data Sharing on the Neuroinformatics Database Platform

    PubMed Central

    Book, Gregory A; Stevens, Michael; Assaf, Michal; Glahn, David; Pearlson, Godfrey D

    2015-01-01

    We describe the Neuroinformatics Database (NiDB), an open-source database platform for archiving, analysis, and sharing of neuroimaging data. Data from the multi-site projects Autism Brain Imaging Data Exchange (ABIDE), Bipolar-Schizophrenia Network on Intermediate Phenotypes parts one and two (B-SNIP1, B-SNIP2), and Monetary Incentive Delay task (MID) are available for download from the public instance of NiDB, with more projects sharing data as it becomes available. As demonstrated by making several large datasets available, NiDB is an extensible platform appropriately suited to archive and distribute shared neuroimaging data. PMID:25888923

  17. Preliminary surficial geologic map of a Calico Mountains piedmont and part of Coyote Lake, Mojave desert, San Bernardino County, California

    USGS Publications Warehouse

    Dudash, Stephanie L.

    2006-01-01

    This 1:24,000 scale detailed surficial geologic map and digital database of a Calico Mountains piedmont and part of Coyote Lake in south-central California depicts surficial deposits and generalized bedrock units. The mapping is part of a USGS project to investigate the spatial distribution of deposits linked to changes in climate, to provide framework geology for land use management (http://deserts.wr.usgs.gov), to understand the Quaternary tectonic history of the Mojave Desert, and to provide additional information on the history of Lake Manix, of which Coyote Lake is a sub-basin. Mapping is displayed on parts of four USGS 7.5 minute series topographic maps. The map area lies in the central Mojave Desert of California, northeast of Barstow, Calif. and south of Fort Irwin, Calif. and covers 258 sq.km. (99.5 sq.mi.). Geologic deposits in the area consist of Paleozoic metamorphic rocks, Mesozoic plutonic rocks, Miocene volcanic rocks, Pliocene-Pleistocene basin fill, and Quaternary surficial deposits. McCulloh (1960, 1965) conducted bedrock mapping and a generalized version of his maps are compiled into this map. McCulloh's maps contain many bedrock structures within the Calico Mountains that are not shown on the present map. This study resulted in several new findings, including the discovery of previously unrecognized faults, one of which is the Tin Can Alley fault. The north-striking Tin Can Alley fault is part of the Paradise fault zone (Miller and others, 2005), a potentially important feature for studying neo-tectonic strain in the Mojave Desert. Additionally, many Anodonta shells were collected in Coyote Lake lacustrine sediments for radiocarbon dating. Preliminary results support some of Meek's (1999) conclusions on the timing of Mojave River inflow into the Coyote Basin. The database includes information on geologic deposits, samples, and geochronology. The database is distributed in three parts: spatial map-based data, documentation, and printable map graphics of the database. Spatial data are distributed as an ArcInfo personal geodatabase, or as tabular data in the form of Microsoft Access Database (MDB) or dBase Format (DBF) file formats. Documentation includes this file, which provides a discussion of the surficial geology and describes the format and content of the map data, and Federal Geographic Data Committee (FGDC) metadata for the spatial map information. Map graphics files are distributed as Postscript and Adobe Acrobat Portable Document Format (PDF) files, and are appropriate for representing a view of the spatial database at the mapped scale.

  18. Method to assess the temporal persistence of potential biometric features: Application to oculomotor, gait, face and brain structure databases

    PubMed Central

    Nixon, Mark S.; Komogortsev, Oleg V.

    2017-01-01

    We introduce the intraclass correlation coefficient (ICC) to the biometric community as an index of the temporal persistence, or stability, of a single biometric feature. It requires, as input, a feature on an interval or ratio scale, and which is reasonably normally distributed, and it can only be calculated if each subject is tested on 2 or more occasions. For a biometric system, with multiple features available for selection, the ICC can be used to measure the relative stability of each feature. We show, for 14 distinct data sets (1 synthetic, 8 eye-movement-related, 2 gait-related, and 2 face-recognition-related, and one brain-structure-related), that selecting the most stable features, based on the ICC, resulted in the best biometric performance generally. Analyses based on using only the most stable features produced superior Rank-1-Identification Rate (Rank-1-IR) performance in 12 of 14 databases (p = 0.0065, one-tailed), when compared to other sets of features, including the set of all features. For Equal Error Rate (EER), using a subset of only high-ICC features also produced superior performance in 12 of 14 databases (p = 0. 0065, one-tailed). In general, then, for our databases, prescreening potential biometric features, and choosing only highly reliable features yields better performance than choosing lower ICC features or than choosing all features combined. We also determined that, as the ICC of a group of features increases, the median of the genuine similarity score distribution increases and the spread of this distribution decreases. There was no statistically significant similar relationships for the impostor distributions. We believe that the ICC will find many uses in biometric research. In case of the eye movement-driven biometrics, the use of reliable features, as measured by ICC, allowed to us achieve the authentication performance with EER = 2.01%, which was not possible before. PMID:28575030

  19. Method to assess the temporal persistence of potential biometric features: Application to oculomotor, gait, face and brain structure databases.

    PubMed

    Friedman, Lee; Nixon, Mark S; Komogortsev, Oleg V

    2017-01-01

    We introduce the intraclass correlation coefficient (ICC) to the biometric community as an index of the temporal persistence, or stability, of a single biometric feature. It requires, as input, a feature on an interval or ratio scale, and which is reasonably normally distributed, and it can only be calculated if each subject is tested on 2 or more occasions. For a biometric system, with multiple features available for selection, the ICC can be used to measure the relative stability of each feature. We show, for 14 distinct data sets (1 synthetic, 8 eye-movement-related, 2 gait-related, and 2 face-recognition-related, and one brain-structure-related), that selecting the most stable features, based on the ICC, resulted in the best biometric performance generally. Analyses based on using only the most stable features produced superior Rank-1-Identification Rate (Rank-1-IR) performance in 12 of 14 databases (p = 0.0065, one-tailed), when compared to other sets of features, including the set of all features. For Equal Error Rate (EER), using a subset of only high-ICC features also produced superior performance in 12 of 14 databases (p = 0. 0065, one-tailed). In general, then, for our databases, prescreening potential biometric features, and choosing only highly reliable features yields better performance than choosing lower ICC features or than choosing all features combined. We also determined that, as the ICC of a group of features increases, the median of the genuine similarity score distribution increases and the spread of this distribution decreases. There was no statistically significant similar relationships for the impostor distributions. We believe that the ICC will find many uses in biometric research. In case of the eye movement-driven biometrics, the use of reliable features, as measured by ICC, allowed to us achieve the authentication performance with EER = 2.01%, which was not possible before.

  20. CRETACEOUS CLIMATE SENSITIVITY STUDY USING DINOSAUR & PLANT PALEOBIOGEOGRAPHY

    NASA Astrophysics Data System (ADS)

    Goswami, A.; Main, D. J.; Noto, C. R.; Moore, T. L.; Scotese, C.

    2009-12-01

    The Early Cretaceous was characterized by cool poles and moderate global temperatures (~16° C). During the mid and late Cretaceous, long-term global warming (~20° - 22° C) was driven by increasing levels of CO2, rising sea level (lowering albedo) and the continuing breakup of Pangea. Paleoclimatic reconstructions for four time intervals during the Cretaceous: Middle Campanian (80 Ma), Cenomanian/Turonian (90 Ma), Early Albian (110 Ma) and Barremian-Hauterivian (130Ma) are presented here. These paleoclimate simulations were prepared using the Fast Ocean and Atmosphere Model (FOAM). The simulated results show the pattern of the pole-to-Equator temperature gradients, rainfall, surface run-off, the location of major rivers and deltas. In order to investigate the effect of potential dispersal routes on paleobiogeographic patterns, a time-slice series of maps from Early - Late Cretaceous were produced showing plots of dinosaur and plant fossil distributions. These Maps were created utilizing: 1) plant fossil localities from the GEON and Paleobiology (PBDB) databases; and 2) dinosaur fossil localities from an updated version of the Dinosauria (Weishampel, 2004) database. These results are compared to two different types of datasets, 1) Paleotemperature database for the Cretaceous and 2) locality data obtained from GEON, PBDB and Dinosauria database. Global latitudinal mean temperatures from both the model and the paelotemperature database were plotted on a series of latitudinal graphs along with the distributions of fossil plants and dinosaurs. It was found that most dinosaur localities through the Cretaceous tend to cluster within specific climate belts, or envelopes. Also, these Cretaceous maps show variance in biogeographic zonation of both plants and dinosaurs that is commensurate with reconstructed climate patterns and geography. These data are particularly useful for understanding the response of late Mesozoic ecosystems to geographic and climatic conditions that differed markedly from the present. Studies of past biotas and their changes may elucidate the role of climatic and geographic factors in driving changes in species distributions, ecosystem organization, and evolutionary dynamics over time.

  1. Column Store for GWAC: A High-cadence, High-density, Large-scale Astronomical Light Curve Pipeline and Distributed Shared-nothing Database

    NASA Astrophysics Data System (ADS)

    Wan, Meng; Wu, Chao; Wang, Jing; Qiu, Yulei; Xin, Liping; Mullender, Sjoerd; Mühleisen, Hannes; Scheers, Bart; Zhang, Ying; Nes, Niels; Kersten, Martin; Huang, Yongpan; Deng, Jinsong; Wei, Jianyan

    2016-11-01

    The ground-based wide-angle camera array (GWAC), a part of the SVOM space mission, will search for various types of optical transients by continuously imaging a field of view (FOV) of 5000 degrees2 every 15 s. Each exposure consists of 36 × 4k × 4k pixels, typically resulting in 36 × ˜175,600 extracted sources. For a modern time-domain astronomy project like GWAC, which produces massive amounts of data with a high cadence, it is challenging to search for short timescale transients in both real-time and archived data, and to build long-term light curves for variable sources. Here, we develop a high-cadence, high-density light curve pipeline (HCHDLP) to process the GWAC data in real-time, and design a distributed shared-nothing database to manage the massive amount of archived data which will be used to generate a source catalog with more than 100 billion records during 10 years of operation. First, we develop HCHDLP based on the column-store DBMS of MonetDB, taking advantage of MonetDB’s high performance when applied to massive data processing. To realize the real-time functionality of HCHDLP, we optimize the pipeline in its source association function, including both time and space complexity from outside the database (SQL semantic) and inside (RANGE-JOIN implementation), as well as in its strategy of building complex light curves. The optimized source association function is accelerated by three orders of magnitude. Second, we build a distributed database using a two-level time partitioning strategy via the MERGE TABLE and REMOTE TABLE technology of MonetDB. Intensive tests validate that our database architecture is able to achieve both linear scalability in response time and concurrent access by multiple users. In summary, our studies provide guidance for a solution to GWAC in real-time data processing and management of massive data.

  2. The National Landslide Database of Great Britain: Acquisition, communication and the role of social media

    NASA Astrophysics Data System (ADS)

    Pennington, Catherine; Freeborough, Katy; Dashwood, Claire; Dijkstra, Tom; Lawrie, Kenneth

    2015-11-01

    The British Geological Survey (BGS) is the national geological agency for Great Britain that provides geoscientific information to government, other institutions and the public. The National Landslide Database has been developed by the BGS and is the focus for national geohazard research for landslides in Great Britain. The history and structure of the geospatial database and associated Geographical Information System (GIS) are explained, along with the future developments of the database and its applications. The database is the most extensive source of information on landslides in Great Britain with over 17,000 records of landslide events to date, each documented as fully as possible for inland, coastal and artificial slopes. Data are gathered through a range of procedures, including: incorporation of other databases; automated trawling of current and historical scientific literature and media reports; new field- and desk-based mapping technologies with digital data capture, and using citizen science through social media and other online resources. This information is invaluable for directing the investigation, prevention and mitigation of areas of unstable ground in accordance with Government planning policy guidelines. The national landslide susceptibility map (GeoSure) and a national landslide domains map currently under development, as well as regional mapping campaigns, rely heavily on the information contained within the landslide database. Assessing susceptibility to landsliding requires knowledge of the distribution of failures, an understanding of causative factors, their spatial distribution and likely impacts, whilst understanding the frequency and types of landsliding present is integral to modelling how rainfall will influence the stability of a region. Communication of landslide data through the Natural Hazard Partnership (NHP) and Hazard Impact Model contributes to national hazard mitigation and disaster risk reduction with respect to weather and climate. Daily reports of landslide potential are published by BGS through the NHP partnership and data collected for the National Landslide Database are used widely for the creation of these assessments. The National Landslide Database is freely available via an online GIS and is used by a variety of stakeholders for research purposes.

  3. Distributed Generation Energy Technology Capital Costs | Energy Analysis |

    Science.gov Websites

    Technology Capital Costs Transparent Cost Database Button The following charts indicate recent capital cost charts provide a compilation of available national-level cost data from a variety of sources. Costs in distributed generation data used within these charts. If you are seeking utility-scale technology capital cost

  4. Very Large Scale Distributed Information Processing Systems

    DTIC Science & Technology

    1991-09-27

    USENIX Conference Proceedings, pp. 31-43. USENIX, February 1988. [KLA90] Michael L. Kazar, Bruce W. Leverett, Owen T. Anderson, Vasilis Apos- tolides, Beth...will be selected if cost is the curlcron Iorsleettin- IfFigure 2 R DistribUted Database lSgtam and its we combin the abolve two pit , n r-itcrr

  5. Distributing an Online Catalog on CD-ROM...The University of Illinois Experience.

    ERIC Educational Resources Information Center

    Watson, Paula D.; Golden, Gary A.

    1987-01-01

    Description of the planning of a project designed to test the feasibility of distributing a statewide union catalog database on optical disk discusses the relationship of the project's goals to those of statewide library development; dealing with vendors in a volatile, high technology industry; and plans for testing and evaluation. (EM)

  6. Environmental Carcinogen Releases and Lung Cancer Mortality in Rural-Urban Areas of the United States

    ERIC Educational Resources Information Center

    Luo, Juhua; Hendryx, Michael

    2011-01-01

    Purpose: Environmental hazards are unevenly distributed across communities and populations; however, little is known about the distribution of environmental carcinogenic pollutants and lung cancer risk across populations defined by race, sex, and rural-urban setting. Methods: We used the Toxics Release Inventory (TRI) database to conduct an…

  7. PHYLOGENETIC AFFILIATION OF WATER DISTRIBUTION SYSTEM BACTERIAL ISOLATES USING 16S RDNA SEQUENCE ANALYSIS

    EPA Science Inventory

    In a previously described study, only 15% of the bacterial strains isolated from a water distribution system (WDS) grown on R2A agar were identifiable using fatty acid methyl esthers (FAME) profiling. The lack of success was attributed to the use of fatty acid databases of bacter...

  8. Upgrades to the TPSX Material Properties Database

    NASA Technical Reports Server (NTRS)

    Squire, T. H.; Milos, F. S.; Partridge, Harry (Technical Monitor)

    2001-01-01

    The TPSX Material Properties Database is a web-based tool that serves as a database for properties of advanced thermal protection materials. TPSX provides an easy user interface for retrieving material property information in a variety of forms, both graphical and text. The primary purpose and advantage of TPSX is to maintain a high quality source of often used thermal protection material properties in a convenient, easily accessible form, for distribution to government and aerospace industry communities. Last year a major upgrade to the TPSX web site was completed. This year, through the efforts of researchers at several NASA centers, the Office of the Chief Engineer awarded funds to update and expand the databases in TPSX. The FY01 effort focuses on updating correcting the Ames and Johnson thermal protection materials databases. In this session we will summarize the improvements made to the web site last year, report on the status of the on-going database updates, describe the planned upgrades for FY02 and FY03, and provide a demonstration of TPSX.

  9. Equations for hydraulic conductivity estimation from particle size distribution: A dimensional analysis

    NASA Astrophysics Data System (ADS)

    Wang, Ji-Peng; François, Bertrand; Lambert, Pierre

    2017-09-01

    Estimating hydraulic conductivity from particle size distribution (PSD) is an important issue for various engineering problems. Classical models such as Hazen model, Beyer model, and Kozeny-Carman model usually regard the grain diameter at 10% passing (d10) as an effective grain size and the effects of particle size uniformity (in Beyer model) or porosity (in Kozeny-Carman model) are sometimes embedded. This technical note applies the dimensional analysis (Buckingham's ∏ theorem) to analyze the relationship between hydraulic conductivity and particle size distribution (PSD). The porosity is regarded as a dependent variable on the grain size distribution in unconsolidated conditions. It indicates that the coefficient of grain size uniformity and a dimensionless group representing the gravity effect, which is proportional to the mean grain volume, are the main two determinative parameters for estimating hydraulic conductivity. Regression analysis is then carried out on a database comprising 431 samples collected from different depositional environments and new equations are developed for hydraulic conductivity estimation. The new equation, validated in specimens beyond the database, shows an improved prediction comparing to using the classic models.

  10. NEMiD: a web-based curated microbial diversity database with geo-based plotting.

    PubMed

    Bhattacharjee, Kaushik; Joshi, Santa Ram

    2014-01-01

    The majority of the Earth's microbes remain unknown, and that their potential utility cannot be exploited until they are discovered and characterized. They provide wide scope for the development of new strains as well as biotechnological uses. The documentation and bioprospection of microorganisms carry enormous significance considering their relevance to human welfare. This calls for an urgent need to develop a database with emphasis on the microbial diversity of the largest untapped reservoirs in the biosphere. The data annotated in the North-East India Microbial database (NEMiD) were obtained by the isolation and characterization of microbes from different parts of the Eastern Himalayan region. The database was constructed as a relational database management system (RDBMS) for data storage in MySQL in the back-end on a Linux server and implemented in an Apache/PHP environment. This database provides a base for understanding the soil microbial diversity pattern in this megabiodiversity hotspot and indicates the distribution patterns of various organisms along with identification. The NEMiD database is freely available at www.mblabnehu.info/nemid/.

  11. Using ontology databases for scalable query answering, inconsistency detection, and data integration

    PubMed Central

    Dou, Dejing

    2011-01-01

    An ontology database is a basic relational database management system that models an ontology plus its instances. To reason over the transitive closure of instances in the subsumption hierarchy, for example, an ontology database can either unfold views at query time or propagate assertions using triggers at load time. In this paper, we use existing benchmarks to evaluate our method—using triggers—and we demonstrate that by forward computing inferences, we not only improve query time, but the improvement appears to cost only more space (not time). However, we go on to show that the true penalties were simply opaque to the benchmark, i.e., the benchmark inadequately captures load-time costs. We have applied our methods to two case studies in biomedicine, using ontologies and data from genetics and neuroscience to illustrate two important applications: first, ontology databases answer ontology-based queries effectively; second, using triggers, ontology databases detect instance-based inconsistencies—something not possible using views. Finally, we demonstrate how to extend our methods to perform data integration across multiple, distributed ontology databases. PMID:22163378

  12. A blue carbon soil database: Tidal wetland stocks for the US National Greenhouse Gas Inventory

    NASA Astrophysics Data System (ADS)

    Feagin, R. A.; Eriksson, M.; Hinson, A.; Najjar, R. G.; Kroeger, K. D.; Herrmann, M.; Holmquist, J. R.; Windham-Myers, L.; MacDonald, G. M.; Brown, L. N.; Bianchi, T. S.

    2015-12-01

    Coastal wetlands contain large reservoirs of carbon, and in 2015 the US National Greenhouse Gas Inventory began the work of placing blue carbon within the national regulatory context. The potential value of a wetland carbon stock, in relation to its location, soon could be influential in determining governmental policy and management activities, or in stimulating market-based CO2 sequestration projects. To meet the national need for high-resolution maps, a blue carbon stock database was developed linking National Wetlands Inventory datasets with the USDA Soil Survey Geographic Database. Users of the database can identify the economic potential for carbon conservation or restoration projects within specific estuarine basins, states, wetland types, physical parameters, and land management activities. The database is geared towards both national-level assessments and local-level inquiries. Spatial analysis of the stocks show high variance within individual estuarine basins, largely dependent on geomorphic position on the landscape, though there are continental scale trends to the carbon distribution as well. Future plans including linking this database with a sedimentary accretion database to predict carbon flux in US tidal wetlands.

  13. Indigenous species barcode database improves the identification of zooplankton

    PubMed Central

    Yang, Jianghua; Zhang, Wanwan; Sun, Jingying; Xie, Yuwei; Zhang, Yimin; Burton, G. Allen; Yu, Hongxia

    2017-01-01

    Incompleteness and inaccuracy of DNA barcode databases is considered an important hindrance to the use of metabarcoding in biodiversity analysis of zooplankton at the species-level. Species barcoding by Sanger sequencing is inefficient for organisms with small body sizes, such as zooplankton. Here mitochondrial cytochrome c oxidase I (COI) fragment barcodes from 910 freshwater zooplankton specimens (87 morphospecies) were recovered by a high-throughput sequencing platform, Ion Torrent PGM. Intraspecific divergence of most zooplanktons was < 5%, except Branchionus leydign (Rotifer, 14.3%), Trichocerca elongate (Rotifer, 11.5%), Lecane bulla (Rotifer, 15.9%), Synchaeta oblonga (Rotifer, 5.95%) and Schmackeria forbesi (Copepod, 6.5%). Metabarcoding data of 28 environmental samples from Lake Tai were annotated by both an indigenous database and NCBI Genbank database. The indigenous database improved the taxonomic assignment of metabarcoding of zooplankton. Most zooplankton (81%) with barcode sequences in the indigenous database were identified by metabarcoding monitoring. Furthermore, the frequency and distribution of zooplankton were also consistent between metabarcoding and morphology identification. Overall, the indigenous database improved the taxonomic assignment of zooplankton. PMID:28977035

  14. NEMiD: A Web-Based Curated Microbial Diversity Database with Geo-Based Plotting

    PubMed Central

    Bhattacharjee, Kaushik; Joshi, Santa Ram

    2014-01-01

    The majority of the Earth's microbes remain unknown, and that their potential utility cannot be exploited until they are discovered and characterized. They provide wide scope for the development of new strains as well as biotechnological uses. The documentation and bioprospection of microorganisms carry enormous significance considering their relevance to human welfare. This calls for an urgent need to develop a database with emphasis on the microbial diversity of the largest untapped reservoirs in the biosphere. The data annotated in the North-East India Microbial database (NEMiD) were obtained by the isolation and characterization of microbes from different parts of the Eastern Himalayan region. The database was constructed as a relational database management system (RDBMS) for data storage in MySQL in the back-end on a Linux server and implemented in an Apache/PHP environment. This database provides a base for understanding the soil microbial diversity pattern in this megabiodiversity hotspot and indicates the distribution patterns of various organisms along with identification. The NEMiD database is freely available at www.mblabnehu.info/nemid/. PMID:24714636

  15. Adopting a corporate perspective on databases. Improving support for research and decision making.

    PubMed

    Meistrell, M; Schlehuber, C

    1996-03-01

    The Veterans Health Administration (VHA) is at the forefront of designing and managing health care information systems that accommodate the needs of clinicians, researchers, and administrators at all levels. Rather than using one single-site, centralized corporate database VHA has constructed several large databases with different configurations to meet the needs of users with different perspectives. The largest VHA database is the Decentralized Hospital Computer Program (DHCP), a multisite, distributed data system that uses decoupled hospital databases. The centralization of DHCP policy has promoted data coherence, whereas the decentralization of DHCP management has permitted system development to be done with maximum relevance to the users'local practices. A more recently developed VHA data system, the Event Driven Reporting system (EDR), uses multiple, highly coupled databases to provide workload data at facility, regional, and national levels. The EDR automatically posts a subset of DHCP data to local and national VHA management. The development of the EDR illustrates how adoption of a corporate perspective can offer significant database improvements at reasonable cost and with modest impact on the legacy system.

  16. HTSFinder: Powerful Pipeline of DNA Signature Discovery by Parallel and Distributed Computing

    PubMed Central

    Karimi, Ramin; Hajdu, Andras

    2016-01-01

    Comprehensive effort for low-cost sequencing in the past few years has led to the growth of complete genome databases. In parallel with this effort, a strong need, fast and cost-effective methods and applications have been developed to accelerate sequence analysis. Identification is the very first step of this task. Due to the difficulties, high costs, and computational challenges of alignment-based approaches, an alternative universal identification method is highly required. Like an alignment-free approach, DNA signatures have provided new opportunities for the rapid identification of species. In this paper, we present an effective pipeline HTSFinder (high-throughput signature finder) with a corresponding k-mer generator GkmerG (genome k-mers generator). Using this pipeline, we determine the frequency of k-mers from the available complete genome databases for the detection of extensive DNA signatures in a reasonably short time. Our application can detect both unique and common signatures in the arbitrarily selected target and nontarget databases. Hadoop and MapReduce as parallel and distributed computing tools with commodity hardware are used in this pipeline. This approach brings the power of high-performance computing into the ordinary desktop personal computers for discovering DNA signatures in large databases such as bacterial genome. A considerable number of detected unique and common DNA signatures of the target database bring the opportunities to improve the identification process not only for polymerase chain reaction and microarray assays but also for more complex scenarios such as metagenomics and next-generation sequencing analysis. PMID:26884678

  17. HTSFinder: Powerful Pipeline of DNA Signature Discovery by Parallel and Distributed Computing.

    PubMed

    Karimi, Ramin; Hajdu, Andras

    2016-01-01

    Comprehensive effort for low-cost sequencing in the past few years has led to the growth of complete genome databases. In parallel with this effort, a strong need, fast and cost-effective methods and applications have been developed to accelerate sequence analysis. Identification is the very first step of this task. Due to the difficulties, high costs, and computational challenges of alignment-based approaches, an alternative universal identification method is highly required. Like an alignment-free approach, DNA signatures have provided new opportunities for the rapid identification of species. In this paper, we present an effective pipeline HTSFinder (high-throughput signature finder) with a corresponding k-mer generator GkmerG (genome k-mers generator). Using this pipeline, we determine the frequency of k-mers from the available complete genome databases for the detection of extensive DNA signatures in a reasonably short time. Our application can detect both unique and common signatures in the arbitrarily selected target and nontarget databases. Hadoop and MapReduce as parallel and distributed computing tools with commodity hardware are used in this pipeline. This approach brings the power of high-performance computing into the ordinary desktop personal computers for discovering DNA signatures in large databases such as bacterial genome. A considerable number of detected unique and common DNA signatures of the target database bring the opportunities to improve the identification process not only for polymerase chain reaction and microarray assays but also for more complex scenarios such as metagenomics and next-generation sequencing analysis.

  18. Evaluation of Online Information Sources on Alien Species in Europe: The Need of Harmonization and Integration

    NASA Astrophysics Data System (ADS)

    Gatto, Francesca; Katsanevakis, Stelios; Vandekerkhove, Jochen; Zenetos, Argyro; Cardoso, Ana Cristina

    2013-06-01

    Europe is severely affected by alien invasions, which impact biodiversity, ecosystem services, economy, and human health. A large number of national, regional, and global online databases provide information on the distribution, pathways of introduction, and impacts of alien species. The sufficiency and efficiency of the current online information systems to assist the European policy on alien species was investigated by a comparative analysis of occurrence data across 43 online databases. Large differences among databases were found which are partially explained by variations in their taxonomical, environmental, and geographical scopes but also by the variable efforts for continuous updates and by inconsistencies on the definition of "alien" or "invasive" species. No single database covered all European environments, countries, and taxonomic groups. In many European countries national databases do not exist, which greatly affects the quality of reported information. To be operational and useful to scientists, managers, and policy makers, online information systems need to be regularly updated through continuous monitoring on a country or regional level. We propose the creation of a network of online interoperable web services through which information in distributed resources can be accessed, aggregated and then used for reporting and further analysis at different geographical and political scales, as an efficient approach to increase the accessibility of information. Harmonization, standardization, conformity on international standards for nomenclature, and agreement on common definitions of alien and invasive species are among the necessary prerequisites.

  19. Crystallography Open Database (COD): an open-access collection of crystal structures and platform for world-wide collaboration

    PubMed Central

    Gražulis, Saulius; Daškevič, Adriana; Merkys, Andrius; Chateigner, Daniel; Lutterotti, Luca; Quirós, Miguel; Serebryanaya, Nadezhda R.; Moeck, Peter; Downs, Robert T.; Le Bail, Armel

    2012-01-01

    Using an open-access distribution model, the Crystallography Open Database (COD, http://www.crystallography.net) collects all known ‘small molecule / small to medium sized unit cell’ crystal structures and makes them available freely on the Internet. As of today, the COD has aggregated ∼150 000 structures, offering basic search capabilities and the possibility to download the whole database, or parts thereof using a variety of standard open communication protocols. A newly developed website provides capabilities for all registered users to deposit published and so far unpublished structures as personal communications or pre-publication depositions. Such a setup enables extension of the COD database by many users simultaneously. This increases the possibilities for growth of the COD database, and is the first step towards establishing a world wide Internet-based collaborative platform dedicated to the collection and curation of structural knowledge. PMID:22070882

  20. Analysis and Exchange of Multimedia Laboratory Data Using the Brain Database

    PubMed Central

    Wertheim, Steven L.

    1990-01-01

    Two principal goals of the Brain Database are: 1) to support laboratory data collection and analysis of multimedia information about the nervous system and 2) to support exchange of these data among researchers and clinicians who may be physically distant. This has been achieved by an implementation of experimental and clinical records within a relational database. An Image Series Editor has been created that provides a graphical interface to these data for the purposes of annotation, quantification and other analyses. Cooperating laboratories each maintain their own copies of the Brain Database to which they may add private data. Although the data in a given experimental or patient record will be distributed among many tables and external image files, the user can treat each record as a unit that can be extracted from the local database and sent to a distant colleague.

  1. Information-seeking behavior and the use of online resources: a snapshot of current health sciences faculty.

    PubMed

    De Groote, Sandra L; Shultz, Mary; Blecic, Deborah D

    2014-07-01

    The research assesses the information-seeking behaviors of health sciences faculty, including their use of online databases, journals, and social media. A survey was designed and distributed via email to 754 health sciences faculty at a large urban research university with 6 health sciences colleges. Twenty-six percent (198) of faculty responded. MEDLINE was the primary database utilized, with 78.5% respondents indicating they use the database at least once a week. Compared to MEDLINE, Google was utilized more often on a daily basis. Other databases showed much lower usage. Low use of online databases other than MEDLINE, link-out tools to online journals, and online social media and collaboration tools demonstrates a need for meaningful promotion of online resources and informatics literacy instruction for faculty. Library resources are plentiful and perhaps somewhat overwhelming. Librarians need to help faculty discover and utilize the resources and tools that libraries have to offer.

  2. Federated or cached searches: Providing expected performance from multiple invasive species databases

    NASA Astrophysics Data System (ADS)

    Graham, Jim; Jarnevich, Catherine S.; Simpson, Annie; Newman, Gregory J.; Stohlgren, Thomas J.

    2011-06-01

    Invasive species are a universal global problem, but the information to identify them, manage them, and prevent invasions is stored around the globe in a variety of formats. The Global Invasive Species Information Network is a consortium of organizations working toward providing seamless access to these disparate databases via the Internet. A distributed network of databases can be created using the Internet and a standard web service protocol. There are two options to provide this integration. First, federated searches are being proposed to allow users to search "deep" web documents such as databases for invasive species. A second method is to create a cache of data from the databases for searching. We compare these two methods, and show that federated searches will not provide the performance and flexibility required from users and a central cache of the datum are required to improve performance.

  3. Insect barcode information system.

    PubMed

    Pratheepa, Maria; Jalali, Sushil Kumar; Arokiaraj, Robinson Silvester; Venkatesan, Thiruvengadam; Nagesh, Mandadi; Panda, Madhusmita; Pattar, Sharath

    2014-01-01

    Insect Barcode Information System called as Insect Barcode Informática (IBIn) is an online database resource developed by the National Bureau of Agriculturally Important Insects, Bangalore. This database provides acquisition, storage, analysis and publication of DNA barcode records of agriculturally important insects, for researchers specifically in India and other countries. It bridges a gap in bioinformatics by integrating molecular, morphological and distribution details of agriculturally important insects. IBIn was developed using PHP/My SQL by using relational database management concept. This database is based on the client- server architecture, where many clients can access data simultaneously. IBIn is freely available on-line and is user-friendly. IBIn allows the registered users to input new information, search and view information related to DNA barcode of agriculturally important insects.This paper provides a current status of insect barcode in India and brief introduction about the database IBIn. http://www.nabg-nbaii.res.in/barcode.

  4. PhamDB: a web-based application for building Phamerator databases.

    PubMed

    Lamine, James G; DeJong, Randall J; Nelesen, Serita M

    2016-07-01

    PhamDB is a web application which creates databases of bacteriophage genes, grouped by gene similarity. It is backwards compatible with the existing Phamerator desktop software while providing an improved database creation workflow. Key features include a graphical user interface, validation of uploaded GenBank files, and abilities to import phages from existing databases, modify existing databases and queue multiple jobs. Source code and installation instructions for Linux, Windows and Mac OSX are freely available at https://github.com/jglamine/phage PhamDB is also distributed as a docker image which can be managed via Kitematic. This docker image contains the application and all third party software dependencies as a pre-configured system, and is freely available via the installation instructions provided. snelesen@calvin.edu. © The Author 2016. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.

  5. Federated or cached searches: providing expected performance from multiple invasive species databases

    USGS Publications Warehouse

    Graham, Jim; Jarnevich, Catherine S.; Simpson, Annie; Newman, Gregory J.; Stohlgren, Thomas J.

    2011-01-01

    Invasive species are a universal global problem, but the information to identify them, manage them, and prevent invasions is stored around the globe in a variety of formats. The Global Invasive Species Information Network is a consortium of organizations working toward providing seamless access to these disparate databases via the Internet. A distributed network of databases can be created using the Internet and a standard web service protocol. There are two options to provide this integration. First, federated searches are being proposed to allow users to search “deep” web documents such as databases for invasive species. A second method is to create a cache of data from the databases for searching. We compare these two methods, and show that federated searches will not provide the performance and flexibility required from users and a central cache of the datum are required to improve performance.

  6. The Starlite Project

    DTIC Science & Technology

    1989-10-01

    Operating Systems for Mission-Critical Computing, (Sept. 1989) J1-J7. (12) Son, S. H. and N. Haghighi, "Performance Evaluation of Multiversion Database...Hungary, (Oct. 1989), to appear. (14) Son, S. H. and Y. Kim, "A Software Prototyping Environment and Its Use in Developing a Multiversion Distributed...University of Virginia, (Aug. 1989). (23) Son, S. H. and N. Haghighi, "Performance Evaluation of Multiversion Database Systems," Technical Report IPC

  7. Recent improvements in the NASA technical report server

    NASA Technical Reports Server (NTRS)

    Maa, Ming-Hokng; Nelson, Michael L.

    1995-01-01

    The NASA Technical Report Server (NTRS), a World Wide Web (WWW) report distribution service, has been modified to allow parallel database queries, significantly decreasing user access time by an average factor of 2.3, access from clients behind firewalls and/or proxies which truncate excessively long Uniform Resource Locators (URL's), access to non-Wide Area Information Server (WAIS) databases, and compatibility with the Z39-50.3 protocol.

  8. US Army Research Laboratory Visualization Framework Design Document

    DTIC Science & Technology

    2016-01-01

    This section highlights each module in the ARL-VF and subsequent sections provide details on how each module interacts . Fig. 2 ARL-VF with the...ConfigAgent MultiTouch VizDatabase VizController TUIO VizDatabase User VizDaemon VizDaemon VizDaemon VizDaemon VizDaemon TestPoint...received by the destination. The sequence diagram in Fig. 4 shows this interaction . Approved for public release; distribution unlimited. 13 Fig. 4

  9. Verification of road databases using multiple road models

    NASA Astrophysics Data System (ADS)

    Ziems, Marcel; Rottensteiner, Franz; Heipke, Christian

    2017-08-01

    In this paper a new approach for automatic road database verification based on remote sensing images is presented. In contrast to existing methods, the applicability of the new approach is not restricted to specific road types, context areas or geographic regions. This is achieved by combining several state-of-the-art road detection and road verification approaches that work well under different circumstances. Each one serves as an independent module representing a unique road model and a specific processing strategy. All modules provide independent solutions for the verification problem of each road object stored in the database in form of two probability distributions, the first one for the state of a database object (correct or incorrect), and a second one for the state of the underlying road model (applicable or not applicable). In accordance with the Dempster-Shafer Theory, both distributions are mapped to a new state space comprising the classes correct, incorrect and unknown. Statistical reasoning is applied to obtain the optimal state of a road object. A comparison with state-of-the-art road detection approaches using benchmark datasets shows that in general the proposed approach provides results with larger completeness. Additional experiments reveal that based on the proposed method a highly reliable semi-automatic approach for road data base verification can be designed.

  10. [Analysis on composition and medication regularities of prescriptions treating hypochondriac pain based on traditional Chinese medicine inheritance support system inheritance support platform].

    PubMed

    Zhao, Yan-qing; Teng, Jing

    2015-03-01

    To analyze the composition and medication regularities of prescriptions treating hypochondriac pain in Chinese journal full-text database (CNKI) based on the traditional Chinese medicine inheritance support system, in order to provide a reference for further research and development for new traditional Chinese medicines treating hypochondriac pain. The traditional Chinese medicine inheritance support platform software V2. 0 was used to build a prescription database of Chinese medicines treating hypochondriac pain. The software integration data mining method was used to distribute prescriptions according to "four odors", "five flavors" and "meridians" in the database and achieve frequency statistics, syndrome distribution, prescription regularity and new prescription analysis. An analysis were made for 192 prescriptions treating hypochondriac pain to determine the frequencies of medicines in prescriptions, commonly used medicine pairs and combinations and summarize 15 new prescriptions. This study indicated that the prescriptions treating hypochondriac pain in Chinese journal full-text database are mostly those for soothing liver-qi stagnation, promoting qi and activating blood, clearing heat and promoting dampness, and invigorating spleen and removing phlem, with a cold property and bitter taste, and reflect the principles of "distinguish deficiency and excess and relieving pain by smoothening meridians" in treating hypochondriac pain.

  11. The IUGS/IAGC Task Group on Global Geochemical Baselines

    USGS Publications Warehouse

    Smith, David B.; Wang, Xueqiu; Reeder, Shaun; Demetriades, Alecos

    2012-01-01

    The Task Group on Global Geochemical Baselines, operating under the auspices of both the International Union of Geological Sciences (IUGS) and the International Association of Geochemistry (IAGC), has the long-term goal of establishing a global geochemical database to document the concentration and distribution of chemical elements in the Earth’s surface or near-surface environment. The database and accompanying element distribution maps represent a geochemical baseline against which future human-induced or natural changes to the chemistry of the land surface may be recognized and quantified. In order to accomplish this long-term goal, the activities of the Task Group include: (1) developing partnerships with countries conducting broad-scale geochemical mapping studies; (2) providing consultation and training in the form of workshops and short courses; (3) organizing periodic international symposia to foster communication among the geochemical mapping community; (4) developing criteria for certifying those projects whose data are acceptable in a global geochemical database; (5) acting as a repository for data collected by those projects meeting the criteria for standardization; (6) preparing complete metadata for the certified projects; and (7) preparing, ultimately, a global geochemical database. This paper summarizes the history and accomplishments of the Task Group since its first predecessor project was established in 1988.

  12. Requirements, Verification, and Compliance (RVC) Database Tool

    NASA Technical Reports Server (NTRS)

    Rainwater, Neil E., II; McDuffee, Patrick B.; Thomas, L. Dale

    2001-01-01

    This paper describes the development, design, and implementation of the Requirements, Verification, and Compliance (RVC) database used on the International Space Welding Experiment (ISWE) project managed at Marshall Space Flight Center. The RVC is a systems engineer's tool for automating and managing the following information: requirements; requirements traceability; verification requirements; verification planning; verification success criteria; and compliance status. This information normally contained within documents (e.g. specifications, plans) is contained in an electronic database that allows the project team members to access, query, and status the requirements, verification, and compliance information from their individual desktop computers. Using commercial-off-the-shelf (COTS) database software that contains networking capabilities, the RVC was developed not only with cost savings in mind but primarily for the purpose of providing a more efficient and effective automated method of maintaining and distributing the systems engineering information. In addition, the RVC approach provides the systems engineer the capability to develop and tailor various reports containing the requirements, verification, and compliance information that meets the needs of the project team members. The automated approach of the RVC for capturing and distributing the information improves the productivity of the systems engineer by allowing that person to concentrate more on the job of developing good requirements and verification programs and not on the effort of being a "document developer".

  13. Atomic and Molecular Databases, VAMDC (Virtual Atomic and Molecular Data Centre)

    NASA Astrophysics Data System (ADS)

    Dubernet, Marie-Lise; Zwölf, Carlo Maria; Moreau, Nicolas; Awa Ba, Yaya; VAMDC Consortium

    2015-08-01

    The "Virtual Atomic and Molecular Data Centre Consortium",(VAMDC Consortium, http://www.vamdc.eu) is a Consortium bound by an Memorandum of Understanding aiming at ensuring the sustainability of the VAMDC e-infrastructure. The current VAMDC e-infrastructure inter-connects about 30 atomic and molecular databases with the number of connected databases increasing every year: some databases are well-known databases such as CDMS, JPL, HITRAN, VALD,.., other databases have been created since the start of VAMDC. About 90% of our databases are used for astrophysical applications. The data can be queried, retrieved, visualized in a single format from a general portal (http://portal.vamdc.eu) and VAMDC is also developing standalone tools in order to retrieve and handle the data. VAMDC provides software and support in order to include databases within the VAMDC e-infrastructure. One current feature of VAMDC is the constrained environnement of description of data that ensures a higher quality for distribution of data; a future feature is the link of VAMDC with evaluation/validation groups. The talk will present the VAMDC Consortium and the VAMDC e infrastructure with its underlying technology, its services, its science use cases and its etension towards other communities than the academic research community.

  14. Technical Aspects of Interfacing MUMPS to an External SQL Relational Database Management System

    PubMed Central

    Kuzmak, Peter M.; Walters, Richard F.; Penrod, Gail

    1988-01-01

    This paper describes an interface connecting InterSystems MUMPS (M/VX) to an external relational DBMS, the SYBASE Database Management System. The interface enables MUMPS to operate in a relational environment and gives the MUMPS language full access to a complete set of SQL commands. MUMPS generates SQL statements as ASCII text and sends them to the RDBMS. The RDBMS executes the statements and returns ASCII results to MUMPS. The interface suggests that the language features of MUMPS make it an attractive tool for use in the relational database environment. The approach described in this paper separates MUMPS from the relational database. Positioning the relational database outside of MUMPS promotes data sharing and permits a number of different options to be used for working with the data. Other languages like C, FORTRAN, and COBOL can access the RDBMS database. Advanced tools provided by the relational database vendor can also be used. SYBASE is an advanced high-performance transaction-oriented relational database management system for the VAX/VMS and UNIX operating systems. SYBASE is designed using a distributed open-systems architecture, and is relatively easy to interface with MUMPS.

  15. Development of Web-based Distributed Cooperative Development Environmentof Sign-Language Animation System and its Evaluation

    NASA Astrophysics Data System (ADS)

    Yuizono, Takaya; Hara, Kousuke; Nakayama, Shigeru

    A web-based distributed cooperative development environment of sign-language animation system has been developed. We have extended the system from the previous animation system that was constructed as three tiered system which consists of sign-language animation interface layer, sign-language data processing layer, and sign-language animation database. Two components of a web client using VRML plug-in and web servlet are added to the previous system. The systems can support humanoid-model avatar for interoperability, and can use the stored sign language animation data shared on the database. It is noted in the evaluation of this system that the inverse kinematics function of web client improves the sign-language animation making.

  16. Metacatalog of Planetary Surface Features for Multicriteria Evaluation of Surface Evolution: the Integrated Planetary Feature Database

    NASA Astrophysics Data System (ADS)

    Hargitai, Henrik

    2016-10-01

    We have created a metacatalog, or catalog or catalogs, of surface features of Mars that also includes the actual data in the catalogs listed. The goal is to make mesoscale surface feature databases available in one place, in a GIS-ready format. The databases can be directly imported to ArcGIS or other GIS platforms, like Google Mars. Some of the catalogs in our database are also ingested into the JMARS platform.All catalogs have been previously published in a peer-reviewed journal, but they may contain updates of the published catalogs. Many of the catalogs are "integrated", i.e. they merge databases or information from various papers on the same topic, including references to each individual features listed.Where available, we have included shapefiles with polygon or linear features, however, most of the catalogs only contain point data of their center points and morphological data.One of the unexpected results of the planetary feature metacatalog is that some features have been described by several papers, using different, i.e., conflicting designations. This shows the need for the development of an identification system suitable for mesoscale (100s m to km sized) features that tracks papers and thus prevents multiple naming of the same feature.The feature database can be used for multicriteria analysis of a terrain, thus enables easy distribution pattern analysis and the correlation of the distribution of different landforms and features on Mars. Such catalog makes a scientific evaluation of potential landing sites easier and more effective during the selection process and also supports automated landing site selections.The catalog is accessible at https://planetarydatabase.wordpress.com/.

  17. Integrating Distributed Homogeneous and Heterogeneous Databases: Prototypes. Volume 3.

    DTIC Science & Technology

    1987-12-01

    Integrating Distributed3 Institute of Teholg Homogeneous and -Knowledge-Based eeokn usDtb e: Integrated Information Pooye Systems Engineering Pooye (KBIISE...Transportation Systems Center, December 1987 Broadway, NIA 02142 13. NUMBER OF PAGES IT ~ *n~1~ ArFre 218 Pages 14. kW rSi dTfrn front N Gr~in Office) IS...SECURITY CLASS. (of thie report) Transportation Systems Center, Unclassified Broadway, MA 02142 I5a. DECLASSIFICATION/ DOWNGRADING SCHEDULE 16. DISTRIBUTION

  18. The Protein Disease Database of human body fluids: II. Computer methods and data issues.

    PubMed

    Lemkin, P F; Orr, G A; Goldstein, M P; Creed, G J; Myrick, J E; Merril, C R

    1995-01-01

    The Protein Disease Database (PDD) is a relational database of proteins and diseases. With this database it is possible to screen for quantitative protein abnormalities associated with disease states. These quantitative relationships use data drawn from the peer-reviewed biomedical literature. Assays may also include those observed in high-resolution electrophoretic gels that offer the potential to quantitate many proteins in a single test as well as data gathered by enzymatic or immunologic assays. We are using the Internet World Wide Web (WWW) and the Web browser paradigm as an access method for wide distribution and querying of the Protein Disease Database. The WWW hypertext transfer protocol and its Common Gateway Interface make it possible to build powerful graphical user interfaces that can support easy-to-use data retrieval using query specification forms or images. The details of these interactions are totally transparent to the users of these forms. Using a client-server SQL relational database, user query access, initial data entry and database maintenance are all performed over the Internet with a Web browser. We discuss the underlying design issues, mapping mechanisms and assumptions that we used in constructing the system, data entry, access to the database server, security, and synthesis of derived two-dimensional gel image maps and hypertext documents resulting from SQL database searches.

  19. Levelling and merging of two discrete national-scale geochemical databases: A case study showing the surficial expression of metalliferous black shales

    USGS Publications Warehouse

    Smith, Steven M.; Neilson, Ryan T.; Giles, Stuart A.

    2015-01-01

    Government-sponsored, national-scale, soil and sediment geochemical databases are used to estimate regional and local background concentrations for environmental issues, identify possible anthropogenic contamination, estimate mineral endowment, explore for new mineral deposits, evaluate nutrient levels for agriculture, and establish concentration relationships with human or animal health. Because of these different uses, it is difficult for any single database to accommodate all the needs of each client. Smith et al. (2013, p. 168) reviewed six national-scale soil and sediment geochemical databases for the United States (U.S.) and, for each, evaluated “its appropriateness as a national-scale geochemical database and its usefulness for national-scale geochemical mapping.” Each of the evaluated databases has strengths and weaknesses that were listed in that review.Two of these U.S. national-scale geochemical databases are similar in their sample media and collection protocols but have different strengths—primarily sampling density and analytical consistency. This project was implemented to determine whether those databases could be merged to produce a combined dataset that could be used for mineral resource assessments. The utility of the merged database was tested to see whether mapped distributions could identify metalliferous black shales at a national scale.

  20. SORTEZ: a relational translator for NCBI's ASN.1 database.

    PubMed

    Hart, K W; Searls, D B; Overton, G C

    1994-07-01

    The National Center for Biotechnology Information (NCBI) has created a database collection that includes several protein and nucleic acid sequence databases, a biosequence-specific subset of MEDLINE, as well as value-added information such as links between similar sequences. Information in the NCBI database is modeled in Abstract Syntax Notation 1 (ASN.1) an Open Systems Interconnection protocol designed for the purpose of exchanging structured data between software applications rather than as a data model for database systems. While the NCBI database is distributed with an easy-to-use information retrieval system, ENTREZ, the ASN.1 data model currently lacks an ad hoc query language for general-purpose data access. For that reason, we have developed a software package, SORTEZ, that transforms the ASN.1 database (or other databases with nested data structures) to a relational data model and subsequently to a relational database management system (Sybase) where information can be accessed through the relational query language, SQL. Because the need to transform data from one data model and schema to another arises naturally in several important contexts, including efficient execution of specific applications, access to multiple databases and adaptation to database evolution this work also serves as a practical study of the issues involved in the various stages of database transformation. We show that transformation from the ASN.1 data model to a relational data model can be largely automated, but that schema transformation and data conversion require considerable domain expertise and would greatly benefit from additional support tools.

  1. Supplier's Status for Critical Solid Propellants, Explosive, and Pyrotechnic Ingredients

    NASA Technical Reports Server (NTRS)

    Sims, B. L.; Painter, C. R.; Nauflett, G. W.; Cramer, R. J.; Mulder, E. J.

    2000-01-01

    In the early 1970's a program was initiated at the Naval Surface Warfare Center/Indian Head Division (NSWC/IHDIV) to address the well-known problems associated with availability and suppliers of critical ingredients. These critical ingredients are necessary for preparation of solid propellants and explosives manufactured by the Navy. The objective of the program was to identify primary and secondary (or back-up) vendor information for these critical ingredients, and to develop suitable alternative materials if an ingredient is unavailable. In 1992 NSWC/IHDIV funded Chemical Propulsion Information Agency (CPIA) under a Technical Area Task (TAT) to expedite the task of creating a database listing critical ingredients used to manufacture Navy propellant and explosives based on known formulation quantities. Under this task CPIA provided employees that were 100 percent dedicated to the task of obtaining critical ingredient suppliers information, selecting the software and designing the interface between the computer program and the database users. TAT objectives included creating the Explosive Ingredients Source Database (EISD) for Propellant, Explosive and Pyrotechnic (PEP) critical elements. The goal was to create a readily accessible database, to provide users a quick-view summary of critical ingredient supplier's information and create a centralized archive that CPIA would update and distribute. EISD funding ended in 1996. At that time, the database entries included 53 formulations and 108 critical used to manufacture Navy propellant and explosives. CPIA turned the database tasking back over to NSWC/IHDIV to maintain and distribute at their discretion. Due to significant interest in propellant/explosives critical ingredients suppliers' status, the Propellant Development and Characterization Subcommittee (PDCS) approached the JANNAF Executive committee (EC) for authorization to continue the critical ingredient database work. In 1999, JANNAF EC approved the PDCS panel task. This paper is designed to emphasize the necessity of maintaining a JANNAF community supported database, which monitors PEP critical ingredient suppliers' status. The final product of this task is a user friendly, searchable database that provides a quick-view summary of critical ingredient supplier's information. This database must be designed to serve the needs of JANNAF and the propellant and energetic commercial manufacturing community as well. This paper provides a summary of the type of information to archive each critical ingredient.

  2. Application of SQL database to the control system of MOIRCS

    NASA Astrophysics Data System (ADS)

    Yoshikawa, Tomohiro; Omata, Koji; Konishi, Masahiro; Ichikawa, Takashi; Suzuki, Ryuji; Tokoku, Chihiro; Uchimoto, Yuka Katsuno; Nishimura, Tetsuo

    2006-06-01

    MOIRCS (Multi-Object Infrared Camera and Spectrograph) is a new instrument for the Subaru telescope. In order to perform observations of near-infrared imaging and spectroscopy with cold slit mask, MOIRCS contains many device components, which are distributed on an Ethernet LAN. Two PCs wired to the focal plane array electronics operate two HAWAII2 detectors, respectively, and other two PCs are used for integrated control and quick data reduction, respectively. Though most of the devices (e.g., filter and grism turrets, slit exchange mechanism for spectroscopy) are controlled via RS232C interface, they are accessible from TCP/IP connection using TCP/IP to RS232C converters. Moreover, other devices are also connected to the Ethernet LAN. This network distributed structure provides flexibility of hardware configuration. We have constructed an integrated control system for such network distributed hardwares, named T-LECS (Tohoku University - Layered Electronic Control System). T-LECS has also network distributed software design, applying TCP/IP socket communication to interprocess communication. In order to help the communication between the device interfaces and the user interfaces, we defined three layers in T-LECS; an external layer for user interface applications, an internal layer for device interface applications, and a communication layer, which connects two layers above. In the communication layer, we store the data of the system to an SQL database server; they are status data, FITS header data, and also meta data such as device configuration data and FITS configuration data. We present our software system design and the database schema to manage observations of MOIRCS with Subaru.

  3. VIEWCACHE: An incremental pointer-base access method for distributed databases. Part 1: The universal index system design document. Part 2: The universal index system low-level design document. Part 3: User's guide. Part 4: Reference manual. Part 5: UIMS test suite

    NASA Technical Reports Server (NTRS)

    Kelley, Steve; Roussopoulos, Nick; Sellis, Timos

    1992-01-01

    The goal of the Universal Index System (UIS), is to provide an easy-to-use and reliable interface to many different kinds of database systems. The impetus for this system was to simplify database index management for users, thus encouraging the use of indexes. As the idea grew into an actual system design, the concept of increasing database performance by facilitating the use of time-saving techniques at the user level became a theme for the project. This Final Report describes the Design, the Implementation of UIS, and its Language Interfaces. It also includes the User's Guide and the Reference Manual.

  4. A global building inventory for earthquake loss estimation and risk management

    USGS Publications Warehouse

    Jaiswal, K.; Wald, D.; Porter, K.

    2010-01-01

    We develop a global database of building inventories using taxonomy of global building types for use in near-real-time post-earthquake loss estimation and pre-earthquake risk analysis, for the U.S. Geological Survey's Prompt Assessment of Global Earthquakes for Response (PAGER) program. The database is available for public use, subject to peer review, scrutiny, and open enhancement. On a country-by-country level, it contains estimates of the distribution of building types categorized by material, lateral force resisting system, and occupancy type (residential or nonresidential, urban or rural). The database draws on and harmonizes numerous sources: (1) UN statistics, (2) UN Habitat's demographic and health survey (DHS) database, (3) national housing censuses, (4) the World Housing Encyclopedia and (5) other literature. ?? 2010, Earthquake Engineering Research Institute.

  5. Normative database of donor keratographic readings in an eye-bank setting.

    PubMed

    Lewis, Jennifer R; Bogucki, Jennifer M; Mahmoud, Ashraf M; Lembach, Richard G; Roberts, Cynthia J

    2010-04-01

    To generate a normative donor topographic database from rasterstereography images of whole globes acquired in an eye-bank setting with minimal manipulation or handling. Eye-bank laboratory. In a retrospective study, rasterstereography topographic images that had been prospectively collected in duplicate of donor eyes received by the Central Ohio Lions Eye Bank between 1997 and 1999 were analyzed. Best-fit sphere (BFS) and simulated keratometry (K) values were extracted. These values were recalculated after application of custom software to correct any tilt of the mapped surfaces relative to the image plane. The mean value variances between right eyes and left eyes, between consecutive scans, and after untilting were analyzed by repeated-measures analysis of variance and t tests (P.05, Kolmogorov-Smirnov). There was no difference between right and left eyes or consecutive scans (P>.05). The mean values changed when the images were tilt-corrected (P<.05). The right eye BFS, Kflat, and Ksteep values of 42.03 diopters (D) +/- 1.88 (SD), 42.21 +/- 2.10 D, and 43.82 +/- 2.00 D, respectively, increased to 42.52 +/- 1.73 D, 43.05 +/- 1.99 D, and 44.57 +/- 2.02 D, respectively, after tilt correction. Keratometric parameter frequency distributions from the donor database of tilt-corrected data were normal in distribution and comparable to parameters reported for normal eyes in a living population. These findings show the feasibility and reliability of routine donor-eye topography by rasterstereography. No author has a financial or proprietary interest in any material or method mentioned. Additional disclosures are found in the footnotes. Copyright (c) 2010 ASCRS and ESCRS. Published by Elsevier Inc. All rights reserved.

  6. The Advanced Composition Explorer Shock Database and Application to Particle Acceleration Theory

    NASA Technical Reports Server (NTRS)

    Parker, L. Neergaard; Zank, G. P.

    2015-01-01

    The theory of particle acceleration via diffusive shock acceleration (DSA) has been studied in depth by Gosling et al. (1981), van Nes et al. (1984), Mason (2000), Desai et al. (2003), Zank et al. (2006), among many others. Recently, Parker and Zank (2012, 2014) and Parker et al. (2014) using the Advanced Composition Explorer (ACE) shock database at 1 AU explored two questions: does the upstream distribution alone have enough particles to account for the accelerated downstream distribution and can the slope of the downstream accelerated spectrum be explained using DSA? As was shown in this research, diffusive shock acceleration can account for a large population of the shocks. However, Parker and Zank (2012, 2014) and Parker et al. (2014) used a subset of the larger ACE database. Recently, work has successfully been completed that allows for the entire ACE database to be considered in a larger statistical analysis. We explain DSA as it applies to single and multiple shocks and the shock criteria used in this statistical analysis. We calculate the expected injection energy via diffusive shock acceleration given upstream parameters defined from the ACE Solar Wind Electron, Proton, and Alpha Monitor (SWEPAM) data to construct the theoretical upstream distribution. We show the comparison of shock strength derived from diffusive shock acceleration theory to observations in the 50 keV to 5 MeV range from an instrument on ACE. Parameters such as shock velocity, shock obliquity, particle number, and time between shocks are considered. This study is further divided into single and multiple shock categories, with an additional emphasis on forward-forward multiple shock pairs. Finally with regard to forward-forward shock pairs, results comparing injection energies of the first shock, second shock, and second shock with previous energetic population will be given.

  7. The Advanced Composition Explorer Shock Database and Application to Particle Acceleration Theory

    NASA Technical Reports Server (NTRS)

    Parker, L. Neergaard; Zank, G. P.

    2015-01-01

    The theory of particle acceleration via diffusive shock acceleration (DSA) has been studied in depth by Gosling et al. (1981), van Nes et al. (1984), Mason (2000), Desai et al. (2003), Zank et al. (2006), among many others. Recently, Parker and Zank (2012, 2014) and Parker et al. (2014) using the Advanced Composition Explorer (ACE) shock database at 1 AU explored two questions: does the upstream distribution alone have enough particles to account for the accelerated downstream distribution and can the slope of the downstream accelerated spectrum be explained using DSA? As was shown in this research, diffusive shock acceleration can account for a large population of the shocks. However, Parker and Zank (2012, 2014) and Parker et al. (2014) used a subset of the larger ACE database. Recently, work has successfully been completed that allows for the entire ACE database to be considered in a larger statistical analysis. We explain DSA as it applies to single and multiple shocks and the shock criteria used in this statistical analysis. We calculate the expected injection energy via diffusive shock acceleration given upstream parameters defined from the ACE Solar Wind Electron, Proton, and Alpha Monitor (SWEPAM) data to construct the theoretical upstream distribution. We show the comparison of shock strength derived from diffusive shock acceleration theory to observations in the 50 keV to 5 MeV range from an instrument on ACE. Parameters such as shock velocity, shock obliquity, particle number, and time between shocks are considered. This study is further divided into single and multiple shock categories, with an additional emphasis on forward-forward multiple shock pairs. Finally with regard to forwardforward shock pairs, results comparing injection energies of the first shock, second shock, and second shock with previous energetic population will be given.

  8. Spatial configuration and distribution of forest patches in Champaign County, Illinois: 1940 to 1993

    Treesearch

    J. Danilo Chinea

    1997-01-01

    Spatial configuration and distribution of landscape elements have implications for the dynamics of forest ecosystems, and, therefore, for the management of these resources. The forest cover of Champaign County, in east-central Illinois, was mapped from 1940 and 1993 aerial photography and entered in a geographical information system database. In 1940, 208 forest...

  9. A GIS approach to identifying the distribution and structure of coast redwood across its range

    Treesearch

    Peter Cowan; Emily E. Burns; Richard Campbell

    2017-01-01

    To better understand the distribution and current structure of coast redwood (Sequoia sempervirens (D.Don) Endl.) forests throughout the range and how it varies by land ownerships, the Save the Redwoods League has conducted a redwood specific analysis of a high resolution forest structure database encompassing the entire natural coast redwood range...

  10. Spatiotemporal distribution patterns of forest fires in northern Mexico

    Treesearch

    Gustavo Pérez-Verdin; M. A. Márquez-Linares; A. Cortes-Ortiz; M. Salmerón-Macias

    2013-01-01

    Using the 2000-2011 CONAFOR databases, a spatiotemporal analysis of the occurrence of forest fires in Durango, one of the most affected States in Mexico, was conducted. The Moran's index was used to determine a spatial distribution pattern; also, an analysis of seasonal and temporal autocorrelation of the data collected was completed. The geographically weighted...

  11. PRESENTED AT TRIANGLE CONSORTIUM FOR REPRODUCTIVE BIOLOGY MEETING IN CHAPEL HILL, NC ON 2/11/2006: SPERM COUNT DISTRIBUTIONS IN FERTILE MEN

    EPA Science Inventory

    Sperm concentration and count are often used as indicators of environmental impacts on male reproductive health. Existing clinical databases may be biased towards sub-fertile men with low sperm counts and less is known about expected sperm count distributions in cohorts of ferti...

  12. Group-oriented coordination models for distributed client-server computing

    NASA Technical Reports Server (NTRS)

    Adler, Richard M.; Hughes, Craig S.

    1994-01-01

    This paper describes group-oriented control models for distributed client-server interactions. These models transparently coordinate requests for services that involve multiple servers, such as queries across distributed databases. Specific capabilities include: decomposing and replicating client requests; dispatching request subtasks or copies to independent, networked servers; and combining server results into a single response for the client. The control models were implemented by combining request broker and process group technologies with an object-oriented communication middleware tool. The models are illustrated in the context of a distributed operations support application for space-based systems.

  13. Distributed software framework and continuous integration in hydroinformatics systems

    NASA Astrophysics Data System (ADS)

    Zhou, Jianzhong; Zhang, Wei; Xie, Mengfei; Lu, Chengwei; Chen, Xiao

    2017-08-01

    When encountering multiple and complicated models, multisource structured and unstructured data, complex requirements analysis, the platform design and integration of hydroinformatics systems become a challenge. To properly solve these problems, we describe a distributed software framework and it’s continuous integration process in hydroinformatics systems. This distributed framework mainly consists of server cluster for models, distributed database, GIS (Geographic Information System) servers, master node and clients. Based on it, a GIS - based decision support system for joint regulating of water quantity and water quality of group lakes in Wuhan China is established.

  14. A Neural Network Aero Design System for Advanced Turbo-Engines

    NASA Technical Reports Server (NTRS)

    Sanz, Jose M.

    1999-01-01

    An inverse design method calculates the blade shape that produces a prescribed input pressure distribution. By controlling this input pressure distribution the aerodynamic design objectives can easily be met. Because of the intrinsic relationship between pressure distribution and airfoil physical properties, a neural network can be trained to choose the optimal pressure distribution that would meet a set of physical requirements. The neural network technique works well not only as an interpolating device but also as an extrapolating device to achieve blade designs from a given database. Two validating test cases are discussed.

  15. The HyperLeda project en route to the astronomical virtual observatory

    NASA Astrophysics Data System (ADS)

    Golev, V.; Georgiev, V.; Prugniel, Ph.

    2002-07-01

    HyperLeda (Hyper-Linked Extragalactic Databases and Archives) is aimed to study the evolution of galaxies, their kinematics and stellar populations and the structure of Local Universe. HyperLeda is involved in catalogue and software production, data-mining and massive data processing. The products are serviced to the community through web mirrors. The development of HyperLeda is distributed between different sites and is based on the background experience of the LEDA and Hypercat databases. The HyperLeda project is focused both on the European iAstro colaboration and as a unique database for studies of the physics of the extragalactic objects.

  16. Distributed databases for materials study of thermo-kinetic properties

    NASA Astrophysics Data System (ADS)

    Toher, Cormac

    2015-03-01

    High-throughput computational materials science provides researchers with the opportunity to rapidly generate large databases of materials properties. To rapidly add thermal properties to the AFLOWLIB consortium and Materials Project repositories, we have implemented an automated quasi-harmonic Debye model, the Automatic GIBBS Library (AGL). This enables us to screen thousands of materials for thermal conductivity, bulk modulus, thermal expansion and related properties. The search and sort functions of the online database can then be used to identify suitable materials for more in-depth study using more precise computational or experimental techniques. AFLOW-AGL source code is public domain and will soon be released within the GNU-GPL license.

  17. GetData: A filesystem-based, column-oriented database format for time-ordered binary data

    NASA Astrophysics Data System (ADS)

    Wiebe, Donald V.; Netterfield, Calvin B.; Kisner, Theodore S.

    2015-12-01

    The GetData Project is the reference implementation of the Dirfile Standards, a filesystem-based, column-oriented database format for time-ordered binary data. Dirfiles provide a fast, simple format for storing and reading data, suitable for both quicklook and analysis pipelines. GetData provides a C API and bindings exist for various other languages. GetData is distributed under the terms of the GNU Lesser General Public License.

  18. Cloud-Based Distributed Control of Unmanned Systems

    DTIC Science & Technology

    2015-04-01

    during mission execution. At best, the data is saved onto hard-drives and is accessible only by the local team. Data history in a form available and...following open source technologies: GeoServer, OpenLayers, PostgreSQL , and PostGIS are chosen to implement the back-end database and server. A brief...geospatial map data. 3. PostgreSQL : An SQL-compliant object-relational database that easily scales to accommodate large amounts of data - upwards to

  19. Delayed Instantiation Bulk Operations for Management of Distributed, Object-Based Storage Systems

    DTIC Science & Technology

    2009-08-01

    source and destination object sets, while they have attribute pages to indicate that history . Fourth, we allow for operations to occur on any objects...client dialogue to the PostgreSQL database where server-side functions implement the service logic for the requests. The translation is done...to satisfy client requests, and performs delayed instantiation bulk operations. It is built around a PostgreSQL database with tables for storing

  20. LSD: Large Survey Database framework

    NASA Astrophysics Data System (ADS)

    Juric, Mario

    2012-09-01

    The Large Survey Database (LSD) is a Python framework and DBMS for distributed storage, cross-matching and querying of large survey catalogs (>10^9 rows, >1 TB). The primary driver behind its development is the analysis of Pan-STARRS PS1 data. It is specifically optimized for fast queries and parallel sweeps of positionally and temporally indexed datasets. It transparently scales to more than >10^2 nodes, and can be made to function in "shared nothing" architectures.

  1. Scalable global grid catalogue for Run3 and beyond

    NASA Astrophysics Data System (ADS)

    Martinez Pedreira, M.; Grigoras, C.; ALICE Collaboration

    2017-10-01

    The AliEn (ALICE Environment) file catalogue is a global unique namespace providing mapping between a UNIX-like logical name structure and the corresponding physical files distributed over 80 storage elements worldwide. Powerful search tools and hierarchical metadata information are integral parts of the system and are used by the Grid jobs as well as local users to store and access all files on the Grid storage elements. The catalogue has been in production since 2005 and over the past 11 years has grown to more than 2 billion logical file names. The backend is a set of distributed relational databases, ensuring smooth growth and fast access. Due to the anticipated fast future growth, we are looking for ways to enhance the performance and scalability by simplifying the catalogue schema while keeping the functionality intact. We investigated different backend solutions, such as distributed key value stores, as replacement for the relational database. This contribution covers the architectural changes in the system, together with the technology evaluation, benchmark results and conclusions.

  2. A Correction for IUE UV Flux Distributions from Comparisons with CALSPEC

    NASA Astrophysics Data System (ADS)

    Bohlin, Ralph C.; Bianchi, Luciana

    2018-04-01

    A collection of spectral energy distributions (SEDs) is available in the Hubble Space Telescope (HST) CALSPEC database that is based on calculated model atmospheres for pure hydrogen white dwarfs (WDs). A much larger set (∼100,000) of UV SEDs covering the range (1150–3350 Å) with somewhat lower quality are available in the IUE database. IUE low-dispersion flux distributions are compared with CALSPEC to provide a correction that places IUE fluxes on the CALSPEC scale. While IUE observations are repeatable to only 4%–10% in regions of good sensitivity, the average flux corrections have a precision of 2%–3%. Our re-calibration places the IUE flux scale on the current UV reference standard and is relevant for any project based on IUE archival data, including our planned comparison of GALEX to the corrected IUE fluxes. IUE SEDs may be used to plan observations and cross-calibrate data from future missions, so the IUE flux calibration must be consistent with HST instrumental calibrations to the best possible precision.

  3. Wide-area-distributed storage system for a multimedia database

    NASA Astrophysics Data System (ADS)

    Ueno, Masahiro; Kinoshita, Shigechika; Kuriki, Makato; Murata, Setsuko; Iwatsu, Shigetaro

    1998-12-01

    We have developed a wide-area-distribution storage system for multimedia databases, which minimizes the possibility of simultaneous failure of multiple disks in the event of a major disaster. It features a RAID system, whose member disks are spatially distributed over a wide area. Each node has a device, which includes the controller of the RAID and the controller of the member disks controlled by other nodes. The devices in the node are connected to a computer, using fiber optic cables and communicate using fiber-channel technology. Any computer at a node can utilize multiple devices connected by optical fibers as a single 'virtual disk.' The advantage of this system structure is that devices and fiber optic cables are shared by the computers. In this report, we first described our proposed system, and a prototype was used for testing. We then discussed its performance; i.e., how to read and write throughputs are affected by data-access delay, the RAID level, and queuing.

  4. A Hybrid Semi-supervised Classification Scheme for Mining Multisource Geospatial Data

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Vatsavai, Raju; Bhaduri, Budhendra L

    2011-01-01

    Supervised learning methods such as Maximum Likelihood (ML) are often used in land cover (thematic) classification of remote sensing imagery. ML classifier relies exclusively on spectral characteristics of thematic classes whose statistical distributions (class conditional probability densities) are often overlapping. The spectral response distributions of thematic classes are dependent on many factors including elevation, soil types, and ecological zones. A second problem with statistical classifiers is the requirement of large number of accurate training samples (10 to 30 |dimensions|), which are often costly and time consuming to acquire over large geographic regions. With the increasing availability of geospatial databases, itmore » is possible to exploit the knowledge derived from these ancillary datasets to improve classification accuracies even when the class distributions are highly overlapping. Likewise newer semi-supervised techniques can be adopted to improve the parameter estimates of statistical model by utilizing a large number of easily available unlabeled training samples. Unfortunately there is no convenient multivariate statistical model that can be employed for mulitsource geospatial databases. In this paper we present a hybrid semi-supervised learning algorithm that effectively exploits freely available unlabeled training samples from multispectral remote sensing images and also incorporates ancillary geospatial databases. We have conducted several experiments on real datasets, and our new hybrid approach shows over 25 to 35% improvement in overall classification accuracy over conventional classification schemes.« less

  5. Multiple elastic scattering of electrons in condensed matter

    NASA Astrophysics Data System (ADS)

    Jablonski, A.

    2017-01-01

    Since the 1940s, much attention has been devoted to the problem of accurate theoretical description of electron transport in condensed matter. The needed information for describing different aspects of the electron transport is the angular distribution of electron directions after multiple elastic collisions. This distribution can be expanded into a series of Legendre polynomials with coefficients, Al. In the present work, a database of these coefficients for all elements up to uranium (Z=92) and a dense grid of electron energies varying from 50 to 5000 eV has been created. The database makes possible the following applications: (i) accurate interpolation of coefficients Al for any element and any energy from the above range, (ii) fast calculations of the differential and total elastic-scattering cross sections, (iii) determination of the angular distribution of directions after multiple collisions, (iv) calculations of the probability of elastic backscattering from solids, and (v) calculations of the calibration curves for determination of the inelastic mean free paths of electrons. The last two applications provide data with comparable accuracy to Monte Carlo simulations, yet the running time is decreased by several orders of magnitude. All of the above applications are implemented in the Fortran program MULTI_SCATT. Numerous illustrative runs of this program are described. Despite a relatively large volume of the database of coefficients Al, the program MULTI_SCATT can be readily run on personal computers.

  6. Is the spatial distribution of brain lesions associated with closed-head injury predictive of subsequent development of attention-deficit/hyperactivity disorder? Analysis with brain-image database

    NASA Technical Reports Server (NTRS)

    Herskovits, E. H.; Megalooikonomou, V.; Davatzikos, C.; Chen, A.; Bryan, R. N.; Gerring, J. P.

    1999-01-01

    PURPOSE: To determine whether there is an association between the spatial distribution of lesions detected at magnetic resonance (MR) imaging of the brain in children after closed-head injury and the development of secondary attention-deficit/hyperactivity disorder (ADHD). MATERIALS AND METHODS: Data obtained from 76 children without prior history of ADHD were analyzed. MR images were obtained 3 months after closed-head injury. After manual delineation of lesions, images were registered to the Talairach coordinate system. For each subject, registered images and secondary ADHD status were integrated into a brain-image database, which contains depiction (visualization) and statistical analysis software. Using this database, we assessed visually the spatial distributions of lesions and performed statistical analysis of image and clinical variables. RESULTS: Of the 76 children, 15 developed secondary ADHD. Depiction of the data suggested that children who developed secondary ADHD had more lesions in the right putamen than children who did not develop secondary ADHD; this impression was confirmed statistically. After Bonferroni correction, we could not demonstrate significant differences between secondary ADHD status and lesion burdens for the right caudate nucleus or the right globus pallidus. CONCLUSION: Closed-head injury-induced lesions in the right putamen in children are associated with subsequent development of secondary ADHD. Depiction software is useful in guiding statistical analysis of image data.

  7. Site partitioning for distributed redundant disk arrays

    NASA Technical Reports Server (NTRS)

    Mourad, Antoine N.; Fuchs, W. K.; Saab, Daniel G.

    1992-01-01

    Distributed redundant disk arrays can be used in a distributed computing system or database system to provide recovery in the presence of temporary and permanent failures of single sites. In this paper, we look at the problem of partitioning the sites into redundant arrays in such way that the communication costs for maintaining the parity information are minimized. We show that the partitioning problem is NP-complete and we propose two heuristic algorithms for finding approximate solutions.

  8. Implementation of medical monitor system based on networks

    NASA Astrophysics Data System (ADS)

    Yu, Hui; Cao, Yuzhen; Zhang, Lixin; Ding, Mingshi

    2006-11-01

    In this paper, the development trend of medical monitor system is analyzed and portable trend and network function become more and more popular among all kinds of medical monitor devices. The architecture of medical network monitor system solution is provided and design and implementation details of medical monitor terminal, monitor center software, distributed medical database and two kind of medical information terminal are especially discussed. Rabbit3000 system is used in medical monitor terminal to implement security administration of data transfer on network, human-machine interface, power management and DSP interface while DSP chip TMS5402 is used in signal analysis and data compression. Distributed medical database is designed for hospital center according to DICOM information model and HL7 standard. Pocket medical information terminal based on ARM9 embedded platform is also developed to interactive with center database on networks. Two kernels based on WINCE are customized and corresponding terminal software are developed for nurse's routine care and doctor's auxiliary diagnosis. Now invention patent of the monitor terminal is approved and manufacture and clinic test plans are scheduled. Applications for invention patent are also arranged for two medical information terminals.

  9. [Public scientific knowledge distribution in health information, communication and information technology indexed in MEDLINE and LILACS databases].

    PubMed

    Packer, Abel Laerte; Tardelli, Adalberto Otranto; Castro, Regina Célia Figueiredo

    2007-01-01

    This study explores the distribution of international, regional and national scientific output in health information and communication, indexed in the MEDLINE and LILACS databases, between 1996 and 2005. A selection of articles was based on the hierarchical structure of Information Science in MeSH vocabulary. Four specific domains were determined: health information, medical informatics, scientific communications on healthcare and healthcare communications. The variables analyzed were: most-covered subjects and journals, author affiliation and publication countries and languages, in both databases. The Information Science category is represented in nearly 5% of MEDLINE and LILACS articles. The four domains under analysis showed a relative annual increase in MEDLINE. The Medical Informatics domain showed the highest number of records in MEDLINE, representing about half of all indexed articles. The importance of Information Science as a whole is more visible in publications from developed countries and the findings indicate the predominance of the United States, with significant growth in scientific output from China and South Korea and, to a lesser extent, Brazil.

  10. Australia's continental-scale acoustic tracking database and its automated quality control process

    NASA Astrophysics Data System (ADS)

    Hoenner, Xavier; Huveneers, Charlie; Steckenreuter, Andre; Simpfendorfer, Colin; Tattersall, Katherine; Jaine, Fabrice; Atkins, Natalia; Babcock, Russ; Brodie, Stephanie; Burgess, Jonathan; Campbell, Hamish; Heupel, Michelle; Pasquer, Benedicte; Proctor, Roger; Taylor, Matthew D.; Udyawer, Vinay; Harcourt, Robert

    2018-01-01

    Our ability to predict species responses to environmental changes relies on accurate records of animal movement patterns. Continental-scale acoustic telemetry networks are increasingly being established worldwide, producing large volumes of information-rich geospatial data. During the last decade, the Integrated Marine Observing System's Animal Tracking Facility (IMOS ATF) established a permanent array of acoustic receivers around Australia. Simultaneously, IMOS developed a centralised national database to foster collaborative research across the user community and quantify individual behaviour across a broad range of taxa. Here we present the database and quality control procedures developed to collate 49.6 million valid detections from 1891 receiving stations. This dataset consists of detections for 3,777 tags deployed on 117 marine species, with distances travelled ranging from a few to thousands of kilometres. Connectivity between regions was only made possible by the joint contribution of IMOS infrastructure and researcher-funded receivers. This dataset constitutes a valuable resource facilitating meta-analysis of animal movement, distributions, and habitat use, and is important for relating species distribution shifts with environmental covariates.

  11. Methods to elicit probability distributions from experts: a systematic review of reported practice in health technology assessment.

    PubMed

    Grigore, Bogdan; Peters, Jaime; Hyde, Christopher; Stein, Ken

    2013-11-01

    Elicitation is a technique that can be used to obtain probability distribution from experts about unknown quantities. We conducted a methodology review of reports where probability distributions had been elicited from experts to be used in model-based health technology assessments. Databases including MEDLINE, EMBASE and the CRD database were searched from inception to April 2013. Reference lists were checked and citation mapping was also used. Studies describing their approach to the elicitation of probability distributions were included. Data was abstracted on pre-defined aspects of the elicitation technique. Reports were critically appraised on their consideration of the validity, reliability and feasibility of the elicitation exercise. Fourteen articles were included. Across these studies, the most marked features were heterogeneity in elicitation approach and failure to report key aspects of the elicitation method. The most frequently used approaches to elicitation were the histogram technique and the bisection method. Only three papers explicitly considered the validity, reliability and feasibility of the elicitation exercises. Judged by the studies identified in the review, reports of expert elicitation are insufficient in detail and this impacts on the perceived usability of expert-elicited probability distributions. In this context, the wider credibility of elicitation will only be improved by better reporting and greater standardisation of approach. Until then, the advantage of eliciting probability distributions from experts may be lost.

  12. BioModels Database: An enhanced, curated and annotated resource for published quantitative kinetic models

    PubMed Central

    2010-01-01

    Background Quantitative models of biochemical and cellular systems are used to answer a variety of questions in the biological sciences. The number of published quantitative models is growing steadily thanks to increasing interest in the use of models as well as the development of improved software systems and the availability of better, cheaper computer hardware. To maximise the benefits of this growing body of models, the field needs centralised model repositories that will encourage, facilitate and promote model dissemination and reuse. Ideally, the models stored in these repositories should be extensively tested and encoded in community-supported and standardised formats. In addition, the models and their components should be cross-referenced with other resources in order to allow their unambiguous identification. Description BioModels Database http://www.ebi.ac.uk/biomodels/ is aimed at addressing exactly these needs. It is a freely-accessible online resource for storing, viewing, retrieving, and analysing published, peer-reviewed quantitative models of biochemical and cellular systems. The structure and behaviour of each simulation model distributed by BioModels Database are thoroughly checked; in addition, model elements are annotated with terms from controlled vocabularies as well as linked to relevant data resources. Models can be examined online or downloaded in various formats. Reaction network diagrams generated from the models are also available in several formats. BioModels Database also provides features such as online simulation and the extraction of components from large scale models into smaller submodels. Finally, the system provides a range of web services that external software systems can use to access up-to-date data from the database. Conclusions BioModels Database has become a recognised reference resource for systems biology. It is being used by the community in a variety of ways; for example, it is used to benchmark different simulation systems, and to study the clustering of models based upon their annotations. Model deposition to the database today is advised by several publishers of scientific journals. The models in BioModels Database are freely distributed and reusable; the underlying software infrastructure is also available from SourceForge https://sourceforge.net/projects/biomodels/ under the GNU General Public License. PMID:20587024

  13. User’s guide to the North Pacific Pelagic Seabird Database 2.0

    USGS Publications Warehouse

    Drew, Gary S.; Piatt, John F.; Renner, Martin

    2015-07-13

    The North Pacific Pelagic Seabird Database (NPPSD) was created in 2005 to consolidate data on the oceanic distribution of marine bird species in the North Pacific. Most of these data were collected on surveys by counting species within defined areas and at known locations (that is, on strip transects). The NPPSD also contains observations of other bird species and marine mammals. The original NPPSD combined data from 465 surveys conducted between 1973 and 2002, primarily in waters adjacent to Alaska. These surveys included 61,195 sample transects with location, environment, and metadata information, and the data were organized in a flat-file format. In developing NPPSD 2.0, our goals were to add new datasets, to make significant improvements to database functionality and to provide the database online. NPPSD 2.0 includes data from a broader geographic range within the North Pacific, including new observations made offshore of the Russian Federation, Japan, Korea, British Columbia (Canada), Oregon, and California. These data were imported into a relational database, proofed, and structured in a common format. NPPSD 2.0 contains 351,674 samples (transects) collected between 1973 and 2012, representing a total sampled area of 270,259 square kilometers, and extends the time series of samples in some areas—notably the Bering Sea—to four decades. It contains observations of 16,988,138 birds and 235,545 marine mammals and is available on the NPPSD Web site. Supplementary materials include an updated set of standardized taxonomic codes, reference maps that show the spatial and temporal distribution of the survey efforts and a downloadable query tool.

  14. Vanderbilt University Institute of Imaging Science Center for Computational Imaging XNAT: A multimodal data archive and processing environment.

    PubMed

    Harrigan, Robert L; Yvernault, Benjamin C; Boyd, Brian D; Damon, Stephen M; Gibney, Kyla David; Conrad, Benjamin N; Phillips, Nicholas S; Rogers, Baxter P; Gao, Yurui; Landman, Bennett A

    2016-01-01

    The Vanderbilt University Institute for Imaging Science (VUIIS) Center for Computational Imaging (CCI) has developed a database built on XNAT housing over a quarter of a million scans. The database provides framework for (1) rapid prototyping, (2) large scale batch processing of images and (3) scalable project management. The system uses the web-based interfaces of XNAT and REDCap to allow for graphical interaction. A python middleware layer, the Distributed Automation for XNAT (DAX) package, distributes computation across the Vanderbilt Advanced Computing Center for Research and Education high performance computing center. All software are made available in open source for use in combining portable batch scripting (PBS) grids and XNAT servers. Copyright © 2015 Elsevier Inc. All rights reserved.

  15. Catalog of databases and reports

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Burtis, M.D.

    1997-04-01

    This catalog provides information about the many reports and materials made available by the US Department of Energy`s (DOE`s) Global Change Research Program (GCRP) and the Carbon Dioxide Information Analysis Center (CDIAC). The catalog is divided into nine sections plus the author and title indexes: Section A--US Department of Energy Global Change Research Program Research Plans and Summaries; Section B--US Department of Energy Global Change Research Program Technical Reports; Section C--US Department of Energy Atmospheric Radiation Measurement (ARM) Program Reports; Section D--Other US Department of Energy Reports; Section E--CDIAC Reports; Section F--CDIAC Numeric Data and Computer Model Distribution; Section G--Othermore » Databases Distributed by CDIAC; Section H--US Department of Agriculture Reports on Response of Vegetation to Carbon Dioxide; and Section I--Other Publications.« less

  16. Fullerene data mining using bibliometrics and database tomography

    PubMed

    Kostoff; Braun; Schubert; Toothman; Humenik

    2000-01-01

    Database tomography (DT) is a textual database analysis system consisting of two major components: (1) algorithms for extracting multiword phrase frequencies and phrase proximities (physical closeness of the multiword technical phrases) from any type of large textual database, to augment (2) interpretative capabilities of the expert human analyst. DT was used to derive technical intelligence from a fullerenes database derived from the Science Citation Index and the Engineering Compendex. Phrase frequency analysis by the technical domain experts provided the pervasive technical themes of the fullerenes database, and phrase proximity analysis provided the relationships among the pervasive technical themes. Bibliometric analysis of the fullerenes literature supplemented the DT results with author/journal/institution publication and citation data. Comparisons of fullerenes results with past analyses of similarly structured near-earth space, chemistry, hypersonic/supersonic flow, aircraft, and ship hydrodynamics databases are made. One important finding is that many of the normalized bibliometric distribution functions are extremely consistent across these diverse technical domains and could reasonably be expected to apply to broader chemical topics than fullerenes that span multiple structural classes. Finally, lessons learned about integrating the technical domain experts with the data mining tools are presented.

  17. Very fast road database verification using textured 3D city models obtained from airborne imagery

    NASA Astrophysics Data System (ADS)

    Bulatov, Dimitri; Ziems, Marcel; Rottensteiner, Franz; Pohl, Melanie

    2014-10-01

    Road databases are known to be an important part of any geodata infrastructure, e.g. as the basis for urban planning or emergency services. Updating road databases for crisis events must be performed quickly and with the highest possible degree of automation. We present a semi-automatic algorithm for road verification using textured 3D city models, starting from aerial or even UAV-images. This algorithm contains two processes, which exchange input and output, but basically run independently from each other. These processes are textured urban terrain reconstruction and road verification. The first process contains a dense photogrammetric reconstruction of 3D geometry of the scene using depth maps. The second process is our core procedure, since it contains various methods for road verification. Each method represents a unique road model and a specific strategy, and thus is able to deal with a specific type of roads. Each method is designed to provide two probability distributions, where the first describes the state of a road object (correct, incorrect), and the second describes the state of its underlying road model (applicable, not applicable). Based on the Dempster-Shafer Theory, both distributions are mapped to a single distribution that refers to three states: correct, incorrect, and unknown. With respect to the interaction of both processes, the normalized elevation map and the digital orthophoto generated during 3D reconstruction are the necessary input - together with initial road database entries - for the road verification process. If the entries of the database are too obsolete or not available at all, sensor data evaluation enables classification of the road pixels of the elevation map followed by road map extraction by means of vectorization and filtering of the geometrically and topologically inconsistent objects. Depending on the time issue and availability of a geo-database for buildings, the urban terrain reconstruction procedure has semantic models of buildings, trees, and ground as output. Building s and ground are textured by means of available images. This facilitates the orientation in the model and the interactive verification of the road objects that where initially classified as unknown. The three main modules of the texturing algorithm are: Pose estimation (if the videos are not geo-referenced), occlusion analysis, and texture synthesis.

  18. 3D radiation belt diffusion model results using new empirical models of whistler chorus and hiss

    NASA Astrophysics Data System (ADS)

    Cunningham, G.; Chen, Y.; Henderson, M. G.; Reeves, G. D.; Tu, W.

    2012-12-01

    3D diffusion codes model the energization, radial transport, and pitch angle scattering due to wave-particle interactions. Diffusion codes are powerful but are limited by the lack of knowledge of the spatial & temporal distribution of waves that drive the interactions for a specific event. We present results from the 3D DREAM model using diffusion coefficients driven by new, activity-dependent, statistical models of chorus and hiss waves. Most 3D codes parameterize the diffusion coefficients or wave amplitudes as functions of magnetic activity indices like Kp, AE, or Dst. These functional representations produce the average value of the wave intensities for a given level of magnetic activity; however, the variability of the wave population at a given activity level is lost with such a representation. Our 3D code makes use of the full sample distributions contained in a set of empirical wave databases (one database for each wave type, including plasmaspheric hiss, lower and upper hand chorus) that were recently produced by our team using CRRES and THEMIS observations. The wave databases store the full probability distribution of observed wave intensity binned by AE, MLT, MLAT and L*. In this presentation, we show results that make use of the wave intensity sample probability distributions for lower-band and upper-band chorus by sampling the distributions stochastically during a representative CRRES-era storm. The sampling of the wave intensity probability distributions produces a collection of possible evolutions of the phase space density, which quantifies the uncertainty in the model predictions caused by the uncertainty of the chorus wave amplitudes for a specific event. A significant issue is the determination of an appropriate model for the spatio-temporal correlations of the wave intensities, since the diffusion coefficients are computed as spatio-temporal averages of the waves over MLT, MLAT and L*. The spatiotemporal correlations cannot be inferred from the wave databases. In this study we use a temporal correlation of ~1 hour for the sampled wave intensities that is informed by the observed autocorrelation in the AE index, a spatial correlation length of ~100 km in the two directions perpendicular to the magnetic field, and a spatial correlation length of 5000 km in the direction parallel to the magnetic field, according to the work of Santolik et al (2003), who used multi-spacecraft measurements from Cluster to quantify the correlation length scales for equatorial chorus . We find that, despite the small correlation length scale for chorus, there remains significant variability in the model outcomes driven by variability in the chorus wave intensities.

  19. Distributed health data networks: a practical and preferred approach to multi-institutional evaluations of comparative effectiveness, safety, and quality of care.

    PubMed

    Brown, Jeffrey S; Holmes, John H; Shah, Kiran; Hall, Ken; Lazarus, Ross; Platt, Richard

    2010-06-01

    Comparative effectiveness research, medical product safety evaluation, and quality measurement will require the ability to use electronic health data held by multiple organizations. There is no consensus about whether to create regional or national combined (eg, "all payer") databases for these purposes, or distributed data networks that leave most Protected Health Information and proprietary data in the possession of the original data holders. Demonstrate functions of a distributed research network that supports research needs and also address data holders concerns about participation. Key design functions included strong local control of data uses and a centralized web-based querying interface. We implemented a pilot distributed research network and evaluated the design considerations, utility for research, and the acceptability to data holders of methods for menu-driven querying. We developed and tested a central, web-based interface with supporting network software. Specific functions assessed include query formation and distribution, query execution and review, and aggregation of results. This pilot successfully evaluated temporal trends in medication use and diagnoses at 5 separate sites, demonstrating some of the possibilities of using a distributed research network. The pilot demonstrated the potential utility of the design, which addressed the major concerns of both users and data holders. No serious obstacles were identified that would prevent development of a fully functional, scalable network. Distributed networks are capable of addressing nearly all anticipated uses of routinely collected electronic healthcare data. Distributed networks would obviate the need for centralized databases, thus avoiding numerous obstacles.

  20. Construction of crystal structure prototype database: methods and applications.

    PubMed

    Su, Chuanxun; Lv, Jian; Li, Quan; Wang, Hui; Zhang, Lijun; Wang, Yanchao; Ma, Yanming

    2017-04-26

    Crystal structure prototype data have become a useful source of information for materials discovery in the fields of crystallography, chemistry, physics, and materials science. This work reports the development of a robust and efficient method for assessing the similarity of structures on the basis of their interatomic distances. Using this method, we proposed a simple and unambiguous definition of crystal structure prototype based on hierarchical clustering theory, and constructed the crystal structure prototype database (CSPD) by filtering the known crystallographic structures in a database. With similar method, a program structure prototype analysis package (SPAP) was developed to remove similar structures in CALYPSO prediction results and extract predicted low energy structures for a separate theoretical structure database. A series of statistics describing the distribution of crystal structure prototypes in the CSPD was compiled to provide an important insight for structure prediction and high-throughput calculations. Illustrative examples of the application of the proposed database are given, including the generation of initial structures for structure prediction and determination of the prototype structure in databases. These examples demonstrate the CSPD to be a generally applicable and useful tool for materials discovery.

  1. Construction of crystal structure prototype database: methods and applications

    NASA Astrophysics Data System (ADS)

    Su, Chuanxun; Lv, Jian; Li, Quan; Wang, Hui; Zhang, Lijun; Wang, Yanchao; Ma, Yanming

    2017-04-01

    Crystal structure prototype data have become a useful source of information for materials discovery in the fields of crystallography, chemistry, physics, and materials science. This work reports the development of a robust and efficient method for assessing the similarity of structures on the basis of their interatomic distances. Using this method, we proposed a simple and unambiguous definition of crystal structure prototype based on hierarchical clustering theory, and constructed the crystal structure prototype database (CSPD) by filtering the known crystallographic structures in a database. With similar method, a program structure prototype analysis package (SPAP) was developed to remove similar structures in CALYPSO prediction results and extract predicted low energy structures for a separate theoretical structure database. A series of statistics describing the distribution of crystal structure prototypes in the CSPD was compiled to provide an important insight for structure prediction and high-throughput calculations. Illustrative examples of the application of the proposed database are given, including the generation of initial structures for structure prediction and determination of the prototype structure in databases. These examples demonstrate the CSPD to be a generally applicable and useful tool for materials discovery.

  2. Big Data and Total Hip Arthroplasty: How Do Large Databases Compare?

    PubMed

    Bedard, Nicholas A; Pugely, Andrew J; McHugh, Michael A; Lux, Nathan R; Bozic, Kevin J; Callaghan, John J

    2018-01-01

    Use of large databases for orthopedic research has become extremely popular in recent years. Each database varies in the methods used to capture data and the population it represents. The purpose of this study was to evaluate how these databases differed in reported demographics, comorbidities, and postoperative complications for primary total hip arthroplasty (THA) patients. Primary THA patients were identified within National Surgical Quality Improvement Programs (NSQIP), Nationwide Inpatient Sample (NIS), Medicare Standard Analytic Files (MED), and Humana administrative claims database (HAC). NSQIP definitions for comorbidities and complications were matched to corresponding International Classification of Diseases, 9th Revision/Current Procedural Terminology codes to query the other databases. Demographics, comorbidities, and postoperative complications were compared. The number of patients from each database was 22,644 in HAC, 371,715 in MED, 188,779 in NIS, and 27,818 in NSQIP. Age and gender distribution were clinically similar. Overall, there was variation in prevalence of comorbidities and rates of postoperative complications between databases. As an example, NSQIP had more than twice the obesity than NIS. HAC and MED had more than 2 times the diabetics than NSQIP. Rates of deep infection and stroke 30 days after THA had more than 2-fold difference between all databases. Among databases commonly used in orthopedic research, there is considerable variation in complication rates following THA depending upon the database used for analysis. It is important to consider these differences when critically evaluating database research. Additionally, with the advent of bundled payments, these differences must be considered in risk adjustment models. Copyright © 2017 Elsevier Inc. All rights reserved.

  3. DISTRIBUTED CONTROL AND DA FOR ATLAS

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    D. SCUDDER; ET AL

    1999-05-01

    The control system for the Atlas pulsed power generator being built at Los Alamos National Laboratory will utilize a significant level of distributed control. Other principal design characteristics include noise immunity, modularity and use of commercial products wherever possible. The data acquisition system is tightly coordinated with the control system. Both share a common database server and a fiber-optic ethernet communications backbone.

  4. Environmental Justice and the Spatial Distribution of Outdoor Recreation sites: an Applications of Geographic Information Systems

    Treesearch

    Michael A. Tarrant; H. Ken Cordell

    1999-01-01

    This study examines the spatial distribution of outdoor recreation sites and their proximity to census block groups (CBGs), in order to determine potential socio-economic inequities. It is framed within the context of environmental justice. Information from the Southern Appalachian Assessment database was applied to a case study of the Chattahoochee National Forest in...

  5. RISK MANAGEMENT USING PROJECT RECON

    DTIC Science & Technology

    2016-11-28

    Risk Management Using Project Recon UNCLASSIFIED: Distribution Statement A. Approved for public release; distribution is unlimited. Bonnie Leece... Project Recon Lead What is Project Recon? • A web-based GOTS tool designed to capture, manage, and link Risks, Issues, and Opportunities in a...centralized database. • Project Recon (formerly Risk Recon) is designed to be used by all Program Management Offices, Integrated Project Teams and any

  6. Regeneration of cervix after excisional treatment for cervical intraepithelial neoplasia: a study of collagen distribution.

    PubMed

    Phadnis, S V; Atilade, A; Bowring, J; Kyrgiou, M; Young, M P A; Evans, H; Paraskevaidis, E; Walker, P

    2011-12-01

    To study the distribution of collagen in the regenerated cervical tissue after excisional treatment for cervical intraepithelial neoplasia (CIN). Cohort study. A large tertiary teaching hospital in London. Women who underwent repeat excisional treatment for treatment failure or persistent CIN. Eligible women who underwent a repeat excisional treatment for treatment failure, including hysterectomy, between January 2002 and December 2007 in our colposcopy unit were identified by the Infoflex(®) database and SNOMED encoded histopathology database. Collagen expression was assessed using picro-Sirius red stain and the intensity of staining was compared in paired specimens from the first and second treatments. Differences in collagen expression were examined in the paired excisional treatment specimens. A total of 17 women were included. Increased collagen expression in the regenerated cervical tissue of the second cone compared with the first cone was noted in six women, decreased expression was noted in five women, and the pattern of collagen distribution was equivocal in six women. There is no overall change in collagen distribution during regeneration following excisional treatment for CIN. © 2011 The Authors BJOG An International Journal of Obstetrics and Gynaecology © 2011 RCOG.

  7. Designing and Implementing a Distributed System Architecture for the Mars Rover Mission Planning Software (Maestro)

    NASA Technical Reports Server (NTRS)

    Goldgof, Gregory M.

    2005-01-01

    Distributed systems allow scientists from around the world to plan missions concurrently, while being updated on the revisions of their colleagues in real time. However, permitting multiple clients to simultaneously modify a single data repository can quickly lead to data corruption or inconsistent states between users. Since our message broker, the Java Message Service, does not ensure that messages will be received in the order they were published, we must implement our own numbering scheme to guarantee that changes to mission plans are performed in the correct sequence. Furthermore, distributed architectures must ensure that as new users connect to the system, they synchronize with the database without missing any messages or falling into an inconsistent state. Robust systems must also guarantee that all clients will remain synchronized with the database even in the case of multiple client failure, which can occur at any time due to lost network connections or a user's own system instability. The final design for the distributed system behind the Mars rover mission planning software fulfills all of these requirements and upon completion will be deployed to MER at the end of 2005 as well as Phoenix (2007) and MSL (2009).

  8. Adaptation of Decoy Fusion Strategy for Existing Multi-Stage Search Workflows

    NASA Astrophysics Data System (ADS)

    Ivanov, Mark V.; Levitsky, Lev I.; Gorshkov, Mikhail V.

    2016-09-01

    A number of proteomic database search engines implement multi-stage strategies aiming at increasing the sensitivity of proteome analysis. These approaches often employ a subset of the original database for the secondary stage of analysis. However, if target-decoy approach (TDA) is used for false discovery rate (FDR) estimation, the multi-stage strategies may violate the underlying assumption of TDA that false matches are distributed uniformly across the target and decoy databases. This violation occurs if the numbers of target and decoy proteins selected for the second search are not equal. Here, we propose a method of decoy database generation based on the previously reported decoy fusion strategy. This method allows unbiased TDA-based FDR estimation in multi-stage searches and can be easily integrated into existing workflows utilizing popular search engines and post-search algorithms.

  9. Development of a database system for operational use in the selection of titanium alloys

    NASA Astrophysics Data System (ADS)

    Han, Yuan-Fei; Zeng, Wei-Dong; Sun, Yu; Zhao, Yong-Qing

    2011-08-01

    The selection of titanium alloys has become a complex decision-making task due to the growing number of creation and utilization for titanium alloys, with each having its own characteristics, advantages, and limitations. In choosing the most appropriate titanium alloys, it is very essential to offer a reasonable and intelligent service for technical engineers. One possible solution of this problem is to develop a database system (DS) to help retrieve rational proposals from different databases and information sources and analyze them to provide useful and explicit information. For this purpose, a design strategy of the fuzzy set theory is proposed, and a distributed database system is developed. Through ranking of the candidate titanium alloys, the most suitable material is determined. It is found that the selection results are in good agreement with the practical situation.

  10. Geographical Distribution of Woody Biomass Carbon in Tropical Africa: An Updated Database for 2000 (NDP-055.2007, NDP-055b))

    DOE Data Explorer

    Gibbs, Holly K. [Center for Sustainability and the Global Environment (SAGE), University of Wisconsin, Madison, WI (USA); Brown, Sandra [Winrock International, Arlington, VA (USA); Olsen, L. M. [Carbon Dioxide Information Analysis Center (CDIAC), Oak Ridge National Laboratory, Oak Ridge, TN (USA); Boden, Thomas A. [Carbon Dioxide Information Analysis Center (CDIAC), Oak Ridge National Laboratory, Oak Ridge, TN (USA)

    2007-09-01

    Maps of biomass density are critical inputs for estimating carbon emissions from deforestation and degradation of tropical forests. Brown and Gatson (1996) pioneered methods to use GIS analysis to map forest biomass based on forest inventory data (ndp055). This database is an update of ndp055 (which represent conditions in circa 1980) and accounts for land cover changes occurring up to the year 2000.

  11. Semantic encoding of relational databases in wireless networks

    NASA Astrophysics Data System (ADS)

    Benjamin, David P.; Walker, Adrian

    2005-03-01

    Semantic Encoding is a new, patented technology that greatly increases the speed of transmission of distributed databases over networks, especially over ad hoc wireless networks, while providing a novel method of data security. It reduces bandwidth consumption and storage requirements, while speeding up query processing, encryption and computation of digital signatures. We describe the application of Semantic Encoding in a wireless setting and provide an example of its operation in which a compression of 290:1 would be achieved.

  12. Quaternary Geology and Liquefaction Susceptibility, San Francisco, California 1:100,000 Quadrangle: A Digital Database

    USGS Publications Warehouse

    Knudsen, Keith L.; Noller, Jay S.; Sowers, Janet M.; Lettis, William R.

    1997-01-01

    This Open-File report is a digital geologic map database. This pamphlet serves to introduce and describe the digital data. There are no paper maps included in the Open-File report. The report does include, however, PostScript plot files containing the images of the geologic map sheets with explanations, as well as the accompanying text describing the geology of the area. For those interested in a paper plot of information contained in the database or in obtaining the PostScript plot files, please see the section entitled 'For Those Who Aren't Familiar With Digital Geologic Map Databases' below. This digital map database, compiled from previously unpublished data, and new mapping by the authors, represents the general distribution of surficial deposits in the San Francisco bay region. Together with the accompanying text file (sf_geo.txt or sf_geo.pdf), it provides current information on Quaternary geology and liquefaction susceptibility of the San Francisco, California, 1:100,000 quadrangle. The database delineates map units that are identified by general age and lithology following the stratigraphic nomenclature of the U.S. Geological Survey. The scale of the source maps limits the spatial resolution (scale) of the database to 1:100,000 or smaller. The content and character of the database, as well as three methods of obtaining the database, are described below.

  13. A Data Analysis Expert System For Large Established Distributed Databases

    NASA Astrophysics Data System (ADS)

    Gnacek, Anne-Marie; An, Y. Kim; Ryan, J. Patrick

    1987-05-01

    The purpose of this work is to analyze the applicability of artificial intelligence techniques for developing a user-friendly, parallel interface to large isolated, incompatible NASA databases for the purpose of assisting the management decision process. To carry out this work, a survey was conducted to establish the data access requirements of several key NASA user groups. In addition, current NASA database access methods were evaluated. The results of this work are presented in the form of a design for a natural language database interface system, called the Deductively Augmented NASA Management Decision Support System (DANMDS). This design is feasible principally because of recently announced commercial hardware and software product developments which allow cross-vendor compatibility. The goal of the DANMDS system is commensurate with the central dilemma confronting most large companies and institutions in America, the retrieval of information from large, established, incompatible database systems. The DANMDS system implementation would represent a significant first step toward this problem's resolution.

  14. A High-Resolution LC-MS-Based Secondary Metabolite Fingerprint Database of Marine Bacteria

    PubMed Central

    Lu, Liang; Wang, Jijie; Xu, Ying; Wang, Kailing; Hu, Yingwei; Tian, Renmao; Yang, Bo; Lai, Qiliang; Li, Yongxin; Zhang, Weipeng; Shao, Zongze; Lam, Henry; Qian, Pei-Yuan

    2014-01-01

    Marine bacteria are the most widely distributed organisms in the ocean environment and produce a wide variety of secondary metabolites. However, traditional screening for bioactive natural compounds is greatly hindered by the lack of a systematic way of cataloguing the chemical profiles of bacterial strains found in nature. Here we present a chemical fingerprint database of marine bacteria based on their secondary metabolite profiles, acquired by high-resolution LC-MS. Till now, 1,430 bacterial strains spanning 168 known species collected from different marine environments were cultured and profiled. Using this database, we demonstrated that secondary metabolite profile similarity is approximately, but not always, correlated with taxonomical similarity. We also validated the ability of this database to find species-specific metabolites, as well as to discover known bioactive compounds from previously unknown sources. An online interface to this database, as well as the accompanying software, is provided freely for the community to use. PMID:25298017

  15. Gramene database in 2010: updates and extensions.

    PubMed

    Youens-Clark, Ken; Buckler, Ed; Casstevens, Terry; Chen, Charles; Declerck, Genevieve; Derwent, Paul; Dharmawardhana, Palitha; Jaiswal, Pankaj; Kersey, Paul; Karthikeyan, A S; Lu, Jerry; McCouch, Susan R; Ren, Liya; Spooner, William; Stein, Joshua C; Thomason, Jim; Wei, Sharon; Ware, Doreen

    2011-01-01

    Now in its 10th year, the Gramene database (http://www.gramene.org) has grown from its primary focus on rice, the first fully-sequenced grass genome, to become a resource for major model and crop plants including Arabidopsis, Brachypodium, maize, sorghum, poplar and grape in addition to several species of rice. Gramene began with the addition of an Ensembl genome browser and has expanded in the last decade to become a robust resource for plant genomics hosting a wide array of data sets including quantitative trait loci (QTL), metabolic pathways, genetic diversity, genes, proteins, germplasm, literature, ontologies and a fully-structured markers and sequences database integrated with genome browsers and maps from various published studies (genetic, physical, bin, etc.). In addition, Gramene now hosts a variety of web services including a Distributed Annotation Server (DAS), BLAST and a public MySQL database. Twice a year, Gramene releases a major build of the database and makes interim releases to correct errors or to make important updates to software and/or data.

  16. Constructing a Graph Database for Semantic Literature-Based Discovery.

    PubMed

    Hristovski, Dimitar; Kastrin, Andrej; Dinevski, Dejan; Rindflesch, Thomas C

    2015-01-01

    Literature-based discovery (LBD) generates discoveries, or hypotheses, by combining what is already known in the literature. Potential discoveries have the form of relations between biomedical concepts; for example, a drug may be determined to treat a disease other than the one for which it was intended. LBD views the knowledge in a domain as a network; a set of concepts along with the relations between them. As a starting point, we used SemMedDB, a database of semantic relations between biomedical concepts extracted with SemRep from Medline. SemMedDB is distributed as a MySQL relational database, which has some problems when dealing with network data. We transformed and uploaded SemMedDB into the Neo4j graph database, and implemented the basic LBD discovery algorithms with the Cypher query language. We conclude that storing the data needed for semantic LBD is more natural in a graph database. Also, implementing LBD discovery algorithms is conceptually simpler with a graph query language when compared with standard SQL.

  17. High-quality unsaturated zone hydraulic property data for hydrologic applications

    USGS Publications Warehouse

    Perkins, Kimberlie; Nimmo, John R.

    2009-01-01

    In hydrologic studies, especially those using dynamic unsaturated zone moisture modeling, calculations based on property transfer models informed by hydraulic property databases are often used in lieu of measured data from the site of interest. Reliance on database-informed predicted values has become increasingly common with the use of neural networks. High-quality data are needed for databases used in this way and for theoretical and property transfer model development and testing. Hydraulic properties predicted on the basis of existing databases may be adequate in some applications but not others. An obvious problem occurs when the available database has few or no data for samples that are closely related to the medium of interest. The data set presented in this paper includes saturated and unsaturated hydraulic conductivity, water retention, particle-size distributions, and bulk properties. All samples are minimally disturbed, all measurements were performed using the same state of the art techniques and the environments represented are diverse.

  18. RDIS: The Rabies Disease Information System.

    PubMed

    Dharmalingam, Baskeran; Jothi, Lydia

    2015-01-01

    Rabies is a deadly viral disease causing acute inflammation or encephalitis of the brain in human beings and other mammals. Therefore, it is of interest to collect information related to the disease from several sources including known literature databases for further analysis and interpretation. Hence, we describe the development of a database called the Rabies Disease Information System (RDIS) for this purpose. The online database describes the etiology, epidemiology, pathogenesis and pathology of the disease using diagrammatic representations. It provides information on several carriers of the rabies viruses like dog, bat, fox and civet, and their distributions around the world. Information related to the urban and sylvatic cycles of transmission of the virus is also made available. The database also contains information related to available diagnostic methods and vaccines for human and other animals. This information is of use to medical, veterinary and paramedical practitioners, students, researchers, pet owners, animal lovers, livestock handlers, travelers and many others. The database is available for free http://rabies.mscwbif.org/home.html.

  19. GlycomeDB – integration of open-access carbohydrate structure databases

    PubMed Central

    Ranzinger, René; Herget, Stephan; Wetter, Thomas; von der Lieth, Claus-Wilhelm

    2008-01-01

    Background Although carbohydrates are the third major class of biological macromolecules, after proteins and DNA, there is neither a comprehensive database for carbohydrate structures nor an established universal structure encoding scheme for computational purposes. Funding for further development of the Complex Carbohydrate Structure Database (CCSD or CarbBank) ceased in 1997, and since then several initiatives have developed independent databases with partially overlapping foci. For each database, different encoding schemes for residues and sequence topology were designed. Therefore, it is virtually impossible to obtain an overview of all deposited structures or to compare the contents of the various databases. Results We have implemented procedures which download the structures contained in the seven major databases, e.g. GLYCOSCIENCES.de, the Consortium for Functional Glycomics (CFG), the Kyoto Encyclopedia of Genes and Genomes (KEGG) and the Bacterial Carbohydrate Structure Database (BCSDB). We have created a new database called GlycomeDB, containing all structures, their taxonomic annotations and references (IDs) for the original databases. More than 100000 datasets were imported, resulting in more than 33000 unique sequences now encoded in GlycomeDB using the universal format GlycoCT. Inconsistencies were found in all public databases, which were discussed and corrected in multiple feedback rounds with the responsible curators. Conclusion GlycomeDB is a new, publicly available database for carbohydrate sequences with a unified, all-encompassing structure encoding format and NCBI taxonomic referencing. The database is updated weekly and can be downloaded free of charge. The JAVA application GlycoUpdateDB is also available for establishing and updating a local installation of GlycomeDB. With the advent of GlycomeDB, the distributed islands of knowledge in glycomics are now bridged to form a single resource. PMID:18803830

  20. Using Large Diabetes Databases for Research.

    PubMed

    Wild, Sarah; Fischbacher, Colin; McKnight, John

    2016-09-01

    There are an increasing number of clinical, administrative and trial databases that can be used for research. These are particularly valuable if there are opportunities for linkage to other databases. This paper describes examples of the use of large diabetes databases for research. It reviews the advantages and disadvantages of using large diabetes databases for research and suggests solutions for some challenges. Large, high-quality databases offer potential sources of information for research at relatively low cost. Fundamental issues for using databases for research are the completeness of capture of cases within the population and time period of interest and accuracy of the diagnosis of diabetes and outcomes of interest. The extent to which people included in the database are representative should be considered if the database is not population based and there is the intention to extrapolate findings to the wider diabetes population. Information on key variables such as date of diagnosis or duration of diabetes may not be available at all, may be inaccurate or may contain a large amount of missing data. Information on key confounding factors is rarely available for the nondiabetic or general population limiting comparisons with the population of people with diabetes. However comparisons that allow for differences in distribution of important demographic factors may be feasible using data for the whole population or a matched cohort study design. In summary, diabetes databases can be used to address important research questions. Understanding the strengths and limitations of this approach is crucial to interpret the findings appropriately. © 2016 Diabetes Technology Society.

  1. The Protein Information Resource: an integrated public resource of functional annotation of proteins

    PubMed Central

    Wu, Cathy H.; Huang, Hongzhan; Arminski, Leslie; Castro-Alvear, Jorge; Chen, Yongxing; Hu, Zhang-Zhi; Ledley, Robert S.; Lewis, Kali C.; Mewes, Hans-Werner; Orcutt, Bruce C.; Suzek, Baris E.; Tsugita, Akira; Vinayaka, C. R.; Yeh, Lai-Su L.; Zhang, Jian; Barker, Winona C.

    2002-01-01

    The Protein Information Resource (PIR) serves as an integrated public resource of functional annotation of protein data to support genomic/proteomic research and scientific discovery. The PIR, in collaboration with the Munich Information Center for Protein Sequences (MIPS) and the Japan International Protein Information Database (JIPID), produces the PIR-International Protein Sequence Database (PSD), the major annotated protein sequence database in the public domain, containing about 250 000 proteins. To improve protein annotation and the coverage of experimentally validated data, a bibliography submission system is developed for scientists to submit, categorize and retrieve literature information. Comprehensive protein information is available from iProClass, which includes family classification at the superfamily, domain and motif levels, structural and functional features of proteins, as well as cross-references to over 40 biological databases. To provide timely and comprehensive protein data with source attribution, we have introduced a non-redundant reference protein database, PIR-NREF. The database consists of about 800 000 proteins collected from PIR-PSD, SWISS-PROT, TrEMBL, GenPept, RefSeq and PDB, with composite protein names and literature data. To promote database interoperability, we provide XML data distribution and open database schema, and adopt common ontologies. The PIR web site (http://pir.georgetown.edu/) features data mining and sequence analysis tools for information retrieval and functional identification of proteins based on both sequence and annotation information. The PIR databases and other files are also available by FTP (ftp://nbrfa.georgetown.edu/pir_databases). PMID:11752247

  2. Study on parallel and distributed management of RS data based on spatial database

    NASA Astrophysics Data System (ADS)

    Chen, Yingbiao; Qian, Qinglan; Wu, Hongqiao; Liu, Shijin

    2009-10-01

    With the rapid development of current earth-observing technology, RS image data storage, management and information publication become a bottle-neck for its appliance and popularization. There are two prominent problems in RS image data storage and management system. First, background server hardly handle the heavy process of great capacity of RS data which stored at different nodes in a distributing environment. A tough burden has put on the background server. Second, there is no unique, standard and rational organization of Multi-sensor RS data for its storage and management. And lots of information is lost or not included at storage. Faced at the above two problems, the paper has put forward a framework for RS image data parallel and distributed management and storage system. This system aims at RS data information system based on parallel background server and a distributed data management system. Aiming at the above two goals, this paper has studied the following key techniques and elicited some revelatory conclusions. The paper has put forward a solid index of "Pyramid, Block, Layer, Epoch" according to the properties of RS image data. With the solid index mechanism, a rational organization for different resolution, different area, different band and different period of Multi-sensor RS image data is completed. In data storage, RS data is not divided into binary large objects to be stored at current relational database system, while it is reconstructed through the above solid index mechanism. A logical image database for the RS image data file is constructed. In system architecture, this paper has set up a framework based on a parallel server of several common computers. Under the framework, the background process is divided into two parts, the common WEB process and parallel process.

  3. Patterns, biases and prospects in the distribution and diversity of Neotropical snakes

    PubMed Central

    Sawaya, Ricardo J.; Zizka, Alexander; Laffan, Shawn; Faurby, Søren; Pyron, R. Alexander; Bérnils, Renato S.; Jansen, Martin; Passos, Paulo; Prudente, Ana L. C.; Cisneros‐Heredia, Diego F.; Braz, Henrique B.; Nogueira, Cristiano de C.; Antonelli, Alexandre; Meiri, Shai

    2017-01-01

    Abstract Motivation We generated a novel database of Neotropical snakes (one of the world's richest herpetofauna) combining the most comprehensive, manually compiled distribution dataset with publicly available data. We assess, for the first time, the diversity patterns for all Neotropical snakes as well as sampling density and sampling biases. Main types of variables contained We compiled three databases of species occurrences: a dataset downloaded from the Global Biodiversity Information Facility (GBIF), a verified dataset built through taxonomic work and specialized literature, and a combined dataset comprising a cleaned version of the GBIF dataset merged with the verified dataset. Spatial location and grain Neotropics, Behrmann projection equivalent to 1° × 1°. Time period Specimens housed in museums during the last 150 years. Major taxa studied Squamata: Serpentes. Software format Geographical information system (GIS). Results The combined dataset provides the most comprehensive distribution database for Neotropical snakes to date. It contains 147,515 records for 886 species across 12 families, representing 74% of all species of snakes, spanning 27 countries in the Americas. Species richness and phylogenetic diversity show overall similar patterns. Amazonia is the least sampled Neotropical region, whereas most well‐sampled sites are located near large universities and scientific collections. We provide a list and updated maps of geographical distribution of all snake species surveyed. Main conclusions The biodiversity metrics of Neotropical snakes reflect patterns previously documented for other vertebrates, suggesting that similar factors may determine the diversity of both ectothermic and endothermic animals. We suggest conservation strategies for high‐diversity areas and sampling efforts be directed towards Amazonia and poorly known species. PMID:29398972

  4. [Integrated use of data bases to map manufacturing processes involving exposure to carcinogens in the Piedmont Region: the example of formaldehyde].

    PubMed

    Falcone, U; Gilardi, Luisella; Pasqualini, O; Santoro, S; Coffano, Elena

    2010-01-01

    Exposure to carcinogens is still widespread in working environments. For the purpose of defining priority of interventions, it is necessary to estimate the number and the geographic distribution of workers potentially exposed to carcinogens. It could therefore be useful to test the use of tools and information sources already available in order to map the distribution of exposure to carcinogens. Formaldehyde is suggested as an example of an occupational carcinogen in this study. The study aimed at verifying and investigating the potential of 3 integrated databases: MATline, CAREX, and company databases resulting from occupational accident and disease claims (INAIL), in order to estimate the number of workers exposed to formaldehyde and map their distribution in the Piedmont Region. The list of manufacturing processes involving exposure to formaldehyde was sorted by MIATline; for each process the number of firms and employees were obtained from the INAIL archives. By applying the prevalence of exposed workers obtained with CAREX, an estimate of exposure for each process was determined. A map of the distribution of employees associated with a specific process was produced using ArcView GIS software. It was estimated that more than 13,000 employees are exposed to formaldehyde in the Piedmont Region. The manufacture of furniture was identified as the process with the highest number of workers exposed to formaldehyde (3,130),followed by metal workers (2,301 exposed) and synthetic resin processing (1,391 exposed). The results obtained from the integrated use of databases provide a basis for defining priority of preventive interventions required in the industrial processes involving exposure to carcinogens in the Piedmont Region.

  5. Filling in the GAPS: evaluating completeness and coverage of open-access biodiversity databases in the United States

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Troia, Matthew J.; McManamay, Ryan A.

    Primary biodiversity data constitute observations of particular species at given points in time and space. Open-access electronic databases provide unprecedented access to these data, but their usefulness in characterizing species distributions and patterns in biodiversity depend on how complete species inventories are at a given survey location and how uniformly distributed survey locations are along dimensions of time, space, and environment. Our aim was to compare completeness and coverage among three open-access databases representing ten taxonomic groups (amphibians, birds, freshwater bivalves, crayfish, freshwater fish, fungi, insects, mammals, plants, and reptiles) in the contiguous United States. We compiled occurrence records frommore » the Global Biodiversity Information Facility (GBIF), the North American Breeding Bird Survey (BBS), and federally administered fish surveys (FFS). In this study, we aggregated occurrence records by 0.1° × 0.1° grid cells and computed three completeness metrics to classify each grid cell as well-surveyed or not. Next, we compared frequency distributions of surveyed grid cells to background environmental conditions in a GIS and performed Kolmogorov–Smirnov tests to quantify coverage through time, along two spatial gradients, and along eight environmental gradients. The three databases contributed >13.6 million reliable occurrence records distributed among >190,000 grid cells. The percent of well-surveyed grid cells was substantially lower for GBIF (5.2%) than for systematic surveys (BBS and FFS; 82.5%). Still, the large number of GBIF occurrence records produced at least 250 well-surveyed grid cells for six of nine taxonomic groups. Coverages of systematic surveys were less biased across spatial and environmental dimensions but were more biased in temporal coverage compared to GBIF data. GBIF coverages also varied among taxonomic groups, consistent with commonly recognized geographic, environmental, and institutional sampling biases. Lastly, this comprehensive assessment of biodiversity data across the contiguous United States provides a prioritization scheme to fill in the gaps by contributing existing occurrence records to the public domain and planning future surveys.« less

  6. Filling in the GAPS: evaluating completeness and coverage of open-access biodiversity databases in the United States

    DOE PAGES

    Troia, Matthew J.; McManamay, Ryan A.

    2016-06-12

    Primary biodiversity data constitute observations of particular species at given points in time and space. Open-access electronic databases provide unprecedented access to these data, but their usefulness in characterizing species distributions and patterns in biodiversity depend on how complete species inventories are at a given survey location and how uniformly distributed survey locations are along dimensions of time, space, and environment. Our aim was to compare completeness and coverage among three open-access databases representing ten taxonomic groups (amphibians, birds, freshwater bivalves, crayfish, freshwater fish, fungi, insects, mammals, plants, and reptiles) in the contiguous United States. We compiled occurrence records frommore » the Global Biodiversity Information Facility (GBIF), the North American Breeding Bird Survey (BBS), and federally administered fish surveys (FFS). In this study, we aggregated occurrence records by 0.1° × 0.1° grid cells and computed three completeness metrics to classify each grid cell as well-surveyed or not. Next, we compared frequency distributions of surveyed grid cells to background environmental conditions in a GIS and performed Kolmogorov–Smirnov tests to quantify coverage through time, along two spatial gradients, and along eight environmental gradients. The three databases contributed >13.6 million reliable occurrence records distributed among >190,000 grid cells. The percent of well-surveyed grid cells was substantially lower for GBIF (5.2%) than for systematic surveys (BBS and FFS; 82.5%). Still, the large number of GBIF occurrence records produced at least 250 well-surveyed grid cells for six of nine taxonomic groups. Coverages of systematic surveys were less biased across spatial and environmental dimensions but were more biased in temporal coverage compared to GBIF data. GBIF coverages also varied among taxonomic groups, consistent with commonly recognized geographic, environmental, and institutional sampling biases. Lastly, this comprehensive assessment of biodiversity data across the contiguous United States provides a prioritization scheme to fill in the gaps by contributing existing occurrence records to the public domain and planning future surveys.« less

  7. Using CLIPS in a distributed system: The Network Control Center (NCC) expert system

    NASA Technical Reports Server (NTRS)

    Wannemacher, Tom

    1990-01-01

    This paper describes an intelligent troubleshooting system for the Help Desk domain. It was developed on an IBM-compatible 80286 PC using Microsoft C and CLIPS and an AT&T 3B2 minicomputer using the UNIFY database and a combination of shell script, C programs and SQL queries. The two computers are linked by a lan. The functions of this system are to help non-technical NCC personnel handle trouble calls, to keep a log of problem calls with complete, concise information, and to keep a historical database of problems. The database helps identify hardware and software problem areas and provides a source of new rules for the troubleshooting knowledge base.

  8. A global organism detection and monitoring system for non-native species

    USGS Publications Warehouse

    Graham, J.; Newman, G.; Jarnevich, C.; Shory, R.; Stohlgren, T.J.

    2007-01-01

    Harmful invasive non-native species are a significant threat to native species and ecosystems, and the costs associated with non-native species in the United States is estimated at over $120 Billion/year. While some local or regional databases exist for some taxonomic groups, there are no effective geographic databases designed to detect and monitor all species of non-native plants, animals, and pathogens. We developed a web-based solution called the Global Organism Detection and Monitoring (GODM) system to provide real-time data from a broad spectrum of users on the distribution and abundance of non-native species, including attributes of their habitats for predictive spatial modeling of current and potential distributions. The four major subsystems of GODM provide dynamic links between the organism data, web pages, spatial data, and modeling capabilities. The core survey database tables for recording invasive species survey data are organized into three categories: "Where, Who & When, and What." Organisms are identified with Taxonomic Serial Numbers from the Integrated Taxonomic Information System. To allow users to immediately see a map of their data combined with other user's data, a custom geographic information system (GIS) Internet solution was required. The GIS solution provides an unprecedented level of flexibility in database access, allowing users to display maps of invasive species distributions or abundances based on various criteria including taxonomic classification (i.e., phylum or division, order, class, family, genus, species, subspecies, and variety), a specific project, a range of dates, and a range of attributes (percent cover, age, height, sex, weight). This is a significant paradigm shift from "map servers" to true Internet-based GIS solutions. The remainder of the system was created with a mix of commercial products, open source software, and custom software. Custom GIS libraries were created where required for processing large datasets, accessing the operating system, and to use existing libraries in C++, R, and other languages to develop the tools to track harmful species in space and time. The GODM database and system are crucial for early detection and rapid containment of invasive species. ?? 2007 Elsevier B.V. All rights reserved.

  9. Spatial distribution of GRBs and large scale structure of the Universe

    NASA Astrophysics Data System (ADS)

    Bagoly, Zsolt; Rácz, István I.; Balázs, Lajos G.; Tóth, L. Viktor; Horváth, István

    We studied the space distribution of the starburst galaxies from Millennium XXL database at z = 0.82. We examined the starburst distribution in the classical Millennium I (De Lucia et al. (2006)) using a semi-analytical model for the genesis of the galaxies. We simulated a starburst galaxies sample with Markov Chain Monte Carlo method. The connection between the large scale structures homogenous and starburst groups distribution (Kofman and Shandarin 1998), Suhhonenko et al. (2011), Liivamägi et al. (2012), Park et al. (2012), Horvath et al. (2014), Horvath et al. (2015)) on a defined scale were checked too.

  10. Neyman Pearson detection of K-distributed random variables

    NASA Astrophysics Data System (ADS)

    Tucker, J. Derek; Azimi-Sadjadi, Mahmood R.

    2010-04-01

    In this paper a new detection method for sonar imagery is developed in K-distributed background clutter. The equation for the log-likelihood is derived and compared to the corresponding counterparts derived for the Gaussian and Rayleigh assumptions. Test results of the proposed method on a data set of synthetic underwater sonar images is also presented. This database contains images with targets of different shapes inserted into backgrounds generated using a correlated K-distributed model. Results illustrating the effectiveness of the K-distributed detector are presented in terms of probability of detection, false alarm, and correct classification rates for various bottom clutter scenarios.

  11. Efficiently Distributing Component-based Applications Across Wide-Area Environments

    DTIC Science & Technology

    2002-01-01

    a variety of sophisticated network-accessible services such as e-mail, banking, on-line shopping, entertainment, and serv - ing as a data exchange...product database Customer Serves as a façade to Order and Account Stateful Session Beans ShoppingCart Maintains list of items to be bought by customer...Pet Store tests; and JBoss 3.0.3 with Jetty 4.1.0, for the RUBiS tests) and a sin- gle database server ( Oracle 8.1.7 Enterprise Edition), each running

  12. Optimizing the NASA Technical Report Server

    NASA Technical Reports Server (NTRS)

    Nelson, Michael L.; Maa, Ming-Hokng

    1996-01-01

    The NASA Technical Report Server (NTRS), a World Wide Web report distribution NASA technical publications service, is modified for performance enhancement, greater protocol support, and human interface optimization. Results include: Parallel database queries, significantly decreasing user access times by an average factor of 2.3; access from clients behind firewalls and/ or proxies which truncate excessively long Uniform Resource Locators (URLs); access to non-Wide Area Information Server (WAIS) databases and compatibility with the 239-50.3 protocol; and a streamlined user interface.

  13. Struggling with Excellence in All We Do: Is the Lure of New Technology Affecting How We Process Out Members’ Information

    DTIC Science & Technology

    2016-02-01

    Approved for public release: distribution unlimited. ii Disclaimer The views expressed in this academic research paper are those of the author...is managed today is far too complex and riddled with risk. Why is a members’ information duplicated across multiple disparate databases ? To better... databases . The purpose of this paper is to provide a viable solution within a given set of constrains that the Air Force can implement. Utilizing the

  14. Modern Hardware Technologies and Software Techniques for On-Line Database Storage and Access.

    DTIC Science & Technology

    1985-12-01

    of the information in a message narrative. This method employs artificial intelligence techniques to extract information, In simalest terms, an...disf ribif ion (tape replacemenf) systemns Database distribution On-fine mass storage Videogame ROM (luke-box I Media Cost Mt $2-10/438 $10-SO/G38...trajninq ot tne great intelligence for the analyst would be required. If, on’ the other hand, a sentence analysis scneme siTole enouq,. for the low-level

  15. Heterogenous database integration in a physician workstation.

    PubMed

    Annevelink, J; Young, C Y; Tang, P C

    1991-01-01

    We discuss the integration of a variety of data and information sources in a Physician Workstation (PWS), focusing on the integration of data from DHCP, the Veteran Administration's Distributed Hospital Computer Program. We designed a logically centralized, object-oriented data-schema, used by end users and applications to explore the data accessible through an object-oriented database using a declarative query language. We emphasize the use of procedural abstraction to transparently integrate a variety of information sources into the data schema.

  16. Heterogenous database integration in a physician workstation.

    PubMed Central

    Annevelink, J.; Young, C. Y.; Tang, P. C.

    1991-01-01

    We discuss the integration of a variety of data and information sources in a Physician Workstation (PWS), focusing on the integration of data from DHCP, the Veteran Administration's Distributed Hospital Computer Program. We designed a logically centralized, object-oriented data-schema, used by end users and applications to explore the data accessible through an object-oriented database using a declarative query language. We emphasize the use of procedural abstraction to transparently integrate a variety of information sources into the data schema. PMID:1807624

  17. Genomics Community Resources | Informatics Technology for Cancer Research (ITCR)

    Cancer.gov

    To facilitate genomic research and the dissemination of its products, National Human Genome Research Institute (NHGRI) supports genomic resources that are crucial for basic research, disease studies, model organism studies, and other biomedical research.  Awards under this FOA will support the development and distribution of genomic resources that will be valuable for the broad research community, using cost-effective approaches.  Such resources include (but are not limited to) databases and informatics resources (such as human and model organism databases, ontologies, and analysi

  18. A digital library for medical imaging activities

    NASA Astrophysics Data System (ADS)

    dos Santos, Marcelo; Furuie, Sérgio S.

    2007-03-01

    This work presents the development of an electronic infrastructure to make available a free, online, multipurpose and multimodality medical image database. The proposed infrastructure implements a distributed architecture for medical image database, authoring tools, and a repository for multimedia documents. Also it includes a peer-reviewed model that assures quality of dataset. This public repository provides a single point of access for medical images and related information to facilitate retrieval tasks. The proposed approach has been used as an electronic teaching system in Radiology as well.

  19. Distributed Database Control and Allocation. Volume 1. Frameworks for Understanding Concurrency Control and Recovery Algorithms.

    DTIC Science & Technology

    1983-10-01

    an Aborti , It forwards the operation directly to the recovery system. When the recovery system acknowledges that the operation has been processed, the...list... AbortI . rite Ti Into the abort list. Then undo all of Ti’s writes by reedina their bet ore-images from the audit trail and writin. them back...Into the stable database. [Ack) Then, delete Ti from the active list. Restart. Process Aborti for each Ti on the active list. Ack) In this algorithm

  20. The StarLite Project

    DTIC Science & Technology

    1988-09-01

    The current prototyping tool also provides a multiversion data object control mechanism. In a real-time database system, synchronization protocols...data in distributed real-time systems. The semantic informa- tion of read-only transactions is exploited for improved efficiency, and a multiversion ...are discussed. ." Index Terms: distributed system, replication, read-only transaction, consistency, multiversion . I’ I’ I’ 4. -9- I I I ° e% 4, 1

  1. Effects of Energetic Additives on Combustion Dynamics

    DTIC Science & Technology

    2010-04-19

    has the Distribution Statement checked befow. The current distribution for this document can be found in the DTIC® Technical Report Database. Q...no person shall be subject to any penalty for failing to comply with a collection of information if it does not display a currently velid OMB...and ethanol drops loaded with nano-Al additives burned differently. An exploratory computational study using Large Eddy Simulation indicated that

  2. A Distributed User Information System

    DTIC Science & Technology

    1990-03-01

    NOE08 Department of Computer Science NOVO 8 1990 University of Maryland S College Park, MD 20742 D Abstract Current user information database technology ...Transactions on Computer Systems, May 1988. [So189] K. Sollins. A plan for internet directory services. Technical report, DDN Network Information Center...2424 A Distributed User Information System DTiC Steven D. Miller, Scott Carson, and Leo Mark DELECTE Institute for Advanced Computer Studies and

  3. Allele frequency distribution for 21 autosomal STR loci in Nepal.

    PubMed

    Kraaijenbrink, T; van Driem, G L; Opgenort, J R M L; Tuladhar, N M; de Knijff, P

    2007-05-24

    The allele frequency distributions of 21 autosomal loci contained in the AmpFlSTR Identifiler, the Powerplex 16 and the FFFL multiplex PCR kits, was studied in 953 unrelated individuals from Nepal. Several new alleles (i.e. not yet reported in the NIST Short Tandem Repeat DNA Internet DataBase [http://www.cstl.nist.gov/biotech/strbase/]) have been detected in the process.

  4. Modelling the distribution of domestic ducks in Monsoon Asia

    USGS Publications Warehouse

    Van Bockel, Thomas P.; Prosser, Diann; Franceschini, Gianluca; Biradar, Chandra; Wint, William; Robinson, Tim; Gilbert, Marius

    2011-01-01

    Domestic ducks are considered to be an important reservoir of highly pathogenic avian influenza (HPAI), as shown by a number of geospatial studies in which they have been identified as a significant risk factor associated with disease presence. Despite their importance in HPAI epidemiology, their large-scale distribution in Monsoon Asia is poorly understood. In this study, we created a spatial database of domestic duck census data in Asia and used it to train statistical distribution models for domestic duck distributions at a spatial resolution of 1km. The method was based on a modelling framework used by the Food and Agriculture Organisation to produce the Gridded Livestock of the World (GLW) database, and relies on stratified regression models between domestic duck densities and a set of agro-ecological explanatory variables. We evaluated different ways of stratifying the analysis and of combining the prediction to optimize the goodness of fit of the predictions. We found that domestic duck density could be predicted with reasonable accuracy (mean RMSE and correlation coefficient between log-transformed observed and predicted densities being 0.58 and 0.80, respectively), using a stratification based on livestock production systems. We tested the use of artificially degraded data on duck distributions in Thailand and Vietnam as training data, and compared the modelled outputs with the original high-resolution data. This showed, for these two countries at least, that these approaches could be used to accurately disaggregate provincial level (administrative level 1) statistical data to provide high resolution model distributions.

  5. Percentiles of the product of uncertainty factors for establishing probabilistic reference doses.

    PubMed

    Gaylor, D W; Kodell, R L

    2000-04-01

    Exposure guidelines for potentially toxic substances are often based on a reference dose (RfD) that is determined by dividing a no-observed-adverse-effect-level (NOAEL), lowest-observed-adverse-effect-level (LOAEL), or benchmark dose (BD) corresponding to a low level of risk, by a product of uncertainty factors. The uncertainty factors for animal to human extrapolation, variable sensitivities among humans, extrapolation from measured subchronic effects to unknown results for chronic exposures, and extrapolation from a LOAEL to a NOAEL can be thought of as random variables that vary from chemical to chemical. Selected databases are examined that provide distributions across chemicals of inter- and intraspecies effects, ratios of LOAELs to NOAELs, and differences in acute and chronic effects, to illustrate the determination of percentiles for uncertainty factors. The distributions of uncertainty factors tend to be approximately lognormally distributed. The logarithm of the product of independent uncertainty factors is approximately distributed as the sum of normally distributed variables, making it possible to estimate percentiles for the product. Hence, the size of the products of uncertainty factors can be selected to provide adequate safety for a large percentage (e.g., approximately 95%) of RfDs. For the databases used to describe the distributions of uncertainty factors, using values of 10 appear to be reasonable and conservative. For the databases examined the following simple "Rule of 3s" is suggested that exceeds the estimated 95th percentile of the product of uncertainty factors: If only a single uncertainty factor is required use 33, for any two uncertainty factors use 3 x 33 approximately 100, for any three uncertainty factors use a combined factor of 3 x 100 = 300, and if all four uncertainty factors are needed use a total factor of 3 x 300 = 900. If near the 99th percentile is desired use another factor of 3. An additional factor may be needed for inadequate data or a modifying factor for other uncertainties (e.g., different routes of exposure) not covered above.

  6. Collaboration systems for classroom instruction

    NASA Astrophysics Data System (ADS)

    Chen, C. Y. Roger; Meliksetian, Dikran S.; Chang, Martin C.

    1996-01-01

    In this paper we discuss how classroom instruction can benefit from state-of-the-art technologies in networks, worldwide web access through Internet, multimedia, databases, and computing. Functional requirements for establishing such a high-tech classroom are identified, followed by descriptions of our current experimental implementations. The focus of the paper is on the capabilities of distributed collaboration, which supports both synchronous multimedia information sharing as well as a shared work environment for distributed teamwork and group decision making. Our ultimate goal is to achieve the concept of 'living world in a classroom' such that live and dynamic up-to-date information and material from all over the world can be integrated into classroom instruction on a real-time basis. We describe how we incorporate application developments in a geography study tool, worldwide web information retrievals, databases, and programming environments into the collaborative system.

  7. Video quality pooling adaptive to perceptual distortion severity.

    PubMed

    Park, Jincheol; Seshadrinathan, Kalpana; Lee, Sanghoon; Bovik, Alan Conrad

    2013-02-01

    It is generally recognized that severe video distortions that are transient in space and/or time have a large effect on overall perceived video quality. In order to understand this phenomena, we study the distribution of spatio-temporally local quality scores obtained from several video quality assessment (VQA) algorithms on videos suffering from compression and lossy transmission over communication channels. We propose a content adaptive spatial and temporal pooling strategy based on the observed distribution. Our method adaptively emphasizes "worst" scores along both the spatial and temporal dimensions of a video sequence and also considers the perceptual effect of large-area cohesive motion flow such as egomotion. We demonstrate the efficacy of the method by testing it using three different VQA algorithms on the LIVE Video Quality database and the EPFL-PoliMI video quality database.

  8. Distributed On-line Monitoring System Based on Modem and Public Phone Net

    NASA Astrophysics Data System (ADS)

    Chen, Dandan; Zhang, Qiushi; Li, Guiru

    In order to solve the monitoring problem of urban sewage disposal, a distributed on-line monitoring system is proposed. By introducing dial-up communication technology based on Modem, the serial communication program can rationally solve the information transmission problem between master station and slave station. The realization of serial communication program is based on the MSComm control of C++ Builder 6.0.The software includes real-time data operation part and history data handling part, which using Microsoft SQL Server 2000 for database, and C++ Builder6.0 for user interface. The monitoring center displays a user interface with alarm information of over-standard data and real-time curve. Practical application shows that the system has successfully accomplished the real-time data acquisition from data gather station, and stored them in the terminal database.

  9. Biometric analysis of the palm vein distribution by means two different techniques of feature extraction

    NASA Astrophysics Data System (ADS)

    Castro-Ortega, R.; Toxqui-Quitl, C.; Solís-Villarreal, J.; Padilla-Vivanco, A.; Castro-Ramos, J.

    2014-09-01

    Vein patterns can be used for accessing, identifying, and authenticating purposes; which are more reliable than classical identification way. Furthermore, these patterns can be used for venipuncture in health fields to get on to veins of patients when they cannot be seen with the naked eye. In this paper, an image acquisition system is implemented in order to acquire digital images of people hands in the near infrared. The image acquisition system consists of a CCD camera and a light source with peak emission in the 880 nm. This radiation can penetrate and can be strongly absorbed by the desoxyhemoglobin that is presented in the blood of the veins. Our method of analysis is composed by several steps and the first one of all is the enhancement of acquired images which is implemented by spatial filters. After that, adaptive thresholding and mathematical morphology operations are used in order to obtain the distribution of vein patterns. The above process is focused on the people recognition through of images of their palm-dorsal distributions obtained from the near infrared light. This work has been directed for doing a comparison of two different techniques of feature extraction as moments and veincode. The classification task is achieved using Artificial Neural Networks. Two databases are used for the analysis of the performance of the algorithms. The first database used here is owned of the Hong Kong Polytechnic University and the second one is our own database.

  10. BioMart: a data federation framework for large collaborative projects.

    PubMed

    Zhang, Junjun; Haider, Syed; Baran, Joachim; Cros, Anthony; Guberman, Jonathan M; Hsu, Jack; Liang, Yong; Yao, Long; Kasprzyk, Arek

    2011-01-01

    BioMart is a freely available, open source, federated database system that provides a unified access to disparate, geographically distributed data sources. It is designed to be data agnostic and platform independent, such that existing databases can easily be incorporated into the BioMart framework. BioMart allows databases hosted on different servers to be presented seamlessly to users, facilitating collaborative projects between different research groups. BioMart contains several levels of query optimization to efficiently manage large data sets and offers a diverse selection of graphical user interfaces and application programming interfaces to ensure that queries can be performed in whatever manner is most convenient for the user. The software has now been adopted by a large number of different biological databases spanning a wide range of data types and providing a rich source of annotation available to bioinformaticians and biologists alike.

  11. QKD-based quantum private query without a failure probability

    NASA Astrophysics Data System (ADS)

    Liu, Bin; Gao, Fei; Huang, Wei; Wen, QiaoYan

    2015-10-01

    In this paper, we present a quantum-key-distribution (QKD)-based quantum private query (QPQ) protocol utilizing single-photon signal of multiple optical pulses. It maintains the advantages of the QKD-based QPQ, i.e., easy to implement and loss tolerant. In addition, different from the situations in the previous QKD-based QPQ protocols, in our protocol, the number of the items an honest user will obtain is always one and the failure probability is always zero. This characteristic not only improves the stability (in the sense that, ignoring the noise and the attack, the protocol would always succeed), but also benefits the privacy of the database (since the database will no more reveal additional secrets to the honest users). Furthermore, for the user's privacy, the proposed protocol is cheat sensitive, and for security of the database, we obtain an upper bound for the leaked information of the database in theory.

  12. The global compendium of Aedes aegypti and Ae. albopictus occurrence

    NASA Astrophysics Data System (ADS)

    Kraemer, Moritz U. G.; Sinka, Marianne E.; Duda, Kirsten A.; Mylne, Adrian; Shearer, Freya M.; Brady, Oliver J.; Messina, Jane P.; Barker, Christopher M.; Moore, Chester G.; Carvalho, Roberta G.; Coelho, Giovanini E.; van Bortel, Wim; Hendrickx, Guy; Schaffner, Francis; Wint, G. R. William; Elyazar, Iqbal R. F.; Teng, Hwa-Jen; Hay, Simon I.

    2015-07-01

    Aedes aegypti and Ae. albopictus are the main vectors transmitting dengue and chikungunya viruses. Despite being pathogens of global public health importance, knowledge of their vectors’ global distribution remains patchy and sparse. A global geographic database of known occurrences of Ae. aegypti and Ae. albopictus between 1960 and 2014 was compiled. Herein we present the database, which comprises occurrence data linked to point or polygon locations, derived from peer-reviewed literature and unpublished studies including national entomological surveys and expert networks. We describe all data collection processes, as well as geo-positioning methods, database management and quality-control procedures. This is the first comprehensive global database of Ae. aegypti and Ae. albopictus occurrence, consisting of 19,930 and 22,137 geo-positioned occurrence records respectively. Both datasets can be used for a variety of mapping and spatial analyses of the vectors and, by inference, the diseases they transmit.

  13. The global compendium of Aedes aegypti and Ae. albopictus occurrence

    PubMed Central

    Kraemer, Moritz U. G.; Sinka, Marianne E.; Duda, Kirsten A.; Mylne, Adrian; Shearer, Freya M.; Brady, Oliver J.; Messina, Jane P.; Barker, Christopher M.; Moore, Chester G.; Carvalho, Roberta G.; Coelho, Giovanini E.; Van Bortel, Wim; Hendrickx, Guy; Schaffner, Francis; Wint, G. R. William; Elyazar, Iqbal R. F.; Teng, Hwa-Jen; Hay, Simon I.

    2015-01-01

    Aedes aegypti and Ae. albopictus are the main vectors transmitting dengue and chikungunya viruses. Despite being pathogens of global public health importance, knowledge of their vectors’ global distribution remains patchy and sparse. A global geographic database of known occurrences of Ae. aegypti and Ae. albopictus between 1960 and 2014 was compiled. Herein we present the database, which comprises occurrence data linked to point or polygon locations, derived from peer-reviewed literature and unpublished studies including national entomological surveys and expert networks. We describe all data collection processes, as well as geo-positioning methods, database management and quality-control procedures. This is the first comprehensive global database of Ae. aegypti and Ae. albopictus occurrence, consisting of 19,930 and 22,137 geo-positioned occurrence records respectively. Both datasets can be used for a variety of mapping and spatial analyses of the vectors and, by inference, the diseases they transmit. PMID:26175912

  14. Realization of Real-Time Clinical Data Integration Using Advanced Database Technology

    PubMed Central

    Yoo, Sooyoung; Kim, Boyoung; Park, Heekyong; Choi, Jinwook; Chun, Jonghoon

    2003-01-01

    As information & communication technologies have advanced, interest in mobile health care systems has grown. In order to obtain information seamlessly from distributed and fragmented clinical data from heterogeneous institutions, we need solutions that integrate data. In this article, we introduce a method for information integration based on real-time message communication using trigger and advanced database technologies. Messages were devised to conform to HL7, a standard for electronic data exchange in healthcare environments. The HL7 based system provides us with an integrated environment in which we are able to manage the complexities of medical data. We developed this message communication interface to generate and parse HL7 messages automatically from the database point of view. We discuss how easily real time data exchange is performed in the clinical information system, given the requirement for minimum loading of the database system. PMID:14728271

  15. CHOmine: an integrated data warehouse for CHO systems biology and modeling

    PubMed Central

    Hanscho, Michael; Ruckerbauer, David E.; Zanghellini, Jürgen; Borth, Nicole

    2017-01-01

    Abstract The last decade has seen a surge in published genome-scale information for Chinese hamster ovary (CHO) cells, which are the main production vehicles for therapeutic proteins. While a single access point is available at www.CHOgenome.org, the primary data is distributed over several databases at different institutions. Currently research is frequently hampered by a plethora of gene names and IDs that vary between published draft genomes and databases making systems biology analyses cumbersome and elaborate. Here we present CHOmine, an integrative data warehouse connecting data from various databases and links to other ones. Furthermore, we introduce CHOmodel, a web based resource that provides access to recently published CHO cell line specific metabolic reconstructions. Both resources allow to query CHO relevant data, find interconnections between different types of data and thus provides a simple, standardized entry point to the world of CHO systems biology. Database URL: http://www.chogenome.org PMID:28605771

  16. Expert system development for commonality analysis in space programs

    NASA Technical Reports Server (NTRS)

    Yeager, Dorian P.

    1987-01-01

    This report is a combination of foundational mathematics and software design. A mathematical model of the Commonality Analysis problem was developed and some important properties discovered. The complexity of the problem is described herein and techniques, both deterministic and heuristic, for reducing that complexity are presented. Weaknesses are pointed out in the existing software (System Commonality Analysis Tool) and several improvements are recommended. It is recommended that: (1) an expert system for guiding the design of new databases be developed; (2) a distributed knowledge base be created and maintained for the purpose of encoding the commonality relationships between design items in commonality databases; (3) a software module be produced which automatically generates commonality alternative sets from commonality databases using the knowledge associated with those databases; and (4) a more complete commonality analysis module be written which is capable of generating any type of feasible solution.

  17. Atlas of Iberian water beetles (ESACIB database).

    PubMed

    Sánchez-Fernández, David; Millán, Andrés; Abellán, Pedro; Picazo, Félix; Carbonell, José A; Ribera, Ignacio

    2015-01-01

    The ESACIB ('EScarabajos ACuáticos IBéricos') database is provided, including all available distributional data of Iberian and Balearic water beetles from the literature up to 2013, as well as from museum and private collections, PhD theses, and other unpublished sources. The database contains 62,015 records with associated geographic data (10×10 km UTM squares) for 488 species and subspecies of water beetles, 120 of them endemic to the Iberian Peninsula and eight to the Balearic Islands. This database was used for the elaboration of the "Atlas de los Coleópteros Acuáticos de España Peninsular". In this dataset data of 15 additional species has been added: 11 that occur in the Balearic Islands or mainland Portugal but not in peninsular Spain and an other four with mainly terrestrial habits within the genus Helophorus (for taxonomic coherence). The complete dataset is provided in Darwin Core Archive format.

  18. Atlas of Iberian water beetles (ESACIB database)

    PubMed Central

    Sánchez-Fernández, David; Millán, Andrés; Abellán, Pedro; Picazo, Félix; Carbonell, José A.; Ribera, Ignacio

    2015-01-01

    Abstract The ESACIB (‘EScarabajos ACuáticos IBéricos’) database is provided, including all available distributional data of Iberian and Balearic water beetles from the literature up to 2013, as well as from museum and private collections, PhD theses, and other unpublished sources. The database contains 62,015 records with associated geographic data (10×10 km UTM squares) for 488 species and subspecies of water beetles, 120 of them endemic to the Iberian Peninsula and eight to the Balearic Islands. This database was used for the elaboration of the “Atlas de los Coleópteros Acuáticos de España Peninsular”. In this dataset data of 15 additional species has been added: 11 that occur in the Balearic Islands or mainland Portugal but not in peninsular Spain and an other four with mainly terrestrial habits within the genus Helophorus (for taxonomic coherence). The complete dataset is provided in Darwin Core Archive format. PMID:26448717

  19. Monitoring of services with non-relational databases and map-reduce framework

    NASA Astrophysics Data System (ADS)

    Babik, M.; Souto, F.

    2012-12-01

    Service Availability Monitoring (SAM) is a well-established monitoring framework that performs regular measurements of the core site services and reports the corresponding availability and reliability of the Worldwide LHC Computing Grid (WLCG) infrastructure. One of the existing extensions of SAM is Site Wide Area Testing (SWAT), which gathers monitoring information from the worker nodes via instrumented jobs. This generates quite a lot of monitoring data to process, as there are several data points for every job and several million jobs are executed every day. The recent uptake of non-relational databases opens a new paradigm in the large-scale storage and distributed processing of systems with heavy read-write workloads. For SAM this brings new possibilities to improve its model, from performing aggregation of measurements to storing raw data and subsequent re-processing. Both SAM and SWAT are currently tuned to run at top performance, reaching some of the limits in storage and processing power of their existing Oracle relational database. We investigated the usability and performance of non-relational storage together with its distributed data processing capabilities. For this, several popular systems have been compared. In this contribution we describe our investigation of the existing non-relational databases suited for monitoring systems covering Cassandra, HBase and MongoDB. Further, we present our experiences in data modeling and prototyping map-reduce algorithms focusing on the extension of the already existing availability and reliability computations. Finally, possible future directions in this area are discussed, analyzing the current deficiencies of the existing Grid monitoring systems and proposing solutions to leverage the benefits of the non-relational databases to get more scalable and flexible frameworks.

  20. Community-Supported Data Repositories in Paleobiology: A 'Middle Tail' Between the Geoscientific and Informatics Communities

    NASA Astrophysics Data System (ADS)

    Williams, J. W.; Ashworth, A. C.; Betancourt, J. L.; Bills, B.; Blois, J.; Booth, R.; Buckland, P.; Charles, D.; Curry, B. B.; Goring, S. J.; Davis, E.; Grimm, E. C.; Graham, R. W.; Smith, A. J.

    2015-12-01

    Community-supported data repositories (CSDRs) in paleoecology and paleoclimatology have a decades-long tradition and serve multiple critical scientific needs. CSDRs facilitate synthetic large-scale scientific research by providing open-access and curated data that employ community-supported metadata and data standards. CSDRs serve as a 'middle tail' or boundary organization between information scientists and the long-tail community of individual geoscientists collecting and analyzing paleoecological data. Over the past decades, a distributed network of CSDRs has emerged, each serving a particular suite of data and research communities, e.g. Neotoma Paleoecology Database, Paleobiology Database, International Tree Ring Database, NOAA NCEI for Paleoclimatology, Morphobank, iDigPaleo, and Integrated Earth Data Alliance. Recently, these groups have organized into a common Paleobiology Data Consortium dedicated to improving interoperability and sharing best practices and protocols. The Neotoma Paleoecology Database offers one example of an active and growing CSDR, designed to facilitate research into ecological and evolutionary dynamics during recent past global change. Neotoma combines a centralized database structure with distributed scientific governance via multiple virtual constituent data working groups. The Neotoma data model is flexible and can accommodate a variety of paleoecological proxies from many depositional contests. Data input into Neotoma is done by trained Data Stewards, drawn from their communities. Neotoma data can be searched, viewed, and returned to users through multiple interfaces, including the interactive Neotoma Explorer map interface, REST-ful Application Programming Interfaces (APIs), the neotoma R package, and the Tilia stratigraphic software. Neotoma is governed by geoscientists and provides community engagement through training workshops for data contributors, stewards, and users. Neotoma is engaged in the Paleobiological Data Consortium and other efforts to improve interoperability among cyberinfrastructure in the paleogeosciences.

  1. A BRDF-BPDF database for the analysis of Earth target reflectances

    NASA Astrophysics Data System (ADS)

    Breon, Francois-Marie; Maignan, Fabienne

    2017-01-01

    Land surface reflectance is not isotropic. It varies with the observation geometry that is defined by the sun, view zenith angles, and the relative azimuth. In addition, the reflectance is linearly polarized. The reflectance anisotropy is quantified by the bidirectional reflectance distribution function (BRDF), while its polarization properties are defined by the bidirectional polarization distribution function (BPDF). The POLDER radiometer that flew onboard the PARASOL microsatellite remains the only space instrument that measured numerous samples of the BRDF and BPDF of Earth targets. Here, we describe a database of representative BRDFs and BPDFs derived from the POLDER measurements. From the huge number of data acquired by the spaceborne instrument over a period of 7 years, we selected a set of targets with high-quality observations. The selection aimed for a large number of observations, free of significant cloud or aerosol contamination, acquired in diverse observation geometries with a focus on the backscatter direction that shows the specific hot spot signature. The targets are sorted according to the 16-class International Geosphere-Biosphere Programme (IGBP) land cover classification system, and the target selection aims at a spatial representativeness within the class. The database thus provides a set of high-quality BRDF and BPDF samples that can be used to assess the typical variability of natural surface reflectances or to evaluate models. It is available freely from the PANGAEA website (doi:10.1594/PANGAEA.864090). In addition to the database, we provide a visualization and analysis tool based on the Interactive Data Language (IDL). It allows an interactive analysis of the measurements and a comparison against various BRDF and BPDF analytical models. The present paper describes the input data, the selection principles, the database format, and the analysis tool

  2. DASTCOM5: A Portable and Current Database of Asteroid and Comet Orbit Solutions

    NASA Astrophysics Data System (ADS)

    Giorgini, Jon D.; Chamberlin, Alan B.

    2014-11-01

    A portable direct-access database containing all NASA/JPL asteroid and comet orbit solutions, with the software to access it, is available for download (ftp://ssd.jpl.nasa.gov/pub/xfr/dastcom5.zip; unzip -ao dastcom5.zip). DASTCOM5 contains the latest heliocentric IAU76/J2000 ecliptic osculating orbital elements for all known asteroids and comets as determined by a least-squares best-fit to ground-based optical, spacecraft, and radar astrometric measurements. Other physical, dynamical, and covariance parameters are included when known. A total of 142 parameters per object are supported within DASTCOM5. This information is suitable for initializing high-precision numerical integrations, assessing orbit geometry, computing trajectory uncertainties, visual magnitude, and summarizing physical characteristics of the body. The DASTCOM5 distribution is updated as often as hourly to include newly discovered objects or orbit solution updates. It includes an ASCII index of objects that supports look-ups based on name, current or past designation, SPK ID, MPC packed-designations, or record number. DASTCOM5 is the database used by the NASA/JPL Horizons ephemeris system. It is a subset exported from a larger MySQL-based relational Small-Body Database ("SBDB") maintained at JPL. The DASTCOM5 distribution is intended for programmers comfortable with UNIX/LINUX/MacOSX command-line usage who need to develop stand-alone applications. The goal of the implementation is to provide small, fast, portable, and flexibly programmatic access to JPL comet and asteroid orbit solutions. The supplied software library, examples, and application programs have been verified under gfortran, Lahey, Intel, and Sun 32/64-bit Linux/UNIX FORTRAN compilers. A command-line tool ("dxlook") is provided to enable database access from shell or script environments.

  3. An environmental database for Venice and tidal zones

    NASA Astrophysics Data System (ADS)

    Macaluso, L.; Fant, S.; Marani, A.; Scalvini, G.; Zane, O.

    2003-04-01

    The natural environment is a complex, highly variable and physically non reproducible system (not in laboratory, nor in a confined territory). Environmental experimental studies are thus necessarily based on field measurements distributed in time and space. Only extensive data collections can provide the representative samples of the system behavior which are essential for scientific advancement. The assimilation of large data collections into accessible archives must necessarily be implemented in electronic databases. In the case of tidal environments in general, and of the Venice lagoon in particular, it is useful to establish a database, freely accessible to the scientific community, documenting the dynamics of such systems and their response to anthropic pressures and climatic variability. At the Istituto Veneto di Scienze, Lettere ed Arti in Venice (Italy) two internet environmental databases has been developed: one collects information regarding in detail the Venice lagoon; the other co-ordinate the research consortium of the "TIDE" EU RTD project, that attends to three different tidal areas: Venice Lagoon (Italy), Morecambe Bay (England), and Forth Estuary (Scotland). The archives may be accessed through the URL: www.istitutoveneto.it. The first one is freely available and applies to anyone is interested. It is continuously updated and has been structured in order to promote documentation concerning Venetian environment and disseminate this information for educational purposes (see "Dissemination" section). The second one is supplied by scientists and engineers working on this tidal system for various purposes (scientific, management, conservation purposes, etc.); it applies to interested researchers and grows with their own contributions. Both intend to promote scientific communication, to contribute to the realization of a distributed information system collecting homogeneous themes, and to initiate the interconnection among databases regarding different kinds of environment.

  4. Towards communication-efficient quantum oblivious key distribution

    NASA Astrophysics Data System (ADS)

    Panduranga Rao, M. V.; Jakobi, M.

    2013-01-01

    Symmetrically private information retrieval, a fundamental problem in the field of secure multiparty computation, is defined as follows: A database D of N bits held by Bob is queried by a user Alice who is interested in the bit Db in such a way that (1) Alice learns Db and only Db and (2) Bob does not learn anything about Alice's choice b. While solutions to this problem in the classical domain rely largely on unproven computational complexity theoretic assumptions, it is also known that perfect solutions that guarantee both database and user privacy are impossible in the quantum domain. Jakobi [Phys. Rev. APLRAAN1050-294710.1103/PhysRevA.83.022301 83, 022301 (2011)] proposed a protocol for oblivious transfer using well-known quantum key device (QKD) techniques to establish an oblivious key to solve this problem. Their solution provided a good degree of database and user privacy (using physical principles like the impossibility of perfectly distinguishing nonorthogonal quantum states and the impossibility of superluminal communication) while being loss-resistant and implementable with commercial QKD devices (due to the use of the Scarani-Acin-Ribordy-Gisin 2004 protocol). However, their quantum oblivious key distribution (QOKD) protocol requires a communication complexity of O(NlogN). Since modern databases can be extremely large, it is important to reduce this communication as much as possible. In this paper, we first suggest a modification of their protocol wherein the number of qubits that need to be exchanged is reduced to O(N). A subsequent generalization reduces the quantum communication complexity even further in such a way that only a few hundred qubits are needed to be transferred even for very large databases.

  5. ProbOnto: ontology and knowledge base of probability distributions.

    PubMed

    Swat, Maciej J; Grenon, Pierre; Wimalaratne, Sarala

    2016-09-01

    Probability distributions play a central role in mathematical and statistical modelling. The encoding, annotation and exchange of such models could be greatly simplified by a resource providing a common reference for the definition of probability distributions. Although some resources exist, no suitably detailed and complex ontology exists nor any database allowing programmatic access. ProbOnto, is an ontology-based knowledge base of probability distributions, featuring more than 80 uni- and multivariate distributions with their defining functions, characteristics, relationships and re-parameterization formulas. It can be used for model annotation and facilitates the encoding of distribution-based models, related functions and quantities. http://probonto.org mjswat@ebi.ac.uk Supplementary data are available at Bioinformatics online. © The Author 2016. Published by Oxford University Press.

  6. [Developing forensic reference database by 18 autosomal STR for DNA identification in Republic of Belarus].

    PubMed

    Tsybovskii, I S; Veremeichik, V M; Kotova, S A; Kritskaya, S V; Evmenenko, S A; Udina, I G

    2017-02-01

    For the Republic of Belarus, development of a forensic reference database on the basis of 18 autosomal microsatellites (STR) using a population dataset (N = 1040), “familial” genotypic dataset (N = 2550) obtained from expertise performance of paternity testing, and a dataset of genotypes from a criminal registration database (N = 8756) is described. Population samples studied consist of 80% ethnic Belarusians and 20% individuals of other nationality or of mixed origin (by questionnaire data). Genotypes of 12346 inhabitants of the Republic of Belarus from 118 regional samples studied by 18 autosomal microsatellites are included in the sample: 16 tetranucleotide STR (D2S1338, TPOX, D3S1358, CSF1PO, D5S818, D8S1179, D7S820, THO1, vWA, D13S317, D16S539, D18S51, D19S433, D21S11, F13B, and FGA) and two pentanucleotide STR (Penta D and Penta E). The samples studied are in Hardy–Weinberg equilibrium according to distribution of genotypes by 18 STR. Significant differences were not detected between discrete populations or between samples from various historical ethnographic regions of the Republic of Belarus (Western and Eastern Polesie, Podneprovye, Ponemanye, Poozerye, and Center), which indicates the absence of prominent genetic differentiation. Statistically significant differences between the studied genotypic datasets also were not detected, which made it possible to combine the datasets and consider the total sample as a unified forensic reference database for 18 “criminalistic” STR loci. Differences between reference database of the Republic of Belarus and Russians and Ukrainians by the distribution of the range of autosomal STR also were not detected, corresponding to a close genetic relationship of the three Eastern Slavic nations mediated by common origin and intense mutual migrations. Significant differences by separate STR loci between the reference database of Republic of Belarus and populations of Southern and Western Slavs were observed. The necessity of using original reference database for support of forensic expertise practice in the Republic of Belarus was demonstrated.

  7. In-Memory Graph Databases for Web-Scale Data

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Castellana, Vito G.; Morari, Alessandro; Weaver, Jesse R.

    RDF databases have emerged as one of the most relevant way for organizing, integrating, and managing expo- nentially growing, often heterogeneous, and not rigidly structured data for a variety of scientific and commercial fields. In this paper we discuss the solutions integrated in GEMS (Graph database Engine for Multithreaded Systems), a software framework for implementing RDF databases on commodity, distributed-memory high-performance clusters. Unlike the majority of current RDF databases, GEMS has been designed from the ground up to primarily employ graph-based methods. This is reflected in all the layers of its stack. The GEMS framework is composed of: a SPARQL-to-C++more » compiler, a library of data structures and related methods to access and modify them, and a custom runtime providing lightweight software multithreading, network messages aggregation and a partitioned global address space. We provide an overview of the framework, detailing its component and how they have been closely designed and customized to address issues of graph methods applied to large-scale datasets on clusters. We discuss in details the principles that enable automatic translation of the queries (expressed in SPARQL, the query language of choice for RDF databases) to graph methods, and identify differences with respect to other RDF databases.« less

  8. Geology of Point Reyes National Seashore and vicinity, California: a digital database

    USGS Publications Warehouse

    Clark, Jospeh C.; Brabb, Earl E.

    1997-01-01

    This Open-File report is a digital geologic map database. This pamphlet serves to introduce and describe the digital data. There is no paper map included in the Open-File report. The report does include, however, a PostScript plot file containing an image of the geologic map sheet with explanation, as well as the accompanying text describing the geology of the area. For those interested in a paper plot of information contained in the database or in obtaining the PostScript plot files, please see the section entitled 'For Those Who Aren't Familiar With Digital Geologic Map Databases' below. This digital map database, compiled from previously published and unpublished data and new mapping by the authors, represents the general distribution of surficial deposits and rock units in Point Reyes and surrounding areas. Together with the accompanying text file (pr-geo.txt or pr-geo.ps), it provides current information on the stratigraphy and structural geology of the area covered. The database delineates map units that are identified by general age and lithology following the stratigraphic nomenclature of the U.S. Geological Survey. The scale of the source maps limits the spatial resolution (scale) of the database to 1:48,000 or smaller.

  9. Northeast India Helminth Parasite Information Database (NEIHPID): Knowledge Base for Helminth Parasites

    PubMed Central

    Debnath, Manish; Kharumnuid, Graciously; Thongnibah, Welfrank; Tandon, Veena

    2016-01-01

    Most metazoan parasites that invade vertebrate hosts belong to three phyla: Platyhelminthes, Nematoda and Acanthocephala. Many of the parasitic members of these phyla are collectively known as helminths and are causative agents of many debilitating, deforming and lethal diseases of humans and animals. The North-East India Helminth Parasite Information Database (NEIHPID) project aimed to document and characterise the spectrum of helminth parasites in the north-eastern region of India, providing host, geographical distribution, diagnostic characters and image data. The morphology-based taxonomic data are supplemented with information on DNA sequences of nuclear, ribosomal and mitochondrial gene marker regions that aid in parasite identification. In addition, the database contains raw next generation sequencing (NGS) data for 3 foodborne trematode parasites, with more to follow. The database will also provide study material for students interested in parasite biology. Users can search the database at various taxonomic levels (phylum, class, order, superfamily, family, genus, and species), or by host, habitat and geographical location. Specimen collection locations are noted as co-ordinates in a MySQL database and can be viewed on Google maps, using Google Maps JavaScript API v3. The NEIHPID database has been made freely available at http://nepiac.nehu.ac.in/index.php PMID:27285615

  10. Northeast India Helminth Parasite Information Database (NEIHPID): Knowledge Base for Helminth Parasites.

    PubMed

    Biswal, Devendra Kumar; Debnath, Manish; Kharumnuid, Graciously; Thongnibah, Welfrank; Tandon, Veena

    2016-01-01

    Most metazoan parasites that invade vertebrate hosts belong to three phyla: Platyhelminthes, Nematoda and Acanthocephala. Many of the parasitic members of these phyla are collectively known as helminths and are causative agents of many debilitating, deforming and lethal diseases of humans and animals. The North-East India Helminth Parasite Information Database (NEIHPID) project aimed to document and characterise the spectrum of helminth parasites in the north-eastern region of India, providing host, geographical distribution, diagnostic characters and image data. The morphology-based taxonomic data are supplemented with information on DNA sequences of nuclear, ribosomal and mitochondrial gene marker regions that aid in parasite identification. In addition, the database contains raw next generation sequencing (NGS) data for 3 foodborne trematode parasites, with more to follow. The database will also provide study material for students interested in parasite biology. Users can search the database at various taxonomic levels (phylum, class, order, superfamily, family, genus, and species), or by host, habitat and geographical location. Specimen collection locations are noted as co-ordinates in a MySQL database and can be viewed on Google maps, using Google Maps JavaScript API v3. The NEIHPID database has been made freely available at http://nepiac.nehu.ac.in/index.php.

  11. Q2Stress: A database for multiple cues to stress assignment in Italian.

    PubMed

    Spinelli, Giacomo; Sulpizio, Simone; Burani, Cristina

    2017-12-01

    In languages where the position of lexical stress within a word is not predictable from print, readers rely on distributional information extracted from the lexicon in order to assign stress. Lexical databases are thus especially important for researchers willing to address stress assignment in those languages. Here we present Q2Stress, a new database aimed to fill the lack of such a resource for Italian. Q2Stress includes multiple cues readers may use in assigning stress, such as type and token frequency of stress patterns as well as their distribution with respect to number of syllables, grammatical category, word beginnings, word endings, and consonant-vowel structures. Furthermore, for the first time, data for both adults and children are available. Q2Stress may help researchers to answer empirical as well as theoretical questions about stress assignment and stress-related issues, and more in general, to explore the orthography-to-phonology relation in reading. Q2Stress is designed as a user-friendly resource, as it comes with scripts allowing researchers to explore and select their own stimuli according to several criteria as well as summary tables for overall data analysis.

  12. Assessment and mapping of water pollution indices in zone-III of municipal corporation of hyderabad using remote sensing and geographic information system.

    PubMed

    Asadi, S S; Vuppala, Padmaja; Reddy, M Anji

    2005-01-01

    A preliminary survey of area under Zone-III of MCH was undertaken to assess the ground water quality, demonstrate its spatial distribution and correlate with the land use patterns using advance techniques of remote sensing and geographical information system (GIS). Twenty-seven ground water samples were collected and their chemical analysis was done to form the attribute database. Water quality index was calculated from the measured parameters, based on which the study area was classified into five groups with respect to suitability of water for drinking purpose. Thematic maps viz., base map, road network, drainage and land use/land cover were prepared from IRS ID PAN + LISS III merged satellite imagery forming the spatial database. Attribute database was integrated with spatial sampling locations map in Arc/Info and maps showing spatial distribution of water quality parameters were prepared in Arc View. Results indicated that high concentrations of total dissolved solids (TDS), nitrates, fluorides and total hardness were observed in few industrial and densely populated areas indicating deteriorated water quality while the other areas exhibited moderate to good water quality.

  13. Management of information in distributed biomedical collaboratories.

    PubMed

    Keator, David B

    2009-01-01

    Organizing and annotating biomedical data in structured ways has gained much interest and focus in the last 30 years. Driven by decreases in digital storage costs and advances in genetics sequencing, imaging, electronic data collection, and microarray technologies, data is being collected at an alarming rate. The specialization of fields in biology and medicine demonstrates the need for somewhat different structures for storage and retrieval of data. For biologists, the need for structured information and integration across a number of domains drives development. For clinical researchers and hospitals, the need for a structured medical record accessible to, ideally, any medical practitioner who might require it during the course of research or patient treatment, patient confidentiality, and security are the driving developmental factors. Scientific data management systems generally consist of a few core services: a backend database system, a front-end graphical user interface, and an export/import mechanism or data interchange format to both get data into and out of the database and share data with collaborators. The chapter introduces some existing databases, distributed file systems, and interchange languages used within the biomedical research and clinical communities for scientific data management and exchange.

  14. A dedicated database system for handling multi-level data in systems biology.

    PubMed

    Pornputtapong, Natapol; Wanichthanarak, Kwanjeera; Nilsson, Avlant; Nookaew, Intawat; Nielsen, Jens

    2014-01-01

    Advances in high-throughput technologies have enabled extensive generation of multi-level omics data. These data are crucial for systems biology research, though they are complex, heterogeneous, highly dynamic, incomplete and distributed among public databases. This leads to difficulties in data accessibility and often results in errors when data are merged and integrated from varied resources. Therefore, integration and management of systems biological data remain very challenging. To overcome this, we designed and developed a dedicated database system that can serve and solve the vital issues in data management and hereby facilitate data integration, modeling and analysis in systems biology within a sole database. In addition, a yeast data repository was implemented as an integrated database environment which is operated by the database system. Two applications were implemented to demonstrate extensibility and utilization of the system. Both illustrate how the user can access the database via the web query function and implemented scripts. These scripts are specific for two sample cases: 1) Detecting the pheromone pathway in protein interaction networks; and 2) Finding metabolic reactions regulated by Snf1 kinase. In this study we present the design of database system which offers an extensible environment to efficiently capture the majority of biological entities and relations encountered in systems biology. Critical functions and control processes were designed and implemented to ensure consistent, efficient, secure and reliable transactions. The two sample cases on the yeast integrated data clearly demonstrate the value of a sole database environment for systems biology research.

  15. Bayesian screening for active compounds in high-dimensional chemical spaces combining property descriptors and molecular fingerprints.

    PubMed

    Vogt, Martin; Bajorath, Jürgen

    2008-01-01

    Bayesian classifiers are increasingly being used to distinguish active from inactive compounds and search large databases for novel active molecules. We introduce an approach to directly combine the contributions of property descriptors and molecular fingerprints in the search for active compounds that is based on a Bayesian framework. Conventionally, property descriptors and fingerprints are used as alternative features for virtual screening methods. Following the approach introduced here, probability distributions of descriptor values and fingerprint bit settings are calculated for active and database molecules and the divergence between the resulting combined distributions is determined as a measure of biological activity. In test calculations on a large number of compound activity classes, this methodology was found to consistently perform better than similarity searching using fingerprints and multiple reference compounds or Bayesian screening calculations using probability distributions calculated only from property descriptors. These findings demonstrate that there is considerable synergy between different types of property descriptors and fingerprints in recognizing diverse structure-activity relationships, at least in the context of Bayesian modeling.

  16. Fish Karyome: A karyological information network database of Indian Fishes.

    PubMed

    Nagpure, Naresh Sahebrao; Pathak, Ajey Kumar; Pati, Rameshwar; Singh, Shri Prakash; Singh, Mahender; Sarkar, Uttam Kumar; Kushwaha, Basdeo; Kumar, Ravindra

    2012-01-01

    'Fish Karyome', a database on karyological information of Indian fishes have been developed that serves as central source for karyotype data about Indian fishes compiled from the published literature. Fish Karyome has been intended to serve as a liaison tool for the researchers and contains karyological information about 171 out of 2438 finfish species reported in India and is publically available via World Wide Web. The database provides information on chromosome number, morphology, sex chromosomes, karyotype formula and cytogenetic markers etc. Additionally, it also provides the phenotypic information that includes species name, its classification, and locality of sample collection, common name, local name, sex, geographical distribution, and IUCN Red list status. Besides, fish and karyotype images, references for 171 finfish species have been included in the database. Fish Karyome has been developed using SQL Server 2008, a relational database management system, Microsoft's ASP.NET-2008 and Macromedia's FLASH Technology under Windows 7 operating environment. The system also enables users to input new information and images into the database, search and view the information and images of interest using various search options. Fish Karyome has wide range of applications in species characterization and identification, sex determination, chromosomal mapping, karyo-evolution and systematics of fishes.

  17. Database for the geologic map of the Mount Baker 30- by 60-minute quadrangle, Washington (I-2660)

    USGS Publications Warehouse

    Tabor, R.W.; Haugerud, R.A.; Hildreth, Wes; Brown, E.H.

    2006-01-01

    This digital map database has been prepared by R.W. Tabor from the published Geologic map of the Mount Baker 30- by 60-Minute Quadrangle, Washington. Together with the accompanying text files as PDF, it provides information on the geologic structure and stratigraphy of the area covered. The database delineates map units that are identified by general age and lithology following the stratigraphic nomenclature of the U.S. Geological Survey. The authors mapped most of the geology at 1:100,000. The Quaternary contacts and structural data have been much simplified for the 1:100,000-scale map and database. The spatial resolution (scale) of the database is 1:100,000 or smaller. This database depicts the distribution of geologic materials and structures at a regional (1:100,000) scale. The report is intended to provide geologic information for the regional study of materials properties, earthquake shaking, landslide potential, mineral hazards, seismic velocity, and earthquake faults. In addition, the report contains information and interpretations about the regional geologic history and framework. However, the regional scale of this report does not provide sufficient detail for site development purposes.

  18. GANESH: software for customized annotation of genome regions.

    PubMed

    Huntley, Derek; Hummerich, Holger; Smedley, Damian; Kittivoravitkul, Sasivimol; McCarthy, Mark; Little, Peter; Sergot, Marek

    2003-09-01

    GANESH is a software package designed to support the genetic analysis of regions of human and other genomes. It provides a set of components that may be assembled to construct a self-updating database of DNA sequence, mapping data, and annotations of possible genome features. Once one or more remote sources of data for the target region have been identified, all sequences for that region are downloaded, assimilated, and subjected to a (configurable) set of standard database-searching and genome-analysis packages. The results are stored in compressed form in a relational database, and are updated automatically on a regular schedule so that they are always immediately available in their most up-to-date versions. A Java front-end, executed as a stand alone application or web applet, provides a graphical interface for navigating the database and for viewing the annotations. There are facilities for importing and exporting data in the format of the Distributed Annotation System (DAS), enabling a GANESH database to be used as a component of a DAS configuration. The system has been used to construct databases for about a dozen regions of human chromosomes and for three regions of mouse chromosomes.

  19. FishTraits: a database of ecological and life-history traits of freshwater fishes of the United States

    USGS Publications Warehouse

    Angermeier, Paul L.; Frimpong, Emmanuel A.

    2011-01-01

    The need for integrated and widely accessible sources of species traits data to facilitate studies of ecology, conservation, and management has motivated development of traits databases for various taxa. In spite of the increasing number of traits-based analyses of freshwater fishes in the United States, no consolidated database of traits of this group exists publicly, and much useful information on these species is documented only in obscure sources. The largely inaccessible and unconsolidated traits information makes large-scale analysis involving many fishes and/or traits particularly challenging. We have compiled a database of > 100 traits for 809 (731 native and 78 nonnative) fish species found in freshwaters of the conterminous United States, including 37 native families and 145 native genera. The database, named Fish Traits, contains information on four major categories of traits: (1) trophic ecology; (2) body size, reproductive ecology, and life history; (3) habitat preferences; and (4) salinity and temperature tolerances. Information on geographic distribution and conservation status was also compiled. The database enhances many opportunities for conducting research on fish species traits and constitutes the first step toward establishing a central repository for a continually expanding set of traits of North American fishes.

  20. Evolution of the use of relational and NoSQL databases in the ATLAS experiment

    NASA Astrophysics Data System (ADS)

    Barberis, D.

    2016-09-01

    The ATLAS experiment used for many years a large database infrastructure based on Oracle to store several different types of non-event data: time-dependent detector configuration and conditions data, calibrations and alignments, configurations of Grid sites, catalogues for data management tools, job records for distributed workload management tools, run and event metadata. The rapid development of "NoSQL" databases (structured storage services) in the last five years allowed an extended and complementary usage of traditional relational databases and new structured storage tools in order to improve the performance of existing applications and to extend their functionalities using the possibilities offered by the modern storage systems. The trend is towards using the best tool for each kind of data, separating for example the intrinsically relational metadata from payload storage, and records that are frequently updated and benefit from transactions from archived information. Access to all components has to be orchestrated by specialised services that run on front-end machines and shield the user from the complexity of data storage infrastructure. This paper describes this technology evolution in the ATLAS database infrastructure and presents a few examples of large database applications that benefit from it.

  1. Database for the geologic map of the Chelan 30-minute by 60-minute quadrangle, Washington (I-1661)

    USGS Publications Warehouse

    Tabor, R.W.; Frizzell, V.A.; Whetten, J.T.; Waitt, R.B.; Swanson, D.A.; Byerly, G.R.; Booth, D.B.; Hetherington, M.J.; Zartman, R.E.

    2006-01-01

    This digital map database has been prepared by R. W. Tabor from the published Geologic map of the Chelan 30-Minute Quadrangle, Washington. Together with the accompanying text files as PDF, it provides information on the geologic structure and stratigraphy of the area covered. The database delineates map units that are identified by general age and lithology following the stratigraphic nomenclature of the U.S. Geological Survey. The authors mapped most of the bedrock geology at 1:100,000 scale, but compiled Quaternary units at 1:24,000 scale. The Quaternary contacts and structural data have been much simplified for the 1:100,000-scale map and database. The spatial resolution (scale) of the database is 1:100,000 or smaller. This database depicts the distribution of geologic materials and structures at a regional (1:100,000) scale. The report is intended to provide geologic information for the regional study of materials properties, earthquake shaking, landslide potential, mineral hazards, seismic velocity, and earthquake faults. In addition, the report contains information and interpretations about the regional geologic history and framework. However, the regional scale of this report does not provide sufficient detail for site development purposes.

  2. Database for the geologic map of the Snoqualmie Pass 30-minute by 60-minute quadrangle, Washington (I-2538)

    USGS Publications Warehouse

    Tabor, R.W.; Frizzell, V.A.; Booth, D.B.; Waitt, R.B.

    2006-01-01

    This digital map database has been prepared by R.W. Tabor from the published Geologic map of the Snoqualmie Pass 30' X 60' Quadrangle, Washington. Together with the accompanying text files as PDF, it provides information on the geologic structure and stratigraphy of the area covered. The database delineates map units that are identified by general age and lithology following the stratigraphic nomenclature of the U.S. Geological Survey. The authors mapped most of the bedrock geology at 1:100,000 scale, but compiled Quaternary units at 1:24,000 scale. The Quaternary contacts and structural data have been much simplified for the 1:100,000-scale map and database. The spatial resolution (scale) of the database is 1:100,000 or smaller. This database depicts the distribution of geologic materials and structures at a regional (1:100,000) scale. The report is intended to provide geologic information for the regional study of materials properties, earthquake shaking, landslide potential, mineral hazards, seismic velocity, and earthquake faults. In addition, the report contains information and interpretations about the regional geologic history and framework. However, the regional scale of this report does not provide sufficient detail for site development purposes.

  3. Geologic Map of the Wenatchee 1:100,000 Quadrangle, Central Washington: A Digital Database

    USGS Publications Warehouse

    Tabor, R.W.; Waitt, R.B.; Frizzell, V.A.; Swanson, D.A.; Byerly, G.R.; Bentley, R.D.

    2005-01-01

    This digital map database has been prepared by R.W. Tabor from the published Geologic map of the Wenatchee 1:100,000 Quadrangle, Central Washington. Together with the accompanying text files as PDF, it provides information on the geologic structure and stratigraphy of the area covered. The database delineates map units that are identified by general age and lithology following the stratigraphic nomenclature of the U.S. Geological Survey. The authors mapped most of the bedrock geology at 1:100,000 scale, but compiled Quaternary units at 1:24,000 scale. The Quaternary contacts and structural data have been much simplified for the 1:100,000-scale map and database. The spatial resolution (scale) of the database is 1:100,000 or smaller. This database depicts the distribution of geologic materials and structures at a regional (1:100,000) scale. The report is intended to provide geologic information for the regional study of materials properties, earthquake shaking, landslide potential, mineral hazards, seismic velocity, and earthquake faults. In addition, the report contains information and interpretations about the regional geologic history and framework. However, the regional scale of this report does not provide sufficient detail for site development purposes.

  4. Electronic Publishing.

    ERIC Educational Resources Information Center

    Lancaster, F. W.

    1989-01-01

    Describes various stages involved in the applications of electronic media to the publishing industry. Highlights include computer typesetting, or photocomposition; machine-readable databases; the distribution of publications in electronic form; computer conferencing and electronic mail; collaborative authorship; hypertext; hypermedia publications;…

  5. State and Local Government Publications.

    ERIC Educational Resources Information Center

    Nakata, Yuri; Kopec, Karen

    1980-01-01

    Reviews trends in library programs for state and local government publications and documents the increased interest in microforms and databases. Discussion focuses on publication distribution and control, and efforts to support interstate networking. There are 28 references. (RAA)

  6. Reliability-based econometrics of aerospace structural systems: Design criteria and test options. Ph.D. Thesis - Georgia Inst. of Tech.

    NASA Technical Reports Server (NTRS)

    Thomas, J. M.; Hanagud, S.

    1974-01-01

    The design criteria and test options for aerospace structural reliability were investigated. A decision methodology was developed for selecting a combination of structural tests and structural design factors. The decision method involves the use of Bayesian statistics and statistical decision theory. Procedures are discussed for obtaining and updating data-based probabilistic strength distributions for aerospace structures when test information is available and for obtaining subjective distributions when data are not available. The techniques used in developing the distributions are explained.

  7. Information specialist for a coming age (7)

    NASA Astrophysics Data System (ADS)

    Kishimoto, Tamotsu

    Present Status and effective use of in-house data are described, by showing a case of Kokuyo as an example. Integrated Distribution Information System in which information for production, sales and distribution is integrated, and databases loaded on it, are introduced. Outline of "KOPS" and "KROS" which are external systems connected with the above system, and how Kokuyo makes use of information obtained from this system, are explained. Recently, Kokuyo has focused its efforts on selling goods direct to users, among the diversified distribution channels. Customer Information System which supports such sales activities is also introduced.

  8. Development of a Dynamically Configurable, Object-Oriented Framework for Distributed, Multi-modal Computational Aerospace Systems Simulation

    NASA Technical Reports Server (NTRS)

    Afjeh, Abdollah A.; Reed, John A.

    2003-01-01

    The following reports are presented on this project:A first year progress report on: Development of a Dynamically Configurable,Object-Oriented Framework for Distributed, Multi-modal Computational Aerospace Systems Simulation; A second year progress report on: Development of a Dynamically Configurable, Object-Oriented Framework for Distributed, Multi-modal Computational Aerospace Systems Simulation; An Extensible, Interchangeable and Sharable Database Model for Improving Multidisciplinary Aircraft Design; Interactive, Secure Web-enabled Aircraft Engine Simulation Using XML Databinding Integration; and Improving the Aircraft Design Process Using Web-based Modeling and Simulation.

  9. LHCb Conditions database operation assistance systems

    NASA Astrophysics Data System (ADS)

    Clemencic, M.; Shapoval, I.; Cattaneo, M.; Degaudenzi, H.; Santinelli, R.

    2012-12-01

    The Conditions Database (CondDB) of the LHCb experiment provides versioned, time dependent geometry and conditions data for all LHCb data processing applications (simulation, high level trigger (HLT), reconstruction, analysis) in a heterogeneous computing environment ranging from user laptops to the HLT farm and the Grid. These different use cases impose front-end support for multiple database technologies (Oracle and SQLite are used). Sophisticated distribution tools are required to ensure timely and robust delivery of updates to all environments. The content of the database has to be managed to ensure that updates are internally consistent and externally compatible with multiple versions of the physics application software. In this paper we describe three systems that we have developed to address these issues. The first system is a CondDB state tracking extension to the Oracle 3D Streams replication technology, to trap cases when the CondDB replication was corrupted. Second, an automated distribution system for the SQLite-based CondDB, providing also smart backup and checkout mechanisms for the CondDB managers and LHCb users respectively. And, finally, a system to verify and monitor the internal (CondDB self-consistency) and external (LHCb physics software vs. CondDB) compatibility. The former two systems are used in production in the LHCb experiment and have achieved the desired goal of higher flexibility and robustness for the management and operation of the CondDB. The latter one has been fully designed and is passing currently to the implementation stage.

  10. Geologic map of Yosemite National Park and vicinity, California

    USGS Publications Warehouse

    Huber, N.K.; Bateman, P.C.; Wahrhaftig, Clyde

    1989-01-01

    This digital map database represents the general distribution of bedrock and surficial deposits of the Yosemite National Park vicinity. It was produced directly from the file used to create the print version in 1989. The Yosemite National Park region is comprised of portions of 15 7.5 minute quadrangles. The original publication of the map in 1989 included the map, described map units and provided correlations, as well as a geologic summary and references, all on the same sheet. The database delineates map units that are identified by general age and lithology following the stratigraphic nomenclature of the U.S. Geological Survey. The scale of the source maps limits the spatial resolution (scale) of the database to 1:125,000 or smaller.

  11. Cataloging and indexing - The development of the Space Shuttle mission data base and catalogs from earth observations hand-held photography

    NASA Technical Reports Server (NTRS)

    Nelson, Raymond M.; Willis, Kimberly J.; Daley, William J.; Brumbaugh, Fred R.; Bremer, Jeffrey M.

    1992-01-01

    All earth-looking photographs acquired by Space Shuttle astronauts are identified, located, and catalogued after each mission. The photographs have been entered into a computerized database at the NASA Johnson Space Center. The database in its two modes - computer and catalog - is organized and presented to provide a scope and level of detail designed to be useful in Earth science activities, resource management, environmental studies, and public affairs. The computerized database can be accessed free through standard communication networks 24 hours a day, and the catalogs are distributed throughout the world. Photograph viewing centers are available in the United States, and photographic copies can be obtained through government-supported centers.

  12. Marfan Database (second edition): software and database for the analysis of mutations in the human FBN1 gene.

    PubMed Central

    Collod-Béroud, G; Béroud, C; Adès, L; Black, C; Boxer, M; Brock, D J; Godfrey, M; Hayward, C; Karttunen, L; Milewicz, D; Peltonen, L; Richards, R I; Wang, M; Junien, C; Boileau, C

    1997-01-01

    Fibrillin is the major component of extracellular microfibrils. Mutations in the fibrillin gene on chromosome 15 (FBN1) were described at first in the heritable connective tissue disorder, Marfan syndrome (MFS). More recently, FBN1 has also been shown to harbor mutations related to a spectrum of conditions phenotypically related to MFS. These mutations are private, essentially missense, generally non-recurrent and widely distributed throughout the gene. To date no clear genotype/phenotype relationship has been observed excepted for the localization of neonatal mutations in a cluster between exons 24 and 32. The second version of the computerized Marfan database contains 89 entries. The software has been modified to accomodate new functions and routines. PMID:9016526

  13. Generation of the Ares I-X Flight Test Vehicle Aerodynamic Data Book and Comparison To Flight

    NASA Technical Reports Server (NTRS)

    Bauer, Steven X.; Krist, Steven E.; Compton, William B.

    2011-01-01

    A 3.5-year effort to characterize the aerodynamic behavior of the Ares I-X Flight Test Vehicle (AIX FTV) is described in this paper. The AIX FTV was designed to be representative of the Ares I Crew Launch Vehicle (CLV). While there are several differences in the outer mold line from the current revision of the CLV, the overall length, mass distribution, and flight systems of the two vehicles are very similar. This paper briefly touches on each of the aerodynamic databases developed in the program, describing the methodology employed, experimental and computational contributions to the generation of the databases, and how well the databases and underlying computations compare to actual flight test results.

  14. Information integration for a sky survey by data warehousing

    NASA Astrophysics Data System (ADS)

    Luo, A.; Zhang, Y.; Zhao, Y.

    The virtualization service of data system for a sky survey LAMOST is very important for astronomers The service needs to integrate information from data collections catalogs and references and support simple federation of a set of distributed files and associated metadata Data warehousing has been in existence for several years and demonstrated superiority over traditional relational database management systems by providing novel indexing schemes that supported efficient on-line analytical processing OLAP of large databases Now relational database systems such as Oracle etc support the warehouse capability which including extensions to the SQL language to support OLAP operations and a number of metadata management tools have been created The information integration of LAMOST by applying data warehousing is to effectively provide data and knowledge on-line

  15. USBombus, a database of contemporary survey data for North American Bumble Bees (Hymenoptera, Apidae, Bombus) distributed in the United States.

    PubMed

    Koch, Jonathan B; Lozier, Jeffrey; Strange, James P; Ikerd, Harold; Griswold, Terry; Cordes, Nils; Solter, Leellen; Stewart, Isaac; Cameron, Sydney A

    2015-01-01

    Bumble bees (Hymenoptera: Apidae, Bombus) are pollinators of wild and economically important flowering plants. However, at least four bumble bee species have declined significantly in population abundance and geographic range relative to historic estimates, and one species is possibly extinct. While a wealth of historic data is now available for many of the North American species found to be in decline in online databases, systematic survey data of stable species is still not publically available. The availability of contemporary survey data is critically important for the future monitoring of wild bumble bee populations. Without such data, the ability to ascertain the conservation status of bumble bees in the United States will remain challenging. This paper describes USBombus, a large database that represents the outcomes of one of the largest standardized surveys of bumble bee pollinators (Hymenoptera, Apidae, Bombus) globally. The motivation to collect live bumble bees across the United States was to examine the decline and conservation status of Bombus affinis, B. occidentalis, B. pensylvanicus, and B. terricola. Prior to our national survey of bumble bees in the United States from 2007 to 2010, there have only been regional accounts of bumble bee abundance and richness. In addition to surveying declining bumble bees, we also collected and documented a diversity of co-occuring bumble bees. However we have not yet completely reported their distribution and diversity onto a public online platform. Now, for the first time, we report the geographic distribution of bumble bees reported to be in decline (Cameron et al. 2011), as well as bumble bees that appeared to be stable on a large geographic scale in the United States (not in decline). In this database we report a total of 17,930 adult occurrence records across 397 locations and 39 species of Bombus detected in our national survey. We summarize their abundance and distribution across the United States and association to different ecoregions. The geospatial coverage of the dataset extends across 41 of the 50 US states, and from 0 to 3500 m a.s.l. Authors and respective field crews spent a total of 512 hours surveying bumble bees from 2007 to 2010. The dataset was developed using SQL server 2008 r2. For each specimen, the following information is generally provided: species, name, sex, caste, temporal and geospatial details, Cartesian coordinates, data collector(s), and when available, host plants. This database has already proven useful for a variety of studies on bumble bee ecology and conservation. However it is not publicly available. Considering the value of pollinators in agriculture and wild ecosystems, this large database of bumble bees will likely prove useful for investigations of the effects of anthropogenic activities on pollinator community composition and conservation status.

  16. Very large database of lipids: rationale and design.

    PubMed

    Martin, Seth S; Blaha, Michael J; Toth, Peter P; Joshi, Parag H; McEvoy, John W; Ahmed, Haitham M; Elshazly, Mohamed B; Swiger, Kristopher J; Michos, Erin D; Kwiterovich, Peter O; Kulkarni, Krishnaji R; Chimera, Joseph; Cannon, Christopher P; Blumenthal, Roger S; Jones, Steven R

    2013-11-01

    Blood lipids have major cardiovascular and public health implications. Lipid-lowering drugs are prescribed based in part on categorization of patients into normal or abnormal lipid metabolism, yet relatively little emphasis has been placed on: (1) the accuracy of current lipid measures used in clinical practice, (2) the reliability of current categorizations of dyslipidemia states, and (3) the relationship of advanced lipid characterization to other cardiovascular disease biomarkers. To these ends, we developed the Very Large Database of Lipids (NCT01698489), an ongoing database protocol that harnesses deidentified data from the daily operations of a commercial lipid laboratory. The database includes individuals who were referred for clinical purposes for a Vertical Auto Profile (Atherotech Inc., Birmingham, AL), which directly measures cholesterol concentrations of low-density lipoprotein, very low-density lipoprotein, intermediate-density lipoprotein, high-density lipoprotein, their subclasses, and lipoprotein(a). Individual Very Large Database of Lipids studies, ranging from studies of measurement accuracy, to dyslipidemia categorization, to biomarker associations, to characterization of rare lipid disorders, are investigator-initiated and utilize peer-reviewed statistical analysis plans to address a priori hypotheses/aims. In the first database harvest (Very Large Database of Lipids 1.0) from 2009 to 2011, there were 1 340 614 adult and 10 294 pediatric patients; the adult sample had a median age of 59 years (interquartile range, 49-70 years) with even representation by sex. Lipid distributions closely matched those from the population-representative National Health and Nutrition Examination Survey. The second harvest of the database (Very Large Database of Lipids 2.0) is underway. Overall, the Very Large Database of Lipids database provides an opportunity for collaboration and new knowledge generation through careful examination of granular lipid data on a large scale. © 2013 Wiley Periodicals, Inc.

  17. Results on three predictions for July 2012 federal elections in Mexico based on past regularities.

    PubMed

    Hernández-Saldaña, H

    2013-01-01

    The Presidential Election in Mexico of July 2012 has been the third time that PREP, Previous Electoral Results Program works. PREP gives voting outcomes based in electoral certificates of each polling station that arrive to capture centers. In previous ones, some statistical regularities had been observed, three of them were selected to make predictions and were published in arXiv:1207.0078 [physics.soc-ph]. Using the database made public in July 2012, two of the predictions were completely fulfilled, while, the third one was measured and confirmed using the database obtained upon request to the electoral authorities. The first two predictions confirmed by actual measures are: (ii) The Partido Revolucionario Institucional, PRI, is a sprinter and has a better performance in polling stations arriving late to capture centers during the process. (iii) Distribution of vote of this party is well described by a smooth function named a Daisy model. A Gamma distribution, but compatible with a Daisy model, fits the distribution as well. The third prediction confirms that errare humanum est, since the error distributions of all the self-consistency variables appeared as a central power law with lateral lobes as in 2000 and 2006 electoral processes. The three measured regularities appeared no matter the political environment.

  18. Vulnerability of freshwater native biodiversity to non-native ...

    EPA Pesticide Factsheets

    Background/Question/Methods Non-native species pose one of the greatest threats to native biodiversity. The literature provides plentiful empirical and anecdotal evidence of this phenomenon; however, such evidence is limited to local or regional scales. Employing geospatial analyses, we investigate the potential threat of non-native species to threatened and endangered aquatic animal taxa inhabiting unprotected areas across the continental US. We compiled distribution information from existing publicly available databases at the watershed scale (12-digit hydrologic unit code). We mapped non-native aquatic plant and animal species richness, and an index of cumulative invasion pressure, which weights non-native richness by the time since invasion of each species. These distributions were compared to the distributions of native aquatic taxa (fish, amphibians, mollusks, and decapods) from the International Union for the Conservation of Nature (IUCN) database. We mapped the proportion of species listed by IUCN as threatened and endangered, and a species rarity index per watershed. An overlay analysis identified watersheds experiencing high pressure from non-native species and also containing high proportions of threatened and endangered species or exhibiting high species rarity. Conservation priorities were identified by generating priority indices from these overlays and mapping them relative to the distribution of protected areas across the US. Results/Conclusion

  19. Distributed spatial information integration based on web service

    NASA Astrophysics Data System (ADS)

    Tong, Hengjian; Zhang, Yun; Shao, Zhenfeng

    2008-10-01

    Spatial information systems and spatial information in different geographic locations usually belong to different organizations. They are distributed and often heterogeneous and independent from each other. This leads to the fact that many isolated spatial information islands are formed, reducing the efficiency of information utilization. In order to address this issue, we present a method for effective spatial information integration based on web service. The method applies asynchronous invocation of web service and dynamic invocation of web service to implement distributed, parallel execution of web map services. All isolated information islands are connected by the dispatcher of web service and its registration database to form a uniform collaborative system. According to the web service registration database, the dispatcher of web services can dynamically invoke each web map service through an asynchronous delegating mechanism. All of the web map services can be executed at the same time. When each web map service is done, an image will be returned to the dispatcher. After all of the web services are done, all images are transparently overlaid together in the dispatcher. Thus, users can browse and analyze the integrated spatial information. Experiments demonstrate that the utilization rate of spatial information resources is significantly raised thought the proposed method of distributed spatial information integration.

  20. Distributed spatial information integration based on web service

    NASA Astrophysics Data System (ADS)

    Tong, Hengjian; Zhang, Yun; Shao, Zhenfeng

    2009-10-01

    Spatial information systems and spatial information in different geographic locations usually belong to different organizations. They are distributed and often heterogeneous and independent from each other. This leads to the fact that many isolated spatial information islands are formed, reducing the efficiency of information utilization. In order to address this issue, we present a method for effective spatial information integration based on web service. The method applies asynchronous invocation of web service and dynamic invocation of web service to implement distributed, parallel execution of web map services. All isolated information islands are connected by the dispatcher of web service and its registration database to form a uniform collaborative system. According to the web service registration database, the dispatcher of web services can dynamically invoke each web map service through an asynchronous delegating mechanism. All of the web map services can be executed at the same time. When each web map service is done, an image will be returned to the dispatcher. After all of the web services are done, all images are transparently overlaid together in the dispatcher. Thus, users can browse and analyze the integrated spatial information. Experiments demonstrate that the utilization rate of spatial information resources is significantly raised thought the proposed method of distributed spatial information integration.

Top