Sample records for database retrieval system

  1. Kingfisher: a system for remote sensing image database management

    NASA Astrophysics Data System (ADS)

    Bruzzo, Michele; Giordano, Ferdinando; Dellepiane, Silvana G.

    2003-04-01

    At present retrieval methods in remote sensing image database are mainly based on spatial-temporal information. The increasing amount of images to be collected by the ground station of earth observing systems emphasizes the need for database management with intelligent data retrieval capabilities. The purpose of the proposed method is to realize a new content based retrieval system for remote sensing images database with an innovative search tool based on image similarity. This methodology is quite innovative for this application, at present many systems exist for photographic images, as for example QBIC and IKONA, but they are not able to extract and describe properly remote image content. The target database is set by an archive of images originated from an X-SAR sensor (spaceborne mission, 1994). The best content descriptors, mainly texture parameters, guarantees high retrieval performances and can be extracted without losses independently of image resolution. The latter property allows DBMS (Database Management System) to process low amount of information, as in the case of quick-look images, improving time performance and memory access without reducing retrieval accuracy. The matching technique has been designed to enable image management (database population and retrieval) independently of dimensions (width and height). Local and global content descriptors are compared, during retrieval phase, with the query image and results seem to be very encouraging.

  2. New model for distributed multimedia databases and its application to networking of museums

    NASA Astrophysics Data System (ADS)

    Kuroda, Kazuhide; Komatsu, Naohisa; Komiya, Kazumi; Ikeda, Hiroaki

    1998-02-01

    This paper proposes a new distributed multimedia data base system where the databases storing MPEG-2 videos and/or super high definition images are connected together through the B-ISDN's, and also refers to an example of the networking of museums on the basis of the proposed database system. The proposed database system introduces a new concept of the 'retrieval manager' which functions an intelligent controller so that the user can recognize a set of image databases as one logical database. A user terminal issues a request to retrieve contents to the retrieval manager which is located in the nearest place to the user terminal on the network. Then, the retrieved contents are directly sent through the B-ISDN's to the user terminal from the server which stores the designated contents. In this case, the designated logical data base dynamically generates the best combination of such a retrieving parameter as a data transfer path referring to directly or data on the basis of the environment of the system. The generated retrieving parameter is then executed to select the most suitable data transfer path on the network. Therefore, the best combination of these parameters fits to the distributed multimedia database system.

  3. Nonmaterialized Relations and the Support of Information Retrieval Applications by Relational Database Systems.

    ERIC Educational Resources Information Center

    Lynch, Clifford A.

    1991-01-01

    Describes several aspects of the problem of supporting information retrieval system query requirements in the relational database management system (RDBMS) environment and proposes an extension to query processing called nonmaterialized relations. User interactions with information retrieval systems are discussed, and nonmaterialized relations are…

  4. Integration of Information Retrieval and Database Management Systems.

    ERIC Educational Resources Information Center

    Deogun, Jitender S.; Raghavan, Vijay V.

    1988-01-01

    Discusses the motivation for integrating information retrieval and database management systems, and proposes a probabilistic retrieval model in which records in a file may be composed of attributes (formatted data items) and descriptors (content indicators). The details and resolutions of difficulties involved in integrating such systems are…

  5. A Parallel Relational Database Management System Approach to Relevance Feedback in Information Retrieval.

    ERIC Educational Resources Information Center

    Lundquist, Carol; Frieder, Ophir; Holmes, David O.; Grossman, David

    1999-01-01

    Describes a scalable, parallel, relational database-drive information retrieval engine. To support portability across a wide range of execution environments, all algorithms adhere to the SQL-92 standard. By incorporating relevance feedback algorithms, accuracy is enhanced over prior database-driven information retrieval efforts. Presents…

  6. Rapid storage and retrieval of genomic intervals from a relational database system using nested containment lists

    PubMed Central

    Wiley, Laura K.; Sivley, R. Michael; Bush, William S.

    2013-01-01

    Efficient storage and retrieval of genomic annotations based on range intervals is necessary, given the amount of data produced by next-generation sequencing studies. The indexing strategies of relational database systems (such as MySQL) greatly inhibit their use in genomic annotation tasks. This has led to the development of stand-alone applications that are dependent on flat-file libraries. In this work, we introduce MyNCList, an implementation of the NCList data structure within a MySQL database. MyNCList enables the storage, update and rapid retrieval of genomic annotations from the convenience of a relational database system. Range-based annotations of 1 million variants are retrieved in under a minute, making this approach feasible for whole-genome annotation tasks. Database URL: https://github.com/bushlab/mynclist PMID:23894185

  7. Rapid storage and retrieval of genomic intervals from a relational database system using nested containment lists.

    PubMed

    Wiley, Laura K; Sivley, R Michael; Bush, William S

    2013-01-01

    Efficient storage and retrieval of genomic annotations based on range intervals is necessary, given the amount of data produced by next-generation sequencing studies. The indexing strategies of relational database systems (such as MySQL) greatly inhibit their use in genomic annotation tasks. This has led to the development of stand-alone applications that are dependent on flat-file libraries. In this work, we introduce MyNCList, an implementation of the NCList data structure within a MySQL database. MyNCList enables the storage, update and rapid retrieval of genomic annotations from the convenience of a relational database system. Range-based annotations of 1 million variants are retrieved in under a minute, making this approach feasible for whole-genome annotation tasks. Database URL: https://github.com/bushlab/mynclist.

  8. NATIONAL PESTICIDE INFORMATION RETRIEVAL SYSTEM (NPIRS)

    EPA Science Inventory

    The National Pesticide Information Retrieval System (NPIRS) is a collection of pesticide-related databases available through subscription to the Center for Environmental and Regulatory Information Systems, CERIS. The following is a summary of data found in the databases, data sou...

  9. How I do it: a practical database management system to assist clinical research teams with data collection, organization, and reporting.

    PubMed

    Lee, Howard; Chapiro, Julius; Schernthaner, Rüdiger; Duran, Rafael; Wang, Zhijun; Gorodetski, Boris; Geschwind, Jean-François; Lin, MingDe

    2015-04-01

    The objective of this study was to demonstrate that an intra-arterial liver therapy clinical research database system is a more workflow efficient and robust tool for clinical research than a spreadsheet storage system. The database system could be used to generate clinical research study populations easily with custom search and retrieval criteria. A questionnaire was designed and distributed to 21 board-certified radiologists to assess current data storage problems and clinician reception to a database management system. Based on the questionnaire findings, a customized database and user interface system were created to perform automatic calculations of clinical scores including staging systems such as the Child-Pugh and Barcelona Clinic Liver Cancer, and facilitates data input and output. Questionnaire participants were favorable to a database system. The interface retrieved study-relevant data accurately and effectively. The database effectively produced easy-to-read study-specific patient populations with custom-defined inclusion/exclusion criteria. The database management system is workflow efficient and robust in retrieving, storing, and analyzing data. Copyright © 2015 AUR. Published by Elsevier Inc. All rights reserved.

  10. Application of new type of distributed multimedia databases to networked electronic museum

    NASA Astrophysics Data System (ADS)

    Kuroda, Kazuhide; Komatsu, Naohisa; Komiya, Kazumi; Ikeda, Hiroaki

    1999-01-01

    Recently, various kinds of multimedia application systems have actively been developed based on the achievement of advanced high sped communication networks, computer processing technologies, and digital contents-handling technologies. Under this background, this paper proposed a new distributed multimedia database system which can effectively perform a new function of cooperative retrieval among distributed databases. The proposed system introduces a new concept of 'Retrieval manager' which functions as an intelligent controller so that the user can recognize a set of distributed databases as one logical database. The logical database dynamically generates and performs a preferred combination of retrieving parameters on the basis of both directory data and the system environment. Moreover, a concept of 'domain' is defined in the system as a managing unit of retrieval. The retrieval can effectively be performed by cooperation of processing among multiple domains. Communication language and protocols are also defined in the system. These are used in every action for communications in the system. A language interpreter in each machine translates a communication language into an internal language used in each machine. Using the language interpreter, internal processing, such internal modules as DBMS and user interface modules can freely be selected. A concept of 'content-set' is also introduced. A content-set is defined as a package of contents. Contents in the content-set are related to each other. The system handles a content-set as one object. The user terminal can effectively control the displaying of retrieved contents, referring to data indicating the relation of the contents in the content- set. In order to verify the function of the proposed system, a networked electronic museum was experimentally built. The results of this experiment indicate that the proposed system can effectively retrieve the objective contents under the control to a number of distributed domains. The result also indicate that the system can effectively work even if the system becomes large.

  11. (BARS) -- Bibliographic Retrieval System Sandia Shock Compression (SSC) database Shock Physics Index (SPHINX) database. Volume 1: UNIX version query guide customized application for INGRES

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Herrmann, W.; von Laven, G.M.; Parker, T.

    1993-09-01

    The Bibliographic Retrieval System (BARS) is a data base management system specially designed to retrieve bibliographic references. Two databases are available, (i) the Sandia Shock Compression (SSC) database which contains over 5700 references to the literature related to stress waves in solids and their applications, and (ii) the Shock Physics Index (SPHINX) which includes over 8000 further references to stress waves in solids, material properties at intermediate and low rates, ballistic and hypervelocity impact, and explosive or shock fabrication methods. There is some overlap in the information in the two data bases.

  12. Mobile object retrieval in server-based image databases

    NASA Astrophysics Data System (ADS)

    Manger, D.; Pagel, F.; Widak, H.

    2013-05-01

    The increasing number of mobile phones equipped with powerful cameras leads to huge collections of user-generated images. To utilize the information of the images on site, image retrieval systems are becoming more and more popular to search for similar objects in an own image database. As the computational performance and the memory capacity of mobile devices are constantly increasing, this search can often be performed on the device itself. This is feasible, for example, if the images are represented with global image features or if the search is done using EXIF or textual metadata. However, for larger image databases, if multiple users are meant to contribute to a growing image database or if powerful content-based image retrieval methods with local features are required, a server-based image retrieval backend is needed. In this work, we present a content-based image retrieval system with a client server architecture working with local features. On the server side, the scalability to large image databases is addressed with the popular bag-of-word model with state-of-the-art extensions. The client end of the system focuses on a lightweight user interface presenting the most similar images of the database highlighting the visual information which is common with the query image. Additionally, new images can be added to the database making it a powerful and interactive tool for mobile contentbased image retrieval.

  13. A proposal of fuzzy connective with learning function and its application to fuzzy retrieval system

    NASA Technical Reports Server (NTRS)

    Hayashi, Isao; Naito, Eiichi; Ozawa, Jun; Wakami, Noboru

    1993-01-01

    A new fuzzy connective and a structure of network constructed by fuzzy connectives are proposed to overcome a drawback of conventional fuzzy retrieval systems. This network represents a retrieval query and the fuzzy connectives in networks have a learning function to adjust its parameters by data from a database and outputs of a user. The fuzzy retrieval systems employing this network are also constructed. Users can retrieve results even with a query whose attributes do not exist in a database schema and can get satisfactory results for variety of thinkings by learning function.

  14. SLIMMER--A UNIX System-Based Information Retrieval System.

    ERIC Educational Resources Information Center

    Waldstein, Robert K.

    1988-01-01

    Describes an information retrieval system developed at Bell Laboratories to create and maintain a variety of different but interrelated databases, and to provide controlled access to these databases. The components discussed include the interfaces, indexing rules, display languages, response time, and updating procedures of the system. (6 notes…

  15. Materials properties numerical database system established and operational at CINDAS/Purdue University

    NASA Technical Reports Server (NTRS)

    Ho, C. Y.; Li, H. H.

    1989-01-01

    A computerized comprehensive numerical database system on the mechanical, thermophysical, electronic, electrical, magnetic, optical, and other properties of various types of technologically important materials such as metals, alloys, composites, dielectrics, polymers, and ceramics has been established and operational at the Center for Information and Numerical Data Analysis and Synthesis (CINDAS) of Purdue University. This is an on-line, interactive, menu-driven, user-friendly database system. Users can easily search, retrieve, and manipulate the data from the database system without learning special query language, special commands, standardized names of materials, properties, variables, etc. It enables both the direct mode of search/retrieval of data for specified materials, properties, independent variables, etc., and the inverted mode of search/retrieval of candidate materials that meet a set of specified requirements (which is the computer-aided materials selection). It enables also tabular and graphical displays and on-line data manipulations such as units conversion, variables transformation, statistical analysis, etc., of the retrieved data. The development, content, accessibility, etc., of the database system are presented and discussed.

  16. Documentation of a spatial data-base management system for monitoring pesticide application in Washington

    USGS Publications Warehouse

    Schurr, K.M.; Cox, S.E.

    1994-01-01

    The Pesticide-Application Data-Base Management System was created as a demonstration project and was tested with data submitted to the Washington State Department of Agriculture by pesticide applicators from a small geographic area. These data were entered into the Department's relational data-base system and uploaded into the system's ARC/INFO files. Locations for pesticide applica- tions are assigned within the Public Land Survey System grids, and ARC/INFO programs in the Pesticide-Application Data-Base Management System can subdivide each survey section into sixteen idealized quarter-quarter sections for display map grids. The system provides data retrieval and geographic information system plotting capabilities from a menu of seven basic retrieval options. Additionally, ARC/INFO coverages can be created from the retrieved data when required for particular applications. The Pesticide-Application Data-Base Management System, or the general principles used in the system, could be adapted to other applica- tions or to other states.

  17. JANE, A new information retrieval system for the Radiation Shielding Information Center

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Trubey, D.K.

    A new information storage and retrieval system has been developed for the Radiation Shielding Information Center (RSIC) at Oak Ridge National Laboratory to replace mainframe systems that have become obsolete. The database contains citations and abstracts of literature which were selected by RSIC analysts and indexed with terms from a controlled vocabulary. The database, begun in 1963, has been maintained continuously since that time. The new system, called JANE, incorporates automatic indexing techniques and on-line retrieval using the RSIC Data General Eclipse MV/4000 minicomputer, Automatic indexing and retrieval techniques based on fuzzy-set theory allow the presentation of results in ordermore » of Retrieval Status Value. The fuzzy-set membership function depends on term frequency in the titles and abstracts and on Term Discrimination Values which indicate the resolving power of the individual terms. These values are determined by the Cover Coefficient method. The use of a commercial database base to store and retrieve the indexing information permits rapid retrieval of the stored documents. Comparisons of the new and presently-used systems for actual searches of the literature indicate that it is practical to replace the mainframe systems with a minicomputer system similar to the present version of JANE. 18 refs., 10 figs.« less

  18. A searching and reporting system for relational databases using a graph-based metadata representation.

    PubMed

    Hewitt, Robin; Gobbi, Alberto; Lee, Man-Ling

    2005-01-01

    Relational databases are the current standard for storing and retrieving data in the pharmaceutical and biotech industries. However, retrieving data from a relational database requires specialized knowledge of the database schema and of the SQL query language. At Anadys, we have developed an easy-to-use system for searching and reporting data in a relational database to support our drug discovery project teams. This system is fast and flexible and allows users to access all data without having to write SQL queries. This paper presents the hierarchical, graph-based metadata representation and SQL-construction methods that, together, are the basis of this system's capabilities.

  19. Term Relevance Feedback and Mediated Database Searching: Implications for Information Retrieval Practice and Systems Design.

    ERIC Educational Resources Information Center

    Spink, Amanda

    1995-01-01

    This study uses the human approach to examine the sources and effectiveness of search terms selected during 40 mediated interactive database searches and focuses on determining the retrieval effectiveness of search terms identified by users and intermediaries from retrieved items during term relevance feedback. (Author/JKP)

  20. WWW Entrez: A Hypertext Retrieval Tool for Molecular Biology.

    ERIC Educational Resources Information Center

    Epstein, Jonathan A.; Kans, Jonathan A.; Schuler, Gregory D.

    This article describes the World Wide Web (WWW) Entrez server which is based upon the National Center for Biotechnology Information's (NCBI) Entrez retrieval database and software. Entrez is a molecular sequence retrieval system that contains an integrated view of portions of Medline and all publicly available nucleotide and protein databases,…

  1. Developing a Large Lexical Database for Information Retrieval, Parsing, and Text Generation Systems.

    ERIC Educational Resources Information Center

    Conlon, Sumali Pin-Ngern; And Others

    1993-01-01

    Important characteristics of lexical databases and their applications in information retrieval and natural language processing are explained. An ongoing project using various machine-readable sources to build a lexical database is described, and detailed designs of individual entries with examples are included. (Contains 66 references.) (EAM)

  2. Deeply learnt hashing forests for content based image retrieval in prostate MR images

    NASA Astrophysics Data System (ADS)

    Shah, Amit; Conjeti, Sailesh; Navab, Nassir; Katouzian, Amin

    2016-03-01

    Deluge in the size and heterogeneity of medical image databases necessitates the need for content based retrieval systems for their efficient organization. In this paper, we propose such a system to retrieve prostate MR images which share similarities in appearance and content with a query image. We introduce deeply learnt hashing forests (DL-HF) for this image retrieval task. DL-HF effectively leverages the semantic descriptiveness of deep learnt Convolutional Neural Networks. This is used in conjunction with hashing forests which are unsupervised random forests. DL-HF hierarchically parses the deep-learnt feature space to encode subspaces with compact binary code words. We propose a similarity preserving feature descriptor called Parts Histogram which is derived from DL-HF. Correlation defined on this descriptor is used as a similarity metric for retrieval from the database. Validations on publicly available multi-center prostate MR image database established the validity of the proposed approach. The proposed method is fully-automated without any user-interaction and is not dependent on any external image standardization like image normalization and registration. This image retrieval method is generalizable and is well-suited for retrieval in heterogeneous databases other imaging modalities and anatomies.

  3. Data Compression in Full-Text Retrieval Systems.

    ERIC Educational Resources Information Center

    Bell, Timothy C.; And Others

    1993-01-01

    Describes compression methods for components of full-text systems such as text databases on CD-ROM. Topics discussed include storage media; structures for full-text retrieval, including indexes, inverted files, and bitmaps; compression tools; memory requirements during retrieval; and ranking and information retrieval. (Contains 53 references.)…

  4. A biomedical information system for retrieval and manipulation of NHANES data.

    PubMed

    Mukherjee, Sukrit; Martins, David; Norris, Keith C; Jenders, Robert A

    2013-01-01

    The retrieval and manipulation of data from large public databases like the U.S. National Health and Nutrition Examination Survey (NHANES) may require sophisticated statistical software and significant expertise that may be unavailable in the university setting. In response, we have developed the Data Retrieval And Manipulation System (DReAMS), an automated information system to handle all processes of data extraction and cleaning and then joining different subsets to produce analysis-ready output. The system is a browser-based data warehouse application in which the input data from flat files or operational systems are aggregated in a structured way so that the desired data can be read, recoded, queried and extracted efficiently. The current pilot implementation of the system provides access to a limited amount of NHANES database. We plan to increase the amount of data available through the system in the near future and to extend the techniques to other large databases from CDU archive with a current holding of about 53 databases.

  5. Medical Language Processing for Knowledge Representation and Retrievals

    PubMed Central

    Lyman, Margaret; Sager, Naomi; Chi, Emile C.; Tick, Leo J.; Nhan, Ngo Thanh; Su, Yun; Borst, Francois; Scherrer, Jean-Raoul

    1989-01-01

    The Linguistic String Project-Medical Language Processor, a system for computer analysis of narrative patient documents in English, is being adapted for French Lettres de Sortie. The system converts the free-text input to a semantic representation which is then mapped into a relational database. Retrievals of clinical data from the database are described.

  6. [Design and implementation of medical instrument standard information retrieval system based on APS.NET].

    PubMed

    Yu, Kaijun

    2010-07-01

    This paper Analys the design goals of Medical Instrumentation standard information retrieval system. Based on the B /S structure,we established a medical instrumentation standard retrieval system with ASP.NET C # programming language, IIS f Web server, SQL Server 2000 database, in the. NET environment. The paper also Introduces the system structure, retrieval system modules, system development environment and detailed design of the system.

  7. A mathematical model of neuro-fuzzy approximation in image classification

    NASA Astrophysics Data System (ADS)

    Gopalan, Sasi; Pinto, Linu; Sheela, C.; Arun Kumar M., N.

    2016-06-01

    Image digitization and explosion of World Wide Web has made traditional search for image, an inefficient method for retrieval of required grassland image data from large database. For a given input query image Content-Based Image Retrieval (CBIR) system retrieves the similar images from a large database. Advances in technology has increased the use of grassland image data in diverse areas such has agriculture, art galleries, education, industry etc. In all the above mentioned diverse areas it is necessary to retrieve grassland image data efficiently from a large database to perform an assigned task and to make a suitable decision. A CBIR system based on grassland image properties and it uses the aid of a feed-forward back propagation neural network for an effective image retrieval is proposed in this paper. Fuzzy Memberships plays an important role in the input space of the proposed system which leads to a combined neural fuzzy approximation in image classification. The CBIR system with mathematical model in the proposed work gives more clarity about fuzzy-neuro approximation and the convergence of the image features in a grassland image.

  8. Archetype relational mapping - a practical openEHR persistence solution.

    PubMed

    Wang, Li; Min, Lingtong; Wang, Rui; Lu, Xudong; Duan, Huilong

    2015-11-05

    One of the primary obstacles to the widespread adoption of openEHR methodology is the lack of practical persistence solutions for future-proof electronic health record (EHR) systems as described by the openEHR specifications. This paper presents an archetype relational mapping (ARM) persistence solution for the archetype-based EHR systems to support healthcare delivery in the clinical environment. First, the data requirements of the EHR systems are analysed and organized into archetype-friendly concepts. The Clinical Knowledge Manager (CKM) is queried for matching archetypes; when necessary, new archetypes are developed to reflect concepts that are not encompassed by existing archetypes. Next, a template is designed for each archetype to apply constraints related to the local EHR context. Finally, a set of rules is designed to map the archetypes to data tables and provide data persistence based on the relational database. A comparison study was conducted to investigate the differences among the conventional database of an EHR system from a tertiary Class A hospital in China, the generated ARM database, and the Node + Path database. Five data-retrieving tests were designed based on clinical workflow to retrieve exams and laboratory tests. Additionally, two patient-searching tests were designed to identify patients who satisfy certain criteria. The ARM database achieved better performance than the conventional database in three of the five data-retrieving tests, but was less efficient in the remaining two tests. The time difference of query executions conducted by the ARM database and the conventional database is less than 130 %. The ARM database was approximately 6-50 times more efficient than the conventional database in the patient-searching tests, while the Node + Path database requires far more time than the other two databases to execute both the data-retrieving and the patient-searching tests. The ARM approach is capable of generating relational databases using archetypes and templates for archetype-based EHR systems, thus successfully adapting to changes in data requirements. ARM performance is similar to that of conventionally-designed EHR systems, and can be applied in a practical clinical environment. System components such as ARM can greatly facilitate the adoption of openEHR architecture within EHR systems.

  9. Comparing features sets for content-based image retrieval in a medical-case database

    NASA Astrophysics Data System (ADS)

    Muller, Henning; Rosset, Antoine; Vallee, Jean-Paul; Geissbuhler, Antoine

    2004-04-01

    Content-based image retrieval systems (CBIRSs) have frequently been proposed for the use in medical image databases and PACS. Still, only few systems were developed and used in a real clinical environment. It rather seems that medical professionals define their needs and computer scientists develop systems based on data sets they receive with little or no interaction between the two groups. A first study on the diagnostic use of medical image retrieval also shows an improvement in diagnostics when using CBIRSs which underlines the potential importance of this technique. This article explains the use of an open source image retrieval system (GIFT - GNU Image Finding Tool) for the retrieval of medical images in the medical case database system CasImage that is used in daily, clinical routine in the university hospitals of Geneva. Although the base system of GIFT shows an unsatisfactory performance, already little changes in the feature space show to significantly improve the retrieval results. The performance of variations in feature space with respect to color (gray level) quantizations and changes in texture analysis (Gabor filters) is compared. Whereas stock photography relies mainly on colors for retrieval, medical images need a large number of gray levels for successful retrieval, especially when executing feedback queries. The results also show that a too fine granularity in the gray levels lowers the retrieval quality, especially with single-image queries. For the evaluation of the retrieval peformance, a subset of the entire case database of more than 40,000 images is taken with a total of 3752 images. Ground truth was generated by a user who defined the expected query result of a perfect system by selecting images relevant to a given query image. The results show that a smaller number of gray levels (32 - 64) leads to a better retrieval performance, especially when using relevance feedback. The use of more scales and directions for the Gabor filters in the texture analysis also leads to improved results but response time is going up equally due to the larger feature space. CBIRSs can be of great use in managing large medical image databases. They allow to find images that might otherwise be lost for research and publications. They also give students students the possibility to navigate within large image repositories. In the future, CBIR might also become more important in case-based reasoning and evidence-based medicine to support the diagnostics because first studies show good results.

  10. Microcomputer Database Management Systems for Bibliographic Data.

    ERIC Educational Resources Information Center

    Pollard, Richard

    1986-01-01

    Discusses criteria for evaluating microcomputer database management systems (DBMS) used for storage and retrieval of bibliographic data. Two popular types of microcomputer DBMS--file management systems and relational database management systems--are evaluated with respect to these criteria. (Author/MBR)

  11. HUC--A User Designed System for All Recorded Knowledge and Information.

    ERIC Educational Resources Information Center

    Hilton, Howard J.

    This paper proposes a user designed system, HUC, intended to provide a single index and retrieval system covering all recorded knowledge and information capable of being retrieved from all modes of storage, from manual to the most sophisticated retrieval system. The concept integrates terminal hardware, software, and database structure to allow…

  12. Wavelet optimization for content-based image retrieval in medical databases.

    PubMed

    Quellec, G; Lamard, M; Cazuguel, G; Cochener, B; Roux, C

    2010-04-01

    We propose in this article a content-based image retrieval (CBIR) method for diagnosis aid in medical fields. In the proposed system, images are indexed in a generic fashion, without extracting domain-specific features: a signature is built for each image from its wavelet transform. These image signatures characterize the distribution of wavelet coefficients in each subband of the decomposition. A distance measure is then defined to compare two image signatures and thus retrieve the most similar images in a database when a query image is submitted by a physician. To retrieve relevant images from a medical database, the signatures and the distance measure must be related to the medical interpretation of images. As a consequence, we introduce several degrees of freedom in the system so that it can be tuned to any pathology and image modality. In particular, we propose to adapt the wavelet basis, within the lifting scheme framework, and to use a custom decomposition scheme. Weights are also introduced between subbands. All these parameters are tuned by an optimization procedure, using the medical grading of each image in the database to define a performance measure. The system is assessed on two medical image databases: one for diabetic retinopathy follow up and one for screening mammography, as well as a general purpose database. Results are promising: a mean precision of 56.50%, 70.91% and 96.10% is achieved for these three databases, when five images are returned by the system. Copyright 2009 Elsevier B.V. All rights reserved.

  13. Experiments with a novel content-based image retrieval software: can we eliminate classification systems in adolescent idiopathic scoliosis?

    PubMed

    Menon, K Venugopal; Kumar, Dinesh; Thomas, Tessamma

    2014-02-01

    Study Design Preliminary evaluation of new tool. Objective To ascertain whether the newly developed content-based image retrieval (CBIR) software can be used successfully to retrieve images of similar cases of adolescent idiopathic scoliosis (AIS) from a database to help plan treatment without adhering to a classification scheme. Methods Sixty-two operated cases of AIS were entered into the newly developed CBIR database. Five new cases of different curve patterns were used as query images. The images were fed into the CBIR database that retrieved similar images from the existing cases. These were analyzed by a senior surgeon for conformity to the query image. Results Within the limits of variability set for the query system, all the resultant images conformed to the query image. One case had no similar match in the series. The other four retrieved several images that were matching with the query. No matching case was left out in the series. The postoperative images were then analyzed to check for surgical strategies. Broad guidelines for treatment could be derived from the results. More precise query settings, inclusion of bending films, and a larger database will enhance accurate retrieval and better decision making. Conclusion The CBIR system is an effective tool for accurate documentation and retrieval of scoliosis images. Broad guidelines for surgical strategies can be made from the postoperative images of the existing cases without adhering to any classification scheme.

  14. Application and Effects of Linguistic Functions on Information Retrieval in a German Language Full-Text Database: Comparison between Retrieval in Abstract and Full Text.

    ERIC Educational Resources Information Center

    Tauchert, Wolfgang; And Others

    1991-01-01

    Describes the PADOK-II project in Germany, which was designed to give information on the effects of linguistic algorithms on retrieval in a full-text database, the German Patent Information System (GPI). Relevance assessments are discussed, statistical evaluations are described, and searches are compared for the full-text section versus the…

  15. Selecting Data-Base Management Software for Microcomputers in Libraries and Information Units.

    ERIC Educational Resources Information Center

    Pieska, K. A. O.

    1986-01-01

    Presents a model for the evaluation of database management systems software from the viewpoint of librarians and information specialists. The properties of data management systems, database management systems, and text retrieval systems are outlined and compared. (10 references) (CLB)

  16. The present status and problems in document retrieval system : document input type retrieval system

    NASA Astrophysics Data System (ADS)

    Inagaki, Hirohito

    The office-automation (OA) made many changes. Many documents were begun to maintained in an electronic filing system. Therefore, it is needed to establish efficient document retrieval system to extract useful information. Current document retrieval systems are using simple word-matching, syntactic-matching, semantic-matching to obtain high retrieval efficiency. On the other hand, the document retrieval systems using special hardware devices, such as ISSP, were developed for aiming high speed retrieval. Since these systems can accept a single sentence or keywords as input, it is difficult to explain searcher's request. We demonstrated document input type retrieval system, which can directly accept document as an input, and can search similar documents from document data-base.

  17. Retrieving high-resolution images over the Internet from an anatomical image database

    NASA Astrophysics Data System (ADS)

    Strupp-Adams, Annette; Henderson, Earl

    1999-12-01

    The Visible Human Data set is an important contribution to the national collection of anatomical images. To enhance the availability of these images, the National Library of Medicine has supported the design and development of a prototype object-oriented image database which imports, stores, and distributes high resolution anatomical images in both pixel and voxel formats. One of the key database modules is its client-server Internet interface. This Web interface provides a query engine with retrieval access to high-resolution anatomical images that range in size from 100KB for browser viewable rendered images, to 1GB for anatomical structures in voxel file formats. The Web query and retrieval client-server system is composed of applet GUIs, servlets, and RMI application modules which communicate with each other to allow users to query for specific anatomical structures, and retrieve image data as well as associated anatomical images from the database. Selected images can be downloaded individually as single files via HTTP or downloaded in batch-mode over the Internet to the user's machine through an applet that uses Netscape's Object Signing mechanism. The image database uses ObjectDesign's object-oriented DBMS, ObjectStore that has a Java interface. The query and retrieval systems has been tested with a Java-CDE window system, and on the x86 architecture using Windows NT 4.0. This paper describes the Java applet client search engine that queries the database; the Java client module that enables users to view anatomical images online; the Java application server interface to the database which organizes data returned to the user, and its distribution engine that allow users to download image files individually and/or in batch-mode.

  18. A humming retrieval system based on music fingerprint

    NASA Astrophysics Data System (ADS)

    Han, Xingkai; Cao, Baiyu

    2011-10-01

    In this paper, we proposed an improved music information retrieval method utilizing the music fingerprint. The goal of this method is to represent the music with compressed musical information. Based on the selected MIDI files, which are generated automatically as our music target database, we evaluate the accuracy, effectiveness, and efficiency of this method. In this research we not only extract the feature sequence, which can represent the file effectively, from the query and melody database, but also make it possible for retrieving the results in an innovative way. We investigate on the influence of noise to the performance of our system. As experimental result shows, the retrieval accuracy arriving at up to91% without noise is pretty well

  19. Health indicators 1991.

    PubMed

    Dawson, N

    1991-01-01

    This is the second edition of a database developed by the Canadian Centre for Health Information (CCHI). It features 49 health indicators, under one cover containing the most recent data available from a variety of national surveys. This information may be used to establish health goals for the population and to offer objective measures of their success. The database can be accessed through CANSIM, Statistics Canada's socio-economic electronic database and retrieval system, or through a personal computer package which enables the user to retrieve and analyze the 1.2 million data points in the system.

  20. Image-Based Airborne LiDAR Point Cloud Encoding for 3d Building Model Retrieval

    NASA Astrophysics Data System (ADS)

    Chen, Yi-Chen; Lin, Chao-Hung

    2016-06-01

    With the development of Web 2.0 and cyber city modeling, an increasing number of 3D models have been available on web-based model-sharing platforms with many applications such as navigation, urban planning, and virtual reality. Based on the concept of data reuse, a 3D model retrieval system is proposed to retrieve building models similar to a user-specified query. The basic idea behind this system is to reuse these existing 3D building models instead of reconstruction from point clouds. To efficiently retrieve models, the models in databases are compactly encoded by using a shape descriptor generally. However, most of the geometric descriptors in related works are applied to polygonal models. In this study, the input query of the model retrieval system is a point cloud acquired by Light Detection and Ranging (LiDAR) systems because of the efficient scene scanning and spatial information collection. Using Point clouds with sparse, noisy, and incomplete sampling as input queries is more difficult than that by using 3D models. Because that the building roof is more informative than other parts in the airborne LiDAR point cloud, an image-based approach is proposed to encode both point clouds from input queries and 3D models in databases. The main goal of data encoding is that the models in the database and input point clouds can be consistently encoded. Firstly, top-view depth images of buildings are generated to represent the geometry surface of a building roof. Secondly, geometric features are extracted from depth images based on height, edge and plane of building. Finally, descriptors can be extracted by spatial histograms and used in 3D model retrieval system. For data retrieval, the models are retrieved by matching the encoding coefficients of point clouds and building models. In experiments, a database including about 900,000 3D models collected from the Internet is used for evaluation of data retrieval. The results of the proposed method show a clear superiority over related methods.

  1. Experiments and Analysis on a Computer Interface to an Information-Retrieval Network.

    ERIC Educational Resources Information Center

    Marcus, Richard S.; Reintjes, J. Francis

    A primary goal of this project was to develop an interface that would provide direct access for inexperienced users to existing online bibliographic information retrieval networks. The experiment tested the concept of a virtual-system mode of access to a network of heterogeneous interactive retrieval systems and databases. An experimental…

  2. Non-Agricultural Databases and Thesauri: Retrieval of Subject Headings and Non-Controlled Terms in Relation to Agriculture

    ERIC Educational Resources Information Center

    Bartol, Tomaz

    2012-01-01

    Purpose: The paper aims to assess the utility of non-agriculture-specific information systems, databases, and respective controlled vocabularies (thesauri) in organising and retrieving agricultural information. The purpose is to identify thesaurus-linked tree structures, controlled subject headings/terms (heading words, descriptors), and principal…

  3. Requirements for benchmarking personal image retrieval systems

    NASA Astrophysics Data System (ADS)

    Bouguet, Jean-Yves; Dulong, Carole; Kozintsev, Igor; Wu, Yi

    2006-01-01

    It is now common to have accumulated tens of thousands of personal ictures. Efficient access to that many pictures can only be done with a robust image retrieval system. This application is of high interest to Intel processor architects. It is highly compute intensive, and could motivate end users to upgrade their personal computers to the next generations of processors. A key question is how to assess the robustness of a personal image retrieval system. Personal image databases are very different from digital libraries that have been used by many Content Based Image Retrieval Systems.1 For example a personal image database has a lot of pictures of people, but a small set of different people typically family, relatives, and friends. Pictures are taken in a limited set of places like home, work, school, and vacation destination. The most frequent queries are searched for people, and for places. These attributes, and many others affect how a personal image retrieval system should be benchmarked, and benchmarks need to be different from existing ones based on art images, or medical images for examples. The attributes of the data set do not change the list of components needed for the benchmarking of such systems as specified in2: - data sets - query tasks - ground truth - evaluation measures - benchmarking events. This paper proposed a way to build these components to be representative of personal image databases, and of the corresponding usage models.

  4. Improving data management and dissemination in web based information systems by semantic enrichment of descriptive data aspects

    NASA Astrophysics Data System (ADS)

    Gebhardt, Steffen; Wehrmann, Thilo; Klinger, Verena; Schettler, Ingo; Huth, Juliane; Künzer, Claudia; Dech, Stefan

    2010-10-01

    The German-Vietnamese water-related information system for the Mekong Delta (WISDOM) project supports business processes in Integrated Water Resources Management in Vietnam. Multiple disciplines bring together earth and ground based observation themes, such as environmental monitoring, water management, demographics, economy, information technology, and infrastructural systems. This paper introduces the components of the web-based WISDOM system including data, logic and presentation tier. It focuses on the data models upon which the database management system is built, including techniques for tagging or linking metadata with the stored information. The model also uses ordered groupings of spatial, thematic and temporal reference objects to semantically tag datasets to enable fast data retrieval, such as finding all data in a specific administrative unit belonging to a specific theme. A spatial database extension is employed by the PostgreSQL database. This object-oriented database was chosen over a relational database to tag spatial objects to tabular data, improving the retrieval of census and observational data at regional, provincial, and local areas. While the spatial database hinders processing raster data, a "work-around" was built into WISDOM to permit efficient management of both raster and vector data. The data model also incorporates styling aspects of the spatial datasets through styled layer descriptions (SLD) and web mapping service (WMS) layer specifications, allowing retrieval of rendered maps. Metadata elements of the spatial data are based on the ISO19115 standard. XML structured information of the SLD and metadata are stored in an XML database. The data models and the data management system are robust for managing the large quantity of spatial objects, sensor observations, census and document data. The operational WISDOM information system prototype contains modules for data management, automatic data integration, and web services for data retrieval, analysis, and distribution. The graphical user interfaces facilitate metadata cataloguing, data warehousing, web sensor data analysis and thematic mapping.

  5. Indexing and retrieval of MPEG compressed video

    NASA Astrophysics Data System (ADS)

    Kobla, Vikrant; Doermann, David S.

    1998-04-01

    To keep pace with the increased popularity of digital video as an archival medium, the development of techniques for fast and efficient analysis of ideo streams is essential. In particular, solutions to the problems of storing, indexing, browsing, and retrieving video data from large multimedia databases are necessary to a low access to these collections. Given that video is often stored efficiently in a compressed format, the costly overhead of decompression can be reduced by analyzing the compressed representation directly. In earlier work, we presented compressed domain parsing techniques which identified shots, subshots, and scenes. In this article, we present efficient key frame selection, feature extraction, indexing, and retrieval techniques that are directly applicable to MPEG compressed video. We develop a frame type independent representation which normalizes spatial and temporal features including frame type, frame size, macroblock encoding, and motion compensation vectors. Features for indexing are derived directly from this representation and mapped to a low- dimensional space where they can be accessed using standard database techniques. Spatial information is used as primary index into the database and temporal information is used to rank retrieved clips and enhance the robustness of the system. The techniques presented enable efficient indexing, querying, and retrieval of compressed video as demonstrated by our system which typically takes a fraction of a second to retrieve similar video scenes from a database, with over 95 percent recall.

  6. Interfaces and Expert Systems for Online Retrieval.

    ERIC Educational Resources Information Center

    Kehoe, Cynthia A.

    1985-01-01

    This paper reviews the history of separate online system interfaces which led to efforts to develop expert systems for searching databases, particularly for end users, and introduces the research on such expert systems. Appended is a bibliography of sources on interfaces and expert systems for online retrieval. (Author/EJS)

  7. A User Interface for Multiple Retrieval Systems.

    ERIC Educational Resources Information Center

    Teskey, Niall; And Others

    1987-01-01

    Reviews current systems designed to help end-users search online databases without the assistance of an intermediary and describes a prototype system which emulates the Deco (the text storage and retrieval system used by Unilever) interface on Dialog and Data-Star. Initial trials of the prototype system are reported. (15 references) (MES)

  8. Artificial Intelligence and Information Retrieval.

    ERIC Educational Resources Information Center

    Teodorescu, Ioana

    1987-01-01

    Compares artificial intelligence and information retrieval paradigms for natural language understanding, reviews progress to date, and outlines the applicability of artificial intelligence to question answering systems. A list of principal artificial intelligence software for database front end systems is appended. (CLB)

  9. Architecture for biomedical multimedia information delivery on the World Wide Web

    NASA Astrophysics Data System (ADS)

    Long, L. Rodney; Goh, Gin-Hua; Neve, Leif; Thoma, George R.

    1997-10-01

    Research engineers at the National Library of Medicine are building a prototype system for the delivery of multimedia biomedical information on the World Wide Web. This paper discuses the architecture and design considerations for the system, which will be used initially to make images and text from the third National Health and Nutrition Examination Survey (NHANES) publicly available. We categorized our analysis as follows: (1) fundamental software tools: we analyzed trade-offs among use of conventional HTML/CGI, X Window Broadway, and Java; (2) image delivery: we examined the use of unconventional TCP transmission methods; (3) database manager and database design: we discuss the capabilities and planned use of the Informix object-relational database manager and the planned schema for the HNANES database; (4) storage requirements for our Sun server; (5) user interface considerations; (6) the compatibility of the system with other standard research and analysis tools; (7) image display: we discuss considerations for consistent image display for end users. Finally, we discuss the scalability of the system in terms of incorporating larger or more databases of similar data, and the extendibility of the system for supporting content-based retrieval of biomedical images. The system prototype is called the Web-based Medical Information Retrieval System. An early version was built as a Java applet and tested on Unix, PC, and Macintosh platforms. This prototype used the MiniSQL database manager to do text queries on a small database of records of participants in the second NHANES survey. The full records and associated x-ray images were retrievable and displayable on a standard Web browser. A second version has now been built, also a Java applet, using the MySQL database manager.

  10. Content-based image retrieval on mobile devices

    NASA Astrophysics Data System (ADS)

    Ahmad, Iftikhar; Abdullah, Shafaq; Kiranyaz, Serkan; Gabbouj, Moncef

    2005-03-01

    Content-based image retrieval area possesses a tremendous potential for exploration and utilization equally for researchers and people in industry due to its promising results. Expeditious retrieval of desired images requires indexing of the content in large-scale databases along with extraction of low-level features based on the content of these images. With the recent advances in wireless communication technology and availability of multimedia capable phones it has become vital to enable query operation in image databases and retrieve results based on the image content. In this paper we present a content-based image retrieval system for mobile platforms, providing the capability of content-based query to any mobile device that supports Java platform. The system consists of light-weight client application running on a Java enabled device and a server containing a servlet running inside a Java enabled web server. The server responds to image query using efficient native code from selected image database. The client application, running on a mobile phone, is able to initiate a query request, which is handled by a servlet in the server for finding closest match to the queried image. The retrieved results are transmitted over mobile network and images are displayed on the mobile phone. We conclude that such system serves as a basis of content-based information retrieval on wireless devices and needs to cope up with factors such as constraints on hand-held devices and reduced network bandwidth available in mobile environments.

  11. Intelligent distributed medical image management

    NASA Astrophysics Data System (ADS)

    Garcia, Hong-Mei C.; Yun, David Y.

    1995-05-01

    The rapid advancements in high performance global communication have accelerated cooperative image-based medical services to a new frontier. Traditional image-based medical services such as radiology and diagnostic consultation can now fully utilize multimedia technologies in order to provide novel services, including remote cooperative medical triage, distributed virtual simulation of operations, as well as cross-country collaborative medical research and training. Fast (efficient) and easy (flexible) retrieval of relevant images remains a critical requirement for the provision of remote medical services. This paper describes the database system requirements, identifies technological building blocks for meeting the requirements, and presents a system architecture for our target image database system, MISSION-DBS, which has been designed to fulfill the goals of Project MISSION (medical imaging support via satellite integrated optical network) -- an experimental high performance gigabit satellite communication network with access to remote supercomputing power, medical image databases, and 3D visualization capabilities in addition to medical expertise anywhere and anytime around the country. The MISSION-DBS design employs a synergistic fusion of techniques in distributed databases (DDB) and artificial intelligence (AI) for storing, migrating, accessing, and exploring images. The efficient storage and retrieval of voluminous image information is achieved by integrating DDB modeling and AI techniques for image processing while the flexible retrieval mechanisms are accomplished by combining attribute- based and content-based retrievals.

  12. Classroom Laboratory Report: Using an Image Database System in Engineering Education.

    ERIC Educational Resources Information Center

    Alam, Javed; And Others

    1991-01-01

    Describes an image database system assembled using separate computer components that was developed to overcome text-only computer hardware storage and retrieval limitations for a pavement design class. (JJK)

  13. Social Science Data Bases and Data Banks in the United States and Canada.

    ERIC Educational Resources Information Center

    Black, John B.

    This overview of North American social science databases, including scope and services, identifies five trends: (1) growth--in the number of databases, subjects covered, and system availability; (2) increased competition in the retrieval systems marketplace with more databases being offered on multiple systems, improvements being made to the…

  14. Intelligent Text Retrieval and Knowledge Acquisition from Texts for NASA Applications: Preprocessing Issues

    NASA Technical Reports Server (NTRS)

    2002-01-01

    A system that retrieves problem reports from a NASA database is described. The database is queried with natural language questions. Part-of-speech tags are first assigned to each word in the question using a rule based tagger. A partial parse of the question is then produced with independent sets of deterministic finite state a utomata. Using partial parse information, a look up strategy searches the database for problem reports relevant to the question. A bigram stemmer and irregular verb conjugates have been incorporated into the system to improve accuracy. The system is evaluated by a set of fifty five questions posed by NASA engineers. A discussion of future research is also presented.

  15. Combining computational models, semantic annotations and simulation experiments in a graph database

    PubMed Central

    Henkel, Ron; Wolkenhauer, Olaf; Waltemath, Dagmar

    2015-01-01

    Model repositories such as the BioModels Database, the CellML Model Repository or JWS Online are frequently accessed to retrieve computational models of biological systems. However, their storage concepts support only restricted types of queries and not all data inside the repositories can be retrieved. In this article we present a storage concept that meets this challenge. It grounds on a graph database, reflects the models’ structure, incorporates semantic annotations and simulation descriptions and ultimately connects different types of model-related data. The connections between heterogeneous model-related data and bio-ontologies enable efficient search via biological facts and grant access to new model features. The introduced concept notably improves the access of computational models and associated simulations in a model repository. This has positive effects on tasks such as model search, retrieval, ranking, matching and filtering. Furthermore, our work for the first time enables CellML- and Systems Biology Markup Language-encoded models to be effectively maintained in one database. We show how these models can be linked via annotations and queried. Database URL: https://sems.uni-rostock.de/projects/masymos/ PMID:25754863

  16. The Video PATSEARCH System: An Interview with Peter Urbach.

    ERIC Educational Resources Information Center

    Videodisc/Videotext, 1982

    1982-01-01

    The Video PATSEARCH system consists of a microcomputer with a special keyboard and two display screens which accesses the PATSEARCH database of United States government patents on the Bibliographic Retrieval Services (BRS) search system. The microcomputer retrieves text from BRS and matching graphics from an analog optical videodisc. (Author/JJD)

  17. Information Retrieval System Design Issues in a Microcomputer-Based Relational DBMS Environment.

    ERIC Educational Resources Information Center

    Wolfram, Dietmar

    1992-01-01

    Outlines the file structure requirements for a microcomputer-based information retrieval system using FoxPro, a relational database management system (DBMS). Issues relating to the design and implementation of such systems are discussed, and two possible designs are examined in terms of space economy and practicality of implementation. (15…

  18. Comparing the Document Representations of Two IR-Systems: CLARIT and TOPIC.

    ERIC Educational Resources Information Center

    Paijmans, Hans

    1993-01-01

    Compares two information retrieval systems, CLARIT and TOPIC, in terms of assigned versus derived and precoordinate versus postcoordinate indexing. Models of information retrieval systems are discussed, and a test of the systems using a demonstration database of full-text articles from the "Wall Street Journal" is described. (Contains 21…

  19. Database Software Selection for the Egyptian National STI Network.

    ERIC Educational Resources Information Center

    Slamecka, Vladimir

    The evaluation and selection of information/data management system software for the Egyptian National Scientific and Technical (STI) Network are described. An overview of the state-of-the-art of database technology elaborates on the differences between information retrieval and database management systems (DBMS). The desirable characteristics of…

  20. Thematic video indexing to support video database retrieval and query processing

    NASA Astrophysics Data System (ADS)

    Khoja, Shakeel A.; Hall, Wendy

    1999-08-01

    This paper presents a novel video database system, which caters for complex and long videos, such as documentaries, educational videos, etc. As compared to relatively structured format videos like CNN news or commercial advertisements, this database system has the capacity to work with long and unstructured videos.

  1. Managing Data in a GIS Environment

    NASA Technical Reports Server (NTRS)

    Beltran, Maria; Yiasemis, Haris

    1997-01-01

    A Geographic Information System (GIS) is a computer-based system that enables capture, modeling, manipulation, retrieval, analysis and presentation of geographically referenced data. A GIS operates in a dynamic environment of spatial and temporal information. This information is held in a database like any other information system, but performance is more of an issue for a geographic database than a traditional database due to the nature of the data. What distinguishes a GIS from other information systems is the spatial and temporal dimensions of the data and the volume of data (several gigabytes). Most traditional information systems are usually based around tables and textual reports, whereas GIS requires the use of cartographic forms and other visualization techniques. Much of the data can be represented using computer graphics, but a GIS is not a graphics database. A graphical system is concerned with the manipulation and presentation of graphical objects whereas a GIS handles geographic objects that have not only spatial dimensions but non-visual, i e., attribute and components. Furthermore, the nature of the data on which a GIS operates makes the traditional relational database approach inadequate for retrieving data and answering queries that reference spatial data. The purpose of this paper is to describe the efficiency issues behind storage and retrieval of data within a GIS database. Section 2 gives a general background on GIS, and describes the issues involved in custom vs. commercial and hybrid vs. integrated geographic information systems. Section 3 describes the efficiency issues concerning the management of data within a GIS environment. The paper ends with a summary of the main concerns of this paper.

  2. A comparison of the performance of seven key bibliographic databases in identifying all relevant systematic reviews of interventions for hypertension.

    PubMed

    Rathbone, John; Carter, Matt; Hoffmann, Tammy; Glasziou, Paul

    2016-02-09

    Bibliographic databases are the primary resource for identifying systematic reviews of health care interventions. Reliable retrieval of systematic reviews depends on the scope of indexing used by database providers. Therefore, searching one database may be insufficient, but it is unclear how many need to be searched. We sought to evaluate the performance of seven major bibliographic databases for the identification of systematic reviews for hypertension. We searched seven databases (Cochrane library, Database of Abstracts of Reviews of Effects (DARE), Excerpta Medica Database (EMBASE), Epistemonikos, Medical Literature Analysis and Retrieval System Online (MEDLINE), PubMed Health and Turning Research Into Practice (TRIP)) from 2003 to 2015 for systematic reviews of any intervention for hypertension. Citations retrieved were screened for relevance, coded and checked for screening consistency using a fuzzy text matching query. The performance of each database was assessed by calculating its sensitivity, precision, the number of missed reviews and the number of unique records retrieved. Four hundred systematic reviews were identified for inclusion from 11,381 citations retrieved from seven databases. No single database identified all the retrieved systematic reviews for hypertension. EMBASE identified the most reviews (sensitivity 69 %) but also retrieved the most irrelevant citations with 7.2 % precision (Pr). The sensitivity of the Cochrane library was 60 %, DARE 57 %, MEDLINE 57 %, PubMed Health 53 %, Epistemonikos 49 % and TRIP 33 %. EMBASE contained the highest number of unique records (n = 43). The Cochrane library identified seven unique records and had the highest precision (Pr = 30 %), followed by Epistemonikos (n = 2, Pr = 19 %). No unique records were found in PubMed Health (Pr = 24 %) DARE (Pr = 21 %), TRIP (Pr = 10 %) or MEDLINE (Pr = 10 %). Searching EMBASE and the Cochrane library identified 88 % of all systematic reviews in the reference set, and searching the freely available databases (Cochrane, Epistemonikos, MEDLINE) identified 83 % of all the reviews. The databases were re-analysed after systematic reviews of non-conventional interventions (e.g. yoga, acupuncture) were removed. Similarly, no database identified all the retrieved systematic reviews. EMBASE identified the most relevant systematic reviews (sensitivity 73 %) but also retrieved the most irrelevant citations with Pr = 5 %. The sensitivity of the Cochrane database was 62 %, followed by MEDLINE (60 %), DARE (55 %), PubMed Health (54 %), Epistemonikos (50 %) and TRIP (31 %). The precision of the Cochrane library was the highest (20 %), followed by PubMed Health (Pr = 16 %), DARE (Pr = 13 %), Epistemonikos (Pr = 12 %), MEDLINE (Pr = 6 %), TRIP (Pr = 6 %) and EMBASE (Pr = 5 %). EMBASE contained the most unique records (n = 34). The Cochrane library identified seven unique records. The other databases held no unique records. The coverage of bibliographic databases varies considerably due to differences in their scope and content. Researchers wishing to identify systematic reviews should not rely on one database but search multiple databases.

  3. JICST Factual Database JICST DNA Database

    NASA Astrophysics Data System (ADS)

    Shirokizawa, Yoshiko; Abe, Atsushi

    Japan Information Center of Science and Technology (JICST) has started the on-line service of DNA database in October 1988. This database is composed of EMBL Nucleotide Sequence Library and Genetic Sequence Data Bank. The authors outline the database system, data items and search commands. Examples of retrieval session are presented.

  4. An evaluation of information retrieval accuracy with simulated OCR output

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Croft, W.B.; Harding, S.M.; Taghva, K.

    Optical Character Recognition (OCR) is a critical part of many text-based applications. Although some commercial systems use the output from OCR devices to index documents without editing, there is very little quantitative data on the impact of OCR errors on the accuracy of a text retrieval system. Because of the difficulty of constructing test collections to obtain this data, we have carried out evaluation using simulated OCR output on a variety of databases. The results show that high quality OCR devices have little effect on the accuracy of retrieval, but low quality devices used with databases of short documents canmore » result in significant degradation.« less

  5. The Effects of Noisy Data on Text Retrieval.

    ERIC Educational Resources Information Center

    Taghva, Kazem; And Others

    1994-01-01

    Discusses the use of optical character recognition (OCR) for inputting documents in an information retrieval system and describes a study that used an OCR-generated database and its corresponding corrected version to examine query evaluation in the presence of noisy data. Scanning technology, recognition technology, and retrieval technology are…

  6. The use of a personal digital assistant for wireless entry of data into a database via the Internet.

    PubMed

    Fowler, D L; Hogle, N J; Martini, F; Roh, M S

    2002-01-01

    Researchers typically record data on a worksheet and at some later time enter it into the database. Wireless data entry and retrieval using a personal digital assistant (PDA) at the site of patient contact can simplify this process and improve efficiency. A surgeon and a nurse coordinator provided the content for the database. The computer programmer created the database, placed the pages of the database on the PDA screen, and researched and installed security measures. Designing the database took 6 months. Meeting Health Insurance Portability and Accountability Act of 1996 (HIPAA) requirements for patient confidentiality, satisfying institutional Information Services requirements, and ensuring connectivity required an additional 8 months before the functional system was complete. It is now possible to achieve wireless entry and retrieval of data using a PDA. Potential advantages include collection and entry of data at the same time, easy entry of data from multiple sites, and retrieval of data at the patient's bedside.

  7. The comparative effectiveness of conventional and digital image libraries.

    PubMed

    McColl, R I; Johnson, A

    2001-03-01

    Before introducing a hospital-wide image database to improve access, navigation and retrieval speed, a comparative study between a conventional slide library and a matching image database was undertaken to assess its relative benefits. Paired time trials and personal questionnaires revealed faster retrieval rates, higher image quality, and easier viewing for the pilot digital image database. Analysis of confidentiality, copyright and data protection exposed similar issues for both systems, thus concluding that the digital image database is a more effective library system. The authors suggest that in the future, medical images will be stored on large, professionally administered, centrally located file servers, allowing specialist image libraries to be tailored locally for individual users. The further integration of the database with web technology will enable cheap and efficient remote access for a wide range of users.

  8. Pathology report data extraction from relational database using R, with extraction from reports on melanoma of skin as an example.

    PubMed

    Ye, Jay J

    2016-01-01

    Different methods have been described for data extraction from pathology reports with varying degrees of success. Here a technique for directly extracting data from relational database is described. Our department uses synoptic reports modified from College of American Pathologists (CAP) Cancer Protocol Templates to report most of our cancer diagnoses. Choosing the melanoma of skin synoptic report as an example, R scripting language extended with RODBC package was used to query the pathology information system database. Reports containing melanoma of skin synoptic report in the past 4 and a half years were retrieved and individual data elements were extracted. Using the retrieved list of the cases, the database was queried a second time to retrieve/extract the lymph node staging information in the subsequent reports from the same patients. 426 synoptic reports corresponding to unique lesions of melanoma of skin were retrieved, and data elements of interest were extracted into an R data frame. The distribution of Breslow depth of melanomas grouped by year is used as an example of intra-report data extraction and analysis. When the new pN staging information was present in the subsequent reports, 82% (77/94) was precisely retrieved (pN0, pN1, pN2 and pN3). Additional 15% (14/94) was retrieved with certain ambiguity (positive or knowing there was an update). The specificity was 100% for both. The relationship between Breslow depth and lymph node status was graphed as an example of lesion-specific multi-report data extraction and analysis. R extended with RODBC package is a simple and versatile approach well-suited for the above tasks. The success or failure of the retrieval and extraction depended largely on whether the reports were formatted and whether the contents of the elements were consistently phrased. This approach can be easily modified and adopted for other pathology information systems that use relational database for data management.

  9. Kellogg Library and Archive Retrieval System (KLARS) Document Capture Manual. Draft Version.

    ERIC Educational Resources Information Center

    Hugo, Jane

    This manual is designed to supply background information for Kellogg Library and Archive Retrieval System (KLARS) processors and others who might work with the system, outline detailed policies and procedures for processors who prepare and enter data into the adult education database on KLARS, and inform general readers about the system. KLARS is…

  10. Development of geotechnical analysis and design modules for the Virginia Department of Transportation's geotechnical database.

    DOT National Transportation Integrated Search

    2005-01-01

    In 2003, an Internet-based Geotechnical Database Management System (GDBMS) was developed for the Virginia Department of Transportation (VDOT) using distributed Geographic Information System (GIS) methodology for data management, archival, retrieval, ...

  11. A Strategy for Reusing the Data of Electronic Medical Record Systems for Clinical Research.

    PubMed

    Matsumura, Yasushi; Hattori, Atsushi; Manabe, Shiro; Tsuda, Tsutomu; Takeda, Toshihiro; Okada, Katsuki; Murata, Taizo; Mihara, Naoki

    2016-01-01

    There is a great need to reuse data stored in electronic medical records (EMR) databases for clinical research. We previously reported the development of a system in which progress notes and case report forms (CRFs) were simultaneously recorded using a template in the EMR in order to exclude redundant data entry. To make the data collection process more efficient, we are developing a system in which the data originally stored in the EMR database can be populated within a frame in a template. We developed interface plugin modules that retrieve data from the databases of other EMR applications. A universal keyword written in a template master is converted to a local code using a data conversion table, then the objective data is retrieved from the corresponding database. The template element data, which are entered by a template, are stored in the template element database. To retrieve the data entered by other templates, the objective data is designated by the template element code with the template code, or by the concept code if it is written for the element. When the application systems in the EMR generate documents, they also generate a PDF file and a corresponding document profile XML, which includes important data, and send them to the document archive server and the data sharing saver, respectively. In the data sharing server, the data are represented by an item with an item code with a document class code and its value. By linking a concept code to an item identifier, an objective data can be retrieved by designating a concept code. We employed a flexible strategy in which a unique identifier for a hospital is initially attached to all of the data that the hospital generates. The identifier is secondarily linked with concept codes. The data that are not linked with a concept code can also be retrieved using the unique identifier of the hospital. This strategy makes it possible to reuse any of a hospital's data.

  12. A Database System for Course Administration.

    ERIC Educational Resources Information Center

    Benbasat, Izak; And Others

    1982-01-01

    Describes a computer-assisted testing system which produces multiple-choice examinations for a college course in business administration. The system uses SPIRES (Stanford Public Information REtrieval System) to manage a database of questions and related data, mark-sense cards for machine grading tests, and ACL (6) (Audit Command Language) to…

  13. Case retrieval in medical databases by fusing heterogeneous information.

    PubMed

    Quellec, Gwénolé; Lamard, Mathieu; Cazuguel, Guy; Roux, Christian; Cochener, Béatrice

    2011-01-01

    A novel content-based heterogeneous information retrieval framework, particularly well suited to browse medical databases and support new generation computer aided diagnosis (CADx) systems, is presented in this paper. It was designed to retrieve possibly incomplete documents, consisting of several images and semantic information, from a database; more complex data types such as videos can also be included in the framework. The proposed retrieval method relies on image processing, in order to characterize each individual image in a document by their digital content, and information fusion. Once the available images in a query document are characterized, a degree of match, between the query document and each reference document stored in the database, is defined for each attribute (an image feature or a metadata). A Bayesian network is used to recover missing information if need be. Finally, two novel information fusion methods are proposed to combine these degrees of match, in order to rank the reference documents by decreasing relevance for the query. In the first method, the degrees of match are fused by the Bayesian network itself. In the second method, they are fused by the Dezert-Smarandache theory: the second approach lets us model our confidence in each source of information (i.e., each attribute) and take it into account in the fusion process for a better retrieval performance. The proposed methods were applied to two heterogeneous medical databases, a diabetic retinopathy database and a mammography screening database, for computer aided diagnosis. Precisions at five of 0.809 ± 0.158 and 0.821 ± 0.177, respectively, were obtained for these two databases, which is very promising.

  14. An ECG storage and retrieval system embedded in client server HIS utilizing object-oriented DB.

    PubMed

    Wang, C; Ohe, K; Sakurai, T; Nagase, T; Kaihara, S

    1996-02-01

    In the University of Tokyo Hospital, the improved client server HIS has been applied to clinical practice and physicians can order prescription, laboratory examination, ECG examination and radiographic examination, etc. directly by themselves and read results of these examinations, except medical signal waves, schema and image, on UNIX workstations. Recently, we designed and developed an ECG storage and retrieval system embedded in the client server HIS utilizing object-oriented database to take the first step in dealing with digitized signal, schema and image data and show waves, graphics, and images directly to physicians by the client server HIS. The system was developed based on object-oriented analysis and design, and implemented with object-oriented database management system (OODMS) and C++ programming language. In this paper, we describe the ECG data model, functions of the storage and retrieval system, features of user interface and the result of its implementation in the HIS.

  15. Interactive radiographic image retrieval system.

    PubMed

    Kundu, Malay Kumar; Chowdhury, Manish; Das, Sudeb

    2017-02-01

    Content based medical image retrieval (CBMIR) systems enable fast diagnosis through quantitative assessment of the visual information and is an active research topic over the past few decades. Most of the state-of-the-art CBMIR systems suffer from various problems: computationally expensive due to the usage of high dimensional feature vectors and complex classifier/clustering schemes. Inability to properly handle the "semantic gap" and the high intra-class versus inter-class variability problem of the medical image database (like radiographic image database). This yields an exigent demand for developing highly effective and computationally efficient retrieval system. We propose a novel interactive two-stage CBMIR system for diverse collection of medical radiographic images. Initially, Pulse Coupled Neural Network based shape features are used to find out the most probable (similar) image classes using a novel "similarity positional score" mechanism. This is followed by retrieval using Non-subsampled Contourlet Transform based texture features considering only the images of the pre-identified classes. Maximal information compression index is used for unsupervised feature selection to achieve better results. To reduce the semantic gap problem, the proposed system uses a novel fuzzy index based relevance feedback mechanism by incorporating subjectivity of human perception in an analytic manner. Extensive experiments were carried out to evaluate the effectiveness of the proposed CBMIR system on a subset of Image Retrieval in Medical Applications (IRMA)-2009 database consisting of 10,902 labeled radiographic images of 57 different modalities. We obtained overall average precision of around 98% after only 2-3 iterations of relevance feedback mechanism. We assessed the results by comparisons with some of the state-of-the-art CBMIR systems for radiographic images. Unlike most of the existing CBMIR systems, in the proposed two-stage hierarchical framework, main importance is given on constructing efficient and compact feature vector representation, search-space reduction and handling the "semantic gap" problem effectively, without compromising the retrieval performance. Experimental results and comparisons show that the proposed system performs efficiently in the radiographic medical image retrieval field. Copyright © 2016 Elsevier Ireland Ltd. All rights reserved.

  16. Development of bilateral data transferability in the Virginia Department of Transportation's Geotechnical Database Management System Framework.

    DOT National Transportation Integrated Search

    2006-01-01

    An Internet-based, spatiotemporal Geotechnical Database Management System (GDBMS) Framework was designed, developed, and implemented at the Virginia Department of Transportation (VDOT) in 2002 to retrieve, manage, archive, and analyze geotechnical da...

  17. Information Retrieval Using ADABAS-NATURAL (with Applications for Television and Radio).

    ERIC Educational Resources Information Center

    Silbergeld, I.; Kutok, P.

    1984-01-01

    Describes use of the software ADABAS (general purpose database management system) and NATURAL (interactive programing language) in development and implementation of an information retrieval system for the National Television and Radio Network of Israel. General design considerations, files contained in each archive, search strategies, and keywords…

  18. On Inference Rules of Logic-Based Information Retrieval Systems.

    ERIC Educational Resources Information Center

    Chen, Patrick Shicheng

    1994-01-01

    Discussion of relevance and the needs of the users in information retrieval focuses on a deductive object-oriented approach and suggests eight inference rules for the deduction. Highlights include characteristics of a deductive object-oriented system, database and data modeling language, implementation, and user interface. (Contains 24…

  19. The new interactive CESAR

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Fox, P.B.; Yatabe, M.

    1987-01-01

    In this report the Nuclear Criticality Safety Analytical Methods Resource Center describes a new interactive version of CESAR, a critical experiments storage and retrieval program available on the Nuclear Criticality Information System (NCIS) database at Lawrence Livermore National Laboratory. The original version of CESAR did not include interactive search capabilities. The CESAR database was developed to provide a convenient, readily accessible means of storing and retrieving code input data for the SCALE Criticality Safety Analytical Sequences and the codes comprising those sequences. The database includes data for both cross section preparation and criticality safety calculations. 3 refs., 1 tab.

  20. Informatics in radiology: use of CouchDB for document-based storage of DICOM objects.

    PubMed

    Rascovsky, Simón J; Delgado, Jorge A; Sanz, Alexander; Calvo, Víctor D; Castrillón, Gabriel

    2012-01-01

    Picture archiving and communication systems traditionally have depended on schema-based Structured Query Language (SQL) databases for imaging data management. To optimize database size and performance, many such systems store a reduced set of Digital Imaging and Communications in Medicine (DICOM) metadata, discarding informational content that might be needed in the future. As an alternative to traditional database systems, document-based key-value stores recently have gained popularity. These systems store documents containing key-value pairs that facilitate data searches without predefined schemas. Document-based key-value stores are especially suited to archive DICOM objects because DICOM metadata are highly heterogeneous collections of tag-value pairs conveying specific information about imaging modalities, acquisition protocols, and vendor-supported postprocessing options. The authors used an open-source document-based database management system (Apache CouchDB) to create and test two such databases; CouchDB was selected for its overall ease of use, capability for managing attachments, and reliance on HTTP and Representational State Transfer standards for accessing and retrieving data. A large database was created first in which the DICOM metadata from 5880 anonymized magnetic resonance imaging studies (1,949,753 images) were loaded by using a Ruby script. To provide the usual DICOM query functionality, several predefined "views" (standard queries) were created by using JavaScript. For performance comparison, the same queries were executed in both the CouchDB database and a SQL-based DICOM archive. The capabilities of CouchDB for attachment management and database replication were separately assessed in tests of a similar, smaller database. Results showed that CouchDB allowed efficient storage and interrogation of all DICOM objects; with the use of information retrieval algorithms such as map-reduce, all the DICOM metadata stored in the large database were searchable with only a minimal increase in retrieval time over that with the traditional database management system. Results also indicated possible uses for document-based databases in data mining applications such as dose monitoring, quality assurance, and protocol optimization. RSNA, 2012

  1. Why Save Your Course as a Relational Database?

    ERIC Educational Resources Information Center

    Hamilton, Gregory C.; Katz, David L.; Davis, James E.

    2000-01-01

    Describes a system that stores course materials for computer-based training programs in a relational database called Of Course! Outlines the basic structure of the databases; explains distinctions between Of Course! and other authoring languages; and describes how data is retrieved from the database and presented to the student. (Author/LRW)

  2. A Prototype System for Retrieval of Gene Functional Information

    PubMed Central

    Folk, Lillian C.; Patrick, Timothy B.; Pattison, James S.; Wolfinger, Russell D.; Mitchell, Joyce A.

    2003-01-01

    Microarrays allow researchers to gather data about the expression patterns of thousands of genes simultaneously. Statistical analysis can reveal which genes show statistically significant results. Making biological sense of those results requires the retrieval of functional information about the genes thus identified, typically a manual gene-by-gene retrieval of information from various on-line databases. For experiments generating thousands of genes of interest, retrieval of functional information can become a significant bottleneck. To address this issue, we are currently developing a prototype system to automate the process of retrieval of functional information from multiple on-line sources. PMID:14728346

  3. Front End Software for Online Database Searching Part 1: Definitions, System Features, and Evaluation.

    ERIC Educational Resources Information Center

    Hawkins, Donald T.; Levy, Louise R.

    1985-01-01

    This initial article in series of three discusses barriers inhibiting use of current online retrieval systems by novice users and notes reasons for front end and gateway online retrieval systems. Definitions, front end features, user interface, location (personal computer, host mainframe), evaluation, and strengths and weaknesses are covered. (16…

  4. Development of a web-based video management and application processing system

    NASA Astrophysics Data System (ADS)

    Chan, Shermann S.; Wu, Yi; Li, Qing; Zhuang, Yueting

    2001-07-01

    How to facilitate efficient video manipulation and access in a web-based environment is becoming a popular trend for video applications. In this paper, we present a web-oriented video management and application processing system, based on our previous work on multimedia database and content-based retrieval. In particular, we extend the VideoMAP architecture with specific web-oriented mechanisms, which include: (1) Concurrency control facilities for the editing of video data among different types of users, such as Video Administrator, Video Producer, Video Editor, and Video Query Client; different users are assigned various priority levels for different operations on the database. (2) Versatile video retrieval mechanism which employs a hybrid approach by integrating a query-based (database) mechanism with content- based retrieval (CBR) functions; its specific language (CAROL/ST with CBR) supports spatio-temporal semantics of video objects, and also offers an improved mechanism to describe visual content of videos by content-based analysis method. (3) Query profiling database which records the `histories' of various clients' query activities; such profiles can be used to provide the default query template when a similar query is encountered by the same kind of users. An experimental prototype system is being developed based on the existing VideoMAP prototype system, using Java and VC++ on the PC platform.

  5. Research on keyword retrieval method of HBase database based on index structure

    NASA Astrophysics Data System (ADS)

    Gong, Pijin; Lv, Congmin; Gong, Yongsheng; Ma, Haozhi; Sun, Yang; Wang, Lu

    2017-10-01

    With the rapid development of manned spaceflight engineering, the scientific experimental data in space application system is increasing rapidly. How to efficiently query the specific data in the mass data volume has become a problem. In this paper, a method of retrieving the object data based on the object attribute as the keyword is proposed. The HBase database is used to store the object data and object attributes, and the secondary index is constructed. The research shows that this method is a good way to retrieve specified data based on object attributes.

  6. Content-based image retrieval from a database of fracture images

    NASA Astrophysics Data System (ADS)

    Müller, Henning; Do Hoang, Phuong Anh; Depeursinge, Adrien; Hoffmeyer, Pierre; Stern, Richard; Lovis, Christian; Geissbuhler, Antoine

    2007-03-01

    This article describes the use of a medical image retrieval system on a database of 16'000 fractures, selected from surgical routine over several years. Image retrieval has been a very active domain of research for several years. It was frequently proposed for the medical domain, but only few running systems were ever tested in clinical routine. For the planning of surgical interventions after fractures, x-ray images play an important role. The fractures are classified according to exact fracture location, plus whether and to which degree the fracture is damaging articulations to see how complicated a reparation will be. Several classification systems for fractures exist and the classification plus the experience of the surgeon lead in the end to the choice of surgical technique (screw, metal plate, ...). This choice is strongly influenced by the experience and knowledge of the surgeons with respect to a certain technique. Goal of this article is to describe a prototype that supplies similar cases to an example to help treatment planning and find the most appropriate technique for a surgical intervention. Our database contains over 16'000 fracture images before and after a surgical intervention. We use an image retrieval system (GNU Image Finding Tool, GIFT) to find cases/images similar to an example case currently under observation. Problems encountered are varying illumination of images as well as strong anatomic differences between patients. Regions of interest are usually small and the retrieval system needs to focus on this region. Results show that GIFT is capable of supplying similar cases, particularly when using relevance feedback, on such a large database. Usual image retrieval is based on a single image as search target but for this application we have to select images by case as similar cases need to be found and not images. A few false positive cases often remain in the results but they can be sorted out quickly by the surgeons. Image retrieval can well be used for the planning of operations by supplying similar cases. A variety of challenges has been identified and partly solved (varying luminosity, small region of interested, case-based instead of image-based). This article mainly presents a case study to identify potential benefits and problems. Several steps for improving the system have been identified as well and will be described at the end of the paper.

  7. An Efficient Method for the Retrieval of Objects by Topological Relations in Spatial Database Systems.

    ERIC Educational Resources Information Center

    Lin, P. L.; Tan, W. H.

    2003-01-01

    Presents a new method to improve the performance of query processing in a spatial database. Experiments demonstrated that performance of database systems can be improved because both the number of objects accessed and number of objects requiring detailed inspection are much less than those in the previous approach. (AEF)

  8. A fully automatic end-to-end method for content-based image retrieval of CT scans with similar liver lesion annotations.

    PubMed

    Spanier, A B; Caplan, N; Sosna, J; Acar, B; Joskowicz, L

    2018-01-01

    The goal of medical content-based image retrieval (M-CBIR) is to assist radiologists in the decision-making process by retrieving medical cases similar to a given image. One of the key interests of radiologists is lesions and their annotations, since the patient treatment depends on the lesion diagnosis. Therefore, a key feature of M-CBIR systems is the retrieval of scans with the most similar lesion annotations. To be of value, M-CBIR systems should be fully automatic to handle large case databases. We present a fully automatic end-to-end method for the retrieval of CT scans with similar liver lesion annotations. The input is a database of abdominal CT scans labeled with liver lesions, a query CT scan, and optionally one radiologist-specified lesion annotation of interest. The output is an ordered list of the database CT scans with the most similar liver lesion annotations. The method starts by automatically segmenting the liver in the scan. It then extracts a histogram-based features vector from the segmented region, learns the features' relative importance, and ranks the database scans according to the relative importance measure. The main advantages of our method are that it fully automates the end-to-end querying process, that it uses simple and efficient techniques that are scalable to large datasets, and that it produces quality retrieval results using an unannotated CT scan. Our experimental results on 9 CT queries on a dataset of 41 volumetric CT scans from the 2014 Image CLEF Liver Annotation Task yield an average retrieval accuracy (Normalized Discounted Cumulative Gain index) of 0.77 and 0.84 without/with annotation, respectively. Fully automatic end-to-end retrieval of similar cases based on image information alone, rather that on disease diagnosis, may help radiologists to better diagnose liver lesions.

  9. Social media based NPL system to find and retrieve ARM data: Concept paper

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Devarakonda, Ranjeet; Giansiracusa, Michael T.; Kumar, Jitendra

    Information connectivity and retrieval has a role in our daily lives. The most pervasive source of online information is databases. The amount of data is growing at rapid rate and database technology is improving and having a profound effect. Almost all online applications are storing and retrieving information from databases. One challenge in supplying the public with wider access to informational databases is the need for knowledge of database languages like Structured Query Language (SQL). Although the SQL language has been published in many forms, not everybody is able to write SQL queries. Another challenge is that it may notmore » be practical to make the public aware of the structure of the database. There is a need for novice users to query relational databases using their natural language. To solve this problem, many natural language interfaces to structured databases have been developed. The goal is to provide more intuitive method for generating database queries and delivering responses. Social media makes it possible to interact with a wide section of the population. Through this medium, and with the help of Natural Language Processing (NLP) we can make the data of the Atmospheric Radiation Measurement Data Center (ADC) more accessible to the public. We propose an architecture for using Apache Lucene/Solr [1], OpenML [2,3], and Kafka [4] to generate an automated query/response system with inputs from Twitter5, our Cassandra DB, and our log database. Using the Twitter API and NLP we can give the public the ability to ask questions of our database and get automated responses.« less

  10. Etude Comparative des Systemes de Reperage de l'Information Bibliographique en Mode Dialogue. Rapports de Recherche, No. 13 (Comparative Study of Systems for the Online Retrieval of Bibliographic Information. Research Report No. 13).

    ERIC Educational Resources Information Center

    Perusse, Lyse

    This study, which was conducted at the University of Quebec over a period of six months, focuses on four systems for online information retrieval that are available to users at the university and provide access to some of the same databases: BRS (Bibliographic Retrieval Services Inc.), CAN/OLE (Canadian On-Line Inquiry), Lockheed DIALOG, and…

  11. A Community Data Model for Hydrologic Observations

    NASA Astrophysics Data System (ADS)

    Tarboton, D. G.; Horsburgh, J. S.; Zaslavsky, I.; Maidment, D. R.; Valentine, D.; Jennings, B.

    2006-12-01

    The CUAHSI Hydrologic Information System project is developing information technology infrastructure to support hydrologic science. Hydrologic information science involves the description of hydrologic environments in a consistent way, using data models for information integration. This includes a hydrologic observations data model for the storage and retrieval of hydrologic observations in a relational database designed to facilitate data retrieval for integrated analysis of information collected by multiple investigators. It is intended to provide a standard format to facilitate the effective sharing of information between investigators and to facilitate analysis of information within a single study area or hydrologic observatory, or across hydrologic observatories and regions. The observations data model is designed to store hydrologic observations and sufficient ancillary information (metadata) about the observations to allow them to be unambiguously interpreted and used and provide traceable heritage from raw measurements to usable information. The design is based on the premise that a relational database at the single observation level is most effective for providing querying capability and cross dimension data retrieval and analysis. This premise is being tested through the implementation of a prototype hydrologic observations database, and the development of web services for the retrieval of data from and ingestion of data into the database. These web services hosted by the San Diego Supercomputer center make data in the database accessible both through a Hydrologic Data Access System portal and directly from applications software such as Excel, Matlab and ArcGIS that have Standard Object Access Protocol (SOAP) capability. This paper will (1) describe the data model; (2) demonstrate the capability for representing diverse data in the same database; (3) demonstrate the use of the database from applications software for the performance of hydrologic analysis across different observation types.

  12. User's manual for the national water information system of the U.S. Geological Survey: Ground-water site-inventory system

    USGS Publications Warehouse

    ,

    2004-01-01

    The Ground-Water Site-Inventory (GWSI) System is a ground-water data storage and retrieval system that is part of the National Water Information System (NWIS) developed by the U.S. Geological Survey (USGS). The NWIS is a distributed water database in which data can be processed over a network of workstations and file servers at USGS offices throughout the United States. This system comprises the GWSI, the Automated Data Processing System (ADAPS), the Water-Quality System (QWDATA), and the Site-Specific Water-Use Data System (SWUDS). The GWSI System provides for entering new sites and updating existing sites within the local database. In addition, the GWSI provides for retrieving and displaying ground-water and sitefile data stored in the local database. Finally, the GWSI provides for routine maintenance of the local and national data records. This manual contains instructions for users of the GWSI and discusses the general operating procedures for the programs found within the GWSI Main Menu.

  13. User's Manual for the National Water Information System of the U.S. Geological Survey: Ground-water site-inventory system

    USGS Publications Warehouse

    ,

    2005-01-01

    The Ground-Water Site-Inventory (GWSI) System is a ground-water data storage and retrieval system that is part of the National Water Information System (NWIS) developed by the U.S. Geological Survey (USGS). The NWIS is a distributed water database in which data can be processed over a network of workstations and file servers at USGS offices throughout the United States. This system comprises the GWSI, the Automated Data Processing System (ADAPS), the Water-Quality System (QWDATA), and the Site- Specific Water-Use Data System (SWUDS). The GWSI System provides for entering new sites and updating existing sites within the local database. In addition, the GWSI provides for retrieving and displaying groundwater and Sitefile data stored in the local database. Finally, the GWSI provides for routine maintenance of the local and national data records. This manual contains instructions for users of the GWSI and discusses the general operating procedures for the programs found within the GWSI Main Menu.

  14. Comparison of the effectiveness of alternative feature sets in shape retrieval of multicomponent images

    NASA Astrophysics Data System (ADS)

    Eakins, John P.; Edwards, Jonathan D.; Riley, K. Jonathan; Rosin, Paul L.

    2001-01-01

    Many different kinds of features have been used as the basis for shape retrieval from image databases. This paper investigates the relative effectiveness of several types of global shape feature, both singly and in combination. The features compared include well-established descriptors such as Fourier coefficients and moment invariants, as well as recently-proposed measures of triangularity and ellipticity. Experiments were conducted within the framework of the ARTISAN shape retrieval system, and retrieval effectiveness assessed on a database of over 10,000 images, using 24 queries and associated ground truth supplied by the UK Patent Office . Our experiments revealed only minor differences in retrieval effectiveness between different measures, suggesting that a wide variety of shape feature combinations can provide adequate discriminating power for effective shape retrieval in multi-component image collections such as trademark registries. Marked differences between measures were observed for some individual queries, suggesting that there could be considerable scope for improving retrieval effectiveness by providing users with an improved framework for searching multi-dimensional feature space.

  15. Comparison of the effectiveness of alternative feature sets in shape retrieval of multicomponent images

    NASA Astrophysics Data System (ADS)

    Eakins, John P.; Edwards, Jonathan D.; Riley, K. Jonathan; Rosin, Paul L.

    2000-12-01

    Many different kinds of features have been used as the basis for shape retrieval from image databases. This paper investigates the relative effectiveness of several types of global shape feature, both singly and in combination. The features compared include well-established descriptors such as Fourier coefficients and moment invariants, as well as recently-proposed measures of triangularity and ellipticity. Experiments were conducted within the framework of the ARTISAN shape retrieval system, and retrieval effectiveness assessed on a database of over 10,000 images, using 24 queries and associated ground truth supplied by the UK Patent Office . Our experiments revealed only minor differences in retrieval effectiveness between different measures, suggesting that a wide variety of shape feature combinations can provide adequate discriminating power for effective shape retrieval in multi-component image collections such as trademark registries. Marked differences between measures were observed for some individual queries, suggesting that there could be considerable scope for improving retrieval effectiveness by providing users with an improved framework for searching multi-dimensional feature space.

  16. Finding relevant biomedical datasets: the UC San Diego solution for the bioCADDIE Retrieval Challenge

    PubMed Central

    Wei, Wei; Ji, Zhanglong; He, Yupeng; Zhang, Kai; Ha, Yuanchi; Li, Qi; Ohno-Machado, Lucila

    2018-01-01

    Abstract The number and diversity of biomedical datasets grew rapidly in the last decade. A large number of datasets are stored in various repositories, with different formats. Existing dataset retrieval systems lack the capability of cross-repository search. As a result, users spend time searching datasets in known repositories, and they typically do not find new repositories. The biomedical and healthcare data discovery index ecosystem (bioCADDIE) team organized a challenge to solicit new indexing and searching strategies for retrieving biomedical datasets across repositories. We describe the work of one team that built a retrieval pipeline and examined its performance. The pipeline used online resources to supplement dataset metadata, automatically generated queries from users’ free-text questions, produced high-quality retrieval results and achieved the highest inferred Normalized Discounted Cumulative Gain among competitors. The results showed that it is a promising solution for cross-database, cross-domain and cross-repository biomedical dataset retrieval. Database URL: https://github.com/w2wei/dataset_retrieval_pipeline PMID:29688374

  17. Development of a full-text information retrieval system

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Keizo Oyama; AKira Miyazawa, Atsuhiro Takasu; Kouji Shibano

    The authors have executed a project to realize a full-text information retrieval system. The system is designed to deal with a document database comprising full text of a large number of documents such as academic papers. The document structures are utilized in searching and extracting appropriate information. The concept of structure handling and the configuration of the system are described in this paper.

  18. An Intelligent Pictorial Information System

    NASA Astrophysics Data System (ADS)

    Lee, Edward T.; Chang, B.

    1987-05-01

    In examining the history of computer application, we discover that early computer systems were developed primarily for applications related to scientific computation, as in weather prediction, aerospace applications, and nuclear physics applications. At this stage, the computer system served as a big calculator to perform, in the main, manipulation of numbers. Then it was found that computer systems could also be used for business applications, information storage and retrieval, word processing, and report generation. The history of computer application is summarized in Table I. The complexity of pictures makes picture processing much more difficult than number and alphanumerical processing. Therefore, new techniques, new algorithms, and above all, new pictorial knowledge, [1] are needed to overcome the limitatins of existing computer systems. New frontiers in designing computer systems are the ways to handle the representation,[2,3] classification, manipulation, processing, storage, and retrieval of pictures. Especially, the ways to deal with similarity measures and the meaning of the word "approximate" and the phrase "approximate reasoning" are an important and an indispensable part of an intelligent pictorial information system. [4,5] The main objective of this paper is to investigate the mathematical foundation for the effective organization and efficient retrieval of pictures in similarity-directed pictorial databases, [6] based on similarity retrieval techniques [7] and fuzzy languages [8]. The main advantage of this approach is that similar pictures are stored logically close to each other by using quantitative similarity measures. Thus, for answering queries, the amount of picture data needed to be searched can be reduced and the retrieval time can be improved. In addition, in a pictorial database, very often it is desired to find pictures (or feature vectors, histograms, etc.) that are most similar to or most dissimilar [9] to a test picture (or feature vector). Using similarity measures, one can not only store similar pictures logically or physically close to each other in order to improve retrieval or updating efficiency, one can also use such similarity measures to answer fuzzy queries involving nonexact retrieval conditions. In this paper, similarity directed pictorial databases involving geometric figures, chromosome images, [10] leukocyte images, cardiomyopathy images, and satellite images [11] are presented as illustrative examples.

  19. Constraining the Structure of Hot Jupiter Atmospheres Using a Hybrid Version of the NEMESIS Retrieval Algorithm

    NASA Astrophysics Data System (ADS)

    Badhan, Mahmuda A.; Mandell, Avi M.; Hesman, Brigette; Nixon, Conor; Deming, Drake; Irwin, Patrick; Barstow, Joanna; Garland, Ryan

    2015-11-01

    Understanding the formation environments and evolution scenarios of planets in nearby planetary systems requires robust measures for constraining their atmospheric physical properties. Here we have utilized a combination of two different parameter retrieval approaches, Optimal Estimation and Markov Chain Monte Carlo, as part of the well-validated NEMESIS atmospheric retrieval code, to infer a range of temperature profiles and molecular abundances of H2O, CO2, CH4 and CO from available dayside thermal emission observations of several hot-Jupiter candidates. In order to keep the number of parameters low and henceforth retrieve more plausible profile shapes, we have used a parametrized form of the temperature profile based upon an analytic radiative equilibrium derivation in Guillot et al. 2010 (Line et al. 2012, 2014). We show retrieval results on published spectroscopic and photometric data from both the Hubble Space Telescope and Spitzer missions, and compare them with simulations from the upcoming JWST mission. In addition, since NEMESIS utilizes correlated distribution of absorption coefficients (k-distribution) amongst atmospheric layers to compute these models, updates to spectroscopic databases can impact retrievals quite significantly for such high-temperature atmospheres. As high-temperature line databases are continually being improved, we also compare retrievals between old and newer databases.

  20. Space Images for NASA JPL Android Version

    NASA Technical Reports Server (NTRS)

    Nelson, Jon D.; Gutheinz, Sandy C.; Strom, Joshua R.; Arca, Jeremy M.; Perez, Martin; Boggs, Karen; Stanboli, Alice

    2013-01-01

    This software addresses the demand for easily accessible NASA JPL images and videos by providing a user friendly and simple graphical user interface that can be run via the Android platform from any location where Internet connection is available. This app is complementary to the iPhone version of the application. A backend infrastructure stores, tracks, and retrieves space images from the JPL Photojournal and Institutional Communications Web server, and catalogs the information into a streamlined rating infrastructure. This system consists of four distinguishing components: image repository, database, server-side logic, and Android mobile application. The image repository contains images from various JPL flight projects. The database stores the image information as well as the user rating. The server-side logic retrieves the image information from the database and categorizes each image for display. The Android mobile application is an interfacing delivery system that retrieves the image information from the server for each Android mobile device user. Also created is a reporting and tracking system for charting and monitoring usage. Unlike other Android mobile image applications, this system uses the latest emerging technologies to produce image listings based directly on user input. This allows for countless combinations of images returned. The backend infrastructure uses industry-standard coding and database methods, enabling future software improvement and technology updates. The flexibility of the system design framework permits multiple levels of display possibilities and provides integration capabilities. Unique features of the software include image/video retrieval from a selected set of categories, image Web links that can be shared among e-mail users, sharing to Facebook/Twitter, marking as user's favorites, and image metadata searchable for instant results.

  1. Term Dependence: Truncating the Bahadur Lazarsfeld Expansion.

    ERIC Educational Resources Information Center

    Losee, Robert M., Jr.

    1994-01-01

    Studies the performance of probabilistic information retrieval systems using differing statistical dependence assumptions when estimating the probabilities inherent in the retrieval model. Experimental results using the Bahadur Lazarsfeld expansion on the Cystic Fibrosis database are discussed that suggest that incorporating term dependence…

  2. Managing Heterogeneous Information Systems through Discovery and Retrieval of Generic Concepts.

    ERIC Educational Resources Information Center

    Srinivasan, Uma; Ngu, Anne H. H.; Gedeon, Tom

    2000-01-01

    Introduces a conceptual integration approach to heterogeneous databases or information systems that exploits the similarity in metalevel information and performs metadata mining on database objects to discover a set of concepts that serve as a domain abstraction and provide a conceptual layer above existing legacy systems. Presents results of…

  3. Automated search and retrieval of information from imaged documents using optical correlation techniques

    NASA Astrophysics Data System (ADS)

    Stalcup, Bruce W.; Dennis, Phillip W.; Dydyk, Robert B.

    1999-10-01

    Litton PRC and Litton Data Systems Division are developing a system, the Imaged Document Optical Correlation and Conversion System (IDOCCS), to provide a total solution to the problem of managing and retrieving textual and graphic information from imaged document archives. At the heart of IDOCCS, optical correlation technology provides the search and retrieval of information from imaged documents. IDOCCS can be used to rapidly search for key words or phrases within the imaged document archives. In addition, IDOCCS can automatically compare an input document with the archived database to determine if it is a duplicate, thereby reducing the overall resources required to maintain and access the document database. Embedded graphics on imaged pages can also be exploited; e.g., imaged documents containing an agency's seal or logo can be singled out. In this paper, we present a description of IDOCCS as well as preliminary performance results and theoretical projections.

  4. RDFBuilder: a tool to automatically build RDF-based interfaces for MAGE-OM microarray data sources.

    PubMed

    Anguita, Alberto; Martin, Luis; Garcia-Remesal, Miguel; Maojo, Victor

    2013-07-01

    This paper presents RDFBuilder, a tool that enables RDF-based access to MAGE-ML-compliant microarray databases. We have developed a system that automatically transforms the MAGE-OM model and microarray data stored in the ArrayExpress database into RDF format. Additionally, the system automatically enables a SPARQL endpoint. This allows users to execute SPARQL queries for retrieving microarray data, either from specific experiments or from more than one experiment at a time. Our system optimizes response times by caching and reusing information from previous queries. In this paper, we describe our methods for achieving this transformation. We show that our approach is complementary to other existing initiatives, such as Bio2RDF, for accessing and retrieving data from the ArrayExpress database. Copyright © 2013 Elsevier Ireland Ltd. All rights reserved.

  5. Buckets: Smart Objects for Digital Libraries

    NASA Technical Reports Server (NTRS)

    Nelson, Michael L.

    2001-01-01

    Current discussion of digital libraries (DLs) is often dominated by the merits of the respective storage, search and retrieval functionality of archives, repositories, search engines, search interfaces and database systems. While these technologies are necessary for information management, the information content is more important than the systems used for its storage and retrieval. Digital information should have the same long-term survivability prospects as traditional hardcopy information and should be protected to the extent possible from evolving search engine technologies and vendor vagaries in database management systems. Information content and information retrieval systems should progress on independent paths and make limited assumptions about the status or capabilities of the other. Digital information can achieve independence from archives and DL systems through the use of buckets. Buckets are an aggregative, intelligent construct for publishing in DLs. Buckets allow the decoupling of information content from information storage and retrieval. Buckets exist within the Smart Objects and Dumb Archives model for DLs in that many of the functionalities and responsibilities traditionally associated with archives are pushed down (making the archives dumber) into the buckets (making them smarter). Some of the responsibilities imbued to buckets are the enforcement of their terms and conditions, and maintenance and display of their contents.

  6. EROS Main Image File: A Picture Perfect Database for Landsat Imagery and Aerial Photography.

    ERIC Educational Resources Information Center

    Jack, Robert F.

    1984-01-01

    Describes Earth Resources Observation System online database, which provides access to computerized images of Earth obtained via satellite. Highlights include retrieval system and commands, types of images, search strategies, other online functions, and interpretation of accessions. Satellite information, sources and samples of accessions, and…

  7. Content Based Retrieval Database Management System with Support for Similarity Searching and Query Refinement

    DTIC Science & Technology

    2002-01-01

    to the OODBMS approach. The ORDBMS approach produced such research prototypes as Postgres [155], and Starburst [67] and commercial products such as...Kemnitz. The POSTGRES Next-Generation Database Management System. Communications of the ACM, 34(10):78–92, 1991. [156] Michael Stonebreaker and Dorothy

  8. An end to end secure CBIR over encrypted medical database.

    PubMed

    Bellafqira, Reda; Coatrieux, Gouenou; Bouslimi, Dalel; Quellec, Gwenole

    2016-08-01

    In this paper, we propose a new secure content based image retrieval (SCBIR) system adapted to the cloud framework. This solution allows a physician to retrieve images of similar content within an outsourced and encrypted image database, without decrypting them. Contrarily to actual CBIR approaches in the encrypted domain, the originality of the proposed scheme stands on the fact that the features extracted from the encrypted images are themselves encrypted. This is achieved by means of homomorphic encryption and two non-colluding servers, we however both consider as honest but curious. In that way an end to end secure CBIR process is ensured. Experimental results carried out on a diabetic retinopathy database encrypted with the Paillier cryptosystem indicate that our SCBIR achieves retrieval performance as good as if images were processed in their non-encrypted form.

  9. Science information systems: Archive, access, and retrieval

    NASA Technical Reports Server (NTRS)

    Campbell, William J.

    1991-01-01

    The objective of this research is to develop technology for the automated characterization and interactive retrieval and visualization of very large, complex scientific data sets. Technologies will be developed for the following specific areas: (1) rapidly archiving data sets; (2) automatically characterizing and labeling data in near real-time; (3) providing users with the ability to browse contents of databases efficiently and effectively; (4) providing users with the ability to access and retrieve system independent data sets electronically; and (5) automatically alerting scientists to anomalies detected in data.

  10. Tautomerism in chemical information management systems

    NASA Astrophysics Data System (ADS)

    Warr, Wendy A.

    2010-06-01

    Tautomerism has an impact on many of the processes in chemical information management systems including novelty checking during registration into chemical structure databases; storage of structures; exact and substructure searching in chemical structure databases; and depiction of structures retrieved by a search. The approaches taken by 27 different software vendors and database producers are compared. It is hoped that this comparison will act as a discussion document that could ultimately improve databases and software for researchers in the future.

  11. Bookshelf: a simple curation system for the storage of biomolecular simulation data.

    PubMed

    Vohra, Shabana; Hall, Benjamin A; Holdbrook, Daniel A; Khalid, Syma; Biggin, Philip C

    2010-01-01

    Molecular dynamics simulations can now routinely generate data sets of several hundreds of gigabytes in size. The ability to generate this data has become easier over recent years and the rate of data production is likely to increase rapidly in the near future. One major problem associated with this vast amount of data is how to store it in a way that it can be easily retrieved at a later date. The obvious answer to this problem is a database. However, a key issue in the development and maintenance of such a database is its sustainability, which in turn depends on the ease of the deposition and retrieval process. Encouraging users to care about meta-data is difficult and thus the success of any storage system will ultimately depend on how well used by end-users the system is. In this respect we suggest that even a minimal amount of metadata if stored in a sensible fashion is useful, if only at the level of individual research groups. We discuss here, a simple database system which we call 'Bookshelf', that uses python in conjunction with a mysql database to provide an extremely simple system for curating and keeping track of molecular simulation data. It provides a user-friendly, scriptable solution to the common problem amongst biomolecular simulation laboratories; the storage, logging and subsequent retrieval of large numbers of simulations. Download URL: http://sbcb.bioch.ox.ac.uk/bookshelf/

  12. Bookshelf: a simple curation system for the storage of biomolecular simulation data

    PubMed Central

    Vohra, Shabana; Hall, Benjamin A.; Holdbrook, Daniel A.; Khalid, Syma; Biggin, Philip C.

    2010-01-01

    Molecular dynamics simulations can now routinely generate data sets of several hundreds of gigabytes in size. The ability to generate this data has become easier over recent years and the rate of data production is likely to increase rapidly in the near future. One major problem associated with this vast amount of data is how to store it in a way that it can be easily retrieved at a later date. The obvious answer to this problem is a database. However, a key issue in the development and maintenance of such a database is its sustainability, which in turn depends on the ease of the deposition and retrieval process. Encouraging users to care about meta-data is difficult and thus the success of any storage system will ultimately depend on how well used by end-users the system is. In this respect we suggest that even a minimal amount of metadata if stored in a sensible fashion is useful, if only at the level of individual research groups. We discuss here, a simple database system which we call ‘Bookshelf’, that uses python in conjunction with a mysql database to provide an extremely simple system for curating and keeping track of molecular simulation data. It provides a user-friendly, scriptable solution to the common problem amongst biomolecular simulation laboratories; the storage, logging and subsequent retrieval of large numbers of simulations. Download URL: http://sbcb.bioch.ox.ac.uk/bookshelf/ PMID:21169341

  13. A comparison of Boolean-based retrieval to the WAIS system for retrieval of aeronautical information

    NASA Technical Reports Server (NTRS)

    Marchionini, Gary; Barlow, Diane

    1994-01-01

    An evaluation of an information retrieval system using a Boolean-based retrieval engine and inverted file architecture and WAIS, which uses a vector-based engine, was conducted. Four research questions in aeronautical engineering were used to retrieve sets of citations from the NASA Aerospace Database which was mounted on a WAIS server and available through Dialog File 108 which served as the Boolean-based system (BBS). High recall and high precision searches were done in the BBS and terse and verbose queries were used in the WAIS condition. Precision values for the WAIS searches were consistently above the precision values for high recall BBS searches and consistently below the precision values for high precision BBS searches. Terse WAIS queries gave somewhat better precision performance than verbose WAIS queries. In every case, a small number of relevant documents retrieved by one system were not retrieved by the other, indicating the incomplete nature of the results from either retrieval system. Relevant documents in the WAIS searches were found to be randomly distributed in the retrieved sets rather than distributed by ranks. Advantages and limitations of both types of systems are discussed.

  14. Information Retrieval in Telemedicine: a Comparative Study on Bibliographic Databases

    PubMed Central

    Ahmadi, Maryam; Sarabi, Roghayeh Ershad; Orak, Roohangiz Jamshidi; Bahaadinbeigy, Kambiz

    2015-01-01

    Background and Aims: The first step in each systematic review is selection of the most valid database that can provide the highest number of relevant references. This study was carried out to determine the most suitable database for information retrieval in telemedicine field. Methods: Cinhal, PubMed, Web of Science and Scopus databases were searched for telemedicine matched with Education, cost benefit and patient satisfaction. After analysis of the obtained results, the accuracy coefficient, sensitivity, uniqueness and overlap of databases were calculated. Results: The studied databases differed in the number of retrieved articles. PubMed was identified as the most suitable database for retrieving information on the selected topics with the accuracy and sensitivity ratios of 50.7% and 61.4% respectively. The uniqueness percent of retrieved articles ranged from 38% for Pubmed to 3.0% for Cinhal. The highest overlap rate (18.6%) was found between PubMed and Web of Science. Less than 1% of articles have been indexed in all searched databases. Conclusion: PubMed is suggested as the most suitable database for starting search in telemedicine and after PubMed, Scopus and Web of Science can retrieve about 90% of the relevant articles. PMID:26236086

  15. Information Retrieval in Telemedicine: a Comparative Study on Bibliographic Databases.

    PubMed

    Ahmadi, Maryam; Sarabi, Roghayeh Ershad; Orak, Roohangiz Jamshidi; Bahaadinbeigy, Kambiz

    2015-06-01

    The first step in each systematic review is selection of the most valid database that can provide the highest number of relevant references. This study was carried out to determine the most suitable database for information retrieval in telemedicine field. Cinhal, PubMed, Web of Science and Scopus databases were searched for telemedicine matched with Education, cost benefit and patient satisfaction. After analysis of the obtained results, the accuracy coefficient, sensitivity, uniqueness and overlap of databases were calculated. The studied databases differed in the number of retrieved articles. PubMed was identified as the most suitable database for retrieving information on the selected topics with the accuracy and sensitivity ratios of 50.7% and 61.4% respectively. The uniqueness percent of retrieved articles ranged from 38% for Pubmed to 3.0% for Cinhal. The highest overlap rate (18.6%) was found between PubMed and Web of Science. Less than 1% of articles have been indexed in all searched databases. PubMed is suggested as the most suitable database for starting search in telemedicine and after PubMed, Scopus and Web of Science can retrieve about 90% of the relevant articles.

  16. Recent Experiments with INQUERY

    DTIC Science & Technology

    1995-11-01

    were conducted with version of the INQUERY information retrieval system INQUERY is based on the Bayesian inference network retrieval model It is...corpus based query expansion For TREC a subset of of the adhoc document set was used to build the InFinder database None of the...experiments that showed signi cant improvements in retrieval eectiveness when document rankings based on the entire document text are combined with

  17. Object-oriented analysis and design of an ECG storage and retrieval system integrated with an HIS.

    PubMed

    Wang, C; Ohe, K; Sakurai, T; Nagase, T; Kaihara, S

    1996-03-01

    For a hospital information system, object-oriented methodology plays an increasingly important role, especially for the management of digitized data, e.g., the electrocardiogram, electroencephalogram, electromyogram, spirogram, X-ray, CT and histopathological images, which are not yet computerized in most hospitals. As a first step in an object-oriented approach to hospital information management and storing medical data in an object-oriented database, we connected electrocardiographs to a hospital network and established the integration of ECG storage and retrieval systems with a hospital information system. In this paper, the object-oriented analysis and design of the ECG storage and retrieval systems is reported.

  18. Computer Storage and Retrieval of Position - Dependent Data.

    DTIC Science & Technology

    1982-06-01

    This thesis covers the design of a new digital database system to replace the merged (observation and geographic location) record, one file per cruise...68 "The Digital Data Library System: Library Storage and Retrieval of Digital Geophysical Data" by Robert C. Groan) provided a relatively simple...dependent, ’geophysical’ data. The system is operational on a Digital Equipment Corporation VAX-11/780 computer. Values of measured and computed

  19. Keyless Entry: Building a Text Database Using OCR Technology.

    ERIC Educational Resources Information Center

    Grotophorst, Clyde W.

    1989-01-01

    Discusses the use of optical character recognition (OCR) technology to produce an ASCII text database. A tutorial on digital scanning and OCR is provided, and a systems integration project which used the Calera CDP-3000XF scanner and text retrieval software to construct a database of dissertations at George Mason University is described. (four…

  20. Computer systems and methods for the query and visualization of multidimensional databases

    DOEpatents

    Stolte, Chris; Tang, Diane L.; Hanrahan, Patrick

    2006-08-08

    A method and system for producing graphics. A hierarchical structure of a database is determined. A visual table, comprising a plurality of panes, is constructed by providing a specification that is in a language based on the hierarchical structure of the database. In some cases, this language can include fields that are in the database schema. The database is queried to retrieve a set of tuples in accordance with the specification. A subset of the set of tuples is associated with a pane in the plurality of panes.

  1. Computer systems and methods for the query and visualization of multidimensional database

    DOEpatents

    Stolte, Chris; Tang, Diane L.; Hanrahan, Patrick

    2010-05-11

    A method and system for producing graphics. A hierarchical structure of a database is determined. A visual table, comprising a plurality of panes, is constructed by providing a specification that is in a language based on the hierarchical structure of the database. In some cases, this language can include fields that are in the database schema. The database is queried to retrieve a set of tuples in accordance with the specification. A subset of the set of tuples is associated with a pane in the plurality of panes.

  2. MRNIDX - Marine Data Index: Database Description, Operation, Retrieval, and Display

    USGS Publications Warehouse

    Paskevich, Valerie F.

    1982-01-01

    A database referencing the location and content of data stored on magnetic medium was designed to assist in the indexing of time-series and spatially dependent marine geophysical data collected or processed by the U. S. Geological Survey. The database was designed and created for input to the Geologic Retrieval and Synopsis Program (GRASP) to allow selective retrievals of information pertaining to location of data, data format, cruise, geographical bounds and collection dates of data. This information is then used to locate the stored data for administrative purposes or further processing. Database utilization is divided into three distinct operations. The first is the inventorying of the data and the updating of the database, the second is the retrieval of information from the database, and the third is the graphic display of the geographical boundaries to which the retrieved information pertains.

  3. Medical image retrieval system using multiple features from 3D ROIs

    NASA Astrophysics Data System (ADS)

    Lu, Hongbing; Wang, Weiwei; Liao, Qimei; Zhang, Guopeng; Zhou, Zhiming

    2012-02-01

    Compared to a retrieval using global image features, features extracted from regions of interest (ROIs) that reflect distribution patterns of abnormalities would benefit more for content-based medical image retrieval (CBMIR) systems. Currently, most CBMIR systems have been designed for 2D ROIs, which cannot reflect 3D anatomical features and region distribution of lesions comprehensively. To further improve the accuracy of image retrieval, we proposed a retrieval method with 3D features including both geometric features such as Shape Index (SI) and Curvedness (CV) and texture features derived from 3D Gray Level Co-occurrence Matrix, which were extracted from 3D ROIs, based on our previous 2D medical images retrieval system. The system was evaluated with 20 volume CT datasets for colon polyp detection. Preliminary experiments indicated that the integration of morphological features with texture features could improve retrieval performance greatly. The retrieval result using features extracted from 3D ROIs accorded better with the diagnosis from optical colonoscopy than that based on features from 2D ROIs. With the test database of images, the average accuracy rate for 3D retrieval method was 76.6%, indicating its potential value in clinical application.

  4. Location-Driven Image Retrieval for Images Collected by a Mobile Robot

    NASA Astrophysics Data System (ADS)

    Tanaka, Kanji; Hirayama, Mitsuru; Okada, Nobuhiro; Kondo, Eiji

    Mobile robot teleoperation is a method for a human user to interact with a mobile robot over time and distance. Successful teleoperation depends on how well images taken by the mobile robot are visualized to the user. To enhance the efficiency and flexibility of the visualization, an image retrieval system on such a robot’s image database would be very useful. The main difference of the robot’s image database from standard image databases is that various relevant images exist due to variety of viewing conditions. The main contribution of this paper is to propose an efficient retrieval approach, named location-driven approach, utilizing correlation between visual features and real world locations of images. Combining the location-driven approach with the conventional feature-driven approach, our goal can be viewed as finding an optimal classifier between relevant and irrelevant feature-location pairs. An active learning technique based on support vector machine is extended for this aim.

  5. Publication search and retrieval system

    USGS Publications Warehouse

    Winget, Elizabeth A.

    1981-01-01

    The publication search and retrieval system of the Branch of Atlantic-Gulf of Mexico Geology, U.S. Geological Survey, Woods Hole, Mass., is a procedure for listing and describing branch-sponsored publications. It is designed for maintenance and retrieval by those having limited knowledge of computer languages and programs. Because this branch currently utilizes the Hewlett-Packard HP-1000 computer with RTE-IVB operating system, database entry and maintenance is performed in accordance with the TE-IVB Terminal User’s Reference Manual (Hewlett-Packard Company, 1980) and within the constraints of GRASP (Bowen and Botbol, 1975) and WOLF (Evenden, 1978).

  6. A novel biomedical image indexing and retrieval system via deep preference learning.

    PubMed

    Pang, Shuchao; Orgun, Mehmet A; Yu, Zhezhou

    2018-05-01

    The traditional biomedical image retrieval methods as well as content-based image retrieval (CBIR) methods originally designed for non-biomedical images either only consider using pixel and low-level features to describe an image or use deep features to describe images but still leave a lot of room for improving both accuracy and efficiency. In this work, we propose a new approach, which exploits deep learning technology to extract the high-level and compact features from biomedical images. The deep feature extraction process leverages multiple hidden layers to capture substantial feature structures of high-resolution images and represent them at different levels of abstraction, leading to an improved performance for indexing and retrieval of biomedical images. We exploit the current popular and multi-layered deep neural networks, namely, stacked denoising autoencoders (SDAE) and convolutional neural networks (CNN) to represent the discriminative features of biomedical images by transferring the feature representations and parameters of pre-trained deep neural networks from another domain. Moreover, in order to index all the images for finding the similarly referenced images, we also introduce preference learning technology to train and learn a kind of a preference model for the query image, which can output the similarity ranking list of images from a biomedical image database. To the best of our knowledge, this paper introduces preference learning technology for the first time into biomedical image retrieval. We evaluate the performance of two powerful algorithms based on our proposed system and compare them with those of popular biomedical image indexing approaches and existing regular image retrieval methods with detailed experiments over several well-known public biomedical image databases. Based on different criteria for the evaluation of retrieval performance, experimental results demonstrate that our proposed algorithms outperform the state-of-the-art techniques in indexing biomedical images. We propose a novel and automated indexing system based on deep preference learning to characterize biomedical images for developing computer aided diagnosis (CAD) systems in healthcare. Our proposed system shows an outstanding indexing ability and high efficiency for biomedical image retrieval applications and it can be used to collect and annotate the high-resolution images in a biomedical database for further biomedical image research and applications. Copyright © 2018 Elsevier B.V. All rights reserved.

  7. Aquatic information and retrieval (AQUIRE) database system

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Hunter, R.; Niemi, G.; Pilli, A.

    The AQUIRE database system is one of the foremost international resources for finding aquatic toxicity information. Information in the system is organized around the concept of an 'aquatic toxicity test.' A toxicity test record contains information about the chemical, species, endpoint, endpoint concentrations, and test conditions under which the toxicity test was conducted. For the past 10 years aquatic literature has been reviewed and entered into the system. Currently, the AQUIRE database system contains data on more than 2,400 species, 160 endpoints, 5,000 chemicals, 6,000 references, and 104,000 toxicity tests.

  8. 'The surface management system' (SuMS) database: a surface-based database to aid cortical surface reconstruction, visualization and analysis

    NASA Technical Reports Server (NTRS)

    Dickson, J.; Drury, H.; Van Essen, D. C.

    2001-01-01

    Surface reconstructions of the cerebral cortex are increasingly widely used in the analysis and visualization of cortical structure, function and connectivity. From a neuroinformatics perspective, dealing with surface-related data poses a number of challenges. These include the multiplicity of configurations in which surfaces are routinely viewed (e.g. inflated maps, spheres and flat maps), plus the diversity of experimental data that can be represented on any given surface. To address these challenges, we have developed a surface management system (SuMS) that allows automated storage and retrieval of complex surface-related datasets. SuMS provides a systematic framework for the classification, storage and retrieval of many types of surface-related data and associated volume data. Within this classification framework, it serves as a version-control system capable of handling large numbers of surface and volume datasets. With built-in database management system support, SuMS provides rapid search and retrieval capabilities across all the datasets, while also incorporating multiple security levels to regulate access. SuMS is implemented in Java and can be accessed via a Web interface (WebSuMS) or using downloaded client software. Thus, SuMS is well positioned to act as a multiplatform, multi-user 'surface request broker' for the neuroscience community.

  9. Teaching Three-Dimensional Structural Chemistry Using Crystal Structure Databases. 3. The Cambridge Structural Database System: Information Content and Access Software in Educational Applications

    ERIC Educational Resources Information Center

    Battle, Gary M.; Allen, Frank H.; Ferrence, Gregory M.

    2011-01-01

    Parts 1 and 2 of this series described the educational value of experimental three-dimensional (3D) chemical structures determined by X-ray crystallography and retrieved from the crystallographic databases. In part 1, we described the information content of the Cambridge Structural Database (CSD) and discussed a representative teaching subset of…

  10. Data Aggregation System: A system for information retrieval on demand over relational and non-relational distributed data sources

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Ball, G.; Kuznetsov, V.; Evans, D.

    We present the Data Aggregation System, a system for information retrieval and aggregation from heterogenous sources of relational and non-relational data for the Compact Muon Solenoid experiment on the CERN Large Hadron Collider. The experiment currently has a number of organically-developed data sources, including front-ends to a number of different relational databases and non-database data services which do not share common data structures or APIs (Application Programming Interfaces), and cannot at this stage be readily converged. DAS provides a single interface for querying all these services, a caching layer to speed up access to expensive underlying calls and the abilitymore » to merge records from different data services pertaining to a single primary key.« less

  11. Huntington's Disease Research Roster Support with a Microcomputer Database Management System

    PubMed Central

    Gersting, J. M.; Conneally, P. M.; Beidelman, K.

    1983-01-01

    This paper chronicles the MEGADATS (Medical Genetics Acquisition and DAta Transfer System) database development effort in collecting, storing, retrieving, and plotting human family pedigrees. The newest system, MEGADATS-3M, is detailed. Emphasis is on the microcomputer version of MEGADATS-3M and its use to support the Huntington's Disease research roster project. Examples of data input and pedigree plotting are included.

  12. A similarity learning approach to content-based image retrieval: application to digital mammography.

    PubMed

    El-Naqa, Issam; Yang, Yongyi; Galatsanos, Nikolas P; Nishikawa, Robert M; Wernick, Miles N

    2004-10-01

    In this paper, we describe an approach to content-based retrieval of medical images from a database, and provide a preliminary demonstration of our approach as applied to retrieval of digital mammograms. Content-based image retrieval (CBIR) refers to the retrieval of images from a database using information derived from the images themselves, rather than solely from accompanying text indices. In the medical-imaging context, the ultimate aim of CBIR is to provide radiologists with a diagnostic aid in the form of a display of relevant past cases, along with proven pathology and other suitable information. CBIR may also be useful as a training tool for medical students and residents. The goal of information retrieval is to recall from a database information that is relevant to the user's query. The most challenging aspect of CBIR is the definition of relevance (similarity), which is used to guide the retrieval machine. In this paper, we pursue a new approach, in which similarity is learned from training examples provided by human observers. Specifically, we explore the use of neural networks and support vector machines to predict the user's notion of similarity. Within this framework we propose using a hierarchal learning approach, which consists of a cascade of a binary classifier and a regression module to optimize retrieval effectiveness and efficiency. We also explore how to incorporate online human interaction to achieve relevance feedback in this learning framework. Our experiments are based on a database consisting of 76 mammograms, all of which contain clustered microcalcifications (MCs). Our goal is to retrieve mammogram images containing similar MC clusters to that in a query. The performance of the retrieval system is evaluated using precision-recall curves computed using a cross-validation procedure. Our experimental results demonstrate that: 1) the learning framework can accurately predict the perceptual similarity reported by human observers, thereby serving as a basis for CBIR; 2) the learning-based framework can significantly outperform a simple distance-based similarity metric; 3) the use of the hierarchical two-stage network can improve retrieval performance; and 4) relevance feedback can be effectively incorporated into this learning framework to achieve improvement in retrieval precision based on online interaction with users; and 5) the retrieved images by the network can have predicting value for the disease condition of the query.

  13. EROS main image file - A picture perfect database for Landsat imagery and aerial photography

    NASA Technical Reports Server (NTRS)

    Jack, R. F.

    1984-01-01

    The Earth Resources Observation System (EROS) Program was established by the U.S. Department of the Interior in 1966 under the administration of the Geological Survey. It is primarily concerned with the application of remote sensing techniques for the management of natural resources. The retrieval system employed to search the EROS database is called INORAC (Inquiry, Ordering, and Accounting). A description is given of the types of images identified in EROS, taking into account Landsat imagery, Skylab images, Gemini/Apollo photography, and NASA aerial photography. Attention is given to retrieval commands, geographic coordinate searching, refinement techniques, various online functions, and questions regarding the access to the EROS Main Image File.

  14. Information Retrieval Strategies of Millennial Undergraduate Students in Web and Library Database Searches

    ERIC Educational Resources Information Center

    Porter, Brandi

    2009-01-01

    Millennial students make up a large portion of undergraduate students attending colleges and universities, and they have a variety of online resources available to them to complete academically related information searches, primarily Web based and library-based online information retrieval systems. The content, ease of use, and required search…

  15. Improving Database Simulations for Bayesian Precipitation Retrieval using Non-Spherical Ice Particles

    NASA Astrophysics Data System (ADS)

    Ringerud, S.; Skofronick Jackson, G.; Kulie, M.; Randel, D.

    2016-12-01

    NASA's Global Precipitation Measurement Mission (GPM) provides a wealth of both active and passive microwave observations aimed at furthering understanding of global precipitation and the hydrologic cycle. Employing a constellation of passive microwave radiometers increases global coverage and sampling, while the core satellite acts as a transfer standard, enabling consistent retrievals across individual constellation members. The transfer standard is applied in the form of a physically based a priori database constructed for use in Bayesian retrieval algorithms for each radiometer. The database is constructed using hydrometeor profiles optimized for the best fit to simultaneous active/passive core satellite measurements via the GPM Combined Algorithm. Initial validation of GPM rainfall products using the combined database suggests high retrieval errors for convective precipitation over land and at high latitudes. In such regimes, the signal from ice scattering observed at the higher microwave frequencies becomes particularly important for detecting and retrieving precipitation. For cross-track sounders such as MHS and SAPHIR, this signal is crucial. It is therefore important that the scattering signals associated with precipitation are accurately represented and modeled in the retrieval database. In the current GPM combined retrieval and constellation databases, ice hydrometeors are represented as "fluffy spheres", with assumed density and scattering parameters calculated using Mie theory. Resulting simulated Tb agree reasonably well at frequencies up to 89 GHz, but show significant biases at higher frequencies. In this work the database is recreated using an ensemble of non-spherical ice particles with single scattering properties calculated using discrete dipole approximation. Simulated Tb agreement is significantly improved across the high frequencies, decreasing biases by an order of magnitude in several of the channels. The new database is applied for a sample of GPM constellation retrievals and the retrieved precipitation rates compared, to demonstrate areas where the use of more complex ice particles will have the greatest effect upon the final retrievals.

  16. Design Considerations for a Web-based Database System of ELISpot Assay in Immunological Research

    PubMed Central

    Ma, Jingming; Mosmann, Tim; Wu, Hulin

    2005-01-01

    The enzyme-linked immunospot (ELISpot) assay has been a primary means in immunological researches (such as HIV-specific T cell response). Due to huge amount of data involved in ELISpot assay testing, the database system is needed for efficient data entry, easy retrieval, secure storage, and convenient data process. Besides, the NIH has recently issued a policy to promote the sharing of research data (see http://grants.nih.gov/grants/policy/data_sharing). The Web-based database system will be definitely benefit to data sharing among broad research communities. Here are some considerations for a database system of ELISpot assay (DBSEA). PMID:16779326

  17. A database paradigm for the management of DICOM-RT structure sets using a geographic information system

    NASA Astrophysics Data System (ADS)

    Shao, Weber; Kupelian, Patrick A.; Wang, Jason; Low, Daniel A.; Ruan, Dan

    2014-03-01

    We devise a paradigm for representing the DICOM-RT structure sets in a database management system, in such way that secondary calculations of geometric information can be performed quickly from the existing contour definitions. The implementation of this paradigm is achieved using the PostgreSQL database system and the PostGIS extension, a geographic information system commonly used for encoding geographical map data. The proposed paradigm eliminates the overhead of retrieving large data records from the database, as well as the need to implement various numerical and data parsing routines, when additional information related to the geometry of the anatomy is desired.

  18. Content-Based Image Retrieval System for Pulmonary Nodules: Assisting Radiologists in Self-Learning and Diagnosis of Lung Cancer.

    PubMed

    Dhara, Ashis Kumar; Mukhopadhyay, Sudipta; Dutta, Anirvan; Garg, Mandeep; Khandelwal, Niranjan

    2017-02-01

    Visual information of similar nodules could assist the budding radiologists in self-learning. This paper presents a content-based image retrieval (CBIR) system for pulmonary nodules, observed in lung CT images. The reported CBIR systems of pulmonary nodules cannot be put into practice as radiologists need to draw the boundary of nodules during query formation and feature database creation. In the proposed retrieval system, the pulmonary nodules are segmented using a semi-automated technique, which requires a seed point on the nodule from the end-user. The involvement of radiologists in feature database creation is also reduced, as only a seed point is expected from radiologists instead of manual delineation of the boundary of the nodules. The performance of the retrieval system depends on the accuracy of the segmentation technique. Several 3D features are explored to improve the performance of the proposed retrieval system. A set of relevant shape and texture features are considered for efficient representation of the nodules in the feature space. The proposed CBIR system is evaluated for three configurations such as configuration-1 (composite rank of malignancy "1","2" as benign and "4","5" as malignant), configuration-2 (composite rank of malignancy "1","2", "3" as benign and "4","5" as malignant), and configuration-3 (composite rank of malignancy "1","2" as benign and "3","4","5" as malignant). Considering top 5 retrieved nodules and Euclidean distance metric, the precision achieved by the proposed method for configuration-1, configuration-2, and configuration-3 are 82.14, 75.91, and 74.27 %, respectively. The performance of the proposed CBIR system is close to the most recent technique, which is dependent on radiologists for manual segmentation of nodules. A computer-aided diagnosis (CAD) system is also developed based on CBIR paradigm. Performance of the proposed CBIR-based CAD system is close to performance of the CAD system using support vector machine.

  19. Quantum Private Queries

    NASA Astrophysics Data System (ADS)

    Giovannetti, Vittorio; Lloyd, Seth; Maccone, Lorenzo

    2008-06-01

    We propose a cheat sensitive quantum protocol to perform a private search on a classical database which is efficient in terms of communication complexity. It allows a user to retrieve an item from the database provider without revealing which item he or she retrieved: if the provider tries to obtain information on the query, the person querying the database can find it out. The protocol ensures also perfect data privacy of the database: the information that the user can retrieve in a single query is bounded and does not depend on the size of the database. With respect to the known (quantum and classical) strategies for private information retrieval, our protocol displays an exponential reduction in communication complexity and in running-time computational complexity.

  20. Forest Vegetation Simulator translocation techniques with the Bureau of Land Management's Forest Vegetation Information system database

    Treesearch

    Timothy A. Bottomley

    2008-01-01

    The BLM uses a database, called the Forest Vegetation Information System (FORVIS), to store, retrieve, and analyze forest resource information on a majority of their forested lands. FORVIS also has the capability of easily transferring appropriate data electronically into Forest Vegetation Simulator (FVS) for simulation runs. Only minor additional data inputs or...

  1. Evaluation of information-theoretic similarity measures for content-based retrieval and detection of masses in mammograms.

    PubMed

    Tourassi, Georgia D; Harrawood, Brian; Singh, Swatee; Lo, Joseph Y; Floyd, Carey E

    2007-01-01

    The purpose of this study was to evaluate image similarity measures employed in an information-theoretic computer-assisted detection (IT-CAD) scheme. The scheme was developed for content-based retrieval and detection of masses in screening mammograms. The study is aimed toward an interactive clinical paradigm where physicians query the proposed IT-CAD scheme on mammographic locations that are either visually suspicious or indicated as suspicious by other cuing CAD systems. The IT-CAD scheme provides an evidence-based, second opinion for query mammographic locations using a knowledge database of mass and normal cases. In this study, eight entropy-based similarity measures were compared with respect to retrieval precision and detection accuracy using a database of 1820 mammographic regions of interest. The IT-CAD scheme was then validated on a separate database for false positive reduction of progressively more challenging visual cues generated by an existing, in-house mass detection system. The study showed that the image similarity measures fall into one of two categories; one category is better suited to the retrieval of semantically similar cases while the second is more effective with knowledge-based decisions regarding the presence of a true mass in the query location. In addition, the IT-CAD scheme yielded a substantial reduction in false-positive detections while maintaining high detection rate for malignant masses.

  2. Database systems for knowledge-based discovery.

    PubMed

    Jagarlapudi, Sarma A R P; Kishan, K V Radha

    2009-01-01

    Several database systems have been developed to provide valuable information from the bench chemist to biologist, medical practitioner to pharmaceutical scientist in a structured format. The advent of information technology and computational power enhanced the ability to access large volumes of data in the form of a database where one could do compilation, searching, archiving, analysis, and finally knowledge derivation. Although, data are of variable types the tools used for database creation, searching and retrieval are similar. GVK BIO has been developing databases from publicly available scientific literature in specific areas like medicinal chemistry, clinical research, and mechanism-based toxicity so that the structured databases containing vast data could be used in several areas of research. These databases were classified as reference centric or compound centric depending on the way the database systems were designed. Integration of these databases with knowledge derivation tools would enhance the value of these systems toward better drug design and discovery.

  3. Enhancing navigation in biomedical databases by community voting and database-driven text classification

    PubMed Central

    Duchrow, Timo; Shtatland, Timur; Guettler, Daniel; Pivovarov, Misha; Kramer, Stefan; Weissleder, Ralph

    2009-01-01

    Background The breadth of biological databases and their information content continues to increase exponentially. Unfortunately, our ability to query such sources is still often suboptimal. Here, we introduce and apply community voting, database-driven text classification, and visual aids as a means to incorporate distributed expert knowledge, to automatically classify database entries and to efficiently retrieve them. Results Using a previously developed peptide database as an example, we compared several machine learning algorithms in their ability to classify abstracts of published literature results into categories relevant to peptide research, such as related or not related to cancer, angiogenesis, molecular imaging, etc. Ensembles of bagged decision trees met the requirements of our application best. No other algorithm consistently performed better in comparative testing. Moreover, we show that the algorithm produces meaningful class probability estimates, which can be used to visualize the confidence of automatic classification during the retrieval process. To allow viewing long lists of search results enriched by automatic classifications, we added a dynamic heat map to the web interface. We take advantage of community knowledge by enabling users to cast votes in Web 2.0 style in order to correct automated classification errors, which triggers reclassification of all entries. We used a novel framework in which the database "drives" the entire vote aggregation and reclassification process to increase speed while conserving computational resources and keeping the method scalable. In our experiments, we simulate community voting by adding various levels of noise to nearly perfectly labelled instances, and show that, under such conditions, classification can be improved significantly. Conclusion Using PepBank as a model database, we show how to build a classification-aided retrieval system that gathers training data from the community, is completely controlled by the database, scales well with concurrent change events, and can be adapted to add text classification capability to other biomedical databases. The system can be accessed at . PMID:19799796

  4. Population groups: indexing, coverage, and retrieval effectiveness of ethnically related health care issues in health sciences databases.

    PubMed Central

    Efthimiadis, E N; Afifi, M

    1996-01-01

    OBJECTIVES: This study examined methods of accessing (for indexing and retrieval purposes) medical research on population groups in the major abstracting and indexing services of the health sciences literature. DESIGN: The study of diseases in specific population groups is facilitated by the indexing of both diseases and populations in a database. The MEDLINE, PsycINFO, and Embase databases were selected for the study. The published thesauri for these databases were examined to establish the vocabulary in use. Indexing terms were identified and examined as to their representation in the current literature. Terms were clustered further into groups thought to reflect an end user's perspective and to facilitate subsequent analysis. The medical literature contained in the three online databases was searched with both controlled vocabulary and natural language terms. RESULTS: The three thesauri revealed shallow pre-coordinated hierarchical structures, rather difficult-to-use terms for post-coordination, and a blurring of cultural, genetic, and racial facets of populations. Post-coordination is difficult because of the system-oriented terminology, which is intended mostly for information professionals. The terminology unintentionally restricts access by the end users who lack the knowledge needed to use the thesauri effectively for information retrieval. CONCLUSIONS: Population groups are not represented adequately in the index languages of health sciences databases. Users of these databases need to be alerted to the difficulties that may be encountered in searching for information on population groups. Information and health professionals may not be able to access the literature if they are not familiar with the indexing policies on population groups. Consequently, the study points to a problem that needs to be addressed, through either the redesign of existing systems or the design of new ones to meet the goals of Healthy People 2000 and beyond. PMID:8883987

  5. Extracting semantics from audio-visual content: the final frontier in multimedia retrieval.

    PubMed

    Naphade, M R; Huang, T S

    2002-01-01

    Multimedia understanding is a fast emerging interdisciplinary research area. There is tremendous potential for effective use of multimedia content through intelligent analysis. Diverse application areas are increasingly relying on multimedia understanding systems. Advances in multimedia understanding are related directly to advances in signal processing, computer vision, pattern recognition, multimedia databases, and smart sensors. We review the state-of-the-art techniques in multimedia retrieval. In particular, we discuss how multimedia retrieval can be viewed as a pattern recognition problem. We discuss how reliance on powerful pattern recognition and machine learning techniques is increasing in the field of multimedia retrieval. We review the state-of-the-art multimedia understanding systems with particular emphasis on a system for semantic video indexing centered around multijects and multinets. We discuss how semantic retrieval is centered around concepts and context and the various mechanisms for modeling concepts and context.

  6. Music information retrieval in compressed audio files: a survey

    NASA Astrophysics Data System (ADS)

    Zampoglou, Markos; Malamos, Athanasios G.

    2014-07-01

    In this paper, we present an organized survey of the existing literature on music information retrieval systems in which descriptor features are extracted directly from the compressed audio files, without prior decompression to pulse-code modulation format. Avoiding the decompression step and utilizing the readily available compressed-domain information can significantly lighten the computational cost of a music information retrieval system, allowing application to large-scale music databases. We identify a number of systems relying on compressed-domain information and form a systematic classification of the features they extract, the retrieval tasks they tackle and the degree in which they achieve an actual increase in the overall speed-as well as any resulting loss in accuracy. Finally, we discuss recent developments in the field, and the potential research directions they open toward ultra-fast, scalable systems.

  7. Elsevier’s approach to the bioCADDIE 2016 Dataset Retrieval Challenge

    PubMed Central

    Scerri, Antony; Kuriakose, John; Deshmane, Amit Ajit; Stanger, Mark; Moore, Rebekah; Naik, Raj; de Waard, Anita

    2017-01-01

    Abstract We developed a two-stream, Apache Solr-based information retrieval system in response to the bioCADDIE 2016 Dataset Retrieval Challenge. One stream was based on the principle of word embeddings, the other was rooted in ontology based indexing. Despite encountering several issues in the data, the evaluation procedure and the technologies used, the system performed quite well. We provide some pointers towards future work: in particular, we suggest that more work in query expansion could benefit future biomedical search engines. Database URL: https://data.mendeley.com/datasets/zd9dxpyybg/1 PMID:29220454

  8. Monitoring SLAC High Performance UNIX Computing Systems

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Lettsome, Annette K.; /Bethune-Cookman Coll. /SLAC

    2005-12-15

    Knowledge of the effectiveness and efficiency of computers is important when working with high performance systems. The monitoring of such systems is advantageous in order to foresee possible misfortunes or system failures. Ganglia is a software system designed for high performance computing systems to retrieve specific monitoring information. An alternative storage facility for Ganglia's collected data is needed since its default storage system, the round-robin database (RRD), struggles with data integrity. The creation of a script-driven MySQL database solves this dilemma. This paper describes the process took in the creation and implementation of the MySQL database for use by Ganglia.more » Comparisons between data storage by both databases are made using gnuplot and Ganglia's real-time graphical user interface.« less

  9. Development of a database for the verification of trans-ionospheric remote sensing systems

    NASA Astrophysics Data System (ADS)

    Leitinger, R.

    2005-08-01

    Remote sensing systems need verification by means of in-situ data or by means of model data. In the case of ionospheric occultation inversion, ionosphere tomography and other imaging methods on the basis of satellite-to-ground or satellite-to-satellite electron content, the availability of in-situ data with adequate spatial and temporal co-location is a very rare case, indeed. Therefore the method of choice for verification is to produce artificial electron content data with realistic properties, subject these data to the inversion/retrieval method, compare the results with model data and apply a suitable type of “goodness of fit” classification. Inter-comparison of inversion/retrieval methods should be done with sets of artificial electron contents in a “blind” (or even “double blind”) way. The set up of a relevant database for the COST 271 Action is described. One part of the database will be made available to everyone interested in testing of inversion/retrieval methods. The artificial electron content data are calculated by means of large-scale models that are “modulated” in a realistic way to include smaller scale and dynamic structures, like troughs and traveling ionospheric disturbances.

  10. Multiple Object Retrieval in Image Databases Using Hierarchical Segmentation Tree

    ERIC Educational Resources Information Center

    Chen, Wei-Bang

    2012-01-01

    The purpose of this research is to develop a new visual information analysis, representation, and retrieval framework for automatic discovery of salient objects of user's interest in large-scale image databases. In particular, this dissertation describes a content-based image retrieval framework which supports multiple-object retrieval. The…

  11. Data augmentation-assisted deep learning of hand-drawn partially colored sketches for visual search

    PubMed Central

    Muhammad, Khan; Baik, Sung Wook

    2017-01-01

    In recent years, image databases are growing at exponential rates, making their management, indexing, and retrieval, very challenging. Typical image retrieval systems rely on sample images as queries. However, in the absence of sample query images, hand-drawn sketches are also used. The recent adoption of touch screen input devices makes it very convenient to quickly draw shaded sketches of objects to be used for querying image databases. This paper presents a mechanism to provide access to visual information based on users’ hand-drawn partially colored sketches using touch screen devices. A key challenge for sketch-based image retrieval systems is to cope with the inherent ambiguity in sketches due to the lack of colors, textures, shading, and drawing imperfections. To cope with these issues, we propose to fine-tune a deep convolutional neural network (CNN) using augmented dataset to extract features from partially colored hand-drawn sketches for query specification in a sketch-based image retrieval framework. The large augmented dataset contains natural images, edge maps, hand-drawn sketches, de-colorized, and de-texturized images which allow CNN to effectively model visual contents presented to it in a variety of forms. The deep features extracted from CNN allow retrieval of images using both sketches and full color images as queries. We also evaluated the role of partial coloring or shading in sketches to improve the retrieval performance. The proposed method is tested on two large datasets for sketch recognition and sketch-based image retrieval and achieved better classification and retrieval performance than many existing methods. PMID:28859140

  12. Hydrology of lakes in the Minneapolis-Saint Paul Metropolitan Area: A summary of available dat

    USGS Publications Warehouse

    McBride, Mark S.

    1976-01-01

    SYSTEM 2000, a generalized computer data-base management system, was used to organize the data and prepare the tables. SYSTEM 2000 provides powerful capabilities for future retrieval and analyses of the data. The data base is available to potential users so that questions not implicitly anticipated in the preparation of the published tables can be answered readily, and the user can retrieve data in tabular or other forms to meet his particular needs.

  13. Intelligent web image retrieval system

    NASA Astrophysics Data System (ADS)

    Hong, Sungyong; Lee, Chungwoo; Nah, Yunmook

    2001-07-01

    Recently, the web sites such as e-business sites and shopping mall sites deal with lots of image information. To find a specific image from these image sources, we usually use web search engines or image database engines which rely on keyword only retrievals or color based retrievals with limited search capabilities. This paper presents an intelligent web image retrieval system. We propose the system architecture, the texture and color based image classification and indexing techniques, and representation schemes of user usage patterns. The query can be given by providing keywords, by selecting one or more sample texture patterns, by assigning color values within positional color blocks, or by combining some or all of these factors. The system keeps track of user's preferences by generating user query logs and automatically add more search information to subsequent user queries. To show the usefulness of the proposed system, some experimental results showing recall and precision are also explained.

  14. Concept-oriented indexing of video databases: toward semantic sensitive retrieval and browsing.

    PubMed

    Fan, Jianping; Luo, Hangzai; Elmagarmid, Ahmed K

    2004-07-01

    Digital video now plays an important role in medical education, health care, telemedicine and other medical applications. Several content-based video retrieval (CBVR) systems have been proposed in the past, but they still suffer from the following challenging problems: semantic gap, semantic video concept modeling, semantic video classification, and concept-oriented video database indexing and access. In this paper, we propose a novel framework to make some advances toward the final goal to solve these problems. Specifically, the framework includes: 1) a semantic-sensitive video content representation framework by using principal video shots to enhance the quality of features; 2) semantic video concept interpretation by using flexible mixture model to bridge the semantic gap; 3) a novel semantic video-classifier training framework by integrating feature selection, parameter estimation, and model selection seamlessly in a single algorithm; and 4) a concept-oriented video database organization technique through a certain domain-dependent concept hierarchy to enable semantic-sensitive video retrieval and browsing.

  15. Managing biomedical image metadata for search and retrieval of similar images.

    PubMed

    Korenblum, Daniel; Rubin, Daniel; Napel, Sandy; Rodriguez, Cesar; Beaulieu, Chris

    2011-08-01

    Radiology images are generally disconnected from the metadata describing their contents, such as imaging observations ("semantic" metadata), which are usually described in text reports that are not directly linked to the images. We developed a system, the Biomedical Image Metadata Manager (BIMM) to (1) address the problem of managing biomedical image metadata and (2) facilitate the retrieval of similar images using semantic feature metadata. Our approach allows radiologists, researchers, and students to take advantage of the vast and growing repositories of medical image data by explicitly linking images to their associated metadata in a relational database that is globally accessible through a Web application. BIMM receives input in the form of standard-based metadata files using Web service and parses and stores the metadata in a relational database allowing efficient data query and maintenance capabilities. Upon querying BIMM for images, 2D regions of interest (ROIs) stored as metadata are automatically rendered onto preview images included in search results. The system's "match observations" function retrieves images with similar ROIs based on specific semantic features describing imaging observation characteristics (IOCs). We demonstrate that the system, using IOCs alone, can accurately retrieve images with diagnoses matching the query images, and we evaluate its performance on a set of annotated liver lesion images. BIMM has several potential applications, e.g., computer-aided detection and diagnosis, content-based image retrieval, automating medical analysis protocols, and gathering population statistics like disease prevalences. The system provides a framework for decision support systems, potentially improving their diagnostic accuracy and selection of appropriate therapies.

  16. Starbase Data Tables: An ASCII Relational Database for Unix

    NASA Astrophysics Data System (ADS)

    Roll, John

    2011-11-01

    Database management is an increasingly important part of astronomical data analysis. Astronomers need easy and convenient ways of storing, editing, filtering, and retrieving data about data. Commercial databases do not provide good solutions for many of the everyday and informal types of database access astronomers need. The Starbase database system with simple data file formatting rules and command line data operators has been created to answer this need. The system includes a complete set of relational and set operators, fast search/index and sorting operators, and many formatting and I/O operators. Special features are included to enhance the usefulness of the database when manipulating astronomical data. The software runs under UNIX, MSDOS and IRAF.

  17. A novel content-based medical image retrieval method based on query topic dependent image features (QTDIF)

    NASA Astrophysics Data System (ADS)

    Xiong, Wei; Qiu, Bo; Tian, Qi; Mueller, Henning; Xu, Changsheng

    2005-04-01

    Medical image retrieval is still mainly a research domain with a large variety of applications and techniques. With the ImageCLEF 2004 benchmark, an evaluation framework has been created that includes a database, query topics and ground truth data. Eleven systems (with a total of more than 50 runs) compared their performance in various configurations. The results show that there is not any one feature that performs well on all query tasks. Key to successful retrieval is rather the selection of features and feature weights based on a specific set of input features, thus on the query task. In this paper we propose a novel method based on query topic dependent image features (QTDIF) for content-based medical image retrieval. These feature sets are designed to capture both inter-category and intra-category statistical variations to achieve good retrieval performance in terms of recall and precision. We have used Gaussian Mixture Models (GMM) and blob representation to model medical images and construct the proposed novel QTDIF for CBIR. Finally, trained multi-class support vector machines (SVM) are used for image similarity ranking. The proposed methods have been tested over the Casimage database with around 9000 images, for the given 26 image topics, used for imageCLEF 2004. The retrieval performance has been compared with the medGIFT system, which is based on the GNU Image Finding Tool (GIFT). The experimental results show that the proposed QTDIF-based CBIR can provide significantly better performance than systems based general features only.

  18. Representation and alignment of sung queries for music information retrieval

    NASA Astrophysics Data System (ADS)

    Adams, Norman H.; Wakefield, Gregory H.

    2005-09-01

    The pursuit of robust and rapid query-by-humming systems, which search melodic databases using sung queries, is a common theme in music information retrieval. The retrieval aspect of this database problem has received considerable attention, whereas the front-end processing of sung queries and the data structure to represent melodies has been based on musical intuition and historical momentum. The present work explores three time series representations for sung queries: a sequence of notes, a ``smooth'' pitch contour, and a sequence of pitch histograms. The performance of the three representations is compared using a collection of naturally sung queries. It is found that the most robust performance is achieved by the representation with highest dimension, the smooth pitch contour, but that this representation presents a formidable computational burden. For all three representations, it is necessary to align the query and target in order to achieve robust performance. The computational cost of the alignment is quadratic, hence it is necessary to keep the dimension small for rapid retrieval. Accordingly, iterative deepening is employed to achieve both robust performance and rapid retrieval. Finally, the conventional iterative framework is expanded to adapt the alignment constraints based on previous iterations, further expediting retrieval without degrading performance.

  19. A structured vocabulary for indexing dietary supplements in databases in the United States

    PubMed Central

    Saldanha, Leila G; Dwyer, Johanna T; Holden, Joanne M; Ireland, Jayne D.; Andrews, Karen W; Bailey, Regan L; Gahche, Jaime J.; Hardy, Constance J; Møller, Anders; Pilch, Susan M.; Roseland, Janet M

    2011-01-01

    Food composition databases are critical to assess and plan dietary intakes. Dietary supplement databases are also needed because dietary supplements make significant contributions to total nutrient intakes. However, no uniform system exists for classifying dietary supplement products and indexing their ingredients in such databases. Differing approaches to classifying these products make it difficult to retrieve or link information effectively. A consistent approach to classifying information within food composition databases led to the development of LanguaL™, a structured vocabulary. LanguaL™ is being adapted as an interface tool for classifying and retrieving product information in dietary supplement databases. This paper outlines proposed changes to the LanguaL™ thesaurus for indexing dietary supplement products and ingredients in databases. The choice of 12 of the original 14 LanguaL™ facets pertinent to dietary supplements, modifications to their scopes, and applications are described. The 12 chosen facets are: Product Type; Source; Part of Source; Physical State, Shape or Form; Ingredients; Preservation Method, Packing Medium, Container or Wrapping; Contact Surface; Consumer Group/Dietary Use/Label Claim; Geographic Places and Regions; and Adjunct Characteristics of food. PMID:22611303

  20. Document image database indexing with pictorial dictionary

    NASA Astrophysics Data System (ADS)

    Akbari, Mohammad; Azimi, Reza

    2010-02-01

    In this paper we introduce a new approach for information retrieval from Persian document image database without using Optical Character Recognition (OCR).At first an attribute called subword upper contour label is defined then, a pictorial dictionary is constructed based on this attribute for the subwords. By this approach we address two issues in document image retrieval: keyword spotting and retrieval according to the document similarities. The proposed methods have been evaluated on a Persian document image database. The results have proved the ability of this approach in document image information retrieval.

  1. Enabling search over encrypted multimedia databases

    NASA Astrophysics Data System (ADS)

    Lu, Wenjun; Swaminathan, Ashwin; Varna, Avinash L.; Wu, Min

    2009-02-01

    Performing information retrieval tasks while preserving data confidentiality is a desirable capability when a database is stored on a server maintained by a third-party service provider. This paper addresses the problem of enabling content-based retrieval over encrypted multimedia databases. Search indexes, along with multimedia documents, are first encrypted by the content owner and then stored onto the server. Through jointly applying cryptographic techniques, such as order preserving encryption and randomized hash functions, with image processing and information retrieval techniques, secure indexing schemes are designed to provide both privacy protection and rank-ordered search capability. Retrieval results on an encrypted color image database and security analysis of the secure indexing schemes under different attack models show that data confidentiality can be preserved while retaining very good retrieval performance. This work has promising applications in secure multimedia management.

  2. Retrieving handwriting by combining word spotting and manifold ranking

    NASA Astrophysics Data System (ADS)

    Peña Saldarriaga, Sebastián; Morin, Emmanuel; Viard-Gaudin, Christian

    2012-01-01

    Online handwritten data, produced with Tablet PCs or digital pens, consists in a sequence of points (x, y). As the amount of data available in this form increases, algorithms for retrieval of online data are needed. Word spotting is a common approach used for the retrieval of handwriting. However, from an information retrieval (IR) perspective, word spotting is a primitive keyword based matching and retrieval strategy. We propose a framework for handwriting retrieval where an arbitrary word spotting method is used, and then a manifold ranking algorithm is applied on the initial retrieval scores. Experimental results on a database of more than 2,000 handwritten newswires show that our method can improve the performances of a state-of-the-art word spotting system by more than 10%.

  3. An enhanced VIIRS aerosol optical thickness (AOT) retrieval algorithm over land using a global surface reflectance ratio database

    NASA Astrophysics Data System (ADS)

    Zhang, Hai; Kondragunta, Shobha; Laszlo, Istvan; Liu, Hongqing; Remer, Lorraine A.; Huang, Jingfeng; Superczynski, Stephen; Ciren, Pubu

    2016-09-01

    The Visible/Infrared Imager Radiometer Suite (VIIRS) on board the Suomi National Polar-orbiting Partnership (S-NPP) satellite has been retrieving aerosol optical thickness (AOT), operationally and globally, over ocean and land since shortly after S-NPP launch in 2011. However, the current operational VIIRS AOT retrieval algorithm over land has two limitations in its assumptions for land surfaces: (1) it only retrieves AOT over the dark surfaces and (2) it assumes that the global surface reflectance ratios between VIIRS bands are constants. In this work, we develop a surface reflectance ratio database over land with a spatial resolution 0.1° × 0.1° using 2 years of VIIRS top of atmosphere reflectances. We enhance the current operational VIIRS AOT retrieval algorithm by applying the surface reflectance ratio database in the algorithm. The enhanced algorithm is able to retrieve AOT over both dark and bright surfaces. Over bright surfaces, the VIIRS AOT retrievals from the enhanced algorithm have a correlation of 0.79, mean bias of -0.008, and standard deviation (STD) of error of 0.139 when compared against the ground-based observations at the global AERONET (Aerosol Robotic Network) sites. Over dark surfaces, the VIIRS AOT retrievals using the surface reflectance ratio database improve the root-mean-square error from 0.150 to 0.123. The use of the surface reflectance ratio database also increases the data coverage of more than 20% over dark surfaces. The AOT retrievals over bright surfaces are comparable to MODIS Deep Blue AOT retrievals.

  4. GenBank

    PubMed Central

    Benson, Dennis A.; Karsch-Mizrachi, Ilene; Lipman, David J.; Ostell, James; Wheeler, David L.

    2007-01-01

    GenBank (R) is a comprehensive database that contains publicly available nucleotide sequences for more than 240 000 named organisms, obtained primarily through submissions from individual laboratories and batch submissions from large-scale sequencing projects. Most submissions are made using the web-based BankIt or standalone Sequin programs and accession numbers are assigned by GenBank staff upon receipt. Daily data exchange with the EMBL Data Library in Europe and the DNA Data Bank of Japan ensures worldwide coverage. GenBank is accessible through NCBI's retrieval system, Entrez, which integrates data from the major DNA and protein sequence databases along with taxonomy, genome, mapping, protein structure and domain information, and the biomedical journal literature via PubMed. BLAST provides sequence similarity searches of GenBank and other sequence databases. Complete bimonthly releases and daily updates of the GenBank database are available by FTP. To access GenBank and its related retrieval and analysis services, begin at the NCBI Homepage (). PMID:17202161

  5. Using an image-extended relational database to support content-based image retrieval in a PACS.

    PubMed

    Traina, Caetano; Traina, Agma J M; Araújo, Myrian R B; Bueno, Josiane M; Chino, Fabio J T; Razente, Humberto; Azevedo-Marques, Paulo M

    2005-12-01

    This paper presents a new Picture Archiving and Communication System (PACS), called cbPACS, which has content-based image retrieval capabilities. The cbPACS answers range and k-nearest- neighbor similarity queries, employing a relational database manager extended to support images. The images are compared through their features, which are extracted by an image-processing module and stored in the extended relational database. The database extensions were developed aiming at efficiently answering similarity queries by taking advantage of specialized indexing methods. The main concept supporting the extensions is the definition, inside the relational manager, of distance functions based on features extracted from the images. An extension to the SQL language enables the construction of an interpreter that intercepts the extended commands and translates them to standard SQL, allowing any relational database server to be used. By now, the system implemented works on features based on color distribution of the images through normalized histograms as well as metric histograms. Metric histograms are invariant regarding scale, translation and rotation of images and also to brightness transformations. The cbPACS is prepared to integrate new image features, based on texture and shape of the main objects in the image.

  6. Database technology and the management of multimedia data in the Mirror project

    NASA Astrophysics Data System (ADS)

    de Vries, Arjen P.; Blanken, H. M.

    1998-10-01

    Multimedia digital libraries require an open distributed architecture instead of a monolithic database system. In the Mirror project, we use the Monet extensible database kernel to manage different representation of multimedia objects. To maintain independence between content, meta-data, and the creation of meta-data, we allow distribution of data and operations using CORBA. This open architecture introduces new problems for data access. From an end user's perspective, the problem is how to search the available representations to fulfill an actual information need; the conceptual gap between human perceptual processes and the meta-data is too large. From a system's perspective, several representations of the data may semantically overlap or be irrelevant. We address these problems with an iterative query process and active user participating through relevance feedback. A retrieval model based on inference networks assists the user with query formulation. The integration of this model into the database design has two advantages. First, the user can query both the logical and the content structure of multimedia objects. Second, the use of different data models in the logical and the physical database design provides data independence and allows algebraic query optimization. We illustrate query processing with a music retrieval application.

  7. Designing an Integrated System of Databases: A Workstation for Information Seekers.

    ERIC Educational Resources Information Center

    Micco, Mary; Smith, Irma

    1987-01-01

    Proposes a framework for the design of a full function workstation for information retrieval based on study of information seeking behavior. A large amount of local storage of the CD-ROM jukebox variety and full networking capability to both local and external databases are identified as requirements of the prototype. (MES)

  8. Learning about and Practice of Designing Local Data Bases as an Harmonizing Factor.

    ERIC Educational Resources Information Center

    Neelameghan, A.

    This paper provides information workers with some practical approaches to the design, development, and use of local databases that form components of information storage and retrieval systems (ISR) and of automated library operations. Topics discussed include: (1) course objectives for the design and development of local databases for library and…

  9. Sentence-Based Metadata: An Approach and Tool for Viewing Database Designs.

    ERIC Educational Resources Information Center

    Boyle, John M.; Gunge, Jakob; Bryden, John; Librowski, Kaz; Hanna, Hsin-Yi

    2002-01-01

    Describes MARS (Museum Archive Retrieval System), a research tool which enables organizations to exchange digital images and documents by means of a common thesaurus structure, and merge the descriptive data and metadata of their collections. Highlights include theoretical basis; searching the MARS database; and examples in European museums.…

  10. A probabilistic NF2 relational algebra for integrated information retrieval and database systems

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Fuhr, N.; Roelleke, T.

    The integration of information retrieval (IR) and database systems requires a data model which allows for modelling documents as entities, representing uncertainty and vagueness and performing uncertain inference. For this purpose, we present a probabilistic data model based on relations in non-first-normal-form (NF2). Here, tuples are assigned probabilistic weights giving the probability that a tuple belongs to a relation. Thus, the set of weighted index terms of a document are represented as a probabilistic subrelation. In a similar way, imprecise attribute values are modelled as a set-valued attribute. We redefine the relational operators for this type of relations such thatmore » the result of each operator is again a probabilistic NF2 relation, where the weight of a tuple gives the probability that this tuple belongs to the result. By ordering the tuples according to decreasing probabilities, the model yields a ranking of answers like in most IR models. This effect also can be used for typical database queries involving imprecise attribute values as well as for combinations of database and IR queries.« less

  11. Developing an A Priori Database for Passive Microwave Snow Water Retrievals Over Ocean

    NASA Astrophysics Data System (ADS)

    Yin, Mengtao; Liu, Guosheng

    2017-12-01

    A physically optimized a priori database is developed for Global Precipitation Measurement Microwave Imager (GMI) snow water retrievals over ocean. The initial snow water content profiles are derived from CloudSat Cloud Profiling Radar (CPR) measurements. A radiative transfer model in which the single-scattering properties of nonspherical snowflakes are based on the discrete dipole approximate results is employed to simulate brightness temperatures and their gradients. Snow water content profiles are then optimized through a one-dimensional variational (1D-Var) method. The standard deviations of the difference between observed and simulated brightness temperatures are in a similar magnitude to the observation errors defined for observation error covariance matrix after the 1D-Var optimization, indicating that this variational method is successful. This optimized database is applied in a Bayesian retrieval snow water algorithm. The retrieval results indicated that the 1D-Var approach has a positive impact on the GMI retrieved snow water content profiles by improving the physical consistency between snow water content profiles and observed brightness temperatures. Global distribution of snow water contents retrieved from the a priori database is compared with CloudSat CPR estimates. Results showed that the two estimates have a similar pattern of global distribution, and the difference of their global means is small. In addition, we investigate the impact of using physical parameters to subset the database on snow water retrievals. It is shown that using total precipitable water to subset the database with 1D-Var optimization is beneficial for snow water retrievals.

  12. GenBank.

    PubMed

    Benson, Dennis A; Karsch-Mizrachi, Ilene; Lipman, David J; Ostell, James; Wheeler, David L

    2008-01-01

    GenBank (R) is a comprehensive database that contains publicly available nucleotide sequences for more than 260 000 named organisms, obtained primarily through submissions from individual laboratories and batch submissions from large-scale sequencing projects. Most submissions are made using the web-based BankIt or standalone Sequin programs and accession numbers are assigned by GenBank staff upon receipt. Daily data exchange with the European Molecular Biology Laboratory Nucleotide Sequence Database in Europe and the DNA Data Bank of Japan ensures worldwide coverage. GenBank is accessible through NCBI's retrieval system, Entrez, which integrates data from the major DNA and protein sequence databases along with taxonomy, genome, mapping, protein structure and domain information, and the biomedical journal literature via PubMed. BLAST provides sequence similarity searches of GenBank and other sequence databases. Complete bimonthly releases and daily updates of the GenBank database are available by FTP. To access GenBank and its related retrieval and analysis services, begin at the NCBI Homepage: www.ncbi.nlm.nih.gov.

  13. GenBank

    PubMed Central

    Benson, Dennis A.; Karsch-Mizrachi, Ilene; Lipman, David J.; Ostell, James; Wheeler, David L.

    2008-01-01

    GenBank (R) is a comprehensive database that contains publicly available nucleotide sequences for more than 260 000 named organisms, obtained primarily through submissions from individual laboratories and batch submissions from large-scale sequencing projects. Most submissions are made using the web-based BankIt or standalone Sequin programs and accession numbers are assigned by GenBank staff upon receipt. Daily data exchange with the European Molecular Biology Laboratory Nucleotide Sequence Database in Europe and the DNA Data Bank of Japan ensures worldwide coverage. GenBank is accessible through NCBI's retrieval system, Entrez, which integrates data from the major DNA and protein sequence databases along with taxonomy, genome, mapping, protein structure and domain information, and the biomedical journal literature via PubMed. BLAST provides sequence similarity searches of GenBank and other sequence databases. Complete bimonthly releases and daily updates of the GenBank database are available by FTP. To access GenBank and its related retrieval and analysis services, begin at the NCBI Homepage: www.ncbi.nlm.nih.gov PMID:18073190

  14. Ray Modeling Methods for Range Dependent Ocean Environments

    DTIC Science & Technology

    1983-12-01

    the eikonal equation, gives rise to equations for ray paths which are perpendicular to the wave fronts. Equation II.4, the transport equation, leads... databases for use by MEDUSA. The author has assisted in the installation of MEDUSA at computer facilities which possess databases containing archives of...sound velocity profiles, bathymetry, and bottom loss data. At each computer site, programs convert the archival data retrieved by the database system

  15. Content-based TV sports video retrieval using multimodal analysis

    NASA Astrophysics Data System (ADS)

    Yu, Yiqing; Liu, Huayong; Wang, Hongbin; Zhou, Dongru

    2003-09-01

    In this paper, we propose content-based video retrieval, which is a kind of retrieval by its semantical contents. Because video data is composed of multimodal information streams such as video, auditory and textual streams, we describe a strategy of using multimodal analysis for automatic parsing sports video. The paper first defines the basic structure of sports video database system, and then introduces a new approach that integrates visual stream analysis, speech recognition, speech signal processing and text extraction to realize video retrieval. The experimental results for TV sports video of football games indicate that the multimodal analysis is effective for video retrieval by quickly browsing tree-like video clips or inputting keywords within predefined domain.

  16. Environmental Factor(tm) system: RCRA hazardous waste handler information (on cd-rom). Database

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    NONE

    1996-04-01

    Environmental Factor(tm) RCRA Hazardous Waste Handler Information on CD-ROM unleashes the invaluable information found in two key EPA data sources on hazardous waste handlers and offers cradle-to-grave waste tracking. It`s easy to search and display: (1) Permit status, design capacity and compliance history for facilities found in the EPA Resource Conservation and Recovery Information System (RCRIS) program tracking database; (2) Detailed information on hazardous wastes generation, management and minimization by companies who are large quantity generators, and (3) Data on the waste management practices of treatment, storage and disposal (TSD) facilities from the EPA Biennial Reporting System which is collectedmore » every other year. Environmental Factor`s powerful database retrieval system lets you: (1) Search for RCRA facilities by permit type, SIC code, waste codes, corrective action or violation information, TSD status, generator and transporter status and more; (2) View compliance information - dates of evaluation, violation, enforcement and corrective action; (3) Lookup facilities by waste processing categories of marketing, transporting, processing and energy recovery; (4) Use owner/operator information and names, titles and telephone numbers of project managers for prospecting; and (5) Browse detailed data on TSD facility and large quantity generators` activities such as onsite waste treatment, disposal, or recycling, offsite waste received, and waste generation and management. The product contains databases, search and retrieval software on two CD-ROMs, an installation diskette and User`s Guide. Environmental Factor has online context-sensitive help from any screen and a printed User`s Guide describing installation and step-by-step procedures for searching, retrieving and exporting. Hotline support is also available for no additional charge.« less

  17. A probabilistic approach to information retrieval in heterogeneous databases

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Chatterjee, A.; Segev, A.

    During the post decade, organizations have increased their scope and operations beyond their traditional geographic boundaries. At the same time, they have adopted heterogeneous and incompatible information systems independent of each other without a careful consideration that one day they may need to be integrated. As a result of this diversity, many important business applications today require access to data stored in multiple autonomous databases. This paper examines a problem of inter-database information retrieval in a heterogeneous environment, where conventional techniques are no longer efficient. To solve the problem, broader definitions for join, union, intersection and selection operators are proposed.more » Also, a probabilistic method to specify the selectivity of these operators is discussed. An algorithm to compute these probabilities is provided in pseudocode.« less

  18. Collection Fusion Using Bayesian Estimation of a Linear Regression Model in Image Databases on the Web.

    ERIC Educational Resources Information Center

    Kim, Deok-Hwan; Chung, Chin-Wan

    2003-01-01

    Discusses the collection fusion problem of image databases, concerned with retrieving relevant images by content based retrieval from image databases distributed on the Web. Focuses on a metaserver which selects image databases supporting similarity measures and proposes a new algorithm which exploits a probabilistic technique using Bayesian…

  19. Video PATSEARCH: A Mixed-Media System.

    ERIC Educational Resources Information Center

    Schulman, Jacque-Lynne

    1982-01-01

    Describes a videodisc-based information display system in which a computer terminal is used to search the online PATSEARCH database from a remote host with local microcomputer control to select and display drawings from the retrieved records. System features and system components are discussed and criteria for system evaluation are presented.…

  20. Where's My Data - WMD

    NASA Technical Reports Server (NTRS)

    Quach, William L.; Sesplaukis, Tadas; Owen-Mankovich, Kyran J.; Nakamura, Lori L.

    2012-01-01

    WMD provides a centralized interface to access data stored in the Mission Data Processing and Control System (MPCS) GDS (Ground Data Systems) databases during MSL (Mars Science Laboratory) Testbeds and ATLO (Assembly, Test, and Launch Operations) test sessions. The MSL project organizes its data based on venue (Testbed, ATLO, Ops), with each venue's data stored on a separate database, making it cumbersome for users to access data across the various venues. WMD allows sessions to be retrieved through a Web-based search using several criteria: host name, session start date, or session ID number. Sessions matching the search criteria will be displayed and users can then select a session to obtain and analyze the associated data. The uniqueness of this software comes from its collection of data retrieval and analysis features provided through a single interface. This allows users to obtain their data and perform the necessary analysis without having to worry about where and how to get the data, which may be stored in various locations. Additionally, this software is a Web application that only requires a standard browser without additional plug-ins, providing a cross-platform, lightweight solution for users to retrieve and analyze their data. This software solves the problem of efficiently and easily finding and retrieving data from thousands of MSL Testbed and ATLO sessions. WMD allows the user to retrieve their session in as little as one mouse click, and then to quickly retrieve additional data associated with the session.

  1. Performance Evaluation of a Database System in a Multiple Backend Configurations,

    DTIC Science & Technology

    1984-10-01

    leaving a systemn process , the * internal performance measuremnents of MMSD have been carried out. Mathodo lo.- gies for constructing test databases...access d i rectory data via the AT, EDIT, and CDT. In designing the test database, one of the key concepts is the choice of the directory attributes in...internal timing. These requests are selected since they retrieve the seIaI lest portion of the test database and the processing time for each request is

  2. Searching for evidence or approval? A commentary on database search in systematic reviews and alternative information retrieval methodologies.

    PubMed

    Delaney, Aogán; Tamás, Peter A

    2018-03-01

    Despite recognition that database search alone is inadequate even within the health sciences, it appears that reviewers in fields that have adopted systematic review are choosing to rely primarily, or only, on database search for information retrieval. This commentary reminds readers of factors that call into question the appropriateness of default reliance on database searches particularly as systematic review is adapted for use in new and lower consensus fields. It then discusses alternative methods for information retrieval that require development, formalisation, and evaluation. Our goals are to encourage reviewers to reflect critically and transparently on their choice of information retrieval methods and to encourage investment in research on alternatives. Copyright © 2017 John Wiley & Sons, Ltd.

  3. Database resources of the National Center for Biotechnology Information

    PubMed Central

    2015-01-01

    The National Center for Biotechnology Information (NCBI) provides a large suite of online resources for biological information and data, including the GenBank® nucleic acid sequence database and the PubMed database of citations and abstracts for published life science journals. Additional NCBI resources focus on literature (Bookshelf, PubMed Central (PMC) and PubReader); medical genetics (ClinVar, dbMHC, the Genetic Testing Registry, HIV-1/Human Protein Interaction Database and MedGen); genes and genomics (BioProject, BioSample, dbSNP, dbVar, Epigenomics, Gene, Gene Expression Omnibus (GEO), Genome, HomoloGene, the Map Viewer, Nucleotide, PopSet, Probe, RefSeq, Sequence Read Archive, the Taxonomy Browser, Trace Archive and UniGene); and proteins and chemicals (Biosystems, COBALT, the Conserved Domain Database (CDD), the Conserved Domain Architecture Retrieval Tool (CDART), the Molecular Modeling Database (MMDB), Protein Clusters, Protein and the PubChem suite of small molecule databases). The Entrez system provides search and retrieval operations for many of these databases. Augmenting many of the Web applications are custom implementations of the BLAST program optimized to search specialized data sets. All of these resources can be accessed through the NCBI home page at http://www.ncbi.nlm.nih.gov. PMID:25398906

  4. Database resources of the National Center for Biotechnology Information

    PubMed Central

    2016-01-01

    The National Center for Biotechnology Information (NCBI) provides a large suite of online resources for biological information and data, including the GenBank® nucleic acid sequence database and the PubMed database of citations and abstracts for published life science journals. Additional NCBI resources focus on literature (PubMed Central (PMC), Bookshelf and PubReader), health (ClinVar, dbGaP, dbMHC, the Genetic Testing Registry, HIV-1/Human Protein Interaction Database and MedGen), genomes (BioProject, Assembly, Genome, BioSample, dbSNP, dbVar, Epigenomics, the Map Viewer, Nucleotide, Probe, RefSeq, Sequence Read Archive, the Taxonomy Browser and the Trace Archive), genes (Gene, Gene Expression Omnibus (GEO), HomoloGene, PopSet and UniGene), proteins (Protein, the Conserved Domain Database (CDD), COBALT, Conserved Domain Architecture Retrieval Tool (CDART), the Molecular Modeling Database (MMDB) and Protein Clusters) and chemicals (Biosystems and the PubChem suite of small molecule databases). The Entrez system provides search and retrieval operations for most of these databases. Augmenting many of the web applications are custom implementations of the BLAST program optimized to search specialized datasets. All of these resources can be accessed through the NCBI home page at www.ncbi.nlm.nih.gov. PMID:26615191

  5. Prototyping a Web-Enabled Decision Support System to Improve Capacity Management of Aviation Training

    DTIC Science & Technology

    2005-09-01

    sharing, cooperation, and cost optimization International Journal of Production Economics Amsterdam, 93,94, 41-52. Retrieved July 13, 2005, from the... Journal of Production Economics , 59(1-3), 53-64. Calogero, B. (2000). Who is to blame for ERP failure? SunServer, 14(6), 8-9. Retrieved July 24...database. Bonney, M. C., Zhang, Z., Head, M. A., Tien, C. C., & Barson, R. J. (1999). Are push and pull systems really so different? International

  6. SIRW: A web server for the Simple Indexing and Retrieval System that combines sequence motif searches with keyword searches.

    PubMed

    Ramu, Chenna

    2003-07-01

    SIRW (http://sirw.embl.de/) is a World Wide Web interface to the Simple Indexing and Retrieval System (SIR) that is capable of parsing and indexing various flat file databases. In addition it provides a framework for doing sequence analysis (e.g. motif pattern searches) for selected biological sequences through keyword search. SIRW is an ideal tool for the bioinformatics community for searching as well as analyzing biological sequences of interest.

  7. Biological data integration: wrapping data and tools.

    PubMed

    Lacroix, Zoé

    2002-06-01

    Nowadays scientific data is inevitably digital and stored in a wide variety of formats in heterogeneous systems. Scientists need to access an integrated view of remote or local heterogeneous data sources with advanced data accessing, analyzing, and visualization tools. Building a digital library for scientific data requires accessing and manipulating data extracted from flat files or databases, documents retrieved from the Web as well as data generated by software. We present an approach to wrapping web data sources, databases, flat files, or data generated by tools through a database view mechanism. Generally, a wrapper has two tasks: it first sends a query to the source to retrieve data and, second builds the expected output with respect to the virtual structure. Our wrappers are composed of a retrieval component based on an intermediate object view mechanism called search views mapping the source capabilities to attributes, and an eXtensible Markup Language (XML) engine, respectively, to perform these two tasks. The originality of the approach consists of: 1) a generic view mechanism to access seamlessly data sources with limited capabilities and 2) the ability to wrap data sources as well as the useful specific tools they may provide. Our approach has been developed and demonstrated as part of the multidatabase system supporting queries via uniform object protocol model (OPM) interfaces.

  8. Developing a comprehensive system for content-based retrieval of image and text data from a national survey

    NASA Astrophysics Data System (ADS)

    Antani, Sameer K.; Natarajan, Mukil; Long, Jonathan L.; Long, L. Rodney; Thoma, George R.

    2005-04-01

    The article describes the status of our ongoing R&D at the U.S. National Library of Medicine (NLM) towards the development of an advanced multimedia database biomedical information system that supports content-based image retrieval (CBIR). NLM maintains a collection of 17,000 digitized spinal X-rays along with text survey data from the Second National Health and Nutritional Examination Survey (NHANES II). These data serve as a rich data source for epidemiologists and researchers of osteoarthritis and musculoskeletal diseases. It is currently possible to access these through text keyword queries using our Web-based Medical Information Retrieval System (WebMIRS). CBIR methods developed specifically for biomedical images could offer direct visual searching of these images by means of example image or user sketch. We are building a system which supports hybrid queries that have text and image-content components. R&D goals include developing algorithms for robust image segmentation for localizing and identifying relevant anatomy, labeling the segmented anatomy based on its pathology, developing suitable indexing and similarity matching methods for images and image features, and associating the survey text information for query and retrieval along with the image data. Some highlights of the system developed in MATLAB and Java are: use of a networked or local centralized database for text and image data; flexibility to incorporate new research work; provides a means to control access to system components under development; and use of XML for structured reporting. The article details the design, features, and algorithms in this third revision of this prototype system, CBIR3.

  9. Establishment of an inferior vena cava filter database and interventional radiology led follow-up - retrieval rates and patients lost to follow-up.

    PubMed

    Klinken, Sven; Humphries, Charlotte; Ferguson, John

    2017-10-01

    To evaluate the rates of inferior vena cava (IVC) filter retrieval and the number of patient's lost to follow-up, before and after the establishment of an IVC filter database and interventional radiology (inserting physician) led follow-up. On the 1st of June 2012, an electronic interventional radiology database was established at our Institution. In addition, the interventional radiology team took responsibility for follow-up of IVC filters. Data were prospectively collected from the database for all patients who had an IVC filter inserted between the 1st June 2012 and the 31st May 2014. Data on patients who had an IVC filter inserted between the 1st of June 2009 to the 31st of May 2012 were retrospectively reviewed. Patient demographics, insertion indications, filter types, retrieval status, documented retrieval decisions, time in situ, trackable events and complications were obtained in the pre-database (n = 136) and post-database (n = 118) cohorts. Attempted IVC filter retrieval rates were improved from 52.9% to 72.9% (P = 0.001) following the establishment of the database. The number of patients with no documented decision (lost to follow-up) regarding their IVC filter reduced from 31 of 136 (23%) to 0 of 118 patients (P = < 0.001). There was a non-significant reduction in IVC filter dwell time in the post-database group (113 as compared to 137 days, P = 0.129). Following the establishment of an IVC filter database and interventional radiology led follow-up, we demonstrate a significant improvement in the attempted retrieval rates of IVC filters and the number of patient's lost to follow-up. © 2017 The Royal Australian and New Zealand College of Radiologists.

  10. Content-based unconstrained color logo and trademark retrieval with color edge gradient co-occurrence histograms

    NASA Astrophysics Data System (ADS)

    Phan, Raymond; Androutsos, Dimitrios

    2008-01-01

    In this paper, we present a logo and trademark retrieval system for unconstrained color image databases that extends the Color Edge Co-occurrence Histogram (CECH) object detection scheme. We introduce more accurate information to the CECH, by virtue of incorporating color edge detection using vector order statistics. This produces a more accurate representation of edges in color images, in comparison to the simple color pixel difference classification of edges as seen in the CECH. Our proposed method is thus reliant on edge gradient information, and as such, we call this the Color Edge Gradient Co-occurrence Histogram (CEGCH). We use this as the main mechanism for our unconstrained color logo and trademark retrieval scheme. Results illustrate that the proposed retrieval system retrieves logos and trademarks with good accuracy, and outperforms the CECH object detection scheme with higher precision and recall.

  11. A retrieval algorithm of hydrometer profile for submillimeter-wave radiometer

    NASA Astrophysics Data System (ADS)

    Liu, Yuli; Buehler, Stefan; Liu, Heguang

    2017-04-01

    Vertical profiles of particle microphysics perform vital functions for the estimation of climatic feedback. This paper proposes a new algorithm to retrieve the profile of the parameters of the hydrometeor(i.e., ice, snow, rain, liquid cloud, graupel) based on passive submillimeter-wave measurements. These parameters include water content and particle size. The first part of the algorithm builds the database and retrieves the integrated quantities. Database is built up by Atmospheric Radiative Transfer Simulator(ARTS), which uses atmosphere data to simulate the corresponding brightness temperature. Neural network, trained by the precalculated database, is developed to retrieve the water path for each type of particles. The second part of the algorithm analyses the statistical relationship between water path and vertical parameters profiles. Based on the strong dependence existing between vertical layers in the profiles, Principal Component Analysis(PCA) technique is applied. The third part of the algorithm uses the forward model explicitly to retrieve the hydrometeor profiles. Cost function is calculated in each iteration, and Differential Evolution(DE) algorithm is used to adjust the parameter values during the evolutionary process. The performance of this algorithm is planning to be verified for both simulation database and measurement data, by retrieving profiles in comparison with the initial one. Results show that this algorithm has the ability to retrieve the hydrometeor profiles efficiently. The combination of ARTS and optimization algorithm can get much better results than the commonly used database approach. Meanwhile, the concept that ARTS can be used explicitly in the retrieval process shows great potential in providing solution to other retrieval problems.

  12. Aero/fluids database system

    NASA Technical Reports Server (NTRS)

    Reardon, John E.; Violett, Duane L., Jr.

    1991-01-01

    The AFAS Database System was developed to provide the basic structure of a comprehensive database system for the Marshall Space Flight Center (MSFC) Structures and Dynamics Laboratory Aerophysics Division. The system is intended to handle all of the Aerophysics Division Test Facilities as well as data from other sources. The system was written for the DEC VAX family of computers in FORTRAN-77 and utilizes the VMS indexed file system and screen management routines. Various aspects of the system are covered, including a description of the user interface, lists of all code structure elements, descriptions of the file structures, a description of the security system operation, a detailed description of the data retrieval tasks, a description of the session log, and a description of the archival system.

  13. Space Station Freedom environmental database system (FEDS) for MSFC testing

    NASA Technical Reports Server (NTRS)

    Story, Gail S.; Williams, Wendy; Chiu, Charles

    1991-01-01

    The Water Recovery Test (WRT) at Marshall Space Flight Center (MSFC) is the first demonstration of integrated water recovery systems for potable and hygiene water reuse as envisioned for Space Station Freedom (SSF). In order to satisfy the safety and health requirements placed on the SSF program and facilitate test data assessment, an extensive laboratory analysis database was established to provide a central archive and data retrieval function. The database is required to store analysis results for physical, chemical, and microbial parameters measured from water, air and surface samples collected at various locations throughout the test facility. The Oracle Relational Database Management System (RDBMS) was utilized to implement a secured on-line information system with the ECLSS WRT program as the foundation for this system. The database is supported on a VAX/VMS 8810 series mainframe and is accessible from the Marshall Information Network System (MINS). This paper summarizes the database requirements, system design, interfaces, and future enhancements.

  14. Construction of In-house Databases in a Corporation

    NASA Astrophysics Data System (ADS)

    Sano, Hikomaro

    This report outlines “Repoir” (Report information retrieval) system of Toyota Central R & D Laboratories, Inc. as an example of in-house information retrieval system. The online system was designed to process in-house technical reports with the aid of a mainframe computer and has been in operation since 1979. Its features are multiple use of the information for technical and managerial purposes and simplicity in indexing and data input. The total number of descriptors, specially selected for the system, was minimized for ease of indexing. The report also describes the input items, processing flow and typical outputs in kanji letters.

  15. Ten Most Searched Databases by a Business Generalist--Part 1 or A Day in the Life of....

    ERIC Educational Resources Information Center

    Meredith, Meri

    1986-01-01

    Describes databases frequently used in Business Information Center, Cummins Engine Company (Columbus, Indiana): Dun and Bradstreet Business Information Report System, Newsearch, Dun and Bradstreet Market Identifiers, Trade and Industry Index, PTS PROMT, Bureau of Labor Statistics files, ABI/INFORM, Magazine Index, NEXIS, Dow Jones News/Retrieval.…

  16. Developing an Information Infrastructure To Support Information Retrieval: Towards a Theory of Clustering Based in Classification.

    ERIC Educational Resources Information Center

    Micco, Mary; Popp, Rich

    Techniques for building a world-wide information infrastructure by reverse engineering existing databases to link them in a hierarchical system of subject clusters to create an integrated database are explored. The controlled vocabulary of the Library of Congress Subject Headings is used to ensure consistency and group similar items. Each database…

  17. Artificial intelligence techniques for modeling database user behavior

    NASA Technical Reports Server (NTRS)

    Tanner, Steve; Graves, Sara J.

    1990-01-01

    The design and development of the adaptive modeling system is described. This system models how a user accesses a relational database management system in order to improve its performance by discovering use access patterns. In the current system, these patterns are used to improve the user interface and may be used to speed data retrieval, support query optimization and support a more flexible data representation. The system models both syntactic and semantic information about the user's access and employs both procedural and rule-based logic to manipulate the model.

  18. ORFer--retrieval of protein sequences and open reading frames from GenBank and storage into relational databases or text files.

    PubMed

    Büssow, Konrad; Hoffmann, Steve; Sievert, Volker

    2002-12-19

    Functional genomics involves the parallel experimentation with large sets of proteins. This requires management of large sets of open reading frames as a prerequisite of the cloning and recombinant expression of these proteins. A Java program was developed for retrieval of protein and nucleic acid sequences and annotations from NCBI GenBank, using the XML sequence format. Annotations retrieved by ORFer include sequence name, organism and also the completeness of the sequence. The program has a graphical user interface, although it can be used in a non-interactive mode. For protein sequences, the program also extracts the open reading frame sequence, if available, and checks its correct translation. ORFer accepts user input in the form of single or lists of GenBank GI identifiers or accession numbers. It can be used to extract complete sets of open reading frames and protein sequences from any kind of GenBank sequence entry, including complete genomes or chromosomes. Sequences are either stored with their features in a relational database or can be exported as text files in Fasta or tabulator delimited format. The ORFer program is freely available at http://www.proteinstrukturfabrik.de/orfer. The ORFer program allows for fast retrieval of DNA sequences, protein sequences and their open reading frames and sequence annotations from GenBank. Furthermore, storage of sequences and features in a relational database is supported. Such a database can supplement a laboratory information system (LIMS) with appropriate sequence information.

  19. A diagram retrieval method with multi-label learning

    NASA Astrophysics Data System (ADS)

    Fu, Songping; Lu, Xiaoqing; Liu, Lu; Qu, Jingwei; Tang, Zhi

    2015-01-01

    In recent years, the retrieval of plane geometry figures (PGFs) has attracted increasing attention in the fields of mathematics education and computer science. However, the high cost of matching complex PGF features leads to the low efficiency of most retrieval systems. This paper proposes an indirect classification method based on multi-label learning, which improves retrieval efficiency by reducing the scope of compare operation from the whole database to small candidate groups. Label correlations among PGFs are taken into account for the multi-label classification task. The primitive feature selection for multi-label learning and the feature description of visual geometric elements are conducted individually to match similar PGFs. The experiment results show the competitive performance of the proposed method compared with existing PGF retrieval methods in terms of both time consumption and retrieval quality.

  20. 1986 Year End Report for Road Following at Carnegie-Mellon

    DTIC Science & Technology

    1987-05-01

    how to make them work efficiently. We designed a hierarchical structure and a monitor module which manages all parts of the hierarchy (see figure 1...database, called the Local Map, is managed by a program known as the Local Map Builder (LMB). Each module stores and retrieves information in the...knowledge-intensive modules, and a database manager that synchronizes the modules-is characteristic of a traditional blackboard system. Such a system is

  1. Artificial Intelligence: Applications in Education.

    ERIC Educational Resources Information Center

    Thorkildsen, Ron J.; And Others

    1986-01-01

    Artificial intelligence techniques are used in computer programs to search out rapidly and retrieve information from very large databases. Programing advances have also led to the development of systems that provide expert consultation (expert systems). These systems, as applied to education, are the primary emphasis of this article. (LMO)

  2. Integrating Borrowed Records into a Database: Impact on Thesaurus Development and Retrieval.

    ERIC Educational Resources Information Center

    And Others; Kirtland, Monika

    1980-01-01

    Discusses three approaches to thesaurus and indexing/retrieval language maintenance for combined databases: reindexing, merging, and initial standardization. Two thesauri for a combined database are evaluated in terms of their compatibility, and indexing practices are compared. Tables and figures help illustrate aspects of the comparison. (SW)

  3. Subject Retrieval from Full-Text Databases in the Humanities

    ERIC Educational Resources Information Center

    East, John W.

    2007-01-01

    This paper examines the problems involved in subject retrieval from full-text databases of secondary materials in the humanities. Ten such databases were studied and their search functionality evaluated, focusing on factors such as Boolean operators, document surrogates, limiting by subject area, proximity operators, phrase searching, wildcards,…

  4. Assessment of forest fire impacts and emissions in the European Union based on the European forest fire information system

    Treesearch

    Paulo Barbosa; Andrea Camia; Jan Kucera; Giorgio Libertá; Ilaria Palumbo; Jesus San-Miguel-Ayanz; Guido Schmuck

    2009-01-01

    An analysis on the number of forest fires and burned area distribution as retrieved by the European Forest Fire Information System (EFFIS) database is presented. On average, from 2000 to 2005 about...

  5. Desktop Access to Full-Text NACA and NASA Reports: Systems Developed by NASA Langley Technical Library

    NASA Technical Reports Server (NTRS)

    Ambur, Manjula Y.; Adams, David L.; Trinidad, P. Paul

    1997-01-01

    NASA Langley Technical Library has been involved in developing systems for full-text information delivery of NACA/NASA technical reports since 1991. This paper will describe the two prototypes it has developed and the present production system configuration. The prototype systems are a NACA CD-ROM of thirty-three classic paper NACA reports and a network-based Full-text Electronic Reports Documents System (FEDS) constructed from both paper and electronic formats of NACA and NASA reports. The production system is the DigiDoc System (DIGItal Documents) presently being developed based on the experiences gained from the two prototypes. DigiDoc configuration integrates the on-line catalog database World Wide Web interface and PDF technology to provide a powerful and flexible search and retrieval system. It describes in detail significant achievements and lessons learned in terms of data conversion, storage technologies, full-text searching and retrieval, and image databases. The conclusions from the experiences of digitization and full- text access and future plans for DigiDoc system implementation are discussed.

  6. Quantifying the sensitivity of aerosol optical depths retrieved from MSG SEVIRI to a priori data

    NASA Astrophysics Data System (ADS)

    Bulgin, C. E.; Palmer, P. I.; Merchant, C. J.; Siddans, R.; Poulsen, C.; Grainger, R. G.; Thomas, G.; Carboni, E.; McConnell, C.; Highwood, E.

    2009-12-01

    Radiative forcing contributions from aerosol direct and indirect effects remain one of the most uncertain components of the climate system. Satellite observations of aerosol optical properties offer important constraints on atmospheric aerosols but their sensitivity to prior assumptions must be better characterized before they are used effectively to reduce uncertainty in aerosol radiative forcing. We assess the sensitivity of the Oxford-RAL Aerosol and Cloud (ORAC) optimal estimation retrieval of aerosol optical depth (AOD) from the Spinning Enhanced Visible and InfraRed Imager (SEVIRI) to a priori aerosol data. SEVIRI is a geostationary satellite instrument centred over Africa and the neighbouring Atlantic Ocean, routinely sampling desert dust and biomass burning outflow from Africa. We quantify the uncertainty in SEVIRI AOD retrievals in the presence of desert dust by comparing retrievals that use prior information from the Optical Properties of Aerosol and Cloud (OPAC) database, with those that use measured aerosol properties during the Dust Outflow and Deposition to the Ocean (DODO) aircraft campaign (August, 2006). We also assess the sensitivity of retrieved AODs to changes in solar zenith angle, and the vertical profile of aerosol effective radius and extinction coefficient input into the retrieval forward model. Currently the ORAC retrieval scheme retrieves AODs for five aerosol types (desert dust, biomass burning, maritime, urban and continental) and chooses the most appropriate AOD based on the cost functions. We generate an improved prior aerosol speciation database for SEVIRI based on a statistical analysis of a Saharan Dust Index (SDI) determined using variances of different brightness temperatures, and organic and black carbon tracers from the GEOS-Chem chemistry transport model. This database is described as a function of season and time of day. We quantify the difference in AODs between those chosen based on prior information from the SDI and GEOS-Chem and those chosen based on the smallest cost function.

  7. Integrating In Silico Resources to Map a Signaling Network

    PubMed Central

    Liu, Hanqing; Beck, Tim N.; Golemis, Erica A.; Serebriiskii, Ilya G.

    2013-01-01

    The abundance of publicly available life science databases offer a wealth of information that can support interpretation of experimentally derived data and greatly enhance hypothesis generation. Protein interaction and functional networks are not simply new renditions of existing data: they provide the opportunity to gain insights into the specific physical and functional role a protein plays as part of the biological system. In this chapter, we describe different in silico tools that can quickly and conveniently retrieve data from existing data repositories and discuss how the available tools are best utilized for different purposes. While emphasizing protein-protein interaction databases (e.g., BioGrid and IntAct), we also introduce metasearch platforms such as STRING and GeneMANIA, pathway databases (e.g., BioCarta and Pathway Commons), text mining approaches (e.g., PubMed and Chilibot), and resources for drug-protein interactions, genetic information for model organisms and gene expression information based on microarray data mining. Furthermore, we provide a simple step-by-step protocol to building customized protein-protein interaction networks in Cytoscape, a powerful network assembly and visualization program, integrating data retrieved from these various databases. As we illustrate, generation of composite interaction networks enables investigators to extract significantly more information about a given biological system than utilization of a single database or sole reliance on primary literature. PMID:24233784

  8. Imaged Document Optical Correlation and Conversion System (IDOCCS)

    NASA Astrophysics Data System (ADS)

    Stalcup, Bruce W.; Dennis, Phillip W.; Dydyk, Robert B.

    1999-03-01

    Today, the paper document is fast becoming a thing of the past. With the rapid development of fast, inexpensive computing and storage devices, many government and private organizations are archiving their documents in electronic form (e.g., personnel records, medical records, patents, etc.). In addition, many organizations are converting their paper archives to electronic images, which are stored in a computer database. Because of this, there is a need to efficiently organize this data into comprehensive and accessible information resources. The Imaged Document Optical Correlation and Conversion System (IDOCCS) provides a total solution to the problem of managing and retrieving textual and graphic information from imaged document archives. At the heart of IDOCCS, optical correlation technology provides the search and retrieval capability of document images. The IDOCCS can be used to rapidly search for key words or phrases within the imaged document archives and can even determine the types of languages contained within a document. In addition, IDOCCS can automatically compare an input document with the archived database to determine if it is a duplicate, thereby reducing the overall resources required to maintain and access the document database. Embedded graphics on imaged pages can also be exploited, e.g., imaged documents containing an agency's seal or logo, or documents with a particular individual's signature block, can be singled out. With this dual capability, IDOCCS outperforms systems that rely on optical character recognition as a basis for indexing and storing only the textual content of documents for later retrieval.

  9. Current Standardization and Cooperative Efforts Related to Industrial Information Infrastructures.

    DTIC Science & Technology

    1993-05-01

    Data Management Systems: Components used to store, manage, and retrieve data. Data management includes knowledge bases, database management...Application Development Tools and Methods X/Open and POSIX APIs Integrated Design Support System (IDS) Knowledge -Based Systems (KBS) Application...IDEFlx) Yourdon Jackson System Design (JSD) Knowledge -Based Systems (KBSs) Structured Systems Development (SSD) Semantic Unification Meta-Model

  10. Cost and Search Result Comparisons of BRS After Dark and Knowledge Index.

    ERIC Educational Resources Information Center

    Cloud, Gayla Staples; Hambric, Jacqueline

    This two-part study was designed (1) to determine differences in the costs of searching BRS After Dark (BRS AD) and Knowledge Index (KI) generally and across ten selected databases, and (2) to determine whether there is a difference in the citations retrieved when the same search is conducted on the same database in both systems. Study methodology…

  11. Environmental factor(tm) system: RCRA hazardous waste handler information (on CD-ROM). Data file

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    NONE

    1995-11-01

    Environmental Factor(trademark) RCRA Hazardous Waste Handler Information on CD-ROM unleashes the invaluable information found in two key EPA data sources on hazardous waste handlers and offers cradle-to-grave waste tracking. It`s easy to search and display: (1) Permit status, design capacity, and compliance history for facilities found in the EPA Research Conservation and Recovery Information System (RCRIS) program tracking database; (2) Detailed information on hazardous wastes generation, management, and minimization by companies who are large quantity generators; and (3) Data on the waste management practices of treatment, storage, and disposal (TSD) facilities from the EPA Biennial Reporting System which is collectedmore » every other year. Environmental Factor`s powerful database retrieval system lets you: (1) Search for RCRA facilities by permit type, SIC code, waste codes, corrective action, or violation information, TSD status, generator and transporter status, and more. (2) View compliance information - dates of evaluation, violation, enforcement, and corrective action. (3) Lookup facilities by waste processing categories of marketing, transporting, processing, and energy recovery. (4) Use owner/operator information and names, titles, and telephone numbers of project managers for prospecting. (5) Browse detailed data on TSD facility and large quantity generators` activities such as onsite waste treatment, disposal, or recycling, offsite waste received, and waste generation and management. The product contains databases, search and retrieval software on two CD-ROMs, an installation diskette and User`s Guide. Environmental Factor has online context-sensitive help from any screen and a printed User`s Guide describing installation and step-by-step procedures for searching, retrieving, and exporting.« less

  12. GenBank.

    PubMed

    Benson, Dennis A; Karsch-Mizrachi, Ilene; Lipman, David J; Ostell, James; Wheeler, David L

    2007-01-01

    GenBank (R) is a comprehensive database that contains publicly available nucleotide sequences for more than 240 000 named organisms, obtained primarily through submissions from individual laboratories and batch submissions from large-scale sequencing projects. Most submissions are made using the web-based BankIt or standalone Sequin programs and accession numbers are assigned by GenBank staff upon receipt. Daily data exchange with the EMBL Data Library in Europe and the DNA Data Bank of Japan ensures worldwide coverage. GenBank is accessible through NCBI's retrieval system, Entrez, which integrates data from the major DNA and protein sequence databases along with taxonomy, genome, mapping, protein structure and domain information, and the biomedical journal literature via PubMed. BLAST provides sequence similarity searches of GenBank and other sequence databases. Complete bimonthly releases and daily updates of the GenBank database are available by FTP. To access GenBank and its related retrieval and analysis services, begin at the NCBI Homepage (www.ncbi.nlm.nih.gov).

  13. GenBank.

    PubMed

    Benson, Dennis A; Karsch-Mizrachi, Ilene; Lipman, David J; Ostell, James; Wheeler, David L

    2005-01-01

    GenBank is a comprehensive database that contains publicly available DNA sequences for more than 165,000 named organisms, obtained primarily through submissions from individual laboratories and batch submissions from large-scale sequencing projects. Most submissions are made using the web-based BankIt or standalone Sequin programs and accession numbers are assigned by GenBank staff upon receipt. Daily data exchange with the EMBL Data Library in the UK and the DNA Data Bank of Japan helps to ensure worldwide coverage. GenBank is accessible through NCBI's retrieval system, Entrez, which integrates data from the major DNA and protein sequence databases along with taxonomy, genome, mapping, protein structure and domain information, and the biomedical journal literature via PubMed. BLAST provides sequence similarity searches of GenBank and other sequence databases. Complete bimonthly releases and daily updates of the GenBank database are available by FTP. To access GenBank and its related retrieval and analysis services, go to the NCBI Homepage at http://www.ncbi.nlm.nih.gov.

  14. GenBank.

    PubMed

    Benson, Dennis A; Karsch-Mizrachi, Ilene; Lipman, David J; Ostell, James; Wheeler, David L

    2006-01-01

    GenBank (R) is a comprehensive database that contains publicly available DNA sequences for more than 205 000 named organisms, obtained primarily through submissions from individual laboratories and batch submissions from large-scale sequencing projects. Most submissions are made using the Web-based BankIt or standalone Sequin programs and accession numbers are assigned by GenBank staff upon receipt. Daily data exchange with the EMBL Data Library in Europe and the DNA Data Bank of Japan ensures worldwide coverage. GenBank is accessible through NCBI's retrieval system, Entrez, which integrates data from the major DNA and protein sequence databases along with taxonomy, genome, mapping, protein structure and domain information, and the biomedical journal literature via PubMed. BLAST provides sequence similarity searches of GenBank and other sequence databases. Complete bimonthly releases and daily updates of the GenBank database are available by FTP. To access GenBank and its related retrieval and analysis services, go to the NCBI Homepage at www.ncbi.nlm.nih.gov.

  15. Image query and indexing for digital x rays

    NASA Astrophysics Data System (ADS)

    Long, L. Rodney; Thoma, George R.

    1998-12-01

    The web-based medical information retrieval system (WebMIRS) allows interned access to databases containing 17,000 digitized x-ray spine images and associated text data from National Health and Nutrition Examination Surveys (NHANES). WebMIRS allows SQL query of the text, and viewing of the returned text records and images using a standard browser. We are now working (1) to determine utility of data directly derived from the images in our databases, and (2) to investigate the feasibility of computer-assisted or automated indexing of the images to support image retrieval of images of interest to biomedical researchers in the field of osteoarthritis. To build an initial database based on image data, we are manually segmenting a subset of the vertebrae, using techniques from vertebral morphometry. From this, we will derive and add to the database vertebral features. This image-derived data will enhance the user's data access capability by enabling the creation of combined SQL/image-content queries.

  16. 78 FR 19243 - Privacy Act of 1974; System of Records

    Federal Register 2010, 2011, 2012, 2013, 2014

    2013-03-29

    ... applicant; or stored in searchable database and retrievable by patent number. Safeguards: Buildings employ... DEPARTMENT OF COMMERCE United States Patent and Trademark Office Privacy Act of 1974; System of Records AGENCY: United States Patent and Trademark Office, Commerce. ACTION: Notice of amendment of...

  17. Web-based OPACs: Between Tradition and Innovation.

    ERIC Educational Resources Information Center

    Moscoso, Purificacion; Ortiz-Repiso, Virginia

    1999-01-01

    Analyzes the change that Internet-based OPACs (Online Public Access Catalogs) have represented to the structure, administration, and maintenance of the catalogs, retrieval systems, and user interfaces. Examines the structure of databases and traditional principles that have governed systems development. Discusses repercussions of the application…

  18. Maintenance Manual for the Automated Airdrop Information Retrieval System; Human Factors Database

    DTIC Science & Technology

    1994-09-01

    Sensorimotor Abilities Loss of Cognitive/Perceptual Abilities Treatment drug therapy physical therapy cognitive therapy biofeedback therapy 63 9...Device (AOD) Oxygen System oxygen mask oxygen hose oxygen cylinders on/off valve prebreather Floatation Devices life preserver Scuba Gear Ankle Braces

  19. A MySQL Based EPICS Archiver

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Christopher Slominski

    2009-10-01

    Archiving a large fraction of the EPICS signals within the Jefferson Lab (JLAB) Accelerator control system is vital for postmortem and real-time analysis of the accelerator performance. This analysis is performed on a daily basis by scientists, operators, engineers, technicians, and software developers. Archiving poses unique challenges due to the magnitude of the control system. A MySQL Archiving system (Mya) was developed to scale to the needs of the control system; currently archiving 58,000 EPICS variables, updating at a rate of 11,000 events per second. In addition to the large collection rate, retrieval of the archived data must also bemore » fast and robust. Archived data retrieval clients obtain data at a rate over 100,000 data points per second. Managing the data in a relational database provides a number of benefits. This paper describes an archiving solution that uses an open source database and standard off the shelf hardware to reach high performance archiving needs. Mya has been in production at Jefferson Lab since February of 2007.« less

  20. GenBank.

    PubMed

    Benson, Dennis A; Karsch-Mizrachi, Ilene; Lipman, David J; Ostell, James; Sayers, Eric W

    2010-01-01

    GenBank is a comprehensive database that contains publicly available nucleotide sequences for more than 300,000 organisms named at the genus level or lower, obtained primarily through submissions from individual laboratories and batch submissions from large-scale sequencing projects, including whole genome shotgun (WGS) and environmental sampling projects. Most submissions are made using the web-based BankIt or standalone Sequin programs, and accession numbers are assigned by GenBank staff upon receipt. Daily data exchange with the European Molecular Biology Laboratory Nucleotide Sequence Database in Europe and the DNA Data Bank of Japan ensures worldwide coverage. GenBank is accessible through the NCBI Entrez retrieval system, which integrates data from the major DNA and protein sequence databases along with taxonomy, genome, mapping, protein structure and domain information, and the biomedical journal literature via PubMed. BLAST provides sequence similarity searches of GenBank and other sequence databases. Complete bi-monthly releases and daily updates of the GenBank database are available by FTP. To access GenBank and its related retrieval and analysis services, begin at the NCBI homepage: www.ncbi.nlm.nih.gov.

  1. GenBank.

    PubMed

    Benson, Dennis A; Karsch-Mizrachi, Ilene; Lipman, David J; Ostell, James; Sayers, Eric W

    2009-01-01

    GenBank is a comprehensive database that contains publicly available nucleotide sequences for more than 300,000 organisms named at the genus level or lower, obtained primarily through submissions from individual laboratories and batch submissions from large-scale sequencing projects. Most submissions are made using the web-based BankIt or standalone Sequin programs, and accession numbers are assigned by GenBank(R) staff upon receipt. Daily data exchange with the European Molecular Biology Laboratory Nucleotide Sequence Database in Europe and the DNA Data Bank of Japan ensures worldwide coverage. GenBank is accessible through the National Center for Biotechnology Information (NCBI) Entrez retrieval system, which integrates data from the major DNA and protein sequence databases along with taxonomy, genome, mapping, protein structure and domain information, and the biomedical journal literature via PubMed. BLAST provides sequence similarity searches of GenBank and other sequence databases. Complete bimonthly releases and daily updates of the GenBank database are available by FTP. To access GenBank and its related retrieval and analysis services, begin at the NCBI Homepage: www.ncbi.nlm.nih.gov.

  2. Semantic-based surveillance video retrieval.

    PubMed

    Hu, Weiming; Xie, Dan; Fu, Zhouyu; Zeng, Wenrong; Maybank, Steve

    2007-04-01

    Visual surveillance produces large amounts of video data. Effective indexing and retrieval from surveillance video databases are very important. Although there are many ways to represent the content of video clips in current video retrieval algorithms, there still exists a semantic gap between users and retrieval systems. Visual surveillance systems supply a platform for investigating semantic-based video retrieval. In this paper, a semantic-based video retrieval framework for visual surveillance is proposed. A cluster-based tracking algorithm is developed to acquire motion trajectories. The trajectories are then clustered hierarchically using the spatial and temporal information, to learn activity models. A hierarchical structure of semantic indexing and retrieval of object activities, where each individual activity automatically inherits all the semantic descriptions of the activity model to which it belongs, is proposed for accessing video clips and individual objects at the semantic level. The proposed retrieval framework supports various queries including queries by keywords, multiple object queries, and queries by sketch. For multiple object queries, succession and simultaneity restrictions, together with depth and breadth first orders, are considered. For sketch-based queries, a method for matching trajectories drawn by users to spatial trajectories is proposed. The effectiveness and efficiency of our framework are tested in a crowded traffic scene.

  3. Evaluation of contents-based image retrieval methods for a database of logos on drug tablets

    NASA Astrophysics Data System (ADS)

    Geradts, Zeno J.; Hardy, Huub; Poortman, Anneke; Bijhold, Jurrien

    2001-02-01

    In this research an evaluation has been made of the different ways of contents based image retrieval of logos of drug tablets. On a database of 432 illicitly produced tablets (mostly containing MDMA), we have compared different retrieval methods. Two of these methods were available from commercial packages, QBIC and Imatch, where the implementation of the contents based image retrieval methods are not exactly known. We compared the results for this database with the MPEG-7 shape comparison methods, which are the contour-shape, bounding box and region-based shape methods. In addition, we have tested the log polar method that is available from our own research.

  4. Performance appraisal of online MEDLINE access routes.

    PubMed Central

    Walker, C. J.; McKibbon, K. A.; Haynes, R. B.; Johnston, M. E.

    1992-01-01

    OBJECTIVE: To compare the performance and cost of 11 online MEDLINE systems with MEDLINE at Elhill. DESIGN: Comparative study. SYSTEMS: Eleven online daytime systems commercially available in North America offering the MEDLINE database. MEASURES: Number of relevant citations, number of irrelevant citations, proportion of searches producing no relevant citations and cost per relevant citation were analyzed for each system. Relevance and cost for each system were compared with direct searching of MEDLINE through NLM for librarian and clinician search strategies for 18 clinical questions. The citations retrieved by both strategies were pooled and rated for relevance on a 7-point scale. RESULTS: Numbers of relevant and irrelevant citations and cost per relevant citation were higher for clinician searches than librarian searches, reflecting the higher total number of citations retrieved by the clinician approaches. A lower proportion of clinician searches produced no relevant citations than librarian searches. CONCLUSIONS: Eleven daytime MEDLINE systems performed similarly in terms of retrieval and cost within similar searching groups. Clinicians, however, tended to capture larger overall retrievals resulting in higher numbers of relevant and irrelevant citations than librarians. PMID:1482922

  5. Evaluation of information-theoretic similarity measures for content-based retrieval and detection of masses in mammograms

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Tourassi, Georgia D.; Harrawood, Brian; Singh, Swatee

    The purpose of this study was to evaluate image similarity measures employed in an information-theoretic computer-assisted detection (IT-CAD) scheme. The scheme was developed for content-based retrieval and detection of masses in screening mammograms. The study is aimed toward an interactive clinical paradigm where physicians query the proposed IT-CAD scheme on mammographic locations that are either visually suspicious or indicated as suspicious by other cuing CAD systems. The IT-CAD scheme provides an evidence-based, second opinion for query mammographic locations using a knowledge database of mass and normal cases. In this study, eight entropy-based similarity measures were compared with respect to retrievalmore » precision and detection accuracy using a database of 1820 mammographic regions of interest. The IT-CAD scheme was then validated on a separate database for false positive reduction of progressively more challenging visual cues generated by an existing, in-house mass detection system. The study showed that the image similarity measures fall into one of two categories; one category is better suited to the retrieval of semantically similar cases while the second is more effective with knowledge-based decisions regarding the presence of a true mass in the query location. In addition, the IT-CAD scheme yielded a substantial reduction in false-positive detections while maintaining high detection rate for malignant masses.« less

  6. GeoGIS : phase III.

    DOT National Transportation Integrated Search

    2011-08-01

    GeoGIS is a web-based geotechnical database management system that is being developed for the Alabama : Department of Transportation (ALDOT). The purpose of GeoGIS is to facilitate the efficient storage and retrieval of : geotechnical documents for A...

  7. An intelligent interactive visual database management system for Space Shuttle closeout image management

    NASA Technical Reports Server (NTRS)

    Ragusa, James M.; Orwig, Gary; Gilliam, Michael; Blacklock, David; Shaykhian, Ali

    1994-01-01

    Status is given of an applications investigation on the potential for using an expert system shell for classification and retrieval of high resolution, digital, color space shuttle closeout photography. This NASA funded activity has focused on the use of integrated information technologies to intelligently classify and retrieve still imagery from a large, electronically stored collection. A space shuttle processing problem is identified, a working prototype system is described, and commercial applications are identified. A conclusion reached is that the developed system has distinct advantages over the present manual system and cost efficiencies will result as the system is implemented. Further, commercial potential exists for this integrated technology.

  8. Techniques to Access Databases and Integrate Data for Hydrologic Modeling

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Whelan, Gene; Tenney, Nathan D.; Pelton, Mitchell A.

    2009-06-17

    This document addresses techniques to access and integrate data for defining site-specific conditions and behaviors associated with ground-water and surface-water radionuclide transport applicable to U.S. Nuclear Regulatory Commission reviews. Environmental models typically require input data from multiple internal and external sources that may include, but are not limited to, stream and rainfall gage data, meteorological data, hydrogeological data, habitat data, and biological data. These data may be retrieved from a variety of organizations (e.g., federal, state, and regional) and source types (e.g., HTTP, FTP, and databases). Available data sources relevant to hydrologic analyses for reactor licensing are identified and reviewed.more » The data sources described can be useful to define model inputs and parameters, including site features (e.g., watershed boundaries, stream locations, reservoirs, site topography), site properties (e.g., surface conditions, subsurface hydraulic properties, water quality), and site boundary conditions, input forcings, and extreme events (e.g., stream discharge, lake levels, precipitation, recharge, flood and drought characteristics). Available software tools for accessing established databases, retrieving the data, and integrating it with models were identified and reviewed. The emphasis in this review was on existing software products with minimal required modifications to enable their use with the FRAMES modeling framework. The ability of four of these tools to access and retrieve the identified data sources was reviewed. These four software tools were the Hydrologic Data Acquisition and Processing System (HDAPS), Integrated Water Resources Modeling System (IWRMS) External Data Harvester, Data for Environmental Modeling Environmental Data Download Tool (D4EM EDDT), and the FRAMES Internet Database Tools. The IWRMS External Data Harvester and the D4EM EDDT were identified as the most promising tools based on their ability to access and retrieve the required data, and their ability to integrate the data into environmental models using the FRAMES environment.« less

  9. LAILAPS: the plant science search engine.

    PubMed

    Esch, Maria; Chen, Jinbo; Colmsee, Christian; Klapperstück, Matthias; Grafahrend-Belau, Eva; Scholz, Uwe; Lange, Matthias

    2015-01-01

    With the number of sequenced plant genomes growing, the number of predicted genes and functional annotations is also increasing. The association between genes and phenotypic traits is currently of great interest. Unfortunately, the information available today is widely scattered over a number of different databases. Information retrieval (IR) has become an all-encompassing bioinformatics methodology for extracting knowledge from complex, heterogeneous and distributed databases, and therefore can be a useful tool for obtaining a comprehensive view of plant genomics, from genes to traits. Here we describe LAILAPS (http://lailaps.ipk-gatersleben.de), an IR system designed to link plant genomic data in the context of phenotypic attributes for a detailed forward genetic research. LAILAPS comprises around 65 million indexed documents, encompassing >13 major life science databases with around 80 million links to plant genomic resources. The LAILAPS search engine allows fuzzy querying for candidate genes linked to specific traits over a loosely integrated system of indexed and interlinked genome databases. Query assistance and an evidence-based annotation system enable time-efficient and comprehensive information retrieval. An artificial neural network incorporating user feedback and behavior tracking allows relevance sorting of results. We fully describe LAILAPS's functionality and capabilities by comparing this system's performance with other widely used systems and by reporting both a validation in maize and a knowledge discovery use-case focusing on candidate genes in barley. © The Author 2014. Published by Oxford University Press on behalf of Japanese Society of Plant Physiologists.

  10. Interactive Querying Techniques for an Office Filing Facility.

    ERIC Educational Resources Information Center

    Morrissey, J. M.; And Others

    1986-01-01

    Proposes a "Model of Querying" for users of office filing facilities and discusses its motivation, aspects, attributes, and advantages. A review of current information systems and attempts to combine information retrieval, artificial intelligence, and database management techniques leads to conclusion that no resultant system is adequate…

  11. VAPEPS user's reference manual, version 5.0

    NASA Technical Reports Server (NTRS)

    Park, D. M.

    1988-01-01

    This is the reference manual for the VibroAcoustic Payload Environment Prediction System (VAPEPS). The system consists of a computer program and a vibroacoustic database. The purpose of the system is to collect measurements of vibroacoustic data taken from flight events and ground tests, and to retrieve this data and provide a means of using the data to predict future payload environments. This manual describes the operating language of the program. Topics covered include database commands, Statistical Energy Analysis (SEA) prediction commands, stress prediction command, and general computational commands.

  12. A Data Analysis Expert System For Large Established Distributed Databases

    NASA Astrophysics Data System (ADS)

    Gnacek, Anne-Marie; An, Y. Kim; Ryan, J. Patrick

    1987-05-01

    The purpose of this work is to analyze the applicability of artificial intelligence techniques for developing a user-friendly, parallel interface to large isolated, incompatible NASA databases for the purpose of assisting the management decision process. To carry out this work, a survey was conducted to establish the data access requirements of several key NASA user groups. In addition, current NASA database access methods were evaluated. The results of this work are presented in the form of a design for a natural language database interface system, called the Deductively Augmented NASA Management Decision Support System (DANMDS). This design is feasible principally because of recently announced commercial hardware and software product developments which allow cross-vendor compatibility. The goal of the DANMDS system is commensurate with the central dilemma confronting most large companies and institutions in America, the retrieval of information from large, established, incompatible database systems. The DANMDS system implementation would represent a significant first step toward this problem's resolution.

  13. DICOM-compliant PACS with CD-based image archival

    NASA Astrophysics Data System (ADS)

    Cox, Robert D.; Henri, Christopher J.; Rubin, Richard K.; Bret, Patrice M.

    1998-07-01

    This paper describes the design and implementation of a low- cost PACS conforming to the DICOM 3.0 standard. The goal was to provide an efficient image archival and management solution on a heterogeneous hospital network as a basis for filmless radiology. The system follows a distributed, client/server model and was implemented at a fraction of the cost of a commercial PACS. It provides reliable archiving on recordable CD and allows access to digital images throughout the hospital and on the Internet. Dedicated servers have been designed for short-term storage, CD-based archival, data retrieval and remote data access or teleradiology. The short-term storage devices provide DICOM storage and query/retrieve services to scanners and workstations and approximately twelve weeks of 'on-line' image data. The CD-based archival and data retrieval processes are fully automated with the exception of CD loading and unloading. The system employs lossless compression on both short- and long-term storage devices. All servers communicate via the DICOM protocol in conjunction with both local and 'master' SQL-patient databases. Records are transferred from the local to the master database independently, ensuring that storage devices will still function if the master database server cannot be reached. The system features rules-based work-flow management and WWW servers to provide multi-platform remote data access. The WWW server system is distributed on the storage, retrieval and teleradiology servers allowing viewing of locally stored image data directly in a WWW browser without the need for data transfer to a central WWW server. An independent system monitors disk usage, processes, network and CPU load on each server and reports errors to the image management team via email. The PACS was implemented using a combination of off-the-shelf hardware, freely available software and applications developed in-house. The system has enabled filmless operation in CT, MR and ultrasound within the radiology department and throughout the hospital. The use of WWW technology has enabled the development of an intuitive we- based teleradiology and image management solution that provides complete access to image data.

  14. High speed clinical data retrieval system with event time sequence feature: with 10 years of clinical data of Hamamatsu University Hospital CPOE.

    PubMed

    Kimura, M; Tani, S; Watanabe, H; Naito, Y; Sakusabe, T; Watanabe, H; Nakaya, J; Sasaki, F; Numano, T; Furuta, T; Furuta, T

    2008-01-01

    This paper illustrates a high speed clinical data retrieving system, from 10 years of data of operating hospital information system for the purposes of research, evidence creation, patient safety, etc., even incorporating time sequence of causal relations. Total of 73,709,298 records of 10 years at Hamamatsu University Hospital (as of June 2008) are sent from HIS to retrieval system in HL7 v2.5 format. Hierarchical variable length database is used to install them. A search for "listing patients who were prescribed Pravastatin (Mevalotin and generic drugs, any titer)" took 1.92 seconds. "Pravastatin (any) prescribed and recorded AST >150 within two weeks" took 112.22 seconds. Searching conditions can be set to be more complex, connected by Boolean operator and/or. This system called D*D is in operation at Hamamatsu University Hospital since August 2002. It is used for 48,518 times (monthly average of 703 searches). Neither searching, nor background export of data from HIS caused delay of routine operating CPOE. Search database outside of routine operating CPOE, with daily export of order data in HL7 v2.5 format, is proved to provide excellent search environment without causing trouble. Hierarchical representation gives high-speed search response, especially with time sequence of events.

  15. Evaluating the Cassandra NoSQL Database Approach for Genomic Data Persistency.

    PubMed

    Aniceto, Rodrigo; Xavier, Rene; Guimarães, Valeria; Hondo, Fernanda; Holanda, Maristela; Walter, Maria Emilia; Lifschitz, Sérgio

    2015-01-01

    Rapid advances in high-throughput sequencing techniques have created interesting computational challenges in bioinformatics. One of them refers to management of massive amounts of data generated by automatic sequencers. We need to deal with the persistency of genomic data, particularly storing and analyzing these large-scale processed data. To find an alternative to the frequently considered relational database model becomes a compelling task. Other data models may be more effective when dealing with a very large amount of nonconventional data, especially for writing and retrieving operations. In this paper, we discuss the Cassandra NoSQL database approach for storing genomic data. We perform an analysis of persistency and I/O operations with real data, using the Cassandra database system. We also compare the results obtained with a classical relational database system and another NoSQL database approach, MongoDB.

  16. A normative database and determinants of lexical retrieval for 186 Arabic nouns: effects of psycholinguistic and morpho-syntactic variables on naming latency.

    PubMed

    Khwaileh, Tariq; Body, Richard; Herbert, Ruth

    2014-12-01

    Research into lexical retrieval requires pictorial stimuli standardised for key psycholinguistic variables. Such databases exist in a number of languages but not in Arabic. In addition there are few studies of the effects of psycholinguistic and morpho-syntactic variables on Arabic lexical retrieval. The current study identified a set of culturally and linguistically appropriate concept labels, and corresponding photographic representations for Levantine Arabic. The set included masculine and feminine nouns, nouns from both types of plural formation (sound and broken), and both rational and irrational nouns. Levantine Arabic speakers provided norms for visual complexity, imageability, age of acquisition, naming latency and name agreement. This delivered a normative database for a set of 186 Arabic nouns. The effects of the morpho-syntactic and the psycholinguistic variables on lexical retrieval were explored using the database. Imageability and age of acquisition were the only significant determinants of successful lexical retrieval in Arabic. None of the other variables, including all the linguistic variables, had any effect on production time. The normative database is available for the use of clinicians and researchers in the Arab world in the domains of speech and language pathology, neurolinguistics and psycholinguistics. The database and the photographic representations will be soon available for free download from the first author's personal webpage or via email.

  17. Content Based Image Retrieval based on Wavelet Transform coefficients distribution

    PubMed Central

    Lamard, Mathieu; Cazuguel, Guy; Quellec, Gwénolé; Bekri, Lynda; Roux, Christian; Cochener, Béatrice

    2007-01-01

    In this paper we propose a content based image retrieval method for diagnosis aid in medical fields. We characterize images without extracting significant features by using distribution of coefficients obtained by building signatures from the distribution of wavelet transform. The research is carried out by computing signature distances between the query and database images. Several signatures are proposed; they use a model of wavelet coefficient distribution. To enhance results, a weighted distance between signatures is used and an adapted wavelet base is proposed. Retrieval efficiency is given for different databases including a diabetic retinopathy, a mammography and a face database. Results are promising: the retrieval efficiency is higher than 95% for some cases using an optimization process. PMID:18003013

  18. A Normative Database and Determinants of Lexical Retrieval for 186 Arabic Nouns: Effects of Psycholinguistic and Morpho-Syntactic Variables on Naming Latency

    ERIC Educational Resources Information Center

    Khwaileh, Tariq; Body, Richard; Herbert, Ruth

    2014-01-01

    Research into lexical retrieval requires pictorial stimuli standardised for key psycholinguistic variables. Such databases exist in a number of languages but not in Arabic. In addition there are few studies of the effects of psycholinguistic and morpho-syntactic variables on Arabic lexical retrieval. The current study identified a set of…

  19. Update on Genomic Databases and Resources at the National Center for Biotechnology Information.

    PubMed

    Tatusova, Tatiana

    2016-01-01

    The National Center for Biotechnology Information (NCBI), as a primary public repository of genomic sequence data, collects and maintains enormous amounts of heterogeneous data. Data for genomes, genes, gene expressions, gene variation, gene families, proteins, and protein domains are integrated with the analytical, search, and retrieval resources through the NCBI website, text-based search and retrieval system, provides a fast and easy way to navigate across diverse biological databases.Comparative genome analysis tools lead to further understanding of evolution processes quickening the pace of discovery. Recent technological innovations have ignited an explosion in genome sequencing that has fundamentally changed our understanding of the biology of living organisms. This huge increase in DNA sequence data presents new challenges for the information management system and the visualization tools. New strategies have been designed to bring an order to this genome sequence shockwave and improve the usability of associated data.

  20. Content-based video indexing and searching with wavelet transformation

    NASA Astrophysics Data System (ADS)

    Stumpf, Florian; Al-Jawad, Naseer; Du, Hongbo; Jassim, Sabah

    2006-05-01

    Biometric databases form an essential tool in the fight against international terrorism, organised crime and fraud. Various government and law enforcement agencies have their own biometric databases consisting of combination of fingerprints, Iris codes, face images/videos and speech records for an increasing number of persons. In many cases personal data linked to biometric records are incomplete and/or inaccurate. Besides, biometric data in different databases for the same individual may be recorded with different personal details. Following the recent terrorist atrocities, law enforcing agencies collaborate more than before and have greater reliance on database sharing. In such an environment, reliable biometric-based identification must not only determine who you are but also who else you are. In this paper we propose a compact content-based video signature and indexing scheme that can facilitate retrieval of multiple records in face biometric databases that belong to the same person even if their associated personal data are inconsistent. We shall assess the performance of our system using a benchmark audio visual face biometric database that has multiple videos for each subject but with different identity claims. We shall demonstrate that retrieval of relatively small number of videos that are nearest, in terms of the proposed index, to any video in the database results in significant proportion of that individual biometric data.

  1. Databases for LDEF results

    NASA Technical Reports Server (NTRS)

    Bohnhoff-Hlavacek, Gail

    1992-01-01

    One of the objectives of the team supporting the LDEF Systems and Materials Special Investigative Groups is to develop databases of experimental findings. These databases identify the hardware flown, summarize results and conclusions, and provide a system for acknowledging investigators, tracing sources of data, and future design suggestions. To date, databases covering the optical experiments, and thermal control materials (chromic acid anodized aluminum, silverized Teflon blankets, and paints) have been developed at Boeing. We used the Filemaker Pro software, the database manager for the Macintosh computer produced by the Claris Corporation. It is a flat, text-retrievable database that provides access to the data via an intuitive user interface, without tedious programming. Though this software is available only for the Macintosh computer at this time, copies of the databases can be saved to a format that is readable on a personal computer as well. Further, the data can be exported to more powerful relational databases, capabilities, and use of the LDEF databases and describe how to get copies of the database for your own research.

  2. Cosmos: An Information Retrieval System that Works.

    ERIC Educational Resources Information Center

    Clay, Katherine; Grossman, Alvin

    1980-01-01

    Briefly described is the County of San Mateo Online System (COSMOS) which was developed and is used by the San Mateo Educational Resources Center (SMERC) to access the Educational Resources Information Center (ERIC) and Fugitive Information Data Organizer (FIDO) databases as well as the curriculum guides housed at SMERC. (TG)

  3. Converting the H. W. Wilson Company Indexes to an Automated System: A Functional Analysis.

    ERIC Educational Resources Information Center

    Regazzi, John J.

    1984-01-01

    Description of the computerized information system that supports the editorial and manufacturing processes involved in creation of Wilson's subject indexes and catalogs includes the major subsystems--online data entry, batch input processing, validation and release, file generation and database management, online and offline retrieval, publication…

  4. Bibliographic Post-Processing with the TIS Intelligent Gateway: Analytical and Communication Capabilities.

    ERIC Educational Resources Information Center

    Burton, Hilary D.

    TIS (Technology Information System) is an intelligent gateway system capable of performing quantitative evaluation and analysis of bibliographic citations using a set of Process functions. Originally developed by Lawrence Livermore National Laboratory (LLNL) to analyze information retrieved from three major federal databases, DOE/RECON,…

  5. Methods and Software for Building Bibliographic Data Bases.

    ERIC Educational Resources Information Center

    Daehn, Ralph M.

    1985-01-01

    This in-depth look at database management systems (DBMS) for microcomputers covers data entry, information retrieval, security, DBMS software and design, and downloading of literature search results. The advantages of in-house systems versus online search vendors are discussed, and specifications of three software packages and 14 sources are…

  6. 75 FR 52797 - State-68, Office of the Coordinator for Reconstruction and Stabilization Records

    Federal Register 2010, 2011, 2012, 2013, 2014

    2010-08-27

    ...-military exercises. CATEGORIES OF RECORDS IN THE SYSTEM: Name; Social Security number; date of birth..., RETAINING AND DISPOSING OF RECORDS IN THE SYSTEM: STORAGE: Electronic media. RETRIEVABILITY: Individual's... International Development (USAID) personnel will be allowed to access the CRC database in order to run the...

  7. A Computerized Interactive Vocabulary Development System for Advanced Learners.

    ERIC Educational Resources Information Center

    Kukulska-Hulme, Agnes

    1988-01-01

    Argues that the process of recording newly encountered vocabulary items in a typical language learning situation can be improved through a computerized system of vocabulary storage based on database management software that improves the discovery and recording of meaning, subsequent retrieval of items for productive use, and memory retention.…

  8. Improved Information Retrieval Performance on SQL Database Using Data Adapter

    NASA Astrophysics Data System (ADS)

    Husni, M.; Djanali, S.; Ciptaningtyas, H. T.; Wicaksana, I. G. N. A.

    2018-02-01

    The NoSQL databases, short for Not Only SQL, are increasingly being used as the number of big data applications increases. Most systems still use relational databases (RDBs), but as the number of data increases each year, the system handles big data with NoSQL databases to analyze and access data more quickly. NoSQL emerged as a result of the exponential growth of the internet and the development of web applications. The query syntax in the NoSQL database differs from the SQL database, therefore requiring code changes in the application. Data adapter allow applications to not change their SQL query syntax. Data adapters provide methods that can synchronize SQL databases with NotSQL databases. In addition, the data adapter provides an interface which is application can access to run SQL queries. Hence, this research applied data adapter system to synchronize data between MySQL database and Apache HBase using direct access query approach, where system allows application to accept query while synchronization process in progress. From the test performed using data adapter, the results obtained that the data adapter can synchronize between SQL databases, MySQL, and NoSQL database, Apache HBase. This system spends the percentage of memory resources in the range of 40% to 60%, and the percentage of processor moving from 10% to 90%. In addition, from this system also obtained the performance of database NoSQL better than SQL database.

  9. Compressed-domain video indexing techniques using DCT and motion vector information in MPEG video

    NASA Astrophysics Data System (ADS)

    Kobla, Vikrant; Doermann, David S.; Lin, King-Ip; Faloutsos, Christos

    1997-01-01

    Development of various multimedia applications hinges on the availability of fast and efficient storage, browsing, indexing, and retrieval techniques. Given that video is typically stored efficiently in a compressed format, if we can analyze the compressed representation directly, we can avoid the costly overhead of decompressing and operating at the pixel level. Compressed domain parsing of video has been presented in earlier work where a video clip is divided into shots, subshots, and scenes. In this paper, we describe key frame selection, feature extraction, and indexing and retrieval techniques that are directly applicable to MPEG compressed video. We develop a frame-type independent representation of the various types of frames present in an MPEG video in which al frames can be considered equivalent. Features are derived from the available DCT, macroblock, and motion vector information and mapped to a low-dimensional space where they can be accessed with standard database techniques. The spatial information is used as primary index while the temporal information is used to enhance the robustness of the system during the retrieval process. The techniques presented enable fast archiving, indexing, and retrieval of video. Our operational prototype typically takes a fraction of a second to retrieve similar video scenes from our database, with over 95% success.

  10. Rasdaman for Big Spatial Raster Data

    NASA Astrophysics Data System (ADS)

    Hu, F.; Huang, Q.; Scheele, C. J.; Yang, C. P.; Yu, M.; Liu, K.

    2015-12-01

    Spatial raster data have grown exponentially over the past decade. Recent advancements on data acquisition technology, such as remote sensing, have allowed us to collect massive observation data of various spatial resolution and domain coverage. The volume, velocity, and variety of such spatial data, along with the computational intensive nature of spatial queries, pose grand challenge to the storage technologies for effective big data management. While high performance computing platforms (e.g., cloud computing) can be used to solve the computing-intensive issues in big data analysis, data has to be managed in a way that is suitable for distributed parallel processing. Recently, rasdaman (raster data manager) has emerged as a scalable and cost-effective database solution to store and retrieve massive multi-dimensional arrays, such as sensor, image, and statistics data. Within this paper, the pros and cons of using rasdaman to manage and query spatial raster data will be examined and compared with other common approaches, including file-based systems, relational databases (e.g., PostgreSQL/PostGIS), and NoSQL databases (e.g., MongoDB and Hive). Earth Observing System (EOS) data collected from NASA's Atmospheric Scientific Data Center (ASDC) will be used and stored in these selected database systems, and a set of spatial and non-spatial queries will be designed to benchmark their performance on retrieving large-scale, multi-dimensional arrays of EOS data. Lessons learnt from using rasdaman will be discussed as well.

  11. A Data Management System for International Space Station Simulation Tools

    NASA Technical Reports Server (NTRS)

    Betts, Bradley J.; DelMundo, Rommel; Elcott, Sharif; McIntosh, Dawn; Niehaus, Brian; Papasin, Richard; Mah, Robert W.; Clancy, Daniel (Technical Monitor)

    2002-01-01

    Groups associated with the design, operational, and training aspects of the International Space Station make extensive use of modeling and simulation tools. Users of these tools often need to access and manipulate large quantities of data associated with the station, ranging from design documents to wiring diagrams. Retrieving and manipulating this data directly within the simulation and modeling environment can provide substantial benefit to users. An approach for providing these kinds of data management services, including a database schema and class structure, is presented. Implementation details are also provided as a data management system is integrated into the Intelligent Virtual Station, a modeling and simulation tool developed by the NASA Ames Smart Systems Research Laboratory. One use of the Intelligent Virtual Station is generating station-related training procedures in a virtual environment, The data management component allows users to quickly and easily retrieve information related to objects on the station, enhancing their ability to generate accurate procedures. Users can associate new information with objects and have that information stored in a database.

  12. PHASS99: A software program for retrieving and decoding the radiometric ages of igneous rocks from the international database IGBADAT

    NASA Astrophysics Data System (ADS)

    Al-Mishwat, Ali T.

    2016-05-01

    PHASS99 is a FORTRAN program designed to retrieve and decode radiometric and other physical age information of igneous rocks contained in the international database IGBADAT (Igneous Base Data File). In the database, ages are stored in a proprietary format using mnemonic representations. The program can handle up to 99 ages in an igneous rock specimen and caters to forty radiometric age systems. The radiometric age alphanumeric strings assigned to each specimen description in the database consist of four components: the numeric age and its exponential modifier, a four-character mnemonic method identification, a two-character mnemonic name of analysed material, and the reference number in the rock group bibliography vector. For each specimen, the program searches for radiometric age strings, extracts them, parses them, decodes the different age components, and converts them to high-level English equivalents. IGBADAT and similarly-structured files are used for input. The output includes three files: a flat raw ASCII text file containing retrieved radiometric age information, a generic spreadsheet-compatible file for data import to spreadsheets, and an error file. PHASS99 builds on the old program TSTPHA (Test Physical Age) decoder program and expands greatly its capabilities. PHASS99 is simple, user friendly, fast, efficient, and does not require users to have knowledge of programing.

  13. Reference System of DNA and Protein Sequences on CD-ROM

    NASA Astrophysics Data System (ADS)

    Nasu, Hisanori; Ito, Toshiaki

    DNASIS-DBREF31 is a database for DNA and Protein sequences in the form of optical Compact Disk (CD) ROM, developed and commercialized by Hitachi Software Engineering Co., Ltd. Both nucleic acid base sequences and protein amino acid sequences can be retrieved from a single CD-ROM. Existing database is offered in the form of on-line service, floppy disks, or magnetic tape, all of which have some problems or other, such as usability or storage capacity. DNASIS-DBREF31 newly adopt a CD-ROM as a database device to realize a mass storage and personal use of the database.

  14. Compilation of the data-base of the star catalogue by ADABAS.

    NASA Astrophysics Data System (ADS)

    Ishikawa, T.

    A data-base of the FK4 Star Catalogue is compiled by using HITAC M-280H in the Computer Center of Tokyo University and a commercial data-base management system (DBMS) ADABAS. The purpose of this attempt is to examine whether the ADABAS, which could be regarded as a representative of the currently available DBMS's developed majorly for business and information retrieval purposes, proves itself useful for handling mass numerical data like the star catalogue data. It is concluded that the data-base could really be a convenient way for storing and utilizing the star catalogue data.

  15. Psychophysical studies of the performance of an image database retrieval system

    NASA Astrophysics Data System (ADS)

    Papathomas, Thomas V.; Conway, Tiffany E.; Cox, Ingemar J.; Ghosn, Joumana; Miller, Matt L.; Minka, Thomas P.; Yianilos, Peter N.

    1998-07-01

    We describe psychophysical experiments conducted to study PicHunter, a content-based image retrieval (CBIR) system. Experiment 1 studies the importance of using (a) semantic information, (2) memory of earlier input and (3) relative, rather than absolute, judgements of image similarity. The target testing paradigm is used in which a user must search for an image identical to a target. We find that the best performance comes from a version of PicHunter that uses only semantic cues, with memory and relative similarity judgements. Second best is use of both pictorial and semantic cues, with memory and relative similarity judgements. Most reports of CBIR systems provide only qualitative measures of performance based on how similar retrieved images are to a target. Experiment 2 puts PicHunter into this context with a more rigorous test. We first establish a baseline for our database by measuring the time required to find an image that is similar to a target when the images are presented in random order. Although PicHunter's performance is measurably better than this, the test is weak because even random presentation of images yields reasonably short search times. This casts doubt on the strength of results given in other reports where no baseline is established.

  16. Nosql for Storage and Retrieval of Large LIDAR Data Collections

    NASA Astrophysics Data System (ADS)

    Boehm, J.; Liu, K.

    2015-08-01

    Developments in LiDAR technology over the past decades have made LiDAR to become a mature and widely accepted source of geospatial information. This in turn has led to an enormous growth in data volume. The central idea for a file-centric storage of LiDAR point clouds is the observation that large collections of LiDAR data are typically delivered as large collections of files, rather than single files of terabyte size. This split of the dataset, commonly referred to as tiling, was usually done to accommodate a specific processing pipeline. It makes therefore sense to preserve this split. A document oriented NoSQL database can easily emulate this data partitioning, by representing each tile (file) in a separate document. The document stores the metadata of the tile. The actual files are stored in a distributed file system emulated by the NoSQL database. We demonstrate the use of MongoDB a highly scalable document oriented NoSQL database for storing large LiDAR files. MongoDB like any NoSQL database allows for queries on the attributes of the document. As a specialty MongoDB also allows spatial queries. Hence we can perform spatial queries on the bounding boxes of the LiDAR tiles. Inserting and retrieving files on a cloud-based database is compared to native file system and cloud storage transfer speed.

  17. Hospital nurses' information retrieval behaviours in relation to evidence based nursing: a literature review.

    PubMed

    Alving, Berit Elisabeth; Christensen, Janne Buck; Thrysøe, Lars

    2018-03-01

    The purpose of this literature review is to provide an overview of the information retrieval behaviour of clinical nurses, in terms of the use of databases and other information resources and their frequency of use. Systematic searches carried out in five databases and handsearching were used to identify the studies from 2010 to 2016, with a populations, exposures and outcomes (PEO) search strategy, focusing on the question: In which databases or other information resources do hospital nurses search for evidence based information, and how often? Of 5272 titles retrieved based on the search strategy, only nine studies fulfilled the criteria for inclusion. The studies are from the United States, Canada, Taiwan and Nigeria. The results show that hospital nurses' primary choice of source for evidence based information is Google and peers, while bibliographic databases such as PubMed are secondary choices. Data on frequency are only included in four of the studies, and data are heterogenous. The reasons for choosing Google and peers are primarily lack of time; lack of information; lack of retrieval skills; or lack of training in database searching. Only a few studies are published on clinical nurses' retrieval behaviours, and more studies are needed from Europe and Australia. © 2018 Health Libraries Group.

  18. Intelligent Information Retrieval for a Multimedia Database Using Captions

    DTIC Science & Technology

    1992-07-23

    The user was allowed to retrieve any of several multimedia types depending on the descriptors entered. An example mentioned was the assembly of a...statistics showed some performance improvements over a keyword search. Similar type work was described by Wong eL al (1987) where a vector space representation...keyword) lists for searching the lexicon (a syntactic parser is not used); a type hierarchy of terms was used in the process. The system then checked the

  19. Environmental Factor{trademark} system: RCRA hazardous waste handler information

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    NONE

    1999-03-01

    Environmental Factor{trademark} RCRA Hazardous Waste Handler Information on CD-ROM unleashes the invaluable information found in two key EPA data sources on hazardous waste handlers and offers cradle-to-grave waste tracking. It`s easy to search and display: (1) Permit status, design capacity and compliance history for facilities found in the EPA Resource Conservation and Recovery Information System (RCRIS) program tracking database; (2) Detailed information on hazardous wastes generation, management and minimization by companies who are large quantity generators, and (3) Data on the waste management practices of treatment, storage and disposal (TSD) facilities from the EPA Biennial Reporting System which is collectedmore » every other year. Environmental Factor`s powerful database retrieval system lets you: (1) Search for RCRA facilities by permit type, SIC code, waste codes, corrective action or violation information, TSD status, generator and transporter status and more; (2) View compliance information -- dates of evaluation, violation, enforcement and corrective action; (3) Lookup facilities by waste processing categories of marketing, transporting, processing and energy recovery; (4) Use owner/operator information and names, titles and telephone numbers of project managers for prospecting; and (5) Browse detailed data on TSD facility and large quantity generators` activities such as onsite waste treatment, disposal, or recycling, offsite waste received, and waste generation and management. The product contains databases, search and retrieval software on two CD-ROMs, an installation diskette and User`s Guide. Environmental Factor has online context-sensitive help from any screen and a printed User`s Guide describing installation and step-by-step procedures for searching, retrieving and exporting. Hotline support is also available for no additional charge.« less

  20. How to Handle the Avalanche of Online Documentation.

    ERIC Educational Resources Information Center

    Nolan, Maureen P.

    1981-01-01

    The method of handling the printed documentation associated with online information retrieval, which is described, involves the use of a series of separate but related files: database files, system files, network files, index sheets, and equipment files. (FM)

  1. An improved real time image detection system for elephant intrusion along the forest border areas.

    PubMed

    Sugumar, S J; Jayaparvathy, R

    2014-01-01

    Human-elephant conflict is a major problem leading to crop damage, human death and injuries caused by elephants, and elephants being killed by humans. In this paper, we propose an automated unsupervised elephant image detection system (EIDS) as a solution to human-elephant conflict in the context of elephant conservation. The elephant's image is captured in the forest border areas and is sent to a base station via an RF network. The received image is decomposed using Haar wavelet to obtain multilevel wavelet coefficients, with which we perform image feature extraction and similarity match between the elephant query image and the database image using image vision algorithms. A GSM message is sent to the forest officials indicating that an elephant has been detected in the forest border and is approaching human habitat. We propose an optimized distance metric to improve the image retrieval time from the database. We compare the optimized distance metric with the popular Euclidean and Manhattan distance methods. The proposed optimized distance metric retrieves more images with lesser retrieval time than the other distance metrics which makes the optimized distance method more efficient and reliable.

  2. Query by forms: User-oriented relational database retrieving system and its application in analysis of experiment data

    NASA Astrophysics Data System (ADS)

    Skotniczny, Zbigniew

    1989-12-01

    The Query by Forms (QbF) system is a user-oriented interactive tool for querying large relational database with minimal queries difinition cost. The system was worked out under the assumption that user's time and effort for defining needed queries is the most severe bottleneck. The system may be applied in any Rdb/VMS databases system and is recommended for specific information systems of any project where end-user queries cannot be foreseen. The tool is dedicated to specialist of an application domain who have to analyze data maintained in database from any needed point of view, who do not need to know commercial databases languages. The paper presents the system developed as a compromise between its functionality and usability. User-system communication via a menu-driven "tree-like" structure of screen-forms which produces a query difinition and execution is discussed in detail. Output of query results (printed reports and graphics) is also discussed. Finally the paper shows one application of QbF to a HERA-project.

  3. Computer aided systems human engineering: A hypermedia tool

    NASA Technical Reports Server (NTRS)

    Boff, Kenneth R.; Monk, Donald L.; Cody, William J.

    1992-01-01

    The Computer Aided Systems Human Engineering (CASHE) system, Version 1.0, is a multimedia ergonomics database on CD-ROM for the Apple Macintosh II computer, being developed for use by human system designers, educators, and researchers. It will initially be available on CD-ROM and will allow users to access ergonomics data and models stored electronically as text, graphics, and audio. The CASHE CD-ROM, Version 1.0 will contain the Boff and Lincoln (1988) Engineering Data Compendium, MIL-STD-1472D and a unique, interactive simulation capability, the Perception and Performance Prototyper. Its features also include a specialized data retrieval, scaling, and analysis capability and the state of the art in information retrieval, browsing, and navigation.

  4. Image/text automatic indexing and retrieval system using context vector approach

    NASA Astrophysics Data System (ADS)

    Qing, Kent P.; Caid, William R.; Ren, Clara Z.; McCabe, Patrick

    1995-11-01

    Thousands of documents and images are generated daily both on and off line on the information superhighway and other media. Storage technology has improved rapidly to handle these data but indexing this information is becoming very costly. HNC Software Inc. has developed a technology for automatic indexing and retrieval of free text and images. This technique is demonstrated and is based on the concept of `context vectors' which encode a succinct representation of the associated text and features of sub-image. In this paper, we will describe the Automated Librarian System which was designed for free text indexing and the Image Content Addressable Retrieval System (ICARS) which extends the technique from the text domain into the image domain. Both systems have the ability to automatically assign indices for a new document and/or image based on the content similarities in the database. ICARS also has the capability to retrieve images based on similarity of content using index terms, text description, and user-generated images as a query without performing segmentation or object recognition.

  5. Database resources of the National Center for Biotechnology Information.

    PubMed

    2016-01-04

    The National Center for Biotechnology Information (NCBI) provides a large suite of online resources for biological information and data, including the GenBank(®) nucleic acid sequence database and the PubMed database of citations and abstracts for published life science journals. Additional NCBI resources focus on literature (PubMed Central (PMC), Bookshelf and PubReader), health (ClinVar, dbGaP, dbMHC, the Genetic Testing Registry, HIV-1/Human Protein Interaction Database and MedGen), genomes (BioProject, Assembly, Genome, BioSample, dbSNP, dbVar, Epigenomics, the Map Viewer, Nucleotide, Probe, RefSeq, Sequence Read Archive, the Taxonomy Browser and the Trace Archive), genes (Gene, Gene Expression Omnibus (GEO), HomoloGene, PopSet and UniGene), proteins (Protein, the Conserved Domain Database (CDD), COBALT, Conserved Domain Architecture Retrieval Tool (CDART), the Molecular Modeling Database (MMDB) and Protein Clusters) and chemicals (Biosystems and the PubChem suite of small molecule databases). The Entrez system provides search and retrieval operations for most of these databases. Augmenting many of the web applications are custom implementations of the BLAST program optimized to search specialized datasets. All of these resources can be accessed through the NCBI home page at www.ncbi.nlm.nih.gov. Published by Oxford University Press on behalf of Nucleic Acids Research 2015. This work is written by (a) US Government employee(s) and is in the public domain in the US.

  6. Database resources of the National Center for Biotechnology Information.

    PubMed

    2015-01-01

    The National Center for Biotechnology Information (NCBI) provides a large suite of online resources for biological information and data, including the GenBank(®) nucleic acid sequence database and the PubMed database of citations and abstracts for published life science journals. Additional NCBI resources focus on literature (Bookshelf, PubMed Central (PMC) and PubReader); medical genetics (ClinVar, dbMHC, the Genetic Testing Registry, HIV-1/Human Protein Interaction Database and MedGen); genes and genomics (BioProject, BioSample, dbSNP, dbVar, Epigenomics, Gene, Gene Expression Omnibus (GEO), Genome, HomoloGene, the Map Viewer, Nucleotide, PopSet, Probe, RefSeq, Sequence Read Archive, the Taxonomy Browser, Trace Archive and UniGene); and proteins and chemicals (Biosystems, COBALT, the Conserved Domain Database (CDD), the Conserved Domain Architecture Retrieval Tool (CDART), the Molecular Modeling Database (MMDB), Protein Clusters, Protein and the PubChem suite of small molecule databases). The Entrez system provides search and retrieval operations for many of these databases. Augmenting many of the Web applications are custom implementations of the BLAST program optimized to search specialized data sets. All of these resources can be accessed through the NCBI home page at http://www.ncbi.nlm.nih.gov. Published by Oxford University Press on behalf of Nucleic Acids Research 2014. This work is written by (a) US Government employee(s) and is in the public domain in the US.

  7. The Protein Information Resource: an integrated public resource of functional annotation of proteins

    PubMed Central

    Wu, Cathy H.; Huang, Hongzhan; Arminski, Leslie; Castro-Alvear, Jorge; Chen, Yongxing; Hu, Zhang-Zhi; Ledley, Robert S.; Lewis, Kali C.; Mewes, Hans-Werner; Orcutt, Bruce C.; Suzek, Baris E.; Tsugita, Akira; Vinayaka, C. R.; Yeh, Lai-Su L.; Zhang, Jian; Barker, Winona C.

    2002-01-01

    The Protein Information Resource (PIR) serves as an integrated public resource of functional annotation of protein data to support genomic/proteomic research and scientific discovery. The PIR, in collaboration with the Munich Information Center for Protein Sequences (MIPS) and the Japan International Protein Information Database (JIPID), produces the PIR-International Protein Sequence Database (PSD), the major annotated protein sequence database in the public domain, containing about 250 000 proteins. To improve protein annotation and the coverage of experimentally validated data, a bibliography submission system is developed for scientists to submit, categorize and retrieve literature information. Comprehensive protein information is available from iProClass, which includes family classification at the superfamily, domain and motif levels, structural and functional features of proteins, as well as cross-references to over 40 biological databases. To provide timely and comprehensive protein data with source attribution, we have introduced a non-redundant reference protein database, PIR-NREF. The database consists of about 800 000 proteins collected from PIR-PSD, SWISS-PROT, TrEMBL, GenPept, RefSeq and PDB, with composite protein names and literature data. To promote database interoperability, we provide XML data distribution and open database schema, and adopt common ontologies. The PIR web site (http://pir.georgetown.edu/) features data mining and sequence analysis tools for information retrieval and functional identification of proteins based on both sequence and annotation information. The PIR databases and other files are also available by FTP (ftp://nbrfa.georgetown.edu/pir_databases). PMID:11752247

  8. A new method for the automatic retrieval of medical cases based on the RadLex ontology.

    PubMed

    Spanier, A B; Cohen, D; Joskowicz, L

    2017-03-01

    The goal of medical case-based image retrieval (M-CBIR) is to assist radiologists in the clinical decision-making process by finding medical cases in large archives that most resemble a given case. Cases are described by radiology reports comprised of radiological images and textual information on the anatomy and pathology findings. The textual information, when available in standardized terminology, e.g., the RadLex ontology, and used in conjunction with the radiological images, provides a substantial advantage for M-CBIR systems. We present a new method for incorporating textual radiological findings from medical case reports in M-CBIR. The input is a database of medical cases, a query case, and the number of desired relevant cases. The output is an ordered list of the most relevant cases in the database. The method is based on a new case formulation, the Augmented RadLex Graph and an Anatomy-Pathology List. It uses a new case relatedness metric [Formula: see text] that prioritizes more specific medical terms in the RadLex tree over less specific ones and that incorporates the length of the query case. An experimental study on 8 CT queries from the 2015 VISCERAL 3D Case Retrieval Challenge database consisting of 1497 volumetric CT scans shows that our method has accuracy rates of 82 and 70% on the first 10 and 30 most relevant cases, respectively, thereby outperforming six other methods. The increasing amount of medical imaging data acquired in clinical practice constitutes a vast database of untapped diagnostically relevant information. This paper presents a new hybrid approach to retrieving the most relevant medical cases based on textual and image information.

  9. ACMES: fast multiple-genome searches for short repeat sequences with concurrent cross-species information retrieval

    PubMed Central

    Reneker, Jeff; Shyu, Chi-Ren; Zeng, Peiyu; Polacco, Joseph C.; Gassmann, Walter

    2004-01-01

    We have developed a web server for the life sciences community to use to search for short repeats of DNA sequence of length between 3 and 10 000 bases within multiple species. This search employs a unique and fast hash function approach. Our system also applies information retrieval algorithms to discover knowledge of cross-species conservation of repeat sequences. Furthermore, we have incorporated a part of the Gene Ontology database into our information retrieval algorithms to broaden the coverage of the search. Our web server and tutorial can be found at http://acmes.rnet.missouri.edu. PMID:15215469

  10. A statistical retrieval of cloud parameters for the millimeter wave Ice Cloud Imager on board MetOp-SG

    NASA Astrophysics Data System (ADS)

    Prigent, Catherine; Wang, Die; Aires, Filipe; Jimenez, Carlos

    2017-04-01

    The meteorological observations from satellites in the microwave domain are currently limited to below 190 GHz. However, the next generation of European Organisation for the Exploitation of Meteorological Satellites (EUMETSAT) Polar System-Second Generation-EPS-SG will carry an instrument, the Ice Cloud Imager (ICI), with frequencies up to 664 GHz, to improve the characterization of the cloud frozen phase. In this paper, a statistical retrieval of cloud parameters for ICI is developed, trained on a synthetic database derived from the coupling of a mesoscale cloud model and radiative transfer calculations. The hydrometeor profiles simulated with the Weather Research and Forecasting model (WRF) for twelve diverse European mid-latitude situations are used to simulate the brightness temperatures with the Atmospheric Radiative Transfer Simulator (ARTS) to prepare the retrieval database. The WRF+ARTS simulations have been compared to the Special Sensor Microwave Imager/Sounder (SSMIS) observations up to 190 GHz: this successful evaluation gives us confidence in the simulations at the ICI channels from 183 to 664 GHz. Statistical analyses have been performed on this simulated retrieval database, showing that it is not only physically realistic but also statistically satisfactory for retrieval purposes. A first Neural Network (NN) classifier is used to detect the cloud presence. A second NN is developed to retrieve the liquid and ice integrated cloud quantities over sea and land separately. The detection and retrieval of the hydrometeor quantities (i.e., ice, snow, graupel, rain, and liquid cloud) are performed with ICI-only, and with ICI combined with observations from the MicroWave Imager (MWI, with frequencies from 19 to 190 GHz, also on board MetOp-SG). The ICI channels have been optimized for the detection and quantification of the cloud frozen phases: adding the MWI channels improves the performance of the vertically integrated hydrometeor contents, especially for the cloud liquid phases. The relative error for the retrieved integrated frozen water content (FWP, i.e., ice+snow+graupel) is below 40% for 0.1kg/m2 < FWP < 0.5kg/m2 and below 20% for FWP > 0.5 kg/m2.

  11. Development of a database system for operational use in the selection of titanium alloys

    NASA Astrophysics Data System (ADS)

    Han, Yuan-Fei; Zeng, Wei-Dong; Sun, Yu; Zhao, Yong-Qing

    2011-08-01

    The selection of titanium alloys has become a complex decision-making task due to the growing number of creation and utilization for titanium alloys, with each having its own characteristics, advantages, and limitations. In choosing the most appropriate titanium alloys, it is very essential to offer a reasonable and intelligent service for technical engineers. One possible solution of this problem is to develop a database system (DS) to help retrieve rational proposals from different databases and information sources and analyze them to provide useful and explicit information. For this purpose, a design strategy of the fuzzy set theory is proposed, and a distributed database system is developed. Through ranking of the candidate titanium alloys, the most suitable material is determined. It is found that the selection results are in good agreement with the practical situation.

  12. External access to ALICE controls conditions data

    NASA Astrophysics Data System (ADS)

    Jadlovský, J.; Jadlovská, A.; Sarnovský, J.; Jajčišin, Š.; Čopík, M.; Jadlovská, S.; Papcun, P.; Bielek, R.; Čerkala, J.; Kopčík, M.; Chochula, P.; Augustinus, A.

    2014-06-01

    ALICE Controls data produced by commercial SCADA system WINCCOA is stored in ORACLE database on the private experiment network. The SCADA system allows for basic access and processing of the historical data. More advanced analysis requires tools like ROOT and needs therefore a separate access method to the archives. The present scenario expects that detector experts create simple WINCCOA scripts, which retrieves and stores data in a form usable for further studies. This relatively simple procedure generates a lot of administrative overhead - users have to request the data, experts needed to run the script, the results have to be exported outside of the experiment network. The new mechanism profits from database replica, which is running on the CERN campus network. Access to this database is not restricted and there is no risk of generating a heavy load affecting the operation of the experiment. The developed tools presented in this paper allow for access to this data. The users can use web-based tools to generate the requests, consisting of the data identifiers and period of time of interest. The administrators maintain full control over the data - an authorization and authentication mechanism helps to assign privileges to selected users and restrict access to certain groups of data. Advanced caching mechanism allows the user to profit from the presence of already processed data sets. This feature significantly reduces the time required for debugging as the retrieval of raw data can last tens of minutes. A highly configurable client allows for information retrieval bypassing the interactive interface. This method is for example used by ALICE Offline to extract operational conditions after a run is completed. Last but not least, the software can be easily adopted to any underlying database structure and is therefore not limited to WINCCOA.

  13. MPEG-7 audio-visual indexing test-bed for video retrieval

    NASA Astrophysics Data System (ADS)

    Gagnon, Langis; Foucher, Samuel; Gouaillier, Valerie; Brun, Christelle; Brousseau, Julie; Boulianne, Gilles; Osterrath, Frederic; Chapdelaine, Claude; Dutrisac, Julie; St-Onge, Francis; Champagne, Benoit; Lu, Xiaojian

    2003-12-01

    This paper reports on the development status of a Multimedia Asset Management (MAM) test-bed for content-based indexing and retrieval of audio-visual documents within the MPEG-7 standard. The project, called "MPEG-7 Audio-Visual Document Indexing System" (MADIS), specifically targets the indexing and retrieval of video shots and key frames from documentary film archives, based on audio-visual content like face recognition, motion activity, speech recognition and semantic clustering. The MPEG-7/XML encoding of the film database is done off-line. The description decomposition is based on a temporal decomposition into visual segments (shots), key frames and audio/speech sub-segments. The visible outcome will be a web site that allows video retrieval using a proprietary XQuery-based search engine and accessible to members at the Canadian National Film Board (NFB) Cineroute site. For example, end-user will be able to ask to point on movie shots in the database that have been produced in a specific year, that contain the face of a specific actor who tells a specific word and in which there is no motion activity. Video streaming is performed over the high bandwidth CA*net network deployed by CANARIE, a public Canadian Internet development organization.

  14. Evaluating the Cassandra NoSQL Database Approach for Genomic Data Persistency

    PubMed Central

    Aniceto, Rodrigo; Xavier, Rene; Guimarães, Valeria; Hondo, Fernanda; Holanda, Maristela; Walter, Maria Emilia; Lifschitz, Sérgio

    2015-01-01

    Rapid advances in high-throughput sequencing techniques have created interesting computational challenges in bioinformatics. One of them refers to management of massive amounts of data generated by automatic sequencers. We need to deal with the persistency of genomic data, particularly storing and analyzing these large-scale processed data. To find an alternative to the frequently considered relational database model becomes a compelling task. Other data models may be more effective when dealing with a very large amount of nonconventional data, especially for writing and retrieving operations. In this paper, we discuss the Cassandra NoSQL database approach for storing genomic data. We perform an analysis of persistency and I/O operations with real data, using the Cassandra database system. We also compare the results obtained with a classical relational database system and another NoSQL database approach, MongoDB. PMID:26558254

  15. Network Configuration of Oracle and Database Programming Using SQL

    NASA Technical Reports Server (NTRS)

    Davis, Melton; Abdurrashid, Jibril; Diaz, Philip; Harris, W. C.

    2000-01-01

    A database can be defined as a collection of information organized in such a way that it can be retrieved and used. A database management system (DBMS) can further be defined as the tool that enables us to manage and interact with the database. The Oracle 8 Server is a state-of-the-art information management environment. It is a repository for very large amounts of data, and gives users rapid access to that data. The Oracle 8 Server allows for sharing of data between applications; the information is stored in one place and used by many systems. My research will focus primarily on SQL (Structured Query Language) programming. SQL is the way you define and manipulate data in Oracle's relational database. SQL is the industry standard adopted by all database vendors. When programming with SQL, you work on sets of data (i.e., information is not processed one record at a time).

  16. Interactive Exploration for Continuously Expanding Neuron Databases.

    PubMed

    Li, Zhongyu; Metaxas, Dimitris N; Lu, Aidong; Zhang, Shaoting

    2017-02-15

    This paper proposes a novel framework to help biologists explore and analyze neurons based on retrieval of data from neuron morphological databases. In recent years, the continuously expanding neuron databases provide a rich source of information to associate neuronal morphologies with their functional properties. We design a coarse-to-fine framework for efficient and effective data retrieval from large-scale neuron databases. In the coarse-level, for efficiency in large-scale, we employ a binary coding method to compress morphological features into binary codes of tens of bits. Short binary codes allow for real-time similarity searching in Hamming space. Because the neuron databases are continuously expanding, it is inefficient to re-train the binary coding model from scratch when adding new neurons. To solve this problem, we extend binary coding with online updating schemes, which only considers the newly added neurons and update the model on-the-fly, without accessing the whole neuron databases. In the fine-grained level, we introduce domain experts/users in the framework, which can give relevance feedback for the binary coding based retrieval results. This interactive strategy can improve the retrieval performance through re-ranking the above coarse results, where we design a new similarity measure and take the feedback into account. Our framework is validated on more than 17,000 neuron cells, showing promising retrieval accuracy and efficiency. Moreover, we demonstrate its use case in assisting biologists to identify and explore unknown neurons. Copyright © 2017 Elsevier Inc. All rights reserved.

  17. Using GenBank.

    PubMed

    Wheeler, David

    2007-01-01

    GenBank(R) is a comprehensive database of publicly available DNA sequences for more than 205,000 named organisms and for more than 60,000 within the embryophyta, obtained through submissions from individual laboratories and batch submissions from large-scale sequencing projects. Daily data exchange with the European Molecular Biology Laboratory (EMBL) in Europe and the DNA Data Bank of Japan ensures worldwide coverage. GenBank is accessible through the National Center for Biotechnology Information (NCBI) retrieval system, Entrez, which integrates data from the major DNA and protein sequence databases with taxonomy, genome, mapping, protein structure, and domain information and the biomedical journal literature through PubMed. BLAST provides sequence similarity searches of GenBank and other sequence databases. Complete bimonthly releases and daily updates of the GenBank database are available through FTP. GenBank usage scenarios ranging from local analyses of the data available through FTP to online analyses supported by the NCBI Web-based tools are discussed. To access GenBank and its related retrieval and analysis services, go to the NCBI Homepage at http://www.ncbi.nlm.nih.gov.

  18. GenBank.

    PubMed

    Benson, Dennis A; Karsch-Mizrachi, Ilene; Lipman, David J; Ostell, James; Sayers, Eric W

    2011-01-01

    GenBank® is a comprehensive database that contains publicly available nucleotide sequences for more than 380,000 organisms named at the genus level or lower, obtained primarily through submissions from individual laboratories and batch submissions from large-scale sequencing projects, including whole genome shotgun (WGS) and environmental sampling projects. Most submissions are made using the web-based BankIt or standalone Sequin programs, and accession numbers are assigned by GenBank staff upon receipt. Daily data exchange with the European Nucleotide Archive (ENA) and the DNA Data Bank of Japan (DDBJ) ensures worldwide coverage. GenBank is accessible through the NCBI Entrez retrieval system that integrates data from the major DNA and protein sequence databases along with taxonomy, genome, mapping, protein structure and domain information, and the biomedical journal literature via PubMed. BLAST provides sequence similarity searches of GenBank and other sequence databases. Complete bimonthly releases and daily updates of the GenBank database are available by FTP. To access GenBank and its related retrieval and analysis services, begin at the NCBI Homepage: www.ncbi.nlm.nih.gov.

  19. Total Bregman Divergence and its Applications to Shape Retrieval.

    PubMed

    Liu, Meizhu; Vemuri, Baba C; Amari, Shun-Ichi; Nielsen, Frank

    2010-01-01

    Shape database search is ubiquitous in the world of biometric systems, CAD systems etc. Shape data in these domains is experiencing an explosive growth and usually requires search of whole shape databases to retrieve the best matches with accuracy and efficiency for a variety of tasks. In this paper, we present a novel divergence measure between any two given points in [Formula: see text] or two distribution functions. This divergence measures the orthogonal distance between the tangent to the convex function (used in the definition of the divergence) at one of its input arguments and its second argument. This is in contrast to the ordinate distance taken in the usual definition of the Bregman class of divergences [4]. We use this orthogonal distance to redefine the Bregman class of divergences and develop a new theory for estimating the center of a set of vectors as well as probability distribution functions. The new class of divergences are dubbed the total Bregman divergence (TBD). We present the l 1 -norm based TBD center that is dubbed the t-center which is then used as a cluster center of a class of shapes The t-center is weighted mean and this weight is small for noise and outliers. We present a shape retrieval scheme using TBD and the t-center for representing the classes of shapes from the MPEG-7 database and compare the results with other state-of-the-art methods in literature.

  20. Impact of line parameter database, continuum absorption, full grind configuration, and L1B update on GOSAT TIR methane retrieval

    NASA Astrophysics Data System (ADS)

    Yamada, A.; Saitoh, N.; Nonogaki, R.; Imasu, R.; Shiomi, K.; Kuze, A.

    2016-12-01

    The thermal infrared (TIR) band of Thermal and Near-infrared Sensor for Carbon Observation Fourier Transform Spectrometer (TANSO-FTS) onboard Greenhouse Gases Observing Satellite (GOSAT) observes CH4 profile at wavenumber range from 1210 cm-1 to 1360 cm-1 including CH4 ν4 band. The current retrieval algorithm (V1.0) uses LBLRTM V12.1 with AER V3.1 line database to calculate optical depth. LBLRTM V12.1 include MT_CKD 2.5.2 model to calculate continuum absorption. The continuum absorption has large uncertainty, especially temperature dependent coefficient, between BPS model and MT_CKD model in the wavenumber region of 1210-1250 cm-1(Paynter and Ramaswamy, 2014). The purpose of this study is to assess the impact on CH4 retrieval from the line parameter databases and the uncertainty of continuum absorption. We used AER v1.0 database, HITRAN2004 database, HITRAN2008 database, AER V3.2 database, and HITRAN2012 database (Rothman et al. 2005, 2009, and 2013. Clough et al., 2005). AER V1.0 database is based on HITRAN2000. The CH4 line parameters of AER V3.1 and V3.2 databases are developed from HITRAN2008 including updates until May 2009 with line mixing parameters. We compared the retrieved CH4 with the HIPPO CH4 observation (Wofsy et al., 2012). The difference of AER V3.2 was the smallest and 24.1 ± 45.9 ppbv. The differences of AER V1.0, HITRAN2004, HITRAN2008, and HITRAN2012 were 35.6 ± 46.5 ppbv, 37.6 ± 46.3 ppbv, 32.1 ± 46.1 ppbv, and 35.2 ± 46.0 ppbv, respectively. Compare AER V3.2 case to HITRAN2008 case, the line coupling effect reduced difference by 8.0 ppbv. Median values of Residual difference from HITRAN2008 to AER V1.0, HITRAN2004, AER V3.2, and HITRAN2012 were 0.6 K, 0.1 K, -0.08 K, and 0.08 K, respectively, while median values of transmittance difference were less than 0.0003 and transmittance differences have small wavenumber dependence. We also discuss the retrieval error from the uncertainty of the continuum absorption, the test of full grid configuration for retrieval, and the retrieval results using GOSAT TIR L1B V203203, which are sample products to evaluate the next level 1B algorithm.

  1. Review of Research on Student-Facing Learning Analytics Dashboards and Educational Recommender Systems

    ERIC Educational Resources Information Center

    Bodily, Robert; Verbert, Katrien

    2017-01-01

    This article is a comprehensive literature review of student-facing learning analytics reporting systems that track learning analytics data and report it directly to students. This literature review builds on four previously conducted literature reviews in similar domains. Out of the 945 articles retrieved from databases and journals, 93 articles…

  2. Global Precipitation Estimates from Cross-Track Passive Microwave Observations Using a Physically-Based Retrieval Scheme

    NASA Technical Reports Server (NTRS)

    Kidd, Chris; Matsui, Toshi; Chern, Jiundar; Mohr, Karen; Kummerow, Christian; Randel, Dave

    2015-01-01

    The estimation of precipitation across the globe from satellite sensors provides a key resource in the observation and understanding of our climate system. Estimates from all pertinent satellite observations are critical in providing the necessary temporal sampling. However, consistency in these estimates from instruments with different frequencies and resolutions is critical. This paper details the physically based retrieval scheme to estimate precipitation from cross-track (XT) passive microwave (PM) sensors on board the constellation satellites of the Global Precipitation Measurement (GPM) mission. Here the Goddard profiling algorithm (GPROF), a physically based Bayesian scheme developed for conically scanning (CS) sensors, is adapted for use with XT PM sensors. The present XT GPROF scheme utilizes a model-generated database to overcome issues encountered with an observational database as used by the CS scheme. The model database ensures greater consistency across meteorological regimes and surface types by providing a more comprehensive set of precipitation profiles. The database is corrected for bias against the CS database to ensure consistency in the final product. Statistical comparisons over western Europe and the United States show that the XT GPROF estimates are comparable with those from the CS scheme. Indeed, the XT estimates have higher correlations against surface radar data, while maintaining similar root-mean-square errors. Latitudinal profiles of precipitation show the XT estimates are generally comparable with the CS estimates, although in the southern midlatitudes the peak precipitation is shifted equatorward while over the Arctic large differences are seen between the XT and the CS retrievals.

  3. An Integrated Korean Biodiversity and Genetic Information Retrieval System

    PubMed Central

    Lim, Jeongheui; Bhak, Jong; Oh, Hee-Mock; Kim, Chang-Bae; Park, Yong-Ha; Paek, Woon Kee

    2008-01-01

    Background On-line biodiversity information databases are growing quickly and being integrated into general bioinformatics systems due to the advances of fast gene sequencing technologies and the Internet. These can reduce the cost and effort of performing biodiversity surveys and genetic searches, which allows scientists to spend more time researching and less time collecting and maintaining data. This will cause an increased rate of knowledge build-up and improve conservations. The biodiversity databases in Korea have been scattered among several institutes and local natural history museums with incompatible data types. Therefore, a comprehensive database and a nation wide web portal for biodiversity information is necessary in order to integrate diverse information resources, including molecular and genomic databases. Results The Korean Natural History Research Information System (NARIS) was built and serviced as the central biodiversity information system to collect and integrate the biodiversity data of various institutes and natural history museums in Korea. This database aims to be an integrated resource that contains additional biological information, such as genome sequences and molecular level diversity. Currently, twelve institutes and museums in Korea are integrated by the DiGIR (Distributed Generic Information Retrieval) protocol, with Darwin Core2.0 format as its metadata standard for data exchange. Data quality control and statistical analysis functions have been implemented. In particular, integrating molecular and genetic information from the National Center for Biotechnology Information (NCBI) databases with NARIS was recently accomplished. NARIS can also be extended to accommodate other institutes abroad, and the whole system can be exported to establish local biodiversity management servers. Conclusion A Korean data portal, NARIS, has been developed to efficiently manage and utilize biodiversity data, which includes genetic resources. NARIS aims to be integral in maximizing bio-resource utilization for conservation, management, research, education, industrial applications, and integration with other bioinformation data resources. It can be found at . PMID:19091024

  4. High-speed data search

    NASA Technical Reports Server (NTRS)

    Driscoll, James N.

    1994-01-01

    The high-speed data search system developed for KSC incorporates existing and emerging information retrieval technology to help a user intelligently and rapidly locate information found in large textual databases. This technology includes: natural language input; statistical ranking of retrieved information; an artificial intelligence concept called semantics, where 'surface level' knowledge found in text is used to improve the ranking of retrieved information; and relevance feedback, where user judgements about viewed information are used to automatically modify the search for further information. Semantics and relevance feedback are features of the system which are not available commercially. The system further demonstrates focus on paragraphs of information to decide relevance; and it can be used (without modification) to intelligently search all kinds of document collections, such as collections of legal documents medical documents, news stories, patents, and so forth. The purpose of this paper is to demonstrate the usefulness of statistical ranking, our semantic improvement, and relevance feedback.

  5. Multimedia Database at National Museum of Ethnology

    NASA Astrophysics Data System (ADS)

    Sugita, Shigeharu

    This paper describes the information management system at National Museum of Ethnology, Osaka, Japan. This museum is a kind of research center for cultural anthropology, and has many computer systems such as IBM 3090, VAX11/780, Fujitu M340R, etc. With these computers, distributed multimedia databases are constructed in which not only bibliographic data but also artifact image, slide image, book page image, etc. are stored. The number of data is now about 1.3 million items. These data can be retrieved and displayed on the multimedia workstation which has several displays.

  6. A database system to support image algorithm evaluation

    NASA Technical Reports Server (NTRS)

    Lien, Y. E.

    1977-01-01

    The design is given of an interactive image database system IMDB, which allows the user to create, retrieve, store, display, and manipulate images through the facility of a high-level, interactive image query (IQ) language. The query language IQ permits the user to define false color functions, pixel value transformations, overlay functions, zoom functions, and windows. The user manipulates the images through generic functions. The user can direct images to display devices for visual and qualitative analysis. Image histograms and pixel value distributions can also be computed to obtain a quantitative analysis of images.

  7. Expert Search Strategies: The Information Retrieval Practices of Healthcare Information Professionals.

    PubMed

    Russell-Rose, Tony; Chamberlain, Jon

    2017-10-02

    Healthcare information professionals play a key role in closing the knowledge gap between medical research and clinical practice. Their work involves meticulous searching of literature databases using complex search strategies that can consist of hundreds of keywords, operators, and ontology terms. This process is prone to error and can lead to inefficiency and bias if performed incorrectly. The aim of this study was to investigate the search behavior of healthcare information professionals, uncovering their needs, goals, and requirements for information retrieval systems. A survey was distributed to healthcare information professionals via professional association email discussion lists. It investigated the search tasks they undertake, their techniques for search strategy formulation, their approaches to evaluating search results, and their preferred functionality for searching library-style databases. The popular literature search system PubMed was then evaluated to determine the extent to which their needs were met. The 107 respondents indicated that their information retrieval process relied on the use of complex, repeatable, and transparent search strategies. On average it took 60 minutes to formulate a search strategy, with a search task taking 4 hours and consisting of 15 strategy lines. Respondents reviewed a median of 175 results per search task, far more than they would ideally like (100). The most desired features of a search system were merging search queries and combining search results. Healthcare information professionals routinely address some of the most challenging information retrieval problems of any profession. However, their needs are not fully supported by current literature search systems and there is demand for improved functionality, in particular regarding the development and management of search strategies. ©Tony Russell-Rose, Jon Chamberlain. Originally published in JMIR Medical Informatics (http://medinform.jmir.org), 02.10.2017.

  8. Expert Search Strategies: The Information Retrieval Practices of Healthcare Information Professionals

    PubMed Central

    2017-01-01

    Background Healthcare information professionals play a key role in closing the knowledge gap between medical research and clinical practice. Their work involves meticulous searching of literature databases using complex search strategies that can consist of hundreds of keywords, operators, and ontology terms. This process is prone to error and can lead to inefficiency and bias if performed incorrectly. Objective The aim of this study was to investigate the search behavior of healthcare information professionals, uncovering their needs, goals, and requirements for information retrieval systems. Methods A survey was distributed to healthcare information professionals via professional association email discussion lists. It investigated the search tasks they undertake, their techniques for search strategy formulation, their approaches to evaluating search results, and their preferred functionality for searching library-style databases. The popular literature search system PubMed was then evaluated to determine the extent to which their needs were met. Results The 107 respondents indicated that their information retrieval process relied on the use of complex, repeatable, and transparent search strategies. On average it took 60 minutes to formulate a search strategy, with a search task taking 4 hours and consisting of 15 strategy lines. Respondents reviewed a median of 175 results per search task, far more than they would ideally like (100). The most desired features of a search system were merging search queries and combining search results. Conclusions Healthcare information professionals routinely address some of the most challenging information retrieval problems of any profession. However, their needs are not fully supported by current literature search systems and there is demand for improved functionality, in particular regarding the development and management of search strategies. PMID:28970190

  9. Development of an Integrated Hydrologic Modeling System for Rainfall-Runoff Simulation

    NASA Astrophysics Data System (ADS)

    Lu, B.; Piasecki, M.

    2008-12-01

    This paper aims to present the development of an integrated hydrological model which involves functionalities of digital watershed processing, online data retrieval, hydrologic simulation and post-event analysis. The proposed system is intended to work as a back end to the CUAHSI HIS cyberinfrastructure developments. As a first step into developing this system, a physics-based distributed hydrologic model PIHM (Penn State Integrated Hydrologic Model) is wrapped into OpenMI(Open Modeling Interface and Environment ) environment so as to seamlessly interact with OpenMI compliant meteorological models. The graphical user interface is being developed from the openGIS application called MapWindows which permits functionality expansion through the addition of plug-ins. . Modules required to set up through the GUI workboard include those for retrieving meteorological data from existing database or meteorological prediction models, obtaining geospatial data from the output of digital watershed processing, and importing initial condition and boundary condition. They are connected to the OpenMI compliant PIHM to simulate rainfall-runoff processes and includes a module for automatically displaying output after the simulation. Online databases are accessed through the WaterOneFlow web services, and the retrieved data are either stored in an observation database(OD) following the schema of Observation Data Model(ODM) in case for time series support, or a grid based storage facility which may be a format like netCDF or a grid-based-data database schema . Specific development steps include the creation of a bridge to overcome interoperability issue between PIHM and the ODM, as well as the embedding of TauDEM (Terrain Analysis Using Digital Elevation Models) into the model. This module is responsible for developing watershed and stream network using digital elevation models. Visualizing and editing geospatial data is achieved by the usage of MapWinGIS, an ActiveX control developed by MapWindow team. After applying to the practical watershed, the performance of the model can be tested by the post-event analysis module.

  10. MRML: an extensible communication protocol for interoperability and benchmarking of multimedia information retrieval systems

    NASA Astrophysics Data System (ADS)

    Mueller, Wolfgang; Mueller, Henning; Marchand-Maillet, Stephane; Pun, Thierry; Squire, David M.; Pecenovic, Zoran; Giess, Christoph; de Vries, Arjen P.

    2000-10-01

    While in the area of relational databases interoperability is ensured by common communication protocols (e.g. ODBC/JDBC using SQL), Content Based Image Retrieval Systems (CBIRS) and other multimedia retrieval systems are lacking both a common query language and a common communication protocol. Besides its obvious short term convenience, interoperability of systems is crucial for the exchange and analysis of user data. In this paper, we present and describe an extensible XML-based query markup language, called MRML (Multimedia Retrieval markup Language). MRML is primarily designed so as to ensure interoperability between different content-based multimedia retrieval systems. Further, MRML allows researchers to preserve their freedom in extending their system as needed. MRML encapsulates multimedia queries in a way that enable multimedia (MM) query languages, MM content descriptions, MM query engines, and MM user interfaces to grow independently from each other, reaching a maximum of interoperability while ensuring a maximum of freedom for the developer. For benefitting from this, only a few simple design principles have to be respected when extending MRML for one's fprivate needs. The design of extensions withing the MRML framework will be described in detail in the paper. MRML has been implemented and tested for the CBIRS Viper, using the user interface Snake Charmer. Both are part of the GNU project and can be downloaded at our site.

  11. Natural language information retrieval in digital libraries

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Strzalkowski, T.; Perez-Carballo, J.; Marinescu, M.

    In this paper we report on some recent developments in joint NYU and GE natural language information retrieval system. The main characteristic of this system is the use of advanced natural language processing to enhance the effectiveness of term-based document retrieval. The system is designed around a traditional statistical backbone consisting of the indexer module, which builds inverted index files from pre-processed documents, and a retrieval engine which searches and ranks the documents in response to user queries. Natural language processing is used to (1) preprocess the documents in order to extract content-carrying terms, (2) discover inter-term dependencies and buildmore » a conceptual hierarchy specific to the database domain, and (3) process user`s natural language requests into effective search queries. This system has been used in NIST-sponsored Text Retrieval Conferences (TREC), where we worked with approximately 3.3 GBytes of text articles including material from the Wall Street Journal, the Associated Press newswire, the Federal Register, Ziff Communications`s Computer Library, Department of Energy abstracts, U.S. Patents and the San Jose Mercury News, totaling more than 500 million words of English. The system have been designed to facilitate its scalability to deal with ever increasing amounts of data. In particular, a randomized index-splitting mechanism has been installed which allows the system to create a number of smaller indexes that can be independently and efficiently searched.« less

  12. Classifying environmental pollutants: Part 3. External validation of the classification system.

    PubMed

    Verhaar, H J; Solbé, J; Speksnijder, J; van Leeuwen, C J; Hermens, J L

    2000-04-01

    In order to validate a classification system for the prediction of the toxic effect concentrations of organic environmental pollutants to fish, all available fish acute toxicity data were retrieved from the ECETOC database, a database of quality-evaluated aquatic toxicity measurements created and maintained by the European Centre for the Ecotoxicology and Toxicology of Chemicals. The individual chemicals for which these data were available were classified according to the rulebase under consideration and predictions of effect concentrations or ranges of possible effect concentrations were generated. These predictions were compared to the actual toxicity data retrieved from the database. The results of this comparison show that generally, the classification system provides adequate predictions of either the aquatic toxicity (class 1) or the possible range of toxicity (other classes) of organic compounds. A slight underestimation of effect concentrations occurs for some highly water soluble, reactive chemicals with low log K(ow) values. On the other end of the scale, some compounds that are classified as belonging to a relatively toxic class appear to belong to the so-called baseline toxicity compounds. For some of these, additional classification rules are proposed. Furthermore, some groups of compounds cannot be classified, although they should be amenable to predictions. For these compounds additional research as to class membership and associated prediction rules is proposed.

  13. EnsMart: A Generic System for Fast and Flexible Access to Biological Data

    PubMed Central

    Kasprzyk, Arek; Keefe, Damian; Smedley, Damian; London, Darin; Spooner, William; Melsopp, Craig; Hammond, Martin; Rocca-Serra, Philippe; Cox, Tony; Birney, Ewan

    2004-01-01

    The EnsMart system (www.ensembl.org/EnsMart) provides a generic data warehousing solution for fast and flexible querying of large biological data sets and integration with third-party data and tools. The system consists of a query-optimized database and interactive, user-friendly interfaces. EnsMart has been applied to Ensembl, where it extends its genomic browser capabilities, facilitating rapid retrieval of customized data sets. A wide variety of complex queries, on various types of annotations, for numerous species are supported. These can be applied to many research problems, ranging from SNP selection for candidate gene screening, through cross-species evolutionary comparisons, to microarray annotation. Users can group and refine biological data according to many criteria, including cross-species analyses, disease links, sequence variations, and expression patterns. Both tabulated list data and biological sequence output can be generated dynamically, in HTML, text, Microsoft Excel, and compressed formats. A wide range of sequence types, such as cDNA, peptides, coding regions, UTRs, and exons, with additional upstream and downstream regions, can be retrieved. The EnsMart database can be accessed via a public Web site, or through a Java application suite. Both implementations and the database are freely available for local installation, and can be extended or adapted to `non-Ensembl' data sets. PMID:14707178

  14. Extending the data dictionary for data/knowledge management

    NASA Technical Reports Server (NTRS)

    Hydrick, Cecile L.; Graves, Sara J.

    1988-01-01

    Current relational database technology provides the means for efficiently storing and retrieving large amounts of data. By combining techniques learned from the field of artificial intelligence with this technology, it is possible to expand the capabilities of such systems. This paper suggests using the expanded domain concept, an object-oriented organization, and the storing of knowledge rules within the relational database as a solution to the unique problems associated with CAD/CAM and engineering data.

  15. Interactive access to forest inventory data for the South Central United States

    Treesearch

    William H. McWilliams

    1990-01-01

    On-line access to USDA, Forest Service successive forest inventory data for the South Central United States is provided by two computer systems. The Easy Access to Forest Inventory and Analysis Tables program (EZTAB) produces a set of tables for specific geographic areas. The Interactive Graphics and Retrieval System (INGRES) is a database management system that...

  16. Brain CT image similarity retrieval method based on uncertain location graph.

    PubMed

    Pan, Haiwei; Li, Pengyuan; Li, Qing; Han, Qilong; Feng, Xiaoning; Gao, Linlin

    2014-03-01

    A number of brain computed tomography (CT) images stored in hospitals that contain valuable information should be shared to support computer-aided diagnosis systems. Finding the similar brain CT images from the brain CT image database can effectively help doctors diagnose based on the earlier cases. However, the similarity retrieval for brain CT images requires much higher accuracy than the general images. In this paper, a new model of uncertain location graph (ULG) is presented for brain CT image modeling and similarity retrieval. According to the characteristics of brain CT image, we propose a novel method to model brain CT image to ULG based on brain CT image texture. Then, a scheme for ULG similarity retrieval is introduced. Furthermore, an effective index structure is applied to reduce the searching time. Experimental results reveal that our method functions well on brain CT images similarity retrieval with higher accuracy and efficiency.

  17. DOE Office of Scientific and Technical Information (OSTI.GOV)

    Devarakonda, Ranjeet; Giansiracusa, Michael T.; Kumar, Jitendra

    Information connectivity and retrieval has a role in our daily lives. The most pervasive source of online information is databases. The amount of data is growing at rapid rate and database technology is improving and having a profound effect. Almost all online applications are storing and retrieving information from databases. One challenge in supplying the public with wider access to informational databases is the need for knowledge of database languages like Structured Query Language (SQL). Although the SQL language has been published in many forms, not everybody is able to write SQL queries. Another challenge is that it may notmore » be practical to make the public aware of the structure of the database. There is a need for novice users to query relational databases using their natural language. To solve this problem, many natural language interfaces to structured databases have been developed. The goal is to provide more intuitive method for generating database queries and delivering responses. Social media makes it possible to interact with a wide section of the population. Through this medium, and with the help of Natural Language Processing (NLP) we can make the data of the Atmospheric Radiation Measurement Data Center (ADC) more accessible to the public. We propose an architecture for using Apache Lucene/Solr [1], OpenML [2,3], and Kafka [4] to generate an automated query/response system with inputs from Twitter5, our Cassandra DB, and our log database. Using the Twitter API and NLP we can give the public the ability to ask questions of our database and get automated responses.« less

  18. MEDLINE versus EMBASE and CINAHL for telemedicine searches.

    PubMed

    Bahaadinbeigy, Kambiz; Yogesan, Kanagasingam; Wootton, Richard

    2010-10-01

    Researchers in the domain of telemedicine throughout the world tend to search multiple bibliographic databases to retrieve the highest possible number of publications when conducting review projects. Medical Literature Analysis and Retrieval System Online (MEDLINE), Excerpta Medica Database (EMBASE), and Cumulative Index to Nursing and Allied Health Literature (CINAHL) are three popular databases in the discipline of biomedicine that are used for conducting reviews. Access to the MEDLINE database is free and easy, whereas EMBASE and CINAHL are not free and sometimes not easy to access for researchers in small research centers. This project sought to compare MEDLINE with EMBASE and CINAHL to estimate what proportion of potentially relevant publications would be missed when only MEDLINE is used in a review project, in comparison to when EMBASE and CINAHL are also used. Twelve simple keywords relevant to 12 different telemedicine applications were searched using all three databases, and the results were compared. About 9%-18% of potentially relevant articles would have been missed if MEDLINE had been the only database used. It is preferable if all three or more databases are used when conducting a review in telemedicine. Researchers from developing countries or small research institutions could rely on only MEDLINE, but they would loose 9%-18% of the potentially relevant publications. Searching MEDLINE alone is not ideal, but in a resource-constrained situation, it is definitely better than nothing.

  19. A Description of the Text Data Base System TDBS. Stockholm Papers in Library and Information Science.

    ERIC Educational Resources Information Center

    Lofstrom, Mats

    Because experience with large information retrieval (IR) and database management (DBM) systems has shown that they are not adequate for the handling of textual material, two Swedish companies--Paralog and AU-System Network--have joined in a venture to develop a software package which combines features from IR and DMB systems to form a Text Data…

  20. Diseno y desarrollo de una base de datos bibliograficos (Design and Development of a Bibliographic Database).

    ERIC Educational Resources Information Center

    Mattenella, L. E.; Velazco, J. W.

    1992-01-01

    This article briefly describes the development of bibliographic retrieval systems in the Instituto de Beneficio di Minerales (IN BE MI) in Salta, Argentina, using the Mini-micro CDS/ISIS software developed by Unesco. (LRW)

  1. Multitasking Information Seeking and Searching Processes.

    ERIC Educational Resources Information Center

    Spink, Amanda; Ozmutlu, H. Cenk; Ozmutlu, Seda

    2002-01-01

    Presents findings from four studies of the prevalence of multitasking information seeking and searching by Web (via the Excite search engine), information retrieval system (mediated online database searching), and academic library users. Highlights include human information coordinating behavior (HICB); and implications for models of information…

  2. An approach in building a chemical compound search engine in oracle database.

    PubMed

    Wang, H; Volarath, P; Harrison, R

    2005-01-01

    A searching or identifying of chemical compounds is an important process in drug design and in chemistry research. An efficient search engine involves a close coupling of the search algorithm and database implementation. The database must process chemical structures, which demands the approaches to represent, store, and retrieve structures in a database system. In this paper, a general database framework for working as a chemical compound search engine in Oracle database is described. The framework is devoted to eliminate data type constrains for potential search algorithms, which is a crucial step toward building a domain specific query language on top of SQL. A search engine implementation based on the database framework is also demonstrated. The convenience of the implementation emphasizes the efficiency and simplicity of the framework.

  3. BIOSPIDA: A Relational Database Translator for NCBI.

    PubMed

    Hagen, Matthew S; Lee, Eva K

    2010-11-13

    As the volume and availability of biological databases continue widespread growth, it has become increasingly difficult for research scientists to identify all relevant information for biological entities of interest. Details of nucleotide sequences, gene expression, molecular interactions, and three-dimensional structures are maintained across many different databases. To retrieve all necessary information requires an integrated system that can query multiple databases with minimized overhead. This paper introduces a universal parser and relational schema translator that can be utilized for all NCBI databases in Abstract Syntax Notation (ASN.1). The data models for OMIM, Entrez-Gene, Pubmed, MMDB and GenBank have been successfully converted into relational databases and all are easily linkable helping to answer complex biological questions. These tools facilitate research scientists to locally integrate databases from NCBI without significant workload or development time.

  4. A prototype feature system for feature retrieval using relationships

    USGS Publications Warehouse

    Choi, J.; Usery, E.L.

    2009-01-01

    Using a feature data model, geographic phenomena can be represented effectively by integrating space, theme, and time. This paper extends and implements a feature data model that supports query and visualization of geographic features using their non-spatial and temporal relationships. A prototype feature-oriented geographic information system (FOGIS) is then developed and storage of features named Feature Database is designed. Buildings from the U.S. Marine Corps Base, Camp Lejeune, North Carolina and subways in Chicago, Illinois are used to test the developed system. The results of the applications show the strength of the feature data model and the developed system 'FOGIS' when they utilize non-spatial and temporal relationships in order to retrieve and visualize individual features.

  5. Graph Databases for Large-Scale Healthcare Systems: A Framework for Efficient Data Management and Data Services

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Park, Yubin; Shankar, Mallikarjun; Park, Byung H.

    Designing a database system for both efficient data management and data services has been one of the enduring challenges in the healthcare domain. In many healthcare systems, data services and data management are often viewed as two orthogonal tasks; data services refer to retrieval and analytic queries such as search, joins, statistical data extraction, and simple data mining algorithms, while data management refers to building error-tolerant and non-redundant database systems. The gap between service and management has resulted in rigid database systems and schemas that do not support effective analytics. We compose a rich graph structure from an abstracted healthcaremore » RDBMS to illustrate how we can fill this gap in practice. We show how a healthcare graph can be automatically constructed from a normalized relational database using the proposed 3NF Equivalent Graph (3EG) transformation.We discuss a set of real world graph queries such as finding self-referrals, shared providers, and collaborative filtering, and evaluate their performance over a relational database and its 3EG-transformed graph. Experimental results show that the graph representation serves as multiple de-normalized tables, thus reducing complexity in a database and enhancing data accessibility of users. Based on this finding, we propose an ensemble framework of databases for healthcare applications.« less

  6. Indexing and Retrieval in ERIC: The 20th Year.

    ERIC Educational Resources Information Center

    Barnett, Lynn

    This brief review of the Educational Resources Information Center (ERIC) system is intended to make users more aware of (1) the system as a whole, (2) the process of indexing educational literature for the database, and (3) the role of the Thesaurus of ERIC Descriptors in the overall information dissemination process. An overview of the ERIC…

  7. Image retrieval for identifying house plants

    NASA Astrophysics Data System (ADS)

    Kebapci, Hanife; Yanikoglu, Berrin; Unal, Gozde

    2010-02-01

    We present a content-based image retrieval system for plant identification which is intended for providing users with a simple method to locate information about their house plants. A plant image consists of a collection of overlapping leaves and possibly flowers, which makes the problem challenging. We studied the suitability of various well-known color, texture and shape features for this problem, as well as introducing some new ones. The features are extracted from the general plant region that is segmented from the background using the max-flow min-cut technique. Results on a database of 132 different plant images show promise (in about 72% of the queries, the correct plant image is retrieved among the top-15 results).

  8. BIRS - Bioterrorism Information Retrieval System.

    PubMed

    Tewari, Ashish Kumar; Rashi; Wadhwa, Gulshan; Sharma, Sanjeev Kumar; Jain, Chakresh Kumar

    2013-01-01

    Bioterrorism is the intended use of pathogenic strains of microbes to widen terror in a population. There is a definite need to promote research for development of vaccines, therapeutics and diagnostic methods as a part of preparedness to any bioterror attack in the future. BIRS is an open-access database of collective information on the organisms related to bioterrorism. The architecture of database utilizes the current open-source technology viz PHP ver 5.3.19, MySQL and IIS server under windows platform for database designing. Database stores information on literature, generic- information and unique pathways of about 10 microorganisms involved in bioterrorism. This may serve as a collective repository to accelerate the drug discovery and vaccines designing process against such bioterrorist agents (microbes). The available data has been validated from various online resources and literature mining in order to provide the user with a comprehensive information system. The database is freely available at http://www.bioterrorism.biowaves.org.

  9. Online literature-retrieval systems: how to get started.

    PubMed

    Tousignaut, D R

    1983-02-01

    Basic information describing online literature-retrieval systems is presented; the power of online searching is also discussed. The equipment, expense involved, and training necessary to perform online searching efficiently is described. An individual searcher needs only a computer terminal and a telephone; by telephone, the searcher connects with an online vendor's computer at another location. The four major U.S. vendors (Dialog, Bibliographic Retrieval Services, Systems Development Corporation, and the National Library of Medicine) are compared. A step-by-step procedure of logging in and searching is presented. Using the International Pharmaceutical Abstracts database as an example, 17 access points to locating an article via an online system are compared with only two (the subject and author index entry) of a printed service. By searching online, one can search the published literature on a specific topic in a matter of minutes. An online search is very useful when limited information is available or the search question contains a term that is not in a printed index.

  10. Development of a personalized training system using the Lung Image Database Consortium and Image Database resource Initiative Database.

    PubMed

    Lin, Hongli; Wang, Weisheng; Luo, Jiawei; Yang, Xuedong

    2014-12-01

    The aim of this study was to develop a personalized training system using the Lung Image Database Consortium (LIDC) and Image Database resource Initiative (IDRI) Database, because collecting, annotating, and marking a large number of appropriate computed tomography (CT) scans, and providing the capability of dynamically selecting suitable training cases based on the performance levels of trainees and the characteristics of cases are critical for developing a efficient training system. A novel approach is proposed to develop a personalized radiology training system for the interpretation of lung nodules in CT scans using the Lung Image Database Consortium (LIDC) and Image Database Resource Initiative (IDRI) database, which provides a Content-Boosted Collaborative Filtering (CBCF) algorithm for predicting the difficulty level of each case of each trainee when selecting suitable cases to meet individual needs, and a diagnostic simulation tool to enable trainees to analyze and diagnose lung nodules with the help of an image processing tool and a nodule retrieval tool. Preliminary evaluation of the system shows that developing a personalized training system for interpretation of lung nodules is needed and useful to enhance the professional skills of trainees. The approach of developing personalized training systems using the LIDC/IDRL database is a feasible solution to the challenges of constructing specific training program in terms of cost and training efficiency. Copyright © 2014 AUR. Published by Elsevier Inc. All rights reserved.

  11. Searching LOGIN, the Local Government Information Network.

    ERIC Educational Resources Information Center

    Jack, Robert F.

    1984-01-01

    Describes a computer-based information retrieval and electronic messaging system produced by Control Data Corporation now being used by government agencies and other organizations. Background of Local Government Information Network (LOGIN), database structure, types of LOGIN units, searching LOGIN (intersect, display, and list commands), and how…

  12. A Survey in Indexing and Searching XML Documents.

    ERIC Educational Resources Information Center

    Luk, Robert W. P.; Leong, H. V.; Dillon, Tharam S.; Chan, Alvin T. S.; Croft, W. Bruce; Allan, James

    2002-01-01

    Discussion of XML focuses on indexing techniques for XML documents, grouping them into flat-file, semistructured, and structured indexing paradigms. Highlights include searching techniques, including full text search and multistage search; search result presentations; database and information retrieval system integration; XML query languages; and…

  13. The Arabidopsis Information Resource (TAIR): a comprehensive database and web-based information retrieval, analysis, and visualization system for a model plant

    PubMed Central

    Huala, Eva; Dickerman, Allan W.; Garcia-Hernandez, Margarita; Weems, Danforth; Reiser, Leonore; LaFond, Frank; Hanley, David; Kiphart, Donald; Zhuang, Mingzhe; Huang, Wen; Mueller, Lukas A.; Bhattacharyya, Debika; Bhaya, Devaki; Sobral, Bruno W.; Beavis, William; Meinke, David W.; Town, Christopher D.; Somerville, Chris; Rhee, Seung Yon

    2001-01-01

    Arabidopsis thaliana, a small annual plant belonging to the mustard family, is the subject of study by an estimated 7000 researchers around the world. In addition to the large body of genetic, physiological and biochemical data gathered for this plant, it will be the first higher plant genome to be completely sequenced, with completion expected at the end of the year 2000. The sequencing effort has been coordinated by an international collaboration, the Arabidopsis Genome Initiative (AGI). The rationale for intensive investigation of Arabidopsis is that it is an excellent model for higher plants. In order to maximize use of the knowledge gained about this plant, there is a need for a comprehensive database and information retrieval and analysis system that will provide user-friendly access to Arabidopsis information. This paper describes the initial steps we have taken toward realizing these goals in a project called The Arabidopsis Information Resource (TAIR) (www.arabidopsis.org). PMID:11125061

  14. Systematically Retrieving Research: A Case Study Evaluating Seven Databases

    ERIC Educational Resources Information Center

    Taylor, Brian; Wylie, Emma; Dempster, Martin; Donnelly, Michael

    2007-01-01

    Objective: Developing the scientific underpinnings of social welfare requires effective and efficient methods of retrieving relevant items from the increasing volume of research. Method: We compared seven databases by running the nearest equivalent search on each. The search topic was chosen for relevance to social work practice with older people.…

  15. Design of a graphical user interface for an intelligent multimedia information system for radiology research

    NASA Astrophysics Data System (ADS)

    Taira, Ricky K.; Wong, Clement; Johnson, David; Bhushan, Vikas; Rivera, Monica; Huang, Lu J.; Aberle, Denise R.; Cardenas, Alfonso F.; Chu, Wesley W.

    1995-05-01

    With the increase in the volume and distribution of images and text available in PACS and medical electronic health-care environments it becomes increasingly important to maintain indexes that summarize the content of these multi-media documents. Such indices are necessary to quickly locate relevant patient cases for research, patient management, and teaching. The goal of this project is to develop an intelligent document retrieval system that allows researchers to request for patient cases based on document content. Thus we wish to retrieve patient cases from electronic information archives that could include a combined specification of patient demographics, low level radiologic findings (size, shape, number), intermediate-level radiologic findings (e.g., atelectasis, infiltrates, etc.) and/or high-level pathology constraints (e.g., well-differentiated small cell carcinoma). The cases could be distributed among multiple heterogeneous databases such as PACS, RIS, and HIS. Content- based retrieval systems go beyond the capabilities of simple key-word or string-based retrieval matching systems. These systems require a knowledge base to comprehend the generality/specificity of a concept (thus knowing the subclasses or related concepts to a given concept) and knowledge of the various string representations for each concept (i.e., synonyms, lexical variants, etc.). We have previously reported on a data integration mediation layer that allows transparent access to multiple heterogeneous distributed medical databases (HIS, RIS, and PACS). The data access layer of our architecture currently has limited query processing capabilities. Given a patient hospital identification number, the access mediation layer collects all documents in RIS and HIS and returns this information to a specified workstation location. In this paper we report on our efforts to extend the query processing capabilities of the system by creation of custom query interfaces, an intelligent query processing engine, and a document-content index that can be generated automatically (i.e., no manual authoring or changes to the normal clinical protocols).

  16. Spanish personal name variations in national and international biomedical databases: implications for information retrieval and bibliometric studies

    PubMed Central

    Ruiz-Pérez, R.; López-Cózar, E. Delgado; Jiménez-Contreras, E.

    2002-01-01

    Objectives: The study sought to investigate how Spanish names are handled by national and international databases and to identify mistakes that can undermine the usefulness of these databases for locating and retrieving works by Spanish authors. Methods: The authors sampled 172 articles published by authors from the University of Granada Medical School between 1987 and 1996 and analyzed the variations in how each of their names was indexed in Science Citation Index (SCI), MEDLINE, and Índice Médico Español (IME). The number and types of variants that appeared for each author's name were recorded and compared across databases to identify inconsistencies in indexing practices. We analyzed the relationship between variability (number of variants of an author's name) and productivity (number of items the name was associated with as an author), the consequences for retrieval of information, and the most frequent indexing structures used for Spanish names. Results: The proportion of authors who appeared under more then one name was 48.1% in SCI, 50.7% in MEDLINE, and 69.0% in IME. Productivity correlated directly with variability: more than 50% of the authors listed on five to ten items appeared under more than one name in any given database, and close to 100% of the authors listed on more than ten items appeared under two or more variants. Productivity correlated inversely with retrievability: as the number of variants for a name increased, the number of items retrieved under each variant decreased. For the most highly productive authors, the number of items retrieved under each variant tended toward one. The most frequent indexing methods varied between databases. In MEDLINE and IME, names were indexed correctly as “first surname second surname, first name initial middle name initial” (if present) in 41.7% and 49.5% of the records, respectively. However, in SCI, the most frequent method was “first surname, first name initial second name initial” (48.0% of the records) and first surname and second surname run together, first name initial (18.3%). Conclusions: Retrievability on the basis of author's name was poor in all three databases. Each database uses accurate indexing methods, but these methods fail to result in consistency or coherence for specific entries. The likely causes of inconsistency are: (1) use by authors of variants of their names during their publication careers, (2) lack of authority control in all three databases, (3) the use of an inappropriate indexing method for Spanish names in SCI, (4) authors' inconsistent behaviors, and (5) possible editorial interventions by some journals. We offer some suggestions as to how to avert the proliferation of author name variants in the databases. PMID:12398248

  17. Working with Data: Discovering Knowledge through Mining and Analysis; Systematic Knowledge Management and Knowledge Discovery; Text Mining; Methodological Approach in Discovering User Search Patterns through Web Log Analysis; Knowledge Discovery in Databases Using Formal Concept Analysis; Knowledge Discovery with a Little Perspective.

    ERIC Educational Resources Information Center

    Qin, Jian; Jurisica, Igor; Liddy, Elizabeth D.; Jansen, Bernard J; Spink, Amanda; Priss, Uta; Norton, Melanie J.

    2000-01-01

    These six articles discuss knowledge discovery in databases (KDD). Topics include data mining; knowledge management systems; applications of knowledge discovery; text and Web mining; text mining and information retrieval; user search patterns through Web log analysis; concept analysis; data collection; and data structure inconsistency. (LRW)

  18. 41. DISCOVERY, SEARCH, AND COMMUNICATION OF TEXTUAL KNOWLEDGE RESOURCES IN DISTRIBUTED SYSTEMS a. Discovering and Utilizing Knowledge Sources for Metasearch Knowledge Systems

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Zamora, Antonio

    Advanced Natural Language Processing Tools for Web Information Retrieval, Content Analysis, and Synthesis. The goal of this SBIR was to implement and evaluate several advanced Natural Language Processing (NLP) tools and techniques to enhance the precision and relevance of search results by analyzing and augmenting search queries and by helping to organize the search output obtained from heterogeneous databases and web pages containing textual information of interest to DOE and the scientific-technical user communities in general. The SBIR investigated 1) the incorporation of spelling checkers in search applications, 2) identification of significant phrases and concepts using a combination of linguisticmore » and statistical techniques, and 3) enhancement of the query interface and search retrieval results through the use of semantic resources, such as thesauri. A search program with a flexible query interface was developed to search reference databases with the objective of enhancing search results from web queries or queries of specialized search systems such as DOE's Information Bridge. The DOE ETDE/INIS Joint Thesaurus was processed to create a searchable database. Term frequencies and term co-occurrences were used to enhance the web information retrieval by providing algorithmically-derived objective criteria to organize relevant documents into clusters containing significant terms. A thesaurus provides an authoritative overview and classification of a field of knowledge. By organizing the results of a search using the thesaurus terminology, the output is more meaningful than when the results are just organized based on the terms that co-occur in the retrieved documents, some of which may not be significant. An attempt was made to take advantage of the hierarchy provided by broader and narrower terms, as well as other field-specific information in the thesauri. The search program uses linguistic morphological routines to find relevant entries regardless of whether terms are stored in singular or plural form. Implementation of additional inflectional morphology processes for verbs can enhance retrieval further, but this has to be balanced by the possibility of broadening the results too much. In addition to the DOE energy thesaurus, other sources of specialized organized knowledge such as the Medical Subject Headings (MeSH), the Unified Medical Language System (UMLS), and Wikipedia were investigated. The supporting role of the NLP thesaurus search program was enhanced by incorporating spelling aid and a part-of-speech tagger to cope with misspellings in the queries and to determine the grammatical roles of the query words and identify nouns for special processing. To improve precision, multiple modes of searching were implemented including Boolean operators, and field-specific searches. Programs to convert a thesaurus or reference file into searchable support files can be deployed easily, and the resulting files are immediately searchable to produce relevance-ranked results with builtin spelling aid, morphological processing, and advanced search logic. Demonstration systems were built for several databases, including the DOE energy thesaurus.« less

  19. Three-dimensional spatiotemporal features for fast content-based retrieval of focal liver lesions.

    PubMed

    Roy, Sharmili; Chi, Yanling; Liu, Jimin; Venkatesh, Sudhakar K; Brown, Michael S

    2014-11-01

    Content-based image retrieval systems for 3-D medical datasets still largely rely on 2-D image-based features extracted from a few representative slices of the image stack. Most 2 -D features that are currently used in the literature not only model a 3-D tumor incompletely but are also highly expensive in terms of computation time, especially for high-resolution datasets. Radiologist-specified semantic labels are sometimes used along with image-based 2-D features to improve the retrieval performance. Since radiological labels show large interuser variability, are often unstructured, and require user interaction, their use as lesion characterizing features is highly subjective, tedious, and slow. In this paper, we propose a 3-D image-based spatiotemporal feature extraction framework for fast content-based retrieval of focal liver lesions. All the features are computer generated and are extracted from four-phase abdominal CT images. Retrieval performance and query processing times for the proposed framework is evaluated on a database of 44 hepatic lesions comprising of five pathological types. Bull's eye percentage score above 85% is achieved for three out of the five lesion pathologies and for 98% of query lesions, at least one same type of lesion is ranked among the top two retrieved results. Experiments show that the proposed system's query processing is more than 20 times faster than other already published systems that use 2-D features. With fast computation time and high retrieval accuracy, the proposed system has the potential to be used as an assistant to radiologists for routine hepatic tumor diagnosis.

  20. Composite Materials Design Database and Data Retrieval System Requirements

    DTIC Science & Technology

    1991-08-01

    the present time, the majority of expert systems are stand-alone systems, and environments for effectively coupling heuristic data management with...nonheuristic data management remain to be developed. The only available recourse is to resort to traditional DBMS development and use, and to service...Organization for Data Management . Academic Press, 1986. Glaeser, P. S. (ed). " Data for Science and Technology." Proceedings of the Seventh

  1. The PMDB Protein Model Database

    PubMed Central

    Castrignanò, Tiziana; De Meo, Paolo D'Onorio; Cozzetto, Domenico; Talamo, Ivano Giuseppe; Tramontano, Anna

    2006-01-01

    The Protein Model Database (PMDB) is a public resource aimed at storing manually built 3D models of proteins. The database is designed to provide access to models published in the scientific literature, together with validating experimental data. It is a relational database and it currently contains >74 000 models for ∼240 proteins. The system is accessible at and allows predictors to submit models along with related supporting evidence and users to download them through a simple and intuitive interface. Users can navigate in the database and retrieve models referring to the same target protein or to different regions of the same protein. Each model is assigned a unique identifier that allows interested users to directly access the data. PMID:16381873

  2. Method for the reduction of image content redundancy in large image databases

    DOEpatents

    Tobin, Kenneth William; Karnowski, Thomas P.

    2010-03-02

    A method of increasing information content for content-based image retrieval (CBIR) systems includes the steps of providing a CBIR database, the database having an index for a plurality of stored digital images using a plurality of feature vectors, the feature vectors corresponding to distinct descriptive characteristics of the images. A visual similarity parameter value is calculated based on a degree of visual similarity between features vectors of an incoming image being considered for entry into the database and feature vectors associated with a most similar of the stored images. Based on said visual similarity parameter value it is determined whether to store or how long to store the feature vectors associated with the incoming image in the database.

  3. Multimedia explorer: image database, image proxy-server and search-engine.

    PubMed Central

    Frankewitsch, T.; Prokosch, U.

    1999-01-01

    Multimedia plays a major role in medicine. Databases containing images, movies or other types of multimedia objects are increasing in number, especially on the WWW. However, no good retrieval mechanism or search engine currently exists to efficiently track down such multimedia sources in the vast of information provided by the WWW. Secondly, the tools for searching databases are usually not adapted to the properties of images. HTML pages do not allow complex searches. Therefore establishing a more comfortable retrieval involves the use of a higher programming level like JAVA. With this platform independent language it is possible to create extensions to commonly used web browsers. These applets offer a graphical user interface for high level navigation. We implemented a database using JAVA objects as the primary storage container which are then stored by a JAVA controlled ORACLE8 database. Navigation depends on a structured vocabulary enhanced by a semantic network. With this approach multimedia objects can be encapsulated within a logical module for quick data retrieval. PMID:10566463

  4. Multimedia explorer: image database, image proxy-server and search-engine.

    PubMed

    Frankewitsch, T; Prokosch, U

    1999-01-01

    Multimedia plays a major role in medicine. Databases containing images, movies or other types of multimedia objects are increasing in number, especially on the WWW. However, no good retrieval mechanism or search engine currently exists to efficiently track down such multimedia sources in the vast of information provided by the WWW. Secondly, the tools for searching databases are usually not adapted to the properties of images. HTML pages do not allow complex searches. Therefore establishing a more comfortable retrieval involves the use of a higher programming level like JAVA. With this platform independent language it is possible to create extensions to commonly used web browsers. These applets offer a graphical user interface for high level navigation. We implemented a database using JAVA objects as the primary storage container which are then stored by a JAVA controlled ORACLE8 database. Navigation depends on a structured vocabulary enhanced by a semantic network. With this approach multimedia objects can be encapsulated within a logical module for quick data retrieval.

  5. Performing private database queries in a real-world environment using a quantum protocol.

    PubMed

    Chan, Philip; Lucio-Martinez, Itzel; Mo, Xiaofan; Simon, Christoph; Tittel, Wolfgang

    2014-06-10

    In the well-studied cryptographic primitive 1-out-of-N oblivious transfer, a user retrieves a single element from a database of size N without the database learning which element was retrieved. While it has previously been shown that a secure implementation of 1-out-of-N oblivious transfer is impossible against arbitrarily powerful adversaries, recent research has revealed an interesting class of private query protocols based on quantum mechanics in a cheat sensitive model. Specifically, a practical protocol does not need to guarantee that the database provider cannot learn what element was retrieved if doing so carries the risk of detection. The latter is sufficient motivation to keep a database provider honest. However, none of the previously proposed protocols could cope with noisy channels. Here we present a fault-tolerant private query protocol, in which the novel error correction procedure is integral to the security of the protocol. Furthermore, we present a proof-of-concept demonstration of the protocol over a deployed fibre.

  6. Performing private database queries in a real-world environment using a quantum protocol

    PubMed Central

    Chan, Philip; Lucio-Martinez, Itzel; Mo, Xiaofan; Simon, Christoph; Tittel, Wolfgang

    2014-01-01

    In the well-studied cryptographic primitive 1-out-of-N oblivious transfer, a user retrieves a single element from a database of size N without the database learning which element was retrieved. While it has previously been shown that a secure implementation of 1-out-of-N oblivious transfer is impossible against arbitrarily powerful adversaries, recent research has revealed an interesting class of private query protocols based on quantum mechanics in a cheat sensitive model. Specifically, a practical protocol does not need to guarantee that the database provider cannot learn what element was retrieved if doing so carries the risk of detection. The latter is sufficient motivation to keep a database provider honest. However, none of the previously proposed protocols could cope with noisy channels. Here we present a fault-tolerant private query protocol, in which the novel error correction procedure is integral to the security of the protocol. Furthermore, we present a proof-of-concept demonstration of the protocol over a deployed fibre. PMID:24913129

  7. Columnar water vapor retrievals from multifilter rotating shadowband radiometer data

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Alexandrov, Mikhail; Schmid, Beat; Turner, David D.

    2009-01-26

    The Multi-Filter Rotating Shadowband Radiometer (MFRSR) measures direct and diffuse irradiances in the visible and near IR spectral range. In addition to characteristics of atmospheric aerosols, MFRSR data also allow retrieval of precipitable water vapor (PWV) column amounts, which are determined from the direct normal irradiances in the 940 nm spectral channel. The HITRAN 2004 spectral database was used in our retrievals to model the water vapor absorption. We present a detailed error analysis describing the influence of uncertainties in instrument calibration and spectral response, as well as those in available spectral databases, on the retrieval results. The results ofmore » our PWV retrievals from the Southern Great Plains (SGP) site operated by the DOE Atmospheric Radiation Measurement (ARM) Program were compared with correlative standard measurements by Microwave Radiometers (MWRs) and a Global Positioning System (GPS) water vapor sensor, as well as with retrievals from other solar radiometers (AERONET’s CIMEL, AATS-6). Some of these data are routinely available at the SGP’s Central Facility, however, we also used measurements from a wider array of instrumentation deployed at this site during the Water Vapor Intensive Observation Period (WVIOP2000) in September – October 2000. The WVIOP data show better agreement between different solar radiometers or between different microwave radiometers (both groups showing relative biases within 4%) than between these two groups of instruments, with MWRs values being consistently higher (up to 14%) than those from solar instruments. We also demonstrate the feasibility of using MFRSR network data for creation of 2D datasets comparable with the MODIS satellite water vapor product.« less

  8. A structural informatics approach to mine kinase knowledge bases.

    PubMed

    Brooijmans, Natasja; Mobilio, Dominick; Walker, Gary; Nilakantan, Ramaswamy; Denny, Rajiah A; Feyfant, Eric; Diller, David; Bikker, Jack; Humblet, Christine

    2010-03-01

    In this paper, we describe a combination of structural informatics approaches developed to mine data extracted from existing structure knowledge bases (Protein Data Bank and the GVK database) with a focus on kinase ATP-binding site data. In contrast to existing systems that retrieve and analyze protein structures, our techniques are centered on a database of ligand-bound geometries in relation to residues lining the binding site and transparent access to ligand-based SAR data. We illustrate the systems in the context of the Abelson kinase and related inhibitor structures. 2009 Elsevier Ltd. All rights reserved.

  9. Newspapers and Electronic Databases: Present Technology.

    ERIC Educational Resources Information Center

    Newcombe, Barbara; Trivedi, Harish

    1984-01-01

    Discusses technology used to preserve, control, index, and retrieve information in newspapers, highlighting ways to record analyses of news stories, storage/indexing systems based on computers, information as salable commodity, preparation of news for electronic storage, answering in-house queries, questions of copyright and invasion of privacy,…

  10. University of Iowa at TREC 2008 Legal and Relevance Feedback Tracks

    DTIC Science & Technology

    2008-11-01

    Fellbaum, C, [ed.]. Wordnet: An Electronic Lexical Database. Cambridge : MIT Press, 1998. [3] Salton , G. (ed) (1971), The SMART Retrieval System...learning tools and techniques. 2nd Edition. San Francisco : Morgan Kaufmann, 2005. [5] Platt, J . Machines using Sequential Minimal Optimization. [ed.] B

  11. Development of SRS.php, a Simple Object Access Protocol-based library for data acquisition from integrated biological databases.

    PubMed

    Barbosa-Silva, A; Pafilis, E; Ortega, J M; Schneider, R

    2007-12-11

    Data integration has become an important task for biological database providers. The current model for data exchange among different sources simplifies the manner that distinct information is accessed by users. The evolution of data representation from HTML to XML enabled programs, instead of humans, to interact with biological databases. We present here SRS.php, a PHP library that can interact with the data integration Sequence Retrieval System (SRS). The library has been written using SOAP definitions, and permits the programmatic communication through webservices with the SRS. The interactions are possible by invoking the methods described in WSDL by exchanging XML messages. The current functions available in the library have been built to access specific data stored in any of the 90 different databases (such as UNIPROT, KEGG and GO) using the same query syntax format. The inclusion of the described functions in the source of scripts written in PHP enables them as webservice clients to the SRS server. The functions permit one to query the whole content of any SRS database, to list specific records in these databases, to get specific fields from the records, and to link any record among any pair of linked databases. The case study presented exemplifies the library usage to retrieve information regarding registries of a Plant Defense Mechanisms database. The Plant Defense Mechanisms database is currently being developed, and the proposal of SRS.php library usage is to enable the data acquisition for the further warehousing tasks related to its setup and maintenance.

  12. BIOSPIDA: A Relational Database Translator for NCBI

    PubMed Central

    Hagen, Matthew S.; Lee, Eva K.

    2010-01-01

    As the volume and availability of biological databases continue widespread growth, it has become increasingly difficult for research scientists to identify all relevant information for biological entities of interest. Details of nucleotide sequences, gene expression, molecular interactions, and three-dimensional structures are maintained across many different databases. To retrieve all necessary information requires an integrated system that can query multiple databases with minimized overhead. This paper introduces a universal parser and relational schema translator that can be utilized for all NCBI databases in Abstract Syntax Notation (ASN.1). The data models for OMIM, Entrez-Gene, Pubmed, MMDB and GenBank have been successfully converted into relational databases and all are easily linkable helping to answer complex biological questions. These tools facilitate research scientists to locally integrate databases from NCBI without significant workload or development time. PMID:21347013

  13. Overview of Nuclear Physics Data: Databases, Web Applications and Teaching Tools

    NASA Astrophysics Data System (ADS)

    McCutchan, Elizabeth

    2017-01-01

    The mission of the United States Nuclear Data Program (USNDP) is to provide current, accurate, and authoritative data for use in pure and applied areas of nuclear science and engineering. This is accomplished by compiling, evaluating, and disseminating extensive datasets. Our main products include the Evaluated Nuclear Structure File (ENSDF) containing information on nuclear structure and decay properties and the Evaluated Nuclear Data File (ENDF) containing information on neutron-induced reactions. The National Nuclear Data Center (NNDC), through the website www.nndc.bnl.gov, provides web-based retrieval systems for these and many other databases. In addition, the NNDC hosts several on-line physics tools, useful for calculating various quantities relating to basic nuclear physics. In this talk, I will first introduce the quantities which are evaluated and recommended in our databases. I will then outline the searching capabilities which allow one to quickly and efficiently retrieve data. Finally, I will demonstrate how the database searches and web applications can provide effective teaching tools concerning the structure of nuclei and how they interact. Work supported by the Office of Nuclear Physics, Office of Science of the U.S. Department of Energy under Contract No. DE-AC02-98CH10886.

  14. Kinetic parameters of cholinesterase interactions with organophosphates: retrieval and comparison tools available through ESTHER database: ESTerases, alpha/beta Hydrolase Enzymes and Relatives.

    PubMed

    Chatonnet, A; Hotelier, T; Cousin, X

    1999-05-14

    Cholinesterases are targets for organophosphorus compounds which are used as insecticides, chemical warfare agents and drugs for the treatment of disease such as glaucoma, or parasitic infections. The widespread use of these chemicals explains the growing of this area of research and the ever increasing number of sequences, structures, or biochemical data available. Future advances will depend upon effective management of existing information as well as upon creation of new knowledge. The ESTHER database goal is to facilitate retrieval and comparison of data about structure and function of proteins presenting the alpha/beta hydrolase fold. Protein engineering and in vitro production of enzymes allow direct comparison of biochemical parameters. Kinetic parameters of enzymatic reactions are now included in the database. These parameters can be searched and compared with a table construction tool. ESTHER can be reached through internet (http://www.ensam.inra.fr/cholinesterase). The full database or the specialised X-window Client-server system can be downloaded from our ftp server (ftp://ftp.toulouse.inra.fr./pub/esther). Forms can be used to send updates or corrections directly from the web.

  15. BioModels.net Web Services, a free and integrated toolkit for computational modelling software.

    PubMed

    Li, Chen; Courtot, Mélanie; Le Novère, Nicolas; Laibe, Camille

    2010-05-01

    Exchanging and sharing scientific results are essential for researchers in the field of computational modelling. BioModels.net defines agreed-upon standards for model curation. A fundamental one, MIRIAM (Minimum Information Requested in the Annotation of Models), standardises the annotation and curation process of quantitative models in biology. To support this standard, MIRIAM Resources maintains a set of standard data types for annotating models, and provides services for manipulating these annotations. Furthermore, BioModels.net creates controlled vocabularies, such as SBO (Systems Biology Ontology) which strictly indexes, defines and links terms used in Systems Biology. Finally, BioModels Database provides a free, centralised, publicly accessible database for storing, searching and retrieving curated and annotated computational models. Each resource provides a web interface to submit, search, retrieve and display its data. In addition, the BioModels.net team provides a set of Web Services which allows the community to programmatically access the resources. A user is then able to perform remote queries, such as retrieving a model and resolving all its MIRIAM Annotations, as well as getting the details about the associated SBO terms. These web services use established standards. Communications rely on SOAP (Simple Object Access Protocol) messages and the available queries are described in a WSDL (Web Services Description Language) file. Several libraries are provided in order to simplify the development of client software. BioModels.net Web Services make one step further for the researchers to simulate and understand the entirety of a biological system, by allowing them to retrieve biological models in their own tool, combine queries in workflows and efficiently analyse models.

  16. Retrieving clinically relevant diabetic retinopathy images using a multi-class multiple-instance framework

    NASA Astrophysics Data System (ADS)

    Chandakkar, Parag S.; Venkatesan, Ragav; Li, Baoxin

    2013-02-01

    Diabetic retinopathy (DR) is a vision-threatening complication from diabetes mellitus, a medical condition that is rising globally. Unfortunately, many patients are unaware of this complication because of absence of symptoms. Regular screening of DR is necessary to detect the condition for timely treatment. Content-based image retrieval, using archived and diagnosed fundus (retinal) camera DR images can improve screening efficiency of DR. This content-based image retrieval study focuses on two DR clinical findings, microaneurysm and neovascularization, which are clinical signs of non-proliferative and proliferative diabetic retinopathy. The authors propose a multi-class multiple-instance image retrieval framework which deploys a modified color correlogram and statistics of steerable Gaussian Filter responses, for retrieving clinically relevant images from a database of DR fundus image database.

  17. Construction of In-house Databases in a Corporation

    NASA Astrophysics Data System (ADS)

    Tamura, Haruki; Mezaki, Koji

    This paper describes fundamental idea of technical information management in Mitsubishi Heavy Industries, Ltd., and present status of the activities. Then it introduces the background and history of the development, problems and countermeasures against them regarding Mitsubishi Heavy Industries Technical Information Retrieval System (called MARON) which started its service in May, 1985. The system deals with databases which cover information common to the whole company (in-house research and technical reports, holding information of books, journals and so on), and local information held in each business division or department. Anybody from any division can access to these databases through the company-wide network. The in-house interlibrary loan subsystem called Orderentry is available, which supports acquiring operation of original materials.

  18. Data base management system for lymphatic filariasis--a neglected tropical disease.

    PubMed

    Upadhyayula, Suryanaryana Murty; Mutheneni, Srinivasa Rao; Kadiri, Madhusudhan Rao; Kumaraswamy, Sriram; Nelaturu, Sarat Chandra Babu

    2012-01-01

    Researchers working in the area of Public Health are being confronted with large volumes of data on various aspects of entomology and epidemiology. To obtain the relevant information out of these data requires particular database management system. In this paper, we have described about the usages of our developed database on lymphatic filariasis. This database application is developed using Model View Controller (MVC) architecture, with MySQL as database and a web based interface. We have collected and incorporated the data on filariasis in the database from Karimnagar, Chittoor, East and West Godavari districts of Andhra Pradesh, India. The importance of this database is to store the collected data, retrieve the information and produce various combinational reports on filarial aspects which in turn will help the public health officials to understand the burden of disease in a particular locality. This information is likely to have an imperative role on decision making for effective control of filarial disease and integrated vector management operations.

  19. Impact of line parameter database and continuum absorption on GOSAT TIR methane retrieval

    NASA Astrophysics Data System (ADS)

    Yamada, A.; Saitoh, N.; Nonogaki, R.; Imasu, R.; Shiomi, K.; Kuze, A.

    2017-12-01

    The current methane retrieval algorithm (V1) at wavenumber range from 1210 cm-1 to 1360 cm-1 including CH4 ν 4 band from the thermal infrared (TIR) band of Thermal and Near-infrared Sensor for Carbon Observation Fourier Transform Spectrometer (TANSO-FTS) onboard Greenhouse Gases Observing Satellite (GOSAT) uses LBLRTM V12.1 with AER V3.1 line database and MT CKD 2.5.2 continuum absorption model to calculate optical depth. Since line parameter databases have been updated and the continuum absorption may have large uncertainty, the purpose of this study is to assess the impact on {CH}4 retrieval from the choice of line parameter databases and the uncertainty of continuum absorption. We retrieved {CH}4 profiles with replacement of line parameter database from AER V3.1 to AER v1.0, HITRAN 2004, HITRAN 2008, AER V3.2, or HITRAN 2012 (Rothman et al. 2005, 2009, and 2013. Clough et al., 2005), we assumed 10% larger continuum absorption coefficients and 50% larger temperature dependent coefficient of continuum absorption based on the report by Paynter and Ramaswamy (2014). We compared the retrieved CH4 with the HIPPO CH4 observation (Wofsy et al., 2012). The difference from HIPPO observation of AER V3.2 was the smallest and 24.1 ± 45.9 ppbv. The differences of AER V1.0, HITRAN 2004, HITRAN 2008, and HITRAN 2012 were 35.6 ± 46.5 ppbv, 37.6 ± 46.3 ppbv, 32.1 ± 46.1 ppbv, and 35.2 ± 46.0 ppbv, respectively. Maximum {CH}4 retrieval differences were -0.4 ppbv at the layer of 314 hPa when we used 10% larger absorption coefficients of {H}2O foreign continuum. Comparing AER V3.2 case to HITRAN 2008 case, the line coupling effect reduced difference by 8.0 ppbv. Line coupling effects were important for GOSAT TIR {CH}4 retrieval. Effects from the uncertainty of continuum absorption were negligible small for GOSAT TIR CH4 retrieval.

  20. The EMBL nucleotide sequence database

    PubMed Central

    Stoesser, Guenter; Baker, Wendy; van den Broek, Alexandra; Camon, Evelyn; Garcia-Pastor, Maria; Kanz, Carola; Kulikova, Tamara; Lombard, Vincent; Lopez, Rodrigo; Parkinson, Helen; Redaschi, Nicole; Sterk, Peter; Stoehr, Peter; Tuli, Mary Ann

    2001-01-01

    The EMBL Nucleotide Sequence Database (http://www.ebi.ac.uk/embl/) is maintained at the European Bioinformatics Institute (EBI) in an international collaboration with the DNA Data Bank of Japan (DDBJ) and GenBank at the NCBI (USA). Data is exchanged amongst the collaborating databases on a daily basis. The major contributors to the EMBL database are individual authors and genome project groups. Webin is the preferred web-based submission system for individual submitters, whilst automatic procedures allow incorporation of sequence data from large-scale genome sequencing centres and from the European Patent Office (EPO). Database releases are produced quarterly. Network services allow free access to the most up-to-date data collection via ftp, email and World Wide Web interfaces. EBI’s Sequence Retrieval System (SRS), a network browser for databanks in molecular biology, integrates and links the main nucleotide and protein databases plus many specialized databases. For sequence similarity searching a variety of tools (e.g. Blitz, Fasta, BLAST) are available which allow external users to compare their own sequences against the latest data in the EMBL Nucleotide Sequence Database and SWISS-PROT. PMID:11125039

  1. Controlled and Uncontrolled Subject Descriptions in the CF Database: A Comparison of Optimal Cluster-Based Retrieval Results.

    ERIC Educational Resources Information Center

    Shaw, W. M., Jr.

    1993-01-01

    Describes a study conducted on the cystic fibrosis (CF) database, a subset of MEDLINE, that investigated clustering structure and the effectiveness of cluster-based retrieval as a function of the exhaustivity of the uncontrolled subject descriptions. Results are compared to calculations for controlled descriptions based on Medical Subject Headings…

  2. Information Retrieval Center: A Proposal for the Implementation of CD-ROM Database Technology at Memphis State University Libraries.

    ERIC Educational Resources Information Center

    Evans, John; Park, Betsy

    This planning proposal recommends that Memphis State University Libraries make information on CD-ROM (compact disc--read only memory) available in the Reference Department by establishing an Information Retrieval Center (IRC). Following a brief introduction and statement of purpose, the library's databases, users, staffing, facilities, and…

  3. Fault-tolerant symmetrically-private information retrieval

    NASA Astrophysics Data System (ADS)

    Wang, Tian-Yin; Cai, Xiao-Qiu; Zhang, Rui-Ling

    2016-08-01

    We propose two symmetrically-private information retrieval protocols based on quantum key distribution, which provide a good degree of database and user privacy while being flexible, loss-resistant and easily generalized to a large database similar to the precedent works. Furthermore, one protocol is robust to a collective-dephasing noise, and the other is robust to a collective-rotation noise.

  4. Design of Content Based Image Retrieval Scheme for Diabetic Retinopathy Images using Harmony Search Algorithm.

    PubMed

    Sivakamasundari, J; Natarajan, V

    2015-01-01

    Diabetic Retinopathy (DR) is a disorder that affects the structure of retinal blood vessels due to long-standing diabetes mellitus. Automated segmentation of blood vessel is vital for periodic screening and timely diagnosis. An attempt has been made to generate continuous retinal vasculature for the design of Content Based Image Retrieval (CBIR) application. The typical normal and abnormal retinal images are preprocessed to improve the vessel contrast. The blood vessels are segmented using evolutionary based Harmony Search Algorithm (HSA) combined with Otsu Multilevel Thresholding (MLT) method by best objective functions. The segmentation results are validated with corresponding ground truth images using binary similarity measures. The statistical, textural and structural features are obtained from the segmented images of normal and DR affected retina and are analyzed. CBIR in medical image retrieval applications are used to assist physicians in clinical decision-support techniques and research fields. A CBIR system is developed using HSA based Otsu MLT segmentation technique and the features obtained from the segmented images. Similarity matching is carried out between the features of query and database images using Euclidean Distance measure. Similar images are ranked and retrieved. The retrieval performance of CBIR system is evaluated in terms of precision and recall. The CBIR systems developed using HSA based Otsu MLT and conventional Otsu MLT methods are compared. The retrieval performance such as precision and recall are found to be 96% and 58% for CBIR system using HSA based Otsu MLT segmentation. This automated CBIR system could be recommended for use in computer assisted diagnosis for diabetic retinopathy screening.

  5. An Intelligent Information System for forest management: NED/FVS integration

    Treesearch

    J. Wang; W.D. Potter; D. Nute; F. Maier; H. Michael Rauscher; M.J. Twery; S. Thomasma; P. Knopp

    2002-01-01

    An Intelligent Information System (IIS) is viewed as composed of a unified knowledge base, database, and model base. This allows an IIS to provide responses to user queries regardless of whether the query process involves a data retrieval, an inference, a computational method, a problem solving module, or some combination of these. NED-2 is a full-featured intelligent...

  6. Computer Assisted Diagnostic Prescriptive Program in Reading and Mathematics. An Exemplary Micro-Computer Program and a Developer/Demonstrator Project, National Diffusion Network.

    ERIC Educational Resources Information Center

    Roberson, E. Wayne; Glowinski, Debra J.

    The Computer Assisted Diagnostic Prescriptive Program (CADPP) is a customized databased curriculum management system which permits the user to load the following into a filing/retrieval software system: (1) learning characteristics of individual students (e.g., age, instructional level, learning modality); (2) skill-oriented characteristics of…

  7. Baseline and extensions approach to information retrieval of complex medical data: Poznan's approach to the bioCADDIE 2016

    PubMed Central

    Cieslewicz, Artur; Dutkiewicz, Jakub; Jedrzejek, Czeslaw

    2018-01-01

    Abstract Information retrieval from biomedical repositories has become a challenging task because of their increasing size and complexity. To facilitate the research aimed at improving the search for relevant documents, various information retrieval challenges have been launched. In this article, we present the improved medical information retrieval systems designed by Poznan University of Technology and Poznan University of Medical Sciences as a contribution to the bioCADDIE 2016 challenge—a task focusing on information retrieval from a collection of 794 992 datasets generated from 20 biomedical repositories. The system developed by our team utilizes the Terrier 4.2 search platform enhanced by a query expansion method using word embeddings. This approach, after post-challenge modifications and improvements (with particular regard to assigning proper weights for original and expanded terms), allowed us achieving the second best infNDCG measure (0.4539) compared with the challenge results and infAP 0.3978. This demonstrates that proper utilization of word embeddings can be a valuable addition to the information retrieval process. Some analysis is provided on related work involving other bioCADDIE contributions. We discuss the possibility of improving our results by using better word embedding schemes to find candidates for query expansion. Database URL: https://biocaddie.org/benchmark-data PMID:29688372

  8. A digital library for medical imaging activities

    NASA Astrophysics Data System (ADS)

    dos Santos, Marcelo; Furuie, Sérgio S.

    2007-03-01

    This work presents the development of an electronic infrastructure to make available a free, online, multipurpose and multimodality medical image database. The proposed infrastructure implements a distributed architecture for medical image database, authoring tools, and a repository for multimedia documents. Also it includes a peer-reviewed model that assures quality of dataset. This public repository provides a single point of access for medical images and related information to facilitate retrieval tasks. The proposed approach has been used as an electronic teaching system in Radiology as well.

  9. Case-based fracture image retrieval.

    PubMed

    Zhou, Xin; Stern, Richard; Müller, Henning

    2012-05-01

    Case-based fracture image retrieval can assist surgeons in decisions regarding new cases by supplying visually similar past cases. This tool may guide fracture fixation and management through comparison of long-term outcomes in similar cases. A fracture image database collected over 10 years at the orthopedic service of the University Hospitals of Geneva was used. This database contains 2,690 fracture cases associated with 43 classes (based on the AO/OTA classification). A case-based retrieval engine was developed and evaluated using retrieval precision as a performance metric. Only cases in the same class as the query case are considered as relevant. The scale-invariant feature transform (SIFT) is used for image analysis. Performance evaluation was computed in terms of mean average precision (MAP) and early precision (P10, P30). Retrieval results produced with the GNU image finding tool (GIFT) were used as a baseline. Two sampling strategies were evaluated. One used a dense 40 × 40 pixel grid sampling, and the second one used the standard SIFT features. Based on dense pixel grid sampling, three unsupervised feature selection strategies were introduced to further improve retrieval performance. With dense pixel grid sampling, the image is divided into 1,600 (40 × 40) square blocks. The goal is to emphasize the salient regions (blocks) and ignore irrelevant regions. Regions are considered as important when a high variance of the visual features is found. The first strategy is to calculate the variance of all descriptors on the global database. The second strategy is to calculate the variance of all descriptors for each case. A third strategy is to perform a thumbnail image clustering in a first step and then to calculate the variance for each cluster. Finally, a fusion between a SIFT-based system and GIFT is performed. A first comparison on the selection of sampling strategies using SIFT features shows that dense sampling using a pixel grid (MAP = 0.18) outperformed the SIFT detector-based sampling approach (MAP = 0.10). In a second step, three unsupervised feature selection strategies were evaluated. A grid parameter search is applied to optimize parameters for feature selection and clustering. Results show that using half of the regions (700 or 800) obtains the best performance for all three strategies. Increasing the number of clusters in clustering can also improve the retrieval performance. The SIFT descriptor variance in each case gave the best indication of saliency for the regions (MAP = 0.23), better than the other two strategies (MAP = 0.20 and 0.21). Combining GIFT (MAP = 0.23) and the best SIFT strategy (MAP = 0.23) produced significantly better results (MAP = 0.27) than each system alone. A case-based fracture retrieval engine was developed and is available for online demonstration. SIFT is used to extract local features, and three feature selection strategies were introduced and evaluated. A baseline using the GIFT system was used to evaluate the salient point-based approaches. Without supervised learning, SIFT-based systems with optimized parameters slightly outperformed the GIFT system. A fusion of the two approaches shows that the information contained in the two approaches is complementary. Supervised learning on the feature space is foreseen as the next step of this study.

  10. KA-SB: from data integration to large scale reasoning

    PubMed Central

    Roldán-García, María del Mar; Navas-Delgado, Ismael; Kerzazi, Amine; Chniber, Othmane; Molina-Castro, Joaquín; Aldana-Montes, José F

    2009-01-01

    Background The analysis of information in the biological domain is usually focused on the analysis of data from single on-line data sources. Unfortunately, studying a biological process requires having access to disperse, heterogeneous, autonomous data sources. In this context, an analysis of the information is not possible without the integration of such data. Methods KA-SB is a querying and analysis system for final users based on combining a data integration solution with a reasoner. Thus, the tool has been created with a process divided into two steps: 1) KOMF, the Khaos Ontology-based Mediator Framework, is used to retrieve information from heterogeneous and distributed databases; 2) the integrated information is crystallized in a (persistent and high performance) reasoner (DBOWL). This information could be further analyzed later (by means of querying and reasoning). Results In this paper we present a novel system that combines the use of a mediation system with the reasoning capabilities of a large scale reasoner to provide a way of finding new knowledge and of analyzing the integrated information from different databases, which is retrieved as a set of ontology instances. This tool uses a graphical query interface to build user queries easily, which shows a graphical representation of the ontology and allows users o build queries by clicking on the ontology concepts. Conclusion These kinds of systems (based on KOMF) will provide users with very large amounts of information (interpreted as ontology instances once retrieved), which cannot be managed using traditional main memory-based reasoners. We propose a process for creating persistent and scalable knowledgebases from sets of OWL instances obtained by integrating heterogeneous data sources with KOMF. This process has been applied to develop a demo tool , which uses the BioPax Level 3 ontology as the integration schema, and integrates UNIPROT, KEGG, CHEBI, BRENDA and SABIORK databases. PMID:19796402

  11. WE-E-BRB-11: Riview a Web-Based Viewer for Radiotherapy.

    PubMed

    Apte, A; Wang, Y; Deasy, J

    2012-06-01

    Collaborations involving radiotherapy data collection, such as the recently proposed international radiogenomics consortium, require robust, web-based tools to facilitate reviewing treatment planning information. We present the architecture and prototype characteristics for a web-based radiotherapy viewer. The web-based environment developed in this work consists of the following components: 1) Import of DICOM/RTOG data: CERR was leveraged to import DICOM/RTOG data and to convert to database friendly RT objects. 2) Extraction and Storage of RT objects: The scan and dose distributions were stored as .png files per slice and view plane. The file locations were written to the MySQL database. Structure contours and DVH curves were written to the database as numeric data. 3) Web interfaces to query, retrieve and visualize the RT objects: The Web application was developed using HTML 5 and Ruby on Rails (RoR) technology following the MVC philosophy. The open source ImageMagick library was utilized to overlay scan, dose and structures. The application allows users to (i) QA the treatment plans associated with a study, (ii) Query and Retrieve patients matching anonymized ID and study, (iii) Review up to 4 plans simultaneously in 4 window panes (iv) Plot DVH curves for the selected structures and dose distributions. A subset of data for lung cancer patients was used to prototype the system. Five user accounts were created to have access to this study. The scans, doses, structures and DVHs for 10 patients were made available via the web application. A web-based system to facilitate QA, and support Query, Retrieve and the Visualization of RT data was prototyped. The RIVIEW system was developed using open source and free technology like MySQL and RoR. We plan to extend the RIVIEW system further to be useful in clinical trial data collection, outcomes research, cohort plan review and evaluation. © 2012 American Association of Physicists in Medicine.

  12. Data model and relational database design for the New Jersey Water-Transfer Data System (NJWaTr)

    USGS Publications Warehouse

    Tessler, Steven

    2003-01-01

    The New Jersey Water-Transfer Data System (NJWaTr) is a database design for the storage and retrieval of water-use data. NJWaTr can manage data encompassing many facets of water use, including (1) the tracking of various types of water-use activities (withdrawals, returns, transfers, distributions, consumptive-use, wastewater collection, and treatment); (2) the storage of descriptions, classifications and locations of places and organizations involved in water-use activities; (3) the storage of details about measured or estimated volumes of water associated with water-use activities; and (4) the storage of information about data sources and water resources associated with water use. In NJWaTr, each water transfer occurs unidirectionally between two site objects, and the sites and conveyances form a water network. The core entities in the NJWaTr model are site, conveyance, transfer/volume, location, and owner. Other important entities include water resource (used for withdrawals and returns), data source, permit, and alias. Multiple water-exchange estimates based on different methods or data sources can be stored for individual transfers. Storage of user-defined details is accommodated for several of the main entities. Many tables contain classification terms to facilitate the detailed description of data items and can be used for routine or custom data summarization. NJWaTr accommodates single-user and aggregate-user water-use data, can be used for large or small water-network projects, and is available as a stand-alone Microsoft? Access database. Data stored in the NJWaTr structure can be retrieved in user-defined combinations to serve visualization and analytical applications. Users can customize and extend the database, link it to other databases, or implement the design in other relational database applications.

  13. The EXOSAT database and archive

    NASA Technical Reports Server (NTRS)

    Reynolds, A. P.; Parmar, A. N.

    1992-01-01

    The EXOSAT database provides on-line access to the results and data products (spectra, images, and lightcurves) from the EXOSAT mission as well as access to data and logs from a number of other missions (such as EINSTEIN, COS-B, ROSAT, and IRAS). In addition, a number of familiar optical, infrared, and x ray catalogs, including the Hubble Space Telescope (HST) guide star catalog are available. The complete database is located at the EXOSAT observatory at ESTEC in the Netherlands and is accessible remotely via a captive account. The database management system was specifically developed to efficiently access the database and to allow the user to perform statistical studies on large samples of astronomical objects as well as to retrieve scientific and bibliographic information on single sources. The system was designed to be mission independent and includes timing, image processing, and spectral analysis packages as well as software to allow the easy transfer of analysis results and products to the user's own institute. The archive at ESTEC comprises a subset of the EXOSAT observations, stored on magnetic tape. Observations of particular interest were copied in compressed format to an optical jukebox, allowing users to retrieve and analyze selected raw data entirely from their terminals. Such analysis may be necessary if the user's needs are not accommodated by the products contained in the database (in terms of time resolution, spectral range, and the finesse of the background subtraction, for instance). Long-term archiving of the full final observation data is taking place at ESRIN in Italy as part of the ESIS program, again using optical media, and ESRIN have now assumed responsibility for distributing the data to the community. Tests showed that raw observational data (typically several tens of megabytes for a single target) can be transferred via the existing networks in reasonable time.

  14. CHERNOLITTM. Chernobyl Bibliographic Search System

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Caff, F., Jr.; Kennedy, R.A.; Mahaffey, J.A.

    1992-03-02

    The Chernobyl Bibliographic Search System (Chernolit TM) provides bibliographic data in a usable format for research studies relating to the Chernobyl nuclear accident that occurred in the former Ukrainian Republic of the USSR in 1986. Chernolit TM is a portable and easy to use product. The bibliographic data is provided under the control of a graphical user interface so that the user may quickly and easily retrieve pertinent information from the large database. The user may search the database for occurrences of words, names, or phrases; view bibliographic references on screen; and obtain reports of selected references. Reports may bemore » viewed on the screen, printed, or accumulated in a folder that is written to a disk file when the user exits the software. Chernolit TM provides a cost-effective alternative to multiple, independent literature searches. Forty-five hundred references concerning the accident, including abstracts, are distributed with Chernolit TM. The data contained in the database were obtained from electronic literature searches and from requested donations from individuals and organizations. These literature searches interrogated the Energy Science and Technology database (formerly DOE ENERGY) of the DIALOG Information Retrieval Service. Energy Science and Technology, provided by the U.S. DOE, Washington, D.C., is a multi-disciplinary database containing references to the world`s scientific and technical literature on energy. All unclassified information processed at the Office of Scientific and Technical Information (OSTI) of the U.S. DOE is included in the database. In addition, information on many documents has been manually added to Chernolit TM. Most of this information was obtained in response to requests for data sent to people and/or organizations throughout the world.« less

  15. Chernobyl Bibliographic Search System

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Carr, Jr, F.; Kennedy, R. A.; Mahaffey, J. A.

    1992-05-11

    The Chernobyl Bibliographic Search System (Chernolit TM) provides bibliographic data in a usable format for research studies relating to the Chernobyl nuclear accident that occurred in the former Ukrainian Republic of the USSR in 1986. Chernolit TM is a portable and easy to use product. The bibliographic data is provided under the control of a graphical user interface so that the user may quickly and easily retrieve pertinent information from the large database. The user may search the database for occurrences of words, names, or phrases; view bibliographic references on screen; and obtain reports of selected references. Reports may bemore » viewed on the screen, printed, or accumulated in a folder that is written to a disk file when the user exits the software. Chernolit TM provides a cost-effective alternative to multiple, independent literature searches. Forty-five hundred references concerning the accident, including abstracts, are distributed with Chernolit TM. The data contained in the database were obtained from electronic literature searches and from requested donations from individuals and organizations. These literature searches interrogated the Energy Science and Technology database (formerly DOE ENERGY) of the DIALOG Information Retrieval Service. Energy Science and Technology, provided by the U.S. DOE, Washington, D.C., is a multi-disciplinary database containing references to the world''s scientific and technical literature on energy. All unclassified information processed at the Office of Scientific and Technical Information (OSTI) of the U.S. DOE is included in the database. In addition, information on many documents has been manually added to Chernolit TM. Most of this information was obtained in response to requests for data sent to people and/or organizations throughout the world.« less

  16. Computerization of Library and Information Services in Mainland China.

    ERIC Educational Resources Information Center

    Lin, Sharon Chien

    1994-01-01

    Describes two phases of the automation of library and information services in mainland China. From 1974-86, much effort was concentrated on developing computer systems, databases, online retrieval, and networking. From 1986 to the present, practical progress became possible largely because of CD-ROM technology; and large scale networking for…

  17. Semi-Automatic Determination of Citation Relevancy: User Evaluation.

    ERIC Educational Resources Information Center

    Huffman, G. David

    1990-01-01

    Discussion of online bibliographic database searches focuses on a software system, SORT-AID/SABRE, that ranks retrieved citations in terms of relevance. Results of a comprehensive user evaluation of the relevance ranking procedure to determine its effectiveness are presented, and implications for future work are suggested. (10 references) (LRW)

  18. First Toronto Conference on Database Users. Systems that Enhance User Performance.

    ERIC Educational Resources Information Center

    Doszkocs, Tamas E.; Toliver, David

    1987-01-01

    The first of two papers discusses natural language searching as a user performance enhancement tool, focusing on artificial intelligence applications for information retrieval and problems with natural language processing. The second presents a conceptual framework for further development and future design of front ends to online bibliographic…

  19. Integrating a local database into the StarView distributed user interface

    NASA Technical Reports Server (NTRS)

    Silberberg, D. P.

    1992-01-01

    A distributed user interface to the Space Telescope Data Archive and Distribution Service (DADS) known as StarView is being developed. The DADS architecture consists of the data archive as well as a relational database catalog describing the archive. StarView is a client/server system in which the user interface is the front-end client to the DADS catalog and archive servers. Users query the DADS catalog from the StarView interface. Query commands are transmitted via a network and evaluated by the database. The results are returned via the network and are displayed on StarView forms. Based on the results, users decide which data sets to retrieve from the DADS archive. Archive requests are packaged by StarView and sent to DADS, which returns the requested data sets to the users. The advantages of distributed client/server user interfaces over traditional one-machine systems are well known. Since users run software on machines separate from the database, the overall client response time is much faster. Also, since the server is free to process only database requests, the database response time is much faster. Disadvantages inherent in this architecture are slow overall database access time due to the network delays, lack of a 'get previous row' command, and that refinements of a previously issued query must be submitted to the database server, even though the domain of values have already been returned by the previous query. This architecture also does not allow users to cross correlate DADS catalog data with other catalogs. Clearly, a distributed user interface would be more powerful if it overcame these disadvantages. A local database is being integrated into StarView to overcome these disadvantages. When a query is made through a StarView form, which is often composed of fields from multiple tables, it is translated to an SQL query and issued to the DADS catalog. At the same time, a local database table is created to contain the resulting rows of the query. The returned rows are displayed on the form as well as inserted into the local database table. Identical results are produced by reissuing the query to either the DADS catalog or to the local table. Relational databases do not provide a 'get previous row' function because of the inherent complexity of retrieving previous rows of multiple-table joins. However, since this function is easily implemented on a single table, StarView uses the local table to retrieve the previous row. Also, StarView issues subsequent query refinements to the local table instead of the DADS catalog, eliminating the network transmission overhead. Finally, other catalogs can be imported into the local database for cross correlation with local tables. Overall, it is believe that this is a more powerful architecture for distributed, database user interfaces.

  20. The design and implementation of image query system based on color feature

    NASA Astrophysics Data System (ADS)

    Yao, Xu-Dong; Jia, Da-Chun; Li, Lin

    2013-07-01

    ASP.NET technology was used to construct the B/S mode image query system. The theory and technology of database design, color feature extraction from image, index and retrieval in the construction of the image repository were researched. The campus LAN and WAN environment were used to test the system. From the test results, the needs of user queries about related resources were achieved by system architecture design.

  1. Sparse Representation of Multimodality Sensing Databases for Data Mining and Retrieval

    DTIC Science & Technology

    2015-04-09

    Savarese. Estimating the Aspect Layout of Object Categories, EEE Conference on Computer Vision and Pattern Recognition (CVPR). 19-JUN-12...Time Equivalent (FTE) support provided by this agreement, and total for each category): (a) Graduate Students Liang Mei, 50% FTE, EE : systems...PhD candidate Min Sun, 50% FTE, EE : systems, PhD candidate Yu Xiang, 50% FTE, EE : systems, PhD candidate Dae Yon Jung, 50% FTE, EE : systems, PhD

  2. SORTEZ: a relational translator for NCBI's ASN.1 database.

    PubMed

    Hart, K W; Searls, D B; Overton, G C

    1994-07-01

    The National Center for Biotechnology Information (NCBI) has created a database collection that includes several protein and nucleic acid sequence databases, a biosequence-specific subset of MEDLINE, as well as value-added information such as links between similar sequences. Information in the NCBI database is modeled in Abstract Syntax Notation 1 (ASN.1) an Open Systems Interconnection protocol designed for the purpose of exchanging structured data between software applications rather than as a data model for database systems. While the NCBI database is distributed with an easy-to-use information retrieval system, ENTREZ, the ASN.1 data model currently lacks an ad hoc query language for general-purpose data access. For that reason, we have developed a software package, SORTEZ, that transforms the ASN.1 database (or other databases with nested data structures) to a relational data model and subsequently to a relational database management system (Sybase) where information can be accessed through the relational query language, SQL. Because the need to transform data from one data model and schema to another arises naturally in several important contexts, including efficient execution of specific applications, access to multiple databases and adaptation to database evolution this work also serves as a practical study of the issues involved in the various stages of database transformation. We show that transformation from the ASN.1 data model to a relational data model can be largely automated, but that schema transformation and data conversion require considerable domain expertise and would greatly benefit from additional support tools.

  3. Multimedia human brain database system for surgical candidacy determination in temporal lobe epilepsy with content-based image retrieval

    NASA Astrophysics Data System (ADS)

    Siadat, Mohammad-Reza; Soltanian-Zadeh, Hamid; Fotouhi, Farshad A.; Elisevich, Kost

    2003-01-01

    This paper presents the development of a human brain multimedia database for surgical candidacy determination in temporal lobe epilepsy. The focus of the paper is on content-based image management, navigation and retrieval. Several medical image-processing methods including our newly developed segmentation method are utilized for information extraction/correlation and indexing. The input data includes T1-, T2-Weighted MRI and FLAIR MRI and ictal and interictal SPECT modalities with associated clinical data and EEG data analysis. The database can answer queries regarding issues such as the correlation between the attribute X of the entity Y and the outcome of a temporal lobe epilepsy surgery. The entity Y can be a brain anatomical structure such as the hippocampus. The attribute X can be either a functionality feature of the anatomical structure Y, calculated with SPECT modalities, such as signal average, or a volumetric/morphological feature of the entity Y such as volume or average curvature. The outcome of the surgery can be any surgery assessment such as memory quotient. A determination is made regarding surgical candidacy by analysis of both textual and image data. The current database system suggests a surgical determination for the cases with relatively small hippocampus and high signal intensity average on FLAIR images within the hippocampus. This indication pretty much fits with the surgeons" expectations/observations. Moreover, as the database gets more populated with patient profiles and individual surgical outcomes, using data mining methods one may discover partially invisible correlations between the contents of different modalities of data and the outcome of the surgery.

  4. Unsupervised symmetrical trademark image retrieval in soccer telecast using wavelet energy and quadtree decomposition

    NASA Astrophysics Data System (ADS)

    Ong, Swee Khai; Lim, Wee Keong; Soo, Wooi King

    2013-04-01

    Trademark, a distinctive symbol, is used to distinguish products or services provided by a particular person, group or organization from other similar entries. As trademark represents the reputation and credit standing of the owner, it is important to differentiate one trademark from another. Many methods have been proposed to identify, classify and retrieve trademarks. However, most methods required features database and sample sets for training prior to recognition and retrieval process. In this paper, a new feature on wavelet coefficients, the localized wavelet energy, is introduced to extract features of trademarks. With this, unsupervised content-based symmetrical trademark image retrieval is proposed without the database and prior training set. The feature analysis is done by an integration of the proposed localized wavelet energy and quadtree decomposed regional symmetrical vector. The proposed framework eradicates the dependence on query database and human participation during the retrieval process. In this paper, trademarks for soccer games sponsors are the intended trademark category. Video frames from soccer telecast are extracted and processed for this study. Reasonably good localization and retrieval results on certain categories of trademarks are achieved. A distinctive symbol is used to distinguish products or services provided by a particular person, group or organization from other similar entries.

  5. Fast, axis-agnostic, dynamically summarized storage and retrieval for mass spectrometry data.

    PubMed

    Handy, Kyle; Rosen, Jebediah; Gillan, André; Smith, Rob

    2017-01-01

    Mass spectrometry, a popular technique for elucidating the molecular contents of experimental samples, creates data sets comprised of millions of three-dimensional (m/z, retention time, intensity) data points that correspond to the types and quantities of analyzed molecules. Open and commercial MS data formats are arranged by retention time, creating latency when accessing data across multiple m/z. Existing MS storage and retrieval methods have been developed to overcome the limitations of retention time-based data formats, but do not provide certain features such as dynamic summarization and storage and retrieval of point meta-data (such as signal cluster membership), precluding efficient viewing applications and certain data-processing approaches. This manuscript describes MzTree, a spatial database designed to provide real-time storage and retrieval of dynamically summarized standard and augmented MS data with fast performance in both m/z and RT directions. Performance is reported on real data with comparisons against related published retrieval systems.

  6. SIMS: addressing the problem of heterogeneity in databases

    NASA Astrophysics Data System (ADS)

    Arens, Yigal

    1997-02-01

    The heterogeneity of remotely accessible databases -- with respect to contents, query language, semantics, organization, etc. -- presents serious obstacles to convenient querying. The SIMS (single interface to multiple sources) system addresses this global integration problem. It does so by defining a single language for describing the domain about which information is stored in the databases and using this language as the query language. Each database to which SIMS is to provide access is modeled using this language. The model describes a database's contents, organization, and other relevant features. SIMS uses these models, together with a planning system drawing on techniques from artificial intelligence, to decompose a given user's high-level query into a series of queries against the databases and other data manipulation steps. The retrieval plan is constructed so as to minimize data movement over the network and maximize parallelism to increase execution speed. SIMS can recover from network failures during plan execution by obtaining data from alternate sources, when possible. SIMS has been demonstrated in the domains of medical informatics and logistics, using real databases.

  7. CliniWeb: managing clinical information on the World Wide Web.

    PubMed

    Hersh, W R; Brown, K E; Donohoe, L C; Campbell, E M; Horacek, A E

    1996-01-01

    The World Wide Web is a powerful new way to deliver on-line clinical information, but several problems limit its value to health care professionals: content is highly distributed and difficult to find, clinical information is not separated from non-clinical information, and the current Web technology is unable to support some advanced retrieval capabilities. A system called CliniWeb has been developed to address these problems. CliniWeb is an index to clinical information on the World Wide Web, providing a browsing and searching interface to clinical content at the level of the health care student or provider. Its database contains a list of clinical information resources on the Web that are indexed by terms from the Medical Subject Headings disease tree and retrieved with the assistance of SAPHIRE. Limitations of the processes used to build the database are discussed, together with directions for future research.

  8. Document creation, linking, and maintenance system

    DOEpatents

    Claghorn, Ronald [Pasco, WA

    2011-02-15

    A document creation and citation system designed to maintain a database of reference documents. The content of a selected document may be automatically scanned and indexed by the system. The selected documents may also be manually indexed by a user prior to the upload. The indexed documents may be uploaded and stored within a database for later use. The system allows a user to generate new documents by selecting content within the reference documents stored within the database and inserting the selected content into a new document. The system allows the user to customize and augment the content of the new document. The system also generates citations to the selected content retrieved from the reference documents. The citations may be inserted into the new document in the appropriate location and format, as directed by the user. The new document may be uploaded into the database and included with the other reference documents. The system also maintains the database of reference documents so that when changes are made to a reference document, the author of a document referencing the changed document will be alerted to make appropriate changes to his document. The system also allows visual comparison of documents so that the user may see differences in the text of the documents.

  9. Use of a Relational Database to Support Clinical Research: Application in a Diabetes Program

    PubMed Central

    Lomatch, Diane; Truax, Terry; Savage, Peter

    1981-01-01

    A database has been established to support conduct of clinical research and monitor delivery of medical care for 1200 diabetic patients as part of the Michigan Diabetes Research and Training Center (MDRTC). Use of an intelligent microcomputer to enter and retrieve the data and use of a relational database management system (DBMS) to store and manage data have provided a flexible, efficient method of achieving both support of small projects and monitoring overall activity of the Diabetes Center Unit (DCU). Simplicity of access to data, efficiency in providing data for unanticipated requests, ease of manipulations of relations, security and “logical data independence” were important factors in choosing a relational DBMS. The ability to interface with an interactive statistical program and a graphics program is a major advantage of this system. Out database currently provides support for the operation and analysis of several ongoing research projects.

  10. Coordinating Council. First Meeting: NASA/RECON database

    NASA Technical Reports Server (NTRS)

    1990-01-01

    A Council of NASA Headquarters, American Institute of Aeronautics and Astronautics (AIAA), and the NASA Scientific and Technical Information (STI) Facility management met (1) to review and discuss issues of NASA concern, and (2) to promote new and better ways to collect and disseminate scientific and technical information. Topics mentioned for study and discussion at subsequent meetings included the pros and cons of transferring the NASA/RECON database to the commercial sector, the quality of the database, and developing ways to increase foreign acquisitions. The input systems at AIAA and the STI Facility were described. Also discussed were the proposed RECON II retrieval system, the transmittal of document orders received by the Facility and sent to AIAA, and the handling of multimedia input by the Departments of Defense and Commerce. A second meeting was scheduled for six weeks later to discuss database quality and international foreign input.

  11. SPARCCS - Smartphone-Assisted Readiness, Command and Control System

    DTIC Science & Technology

    2012-06-01

    and database needs. By doing this SPARCCS takes advantage of all the capabilities cloud computing has to offer, especially that of disbursed data...40092829/ Microsoft. (2011). Cloud Computing . Retrieved September 24, 2011, http ://www.microsoft.com/industry/government/guides/cloud_computing/2...Command, and Control System) to address these issues. We use smartphones in conjunction with cloud computing to extend the benefits of collaborative

  12. SPLICE: A program to assemble partial query solutions from three-dimensional database searches into novel ligands

    NASA Astrophysics Data System (ADS)

    Ho, Chris M. W.; Marshall, Garland R.

    1993-12-01

    SPLICE is a program that processes partial query solutions retrieved from 3D, structural databases to generate novel, aggregate ligands. It is designed to interface with the database searching program FOUNDATION, which retrieves fragments containing any combination of a user-specified minimum number of matching query elements. SPLICE eliminates aspects of structures that are physically incapable of binding within the active site. Then, a systematic rule-based procedure is performed upon the remaining fragments to ensure receptor complementarity. All modifications are automated and remain transparent to the user. Ligands are then assembled by linking components into composite structures through overlapping bonds. As a control experiment, FOUNDATION and SPLICE were used to reconstruct a know HIV-1 protease inhibitor after it had been fragmented, reoriented, and added to a sham database of fifty different small molecules. To illustrate the capabilities of this program, a 3D search query containing the pharmacophoric elements of an aspartic proteinase-inhibitor crystal complex was searched using FOUNDATION against a subset of the Cambridge Structural Database. One hundred thirty-one compounds were retrieved, each containing any combination of at least four query elements. Compounds were automatically screened and edited for receptor complementarity. Numerous combinations of fragments were discovered that could be linked to form novel structures, containing a greater number of pharmacophoric elements than any single retrieved fragment.

  13. Image-guided decision support system for pulmonary nodule classification in 3D thoracic CT images

    NASA Astrophysics Data System (ADS)

    Kawata, Yoshiki; Niki, Noboru; Ohmatsu, Hironobu; Kusumoto, Masahiro; Kakinuma, Ryutaro; Mori, Kiyoshi; Yamada, Kozo; Nishiyama, Hiroyuki; Eguchi, Kenji; Kaneko, Masahiro; Moriyama, Noriyuki

    2004-05-01

    The purpose of this study is to develop an image-guided decision support system that assists decision-making in clinical differential diagnosis of pulmonary nodules. This approach retrieves and displays nodules that exhibit morphological and internal profiles consistent to the nodule in question. It uses a three-dimensional (3-D) CT image database of pulmonary nodules for which diagnosis is known. In order to build the system, there are following issues that should be solved: 1) to categorize the nodule database with respect to morphological and internal features, 2) to quickly search nodule images similar to an indeterminate nodule from a large database, and 3) to reveal malignancy likelihood computed by using similar nodule images. Especially, the first problem influences the design of other issues. The successful categorization of nodule pattern might lead physicians to find important cues that characterize benign and malignant nodules. This paper focuses on an approach to categorize the nodule database with respect to nodule shape and CT density patterns inside nodule.

  14. Visualization of historical data for the ATLAS detector controls - DDV

    NASA Astrophysics Data System (ADS)

    Maciejewski, J.; Schlenker, S.

    2017-10-01

    The ATLAS experiment is one of four detectors located on the Large Hardon Collider (LHC) based at CERN. Its detector control system (DCS) stores the slow control data acquired within the back-end of distributed WinCC OA applications, which enables the data to be retrieved for future analysis, debugging and detector development in an Oracle relational database. The ATLAS DCS Data Viewer (DDV) is a client-server application providing access to the historical data outside of the experiment network. The server builds optimized SQL queries, retrieves the data from the database and serves it to the clients via HTTP connections. The server also implements protection methods to prevent malicious use of the database. The client is an AJAX-type web application based on the Vaadin (framework build around the Google Web Toolkit (GWT)) which gives users the possibility to access the data with ease. The DCS metadata can be selected using a column-tree navigation or a search engine supporting regular expressions. The data is visualized by a selection of output modules such as a java script value-over time plots or a lazy loading table widget. Additional plugins give the users the possibility to retrieve the data in ROOT format or as an ASCII file. Control system alarms can also be visualized in a dedicated table if necessary. Python mock-up scripts can be generated by the client, allowing the user to query the pythonic DDV server directly, such that the users can embed the scripts into more complex analysis programs. Users are also able to store searches and output configurations as XML on the server to share with others via URL or to embed in HTML.

  15. An intelligent framework for medical image retrieval using MDCT and multi SVM.

    PubMed

    Balan, J A Alex Rajju; Rajan, S Edward

    2014-01-01

    Volumes of medical images are rapidly generated in medical field and to manage them effectively has become a great challenge. This paper studies the development of innovative medical image retrieval based on texture features and accuracy. The objective of the paper is to analyze the image retrieval based on diagnosis of healthcare management systems. This paper traces the development of innovative medical image retrieval to estimate both the image texture features and accuracy. The texture features of medical images are extracted using MDCT and multi SVM. Both the theoretical approach and the simulation results revealed interesting observations and they were corroborated using MDCT coefficients and SVM methodology. All attempts to extract the data about the image in response to the query has been computed successfully and perfect image retrieval performance has been obtained. Experimental results on a database of 100 trademark medical images show that an integrated texture feature representation results in 98% of the images being retrieved using MDCT and multi SVM. Thus we have studied a multiclassification technique based on SVM which is prior suitable for medical images. The results show the retrieval accuracy of 98%, 99% for different sets of medical images with respect to the class of image.

  16. TRENDS: A flight test relational database user's guide and reference manual

    NASA Technical Reports Server (NTRS)

    Bondi, M. J.; Bjorkman, W. S.; Cross, J. L.

    1994-01-01

    This report is designed to be a user's guide and reference manual for users intending to access rotocraft test data via TRENDS, the relational database system which was developed as a tool for the aeronautical engineer with no programming background. This report has been written to assist novice and experienced TRENDS users. TRENDS is a complete system for retrieving, searching, and analyzing both numerical and narrative data, and for displaying time history and statistical data in graphical and numerical formats. This manual provides a 'guided tour' and a 'user's guide' for the new and intermediate-skilled users. Examples for the use of each menu item within TRENDS is provided in the Menu Reference section of the manual, including full coverage for TIMEHIST, one of the key tools. This manual is written around the XV-15 Tilt Rotor database, but does include an appendix on the UH-60 Blackhawk database. This user's guide and reference manual establishes a referrable source for the research community and augments NASA TM-101025, TRENDS: The Aeronautical Post-Test, Database Management System, Jan. 1990, written by the same authors.

  17. [Preparation of the database and the homepage on chemical accidents relating to health hazard].

    PubMed

    Yamamoto, M; Morita, M; Kaminuma, T

    1998-01-01

    We collected the data on accidents due to chemicals occurred in Japan, and prepared the database. We also set up the World Wide Web homepage containing the explanation on accidents due to chemicals and the retrieval page for the database. We designed the retrieval page so that users can search the data from keywords such as chemicals (e.g. chlorine gas, hydrogen sulfide, pesticides), places (e.g. home, factory, vehicles, tank), causes (e.g. reaction, leakage, exhaust gas) and others (e.g. cleaning, painting, transportation).

  18. User's Manual for the New England Water-Use Data System (NEWUDS)

    USGS Publications Warehouse

    Horn, Marilee A.

    2003-01-01

    Water is used in a variety of ways that need to be understood for effective management of water resources. Water-use activities need to be categorized and included in a database management system to understand current water uses and to provide information to water-resource management policy decisionmakers. The New England Water-Use Data System (NEWUDS) is a complex database developed to store water-use information that allows water to be tracked from a point of water-use activity (called a 'Site'), such as withdrawal from a resource (reservoir or aquifer), to a second Site, such as distribution to a user (business or irrigator). NEWUDS conceptual model consists of 10 core entities: system, owner, address, location, site, data source, resource, conveyance, transaction/rate, and alias, with tables available to store user-defined details. Three components--site (with both a From Site and a To Site), a conveyance that connects them, and a transaction/rate associated with the movement of water over a specific time interval form the core of the basic NEWUDS network model. The most important step in correctly translating real-world water-use activities into a storable format in NEWUDS depends on choosing the appropriate sites and linking them correctly in a network to model the flow of water from the initial From Site to the final To Site. Ten water-use networks representing real-world activities are described--three withdrawal networks, three return networks, two user networks, two complex community-system networks. Ten case studies of water use, one for each network, also are included in this manual to illustrate how to compile, store, and retrieve the appropriate data. The sequence of data entry into tables is critical because there are many foreign keys. The recommended core entity sequence is (1) system, (2) owner, (3) address, (4) location, (5) site, (6) data source, (7) resource, (8) conveyance, (9) transaction, and (10) rate; with (11) alias and (12) user-defined detail subject areas populated as needed. After each step in data entry, quality-assurance queries should be run to ensure the data are correctly entered so that it can be retrieved accurately. The point of data storage is retrieval. Several retrieval queries that focus on retrieving only relevant data to specific questions are presented in this manual as examples for the NEWUDS user.

  19. R2 Water Quality Portal Monitoring Stations

    EPA Pesticide Factsheets

    The Water Quality Data Portal (WQP) provides an easy way to access data stored in various large water quality databases. The WQP provides various input parameters on the form including location, site, sampling, and date parameters to filter and customize the returned results. The The Water Quality Portal (WQP) is a cooperative service sponsored by the United States Geological Survey (USGS), the Environmental Protection Agency (EPA) and the National Water Quality Monitoring Council (NWQMC) that integrates publicly available water quality data from the USGS National Water Information System (NWIS) the EPA STOrage and RETrieval (STORET) Data Warehouse, and the USDA ARS Sustaining The Earth??s Watersheds - Agricultural Research Database System (STEWARDS).

  20. BioMart Central Portal: an open database network for the biological community

    PubMed Central

    Guberman, Jonathan M.; Ai, J.; Arnaiz, O.; Baran, Joachim; Blake, Andrew; Baldock, Richard; Chelala, Claude; Croft, David; Cros, Anthony; Cutts, Rosalind J.; Di Génova, A.; Forbes, Simon; Fujisawa, T.; Gadaleta, E.; Goodstein, D. M.; Gundem, Gunes; Haggarty, Bernard; Haider, Syed; Hall, Matthew; Harris, Todd; Haw, Robin; Hu, S.; Hubbard, Simon; Hsu, Jack; Iyer, Vivek; Jones, Philip; Katayama, Toshiaki; Kinsella, R.; Kong, Lei; Lawson, Daniel; Liang, Yong; Lopez-Bigas, Nuria; Luo, J.; Lush, Michael; Mason, Jeremy; Moreews, Francois; Ndegwa, Nelson; Oakley, Darren; Perez-Llamas, Christian; Primig, Michael; Rivkin, Elena; Rosanoff, S.; Shepherd, Rebecca; Simon, Reinhard; Skarnes, B.; Smedley, Damian; Sperling, Linda; Spooner, William; Stevenson, Peter; Stone, Kevin; Teague, J.; Wang, Jun; Wang, Jianxin; Whitty, Brett; Wong, D. T.; Wong-Erasmus, Marie; Yao, L.; Youens-Clark, Ken; Yung, Christina; Zhang, Junjun; Kasprzyk, Arek

    2011-01-01

    BioMart Central Portal is a first of its kind, community-driven effort to provide unified access to dozens of biological databases spanning genomics, proteomics, model organisms, cancer data, ontology information and more. Anybody can contribute an independently maintained resource to the Central Portal, allowing it to be exposed to and shared with the research community, and linking it with the other resources in the portal. Users can take advantage of the common interface to quickly utilize different sources without learning a new system for each. The system also simplifies cross-database searches that might otherwise require several complicated steps. Several integrated tools streamline common tasks, such as converting between ID formats and retrieving sequences. The combination of a wide variety of databases, an easy-to-use interface, robust programmatic access and the array of tools make Central Portal a one-stop shop for biological data querying. Here, we describe the structure of Central Portal and show example queries to demonstrate its capabilities. Database URL: http://central.biomart.org. PMID:21930507

  1. BIRS – Bioterrorism Information Retrieval System

    PubMed Central

    Tewari, Ashish Kumar; Rashi; Wadhwa, Gulshan; Sharma, Sanjeev Kumar; Jain, Chakresh Kumar

    2013-01-01

    Bioterrorism is the intended use of pathogenic strains of microbes to widen terror in a population. There is a definite need to promote research for development of vaccines, therapeutics and diagnostic methods as a part of preparedness to any bioterror attack in the future. BIRS is an open-access database of collective information on the organisms related to bioterrorism. The architecture of database utilizes the current open-source technology viz PHP ver 5.3.19, MySQL and IIS server under windows platform for database designing. Database stores information on literature, generic- information and unique pathways of about 10 microorganisms involved in bioterrorism. This may serve as a collective repository to accelerate the drug discovery and vaccines designing process against such bioterrorist agents (microbes). The available data has been validated from various online resources and literature mining in order to provide the user with a comprehensive information system. Availability The database is freely available at http://www.bioterrorism.biowaves.org PMID:23390356

  2. [Development and evaluation of the medical imaging distribution system with dynamic web application and clustering technology].

    PubMed

    Yokohama, Noriya; Tsuchimoto, Tadashi; Oishi, Masamichi; Itou, Katsuya

    2007-01-20

    It has been noted that the downtime of medical informatics systems is often long. Many systems encounter downtimes of hours or even days, which can have a critical effect on daily operations. Such systems remain especially weak in the areas of database and medical imaging data. The scheme design shows the three-layer architecture of the system: application, database, and storage layers. The application layer uses the DICOM protocol (Digital Imaging and Communication in Medicine) and HTTP (Hyper Text Transport Protocol) with AJAX (Asynchronous JavaScript+XML). The database is designed to decentralize in parallel using cluster technology. Consequently, restoration of the database can be done not only with ease but also with improved retrieval speed. In the storage layer, a network RAID (Redundant Array of Independent Disks) system, it is possible to construct exabyte-scale parallel file systems that exploit storage spread. Development and evaluation of the test-bed has been successful in medical information data backup and recovery in a network environment. This paper presents a schematic design of the new medical informatics system that can be accommodated from a recovery and the dynamic Web application for medical imaging distribution using AJAX.

  3. CD-ROM-aided Databases

    NASA Astrophysics Data System (ADS)

    Keiji, Ogawa

    Toppan Printing Co., Ltd. has played a pioneering role in developing CTS (Computerized Typesetting System) for these twenty years, and has accumulated a great deal of technical know-how. The company intends to integrate accumulated information into multimedia. As for CD-ROM, it has been aggressively striven to develop, from planning to data-input and data-processing. Recently, under the guidance of Research group on molecular design, It has developed a CD-ROM system to support research and development in the field of organic chemistry. This system is constructed mainly of the data in “Organic Syntheses”, a bible among organic chemists. The outline of the structure of files, and that of indexes which is a key point in retrieval, the flow chart of the retrieval process, and editing processes, etc. are described in this paper.

  4. DRUMS: a human disease related unique gene mutation search engine.

    PubMed

    Li, Zuofeng; Liu, Xingnan; Wen, Jingran; Xu, Ye; Zhao, Xin; Li, Xuan; Liu, Lei; Zhang, Xiaoyan

    2011-10-01

    With the completion of the human genome project and the development of new methods for gene variant detection, the integration of mutation data and its phenotypic consequences has become more important than ever. Among all available resources, locus-specific databases (LSDBs) curate one or more specific genes' mutation data along with high-quality phenotypes. Although some genotype-phenotype data from LSDB have been integrated into central databases little effort has been made to integrate all these data by a search engine approach. In this work, we have developed disease related unique gene mutation search engine (DRUMS), a search engine for human disease related unique gene mutation as a convenient tool for biologists or physicians to retrieve gene variant and related phenotype information. Gene variant and phenotype information were stored in a gene-centred relational database. Moreover, the relationships between mutations and diseases were indexed by the uniform resource identifier from LSDB, or another central database. By querying DRUMS, users can access the most popular mutation databases under one interface. DRUMS could be treated as a domain specific search engine. By using web crawling, indexing, and searching technologies, it provides a competitively efficient interface for searching and retrieving mutation data and their relationships to diseases. The present system is freely accessible at http://www.scbit.org/glif/new/drums/index.html. © 2011 Wiley-Liss, Inc.

  5. CoReCG: a comprehensive database of genes associated with colon-rectal cancer

    PubMed Central

    Agarwal, Rahul; Kumar, Binayak; Jayadev, Msk; Raghav, Dhwani; Singh, Ashutosh

    2016-01-01

    Cancer of large intestine is commonly referred as colorectal cancer, which is also the third most frequently prevailing neoplasm across the globe. Though, much of work is being carried out to understand the mechanism of carcinogenesis and advancement of this disease but, fewer studies has been performed to collate the scattered information of alterations in tumorigenic cells like genes, mutations, expression changes, epigenetic alteration or post translation modification, genetic heterogeneity. Earlier findings were mostly focused on understanding etiology of colorectal carcinogenesis but less emphasis were given for the comprehensive review of the existing findings of individual studies which can provide better diagnostics based on the suggested markers in discrete studies. Colon Rectal Cancer Gene Database (CoReCG), contains 2056 colon-rectal cancer genes information involved in distinct colorectal cancer stages sourced from published literature with an effective knowledge based information retrieval system. Additionally, interactive web interface enriched with various browsing sections, augmented with advance search facility for querying the database is provided for user friendly browsing, online tools for sequence similarity searches and knowledge based schema ensures a researcher friendly information retrieval mechanism. Colorectal cancer gene database (CoReCG) is expected to be a single point source for identification of colorectal cancer-related genes, thereby helping with the improvement of classification, diagnosis and treatment of human cancers. Database URL: lms.snu.edu.in/corecg PMID:27114494

  6. A new method of content based medical image retrieval and its applications to CT imaging sign retrieval.

    PubMed

    Ma, Ling; Liu, Xiabi; Gao, Yan; Zhao, Yanfeng; Zhao, Xinming; Zhou, Chunwu

    2017-02-01

    This paper proposes a new method of content based medical image retrieval through considering fused, context-sensitive similarity. Firstly, we fuse the semantic and visual similarities between the query image and each image in the database as their pairwise similarities. Then, we construct a weighted graph whose nodes represent the images and edges measure their pairwise similarities. By using the shortest path algorithm over the weighted graph, we obtain a new similarity measure, context-sensitive similarity measure, between the query image and each database image to complete the retrieval process. Actually, we use the fused pairwise similarity to narrow down the semantic gap for obtaining a more accurate pairwise similarity measure, and spread it on the intrinsic data manifold to achieve the context-sensitive similarity for a better retrieval performance. The proposed method has been evaluated on the retrieval of the Common CT Imaging Signs of Lung Diseases (CISLs) and achieved not only better retrieval results but also the satisfactory computation efficiency. Copyright © 2017 Elsevier Inc. All rights reserved.

  7. TBIdoc: 3D content-based CT image retrieval system for traumatic brain injury

    NASA Astrophysics Data System (ADS)

    Li, Shimiao; Gong, Tianxia; Wang, Jie; Liu, Ruizhe; Tan, Chew Lim; Leong, Tze Yun; Pang, Boon Chuan; Lim, C. C. Tchoyoson; Lee, Cheng Kiang; Tian, Qi; Zhang, Zhuo

    2010-03-01

    Traumatic brain injury (TBI) is a major cause of death and disability. Computed Tomography (CT) scan is widely used in the diagnosis of TBI. Nowadays, large amount of TBI CT data is stacked in the hospital radiology department. Such data and the associated patient information contain valuable information for clinical diagnosis and outcome prediction. However, current hospital database system does not provide an efficient and intuitive tool for doctors to search out cases relevant to the current study case. In this paper, we present the TBIdoc system: a content-based image retrieval (CBIR) system which works on the TBI CT images. In this web-based system, user can query by uploading CT image slices from one study, retrieval result is a list of TBI cases ranked according to their 3D visual similarity to the query case. Specifically, cases of TBI CT images often present diffuse or focal lesions. In TBIdoc system, these pathological image features are represented as bin-based binary feature vectors. We use the Jaccard-Needham measure as the similarity measurement. Based on these, we propose a 3D similarity measure for computing the similarity score between two series of CT slices. nDCG is used to evaluate the system performance, which shows the system produces satisfactory retrieval results. The system is expected to improve the current hospital data management in TBI and to give better support for the clinical decision-making process. It may also contribute to the computer-aided education in TBI.

  8. Machine learning techniques for breast cancer computer aided diagnosis using different image modalities: A systematic review.

    PubMed

    Yassin, Nisreen I R; Omran, Shaimaa; El Houby, Enas M F; Allam, Hemat

    2018-03-01

    The high incidence of breast cancer in women has increased significantly in the recent years. Physician experience of diagnosing and detecting breast cancer can be assisted by using some computerized features extraction and classification algorithms. This paper presents the conduction and results of a systematic review (SR) that aims to investigate the state of the art regarding the computer aided diagnosis/detection (CAD) systems for breast cancer. The SR was conducted using a comprehensive selection of scientific databases as reference sources, allowing access to diverse publications in the field. The scientific databases used are Springer Link (SL), Science Direct (SD), IEEE Xplore Digital Library, and PubMed. Inclusion and exclusion criteria were defined and applied to each retrieved work to select those of interest. From 320 studies retrieved, 154 studies were included. However, the scope of this research is limited to scientific and academic works and excludes commercial interests. This survey provides a general analysis of the current status of CAD systems according to the used image modalities and the machine learning based classifiers. Potential research studies have been discussed to create a more objective and efficient CAD systems. Copyright © 2017 Elsevier B.V. All rights reserved.

  9. Extended Subject Access to Hypertext Online Documentation. Part III: The Document-Boundaries Problem.

    ERIC Educational Resources Information Center

    Girill, T. R.

    1991-01-01

    This article continues the description of DFT (Document, Find, Theseus), an online documentation system that provides computer-managed on-demand printing of software manuals as well as the interactive retrieval of reference passages. Document boundaries in the hypertext database are discussed, search vocabulary complexities are described, and text…

  10. PASCAL Data Base: File Description and On Line Access on ESA/IRS.

    ERIC Educational Resources Information Center

    Pelissier, Denise

    This report describes the PASCAL database, a machine readable version of the French abstract journal Bulletin Signaletique, which allows use of the file for (1) batch and online retrieval of information, (2) selective dissemination of information, and (3) publishing of the 50 sections of Bulletin Signaletique. The system, which covers nine…

  11. An Electronic Finding Aid Using Extensible Markup Language (XML) and Encoded Archival Description (EAD).

    ERIC Educational Resources Information Center

    Chang, May

    2000-01-01

    Describes the development of electronic finding aids for archives at the University of Illinois, Urbana-Champaign that used XML (extensible markup language) and EAD (encoded archival description) to enable more flexible information management and retrieval than using MARC or a relational database management system. EAD template is appended.…

  12. MScanner: a classifier for retrieving Medline citations

    PubMed Central

    Poulter, Graham L; Rubin, Daniel L; Altman, Russ B; Seoighe, Cathal

    2008-01-01

    Background Keyword searching through PubMed and other systems is the standard means of retrieving information from Medline. However, ad-hoc retrieval systems do not meet all of the needs of databases that curate information from literature, or of text miners developing a corpus on a topic that has many terms indicative of relevance. Several databases have developed supervised learning methods that operate on a filtered subset of Medline, to classify Medline records so that fewer articles have to be manually reviewed for relevance. A few studies have considered generalisation of Medline classification to operate on the entire Medline database in a non-domain-specific manner, but existing applications lack speed, available implementations, or a means to measure performance in new domains. Results MScanner is an implementation of a Bayesian classifier that provides a simple web interface for submitting a corpus of relevant training examples in the form of PubMed IDs and returning results ranked by decreasing probability of relevance. For maximum speed it uses the Medical Subject Headings (MeSH) and journal of publication as a concise document representation, and takes roughly 90 seconds to return results against the 16 million records in Medline. The web interface provides interactive exploration of the results, and cross validated performance evaluation on the relevant input against a random subset of Medline. We describe the classifier implementation, cross validate it on three domain-specific topics, and compare its performance to that of an expert PubMed query for a complex topic. In cross validation on the three sample topics against 100,000 random articles, the classifier achieved excellent separation of relevant and irrelevant article score distributions, ROC areas between 0.97 and 0.99, and averaged precision between 0.69 and 0.92. Conclusion MScanner is an effective non-domain-specific classifier that operates on the entire Medline database, and is suited to retrieving topics for which many features may indicate relevance. Its web interface simplifies the task of classifying Medline citations, compared to building a pre-filter and classifier specific to the topic. The data sets and open source code used to obtain the results in this paper are available on-line and as supplementary material, and the web interface may be accessed at . PMID:18284683

  13. [Exploration and construction of the full-text database of acupuncture literature in the Republic of China].

    PubMed

    Fei, Lin; Zhao, Jing; Leng, Jiahao; Zhang, Shujian

    2017-10-12

    The ALIPORC full-text database is targeted at a specific full-text database of acupuncture literature in the Republic of China. Starting in 2015, till now, the database has been getting completed, focusing on books relevant with acupuncture, articles and advertising documents, accomplished or published in the Republic of China. The construction of this database aims to achieve the source sharing of acupuncture medical literature in the Republic of China through the retrieval approaches to diversity and accurate content presentation, contributes to the exchange of scholars, reduces the paper damage caused by paging and simplify the retrieval of the rare literature. The writers have made the explanation of the database in light of sources, characteristics and current situation of construction; and have discussed on improving the efficiency and integrity of the database and deepening the development of acupuncture literature in the Republic of China.

  14. Proposal for Re-Usable TODO Knowledge Management System RESTER

    NASA Astrophysics Data System (ADS)

    Saga, Ryosuke; Kageyama, Akinori; Tsuji, Hiroshi

    This paper describes how to reuse a series of ad-hoc tasks such as special meeting arrangement and equipment procurement. Our RESTER (Reusable TODO Synthesizer) allows a group to reuse a series of tasks which are recorded in case database. Given a specific event, RESTER repairs the retrieved similar case by the ontology which describes the relationship of concept in the organization. A user has chance to check the modified case and to update it if he finds that there are incorrect repair because of deficient ontology. The user is also requested to judge if the retrieved case works or not. If he judges it is useful, the case becomes to be reused more frequently. Thus, RESTER works under the premise of human-computer collaboration. Based on the presented framework, this paper has identified several desirable attributes: (1) RESTER allows a group to externalize its experience on jobs, (2) Externalized experience are connected in case database, (3) A case is internalized by other group when it is retrieved and repaired for a new event, (4) New job generated from the previous similar job of one group is socialized by the other group.

  15. NVST Data Archiving System Based On FastBit NoSQL Database

    NASA Astrophysics Data System (ADS)

    Liu, Ying-bo; Wang, Feng; Ji, Kai-fan; Deng, Hui; Dai, Wei; Liang, Bo

    2014-06-01

    The New Vacuum Solar Telescope (NVST) is a 1-meter vacuum solar telescope that aims to observe the fine structures of active regions on the Sun. The main tasks of the NVST are high resolution imaging and spectral observations, including the measurements of the solar magnetic field. The NVST has been collecting more than 20 million FITS files since it began routine observations in 2012 and produces a maximum observational records of 120 thousand files in a day. Given the large amount of files, the effective archiving and retrieval of files becomes a critical and urgent problem. In this study, we implement a new data archiving system for the NVST based on the Fastbit Not Only Structured Query Language (NoSQL) database. Comparing to the relational database (i.e., MySQL; My Structured Query Language), the Fastbit database manifests distinctive advantages on indexing and querying performance. In a large scale database of 40 million records, the multi-field combined query response time of Fastbit database is about 15 times faster and fully meets the requirements of the NVST. Our study brings a new idea for massive astronomical data archiving and would contribute to the design of data management systems for other astronomical telescopes.

  16. MetNetAPI: A flexible method to access and manipulate biological network data from MetNet

    PubMed Central

    2010-01-01

    Background Convenient programmatic access to different biological databases allows automated integration of scientific knowledge. Many databases support a function to download files or data snapshots, or a webservice that offers "live" data. However, the functionality that a database offers cannot be represented in a static data download file, and webservices may consume considerable computational resources from the host server. Results MetNetAPI is a versatile Application Programming Interface (API) to the MetNetDB database. It abstracts, captures and retains operations away from a biological network repository and website. A range of database functions, previously only available online, can be immediately (and independently from the website) applied to a dataset of interest. Data is available in four layers: molecular entities, localized entities (linked to a specific organelle), interactions, and pathways. Navigation between these layers is intuitive (e.g. one can request the molecular entities in a pathway, as well as request in what pathways a specific entity participates). Data retrieval can be customized: Network objects allow the construction of new and integration of existing pathways and interactions, which can be uploaded back to our server. In contrast to webservices, the computational demand on the host server is limited to processing data-related queries only. Conclusions An API provides several advantages to a systems biology software platform. MetNetAPI illustrates an interface with a central repository of data that represents the complex interrelationships of a metabolic and regulatory network. As an alternative to data-dumps and webservices, it allows access to a current and "live" database and exposes analytical functions to application developers. Yet it only requires limited resources on the server-side (thin server/fat client setup). The API is available for Java, Microsoft.NET and R programming environments and offers flexible query and broad data- retrieval methods. Data retrieval can be customized to client needs and the API offers a framework to construct and manipulate user-defined networks. The design principles can be used as a template to build programmable interfaces for other biological databases. The API software and tutorials are available at http://www.metnetonline.org/api. PMID:21083943

  17. Content based image retrieval using local binary pattern operator and data mining techniques.

    PubMed

    Vatamanu, Oana Astrid; Frandeş, Mirela; Lungeanu, Diana; Mihalaş, Gheorghe-Ioan

    2015-01-01

    Content based image retrieval (CBIR) concerns the retrieval of similar images from image databases, using feature vectors extracted from images. These feature vectors globally define the visual content present in an image, defined by e.g., texture, colour, shape, and spatial relations between vectors. Herein, we propose the definition of feature vectors using the Local Binary Pattern (LBP) operator. A study was performed in order to determine the optimum LBP variant for the general definition of image feature vectors. The chosen LBP variant is then subsequently used to build an ultrasound image database, and a database with images obtained from Wireless Capsule Endoscopy. The image indexing process is optimized using data clustering techniques for images belonging to the same class. Finally, the proposed indexing method is compared to the classical indexing technique, which is nowadays widely used.

  18. Texture-based approach to palmprint retrieval for personal identification

    NASA Astrophysics Data System (ADS)

    Li, Wenxin; Zhang, David; Xu, Z.; You, J.

    2000-12-01

    This paper presents a new approach to palmprint retrieval for personal identification. Three key issues in image retrieval are considered - feature selection, similarity measures and dynamic search for the best matching of the sample in the image database. We propose a texture-based method for palmprint feature representation. The concept of texture energy is introduced to define a palm print's global and local features, which are characterized with high convergence of inner-palm similarities and good dispersion of inter-palm discrimination. The search is carried out in a layered fashion: first global features are used to guide the fast selection of a small set of similar candidates from the database from the database and then local features are used to decide the final output within the candidate set. The experimental results demonstrate the effectiveness and accuracy of the proposed method.

  19. Texture-based approach to palmprint retrieval for personal identification

    NASA Astrophysics Data System (ADS)

    Li, Wenxin; Zhang, David; Xu, Z.; You, J.

    2001-01-01

    This paper presents a new approach to palmprint retrieval for personal identification. Three key issues in image retrieval are considered - feature selection, similarity measures and dynamic search for the best matching of the sample in the image database. We propose a texture-based method for palmprint feature representation. The concept of texture energy is introduced to define a palm print's global and local features, which are characterized with high convergence of inner-palm similarities and good dispersion of inter-palm discrimination. The search is carried out in a layered fashion: first global features are used to guide the fast selection of a small set of similar candidates from the database from the database and then local features are used to decide the final output within the candidate set. The experimental results demonstrate the effectiveness and accuracy of the proposed method.

  20. Ontological interpretation of biomedical database content.

    PubMed

    Santana da Silva, Filipe; Jansen, Ludger; Freitas, Fred; Schulz, Stefan

    2017-06-26

    Biological databases store data about laboratory experiments, together with semantic annotations, in order to support data aggregation and retrieval. The exact meaning of such annotations in the context of a database record is often ambiguous. We address this problem by grounding implicit and explicit database content in a formal-ontological framework. By using a typical extract from the databases UniProt and Ensembl, annotated with content from GO, PR, ChEBI and NCBI Taxonomy, we created four ontological models (in OWL), which generate explicit, distinct interpretations under the BioTopLite2 (BTL2) upper-level ontology. The first three models interpret database entries as individuals (IND), defined classes (SUBC), and classes with dispositions (DISP), respectively; the fourth model (HYBR) is a combination of SUBC and DISP. For the evaluation of these four models, we consider (i) database content retrieval, using ontologies as query vocabulary; (ii) information completeness; and, (iii) DL complexity and decidability. The models were tested under these criteria against four competency questions (CQs). IND does not raise any ontological claim, besides asserting the existence of sample individuals and relations among them. Modelling patterns have to be created for each type of annotation referent. SUBC is interpreted regarding maximally fine-grained defined subclasses under the classes referred to by the data. DISP attempts to extract truly ontological statements from the database records, claiming the existence of dispositions. HYBR is a hybrid of SUBC and DISP and is more parsimonious regarding expressiveness and query answering complexity. For each of the four models, the four CQs were submitted as DL queries. This shows the ability to retrieve individuals with IND, and classes in SUBC and HYBR. DISP does not retrieve anything because the axioms with disposition are embedded in General Class Inclusion (GCI) statements. Ambiguity of biological database content is addressed by a method that identifies implicit knowledge behind semantic annotations in biological databases and grounds it in an expressive upper-level ontology. The result is a seamless representation of database structure, content and annotations as OWL models.

  1. INFOMAT: The international materials assessment and application centre's internet gateway

    NASA Astrophysics Data System (ADS)

    Branquinho, Carmen Lucia; Colodete, Leandro Tavares

    2004-08-01

    INFOMAT is an electronic directory structured to facilitate the search and retrieval of materials science and technology information sources. Linked to the homepage of the International Materials Assessment and Application Centre, INFOMAT presents descriptions of 392 proprietary databases with links to their host systems as well as direct links to over 180 public domain databases and over 2,400 web sites. Among the web sites are associations/unions, governmental and non-governmental institutions, industries, library holdings, market statistics, news services, on-line publications, standardization and intellectual property organizations, and universities/research groups.

  2. Organizing a breast cancer database: data management.

    PubMed

    Yi, Min; Hunt, Kelly K

    2016-06-01

    Developing and organizing a breast cancer database can provide data and serve as valuable research tools for those interested in the etiology, diagnosis, and treatment of cancer. Depending on the research setting, the quality of the data can be a major issue. Assuring that the data collection process does not contribute inaccuracies can help to assure the overall quality of subsequent analyses. Data management is work that involves the planning, development, implementation, and administration of systems for the acquisition, storage, and retrieval of data while protecting it by implementing high security levels. A properly designed database provides you with access to up-to-date, accurate information. Database design is an important component of application design. If you take the time to design your databases properly, you'll be rewarded with a solid application foundation on which you can build the rest of your application.

  3. Web-based Hyper Suprime-Cam Data Providing System

    NASA Astrophysics Data System (ADS)

    Koike, M.; Furusawa, H.; Takata, T.; Price, P.; Okura, Y.; Yamada, Y.; Yamanoi, H.; Yasuda, N.; Bickerton, S.; Katayama, N.; Mineo, S.; Lupton, R.; Bosch, J.; Loomis, C.

    2014-05-01

    We describe a web-based user interface to retrieve Hyper Suprime-Cam data products, including images and. Users can access data directly from a graphical user interface or by writing a database SQL query. The system provides raw images, reduced images and stacked images (from multiple individual exposures), with previews available. Catalog queries can be executed in preview or queue mode, allowing for both exploratory and comprehensive investigations.

  4. Optimization of Extended Relational Database Systems

    DTIC Science & Technology

    1986-07-23

    control functions are integrated into a single system in a homogeneoua way. As a first exam - ple, consider previous work in supporting various semantic...sizes are reduced and, wnk? quently, the number of materializations that will be needed is aba lower. For exam - pie, in the above query tuple...retrieve (EMP.name) where EMP hobbies instrument = ’ violin ’ When the various entries in the hobbies field are materialized, only those queries that

  5. Inter-comparison of the EUMETSAT H-SAF and NASA PPS precipitation products over Western Europe.

    NASA Astrophysics Data System (ADS)

    Kidd, Chris; Panegrossi, Giulia; Ringerud, Sarah; Stocker, Erich

    2017-04-01

    The development of precipitation retrieval techniques utilising passive microwave satellite observations has achieved a good degree of maturity through the use of physically-based schemes. The DMSP Special Sensor Microwave Imager/Sounder (SSMIS) has been the mainstay of passive microwave observations over the last 13 years forming the basis of many satellite precipitation products, including NASA's Precipitation Processing System (PPS) and EUMETSAT's Hydrological Satellite Application Facility (H-SAF). The NASA PPS product utilises the Goddard Profiling (GPROF; currently 2014v2-0) retrieval scheme that provides a physically consistent retrieval scheme through the use of coincident active/passive microwave retrievals from the Global Precipitation Measurement (GPM) mission core satellite. The GPM combined algorithm retrieves hydrometeor profiles optimized for consistency with both Dual-frequency Precipitation Radar (DPR) and GPM Microwave Imager (GMI); these profiles form the basis of the GPROF database which can be utilized for any constellation radiometer within the framework a Bayesian retrieval scheme. The H-SAF product (PR-OBS-1 v1.7) is based on a physically-based Bayesian technique where the a priori information is provided by a Cloud Dynamic Radiation Database (CDRD). Meteorological parameter constraints, derived from synthetic dynamical-thermodynamical-hydrological meteorological profile variables, are used in conjunction with multi-hydrometeor microphysical profiles and multispectral PMW brightness temperature vectors into a specialized a priori knowledge database underpinning and guiding the algorithm's Bayesian retrieval solver. This paper will present the results of an inter-comparison of the NASA PPS GPROF and EUMETSAT H-SAF PR-OBS-1 products over Western Europe for the period from 1 January 2015 through 31 December 2016. Surface radar is derived from the UKMO-derived Nimrod European radar product, available at 15 minute/5 km resolution. Initial results show that overall the correlations between the two satellite precipitation products and surface radar precipitation estimates are similar, particularly for cases where there is extensive precipitation; however, the H-SAF tends to have poorer correlations in situations where rain is light or limited in extent. Similarly, RMSEs for the GPROF scheme tend to a smaller than those of the H-SAF retrievals. The difference in the performance can be traced to the identification of precipitation; the GPROF2014v2-0 scheme overestimates the occurrence and extent of the precipitation, generating a significant amount of light precipitation. The H-SAF scheme has a lower precipitation threshold of about 0.25 mmh-1 while overestimating moderate and higher precipitation intensities.

  6. Using Conceptual Categories of Questions To Measure Differences in Retrieval Performance.

    ERIC Educational Resources Information Center

    Keyes, John G.

    1996-01-01

    To investigate the relationship between the retrieval mechanism and the level of question elaboration, this study divided 100 questions from the cystic fibrosis database into five conceptual categories based on their semantic representations. Two retrieval methods were chosen to investigate potential differences in outcomes across conceptual…

  7. KARL: A Knowledge-Assisted Retrieval Language. Presentation visuals. M.S. Thesis Final Report, 1 Jul. 1985 - 31 Dec. 1987

    NASA Technical Reports Server (NTRS)

    Dominick, Wayne D. (Editor); Triantafyllopoulos, Spiros

    1985-01-01

    A collection of presentation visuals associated with the companion report entitled KARL: A Knowledge-Assisted Retrieval Language, is presented. Information is given on data retrieval, natural language database front ends, generic design objectives, processing capababilities and the query processing cycle.

  8. Generating region proposals for histopathological whole slide image retrieval.

    PubMed

    Ma, Yibing; Jiang, Zhiguo; Zhang, Haopeng; Xie, Fengying; Zheng, Yushan; Shi, Huaqiang; Zhao, Yu; Shi, Jun

    2018-06-01

    Content-based image retrieval is an effective method for histopathological image analysis. However, given a database of huge whole slide images (WSIs), acquiring appropriate region-of-interests (ROIs) for training is significant and difficult. Moreover, histopathological images can only be annotated by pathologists, resulting in the lack of labeling information. Therefore, it is an important and challenging task to generate ROIs from WSI and retrieve image with few labels. This paper presents a novel unsupervised region proposing method for histopathological WSI based on Selective Search. Specifically, the WSI is over-segmented into regions which are hierarchically merged until the WSI becomes a single region. Nucleus-oriented similarity measures for region mergence and Nucleus-Cytoplasm color space for histopathological image are specially defined to generate accurate region proposals. Additionally, we propose a new semi-supervised hashing method for image retrieval. The semantic features of images are extracted with Latent Dirichlet Allocation and transformed into binary hashing codes with Supervised Hashing. The methods are tested on a large-scale multi-class database of breast histopathological WSIs. The results demonstrate that for one WSI, our region proposing method can generate 7.3 thousand contoured regions which fit well with 95.8% of the ROIs annotated by pathologists. The proposed hashing method can retrieve a query image among 136 thousand images in 0.29 s and reach precision of 91% with only 10% of images labeled. The unsupervised region proposing method can generate regions as predictions of lesions in histopathological WSI. The region proposals can also serve as the training samples to train machine-learning models for image retrieval. The proposed hashing method can achieve fast and precise image retrieval with small amount of labels. Furthermore, the proposed methods can be potentially applied in online computer-aided-diagnosis systems. Copyright © 2018 Elsevier B.V. All rights reserved.

  9. The Design and Implementation of the Ariel Active Database Rule System

    DTIC Science & Technology

    1991-10-01

    but only as a main-memory prototype. The POSTGRES rule system (PRS) [SHP88, SRH90] and the Starburst rule system (SRS) [WCL91, HCL+90] have been...query language of POSTGRES for specifying data definition commands, queries and updates [SRH90]. POSTQUEL commands retrieve, append, delete, and replace...placed on an arbitrary attribute (e.g., one without an index) ( POSTGRES rule system [SHP88, SHP89, SR1I90], HiPAC [C+891, DIPS [SLR89], Alert [SPAM91

  10. A database system for characterization of munitions items in conventional ammunition demilitarization stockpiles

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Chun, K.C.; Chiu, S.Y.; Ditmars, J.D.

    1994-05-01

    The MIDAS (Munition Items Disposition Action System) database system is an electronic data management system capable of storage and retrieval of information on the detailed structures and material compositions of munitions items designated for demilitarization. The types of such munitions range from bulk propellants and small arms to projectiles and cluster bombs. The database system is also capable of processing data on the quantities of inert, PEP (propellant, explosives and pyrotechnics) and packaging materials associated with munitions, components, or parts, and the quantities of chemical compounds associated with parts made of PEP materials. Development of the MIDAS database system hasmore » been undertaken by the US Army to support disposition of unwanted ammunition stockpiles. The inventory of such stockpiles currently includes several thousand items, which total tens of thousands of tons, and is still growing. Providing systematic procedures for disposing of all unwanted conventional munitions is the mission of the MIDAS Demilitarization Program. To carry out this mission, all munitions listed in the Single Manager for Conventional Ammunition inventory must be characterized, and alternatives for resource recovery and recycling and/or disposal of munitions in the demilitarization inventory must be identified.« less

  11. Retrievable Inferior Vena Cava Filters: Factors that Affect Retrieval Success

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Geisbuesch, Philipp, E-mail: philippgeisbuesch@gmx.de; Benenati, James F.; Pena, Constantino S.

    Purpose: To report and analyze the indications, procedural success, and complications of retrievable inferior vena cava filters (rIVCF) placement and to identify parameters that influence retrieval attempt and failure. Methods: Between January 2005 and December 2010, a total of 200 patients (80 men, median age 67 years, range 11-95 years) received a rIVCF with the clinical possibility that it could be removed. All patients with rIVCF were prospectively entered into a database and followed until retrieval or a decision not to retrieve the filter was made. A retrospective analysis of this database was performed. Results: Sixty-one percent of patients hadmore » an accepted indication for filter placement; 39% of patients had a relative indication. There was a tendency toward a higher retrieval rate in patients with relative indications (40% vs. 55%, P = 0.076). Filter placement was technically successful in all patients, with no procedure-related mortality. The retrieval rate was 53%. Patient age of >80 years (odds ratio [OR] 0.056, P > 0.0001) and presence of malignancy (OR 0.303, P = 0.003) was associated with a significantly reduced probability for attempted retrieval. Retrieval failure occurred in 7% (6 of 91) of all retrieval attempts. A time interval of > 90 days between implantation and attempted retrieval was associated with retrieval failure (OR 19.8, P = 0.009). Conclusions: Patient age >80 years and a history of malignancy are predictors of a reduced probability for retrieval attempt. The rate of retrieval failure is low and seems to be associated with a time interval of >90 days between filter placement and retrieval.« less

  12. A new natural hazards data-base for volcanic ash and SO2 from global satellite remote sensing measurements

    NASA Astrophysics Data System (ADS)

    Stebel, Kerstin; Prata, Fred; Theys, Nicolas; Tampellini, Lucia; Kamstra, Martijn; Zehner, Claus

    2014-05-01

    Over the last few years there has been a recognition of the utility of satellite measurements to identify and track volcanic emissions that present a natural hazard to human populations. Mitigation of the volcanic hazard to life and the environment requires understanding of the properties of volcanic emissions, identifying the hazard in near real-time and being able to provide timely and accurate forecasts to affected areas. Amongst the many ways to measure volcanic emissions, satellite remote sensing is capable of providing global quantitative retrievals of important microphysical parameters such as ash mass loading, ash particle effective radius, infrared optical depth, SO2 partial and total column abundance, plume altitude, aerosol optical depth and aerosol absorbing index. The eruption of Eyjafjallajökull in April May, 2010 led to increased research and measurement programs to better characterize properties of volcanic ash and the need to establish a data-base in which to store and access these data was confirmed. The European Space Agency (ESA) has recognized the importance of having a quality controlled data-base of satellite retrievals and has funded an activity called Volcanic Ash Strategic Initiative Team VAST (vast.nilu.no) to develop novel remote sensing retrieval schemes and a data-base, initially focused on several recent hazardous volcanic eruptions. In addition, the data-base will host satellite and validation data sets provided from the ESA projects Support to Aviation Control Service SACS (sacs.aeronomie.be) and Study on an end-to-end system for volcanic ash plume monitoring and prediction SMASH. Starting with data for the eruptions of Eyjafjallajökull, Grímsvötn, and Kasatochi, satellite retrievals for Puyhue-Cordon Caulle, Nabro, Merapi, Okmok, Kasatochi and Sarychev Peak will eventually be ingested. Dispersion model simulations are also being included in the data-base. Several atmospheric dispersion models (FLEXPART, SILAM and WRF-Chem) are used in VAST to simulate the dispersion of volcanic ash and SO2 emitted during an eruption. Source terms and dispersion model results will be given. In time, data from conventional in situ sampling instruments, airborne and ground-based remote sensing platforms and other meta-data (bulk ash and gas properties, volcanic setting, volcanic eruption chronologies, potential impacts etc.) will be added. Important applications of the data-base are illustrated related to the ash/aviation problem and to estimating SO2 fluxes from active volcanoes-as a means to diagnose future unrest. The data-base has the potential to provide the natural hazards community with a dynamic atmospheric volcanic hazards map and will be a valuable tool particularly for aviation.

  13. Probabilistic and machine learning-based retrieval approaches for biomedical dataset retrieval

    PubMed Central

    Karisani, Payam; Qin, Zhaohui S; Agichtein, Eugene

    2018-01-01

    Abstract The bioCADDIE dataset retrieval challenge brought together different approaches to retrieval of biomedical datasets relevant to a user’s query, expressed as a text description of a needed dataset. We describe experiments in applying a data-driven, machine learning-based approach to biomedical dataset retrieval as part of this challenge. We report on a series of experiments carried out to evaluate the performance of both probabilistic and machine learning-driven techniques from information retrieval, as applied to this challenge. Our experiments with probabilistic information retrieval methods, such as query term weight optimization, automatic query expansion and simulated user relevance feedback, demonstrate that automatically boosting the weights of important keywords in a verbose query is more effective than other methods. We also show that although there is a rich space of potential representations and features available in this domain, machine learning-based re-ranking models are not able to improve on probabilistic information retrieval techniques with the currently available training data. The models and algorithms presented in this paper can serve as a viable implementation of a search engine to provide access to biomedical datasets. The retrieval performance is expected to be further improved by using additional training data that is created by expert annotation, or gathered through usage logs, clicks and other processes during natural operation of the system. Database URL: https://github.com/emory-irlab/biocaddie PMID:29688379

  14. Alternative treatment technology information center computer database system

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Sullivan, D.

    1995-10-01

    The Alternative Treatment Technology Information Center (ATTIC) computer database system was developed pursuant to the 1986 Superfund law amendments. It provides up-to-date information on innovative treatment technologies to clean up hazardous waste sites. ATTIC v2.0 provides access to several independent databases as well as a mechanism for retrieving full-text documents of key literature. It can be accessed with a personal computer and modem 24 hours a day, and there are no user fees. ATTIC provides {open_quotes}one-stop shopping{close_quotes} for information on alternative treatment options by accessing several databases: (1) treatment technology database; this contains abstracts from the literature on all typesmore » of treatment technologies, including biological, chemical, physical, and thermal methods. The best literature as viewed by experts is highlighted. (2) treatability study database; this provides performance information on technologies to remove contaminants from wastewaters and soils. It is derived from treatability studies. This database is available through ATTIC or separately as a disk that can be mailed to you. (3) underground storage tank database; this presents information on underground storage tank corrective actions, surface spills, emergency response, and remedial actions. (4) oil/chemical spill database; this provides abstracts on treatment and disposal of spilled oil and chemicals. In addition to these separate databases, ATTIC allows immediate access to other disk-based systems such as the Vendor Information System for Innovative Treatment Technologies (VISITT) and the Bioremediation in the Field Search System (BFSS). The user may download these programs to their own PC via a high-speed modem. Also via modem, users are able to download entire documents through the ATTIC system. Currently, about fifty publications are available, including Superfund Innovative Technology Evaluation (SITE) program documents.« less

  15. Multi-source and ontology-based retrieval engine for maize mutant phenotypes

    PubMed Central

    Green, Jason M.; Harnsomburana, Jaturon; Schaeffer, Mary L.; Lawrence, Carolyn J.; Shyu, Chi-Ren

    2011-01-01

    Model Organism Databases, including the various plant genome databases, collect and enable access to massive amounts of heterogeneous information, including sequence data, gene product information, images of mutant phenotypes, etc, as well as textual descriptions of many of these entities. While a variety of basic browsing and search capabilities are available to allow researchers to query and peruse the names and attributes of phenotypic data, next-generation search mechanisms that allow querying and ranking of text descriptions are much less common. In addition, the plant community needs an innovative way to leverage the existing links in these databases to search groups of text descriptions simultaneously. Furthermore, though much time and effort have been afforded to the development of plant-related ontologies, the knowledge embedded in these ontologies remains largely unused in available plant search mechanisms. Addressing these issues, we have developed a unique search engine for mutant phenotypes from MaizeGDB. This advanced search mechanism integrates various text description sources in MaizeGDB to aid a user in retrieving desired mutant phenotype information. Currently, descriptions of mutant phenotypes, loci and gene products are utilized collectively for each search, though expansion of the search mechanism to include other sources is straightforward. The retrieval engine, to our knowledge, is the first engine to exploit the content and structure of available domain ontologies, currently the Plant and Gene Ontologies, to expand and enrich retrieval results in major plant genomic databases. Database URL: http:www.PhenomicsWorld.org/QBTA.php PMID:21558151

  16. Utilizing semantic networks to database and retrieve generalized stochastic colored Petri nets

    NASA Technical Reports Server (NTRS)

    Farah, Jeffrey J.; Kelley, Robert B.

    1992-01-01

    Previous work has introduced the Planning Coordinator (PCOORD), a coordinator functioning within the hierarchy of the Intelligent Machine Mode. Within the structure of the Planning Coordinator resides the Primitive Structure Database (PSDB) functioning to provide the primitive structures utilized by the Planning Coordinator in the establishing of error recovery or on-line path plans. This report further explores the Primitive Structure Database and establishes the potential of utilizing semantic networks as a means of efficiently storing and retrieving the Generalized Stochastic Colored Petri Nets from which the error recovery plans are derived.

  17. Implementation of a WAP-based telemedicine system for patient monitoring.

    PubMed

    Hung, Kevin; Zhang, Yuan-Ting

    2003-06-01

    Many parties have already demonstrated telemedicine applications that use cellular phones and the Internet. A current trend in telecommunication is the convergence of wireless communication and computer network technologies, and the emergence of wireless application protocol (WAP) devices is an example. Since WAP will also be a common feature found in future mobile communication devices, it is worthwhile to investigate its use in telemedicine. This paper describes the implementation and experiences with a WAP-based telemedicine system for patient-monitoring that has been developed in our laboratory. It utilizes WAP devices as mobile access terminals for general inquiry and patient-monitoring services. Authorized users can browse the patients' general data, monitored blood pressure (BP), and electrocardiogram (ECG) on WAP devices in store-and-forward mode. The applications, written in wireless markup language (WML), WMLScript, and Perl, resided in a content server. A MySQL relational database system was set up to store the BP readings, ECG data, patient records, clinic and hospital information, and doctors' appointments with patients. A wireless ECG subsystem was built for recording ambulatory ECG in an indoor environment and for storing ECG data into the database. For testing, a WAP phone compliant with WAP 1.1 was used at GSM 1800 MHz by circuit-switched data (CSD) to connect to the content server through a WAP gateway, which was provided by a mobile phone service provider in Hong Kong. Data were successfully retrieved from the database and displayed on the WAP phone. The system shows how WAP can be feasible in remote patient-monitoring and patient data retrieval.

  18. Video-assisted segmentation of speech and audio track

    NASA Astrophysics Data System (ADS)

    Pandit, Medha; Yusoff, Yusseri; Kittler, Josef; Christmas, William J.; Chilton, E. H. S.

    1999-08-01

    Video database research is commonly concerned with the storage and retrieval of visual information invovling sequence segmentation, shot representation and video clip retrieval. In multimedia applications, video sequences are usually accompanied by a sound track. The sound track contains potential cues to aid shot segmentation such as different speakers, background music, singing and distinctive sounds. These different acoustic categories can be modeled to allow for an effective database retrieval. In this paper, we address the problem of automatic segmentation of audio track of multimedia material. This audio based segmentation can be combined with video scene shot detection in order to achieve partitioning of the multimedia material into semantically significant segments.

  19. A cloud and radiation model-based algorithm for rainfall retrieval from SSM/I multispectral microwave measurements

    NASA Technical Reports Server (NTRS)

    Xiang, Xuwu; Smith, Eric A.; Tripoli, Gregory J.

    1992-01-01

    A hybrid statistical-physical retrieval scheme is explored which combines a statistical approach with an approach based on the development of cloud-radiation models designed to simulate precipitating atmospheres. The algorithm employs the detailed microphysical information from a cloud model as input to a radiative transfer model which generates a cloud-radiation model database. Statistical procedures are then invoked to objectively generate an initial guess composite profile data set from the database. The retrieval algorithm has been tested for a tropical typhoon case using Special Sensor Microwave/Imager (SSM/I) data and has shown satisfactory results.

  20. Inductive monitoring system constructed from nominal system data and its use in real-time system monitoring

    NASA Technical Reports Server (NTRS)

    Iverson, David L. (Inventor)

    2008-01-01

    The present invention relates to an Inductive Monitoring System (IMS), its software implementations, hardware embodiments and applications. Training data is received, typically nominal system data acquired from sensors in normally operating systems or from detailed system simulations. The training data is formed into vectors that are used to generate a knowledge database having clusters of nominal operating regions therein. IMS monitors a system's performance or health by comparing cluster parameters in the knowledge database with incoming sensor data from a monitored-system formed into vectors. Nominal performance is concluded when a monitored-system vector is determined to lie within a nominal operating region cluster or lies sufficiently close to a such a cluster as determined by a threshold value and a distance metric. Some embodiments of IMS include cluster indexing and retrieval methods that increase the execution speed of IMS.

  1. Development of a database for Louisiana highway bridge scour data : technical summary.

    DOT National Transportation Integrated Search

    1999-10-01

    The objectives of the project included: 1) developed a database with manipulation capabilities such as data retrieval, visualization, and update; 2) Input the existing scour data from DOTD files into the database.

  2. Combining semantic technologies with a content-based image retrieval system - Preliminary considerations

    NASA Astrophysics Data System (ADS)

    Chmiel, P.; Ganzha, M.; Jaworska, T.; Paprzycki, M.

    2017-10-01

    Nowadays, as a part of systematic growth of volume, and variety, of information that can be found on the Internet, we observe also dramatic increase in sizes of available image collections. There are many ways to help users browsing / selecting images of interest. One of popular approaches are Content-Based Image Retrieval (CBIR) systems, which allow users to search for images that match their interests, expressed in the form of images (query by example). However, we believe that image search and retrieval could take advantage of semantic technologies. We have decided to test this hypothesis. Specifically, on the basis of knowledge captured in the CBIR, we have developed a domain ontology of residential real estate (detached houses, in particular). This allows us to semantically represent each image (and its constitutive architectural elements) represented within the CBIR. The proposed ontology was extended to capture not only the elements resulting from image segmentation, but also "spatial relations" between them. As a result, a new approach to querying the image database (semantic querying) has materialized, thus extending capabilities of the developed system.

  3. Mass Spectra-Based Framework for Automated Structural Elucidation of Metabolome Data to Explore Phytochemical Diversity

    PubMed Central

    Matsuda, Fumio; Nakabayashi, Ryo; Sawada, Yuji; Suzuki, Makoto; Hirai, Masami Y.; Kanaya, Shigehiko; Saito, Kazuki

    2011-01-01

    A novel framework for automated elucidation of metabolite structures in liquid chromatography–mass spectrometer metabolome data was constructed by integrating databases. High-resolution tandem mass spectra data automatically acquired from each metabolite signal were used for database searches. Three distinct databases, KNApSAcK, ReSpect, and the PRIMe standard compound database, were employed for the structural elucidation. The outputs were retrieved using the CAS metabolite identifier for identification and putative annotation. A simple metabolite ontology system was also introduced to attain putative characterization of the metabolite signals. The automated method was applied for the metabolome data sets obtained from the rosette leaves of 20 Arabidopsis accessions. Phenotypic variations in novel Arabidopsis metabolites among these accessions could be investigated using this method. PMID:22645535

  4. Information resources at the National Center for Biotechnology Information.

    PubMed Central

    Woodsmall, R M; Benson, D A

    1993-01-01

    The National Center for Biotechnology Information (NCBI), part of the National Library of Medicine, was established in 1988 to perform basic research in the field of computational molecular biology as well as build and distribute molecular biology databases. The basic research has led to new algorithms and analysis tools for interpreting genomic data and has been instrumental in the discovery of human disease genes for neurofibromatosis and Kallmann syndrome. The principal database responsibility is the National Institutes of Health (NIH) genetic sequence database, GenBank. NCBI, in collaboration with international partners, builds, distributes, and provides online and CD-ROM access to over 112,000 DNA sequences. Another major program is the integration of multiple sequences databases and related bibliographic information and the development of network-based retrieval systems for Internet access. PMID:8374583

  5. Data-driven information retrieval in heterogeneous collections of transcriptomics data links SIM2s to malignant pleural mesothelioma.

    PubMed

    Caldas, José; Gehlenborg, Nils; Kettunen, Eeva; Faisal, Ali; Rönty, Mikko; Nicholson, Andrew G; Knuutila, Sakari; Brazma, Alvis; Kaski, Samuel

    2012-01-15

    Genome-wide measurement of transcript levels is an ubiquitous tool in biomedical research. As experimental data continues to be deposited in public databases, it is becoming important to develop search engines that enable the retrieval of relevant studies given a query study. While retrieval systems based on meta-data already exist, data-driven approaches that retrieve studies based on similarities in the expression data itself have a greater potential of uncovering novel biological insights. We propose an information retrieval method based on differential expression. Our method deals with arbitrary experimental designs and performs competitively with alternative approaches, while making the search results interpretable in terms of differential expression patterns. We show that our model yields meaningful connections between biological conditions from different studies. Finally, we validate a previously unknown connection between malignant pleural mesothelioma and SIM2s suggested by our method, via real-time polymerase chain reaction in an independent set of mesothelioma samples. Supplementary data and source code are available from http://www.ebi.ac.uk/fg/research/rex.

  6. Information mining in remote sensing imagery

    NASA Astrophysics Data System (ADS)

    Li, Jiang

    The volume of remotely sensed imagery continues to grow at an enormous rate due to the advances in sensor technology, and our capability for collecting and storing images has greatly outpaced our ability to analyze and retrieve information from the images. This motivates us to develop image information mining techniques, which is very much an interdisciplinary endeavor drawing upon expertise in image processing, databases, information retrieval, machine learning, and software design. This dissertation proposes and implements an extensive remote sensing image information mining (ReSIM) system prototype for mining useful information implicitly stored in remote sensing imagery. The system consists of three modules: image processing subsystem, database subsystem, and visualization and graphical user interface (GUI) subsystem. Land cover and land use (LCLU) information corresponding to spectral characteristics is identified by supervised classification based on support vector machines (SVM) with automatic model selection, while textural features that characterize spatial information are extracted using Gabor wavelet coefficients. Within LCLU categories, textural features are clustered using an optimized k-means clustering approach to acquire search efficient space. The clusters are stored in an object-oriented database (OODB) with associated images indexed in an image database (IDB). A k-nearest neighbor search is performed using a query-by-example (QBE) approach. Furthermore, an automatic parametric contour tracing algorithm and an O(n) time piecewise linear polygonal approximation (PLPA) algorithm are developed for shape information mining of interesting objects within the image. A fuzzy object-oriented database based on the fuzzy object-oriented data (FOOD) model is developed to handle the fuzziness and uncertainty. Three specific applications are presented: integrated land cover and texture pattern mining, shape information mining for change detection of lakes, and fuzzy normalized difference vegetation index (NDVI) pattern mining. The study results show the effectiveness of the proposed system prototype and the potentials for other applications in remote sensing.

  7. An SQL query generator for CLIPS

    NASA Technical Reports Server (NTRS)

    Snyder, James; Chirica, Laurian

    1990-01-01

    As expert systems become more widely used, their access to large amounts of external information becomes increasingly important. This information exists in several forms such as statistical, tabular data, knowledge gained by experts and large databases of information maintained by companies. Because many expert systems, including CLIPS, do not provide access to this external information, much of the usefulness of expert systems is left untapped. The scope of this paper is to describe a database extension for the CLIPS expert system shell. The current industry standard database language is SQL. Due to SQL standardization, large amounts of information stored on various computers, potentially at different locations, will be more easily accessible. Expert systems should be able to directly access these existing databases rather than requiring information to be re-entered into the expert system environment. The ORACLE relational database management system (RDBMS) was used to provide a database connection within the CLIPS environment. To facilitate relational database access a query generation system was developed as a CLIPS user function. The queries are entered in a CLlPS-like syntax and are passed to the query generator, which constructs and submits for execution, an SQL query to the ORACLE RDBMS. The query results are asserted as CLIPS facts. The query generator was developed primarily for use within the ICADS project (Intelligent Computer Aided Design System) currently being developed by the CAD Research Unit in the California Polytechnic State University (Cal Poly). In ICADS, there are several parallel or distributed expert systems accessing a common knowledge base of facts. Expert system has a narrow domain of interest and therefore needs only certain portions of the information. The query generator provides a common method of accessing this information and allows the expert system to specify what data is needed without specifying how to retrieve it.

  8. [EXPERIENCE IN THE APPLICATION OF DATABASES ON BLOODSUCKING INSECTS IN ZOOLOGICAL STUDIES].

    PubMed

    Medvedev, S G; Khalikov, R G

    2016-01-01

    The paper summarizes long-term experience of accumulating and summarizing the faunistic information by means of separate databases (DB) and information analytical systems (IAS), and also prospects of its representation by modern multi-user informational systems. The experience obtained during development and practical use of the PARHOST1 IAS for the study of the world flea fauna and work with personal databases created for the study of bloodsucking insects (lice and blackflies) is analyzed. Research collection material on type series of 57 species and subspecies of fleas of the fauna of Russia was approved as a part of multi-user information retrieval system on the web-portal of the Zoological Institute of the Russian Academy of Sciences. According former investigations, the system allows depositing the information in the authentic form and performing its gradual transformation, i. e. its unification and structuring. In order to provide continuity of DB refill, the possibility of work of operators with different degree of competence is provided.

  9. Smartphone home monitoring of ECG

    NASA Astrophysics Data System (ADS)

    Szu, Harold; Hsu, Charles; Moon, Gyu; Landa, Joseph; Nakajima, Hiroshi; Hata, Yutaka

    2012-06-01

    A system of ambulatory, halter, electrocardiography (ECG) monitoring system has already been commercially available for recording and transmitting heartbeats data by the Internet. However, it enjoys the confidence with a reservation and thus a limited market penetration, our system was targeting at aging global villagers having an increasingly biomedical wellness (BMW) homecare needs, not hospital related BMI (biomedical illness). It was designed within SWaP-C (Size, Weight, and Power, Cost) using 3 innovative modules: (i) Smart Electrode (lowpower mixed signal embedded with modern compressive sensing and nanotechnology to improve the electrodes' contact impedance); (ii) Learnable Database (in terms of adaptive wavelets transform QRST feature extraction, Sequential Query Relational database allowing home care monitoring retrievable Aided Target Recognition); (iii) Smartphone (touch screen interface, powerful computation capability, caretaker reporting with GPI, ID, and patient panic button for programmable emergence procedure). It can provide a supplementary home screening system for the post or the pre-diagnosis care at home with a build-in database searchable with the time, the place, and the degree of urgency happened, using in-situ screening.

  10. Improved image retrieval based on fuzzy colour feature vector

    NASA Astrophysics Data System (ADS)

    Ben-Ahmeida, Ahlam M.; Ben Sasi, Ahmed Y.

    2013-03-01

    One of Image indexing techniques is the Content-Based Image Retrieval which is an efficient way for retrieving images from the image database automatically based on their visual contents such as colour, texture, and shape. In this paper will be discuss how using content-based image retrieval (CBIR) method by colour feature extraction and similarity checking. By dividing the query image and all images in the database into pieces and extract the features of each part separately and comparing the corresponding portions in order to increase the accuracy in the retrieval. The proposed approach is based on the use of fuzzy sets, to overcome the problem of curse of dimensionality. The contribution of colour of each pixel is associated to all the bins in the histogram using fuzzy-set membership functions. As a result, the Fuzzy Colour Histogram (FCH), outperformed the Conventional Colour Histogram (CCH) in image retrieving, due to its speedy results, where were images represented as signatures that took less size of memory, depending on the number of divisions. The results also showed that FCH is less sensitive and more robust to brightness changes than the CCH with better retrieval recall values.

  11. A new natural hazards data-base for volcanic ash and SO2 from global satellite remote sensing measurements

    NASA Astrophysics Data System (ADS)

    Prata, F.; Stebel, K.

    2013-12-01

    Over the last few years there has been a recognition of the utility of satellite measurements to identify and track volcanic emissions that present a natural hazard to human populations. Mitigation of the volcanic hazard to life and the environment requires understanding of the properties of volcanic emissions, identifying the hazard in near real-time and being able to provide timely and accurate forecasts to affected areas. Amongst the many ways to measure volcanic emissions, satellite remote sensing is capable of providing global quantitative retrievals of important microphysical parameters such as ash mass loading, ash particle effective radius, infrared optical depth, SO2 partial and total column abundance, plume altitude, aerosol optical depth and aerosol absorbing index. The eruption of Eyjafjallajokull in April-May, 2010 led to increased research and measurement programs to better characterize properties of volcanic ash and the need to establish a data-base in which to store and access these data was confirmed. The European Space Agency (ESA) has recognized the importance of having a quality controlled data-base of satellite retrievals and has funded an activity (VAST) to develop novel remote sensing retrieval schemes and a data-base, initially focused on several recent hazardous volcanic eruptions. As a first step, satellite retrievals for the eruptions of Eyjafjallajokull, Grimsvotn, Puyhue-Cordon Caulle, Nabro, Merapi, Okmok, Kasatochi and Sarychev Peak are being considered. Here we describe the data, retrievals and methods being developed for the data-base. Three important applications of the data-base are illustrated related to the ash/aviation problem, to the impact of the Merapi volcanic eruption on the local population, and to estimate SO2 fluxes from active volcanoes-as a means to diagnose future unrest. Dispersion model simulations are also being included in the data-base. In time, data from conventional in situ sampling instruments, airborne and ground-based remote sensing platforms and other meta-data (bulk ash and gas properties, volcanic setting, volcanic eruption chronologies, hazards and impacts etc.) will be added. The data-base has the potential to provide the natural hazards community with the first dynamic atmospheric volcanic hazards map and will be a valuable tool particularly for global transport.

  12. The STP (Solar-Terrestrial Physics) Semantic Web based on the RSS1.0 and the RDF

    NASA Astrophysics Data System (ADS)

    Kubo, T.; Murata, K. T.; Kimura, E.; Ishikura, S.; Shinohara, I.; Kasaba, Y.; Watari, S.; Matsuoka, D.

    2006-12-01

    In the Solar-Terrestrial Physics (STP), it is pointed out that circulation and utilization of observation data among researchers are insufficient. To archive interdisciplinary researches, we need to overcome this circulation and utilization problems. Under such a background, authors' group has developed a world-wide database that manages meta-data of satellite and ground-based observation data files. It is noted that retrieving meta-data from the observation data and registering them to database have been carried out by hand so far. Our goal is to establish the STP Semantic Web. The Semantic Web provides a common framework that allows a variety of data shared and reused across applications, enterprises, and communities. We also expect that the secondary information related with observations, such as event information and associated news, are also shared over the networks. The most fundamental issue on the establishment is who generates, manages and provides meta-data in the Semantic Web. We developed an automatic meta-data collection system for the observation data using the RSS (RDF Site Summary) 1.0. The RSS1.0 is one of the XML-based markup languages based on the RDF (Resource Description Framework), which is designed for syndicating news and contents of news-like sites. The RSS1.0 is used to describe the STP meta-data, such as data file name, file server address and observation date. To describe the meta-data of the STP beyond RSS1.0 vocabulary, we defined original vocabularies for the STP resources using the RDF Schema. The RDF describes technical terms on the STP along with the Dublin Core Metadata Element Set, which is standard for cross-domain information resource descriptions. Researchers' information on the STP by FOAF, which is known as an RDF/XML vocabulary, creates a machine-readable metadata describing people. Using the RSS1.0 as a meta-data distribution method, the workflow from retrieving meta-data to registering them into the database is automated. This technique is applied for several database systems, such as the DARTS database system and NICT Space Weather Report Service. The DARTS is a science database managed by ISAS/JAXA in Japan. We succeeded in generating and collecting the meta-data automatically for the CDF (Common data Format) data, such as Reimei satellite data, provided by the DARTS. We also create an RDF service for space weather report and real-time global MHD simulation 3D data provided by the NICT. Our Semantic Web system works as follows: The RSS1.0 documents generated on the data sites (ISAS and NICT) are automatically collected by a meta-data collection agent. The RDF documents are registered and the agent extracts meta-data to store them in the Sesame, which is an open source RDF database with support for RDF Schema inferencing and querying. The RDF database provides advanced retrieval processing that has considered property and relation. Finally, the STP Semantic Web provides automatic processing or high level search for the data which are not only for observation data but for space weather news, physical events, technical terms and researches information related to the STP.

  13. A neotropical Miocene pollen database employing image-based search and semantic modeling.

    PubMed

    Han, Jing Ginger; Cao, Hongfei; Barb, Adrian; Punyasena, Surangi W; Jaramillo, Carlos; Shyu, Chi-Ren

    2014-08-01

    Digital microscopic pollen images are being generated with increasing speed and volume, producing opportunities to develop new computational methods that increase the consistency and efficiency of pollen analysis and provide the palynological community a computational framework for information sharing and knowledge transfer. • Mathematical methods were used to assign trait semantics (abstract morphological representations) of the images of neotropical Miocene pollen and spores. Advanced database-indexing structures were built to compare and retrieve similar images based on their visual content. A Web-based system was developed to provide novel tools for automatic trait semantic annotation and image retrieval by trait semantics and visual content. • Mathematical models that map visual features to trait semantics can be used to annotate images with morphology semantics and to search image databases with improved reliability and productivity. Images can also be searched by visual content, providing users with customized emphases on traits such as color, shape, and texture. • Content- and semantic-based image searches provide a powerful computational platform for pollen and spore identification. The infrastructure outlined provides a framework for building a community-wide palynological resource, streamlining the process of manual identification, analysis, and species discovery.

  14. The MIPAS2D: 2-D analysis of MIPAS observations of ESA target molecules and minor species

    NASA Astrophysics Data System (ADS)

    Arnone, E.; Brizzi, G.; Carlotti, M.; Dinelli, B. M.; Magnani, L.; Papandrea, E.; Ridolfi, M.

    2008-12-01

    Measurements from the MIPAS instrument onboard the ENVISAT satellite were analyzed with the Geofit Multi- Target Retrieval (GMTR) system to obtain 2-dimensional fields of pressure, temperature and volume mixing ratios of H2O, O3, HNO3, CH4, N2O, and NO2. Secondary target species relevant to stratospheric chemistry were also analysed and robust mixing ratios of N2O5, ClONO2, F11, F12, F14 and F22 were obtained. Other minor species with high uncertainties were not included in the database and will be the object of further studies. The analysis covers the original nominal observation mode from July 2002 to March 2004 and it is currently being extended to the ongoing reduced resolution mission. The GMTR algorithm was operated on a fixed 5 degrees latitudinal grid in order to ease the comparison with model calculations and climatological datasets. The generated database of atmospheric fields can be directly used for analyses based on averaging processes with no need of further interpolation. Samples of the obtained products are presented and discussed. The database of the retrieved quantities is made available to the scientific community.

  15. Searching the ASRS Database Using QUORUM Keyword Search, Phrase Search, Phrase Generation, and Phrase Discovery

    NASA Technical Reports Server (NTRS)

    McGreevy, Michael W.; Connors, Mary M. (Technical Monitor)

    2001-01-01

    To support Search Requests and Quick Responses at the Aviation Safety Reporting System (ASRS), four new QUORUM methods have been developed: keyword search, phrase search, phrase generation, and phrase discovery. These methods build upon the core QUORUM methods of text analysis, modeling, and relevance-ranking. QUORUM keyword search retrieves ASRS incident narratives that contain one or more user-specified keywords in typical or selected contexts, and ranks the narratives on their relevance to the keywords in context. QUORUM phrase search retrieves narratives that contain one or more user-specified phrases, and ranks the narratives on their relevance to the phrases. QUORUM phrase generation produces a list of phrases from the ASRS database that contain a user-specified word or phrase. QUORUM phrase discovery finds phrases that are related to topics of interest. Phrase generation and phrase discovery are particularly useful for finding query phrases for input to QUORUM phrase search. The presentation of the new QUORUM methods includes: a brief review of the underlying core QUORUM methods; an overview of the new methods; numerous, concrete examples of ASRS database searches using the new methods; discussion of related methods; and, in the appendices, detailed descriptions of the new methods.

  16. Cry-Bt identifier: a biological database for PCR detection of Cry genes present in transgenic plants.

    PubMed

    Singh, Vinay Kumar; Ambwani, Sonu; Marla, Soma; Kumar, Anil

    2009-10-23

    We describe the development of a user friendly tool that would assist in the retrieval of information relating to Cry genes in transgenic crops. The tool also helps in detection of transformed Cry genes from Bacillus thuringiensis present in transgenic plants by providing suitable designed primers for PCR identification of these genes. The tool designed based on relational database model enables easy retrieval of information from the database with simple user queries. The tool also enables users to access related information about Cry genes present in various databases by interacting with different sources (nucleotide sequences, protein sequence, sequence comparison tools, published literature, conserved domains, evolutionary and structural data). http://insilicogenomics.in/Cry-btIdentifier/welcome.html.

  17. Techniques for Soundscape Retrieval and Synthesis

    NASA Astrophysics Data System (ADS)

    Mechtley, Brandon Michael

    The study of acoustic ecology is concerned with the manner in which life interacts with its environment as mediated through sound. As such, a central focus is that of the soundscape: the acoustic environment as perceived by a listener. This dissertation examines the application of several computational tools in the realms of digital signal processing, multimedia information retrieval, and computer music synthesis to the analysis of the soundscape. Namely, these tools include a) an open source software library, Sirens, which can be used for the segmentation of long environmental field recordings into individual sonic events and compare these events in terms of acoustic content, b) a graph-based retrieval system that can use these measures of acoustic similarity and measures of semantic similarity using the lexical database WordNet to perform both text-based retrieval and automatic annotation of environmental sounds, and c) new techniques for the dynamic, realtime parametric morphing of multiple field recordings, informed by the geographic paths along which they were recorded.

  18. Fast large-scale object retrieval with binary quantization

    NASA Astrophysics Data System (ADS)

    Zhou, Shifu; Zeng, Dan; Shen, Wei; Zhang, Zhijiang; Tian, Qi

    2015-11-01

    The objective of large-scale object retrieval systems is to search for images that contain the target object in an image database. Where state-of-the-art approaches rely on global image representations to conduct searches, we consider many boxes per image as candidates to search locally in a picture. In this paper, a feature quantization algorithm called binary quantization is proposed. In binary quantization, a scale-invariant feature transform (SIFT) feature is quantized into a descriptive and discriminative bit-vector, which allows itself to adapt to the classic inverted file structure for box indexing. The inverted file, which stores the bit-vector and box ID where the SIFT feature is located inside, is compact and can be loaded into the main memory for efficient box indexing. We evaluate our approach on available object retrieval datasets. Experimental results demonstrate that the proposed approach is fast and achieves excellent search quality. Therefore, the proposed approach is an improvement over state-of-the-art approaches for object retrieval.

  19. 3D model retrieval method based on mesh segmentation

    NASA Astrophysics Data System (ADS)

    Gan, Yuanchao; Tang, Yan; Zhang, Qingchen

    2012-04-01

    In the process of feature description and extraction, current 3D model retrieval algorithms focus on the global features of 3D models but ignore the combination of global and local features of the model. For this reason, they show less effective performance to the models with similar global shape and different local shape. This paper proposes a novel algorithm for 3D model retrieval based on mesh segmentation. The key idea is to exact the structure feature and the local shape feature of 3D models, and then to compares the similarities of the two characteristics and the total similarity between the models. A system that realizes this approach was built and tested on a database of 200 objects and achieves expected results. The results show that the proposed algorithm improves the precision and the recall rate effectively.

  20. Data shopping in an open marketplace: Introducing the Ontogrator web application for marking up data using ontologies and browsing using facets.

    PubMed

    Morrison, Norman; Hancock, David; Hirschman, Lynette; Dawyndt, Peter; Verslyppe, Bert; Kyrpides, Nikos; Kottmann, Renzo; Yilmaz, Pelin; Glöckner, Frank Oliver; Grethe, Jeff; Booth, Tim; Sterk, Peter; Nenadic, Goran; Field, Dawn

    2011-04-29

    In the future, we hope to see an open and thriving data market in which users can find and select data from a wide range of data providers. In such an open access market, data are products that must be packaged accordingly. Increasingly, eCommerce sellers present heterogeneous product lines to buyers using faceted browsing. Using this approach we have developed the Ontogrator platform, which allows for rapid retrieval of data in a way that would be familiar to any online shopper. Using Knowledge Organization Systems (KOS), especially ontologies, Ontogrator uses text mining to mark up data and faceted browsing to help users navigate, query and retrieve data. Ontogrator offers the potential to impact scientific research in two major ways: 1) by significantly improving the retrieval of relevant information; and 2) by significantly reducing the time required to compose standard database queries and assemble information for further research. Here we present a pilot implementation developed in collaboration with the Genomic Standards Consortium (GSC) that includes content from the StrainInfo, GOLD, CAMERA, Silva and Pubmed databases. This implementation demonstrates the power of ontogration and highlights that the usefulness of this approach is fully dependent on both the quality of data and the KOS (ontologies) used. Ideally, the use and further expansion of this collaborative system will help to surface issues associated with the underlying quality of annotation and could lead to a systematic means for accessing integrated data resources.

  1. Data shopping in an open marketplace: Introducing the Ontogrator web application for marking up data using ontologies and browsing using facets

    PubMed Central

    Morrison, Norman; Hancock, David; Hirschman, Lynette; Dawyndt, Peter; Verslyppe, Bert; Kyrpides, Nikos; Kottmann, Renzo; Yilmaz, Pelin; Glöckner, Frank Oliver; Grethe, Jeff; Booth, Tim; Sterk, Peter; Nenadic, Goran; Field, Dawn

    2011-01-01

    In the future, we hope to see an open and thriving data market in which users can find and select data from a wide range of data providers. In such an open access market, data are products that must be packaged accordingly. Increasingly, eCommerce sellers present heterogeneous product lines to buyers using faceted browsing. Using this approach we have developed the Ontogrator platform, which allows for rapid retrieval of data in a way that would be familiar to any online shopper. Using Knowledge Organization Systems (KOS), especially ontologies, Ontogrator uses text mining to mark up data and faceted browsing to help users navigate, query and retrieve data. Ontogrator offers the potential to impact scientific research in two major ways: 1) by significantly improving the retrieval of relevant information; and 2) by significantly reducing the time required to compose standard database queries and assemble information for further research. Here we present a pilot implementation developed in collaboration with the Genomic Standards Consortium (GSC) that includes content from the StrainInfo, GOLD, CAMERA, Silva and Pubmed databases. This implementation demonstrates the power of ontogration and highlights that the usefulness of this approach is fully dependent on both the quality of data and the KOS (ontologies) used. Ideally, the use and further expansion of this collaborative system will help to surface issues associated with the underlying quality of annotation and could lead to a systematic means for accessing integrated data resources. PMID:21677865

  2. A content-based retrieval of mammographic masses using the curvelet descriptor

    NASA Astrophysics Data System (ADS)

    Narváez, Fabian; Díaz, Gloria; Gómez, Francisco; Romero, Eduardo

    2012-03-01

    Computer-aided diagnosis (CAD) that uses content based image retrieval (CBIR) strategies has became an important research area. This paper presents a retrieval strategy that automatically recovers mammography masses from a virtual repository of mammographies. Unlike other approaches, we do not attempt to segment masses but instead we characterize the regions previously selected by an expert. These regions are firstly curvelet transformed and further characterized by approximating the marginal curvelet subband distribution with a generalized gaussian density (GGD). The content based retrieval strategy searches similar regions in a database using the Kullback-Leibler divergence as the similarity measure between distributions. The effectiveness of the proposed descriptor was assessed by comparing the automatically assigned label with a ground truth available in the DDSM database.1 A total of 380 masses with different shapes, sizes and margins were used for evaluation, resulting in a mean average precision rate of 89.3% and recall rate of 75.2% for the retrieval task.

  3. Land Surface Microwave Emissivities Derived from AMSR-E and MODIS Measurements with Advanced Quality Control

    NASA Technical Reports Server (NTRS)

    Moncet, Jean-Luc; Liang, Pan; Galantowicz, John F.; Lipton, Alan E.; Uymin, Gennady; Prigent, Catherine; Grassotti, Christopher

    2011-01-01

    A microwave emissivity database has been developed with data from the Advanced Microwave Scanning Radiometer-EOS (AMSR-E) and with ancillary land surface temperature (LST) data from the Moderate Resolution Imaging Spectroradiometer (MODIS) on the same Aqua spacecraft. The primary intended application of the database is to provide surface emissivity constraints in atmospheric and surface property retrieval or assimilation. An additional application is to serve as a dynamic indicator of land surface properties relevant to climate change monitoring. The precision of the emissivity data is estimated to be significantly better than in prior databases from other sensors due to the precise collocation with high-quality MODIS LST data and due to the quality control features of our data analysis system. The accuracy of the emissivities in deserts and semi-arid regions is enhanced by applying, in those regions, a version of the emissivity retrieval algorithm that accounts for the penetration of microwave radiation through dry soil with diurnally varying vertical temperature gradients. These results suggest that this penetration effect is more widespread and more significant to interpretation of passive microwave measurements than had been previously established. Emissivity coverage in areas where persistent cloudiness interferes with the availability of MODIS LST data is achieved using a classification-based method to spread emissivity data from less-cloudy areas that have similar microwave surface properties. Evaluations and analyses of the emissivity products over homogeneous snow-free areas are presented, including application to retrieval of soil temperature profiles. Spatial inhomogeneities are the largest in the vicinity of large water bodies due to the large water/land emissivity contrast and give rise to large apparent temporal variability in the retrieved emissivities when satellite footprint locations vary over time. This issue will be dealt with in the future by including a water fraction correction. Also note that current reliance on the MODIS day-night algorithm as a source of LST limits the coverage of the database in the Polar Regions. We will consider relaxing the current restriction as part of future development.

  4. Digitizing Olin Eggen's Card Database

    NASA Astrophysics Data System (ADS)

    Crast, J.; Silvis, G.

    2017-06-01

    The goal of the Eggen Card Database Project is to recover as many of the photometric observations from Olin Eggen's Card Database as possible and preserve these observations, in digital forms that are accessible by anyone. Any observations of interest to the AAVSO will be added to the AAVSO International Database (AID). Given to the AAVSO on long-term loan by the Cerro Tololo Inter-American Observatory, the database is a collection of over 78,000 index cards holding all Eggen's observations made between 1960 and 1990. The cards were electronically scanned and the resulting 108,000 card images have been published as a series of 2,216 PDF files, which are available from the AAVSO web site. The same images are also stored in an AAVSO online database where they are indexed by star name and card content. These images can be viewed using the eggen card portal online tool. Eggen made observations using filter bands from five different photometric systems. He documented these observations using 15 different data recording formats. Each format represents a combination of filter magnitudes and color indexes. These observations are being transcribed onto spreadsheets, from which observations of value to the AAVSO are added to the AID. A total of 506 U, B, V, R, and I observations were added to the AID for the variable stars S Car and l Car. We would like the reader to search through the card database using the eggen card portal for stars of particular interest. If such stars are found and retrieval of the observations is desired, e-mail the authors, and we will be happy to help retrieve those data for the reader.

  5. Dynamic taxonomies applied to a web-based relational database for geo-hydrological risk mitigation

    NASA Astrophysics Data System (ADS)

    Sacco, G. M.; Nigrelli, G.; Bosio, A.; Chiarle, M.; Luino, F.

    2012-02-01

    In its 40 years of activity, the Research Institute for Geo-hydrological Protection of the Italian National Research Council has amassed a vast and varied collection of historical documentation on landslides, muddy-debris flows, and floods in northern Italy from 1600 to the present. Since 2008, the archive resources have been maintained through a relational database management system. The database is used for routine study and research purposes as well as for providing support during geo-hydrological emergencies, when data need to be quickly and accurately retrieved. Retrieval speed and accuracy are the main objectives of an implementation based on a dynamic taxonomies model. Dynamic taxonomies are a general knowledge management model for configuring complex, heterogeneous information bases that support exploratory searching. At each stage of the process, the user can explore or browse the database in a guided yet unconstrained way by selecting the alternatives suggested for further refining the search. Dynamic taxonomies have been successfully applied to such diverse and apparently unrelated domains as e-commerce and medical diagnosis. Here, we describe the application of dynamic taxonomies to our database and compare it to traditional relational database query methods. The dynamic taxonomy interface, essentially a point-and-click interface, is considerably faster and less error-prone than traditional form-based query interfaces that require the user to remember and type in the "right" search keywords. Finally, dynamic taxonomy users have confirmed that one of the principal benefits of this approach is the confidence of having considered all the relevant information. Dynamic taxonomies and relational databases work in synergy to provide fast and precise searching: one of the most important factors in timely response to emergencies.

  6. Computer systems and methods for the query and visualization multidimensional databases

    DOEpatents

    Stolte, Chris; Tang, Diane L.; Hanrahan, Patrick

    2017-04-25

    A method of generating a data visualization is performed at a computer having a display, one or more processors, and memory. The memory stores one or more programs for execution by the one or more processors. The process receives user specification of a plurality of characteristics of a data visualization. The data visualization is based on data from a multidimensional database. The characteristics specify at least x-position and y-position of data marks corresponding to tuples of data retrieved from the database. The process generates a data visualization according to the specified plurality of characteristics. The data visualization has an x-axis defined based on data for one or more first fields from the database that specify x-position of the data marks and the data visualization has a y-axis defined based on data for one or more second fields from the database that specify y-position of the data marks.

  7. An XML-based Generic Tool for Information Retrieval in Solar Databases

    NASA Astrophysics Data System (ADS)

    Scholl, Isabelle F.; Legay, Eric; Linsolas, Romain

    This paper presents the current architecture of the `Solar Web Project' now in its development phase. This tool will provide scientists interested in solar data with a single web-based interface for browsing distributed and heterogeneous catalogs of solar observations. The main goal is to have a generic application that can be easily extended to new sets of data or to new missions with a low level of maintenance. It is developed with Java and XML is used as a powerful configuration language. The server, independent of any database scheme, can communicate with a client (the user interface) and several local or remote archive access systems (such as existing web pages, ftp sites or SQL databases). Archive access systems are externally described in XML files. The user interface is also dynamically generated from an XML file containing the window building rules and a simplified database description. This project is developed at MEDOC (Multi-Experiment Data and Operations Centre), located at the Institut d'Astrophysique Spatiale (Orsay, France). Successful tests have been conducted with other solar archive access systems.

  8. BioMart Central Portal: an open database network for the biological community.

    PubMed

    Guberman, Jonathan M; Ai, J; Arnaiz, O; Baran, Joachim; Blake, Andrew; Baldock, Richard; Chelala, Claude; Croft, David; Cros, Anthony; Cutts, Rosalind J; Di Génova, A; Forbes, Simon; Fujisawa, T; Gadaleta, E; Goodstein, D M; Gundem, Gunes; Haggarty, Bernard; Haider, Syed; Hall, Matthew; Harris, Todd; Haw, Robin; Hu, S; Hubbard, Simon; Hsu, Jack; Iyer, Vivek; Jones, Philip; Katayama, Toshiaki; Kinsella, R; Kong, Lei; Lawson, Daniel; Liang, Yong; Lopez-Bigas, Nuria; Luo, J; Lush, Michael; Mason, Jeremy; Moreews, Francois; Ndegwa, Nelson; Oakley, Darren; Perez-Llamas, Christian; Primig, Michael; Rivkin, Elena; Rosanoff, S; Shepherd, Rebecca; Simon, Reinhard; Skarnes, B; Smedley, Damian; Sperling, Linda; Spooner, William; Stevenson, Peter; Stone, Kevin; Teague, J; Wang, Jun; Wang, Jianxin; Whitty, Brett; Wong, D T; Wong-Erasmus, Marie; Yao, L; Youens-Clark, Ken; Yung, Christina; Zhang, Junjun; Kasprzyk, Arek

    2011-01-01

    BioMart Central Portal is a first of its kind, community-driven effort to provide unified access to dozens of biological databases spanning genomics, proteomics, model organisms, cancer data, ontology information and more. Anybody can contribute an independently maintained resource to the Central Portal, allowing it to be exposed to and shared with the research community, and linking it with the other resources in the portal. Users can take advantage of the common interface to quickly utilize different sources without learning a new system for each. The system also simplifies cross-database searches that might otherwise require several complicated steps. Several integrated tools streamline common tasks, such as converting between ID formats and retrieving sequences. The combination of a wide variety of databases, an easy-to-use interface, robust programmatic access and the array of tools make Central Portal a one-stop shop for biological data querying. Here, we describe the structure of Central Portal and show example queries to demonstrate its capabilities.

  9. The Quality Control of Data in a Clinical Database System—The Patient Identification Problem *

    PubMed Central

    Lai, J. Chi-Sang; Covvey, H.D.; Sevcik, K.C.; Wigle, E.D.

    1981-01-01

    Ensuring the accuracy of patient identification and the linkage of records with the appropriate patient owner is the first level of quality control of data in a clinical database system. Without a unique patient identifier, the fact that patient identity may be recorded at different places and times means that multiple identities may be associated with a given patient and new records associated with any of these identities. Even when a unique patient identifier is utilized, errors introduced in the data handling process can result in the same problems. The outcome is that the retrieval request for a given record may fail, or an erroneously identified record may be retrieved. We have studied each of the ways this fundamental problem occurs and propose a solution based on record linkage techniques to detect errors of this type. Specifically, we propose a patient identification scheme for the situation where no unique health identifier is available and detail a method to find patient records with erroneous identifiers.

  10. A Firefly Algorithm-based Approach for Pseudo-Relevance Feedback: Application to Medical Database.

    PubMed

    Khennak, Ilyes; Drias, Habiba

    2016-11-01

    The difficulty of disambiguating the sense of the incomplete and imprecise keywords that are extensively used in the search queries has caused the failure of search systems to retrieve the desired information. One of the most powerful and promising method to overcome this shortcoming and improve the performance of search engines is Query Expansion, whereby the user's original query is augmented by new keywords that best characterize the user's information needs and produce more useful query. In this paper, a new Firefly Algorithm-based approach is proposed to enhance the retrieval effectiveness of query expansion while maintaining low computational complexity. In contrast to the existing literature, the proposed approach uses a Firefly Algorithm to find the best expanded query among a set of expanded query candidates. Moreover, this new approach allows the determination of the length of the expanded query empirically. Experimental results on MEDLINE, the on-line medical information database, show that our proposed approach is more effective and efficient compared to the state-of-the-art.

  11. Is More Always Better?

    ERIC Educational Resources Information Center

    Bell, Steven J.

    2003-01-01

    Discusses full-text databases and whether existing aggregator databases are meeting user needs. Topics include the need for better search interfaces; concepts of quality research and information retrieval; information overload; full text in electronic journal collections versus aggregator databases; underrepresentation of certain disciplines; and…

  12. Tests of methods for evaluating bibliographic databases: an analysis of the National Library of Medicine's handling of literatures in the medical behavioral sciences.

    PubMed

    Griffith, B C; White, H D; Drott, M C; Saye, J D

    1986-07-01

    This article reports on five separate studies designed for the National Library of Medicine (NLM) to develop and test methodologies for evaluating the products of large databases. The methodologies were tested on literatures of the medical behavioral sciences (MBS). One of these studies examined how well NLM covered MBS monographic literature using CATLINE and OCLC. Another examined MBS journal and serial literature coverage in MEDLINE and other MBS-related databases available through DIALOG. These two studies used 1010 items derived from the reference lists of sixty-one journals, and tested for gaps and overlaps in coverage in the various databases. A third study examined the quality of the indexing NLM provides to MBS literatures and developed a measure of indexing as a system component. The final two studies explored how well MEDLINE retrieved documents on topics submitted by MBS professionals and how online searchers viewed MEDLINE (and other systems and databases) in handling MBS topics. The five studies yielded both broad research outcomes and specific recommendations to NLM.

  13. Optimality of the basic colour categories for classification

    PubMed Central

    Griffin, Lewis D

    2005-01-01

    Categorization of colour has been widely studied as a window into human language and cognition, and quite separately has been used pragmatically in image-database retrieval systems. This suggests the hypothesis that the best category system for pragmatic purposes coincides with human categories (i.e. the basic colours). We have tested this hypothesis by assessing the performance of different category systems in a machine-vision task. The task was the identification of the odd-one-out from triples of images obtained using a web-based image-search service. In each triple, two of the images had been retrieved using the same search term, the other a different term. The terms were simple concrete nouns. The results were as follows: (i) the odd-one-out task can be performed better than chance using colour alone; (ii) basic colour categorization performs better than random systems of categories; (iii) a category system that performs better than the basic colours could not be found; and (iv) it is not just the general layout of the basic colours that is important, but also the detail. We conclude that (i) the results support the plausibility of an explanation for the basic colours as a result of a pressure-to-optimality and (ii) the basic colours are good categories for machine vision image-retrieval systems. PMID:16849219

  14. Incomplete evidence: the inadequacy of databases in tracing published adverse drug reactions in clinical trials

    PubMed Central

    Derry, Sheena; Kong Loke, Yoon; Aronson, Jeffrey K

    2001-01-01

    Background We would expect information on adverse drug reactions in randomised clinical trials to be easily retrievable from specific searches of electronic databases. However, complete retrieval of such information may not be straightforward, for two reasons. First, not all clinical drug trials provide data on the frequency of adverse effects. Secondly, not all electronic records of trials include terms in the abstract or indexing fields that enable us to select those with adverse effects data. We have determined how often automated search methods, using indexing terms and/or textwords in the title or abstract, would fail to retrieve trials with adverse effects data. Methods We used a sample set of 107 trials known to report frequencies of adverse drug effects, and measured the proportion that (i) were not assigned the appropriate adverse effects indexing terms in the electronic databases, and (ii) did not contain identifiable adverse effects textwords in the title or abstract. Results Of the 81 trials with records on both MEDLINE and EMBASE, 25 were not indexed for adverse effects in either database. Twenty-six trials were indexed in one database but not the other. Only 66 of the 107 trials reporting adverse effects data mentioned this in the abstract or title of the paper. Simultaneous use of textword and indexing terms retrieved only 82/107 (77%) papers. Conclusions Specific search strategies based on adverse effects textwords and indexing terms will fail to identify nearly a quarter of trials that report on the rate of drug adverse effects. PMID:11591220

  15. A Holistic, Similarity-Based Approach for Personalized Ranking in Web Databases

    ERIC Educational Resources Information Center

    Telang, Aditya

    2011-01-01

    With the advent of the Web, the notion of "information retrieval" has acquired a completely new connotation and currently encompasses several disciplines ranging from traditional forms of text and data retrieval in unstructured and structured repositories to retrieval of static and dynamic information from the contents of the surface and deep Web.…

  16. [A survey of the best bibliographic searching system in occupational medicine and discussion of its implementation].

    PubMed

    Inoue, J

    1991-12-01

    When occupational health personnel, especially occupational physicians search bibliographies, they usually have to search bibliographies by themselves. Also, if a library is not available because of the location of their work place, they might have to rely on online databases. Although there are many commercial databases in the world, people who seldom use them, will have problems with on-line searching, such as user-computer interface, keywords, and so on. The present study surveyed the best bibliographic searching system in the field of occupational medicine by questionnaire through the use of DIALOG OnDisc MEDLINE as a commercial database. In order to ascertain the problems involved in determining the best bibliographic searching system, a prototype bibliographic searching system was constructed and then evaluated. Finally, solutions for the problems were discussed. These led to the following conclusions: to construct the best bibliographic searching system at the present time, 1) a concept of micro-to-mainframe links (MML) is needed for the computer hardware network; 2) multi-lingual font standards and an excellent common user-computer interface are needed for the computer software; 3) a short course and education of database management systems, and support of personal information processing for retrieved data are necessary for the practical use of the system.

  17. High-performance Negative Database for Massive Data Management System of The Mingantu Spectral Radioheliograph

    NASA Astrophysics Data System (ADS)

    Shi, Congming; Wang, Feng; Deng, Hui; Liu, Yingbo; Liu, Cuiyin; Wei, Shoulin

    2017-08-01

    As a dedicated synthetic aperture radio interferometer in China, the MingantU SpEctral Radioheliograph (MUSER), initially known as the Chinese Spectral RadioHeliograph (CSRH), has entered the stage of routine observation. More than 23 million data records per day need to be effectively managed to provide high-performance data query and retrieval for scientific data reduction. In light of these massive amounts of data generated by the MUSER, in this paper, a novel data management technique called the negative database (ND) is proposed and used to implement a data management system for the MUSER. Based on the key-value database, the ND technique makes complete utilization of the complement set of observational data to derive the requisite information. Experimental results showed that the proposed ND can significantly reduce storage volume in comparison with a relational database management system (RDBMS). Even when considering the time needed to derive records that were absent, its overall performance, including querying and deriving the data of the ND, is comparable with that of a relational database management system (RDBMS). The ND technique effectively solves the problem of massive data storage for the MUSER and is a valuable reference for the massive data management required in next-generation telescopes.

  18. User assumptions about information retrieval systems: Ethical concerns

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Froehlich, T.J.

    Information professionals, whether designers, intermediaries, database producers or vendors, bear some responsibility for the information that they make available to users of information systems. The users of such systems may tend to make many assumptions about the information that a system provides, such as believing: that the data are comprehensive, current and accurate, that the information resources or databases have same degree of quality and consistency of indexing; that the abstracts, if they exist, correctly and adequate reflect the content of the article; that there is consistency informs of author names or journal titles or indexing within and across databases;more » that there is standardization in and across databases; that once errors are detected, they are corrected; that appropriate choices of databases or information resources are a relatively easy matter, etc. The truth is that few of these assumptions are valid in commercia or corporate or organizational databases. However, given these beliefs and assumptions by many users, often promoted by information providers, information professionals, impossible, should intervene to warn users about the limitations and constraints of the databases they are using. With the growth of the Internet and end-user products (e.g., CD-ROMs), such interventions have significantly declined. In such cases, information should be provided on start-up or through interface screens, indicating to users, the constraints and orientation of the system they are using. The principle of {open_quotes}caveat emptor{close_quotes} is naive and socially irresponsible: information professionals or systems have an obligation to provide some framework or context for the information that users are accessing.« less

  19. History Places: A Case Study for Relational Database and Information Retrieval System Design

    ERIC Educational Resources Information Center

    Hendry, David G.

    2007-01-01

    This article presents a project-based case study that was developed for students with diverse backgrounds and varied inclinations for engaging technical topics. The project, called History Places, requires that student teams develop a vision for a kind of digital library, propose a conceptual model, and use the model to derive a logical model and…

  20. Automated Airdrop Information Retrieval System-Human Fact ors Database (AAIRS-HFD) (Users Manual)

    DTIC Science & Technology

    1994-09-01

    creeps, or chokes) Pressure Change Disorders Loss of Sensorimotor Abilities Loss of Cognitive/Perceptual Abilities Treatment drug therapy ...physical therapy cognitive therapy biofeedback therapy 73 9. Psychological Factors Situational Awareness altitude awareness Visual/Spatial...on/off valve prebreather Floatation Devices life preserver Scuba Gear Ankle Braces Knee Braces/Pads 82 7. Cargo/Resupply Parachute Assembly

  1. UniGene Tabulator: a full parser for the UniGene format.

    PubMed

    Lenzi, Luca; Frabetti, Flavia; Facchin, Federica; Casadei, Raffaella; Vitale, Lorenza; Canaider, Silvia; Carinci, Paolo; Zannotti, Maria; Strippoli, Pierluigi

    2006-10-15

    UniGene Tabulator 1.0 provides a solution for full parsing of UniGene flat file format; it implements a structured graphical representation of each data field present in UniGene following import into a common database managing system usable in a personal computer. This database includes related tables for sequence, protein similarity, sequence-tagged site (STS) and transcript map interval (TXMAP) data, plus a summary table where each record represents a UniGene cluster. UniGene Tabulator enables full local management of UniGene data, allowing parsing, querying, indexing, retrieving, exporting and analysis of UniGene data in a relational database form, usable on Macintosh (OS X 10.3.9 or later) and Windows (2000, with service pack 4, XP, with service pack 2 or later) operating systems-based computers. The current release, including both the FileMaker runtime applications, is freely available at http://apollo11.isto.unibo.it/software/

  2. The Cystic Fibrosis Database: Content and Research Opportunities.

    ERIC Educational Resources Information Center

    Shaw, William M., Jr.; And Others

    1991-01-01

    Describes the files contained in the Cystic Fibrosis (CF) database and discusses educational and research opportunities using this database. Topics discussed include queries, evaluating the relevance of items retrieved, and use of the database in an online searching course in the School of Information and Library Science at the University of North…

  3. Scientific Communication of Geochemical Data and the Use of Computer Databases.

    ERIC Educational Resources Information Center

    Le Bas, M. J.; Durham, J.

    1989-01-01

    Describes a scheme in the United Kingdom that coordinates geochemistry publications with a computerized geochemistry database. The database comprises not only data published in the journals but also the remainder of the pertinent data set. The discussion covers the database design; collection, storage and retrieval of data; and plans for future…

  4. Knowledge Discovery in Databases.

    ERIC Educational Resources Information Center

    Norton, M. Jay

    1999-01-01

    Knowledge discovery in databases (KDD) revolves around the investigation and creation of knowledge, processes, algorithms, and mechanisms for retrieving knowledge from data collections. The article is an introductory overview of KDD. The rationale and environment of its development and applications are discussed. Issues related to database design…

  5. Development of a web database portfolio system with PACS connectivity for undergraduate health education and continuing professional development.

    PubMed

    Ng, Curtise K C; White, Peter; McKay, Janice C

    2009-04-01

    Increasingly, the use of web database portfolio systems is noted in medical and health education, and for continuing professional development (CPD). However, the functions of existing systems are not always aligned with the corresponding pedagogy and hence reflection is often lost. This paper presents the development of a tailored web database portfolio system with Picture Archiving and Communication System (PACS) connectivity, which is based on the portfolio pedagogy. Following a pre-determined portfolio framework, a system model with the components of web, database and mail servers, server side scripts, and a Query/Retrieve (Q/R) broker for conversion between Hypertext Transfer Protocol (HTTP) requests and Q/R service class of Digital Imaging and Communication in Medicine (DICOM) standard, is proposed. The system was piloted with seventy-seven volunteers. A tailored web database portfolio system (http://radep.hti.polyu.edu.hk) was developed. Technological arrangements for reinforcing portfolio pedagogy include popup windows (reminders) with guidelines and probing questions of 'collect', 'select' and 'reflect' on evidence of development/experience, limitation in the number of files (evidence) to be uploaded, the 'Evidence Insertion' functionality to link the individual uploaded artifacts with reflective writing, capability to accommodate diversity of contents and convenient interfaces for reviewing portfolios and communication. Evidence to date suggests the system supports users to build their portfolios with sound hypertext reflection under a facilitator's guidance, and with reviewers to monitor students' progress providing feedback and comments online in a programme-wide situation.

  6. Sagace: A web-based search engine for biomedical databases in Japan

    PubMed Central

    2012-01-01

    Background In the big data era, biomedical research continues to generate a large amount of data, and the generated information is often stored in a database and made publicly available. Although combining data from multiple databases should accelerate further studies, the current number of life sciences databases is too large to grasp features and contents of each database. Findings We have developed Sagace, a web-based search engine that enables users to retrieve information from a range of biological databases (such as gene expression profiles and proteomics data) and biological resource banks (such as mouse models of disease and cell lines). With Sagace, users can search more than 300 databases in Japan. Sagace offers features tailored to biomedical research, including manually tuned ranking, a faceted navigation to refine search results, and rich snippets constructed with retrieved metadata for each database entry. Conclusions Sagace will be valuable for experts who are involved in biomedical research and drug development in both academia and industry. Sagace is freely available at http://sagace.nibio.go.jp/en/. PMID:23110816

  7. The ESIS query environment pilot project

    NASA Technical Reports Server (NTRS)

    Fuchs, Jens J.; Ciarlo, Alessandro; Benso, Stefano

    1993-01-01

    The European Space Information System (ESIS) was originally conceived to provide the European space science community with simple and efficient access to space data archives, facilities with which to examine and analyze the retrieved data, and general information services. To achieve that ESIS will provide the scientists with a discipline specific environment for querying in a uniform and transparent manner data stored in geographically dispersed archives. Furthermore it will provide discipline specific tools for displaying and analyzing the retrieved data. The central concept of ESIS is to achieve a more efficient and wider usage of space scientific data, while maintaining the physical archives at the institutions which created them, and has the best background for ensuring and maintaining the scientific validity and interest of the data. In addition to coping with the physical distribution of data, ESIS is to manage also the heterogenity of the individual archives' data models, formats and data base management systems. Thus the ESIS system shall appear to the user as a single database, while it does in fact consist of a collection of dispersed and locally managed databases and data archives. The work reported in this paper is one of the results of the ESIS Pilot Project which is to be completed in 1993. More specifically it presents the pilot ESIS Query Environment (ESIS QE) system which forms the data retrieval and data dissemination axis of the ESIS system. The others are formed by the ESIS Correlation Environment (ESIS CE) and the ESIS Information Services. The ESIS QE Pilot Project is carried out for the European Space Agency's Research and Information center, ESRIN, by a Consortium consisting of Computer Resources International, Denmark, CISET S.p.a, Italy, the University of Strasbourg, France and the Rutherford Appleton Laboratories in the U.K. Furthermore numerous scientists both within ESA and space science community in Europe have been involved in defining the core concepts of the ESIS system.

  8. A MYSQL-BASED DATA ARCHIVER: PRELIMINARY RESULTS

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Matthew Bickley; Christopher Slominski

    2008-01-23

    Following an evaluation of the archival requirements of the Jefferson Laboratory accelerator’s user community, a prototyping effort was executed to determine if an archiver based on MySQL had sufficient functionality to meet those requirements. This approach was chosen because an archiver based on a relational database enables the development effort to focus on data acquisition and management, letting the database take care of storage, indexing and data consistency. It was clear from the prototype effort that there were no performance impediments to successful implementation of a final system. With our performance concerns addressed, the lab undertook the design and developmentmore » of an operational system. The system is in its operational testing phase now. This paper discusses the archiver system requirements, some of the design choices and their rationale, and presents the acquisition, storage and retrieval performance.« less

  9. Citation searching: a systematic review case study of multiple risk behaviour interventions

    PubMed Central

    2014-01-01

    Background The value of citation searches as part of the systematic review process is currently unknown. While the major guides to conducting systematic reviews state that citation searching should be carried out in addition to searching bibliographic databases there are still few studies in the literature that support this view. Rather than using a predefined search strategy to retrieve studies, citation searching uses known relevant papers to identify further papers. Methods We describe a case study about the effectiveness of using the citation sources Google Scholar, Scopus, Web of Science and OVIDSP MEDLINE to identify records for inclusion in a systematic review. We used the 40 included studies identified by traditional database searches from one systematic review of interventions for multiple risk behaviours. We searched for each of the included studies in the four citation sources to retrieve the details of all papers that have cited these studies. We carried out two analyses; the first was to examine the overlap between the four citation sources to identify which citation tool was the most useful; the second was to investigate whether the citation searches identified any relevant records in addition to those retrieved by the original database searches. Results The highest number of citations was retrieved from Google Scholar (1680), followed by Scopus (1173), then Web of Science (1095) and lastly OVIDSP (213). To retrieve all the records identified by the citation tracking searching all four resources was required. Google Scholar identified the highest number of unique citations. The citation tracking identified 9 studies that met the review’s inclusion criteria. Eight of these had already been identified by the traditional databases searches and identified in the screening process while the ninth was not available in any of the databases when the original searches were carried out. It would, however, have been identified by two of the database search strategies if searches had been carried out later. Conclusions Based on the results from this investigation, citation searching as a supplementary search method for systematic reviews may not be the best use of valuable time and resources. It would be useful to verify these findings in other reviews. PMID:24893958

  10. Enhanced Deep Blue Aerosol Retrieval Algorithm: The Second Generation

    NASA Technical Reports Server (NTRS)

    Hsu, N. C.; Jeong, M.-J.; Bettenhausen, C.; Sayer, A. M.; Hansell, R.; Seftor, C. S.; Huang, J.; Tsay, S.-C.

    2013-01-01

    The aerosol products retrieved using the MODIS collection 5.1 Deep Blue algorithm have provided useful information about aerosol properties over bright-reflecting land surfaces, such as desert, semi-arid, and urban regions. However, many components of the C5.1 retrieval algorithm needed to be improved; for example, the use of a static surface database to estimate surface reflectances. This is particularly important over regions of mixed vegetated and non- vegetated surfaces, which may undergo strong seasonal changes in land cover. In order to address this issue, we develop a hybrid approach, which takes advantage of the combination of pre-calculated surface reflectance database and normalized difference vegetation index in determining the surface reflectance for aerosol retrievals. As a result, the spatial coverage of aerosol data generated by the enhanced Deep Blue algorithm has been extended from the arid and semi-arid regions to the entire land areas.

  11. Imaged document information location and extraction using an optical correlator

    NASA Astrophysics Data System (ADS)

    Stalcup, Bruce W.; Dennis, Phillip W.; Dydyk, Robert B.

    1999-12-01

    Today, the paper document is fast becoming a thing of the past. With the rapid development of fast, inexpensive computing and storage devices, many government and private organizations are archiving their documents in electronic form (e.g., personnel records, medical records, patents, etc.). Many of these organizations are converting their paper archives to electronic images, which are then stored in a computer database. Because of this, there is a need to efficiently organize this data into comprehensive and accessible information resources and provide for rapid access to the information contained within these imaged documents. To meet this need, Litton PRC and Litton Data Systems Division are developing a system, the Imaged Document Optical Correlation and Conversion System (IDOCCS), to provide a total solution to the problem of managing and retrieving textual and graphic information from imaged document archives. At the heart of IDOCCS, optical correlation technology provide a means for the search and retrieval of information from imaged documents. IDOCCS can be used to rapidly search for key words or phrases within the imaged document archives and has the potential to determine the types of languages contained within a document. In addition, IDOCCS can automatically compare an input document with the archived database to determine if it is a duplicate, thereby reducing the overall resources required to maintain and access the document database. Embedded graphics on imaged pages can also be exploited, e.g., imaged documents containing an agency's seal or logo can be singled out. In this paper, we present a description of IDOCCS as well as preliminary performance results and theoretical projections.

  12. Maintaining Multimedia Data in a Geospatial Database

    DTIC Science & Technology

    2012-09-01

    at PostgreSQL and MySQL as spatial databases was offered. Given their results, as each database produced result sets from zero to 100,000, it was...excelled given multiple conditions. A different look at PostgreSQL and MySQL as spatial databases was offered. Given their results, as each database... MySQL ................................................................................................14  B.  BENCHMARKING DATA RETRIEVED FROM TABLE

  13. Plant Genome Resources at the National Center for Biotechnology Information

    PubMed Central

    Wheeler, David L.; Smith-White, Brian; Chetvernin, Vyacheslav; Resenchuk, Sergei; Dombrowski, Susan M.; Pechous, Steven W.; Tatusova, Tatiana; Ostell, James

    2005-01-01

    The National Center for Biotechnology Information (NCBI) integrates data from more than 20 biological databases through a flexible search and retrieval system called Entrez. A core Entrez database, Entrez Nucleotide, includes GenBank and is tightly linked to the NCBI Taxonomy database, the Entrez Protein database, and the scientific literature in PubMed. A suite of more specialized databases for genomes, genes, gene families, gene expression, gene variation, and protein domains dovetails with the core databases to make Entrez a powerful system for genomic research. Linked to the full range of Entrez databases is the NCBI Map Viewer, which displays aligned genetic, physical, and sequence maps for eukaryotic genomes including those of many plants. A specialized plant query page allow maps from all plant genomes covered by the Map Viewer to be searched in tandem to produce a display of aligned maps from several species. PlantBLAST searches against the sequences shown in the Map Viewer allow BLAST alignments to be viewed within a genomic context. In addition, precomputed sequence similarities, such as those for proteins offered by BLAST Link, enable fluid navigation from unannotated to annotated sequences, quickening the pace of discovery. NCBI Web pages for plants, such as Plant Genome Central, complete the system by providing centralized access to NCBI's genomic resources as well as links to organism-specific Web pages beyond NCBI. PMID:16010002

  14. High-level specification of a proposed information architecture for support of a bioterrorism early-warning system.

    PubMed

    Berkowitz, Murray R

    2013-01-01

    Current information systems for use in detecting bioterrorist attacks lack a consistent, overarching information architecture. An overview of the use of biological agents as weapons during a bioterrorist attack is presented. Proposed are the design, development, and implementation of a medical informatics system to mine pertinent databases, retrieve relevant data, invoke appropriate biostatistical and epidemiological software packages, and automatically analyze these data. The top-level information architecture is presented. Systems requirements and functional specifications for this level are presented. Finally, future studies are identified.

  15. Enamel color changes following orthodontic treatment.

    PubMed

    Pandian, Akshaya; Ranganathan, Sukanya; Padmanabhan, Sridevi

    2017-01-01

    To evaluate and compare the effect of various orthodontic bonding systems and clean up procedures on quantitative enamel colour change. A literature search was done to identify the studies that assessed the quantitative enamel colour change associated with the various bonding systems and cleanup procedures. Electronic database (Pub Med, Cochrane and Google Scholar) were searched. First stage screening was performed and the abstracts were selected according to the initial selection criteria. Full text articles were retrieved and analyzed during second stage screening. The bibliographies were reviewed to identify additional relevant studies. Sixteen full text articles were retrieved. Six were rejected because the methodology was different. There was significant enamel colour change following orthodontic bonding, debonding and clean up procedures. Self-etching primers produce less enamel colour change compared to conventional etching. Resin Modified GIC produces least colour change compared to other light cure and chemical cure systems. Polishing following the clean-up procedure reduces the colour change of the enamel.

  16. An object-oriented programming system for the integration of internet-based bioinformatics resources.

    PubMed

    Beveridge, Allan

    2006-01-01

    The Internet consists of a vast inhomogeneous reservoir of data. Developing software that can integrate a wide variety of different data sources is a major challenge that must be addressed for the realisation of the full potential of the Internet as a scientific research tool. This article presents a semi-automated object-oriented programming system for integrating web-based resources. We demonstrate that the current Internet standards (HTML, CGI [common gateway interface], Java, etc.) can be exploited to develop a data retrieval system that scans existing web interfaces and then uses a set of rules to generate new Java code that can automatically retrieve data from the Web. The validity of the software has been demonstrated by testing it on several biological databases. We also examine the current limitations of the Internet and discuss the need for the development of universal standards for web-based data.

  17. An Abstraction-Based Data Model for Information Retrieval

    NASA Astrophysics Data System (ADS)

    McAllister, Richard A.; Angryk, Rafal A.

    Language ontologies provide an avenue for automated lexical analysis that may be used to supplement existing information retrieval methods. This paper presents a method of information retrieval that takes advantage of WordNet, a lexical database, to generate paths of abstraction, and uses them as the basis for an inverted index structure to be used in the retrieval of documents from an indexed corpus. We present this method as a entree to a line of research on using ontologies to perform word-sense disambiguation and improve the precision of existing information retrieval techniques.

  18. Generating PubMed Chemical Queries for Consumer Health Literature

    PubMed Central

    Loo, Jeffery; Chang, Hua Florence; Hochstein, Colette; Sun, Ying

    2005-01-01

    Two popular NLM resources that provide information for consumers about chemicals and their safety are the Household Products Database and Haz-Map. Search queries to PubMed via web links were generated from these databases. The query retrieves consumer health-oriented literature about adverse effects of chemicals. The retrieval was limited to a manageable set of 20 to 60 citations, achieved by successively applying increasing limits to the search until the desired number of references was reached. PMID:16779322

  19. The study of co-citation analysis and knowledge structure on healthcare domain

    NASA Astrophysics Data System (ADS)

    Chu, Kuo-Chung; Liu, Wen-I.; Tsai, Ming-Yu

    2012-11-01

    With the prevalence of Internet and digital archives, the online e-journal database facilitates scholars to search literature in a research domain, or to cross-search an inter-disciplined field; the key literature can be efficiently traced out. This study intends to build a Web-based citation analysis system, which consists of four modules, they are: 1) literature search module; (2) statistics module; (3) articles analysis module; and (4) co-citation analysis module. The system focuses on PubMed Central dataset that has 170,000 records. In a research domain, a specific keyword searches in terms of authors, journals, and core issues. In addition, we use data mining techniques for co-citation analysis. The results assist researchers with in-depth understanding of the domain knowledge. Having an automated system for co-citation analysis, it helps to understand changes, trends, and knowledge structure of research domain. For the best of our knowledge, the proposed system differentiates from existing online electronic retrieval database analysis function. Perhaps, the proposed system is going to be a value-added database of healthcare domain, and hope to contribute the researchers.

  20. PubSearch and PubFetch: a simple management system for semiautomated retrieval and annotation of biological information from the literature.

    PubMed

    Yoo, Danny; Xu, Iris; Berardini, Tanya Z; Rhee, Seung Yon; Narayanasamy, Vijay; Twigger, Simon

    2006-03-01

    For most systems in biology, a large body of literature exists that describes the complexity of the system based on experimental results. Manual review of this literature to extract targeted information into biological databases is difficult and time consuming. To address this problem, we developed PubSearch and PubFetch, which store literature, keyword, and gene information in a relational database, index the literature with keywords and gene names, and provide a Web user interface for annotating the genes from experimental data found in the associated literature. A set of protocols is provided in this unit for installing, populating, running, and using PubSearch and PubFetch. In addition, we provide support protocols for performing controlled vocabulary annotations. Intended users of PubSearch and PubFetch are database curators and biology researchers interested in tracking the literature and capturing information about genes of interest in a more effective way than with conventional spreadsheets and lab notebooks.

  1. Building a Massive Volcano Archive and the Development of a Tool for the Science Community

    NASA Technical Reports Server (NTRS)

    Linick, Justin

    2012-01-01

    The Jet Propulsion Laboratory has traditionally housed one of the world's largest databases of volcanic satellite imagery, the ASTER Volcano Archive (10Tb), making these data accessible online for public and scientific use. However, a series of changes in how satellite imagery is housed by the Earth Observing System (EOS) Data Information System has meant that JPL has been unable to systematically maintain its database for the last several years. We have provided a fast, transparent, machine-to-machine client that has updated JPL's database and will keep it current in near real-time. The development of this client has also given us the capability to retrieve any data provided by NASA's Earth Observing System Clearinghouse (ECHO) that covers a volcanic event reported by U.S. Air Force Weather Agency (AFWA). We will also provide a publicly available tool that interfaces with ECHO that can provide functionality not available in any of ECHO's Earth science discovery tools.

  2. Automatic classification and detection of clinically relevant images for diabetic retinopathy

    NASA Astrophysics Data System (ADS)

    Xu, Xinyu; Li, Baoxin

    2008-03-01

    We proposed a novel approach to automatic classification of Diabetic Retinopathy (DR) images and retrieval of clinically-relevant DR images from a database. Given a query image, our approach first classifies the image into one of the three categories: microaneurysm (MA), neovascularization (NV) and normal, and then it retrieves DR images that are clinically-relevant to the query image from an archival image database. In the classification stage, the query DR images are classified by the Multi-class Multiple-Instance Learning (McMIL) approach, where images are viewed as bags, each of which contains a number of instances corresponding to non-overlapping blocks, and each block is characterized by low-level features including color, texture, histogram of edge directions, and shape. McMIL first learns a collection of instance prototypes for each class that maximizes the Diverse Density function using Expectation- Maximization algorithm. A nonlinear mapping is then defined using the instance prototypes and maps every bag to a point in a new multi-class bag feature space. Finally a multi-class Support Vector Machine is trained in the multi-class bag feature space. In the retrieval stage, we retrieve images from the archival database who bear the same label with the query image, and who are the top K nearest neighbors of the query image in terms of similarity in the multi-class bag feature space. The classification approach achieves high classification accuracy, and the retrieval of clinically-relevant images not only facilitates utilization of the vast amount of hidden diagnostic knowledge in the database, but also improves the efficiency and accuracy of DR lesion diagnosis and assessment.

  3. Database in Artificial Intelligence.

    ERIC Educational Resources Information Center

    Wilkinson, Julia

    1986-01-01

    Describes a specialist bibliographic database of literature in the field of artificial intelligence created by the Turing Institute (Glasgow, Scotland) using the BRS/Search information retrieval software. The subscription method for end-users--i.e., annual fee entitles user to unlimited access to database, document provision, and printed awareness…

  4. Database resources of the National Center for Biotechnology Information: 2002 update

    PubMed Central

    Wheeler, David L.; Church, Deanna M.; Lash, Alex E.; Leipe, Detlef D.; Madden, Thomas L.; Pontius, Joan U.; Schuler, Gregory D.; Schriml, Lynn M.; Tatusova, Tatiana A.; Wagner, Lukas; Rapp, Barbara A.

    2002-01-01

    In addition to maintaining the GenBank nucleic acid sequence database, the National Center for Biotechnology Information (NCBI) provides data analysis and retrieval resources that operate on the data in GenBank and a variety of other biological data made available through NCBI’s web site. NCBI data retrieval resources include Entrez, PubMed, LocusLink and the Taxonomy Browser. Data analysis resources include BLAST, Electronic PCR, OrfFinder, RefSeq, UniGene, HomoloGene, Database of Single Nucleotide Polymorphisms (dbSNP), Human Genome Sequencing, Human MapViewer, Human¡VMouse Homology Map, Cancer Chromosome Aberration Project (CCAP), Entrez Genomes, Clusters of Orthologous Groups (COGs) database, Retroviral Genotyping Tools, SAGEmap, Gene Expression Omnibus (GEO), Online Mendelian Inheritance in Man (OMIM), the Molecular Modeling Database (MMDB) and the Conserved Domain Database (CDD). Augmenting many of the web applications are custom implementations of the BLAST program optimized to search specialized data sets. All of the resources can be accessed through the NCBI home page at http://www.ncbi.nlm.nih.gov. PMID:11752242

  5. Novel Algorithm for Classification of Medical Images

    NASA Astrophysics Data System (ADS)

    Bhushan, Bharat; Juneja, Monika

    2010-11-01

    Content-based image retrieval (CBIR) methods in medical image databases have been designed to support specific tasks, such as retrieval of medical images. These methods cannot be transferred to other medical applications since different imaging modalities require different types of processing. To enable content-based queries in diverse collections of medical images, the retrieval system must be familiar with the current Image class prior to the query processing. Further, almost all of them deal with the DICOM imaging format. In this paper a novel algorithm based on energy information obtained from wavelet transform for the classification of medical images according to their modalities is described. For this two types of wavelets have been used and have been shown that energy obtained in either case is quite distinct for each of the body part. This technique can be successfully applied to different image formats. The results are shown for JPEG imaging format.

  6. Estimating Missing Features to Improve Multimedia Information Retrieval

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Bagherjeiran, A; Love, N S; Kamath, C

    Retrieval in a multimedia database usually involves combining information from different modalities of data, such as text and images. However, all modalities of the data may not be available to form the query. The retrieval results from such a partial query are often less than satisfactory. In this paper, we present an approach to complete a partial query by estimating the missing features in the query. Our experiments with a database of images and their associated captions show that, with an initial text-only query, our completion method has similar performance to a full query with both image and text features.more » In addition, when we use relevance feedback, our approach outperforms the results obtained using a full query.« less

  7. WAIS Searching of the Current Contents Database

    NASA Astrophysics Data System (ADS)

    Banholzer, P.; Grabenstein, M. E.

    The Homer E. Newell Memorial Library of NASA's Goddard Space Flight Center is developing capabilities to permit Goddard personnel to access electronic resources of the Library via the Internet. The Library's support services contractor, Maxima Corporation, and their subcontractor, SANAD Support Technologies have recently developed a World Wide Web Home Page (http://www-library.gsfc.nasa.gov) to provide the primary means of access. The first searchable database to be made available through the HomePage to Goddard employees is Current Contents, from the Institute for Scientific Information (ISI). The initial implementation includes coverage of articles from the last few months of 1992 to present. These records are augmented with abstracts and references, and often are more robust than equivalent records in bibliographic databases that currently serve the astronomical community. Maxima/SANAD selected Wais Incorporated's WAIS product with which to build the interface to Current Contents. This system allows access from Macintosh, IBM PC, and Unix hosts, which is an important feature for Goddard's multiplatform environment. The forms interface is structured to allow both fielded (author, article title, journal name, id number, keyword, subject term, and citation) and unfielded WAIS searches. The system allows a user to: Retrieve individual journal article records. Retrieve Table of Contents of specific issues of journals. Connect to articles with similar subject terms or keywords. Connect to other issues of the same journal in the same year. Browse journal issues from an alphabetical list of indexed journal names.

  8. A Unified and Coherent Land Surface Emissivity Earth System Data Record

    NASA Astrophysics Data System (ADS)

    Knuteson, R. O.; Borbas, E. E.; Hulley, G. C.; Hook, S. J.; Anderson, M. C.; Pinker, R. T.; Hain, C.; Guillevic, P. C.

    2014-12-01

    Land Surface Temperature and Emissivity (LST&E) data are essential for a wide variety of studies from calculating the evapo-transpiration of plant canopies to retrieving atmospheric water vapor. LST&E products are generated from data acquired by sensors in low Earth orbit (LEO) and by sensors in geostationary Earth orbit (GEO). Although these products represent the same measure, they are produced at different spatial, spectral and temporal resolutions using different algorithms. The different approaches used to retrieve the temperatures and emissivities result in discrepancies and inconsistencies between the different products. NASA has identified a major need to develop long-term, consistent, and calibrated data and products that are valid across multiple missions and satellite sensors. This poster will introduce the land surface emissivity product of the NASA MEASUREs project called A Unified and Coherent Land Surface Temperature and Emissivity (LST&E) Earth System Data Record (ESDR). To develop a unified high spectral resolution emissivity database, the MODIS baseline-fit emissivity database (MODBF) produced at the University of Wisconsin-Madison and the ASTER Global Emissivity Database (ASTER GED) produced at JPL will be merged. The unified Emissivity ESDR will be produced globally at 5km in mean monthly time-steps and for 12 bands from 3.6-14.3 micron and extended to 417 bands using a PC regression approach. The poster will introduce this data product. LST&E is a critical ESDR for a wide variety of studies in particular ecosystem and climate modeling.

  9. SeqHound: biological sequence and structure database as a platform for bioinformatics research

    PubMed Central

    2002-01-01

    Background SeqHound has been developed as an integrated biological sequence, taxonomy, annotation and 3-D structure database system. It provides a high-performance server platform for bioinformatics research in a locally-hosted environment. Results SeqHound is based on the National Center for Biotechnology Information data model and programming tools. It offers daily updated contents of all Entrez sequence databases in addition to 3-D structural data and information about sequence redundancies, sequence neighbours, taxonomy, complete genomes, functional annotation including Gene Ontology terms and literature links to PubMed. SeqHound is accessible via a web server through a Perl, C or C++ remote API or an optimized local API. It provides functionality necessary to retrieve specialized subsets of sequences, structures and structural domains. Sequences may be retrieved in FASTA, GenBank, ASN.1 and XML formats. Structures are available in ASN.1, XML and PDB formats. Emphasis has been placed on complete genomes, taxonomy, domain and functional annotation as well as 3-D structural functionality in the API, while fielded text indexing functionality remains under development. SeqHound also offers a streamlined WWW interface for simple web-user queries. Conclusions The system has proven useful in several published bioinformatics projects such as the BIND database and offers a cost-effective infrastructure for research. SeqHound will continue to develop and be provided as a service of the Blueprint Initiative at the Samuel Lunenfeld Research Institute. The source code and examples are available under the terms of the GNU public license at the Sourceforge site http://sourceforge.net/projects/slritools/ in the SLRI Toolkit. PMID:12401134

  10. Predictors for Perioperative Outcomes following Total Laryngectomy: A University HealthSystem Consortium Discharge Database Study.

    PubMed

    Rutledge, Jonathan W; Spencer, Horace; Moreno, Mauricio A

    2014-07-01

    The University HealthSystem Consortium (UHC) database collects discharge information on patients treated at academic health centers throughout the United States. We sought to use this database to identify outcome predictors for patients undergoing total laryngectomy. A secondary end point was to assess the validity of the UHC's predictive risk mortality model in this cohort of patients. Retrospective review. Academic medical centers (tertiary referral centers) and their affiliate hospitals in the United States. Using the UHC discharge database, we retrieved and analyzed data for 4648 patients undergoing total laryngectomy who were discharged between October 2007 and January 2011 from all of the member institutions. Demographics, comorbidities, institutional data, and outcomes were retrieved. The length of stay and overall costs were significantly higher among female patients (P < .0001), while age was a predictor of intensive care unit stay (P = .014). The overall complication rate was higher among Asians (P = .019) and in patients with anemia and diabetes compared with other comorbidities. The average institutional case load was 1.92 cases/mo; we found an inverse correlation (R = -0.47) between the institutional case load and length of stay (P < .0001). The UHC admit mortality risk estimator was found to be an accurate predictor not only of mortality (P < .0002) but also of intensive care unit admission and complication rate (P < .0001). This study provides an overview of laryngectomy outcomes in a contemporary cohort of patients treated at academic health centers. UHC admit mortality risk is an excellent outcome predictor and a valuable tool for risk stratification in these patients. © American Academy of Otolaryngology—Head and Neck Surgery Foundation 2014.

  11. Image Engine: an object-oriented multimedia database for storing, retrieving and sharing medical images and text.

    PubMed Central

    Lowe, H. J.

    1993-01-01

    This paper describes Image Engine, an object-oriented, microcomputer-based, multimedia database designed to facilitate the storage and retrieval of digitized biomedical still images, video, and text using inexpensive desktop computers. The current prototype runs on Apple Macintosh computers and allows network database access via peer to peer file sharing protocols. Image Engine supports both free text and controlled vocabulary indexing of multimedia objects. The latter is implemented using the TView thesaurus model developed by the author. The current prototype of Image Engine uses the National Library of Medicine's Medical Subject Headings (MeSH) vocabulary (with UMLS Meta-1 extensions) as its indexing thesaurus. PMID:8130596

  12. Information management and analysis system for groundwater data in Thailand

    NASA Astrophysics Data System (ADS)

    Gill, D.; Luckananurung, P.

    1992-01-01

    The Ground Water Division of the Thai Department of Mineral Resources maintains a large archive of groundwater data with information on some 50,000 water wells. Each well file contains information on well location, well completion, borehole geology, water levels, water quality, and pumping tests. In order to enable efficient use of this information a computer-based system for information management and analysis was created. The project was sponsored by the United Nations Development Program and the Thai Department of Mineral Resources. The system was designed to serve users who lack prior training in automated data processing. Access is through a friendly user/system dialogue. Tasks are segmented into a number of logical steps, each of which is managed by a separate screen. Selective retrieval is possible by four different methods of area definition and by compliance with user-specified constraints on any combination of database variables. The main types of outputs are: (1) files of retrieved data, screened according to users' specifications; (2) an assortment of pre-formatted reports; (3) computed geochemical parameters and various diagrams of water chemistry derived therefrom; (4) bivariate scatter diagrams and linear regression analysis; (5) posting of data and computed results on maps; and (6) hydraulic aquifer characteristics as computed from pumping tests. Data are entered directly from formatted screens. Most records can be copied directly from hand-written documents. The database-management program performs data integrity checks in real time, enabling corrections at the time of input. The system software can be grouped into: (1) database administration and maintenance—these functions are carried out by the SIR/DBMS software package; (2) user communication interface for task definition and execution control—the interface is written in the operating system command language (VMS/DCL) and in FORTRAN 77; and (3) scientific data-processing programs, written in FORTRAN 77. The system was implemented on a DEC MicroVAX II computer.

  13. Contingency Contractor Optimization Phase 3 Sustainment Database Design Document - Contingency Contractor Optimization Tool - Prototype

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Frazier, Christopher Rawls; Durfee, Justin David; Bandlow, Alisa

    The Contingency Contractor Optimization Tool – Prototype (CCOT-P) database is used to store input and output data for the linear program model described in [1]. The database allows queries to retrieve this data and updating and inserting new input data.

  14. WholeCellSimDB: a hybrid relational/HDF database for whole-cell model predictions

    PubMed Central

    Karr, Jonathan R.; Phillips, Nolan C.; Covert, Markus W.

    2014-01-01

    Mechanistic ‘whole-cell’ models are needed to develop a complete understanding of cell physiology. However, extracting biological insights from whole-cell models requires running and analyzing large numbers of simulations. We developed WholeCellSimDB, a database for organizing whole-cell simulations. WholeCellSimDB was designed to enable researchers to search simulation metadata to identify simulations for further analysis, and quickly slice and aggregate simulation results data. In addition, WholeCellSimDB enables users to share simulations with the broader research community. The database uses a hybrid relational/hierarchical data format architecture to efficiently store and retrieve both simulation setup metadata and results data. WholeCellSimDB provides a graphical Web-based interface to search, browse, plot and export simulations; a JavaScript Object Notation (JSON) Web service to retrieve data for Web-based visualizations; a command-line interface to deposit simulations; and a Python API to retrieve data for advanced analysis. Overall, we believe WholeCellSimDB will help researchers use whole-cell models to advance basic biological science and bioengineering. Database URL: http://www.wholecellsimdb.org Source code repository URL: http://github.com/CovertLab/WholeCellSimDB PMID:25231498

  15. Evaluation of relational and NoSQL database architectures to manage genomic annotations.

    PubMed

    Schulz, Wade L; Nelson, Brent G; Felker, Donn K; Durant, Thomas J S; Torres, Richard

    2016-12-01

    While the adoption of next generation sequencing has rapidly expanded, the informatics infrastructure used to manage the data generated by this technology has not kept pace. Historically, relational databases have provided much of the framework for data storage and retrieval. Newer technologies based on NoSQL architectures may provide significant advantages in storage and query efficiency, thereby reducing the cost of data management. But their relative advantage when applied to biomedical data sets, such as genetic data, has not been characterized. To this end, we compared the storage, indexing, and query efficiency of a common relational database (MySQL), a document-oriented NoSQL database (MongoDB), and a relational database with NoSQL support (PostgreSQL). When used to store genomic annotations from the dbSNP database, we found the NoSQL architectures to outperform traditional, relational models for speed of data storage, indexing, and query retrieval in nearly every operation. These findings strongly support the use of novel database technologies to improve the efficiency of data management within the biological sciences. Copyright © 2016 Elsevier Inc. All rights reserved.

  16. Active learning methods for interactive image retrieval.

    PubMed

    Gosselin, Philippe Henri; Cord, Matthieu

    2008-07-01

    Active learning methods have been considered with increased interest in the statistical learning community. Initially developed within a classification framework, a lot of extensions are now being proposed to handle multimedia applications. This paper provides algorithms within a statistical framework to extend active learning for online content-based image retrieval (CBIR). The classification framework is presented with experiments to compare several powerful classification techniques in this information retrieval context. Focusing on interactive methods, active learning strategy is then described. The limitations of this approach for CBIR are emphasized before presenting our new active selection process RETIN. First, as any active method is sensitive to the boundary estimation between classes, the RETIN strategy carries out a boundary correction to make the retrieval process more robust. Second, the criterion of generalization error to optimize the active learning selection is modified to better represent the CBIR objective of database ranking. Third, a batch processing of images is proposed. Our strategy leads to a fast and efficient active learning scheme to retrieve sets of online images (query concept). Experiments on large databases show that the RETIN method performs well in comparison to several other active strategies.

  17. Remote data entry and retrieval for law enforcement

    NASA Astrophysics Data System (ADS)

    Kwasowsky, Bohdan R.; Capraro, Gerard T.; Berdan, Gerald B.; Capraro, Christopher T.

    1997-02-01

    Law enforcement personnel need to capture and retrieve quality multimedia data in `real time' while in the field. This is not done today, for the most part. Most law enforcement officers gather data on handwritten forms and retrieve data via voice communications or fax. This approach is time consuming, costly, prone to errors, and may require months before some data are entered into a usable law enforcement database. With advances in the computing and communications industries, it is now possible to communicate with anyone using a laptop computer or personal digital assistant (PDA), given a phone line, an RF modem, or cellular capability. Many law enforcement officers have access to laptop computers within their vehicles and can stay in touch with their command center and/or retrieve data from local, state, or federal databases. However, this same capability is not available once they leave the vehicle or if the officer is on a beat, motorcycle, or horseback. This paper investigates the issues and reviews the state of the art for integrating a PDA into the gathering and retrieving of multimedia data for law enforcement.

  18. Accelerating Information Retrieval from Profile Hidden Markov Model Databases.

    PubMed

    Tamimi, Ahmad; Ashhab, Yaqoub; Tamimi, Hashem

    2016-01-01

    Profile Hidden Markov Model (Profile-HMM) is an efficient statistical approach to represent protein families. Currently, several databases maintain valuable protein sequence information as profile-HMMs. There is an increasing interest to improve the efficiency of searching Profile-HMM databases to detect sequence-profile or profile-profile homology. However, most efforts to enhance searching efficiency have been focusing on improving the alignment algorithms. Although the performance of these algorithms is fairly acceptable, the growing size of these databases, as well as the increasing demand for using batch query searching approach, are strong motivations that call for further enhancement of information retrieval from profile-HMM databases. This work presents a heuristic method to accelerate the current profile-HMM homology searching approaches. The method works by cluster-based remodeling of the database to reduce the search space, rather than focusing on the alignment algorithms. Using different clustering techniques, 4284 TIGRFAMs profiles were clustered based on their similarities. A representative for each cluster was assigned. To enhance sensitivity, we proposed an extended step that allows overlapping among clusters. A validation benchmark of 6000 randomly selected protein sequences was used to query the clustered profiles. To evaluate the efficiency of our approach, speed and recall values were measured and compared with the sequential search approach. Using hierarchical, k-means, and connected component clustering techniques followed by the extended overlapping step, we obtained an average reduction in time of 41%, and an average recall of 96%. Our results demonstrate that representation of profile-HMMs using a clustering-based approach can significantly accelerate data retrieval from profile-HMM databases.

  19. Glance Information System for ATLAS Management

    NASA Astrophysics Data System (ADS)

    Grael, F. F.; Maidantchik, C.; Évora, L. H. R. A.; Karam, K.; Moraes, L. O. F.; Cirilli, M.; Nessi, M.; Pommès, K.; ATLAS Collaboration

    2011-12-01

    ATLAS Experiment is an international collaboration where more than 37 countries, 172 institutes and laboratories, 2900 physicists, engineers, and computer scientists plus 700 students participate. The management of this teamwork involves several aspects such as institute contribution, employment records, members' appointment, authors' list, preparation and publication of papers and speakers nomination. Previously, most of the information was accessible by a limited group and developers had to face problems such as different terminology, diverse data modeling, heterogeneous databases and unlike users needs. Moreover, the systems were not designed to handle new requirements. The maintenance has to be an easy task due to the long lifetime experiment and professionals turnover. The Glance system, a generic mechanism for accessing any database, acts as an intermediate layer isolating the user from the particularities of each database. It retrieves, inserts and updates the database independently of its technology and modeling. Relying on Glance, a group of systems were built to support the ATLAS management and operation aspects: ATLAS Membership, ATLAS Appointments, ATLAS Speakers, ATLAS Analysis Follow-Up, ATLAS Conference Notes, ATLAS Thesis, ATLAS Traceability and DSS Alarms Viewer. This paper presents the overview of the Glance information framework and describes the privilege mechanism developed to grant different level of access for each member and system.

  20. Performances of 27 MEDLINE systems tested by searches with clinical questions.

    PubMed Central

    Haynes, R B; Walker, C J; McKibbon, K A; Johnston, M E; Willan, A R

    1994-01-01

    OBJECTIVE: To compare the performances of online and compact-disc (CD-ROM) versions of the National Library of Medicine's (NLM) MEDLINE database. DESIGN: Analytic survey. INTERVENTION: Clinical questions were drawn from 18 searches originally conducted spontaneously by clinicians from wards and clinics who had used Grateful Med Version 4.0. Clinicians' search strategies were translated to meet the specific requirements of 13 online and 14 CD-ROM MEDLINE systems. A senior librarian and vendors' representatives constructed independent searches from the clinicians' questions. The librarian and clinician searches were run through each system, in command mode for the librarian and menu mode for clinicians, when available. Vendor searches were run through the vendors' own systems only. MAIN MEASUREMENTS: Numbers of relevant and irrelevant citations retrieved, cost (for online systems only), and time. RESULTS: Systems varied substantially for all searches, and for librarian and clinician searches separately, with respect to the numbers of relevant and irrelevant citations retrieved (p < 0.001 for both) and the cost per relevant citation (p = 0.012), but not with respect to the time per search. Based on combined rankings for the highest number of relevant and the lowest number of irrelevant citations retrieved, the SilverPlatter CD-ROM MEDLINE clinical journal subset performed best for librarian searches, while the PaperChase online system worked best for clinician searches. For cost per relevant citation retrieved, Dialog's Knowledge Index performed best for both librarian and clinician searches. CONCLUSIONS: There were substantial differences in the performances of competing MEDLINE systems, and performance was affected by search strategy, which was conceived by a librarian or by clinicians. PMID:7719810

Top