Sample records for xml information retrieval

  1. An Expressive and Efficient Language for XML Information Retrieval.

    ERIC Educational Resources Information Center

    Chinenyanga, Taurai Tapiwa; Kushmerick, Nicholas

    2002-01-01

    Discusses XML and information retrieval and describes a query language, ELIXIR (expressive and efficient language for XML information retrieval), with a textual similarity operator that can be used for similarity joins. Explains the algorithm for answering ELIXIR queries to generate intermediate relational data. (Author/LRW)

  2. Information Retrieval System for Japanese Standard Disease-Code Master Using XML Web Service

    PubMed Central

    Hatano, Kenji; Ohe, Kazuhiko

    2003-01-01

    Information retrieval system of Japanese Standard Disease-Code Master Using XML Web Service is developed. XML Web Service is a new distributed processing system by standard internet technologies. With seamless remote method invocation of XML Web Service, users are able to get the latest disease code master information from their rich desktop applications or internet web sites, which refer to this service. PMID:14728364

  3. Information persistence using XML database technology

    NASA Astrophysics Data System (ADS)

    Clark, Thomas A.; Lipa, Brian E. G.; Macera, Anthony R.; Staskevich, Gennady R.

    2005-05-01

    The Joint Battlespace Infosphere (JBI) Information Management (IM) services provide information exchange and persistence capabilities that support tailored, dynamic, and timely access to required information, enabling near real-time planning, control, and execution for DoD decision making. JBI IM services will be built on a substrate of network centric core enterprise services and when transitioned, will establish an interoperable information space that aggregates, integrates, fuses, and intelligently disseminates relevant information to support effective warfighter business processes. This virtual information space provides individual users with information tailored to their specific functional responsibilities and provides a highly tailored repository of, or access to, information that is designed to support a specific Community of Interest (COI), geographic area or mission. Critical to effective operation of JBI IM services is the implementation of repositories, where data, represented as information, is represented and persisted for quick and easy retrieval. This paper will address information representation, persistence and retrieval using existing database technologies to manage structured data in Extensible Markup Language (XML) format as well as unstructured data in an IM services-oriented environment. Three basic categories of database technologies will be compared and contrasted: Relational, XML-Enabled, and Native XML. These technologies have diverse properties such as maturity, performance, query language specifications, indexing, and retrieval methods. We will describe our application of these evolving technologies within the context of a JBI Reference Implementation (RI) by providing some hopefully insightful anecdotes and lessons learned along the way. This paper will also outline future directions, promising technologies and emerging COTS products that can offer more powerful information management representations, better persistence mechanisms and

  4. An XML-based system for the flexible classification and retrieval of clinical practice guidelines.

    PubMed Central

    Ganslandt, T.; Mueller, M. L.; Krieglstein, C. F.; Senninger, N.; Prokosch, H. U.

    2002-01-01

    Beneficial effects of clinical practice guidelines (CPGs) have not yet reached expectations due to limited routine adoption. Electronic distribution and reminder systems have the potential to overcome implementation barriers. Existing electronic CPG repositories like the National Guideline Clearinghouse (NGC) provide individual access but lack standardized computer-readable interfaces necessary for automated guideline retrieval. The aim of this paper was to facilitate automated context-based selection and presentation of CPGs. Using attributes from the NGC classification scheme, an XML-based metadata repository was successfully implemented, providing document storage, classification and retrieval functionality. Semi-automated extraction of attributes was implemented for the import of XML guideline documents using XPath. A hospital information system interface was exemplarily implemented for diagnosis-based guideline invocation. Limitations of the implemented system are discussed and possible future work is outlined. Integration of standardized computer-readable search interfaces into existing CPG repositories is proposed. PMID:12463831

  5. ScotlandsPlaces XML: Bespoke XML or XML Mapping?

    ERIC Educational Resources Information Center

    Beamer, Ashley; Gillick, Mark

    2010-01-01

    Purpose: The purpose of this paper is to investigate web services (in the form of parameterised URLs), specifically in the context of the ScotlandsPlaces project. This involves cross-domain querying, data retrieval and display via the development of a bespoke XML standard rather than existing XML formats and mapping between them.…

  6. A Survey in Indexing and Searching XML Documents.

    ERIC Educational Resources Information Center

    Luk, Robert W. P.; Leong, H. V.; Dillon, Tharam S.; Chan, Alvin T. S.; Croft, W. Bruce; Allan, James

    2002-01-01

    Discussion of XML focuses on indexing techniques for XML documents, grouping them into flat-file, semistructured, and structured indexing paradigms. Highlights include searching techniques, including full text search and multistage search; search result presentations; database and information retrieval system integration; XML query languages; and…

  7. A Novel Navigation Paradigm for XML Repositories.

    ERIC Educational Resources Information Center

    Azagury, Alain; Factor, Michael E.; Maarek, Yoelle S.; Mandler, Benny

    2002-01-01

    Discusses data exchange over the Internet and describes the architecture and implementation of an XML document repository that promotes a navigation paradigm for XML documents based on content and context. Topics include information retrieval and semistructured documents; and file systems as information storage infrastructure, particularly XMLFS.…

  8. An exponentiation method for XML element retrieval.

    PubMed

    Wichaiwong, Tanakorn

    2014-01-01

    XML document is now widely used for modelling and storing structured documents. The structure is very rich and carries important information about contents and their relationships, for example, e-Commerce. XML data-centric collections require query terms allowing users to specify constraints on the document structure; mapping structure queries and assigning the weight are significant for the set of possibly relevant documents with respect to structural conditions. In this paper, we present an extension to the MEXIR search system that supports the combination of structural and content queries in the form of content-and-structure queries, which we call the Exponentiation function. It has been shown the structural information improve the effectiveness of the search system up to 52.60% over the baseline BM25 at MAP.

  9. Querying and Ranking XML Documents.

    ERIC Educational Resources Information Center

    Schlieder, Torsten; Meuss, Holger

    2002-01-01

    Discussion of XML, information retrieval, precision, and recall focuses on a retrieval technique that adopts the similarity measure of the vector space model, incorporates the document structure, and supports structured queries. Topics include a query model based on tree matching; structured queries and term-based ranking; and term frequency and…

  10. An Exponentiation Method for XML Element Retrieval

    PubMed Central

    2014-01-01

    XML document is now widely used for modelling and storing structured documents. The structure is very rich and carries important information about contents and their relationships, for example, e-Commerce. XML data-centric collections require query terms allowing users to specify constraints on the document structure; mapping structure queries and assigning the weight are significant for the set of possibly relevant documents with respect to structural conditions. In this paper, we present an extension to the MEXIR search system that supports the combination of structural and content queries in the form of content-and-structure queries, which we call the Exponentiation function. It has been shown the structural information improve the effectiveness of the search system up to 52.60% over the baseline BM25 at MAP. PMID:24696643

  11. XML in Libraries.

    ERIC Educational Resources Information Center

    Tennant, Roy, Ed.

    This book presents examples of how libraries are using XML (eXtensible Markup Language) to solve problems, expand services, and improve systems. Part I contains papers on using XML in library catalog records: "Updating MARC Records with XMLMARC" (Kevin S. Clarke, Stanford University) and "Searching and Retrieving XML Records via the…

  12. An XML-based Generic Tool for Information Retrieval in Solar Databases

    NASA Astrophysics Data System (ADS)

    Scholl, Isabelle F.; Legay, Eric; Linsolas, Romain

    This paper presents the current architecture of the `Solar Web Project' now in its development phase. This tool will provide scientists interested in solar data with a single web-based interface for browsing distributed and heterogeneous catalogs of solar observations. The main goal is to have a generic application that can be easily extended to new sets of data or to new missions with a low level of maintenance. It is developed with Java and XML is used as a powerful configuration language. The server, independent of any database scheme, can communicate with a client (the user interface) and several local or remote archive access systems (such as existing web pages, ftp sites or SQL databases). Archive access systems are externally described in XML files. The user interface is also dynamically generated from an XML file containing the window building rules and a simplified database description. This project is developed at MEDOC (Multi-Experiment Data and Operations Centre), located at the Institut d'Astrophysique Spatiale (Orsay, France). Successful tests have been conducted with other solar archive access systems.

  13. Information Retrieval and Graph Analysis Approaches for Book Recommendation.

    PubMed

    Benkoussas, Chahinez; Bellot, Patrice

    2015-01-01

    A combination of multiple information retrieval approaches is proposed for the purpose of book recommendation. In this paper, book recommendation is based on complex user's query. We used different theoretical retrieval models: probabilistic as InL2 (Divergence from Randomness model) and language model and tested their interpolated combination. Graph analysis algorithms such as PageRank have been successful in Web environments. We consider the application of this algorithm in a new retrieval approach to related document network comprised of social links. We called Directed Graph of Documents (DGD) a network constructed with documents and social information provided from each one of them. Specifically, this work tackles the problem of book recommendation in the context of INEX (Initiative for the Evaluation of XML retrieval) Social Book Search track. A series of reranking experiments demonstrate that combining retrieval models yields significant improvements in terms of standard ranked retrieval metrics. These results extend the applicability of link analysis algorithms to different environments.

  14. Information Retrieval and Graph Analysis Approaches for Book Recommendation

    PubMed Central

    Benkoussas, Chahinez; Bellot, Patrice

    2015-01-01

    A combination of multiple information retrieval approaches is proposed for the purpose of book recommendation. In this paper, book recommendation is based on complex user's query. We used different theoretical retrieval models: probabilistic as InL2 (Divergence from Randomness model) and language model and tested their interpolated combination. Graph analysis algorithms such as PageRank have been successful in Web environments. We consider the application of this algorithm in a new retrieval approach to related document network comprised of social links. We called Directed Graph of Documents (DGD) a network constructed with documents and social information provided from each one of them. Specifically, this work tackles the problem of book recommendation in the context of INEX (Initiative for the Evaluation of XML retrieval) Social Book Search track. A series of reranking experiments demonstrate that combining retrieval models yields significant improvements in terms of standard ranked retrieval metrics. These results extend the applicability of link analysis algorithms to different environments. PMID:26504899

  15. An XML Data Model for Inverted Image Indexing

    NASA Astrophysics Data System (ADS)

    So, Simon W.; Leung, Clement H. C.; Tse, Philip K. C.

    2003-01-01

    The Internet world makes increasing use of XML-based technologies. In multimedia data indexing and retrieval, the MPEG-7 standard for Multimedia Description Scheme is specified using XML. The flexibility of XML allows users to define other markup semantics for special contexts, construct data-centric XML documents, exchange standardized data between computer systems, and present data in different applications. In this paper, the Inverted Image Indexing paradigm is presented and modeled using XML Schema.

  16. phyloXML: XML for evolutionary biology and comparative genomics

    PubMed Central

    Han, Mira V; Zmasek, Christian M

    2009-01-01

    Background Evolutionary trees are central to a wide range of biological studies. In many of these studies, tree nodes and branches need to be associated (or annotated) with various attributes. For example, in studies concerned with organismal relationships, tree nodes are associated with taxonomic names, whereas tree branches have lengths and oftentimes support values. Gene trees used in comparative genomics or phylogenomics are usually annotated with taxonomic information, genome-related data, such as gene names and functional annotations, as well as events such as gene duplications, speciations, or exon shufflings, combined with information related to the evolutionary tree itself. The data standards currently used for evolutionary trees have limited capacities to incorporate such annotations of different data types. Results We developed a XML language, named phyloXML, for describing evolutionary trees, as well as various associated data items. PhyloXML provides elements for commonly used items, such as branch lengths, support values, taxonomic names, and gene names and identifiers. By using "property" elements, phyloXML can be adapted to novel and unforeseen use cases. We also developed various software tools for reading, writing, conversion, and visualization of phyloXML formatted data. Conclusion PhyloXML is an XML language defined by a complete schema in XSD that allows storing and exchanging the structures of evolutionary trees as well as associated data. More information about phyloXML itself, the XSD schema, as well as tools implementing and supporting phyloXML, is available at . PMID:19860910

  17. Representing nested semantic information in a linear string of text using XML.

    PubMed

    Krauthammer, Michael; Johnson, Stephen B; Hripcsak, George; Campbell, David A; Friedman, Carol

    2002-01-01

    XML has been widely adopted as an important data interchange language. The structure of XML enables sharing of data elements with variable degrees of nesting as long as the elements are grouped in a strict tree-like fashion. This requirement potentially restricts the usefulness of XML for marking up written text, which often includes features that do not properly nest within other features. We encountered this problem while marking up medical text with structured semantic information from a Natural Language Processor. Traditional approaches to this problem separate the structured information from the actual text mark up. This paper introduces an alternative solution, which tightly integrates the semantic structure with the text. The resulting XML markup preserves the linearity of the medical texts and can therefore be easily expanded with additional types of information.

  18. Representing nested semantic information in a linear string of text using XML.

    PubMed Central

    Krauthammer, Michael; Johnson, Stephen B.; Hripcsak, George; Campbell, David A.; Friedman, Carol

    2002-01-01

    XML has been widely adopted as an important data interchange language. The structure of XML enables sharing of data elements with variable degrees of nesting as long as the elements are grouped in a strict tree-like fashion. This requirement potentially restricts the usefulness of XML for marking up written text, which often includes features that do not properly nest within other features. We encountered this problem while marking up medical text with structured semantic information from a Natural Language Processor. Traditional approaches to this problem separate the structured information from the actual text mark up. This paper introduces an alternative solution, which tightly integrates the semantic structure with the text. The resulting XML markup preserves the linearity of the medical texts and can therefore be easily expanded with additional types of information. PMID:12463856

  19. XML under the Hood.

    ERIC Educational Resources Information Center

    Scharf, David

    2002-01-01

    Discusses XML (extensible markup language), particularly as it relates to libraries. Topics include organizing information; cataloging; metadata; similarities to HTML; organizations dealing with XML; making XML useful; a history of XML; the semantic Web; related technologies; XML at the Library of Congress; and its role in improving the…

  20. Utilizing the Structure and Content Information for XML Document Clustering

    NASA Astrophysics Data System (ADS)

    Tran, Tien; Kutty, Sangeetha; Nayak, Richi

    This paper reports on the experiments and results of a clustering approach used in the INEX 2008 document mining challenge. The clustering approach utilizes both the structure and content information of the Wikipedia XML document collection. A latent semantic kernel (LSK) is used to measure the semantic similarity between XML documents based on their content features. The construction of a latent semantic kernel involves the computing of singular vector decomposition (SVD). On a large feature space matrix, the computation of SVD is very expensive in terms of time and memory requirements. Thus in this clustering approach, the dimension of the document space of a term-document matrix is reduced before performing SVD. The document space reduction is based on the common structural information of the Wikipedia XML document collection. The proposed clustering approach has shown to be effective on the Wikipedia collection in the INEX 2008 document mining challenge.

  1. XML Style Guide

    DTIC Science & Technology

    2015-07-01

    Acronyms ASCII American Standard Code for Information Interchange DAU data acquisition unit DDML data display markup language IHAL...Transfer Standard URI uniform resource identifier W3C World Wide Web Consortium XML extensible markup language XSD XML schema definition XML Style...Style Guide, RCC 125-15, July 2015 1 Introduction The next generation of telemetry systems will rely heavily on extensible markup language (XML

  2. Definition of an XML markup language for clinical laboratory procedures and comparison with generic XML markup.

    PubMed

    Saadawi, Gilan M; Harrison, James H

    2006-10-01

    Clinical laboratory procedure manuals are typically maintained as word processor files and are inefficient to store and search, require substantial effort for review and updating, and integrate poorly with other laboratory information. Electronic document management systems could improve procedure management and utility. As a first step toward building such systems, we have developed a prototype electronic format for laboratory procedures using Extensible Markup Language (XML). Representative laboratory procedures were analyzed to identify document structure and data elements. This information was used to create a markup vocabulary, CLP-ML, expressed as an XML Document Type Definition (DTD). To determine whether this markup provided advantages over generic markup, we compared procedures structured with CLP-ML or with the vocabulary of the Health Level Seven, Inc. (HL7) Clinical Document Architecture (CDA) narrative block. CLP-ML includes 124 XML tags and supports a variety of procedure types across different laboratory sections. When compared with a general-purpose markup vocabulary (CDA narrative block), CLP-ML documents were easier to edit and read, less complex structurally, and simpler to traverse for searching and retrieval. In combination with appropriate software, CLP-ML is designed to support electronic authoring, reviewing, distributing, and searching of clinical laboratory procedures from a central repository, decreasing procedure maintenance effort and increasing the utility of procedure information. A standard electronic procedure format could also allow laboratories and vendors to share procedures and procedure layouts, minimizing duplicative word processor editing. Our results suggest that laboratory-specific markup such as CLP-ML will provide greater benefit for such systems than generic markup.

  3. An Efficient G-XML Data Management Method using XML Spatial Index for Mobile Devices

    NASA Astrophysics Data System (ADS)

    Tamada, Takashi; Momma, Kei; Seo, Kazuo; Hijikata, Yoshinori; Nishida, Shogo

    This paper presents an efficient G-XML data management method for mobile devices. G-XML is XML based encoding for the transport of geographic information. Mobile devices, such as PDA and mobile-phone, performance trail desktop machines, so some techniques are needed for processing G-XML data on mobile devices. In this method, XML-format spatial index file is used to improve an initial display time of G-XML data. This index file contains XML pointer of each feature in G-XML data and classifies these features by multi-dimensional data structures. From the experimental result, we can prove this method speed up about 3-7 times an initial display time of G-XML data on mobile devices.

  4. Indexing Temporal XML Using FIX

    NASA Astrophysics Data System (ADS)

    Zheng, Tiankun; Wang, Xinjun; Zhou, Yingchun

    XML has become an important criterion for description and exchange of information. It is of practical significance to introduce the temporal information on this basis, because time has penetrated into all walks of life as an important property information .Such kind of database can track document history and recover information to state of any time before, and is called Temporal XML database. We advise a new feature vector on the basis of FIX which is a feature-based XML index, and build an index on temporal XML database using B+ tree, donated TFIX. We also put forward a new query algorithm upon it for temporal query. Our experiments proved that this index has better performance over other kinds of XML indexes. The index can satisfy all TXPath queries with depth up to K(>0).

  5. Semantically Interoperable XML Data

    PubMed Central

    Vergara-Niedermayr, Cristobal; Wang, Fusheng; Pan, Tony; Kurc, Tahsin; Saltz, Joel

    2013-01-01

    XML is ubiquitously used as an information exchange platform for web-based applications in healthcare, life sciences, and many other domains. Proliferating XML data are now managed through latest native XML database technologies. XML data sources conforming to common XML schemas could be shared and integrated with syntactic interoperability. Semantic interoperability can be achieved through semantic annotations of data models using common data elements linked to concepts from ontologies. In this paper, we present a framework and software system to support the development of semantic interoperable XML based data sources that can be shared through a Grid infrastructure. We also present our work on supporting semantic validated XML data through semantic annotations for XML Schema, semantic validation and semantic authoring of XML data. We demonstrate the use of the system for a biomedical database of medical image annotations and markups. PMID:25298789

  6. The SGML standardization framework and the introduction of XML.

    PubMed

    Fierz, W; Grütter, R

    2000-01-01

    Extensible Markup Language (XML) is on its way to becoming a global standard for the representation, exchange, and presentation of information on the World Wide Web (WWW). More than that, XML is creating a standardization framework, in terms of an open network of meta-standards and mediators that allows for the definition of further conventions and agreements in specific business domains. Such an approach is particularly needed in the healthcare domain; XML promises to especially suit the particularities of patient records and their lifelong storage, retrieval, and exchange. At a time when change rather than steadiness is becoming the faithful feature of our society, standardization frameworks which support a diversified growth of specifications that are appropriate to the actual needs of the users are becoming more and more important; and efforts should be made to encourage this new attempt at standardization to grow in a fruitful direction. Thus, the introduction of XML reflects a standardization process which is neither exclusively based on an acknowledged standardization authority, nor a pure market standard. Instead, a consortium of companies, academic institutions, and public bodies has agreed on a common recommendation based on an existing standardization framework. The consortium's process of agreeing to a standardization framework will doubtlessly be successful in the case of XML, and it is suggested that it should be considered as a generic model for standardization processes in the future.

  7. An Electronic Finding Aid Using Extensible Markup Language (XML) and Encoded Archival Description (EAD).

    ERIC Educational Resources Information Center

    Chang, May

    2000-01-01

    Describes the development of electronic finding aids for archives at the University of Illinois, Urbana-Champaign that used XML (extensible markup language) and EAD (encoded archival description) to enable more flexible information management and retrieval than using MARC or a relational database management system. EAD template is appended.…

  8. XML at the ADC: Steps to a Next Generation Data Archive

    NASA Astrophysics Data System (ADS)

    Shaya, E.; Blackwell, J.; Gass, J.; Oliversen, N.; Schneider, G.; Thomas, B.; Cheung, C.; White, R. A.

    1999-05-01

    The eXtensible Markup Language (XML) is a document markup language that allows users to specify their own tags, to create hierarchical structures to qualify their data, and to support automatic checking of documents for structural validity. It is being intensively supported by nearly every major corporate software developer. Under the funds of a NASA AISRP proposal, the Astronomical Data Center (ADC, http://adc.gsfc.nasa.gov) is developing an infrastructure for importation, enhancement, and distribution of data and metadata using XML as the document markup language. We discuss the preliminary Document Type Definition (DTD, at http://adc.gsfc.nasa.gov/xml) which specifies the elements and their attributes in our metadata documents. This attempts to define both the metadata of an astronomical catalog and the `header' information of an astronomical table. In addition, we give an overview of the planned flow of data through automated pipelines from authors and journal presses into our XML archive and retrieval through the web via the XML-QL Query Language and eXtensible Style Language (XSL) scripts. When completed, the catalogs and journal tables at the ADC will be tightly hyperlinked to enhance data discovery. In addition one will be able to search on fragmentary information. For instance, one could query for a table by entering that the second author is so-and-so or that the third author is at such-and-such institution.

  9. The CMIP5 Model Documentation Questionnaire: Development of a Metadata Retrieval System for the METAFOR Common Information Model

    NASA Astrophysics Data System (ADS)

    Pascoe, Charlotte; Lawrence, Bryan; Moine, Marie-Pierre; Ford, Rupert; Devine, Gerry

    2010-05-01

    The EU METAFOR Project (http://metaforclimate.eu) has created a web-based model documentation questionnaire to collect metadata from the modelling groups that are running simulations in support of the Coupled Model Intercomparison Project - 5 (CMIP5). The CMIP5 model documentation questionnaire will retrieve information about the details of the models used, how the simulations were carried out, how the simulations conformed to the CMIP5 experiment requirements and details of the hardware used to perform the simulations. The metadata collected by the CMIP5 questionnaire will allow CMIP5 data to be compared in a scientifically meaningful way. This paper describes the life-cycle of the CMIP5 questionnaire development which starts with relatively unstructured input from domain specialists and ends with formal XML documents that comply with the METAFOR Common Information Model (CIM). Each development step is associated with a specific tool. (1) Mind maps are used to capture information requirements from domain experts and build a controlled vocabulary, (2) a python parser processes the XML files generated by the mind maps, (3) Django (python) is used to generate the dynamic structure and content of the web based questionnaire from processed xml and the METAFOR CIM, (4) Python parsers ensure that information entered into the CMIP5 questionnaire is output as CIM compliant xml, (5) CIM compliant output allows automatic information capture tools to harvest questionnaire content into databases such as the Earth System Grid (ESG) metadata catalogue. This paper will focus on how Django (python) and XML input files are used to generate the structure and content of the CMIP5 questionnaire. It will also address how the choice of development tools listed above provided a framework that enabled working scientists (who we would never ordinarily get to interact with UML and XML) to be part the iterative development process and ensure that the CMIP5 model documentation questionnaire

  10. The SGML Standardization Framework and the Introduction of XML

    PubMed Central

    Grütter, Rolf

    2000-01-01

    Extensible Markup Language (XML) is on its way to becoming a global standard for the representation, exchange, and presentation of information on the World Wide Web (WWW). More than that, XML is creating a standardization framework, in terms of an open network of meta-standards and mediators that allows for the definition of further conventions and agreements in specific business domains. Such an approach is particularly needed in the healthcare domain; XML promises to especially suit the particularities of patient records and their lifelong storage, retrieval, and exchange. At a time when change rather than steadiness is becoming the faithful feature of our society, standardization frameworks which support a diversified growth of specifications that are appropriate to the actual needs of the users are becoming more and more important; and efforts should be made to encourage this new attempt at standardization to grow in a fruitful direction. Thus, the introduction of XML reflects a standardization process which is neither exclusively based on an acknowledged standardization authority, nor a pure market standard. Instead, a consortium of companies, academic institutions, and public bodies has agreed on a common recommendation based on an existing standardization framework. The consortium's process of agreeing to a standardization framework will doubtlessly be successful in the case of XML, and it is suggested that it should be considered as a generic model for standardization processes in the future. PMID:11720931

  11. COM3/369: Knowledge-based Information Systems: A new approach for the representation and retrieval of medical information

    PubMed Central

    Mann, G; Birkmann, C; Schmidt, T; Schaeffler, V

    1999-01-01

    Introduction Present solutions for the representation and retrieval of medical information from online sources are not very satisfying. Either the retrieval process lacks of precision and completeness the representation does not support the update and maintenance of the represented information. Most efforts are currently put into improving the combination of search engines and HTML based documents. However, due to the current shortcomings of methods for natural language understanding there are clear limitations to this approach. Furthermore, this approach does not solve the maintenance problem. At least medical information exceeding a certain complexity seems to afford approaches that rely on structured knowledge representation and corresponding retrieval mechanisms. Methods Knowledge-based information systems are based on the following fundamental ideas. The representation of information is based on ontologies that define the structure of the domain's concepts and their relations. Views on domain models are defined and represented as retrieval schemata. Retrieval schemata can be interpreted as canonical query types focussing on specific aspects of the provided information (e.g. diagnosis or therapy centred views). Based on these retrieval schemata it can be decided which parts of the information in the domain model must be represented explicitly and formalised to support the retrieval process. As representation language propositional logic is used. All other information can be represented in a structured but informal way using text, images etc. Layout schemata are used to assign layout information to retrieved domain concepts. Depending on the target environment HTML or XML can be used. Results Based on this approach two knowledge-based information systems have been developed. The 'Ophthalmologic Knowledge-based Information System for Diabetic Retinopathy' (OKIS-DR) provides information on diagnoses, findings, examinations, guidelines, and reference images related

  12. XML Flight/Ground Data Dictionary Management

    NASA Technical Reports Server (NTRS)

    Wright, Jesse; Wiklow, Colette

    2007-01-01

    A computer program generates Extensible Markup Language (XML) files that effect coupling between the command- and telemetry-handling software running aboard a spacecraft and the corresponding software running in ground support systems. The XML files are produced by use of information from the flight software and from flight-system engineering. The XML files are converted to legacy ground-system data formats for command and telemetry, transformed into Web-based and printed documentation, and used in developing new ground-system data-handling software. Previously, the information about telemetry and command was scattered in various paper documents that were not synchronized. The process of searching and reading the documents was time-consuming and introduced errors. In contrast, the XML files contain all of the information in one place. XML structures can evolve in such a manner as to enable the addition, to the XML files, of the metadata necessary to track the changes and the associated documentation. The use of this software has reduced the extent of manual operations in developing a ground data system, thereby saving considerable time and removing errors that previously arose in the translation and transcription of software information from the flight to the ground system.

  13. The Cadmio XML healthcare record.

    PubMed

    Barbera, Francesco; Ferri, Fernando; Ricci, Fabrizio L; Sottile, Pier Angelo

    2002-01-01

    The management of clinical data is a complex task. Patient related information reported in patient folders is a set of heterogeneous and structured data accessed by different users having different goals (in local or geographical networks). XML language provides a mechanism for describing, manipulating, and visualising structured data in web-based applications. XML ensures that the structured data is managed in a uniform and transparent manner independently from the applications and their providers guaranteeing some interoperability. Extracting data from the healthcare record and structuring them according to XML makes the data available through browsers. The MIC/MIE model (Medical Information Category/Medical Information Elements), which allows the definition and management of healthcare records and used in CADMIO, a HISA based project, is described in this paper, using XML for allowing the data to be visualised through web browsers.

  14. XML Based Markup Languages for Specific Domains

    NASA Astrophysics Data System (ADS)

    Varde, Aparna; Rundensteiner, Elke; Fahrenholz, Sally

    A challenging area in web based support systems is the study of human activities in connection with the web, especially with reference to certain domains. This includes capturing human reasoning in information retrieval, facilitating the exchange of domain-specific knowledge through a common platform and developing tools for the analysis of data on the web from a domain expert's angle. Among the techniques and standards related to such work, we have XML, the eXtensible Markup Language. This serves as a medium of communication for storing and publishing textual, numeric and other forms of data seamlessly. XML tag sets are such that they preserve semantics and simplify the understanding of stored information by users. Often domain-specific markup languages are designed using XML, with a user-centric perspective. Standardization bodies and research communities may extend these to include additional semantics of areas within and related to the domain. This chapter outlines the issues to be considered in developing domain-specific markup languages: the motivation for development, the semantic considerations, the syntactic constraints and other relevant aspects, especially taking into account human factors. Illustrating examples are provided from domains such as Medicine, Finance and Materials Science. Particular emphasis in these examples is on the Materials Markup Language MatML and the semantics of one of its areas, namely, the Heat Treating of Materials. The focus of this chapter, however, is not the design of one particular language but rather the generic issues concerning the development of domain-specific markup languages.

  15. Transformation of HDF-EOS metadata from the ECS model to ISO 19115-based XML

    NASA Astrophysics Data System (ADS)

    Wei, Yaxing; Di, Liping; Zhao, Baohua; Liao, Guangxuan; Chen, Aijun

    2007-02-01

    Nowadays, geographic data, such as NASA's Earth Observation System (EOS) data, are playing an increasing role in many areas, including academic research, government decisions and even in people's every lives. As the quantity of geographic data becomes increasingly large, a major problem is how to fully make use of such data in a distributed, heterogeneous network environment. In order for a user to effectively discover and retrieve the specific information that is useful, the geographic metadata should be described and managed properly. Fortunately, the emergence of XML and Web Services technologies greatly promotes information distribution across the Internet. The research effort discussed in this paper presents a method and its implementation for transforming Hierarchical Data Format (HDF)-EOS metadata from the NASA ECS model to ISO 19115-based XML, which will be managed by the Open Geospatial Consortium (OGC) Catalogue Services—Web Profile (CSW). Using XML and international standards rather than domain-specific models to describe the metadata of those HDF-EOS data, and further using CSW to manage the metadata, can allow metadata information to be searched and interchanged more widely and easily, thus promoting the sharing of HDF-EOS data.

  16. Application of XML in DICOM

    NASA Astrophysics Data System (ADS)

    You, Xiaozhen; Yao, Zhihong

    2005-04-01

    As a standard of communication and storage for medical digital images, DICOM has been playing a very important role in integration of hospital information. In DICOM, tags are expressed by numbers, and only standard data elements can be shared by looking up Data Dictionary while private tags can not. As such, a DICOM file's readability and extensibility is limited. In addition, reading DICOM files needs special software. In our research, we introduced XML into DICOM, defining an XML-based DICOM special transfer format, XML-DCM, a DICOM storage format, X-DCM, as well as developing a program package to realize format interchange among DICOM, XML-DCM, and X-DCM. XML-DCM is based on the DICOM structure while replacing numeric tags with accessible XML character string tags. The merits are as following: a) every character string tag of XML-DCM has explicit meaning, so users can understand standard data elements and those private data elements easily without looking up the Data Dictionary. In this way, the readability and data sharing of DICOM files are greatly improved; b) According to requirements, users can set new character string tags with explicit meaning to their own system to extend the capacity of data elements; c) User can read the medical image and associated information conveniently through IE, ultimately enlarging the scope of data sharing. The application of storage format X-DCM will reduce data redundancy and save storage memory. The result of practical application shows that XML-DCM does favor integration and share of medical image data among different systems or devices.

  17. δ-dependency for privacy-preserving XML data publishing.

    PubMed

    Landberg, Anders H; Nguyen, Kinh; Pardede, Eric; Rahayu, J Wenny

    2014-08-01

    An ever increasing amount of medical data such as electronic health records, is being collected, stored, shared and managed in large online health information systems and electronic medical record systems (EMR) (Williams et al., 2001; Virtanen, 2009; Huang and Liou, 2007) [1-3]. From such rich collections, data is often published in the form of census and statistical data sets for the purpose of knowledge sharing and enabling medical research. This brings with it an increasing need for protecting individual people privacy, and it becomes an issue of great importance especially when information about patients is exposed to the public. While the concept of data privacy has been comprehensively studied for relational data, models and algorithms addressing the distinct differences and complex structure of XML data are yet to be explored. Currently, the common compromise method is to convert private XML data into relational data for publication. This ad hoc approach results in significant loss of useful semantic information previously carried in the private XML data. Health data often has very complex structure, which is best expressed in XML. In fact, XML is the standard format for exchanging (e.g. HL7 version 3(1)) and publishing health information. Lack of means to deal directly with data in XML format is inevitably a serious drawback. In this paper we propose a novel privacy protection model for XML, and an algorithm for implementing this model. We provide general rules, both for transforming a private XML schema into a published XML schema, and for mapping private XML data to the new privacy-protected published XML data. In addition, we propose a new privacy property, δ-dependency, which can be applied to both relational and XML data, and that takes into consideration the hierarchical nature of sensitive data (as opposed to "quasi-identifiers"). Lastly, we provide an implementation of our model, algorithm and privacy property, and perform an experimental analysis

  18. Distribution of immunodeficiency fact files with XML--from Web to WAP.

    PubMed

    Väliaho, Jouni; Riikonen, Pentti; Vihinen, Mauno

    2005-06-26

    Although biomedical information is growing rapidly, it is difficult to find and retrieve validated data especially for rare hereditary diseases. There is an increased need for services capable of integrating and validating information as well as proving it in a logically organized structure. A XML-based language enables creation of open source databases for storage, maintenance and delivery for different platforms. Here we present a new data model called fact file and an XML-based specification Inherited Disease Markup Language (IDML), that were developed to facilitate disease information integration, storage and exchange. The data model was applied to primary immunodeficiencies, but it can be used for any hereditary disease. Fact files integrate biomedical, genetic and clinical information related to hereditary diseases. IDML and fact files were used to build a comprehensive Web and WAP accessible knowledge base ImmunoDeficiency Resource (IDR) available at http://bioinf.uta.fi/idr/. A fact file is a user oriented user interface, which serves as a starting point to explore information on hereditary diseases. The IDML enables the seamless integration and presentation of genetic and disease information resources in the Internet. IDML can be used to build information services for all kinds of inherited diseases. The open source specification and related programs are available at http://bioinf.uta.fi/idml/.

  19. Development of the Plate Tectonics and Seismology markup languages with XML

    NASA Astrophysics Data System (ADS)

    Babaie, H.; Babaei, A.

    2003-04-01

    The Extensible Markup Language (XML) and its specifications such as the XSD Schema, allow geologists to design discipline-specific vocabularies such as Seismology Markup Language (SeismML) or Plate Tectonics Markup Language (TectML). These languages make it possible to store and interchange structured geological information over the Web. Development of a geological markup language requires mapping geological concepts, such as "Earthquake" or "Plate" into a UML object model, applying a modeling and design environment. We have selected four inter-related geological concepts: earthquake, fault, plate, and orogeny, and developed four XML Schema Definitions (XSD), that define the relationships, cardinalities, hierarchies, and semantics of these concepts. In such a geological concept model, the UML object "Earthquake" is related to one or more "Wave" objects, each arriving to a seismic station at a specific "DateTime", and relating to a specific "Epicenter" object that lies at a unique "Location". The "Earthquake" object occurs along a "Segment" of a "Fault" object, which is related to a specific "Plate" object. The "Fault" has its own associations with such things as "Bend", "Step", and "Segment", and could be of any kind (e.g., "Thrust", "Transform'). The "Plate" is related to many other objects such as "MOR", "Subduction", and "Forearc", and is associated with an "Orogeny" object that relates to "Deformation" and "Strain" and several other objects. These UML objects were mapped into XML Metadata Interchange (XMI) formats, which were then converted into four XSD Schemas. The schemas were used to create and validate the XML instance documents, and to create a relational database hosting the plate tectonics and seismological data in the Microsoft Access format. The SeismML and TectML allow seismologists and structural geologists, among others, to submit and retrieve structured geological data on the Internet. A seismologist, for example, can submit peer-reviewed and

  20. "The Wonder Years" of XML.

    ERIC Educational Resources Information Center

    Gazan, Rich

    2000-01-01

    Surveys the current state of Extensible Markup Language (XML), a metalanguage for creating structured documents that describe their own content, and its implications for information professionals. Predicts that XML will become the common language underlying Web, word processing, and database formats. Also discusses Extensible Stylesheet Language…

  1. Connectionist Interaction Information Retrieval.

    ERIC Educational Resources Information Center

    Dominich, Sandor

    2003-01-01

    Discussion of connectionist views for adaptive clustering in information retrieval focuses on a connectionist clustering technique and activation spreading-based information retrieval model using the interaction information retrieval method. Presents theoretical as well as simulation results as regards computational complexity and includes…

  2. How Does XML Help Libraries?

    ERIC Educational Resources Information Center

    Banerjee, Kyle

    2002-01-01

    Discusses XML, how it has transformed the way information is managed and delivered, and its impact on libraries. Topics include how XML differs from other markup languages; the document object model (DOM); style sheets; practical applications for archival materials, interlibrary loans, digital collections, and MARC data; and future possibilities.…

  3. A Space Surveillance Ontology: Captured in an XML Schema

    DTIC Science & Technology

    2000-10-01

    characterization in a way most appropriate to a sub- domain. 6. The commercial market is embracing XML, and the military can take advantage of this significant...the space surveillance ontology effort to two key efforts: the Defense Information Infrastructure Common Operating Environment (DII COE) XML...strongly believe XML schemas will supplant them. Some of the advantages that XML schemas provide over DTDs include: • Strong data typing: The XML Schema

  4. Enhanced Information Retrieval Using AJAX

    NASA Astrophysics Data System (ADS)

    Kachhwaha, Rajendra; Rajvanshi, Nitin

    2010-11-01

    Information Retrieval deals with the representation, storage, organization of, and access to information items. The representation and organization of information items should provide the user with easy access to the information with the rapid development of Internet, large amounts of digitally stored information is readily available on the World Wide Web. This information is so huge that it becomes increasingly difficult and time consuming for the users to find the information relevant to their needs. The explosive growth of information on the Internet has greatly increased the need for information retrieval systems. However, most of the search engines are using conventional information retrieval systems. An information system needs to implement sophisticated pattern matching tools to determine contents at a faster rate. AJAX has recently emerged as the new tool such the of information retrieval process of information retrieval can become fast and information reaches the use at a faster pace as compared to conventional retrieval systems.

  5. Mathematics and Information Retrieval.

    ERIC Educational Resources Information Center

    Salton, Gerald

    1979-01-01

    Examines the main mathematical approaches to information retrieval, including both algebraic and probabilistic models, and describes difficulties which impede formalization of information retrieval processes. A number of developments are covered where new theoretical understandings have directly led to improved retrieval techniques and operations.…

  6. Value of XML in the implementation of clinical practice guidelines--the issue of content retrieval and presentation.

    PubMed

    Hoelzer, S; Schweiger, R K; Boettcher, H A; Tafazzoli, A G; Dudeck, J

    2001-01-01

    that preserves the original cohesiveness. The lack of structure limits the automatic identification and extraction of the information contained in these resources. For this reason, we have chosen a document-based approach using eXtensible Markup Language (XML) with its schema definition and related technologies. XML empowers the applications for in-context searching. In addition it allows the same content to be represented in different ways. Our XML reference clinical data model for guidelines has been realized with the XML schema definition. The schema is used for structuring new text-based guidelines and updating existing documents. It is also used to establish search strategies on the document base. We hypothesize that enabling the physicians to query the available CPGs easily, and to get access to selected and specific information at the point of care will foster increased use. Based on current evidence we are confident that it will have substantial impact on the care provided, and will improve health outcomes.

  7. Shuttle-Data-Tape XML Translator

    NASA Technical Reports Server (NTRS)

    Barry, Matthew R.; Osborne, Richard N.

    2005-01-01

    JSDTImport is a computer program for translating native Shuttle Data Tape (SDT) files from American Standard Code for Information Interchange (ASCII) format into databases in other formats. JSDTImport solves the problem of organizing the SDT content, affording flexibility to enable users to choose how to store the information in a database to better support client and server applications. JSDTImport can be dynamically configured by use of a simple Extensible Markup Language (XML) file. JSDTImport uses this XML file to define how each record and field will be parsed, its layout and definition, and how the resulting database will be structured. JSDTImport also includes a client application programming interface (API) layer that provides abstraction for the data-querying process. The API enables a user to specify the search criteria to apply in gathering all the data relevant to a query. The API can be used to organize the SDT content and translate into a native XML database. The XML format is structured into efficient sections, enabling excellent query performance by use of the XPath query language. Optionally, the content can be translated into a Structured Query Language (SQL) database for fast, reliable SQL queries on standard database server computers.

  8. XML Files

    MedlinePlus

    ... this page, please enable JavaScript. MedlinePlus produces XML data sets that you are welcome to download and use. If you have questions about the MedlinePlus XML files, please contact us . For additional sources of MedlinePlus data in XML format, visit our Web service page, ...

  9. XML syntax for clinical laboratory procedure manuals.

    PubMed

    Saadawi, Gilan; Harrison, James H

    2003-01-01

    We have developed a document type description (DTD) in Extensable Markup Language (XML) for clinical laboratory procedures. Our XML syntax can adequately structure a variety of procedure types across different laboratories and is compatible with current procedure standards. The combination of this format with an XML content management system and appropriate style sheets will allow efficient procedure maintenance, distributed access, customized display and effective searching across a large body of test information.

  10. ADASS Web Database XML Project

    NASA Astrophysics Data System (ADS)

    Barg, M. I.; Stobie, E. B.; Ferro, A. J.; O'Neil, E. J.

    In the spring of 2000, at the request of the ADASS Program Organizing Committee (POC), we began organizing information from previous ADASS conferences in an effort to create a centralized database. The beginnings of this database originated from data (invited speakers, participants, papers, etc.) extracted from HyperText Markup Language (HTML) documents from past ADASS host sites. Unfortunately, not all HTML documents are well formed and parsing them proved to be an iterative process. It was evident at the beginning that if these Web documents were organized in a standardized way, such as XML (Extensible Markup Language), the processing of this information across the Web could be automated, more efficient, and less error prone. This paper will briefly review the many programming tools available for processing XML, including Java, Perl and Python, and will explore the mapping of relational data from our MySQL database to XML.

  11. Genome Information Broker (GIB): data retrieval and comparative analysis system for completed microbial genomes and more

    PubMed Central

    Fumoto, Masaki; Miyazaki, Satoru; Sugawara, Hideaki

    2002-01-01

    Genome Information Broker (GIB) is a powerful tool for the study of comparative genomics. GIB allows users to retrieve and display partial and/or whole genome sequences together with the relevant biological annotation. GIB has accumulated all the completed microbial genome and has recently been expanded to include Arabidopsis thaliana genome data from DDBJ/EMBL/GenBank. In the near future, hundreds of genome sequences will be determined. In order to handle such huge data, we have enhanced the GIB architecture by using XML, CORBA and distributed RDBs. We introduce the new GIB here. GIB is freely accessible at http://gib.genes.nig.ac.jp/. PMID:11752256

  12. Topological Aspects of Information Retrieval.

    ERIC Educational Resources Information Center

    Egghe, Leo; Rousseau, Ronald

    1998-01-01

    Discusses topological aspects of theoretical information retrieval, including retrieval topology; similarity topology; pseudo-metric topology; document spaces as topological spaces; Boolean information retrieval as a subsystem of any topological system; and proofs of theorems. (LRW)

  13. Activities of information retrieval in Daicel Corporation : The roles and efforts of information retrieval team

    NASA Astrophysics Data System (ADS)

    Yamazaki, Towako

    In order to stabilize and improve quality of information retrieval service, the information retrieval team of Daicel Corporation has given some efforts on standard operating procedures, interview sheet for information retrieval, structured format for search report, and search expressions for some technological fields of Daicel. These activities and efforts will also lead to skill sharing and skill tradition between searchers. In addition, skill improvements are needed not only for a searcher individually, but also for the information retrieval team totally when playing searcher's new roles.

  14. Building VoiceXML-Based Applications

    DTIC Science & Technology

    2002-01-01

    basketball games. The Busline systems were pri- y developed using an early implementation of VoiceXML he NBA Update Line was developed using VoiceXML...traveling in and out of Pittsburgh’s rsity neighborhood. The second project is the NBA Up- Line, which provides callers with real-time information NBA ... NBA UPDATE LINE The target user of this system is a fairly knowledgeable basket- ball fan; the system must therefore be able to provide detailed

  15. XML — an opportunity for data standards in the geosciences

    NASA Astrophysics Data System (ADS)

    Houlding, Simon W.

    2001-08-01

    Extensible markup language (XML) is a recently introduced meta-language standard on the Web. It provides the rules for development of metadata (markup) standards for information transfer in specific fields. XML allows development of markup languages that describe what information is rather than how it should be presented. This allows computer applications to process the information in intelligent ways. In contrast hypertext markup language (HTML), which fuelled the initial growth of the Web, is a metadata standard concerned exclusively with presentation of information. Besides its potential for revolutionizing Web activities, XML provides an opportunity for development of meaningful data standards in specific application fields. The rapid endorsement of XML by science, industry and e-commerce has already spawned new metadata standards in such fields as mathematics, chemistry, astronomy, multi-media and Web micro-payments. Development of XML-based data standards in the geosciences would significantly reduce the effort currently wasted on manipulating and reformatting data between different computer platforms and applications and would ensure compatibility with the new generation of Web browsers. This paper explores the evolution, benefits and status of XML and related standards in the more general context of Web activities and uses this as a platform for discussion of its potential for development of data standards in the geosciences. Some of the advantages of XML are illustrated by a simple, browser-compatible demonstration of XML functionality applied to a borehole log dataset. The XML dataset and the associated stylesheet and schema declarations are available for FTP download.

  16. Compressing Aviation Data in XML Format

    NASA Technical Reports Server (NTRS)

    Patel, Hemil; Lau, Derek; Kulkarni, Deepak

    2003-01-01

    Design, operations and maintenance activities in aviation involve analysis of variety of aviation data. This data is typically in disparate formats making it difficult to use with different software packages. Use of a self-describing and extensible standard called XML provides a solution to this interoperability problem. XML provides a standardized language for describing the contents of an information stream, performing the same kind of definitional role for Web content as a database schema performs for relational databases. XML data can be easily customized for display using Extensible Style Sheets (XSL). While self-describing nature of XML makes it easy to reuse, it also increases the size of data significantly. Therefore, transfemng a dataset in XML form can decrease throughput and increase data transfer time significantly. It also increases storage requirements significantly. A natural solution to the problem is to compress the data using suitable algorithm and transfer it in the compressed form. We found that XML-specific compressors such as Xmill and XMLPPM generally outperform traditional compressors. However, optimal use of Xmill requires of discovery of optimal options to use while running Xmill. This, in turn, depends on the nature of data used. Manual disc0ver.y of optimal setting can require an engineer to experiment for weeks. We have devised an XML compression advisory tool that can analyze sample data files and recommend what compression tool would work the best for this data and what are the optimal settings to be used with a XML compression tool.

  17. Prototype Development: Context-Driven Dynamic XML Ophthalmologic Data Capture Application

    PubMed Central

    Schwei, Kelsey M; Kadolph, Christopher; Finamore, Joseph; Cancel, Efrain; McCarty, Catherine A; Okorie, Asha; Thomas, Kate L; Allen Pacheco, Jennifer; Pathak, Jyotishman; Ellis, Stephen B; Denny, Joshua C; Rasmussen, Luke V; Tromp, Gerard; Williams, Marc S; Vrabec, Tamara R; Brilliant, Murray H

    2017-01-01

    Background The capture and integration of structured ophthalmologic data into electronic health records (EHRs) has historically been a challenge. However, the importance of this activity for patient care and research is critical. Objective The purpose of this study was to develop a prototype of a context-driven dynamic extensible markup language (XML) ophthalmologic data capture application for research and clinical care that could be easily integrated into an EHR system. Methods Stakeholders in the medical, research, and informatics fields were interviewed and surveyed to determine data and system requirements for ophthalmologic data capture. On the basis of these requirements, an ophthalmology data capture application was developed to collect and store discrete data elements with important graphical information. Results The context-driven data entry application supports several features, including ink-over drawing capability for documenting eye abnormalities, context-based Web controls that guide data entry based on preestablished dependencies, and an adaptable database or XML schema that stores Web form specifications and allows for immediate changes in form layout or content. The application utilizes Web services to enable data integration with a variety of EHRs for retrieval and storage of patient data. Conclusions This paper describes the development process used to create a context-driven dynamic XML data capture application for optometry and ophthalmology. The list of ophthalmologic data elements identified as important for care and research can be used as a baseline list for future ophthalmologic data collection activities. PMID:28903894

  18. Knowledge-Based Information Retrieval.

    ERIC Educational Resources Information Center

    Ford, Nigel

    1991-01-01

    Discussion of information retrieval focuses on theoretical and empirical advances in knowledge-based information retrieval. Topics discussed include the use of natural language for queries; the use of expert systems; intelligent tutoring systems; user modeling; the need for evaluation of system effectiveness; and examples of systems, including…

  19. Get It Together: Integrating Data with XML.

    ERIC Educational Resources Information Center

    Miller, Ron

    2003-01-01

    Discusses the use of XML for data integration to move data across different platforms, including across the Internet, from a variety of sources. Topics include flexibility; standards; organizing databases; unstructured data and the use of meta tags to encode it with XML information; cost effectiveness; and eliminating client software licenses.…

  20. Flight Dynamic Model Exchange using XML

    NASA Technical Reports Server (NTRS)

    Jackson, E. Bruce; Hildreth, Bruce L.

    2002-01-01

    The AIAA Modeling and Simulation Technical Committee has worked for several years to develop a standard by which the information needed to develop physics-based models of aircraft can be specified. The purpose of this standard is to provide a well-defined set of information, definitions, data tables and axis systems so that cooperating organizations can transfer a model from one simulation facility to another with maximum efficiency. This paper proposes using an application of the eXtensible Markup Language (XML) to implement the AIAA simulation standard. The motivation and justification for using a standard such as XML is discussed. Necessary data elements to be supported are outlined. An example of an aerodynamic model as an XML file is given. This example includes definition of independent and dependent variables for function tables, definition of key variables used to define the model, and axis systems used. The final steps necessary for implementation of the standard are presented. Software to take an XML-defined model and import/export it to/from a given simulation facility is discussed, but not demonstrated. That would be the next step in final implementation of standards for physics-based aircraft dynamic models.

  1. XML Reconstruction View Selection in XML Databases: Complexity Analysis and Approximation Scheme

    NASA Astrophysics Data System (ADS)

    Chebotko, Artem; Fu, Bin

    Query evaluation in an XML database requires reconstructing XML subtrees rooted at nodes found by an XML query. Since XML subtree reconstruction can be expensive, one approach to improve query response time is to use reconstruction views - materialized XML subtrees of an XML document, whose nodes are frequently accessed by XML queries. For this approach to be efficient, the principal requirement is a framework for view selection. In this work, we are the first to formalize and study the problem of XML reconstruction view selection. The input is a tree T, in which every node i has a size c i and profit p i , and the size limitation C. The target is to find a subset of subtrees rooted at nodes i 1, ⋯ , i k respectively such that c_{i_1}+\\cdots +c_{i_k}le C, and p_{i_1}+\\cdots +p_{i_k} is maximal. Furthermore, there is no overlap between any two subtrees selected in the solution. We prove that this problem is NP-hard and present a fully polynomial-time approximation scheme (FPTAS) as a solution.

  2. Evaluation of Efficient XML Interchange (EXI) for Large Datasets and as an Alternative to Binary JSON Encodings

    DTIC Science & Technology

    2015-03-01

    fall in the lossy category (Gonzalez, Woods , & Eddins, 2009, p. 420). For the textual or numeric data in XML, however, lossy compression is...7/1,337 > Professional Notes Being Efficient with Bandwidth By Lieutenant Commander Steve Debich, Lieutenant Bruce Hill, Captain Scot Miller (Retired...2005). XML Binary Characterization. Retrieved from http://www.w3.org/TR/xbc-characterization/ Gonzalez, R., Woods , R., & Eddins, S. (2009

  3. Application of Rough Sets to Information Retrieval.

    ERIC Educational Resources Information Center

    Miyamoto, Sadaaki

    1998-01-01

    Develops a method of rough retrieval, an application of the rough set theory to information retrieval. The aim is to: (1) show that rough sets are naturally applied to information retrieval in which categorized information structure is used; and (2) show that a fuzzy retrieval scheme is induced from the rough retrieval. (AEF)

  4. XML, Ontologies, and Their Clinical Applications.

    PubMed

    Yu, Chunjiang; Shen, Bairong

    2016-01-01

    The development of information technology has resulted in its penetration into every area of clinical research. Various clinical systems have been developed, which produce increasing volumes of clinical data. However, saving, exchanging, querying, and exploiting these data are challenging issues. The development of Extensible Markup Language (XML) has allowed the generation of flexible information formats to facilitate the electronic sharing of structured data via networks, and it has been used widely for clinical data processing. In particular, XML is very useful in the fields of data standardization, data exchange, and data integration. Moreover, ontologies have been attracting increased attention in various clinical fields in recent years. An ontology is the basic level of a knowledge representation scheme, and various ontology repositories have been developed, such as Gene Ontology and BioPortal. The creation of these standardized repositories greatly facilitates clinical research in related fields. In this chapter, we discuss the basic concepts of XML and ontologies, as well as their clinical applications.

  5. Web information retrieval based on ontology

    NASA Astrophysics Data System (ADS)

    Zhang, Jian

    2013-03-01

    The purpose of the Information Retrieval (IR) is to find a set of documents that are relevant for a specific information need of a user. Traditional Information Retrieval model commonly used in commercial search engine is based on keyword indexing system and Boolean logic queries. One big drawback of traditional information retrieval is that they typically retrieve information without an explicitly defined domain of interest to the users so that a lot of no relevance information returns to users, which burden the user to pick up useful answer from these no relevance results. In order to tackle this issue, many semantic web information retrieval models have been proposed recently. The main advantage of Semantic Web is to enhance search mechanisms with the use of Ontology's mechanisms. In this paper, we present our approach to personalize web search engine based on ontology. In addition, key techniques are also discussed in our paper. Compared to previous research, our works concentrate on the semantic similarity and the whole process including query submission and information annotation.

  6. Applying Analogical Reasoning Techniques for Teaching XML Document Querying Skills in Database Classes

    ERIC Educational Resources Information Center

    Mitri, Michel

    2012-01-01

    XML has become the most ubiquitous format for exchange of data between applications running on the Internet. Most Web Services provide their information to clients in the form of XML. The ability to process complex XML documents in order to extract relevant information is becoming as important a skill for IS students to master as querying…

  7. Intelligent Information Retrieval: An Introduction.

    ERIC Educational Resources Information Center

    Gauch, Susan

    1992-01-01

    Discusses the application of artificial intelligence to online information retrieval systems and describes several systems: (1) CANSEARCH, from MEDLINE; (2) Intelligent Interface for Information Retrieval (I3R); (3) Gausch's Query Reformulation; (4) Environmental Pollution Expert (EP-X); (5) PLEXUS (gardening); and (6) SCISOR (corporate…

  8. Hypothesis-confirming information search strategies and computerized information-retrieval systems

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Jacobs, S.M.

    A recent trend in information-retrieval systems technology is the development of on-line information retrieval systems. One objective of these systems has been to attempt to enhance decision effectiveness by allowing users to preferentially seek information, thereby facilitating the reduction or elimination of information overload. These systems do not necessarily lead to more-effective decision making, however. Recent research in information-search strategy suggests that when users are seeking information subsequent to forming initial beliefs, they may preferentially seek information to confirm these beliefs. It seems that effective computer-based decision support requires an information retrieval system capable of: (a) retrieving a subset ofmore » all available information, in order to reduce information overload, and (b) supporting an information search strategy that considers all relevant information, rather than merely hypothesis-confirming information. An information retrieval system with an expert component (i.e., a knowledge-based DSS) should be able to provide these capabilities. Results of this study are non conclusive; there was neither strong confirmatory evidence nor strong disconfirmatory evidence regarding the effectiveness of the KBDSS.« less

  9. A Unified Mathematical Definition of Classical Information Retrieval.

    ERIC Educational Resources Information Center

    Dominich, Sandor

    2000-01-01

    Presents a unified mathematical definition for the classical models of information retrieval and identifies a mathematical structure behind relevance feedback. Highlights include vector information retrieval; probabilistic information retrieval; and similarity information retrieval. (Contains 118 references.) (Author/LRW)

  10. Using XML to Separate Content from the Presentation Software in eLearning Applications

    ERIC Educational Resources Information Center

    Merrill, Paul F.

    2005-01-01

    This paper has shown how XML (extensible Markup Language) can be used to mark up content. Since XML documents, with meaningful tags, can be interpreted easily by humans as well as computers, they are ideal for the interchange of information. Because XML tags can be defined by an individual or organization, XML documents have proven useful in a…

  11. Framework and prototype for a secure XML-based electronic health records system.

    PubMed

    Steele, Robert; Gardner, William; Chandra, Darius; Dillon, Tharam S

    2007-01-01

    Security of personal medical information has always been a challenge for the advancement of Electronic Health Records (EHRs) initiatives. eXtensible Markup Language (XML), is rapidly becoming the key standard for data representation and transportation. The widespread use of XML and the prospect of its use in the Electronic Health (e-health) domain highlights the need for flexible access control models for XML data and documents. This paper presents a declarative access control model for XML data repositories that utilises an expressive XML role control model. The operational semantics of this model are illustrated by Xplorer, a user interface generation engine which supports search-browse-navigate activities on XML repositories.

  12. Application of XML to Journal Table Archiving

    NASA Astrophysics Data System (ADS)

    Shaya, E. J.; Blackwell, J. H.; Gass, J. E.; Kargatis, V. E.; Schneider, G. L.; Weiland, J. L.; Borne, K. D.; White, R. A.; Cheung, C. Y.

    1998-12-01

    The Astronomical Data Center (ADC) at the NASA Goddard Space Flight Center is a major archive for machine-readable astronomical data tables. Many ADC tables are derived from published journal articles. Article tables are reformatted to be machine-readable and documentation is crafted to facilitate proper reuse by researchers. The recent switch of journals to web based electronic format has resulted in the generation of large amounts of tabular data that could be captured into machine-readable archive format at fairly low cost. The large data flow of the tables from all major North American astronomical journals (a factor of 100 greater than the present rate at the ADC) necessitates the development of rigorous standards for the exchange of data between researchers, publishers, and the archives. We have selected a suitable markup language that can fully describe the large variety of astronomical information contained in ADC tables. The eXtensible Markup Language XML is a powerful internet-ready documentation format for data. It provides a precise and clear data description language that is both machine- and human-readable. It is rapidly becoming the standard format for business and information transactions on the internet and it is an ideal common metadata exchange format. By labelling, or "marking up", all elements of the information content, documents are created that computers can easily parse. An XML archive can easily and automatically be maintained, ingested into standard databases or custom software, and even totally restructured whenever necessary. Structuring astronomical data into XML format will enable efficient and focused search capabilities via off-the-shelf software. The ADC is investigating XML's expanded hyperlinking power to enhance connectivity within the ADC data/metadata and developing XSL display scripts to enhance display of astronomical data. The ADC XML Definition Type Document can be viewed at http://messier.gsfc.nasa.gov/dtdhtml/DTD-TREE.html

  13. Graph-Based Interactive Bibliographic Information Retrieval Systems

    ERIC Educational Resources Information Center

    Zhu, Yongjun

    2017-01-01

    In the big data era, we have witnessed the explosion of scholarly literature. This explosion has imposed challenges to the retrieval of bibliographic information. Retrieval of intended bibliographic information has become challenging due to the overwhelming search results returned by bibliographic information retrieval systems for given input…

  14. XML Schema Guide for Primary CDR Submissions

    EPA Pesticide Factsheets

    This document presents the extensible markup language (XML) schema guide for the Office of Pollution Prevention and Toxics’ (OPPT) e-CDRweb tool. E-CDRweb is the electronic, web-based tool provided by Environmental Protection Agency (EPA) for the submission of Chemical Data Reporting (CDR) information. This document provides the user with tips and guidance on correctly using the version 1.7 XML schema. Please note that the order of the elements must match the schema.

  15. Using Induction to Refine Information Retrieval Strategies

    NASA Technical Reports Server (NTRS)

    Baudin, Catherine; Pell, Barney; Kedar, Smadar

    1994-01-01

    Conceptual information retrieval systems use structured document indices, domain knowledge and a set of heuristic retrieval strategies to match user queries with a set of indices describing the document's content. Such retrieval strategies increase the set of relevant documents retrieved (increase recall), but at the expense of returning additional irrelevant documents (decrease precision). Usually in conceptual information retrieval systems this tradeoff is managed by hand and with difficulty. This paper discusses ways of managing this tradeoff by the application of standard induction algorithms to refine the retrieval strategies in an engineering design domain. We gathered examples of query/retrieval pairs during the system's operation using feedback from a user on the retrieved information. We then fed these examples to the induction algorithm and generated decision trees that refine the existing set of retrieval strategies. We found that (1) induction improved the precision on a set of queries generated by another user, without a significant loss in recall, and (2) in an interactive mode, the decision trees pointed out flaws in the retrieval and indexing knowledge and suggested ways to refine the retrieval strategies.

  16. Information Retrieval: A Sequential Learning Process.

    ERIC Educational Resources Information Center

    Bookstein, Abraham

    1983-01-01

    Presents decision-theoretic models which intrinsically include retrieval of multiple documents whereby system responds to request by presenting documents to patron in sequence, gathering feedback, and using information to modify future retrievals. Document independence model, set retrieval model, sequential retrieval model, learning model,…

  17. Querying XML Data with SPARQL

    NASA Astrophysics Data System (ADS)

    Bikakis, Nikos; Gioldasis, Nektarios; Tsinaraki, Chrisa; Christodoulakis, Stavros

    SPARQL is today the standard access language for Semantic Web data. In the recent years XML databases have also acquired industrial importance due to the widespread applicability of XML in the Web. In this paper we present a framework that bridges the heterogeneity gap and creates an interoperable environment where SPARQL queries are used to access XML databases. Our approach assumes that fairly generic mappings between ontology constructs and XML Schema constructs have been automatically derived or manually specified. The mappings are used to automatically translate SPARQL queries to semantically equivalent XQuery queries which are used to access the XML databases. We present the algorithms and the implementation of SPARQL2XQuery framework, which is used for answering SPARQL queries over XML databases.

  18. XML-based information system for planetary sciences

    NASA Astrophysics Data System (ADS)

    Carraro, F.; Fonte, S.; Turrini, D.

    2009-04-01

    EuroPlaNet (EPN in the following) has been developed by the planetological community under the "Sixth Framework Programme" (FP6 in the following), the European programme devoted to the improvement of the European research efforts through the creation of an internal market for science and technology. The goal of the EPN programme is the creation of a European network aimed to the diffusion of data produced by space missions dedicated to the study of the Solar System. A special place within the EPN programme is that of I.D.I.S. (Integrated and Distributed Information Service). The main goal of IDIS is to offer to the planetary science community a user-friendly access to the data and information produced by the various types of research activities, i.e. Earth-based observations, space observations, modeling, theory and laboratory experiments. During the FP6 programme IDIS development consisted in the creation of a series of thematic nodes, each of them specialized in a specific scientific domain, and a technical coordination node. The four thematic nodes are the Atmosphere node, the Plasma node, the Interiors & Surfaces node and the Small Bodies & Dust node. The main task of the nodes have been the building up of selected scientific cases related with the scientific domain of each node. The second work done by EPN nodes have been the creation of a catalogue of resources related to their main scientific theme. Both these efforts have been used as the basis for the development of the main IDIS goal, i.e. the integrated distributed service. An XML-based data model have been developed to describe resources using meta-data and to store the meta-data within an XML-based database called eXist. A search engine has been then developed in order to allow users to search resources within the database. Users can select the resource type and can insert one or more values or can choose a value among those present in a list, depending on selected resource. The system searches for all

  19. Mobile medical visual information retrieval.

    PubMed

    Depeursinge, Adrien; Duc, Samuel; Eggel, Ivan; Müller, Henning

    2012-01-01

    In this paper, we propose mobile access to peer-reviewed medical information based on textual search and content-based visual image retrieval. Web-based interfaces designed for limited screen space were developed to query via web services a medical information retrieval engine optimizing the amount of data to be transferred in wireless form. Visual and textual retrieval engines with state-of-the-art performance were integrated. Results obtained show a good usability of the software. Future use in clinical environments has the potential of increasing quality of patient care through bedside access to the medical literature in context.

  20. Data Manipulation in an XML-Based Digital Image Library

    ERIC Educational Resources Information Center

    Chang, Naicheng

    2005-01-01

    Purpose: To help to clarify the role of XML tools and standards in supporting transition and migration towards a fully XML-based environment for managing access to information. Design/methodology/approach: The Ching Digital Image Library, built on a three-tier architecture, is used as a source of examples to illustrate a number of methods of data…

  1. XWeB: The XML Warehouse Benchmark

    NASA Astrophysics Data System (ADS)

    Mahboubi, Hadj; Darmont, Jérôme

    With the emergence of XML as a standard for representing business data, new decision support applications are being developed. These XML data warehouses aim at supporting On-Line Analytical Processing (OLAP) operations that manipulate irregular XML data. To ensure feasibility of these new tools, important performance issues must be addressed. Performance is customarily assessed with the help of benchmarks. However, decision support benchmarks do not currently support XML features. In this paper, we introduce the XML Warehouse Benchmark (XWeB), which aims at filling this gap. XWeB derives from the relational decision support benchmark TPC-H. It is mainly composed of a test data warehouse that is based on a unified reference model for XML warehouses and that features XML-specific structures, and its associate XQuery decision support workload. XWeB's usage is illustrated by experiments on several XML database management systems.

  2. XML Schema Guide for Secondary CDR Submissions

    EPA Pesticide Factsheets

    This document presents the extensible markup language (XML) schema guide for the Office of Pollution Prevention and Toxics’ (OPPT) e-CDRweb tool. E-CDRweb is the electronic, web-based tool provided by Environmental Protection Agency (EPA) for the submission of Chemical Data Reporting (CDR) information. This document provides the user with tips and guidance on correctly using the version 1.1 XML schema for the Joint Submission Form. Please note that the order of the elements must match the schema.

  3. 46 CFR 520.6 - Retrieval of information.

    Code of Federal Regulations, 2012 CFR

    2012-10-01

    ... 46 Shipping 9 2012-10-01 2012-10-01 false Retrieval of information. 520.6 Section 520.6 Shipping FEDERAL MARITIME COMMISSION REGULATIONS AFFECTING OCEAN SHIPPING IN FOREIGN COMMERCE CARRIER AUTOMATED TARIFFS § 520.6 Retrieval of information. (a) General. Tariffs systems shall present retrievers with the...

  4. 46 CFR 520.6 - Retrieval of information.

    Code of Federal Regulations, 2010 CFR

    2010-10-01

    ... 46 Shipping 9 2010-10-01 2010-10-01 false Retrieval of information. 520.6 Section 520.6 Shipping FEDERAL MARITIME COMMISSION REGULATIONS AFFECTING OCEAN SHIPPING IN FOREIGN COMMERCE CARRIER AUTOMATED TARIFFS § 520.6 Retrieval of information. (a) General. Tariffs systems shall present retrievers with the...

  5. 46 CFR 520.6 - Retrieval of information.

    Code of Federal Regulations, 2014 CFR

    2014-10-01

    ... 46 Shipping 9 2014-10-01 2014-10-01 false Retrieval of information. 520.6 Section 520.6 Shipping FEDERAL MARITIME COMMISSION REGULATIONS AFFECTING OCEAN SHIPPING IN FOREIGN COMMERCE CARRIER AUTOMATED TARIFFS § 520.6 Retrieval of information. (a) General. Tariffs systems shall present retrievers with the...

  6. 46 CFR 520.6 - Retrieval of information.

    Code of Federal Regulations, 2013 CFR

    2013-10-01

    ... 46 Shipping 9 2013-10-01 2013-10-01 false Retrieval of information. 520.6 Section 520.6 Shipping FEDERAL MARITIME COMMISSION REGULATIONS AFFECTING OCEAN SHIPPING IN FOREIGN COMMERCE CARRIER AUTOMATED TARIFFS § 520.6 Retrieval of information. (a) General. Tariffs systems shall present retrievers with the...

  7. 46 CFR 520.6 - Retrieval of information.

    Code of Federal Regulations, 2011 CFR

    2011-10-01

    ... 46 Shipping 9 2011-10-01 2011-10-01 false Retrieval of information. 520.6 Section 520.6 Shipping FEDERAL MARITIME COMMISSION REGULATIONS AFFECTING OCEAN SHIPPING IN FOREIGN COMMERCE CARRIER AUTOMATED TARIFFS § 520.6 Retrieval of information. (a) General. Tariffs systems shall present retrievers with the...

  8. Prototype Development: Context-Driven Dynamic XML Ophthalmologic Data Capture Application.

    PubMed

    Peissig, Peggy; Schwei, Kelsey M; Kadolph, Christopher; Finamore, Joseph; Cancel, Efrain; McCarty, Catherine A; Okorie, Asha; Thomas, Kate L; Allen Pacheco, Jennifer; Pathak, Jyotishman; Ellis, Stephen B; Denny, Joshua C; Rasmussen, Luke V; Tromp, Gerard; Williams, Marc S; Vrabec, Tamara R; Brilliant, Murray H

    2017-09-13

    The capture and integration of structured ophthalmologic data into electronic health records (EHRs) has historically been a challenge. However, the importance of this activity for patient care and research is critical. The purpose of this study was to develop a prototype of a context-driven dynamic extensible markup language (XML) ophthalmologic data capture application for research and clinical care that could be easily integrated into an EHR system. Stakeholders in the medical, research, and informatics fields were interviewed and surveyed to determine data and system requirements for ophthalmologic data capture. On the basis of these requirements, an ophthalmology data capture application was developed to collect and store discrete data elements with important graphical information. The context-driven data entry application supports several features, including ink-over drawing capability for documenting eye abnormalities, context-based Web controls that guide data entry based on preestablished dependencies, and an adaptable database or XML schema that stores Web form specifications and allows for immediate changes in form layout or content. The application utilizes Web services to enable data integration with a variety of EHRs for retrieval and storage of patient data. This paper describes the development process used to create a context-driven dynamic XML data capture application for optometry and ophthalmology. The list of ophthalmologic data elements identified as important for care and research can be used as a baseline list for future ophthalmologic data collection activities. ©Peggy Peissig, Kelsey M Schwei, Christopher Kadolph, Joseph Finamore, Efrain Cancel, Catherine A McCarty, Asha Okorie, Kate L Thomas, Jennifer Allen Pacheco, Jyotishman Pathak, Stephen B Ellis, Joshua C Denny, Luke V Rasmussen, Gerard Tromp, Marc S Williams, Tamara R Vrabec, Murray H Brilliant. Originally published in JMIR Medical Informatics (http://medinform.jmir.org), 13.09.2017.

  9. An XML-Based Knowledge Management System of Port Information for U.S. Coast Guard Cutters

    DTIC Science & Technology

    2003-03-01

    using DTDs was not chosen. XML Schema performs many of the same functions as SQL type schemas, but differ by the unique structure of XML documents...to access data from content files within the developed system. XPath is not equivalent to SQL . While XPath is very powerful at reaching into an XML...document and finding nodes or node sets, it is not a complete query language. For operations like joins, unions, intersections, etc., SQL is far

  10. An XML-Based Mission Command Language for Autonomous Underwater Vehicles (AUVs)

    DTIC Science & Technology

    2003-06-01

    P. XML: How To Program . Prentice Hall, Inc. Upper Saddle River, New Jersey, 2001 Digital Signature Activity Statement, W3C www.w3.org/Signature...languages because it does not directly specify how information is to be presented, but rather defines the structure (and thus semantics) of the...command and control (C2) aspects of using XML to increase the utility of AUVs. XML programming will be addressed. Current mine warfare doctrine will be

  11. XML-Based Generator of C++ Code for Integration With GUIs

    NASA Technical Reports Server (NTRS)

    Hua, Hook; Oyafuso, Fabiano; Klimeck, Gerhard

    2003-01-01

    An open source computer program has been developed to satisfy a need for simplified organization of structured input data for scientific simulation programs. Typically, such input data are parsed in from a flat American Standard Code for Information Interchange (ASCII) text file into computational data structures. Also typically, when a graphical user interface (GUI) is used, there is a need to completely duplicate the input information while providing it to a user in a more structured form. Heretofore, the duplication of the input information has entailed duplication of software efforts and increases in susceptibility to software errors because of the concomitant need to maintain two independent input-handling mechanisms. The present program implements a method in which the input data for a simulation program are completely specified in an Extensible Markup Language (XML)-based text file. The key benefit for XML is storing input data in a structured manner. More importantly, XML allows not just storing of data but also describing what each of the data items are. That XML file contains information useful for rendering the data by other applications. It also then generates data structures in the C++ language that are to be used in the simulation program. In this method, all input data are specified in one place only, and it is easy to integrate the data structures into both the simulation program and the GUI. XML-to-C is useful in two ways: 1. As an executable, it generates the corresponding C++ classes and 2. As a library, it automatically fills the objects with the input data values.

  12. Collaborative Information Retrieval Method among Personal Repositories

    NASA Astrophysics Data System (ADS)

    Kamei, Koji; Yukawa, Takashi; Yoshida, Sen; Kuwabara, Kazuhiro

    In this paper, we describe a collaborative information retrieval method among personal repositorie and an implementation of the method on a personal agent framework. We propose a framework for personal agents that aims to enable the sharing and exchange of information resources that are distributed unevenly among individuals. The kernel of a personal agent framework is an RDF(resource description framework)-based information repository for storing, retrieving and manipulating privately collected information, such as documents the user read and/or wrote, email he/she exchanged, web pages he/she browsed, etc. The repository also collects annotations to information resources that describe relationships among information resources and records of interaction between the user and information resources. Since the information resources in a personal repository and their structure are personalized, information retrieval from other users' is an important application of the personal agent. A vector space model with a personalized concept-base is employed as an information retrieval mechanism in a personal repository. Since a personalized concept-base is constructed from information resources in a personal repository, it reflects its user's knowledge and interests. On the other hand, it leads to another problem while querying other users' personal repositories; that is, simply transferring query requests does not provide desirable results. To solve this problem, we propose a query equalization scheme based on a relevance feedback method for collaborative information retrieval between personalized concept-bases. In this paper, we describe an implementation of the collaborative information retrieval method and its user interface on the personal agent framework.

  13. Next-Generation Search Engines for Information Retrieval

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Devarakonda, Ranjeet; Hook, Leslie A; Palanisamy, Giri

    In the recent years, there have been significant advancements in the areas of scientific data management and retrieval techniques, particularly in terms of standards and protocols for archiving data and metadata. Scientific data is rich, and spread across different places. In order to integrate these pieces together, a data archive and associated metadata should be generated. Data should be stored in a format that can be retrievable and more importantly it should be in a format that will continue to be accessible as technology changes, such as XML. While general-purpose search engines (such as Google or Bing) are useful formore » finding many things on the Internet, they are often of limited usefulness for locating Earth Science data relevant (for example) to a specific spatiotemporal extent. By contrast, tools that search repositories of structured metadata can locate relevant datasets with fairly high precision, but the search is limited to that particular repository. Federated searches (such as Z39.50) have been used, but can be slow and the comprehensiveness can be limited by downtime in any search partner. An alternative approach to improve comprehensiveness is for a repository to harvest metadata from other repositories, possibly with limits based on subject matter or access permissions. Searches through harvested metadata can be extremely responsive, and the search tool can be customized with semantic augmentation appropriate to the community of practice being served. One such system, Mercury, a metadata harvesting, data discovery, and access system, built for researchers to search to, share and obtain spatiotemporal data used across a range of climate and ecological sciences. Mercury is open-source toolset, backend built on Java and search capability is supported by the some popular open source search libraries such as SOLR and LUCENE. Mercury harvests the structured metadata and key data from several data providing servers around the world and builds a

  14. Multimedia Information Retrieval Literature Review

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Wong, Pak C.; Bohn, Shawn J.; Payne, Deborah A.

    This survey paper highlights some of the recent, influential work in multimedia information retrieval (MIR). MIR is a branch area of multimedia (MM). The young and fast-growing area has received strong industrial and academic support in the United States and around the world (see Section 7 for a list of major conferences and journals of the community). The term "information retrieval" may be misleading to those with different computer science or information technology backgrounds. As shown in our discussion later, it indeed includes topics from user interaction, data analytics, machine learning, feature extraction, information visualization, and more.

  15. An XML-based method for astronomy software designing

    NASA Astrophysics Data System (ADS)

    Liao, Mingxue; Aili, Yusupu; Zhang, Jin

    XML-based method for standardization of software designing is introduced and analyzed and successfully applied to renovating the hardware and software of the digital clock at Urumqi Astronomical Station. Basic strategy for eliciting time information from the new digital clock of FT206 in the antenna control program is introduced. By FT206, the need to compute how many centuries passed since a certain day with sophisticated formulas is eliminated and it is no longer necessary to set right UT time for the computer holding control over antenna because the information about year, month, day are all deduced from Julian day dwelling in FT206, rather than from computer time. With XML-based method and standard for software designing, various existing designing methods are unified, communications and collaborations between developers are facilitated, and thus Internet-based mode of developing software becomes possible. The trend of development of XML-based designing method is predicted.

  16. XML Based Scientific Data Management Facility

    NASA Technical Reports Server (NTRS)

    Mehrotra, P.; Zubair, M.; Bushnell, Dennis M. (Technical Monitor)

    2002-01-01

    The World Wide Web consortium has developed an Extensible Markup Language (XML) to support the building of better information management infrastructures. The scientific computing community realizing the benefits of XML has designed markup languages for scientific data. In this paper, we propose a XML based scientific data management ,facility, XDMF. The project is motivated by the fact that even though a lot of scientific data is being generated, it is not being shared because of lack of standards and infrastructure support for discovering and transforming the data. The proposed data management facility can be used to discover the scientific data itself, the transformation functions, and also for applying the required transformations. We have built a prototype system of the proposed data management facility that can work on different platforms. We have implemented the system using Java, and Apache XSLT engine Xalan. To support remote data and transformation functions, we had to extend the XSLT specification and the Xalan package.

  17. Advanced Feedback Methods in Information Retrieval.

    ERIC Educational Resources Information Center

    Salton, G.; And Others

    1985-01-01

    In this study, automatic feedback techniques are applied to Boolean query statements in online information retrieval to generate improved query statements based on information contained in previously retrieved documents. Feedback operations are carried out using conventional Boolean logic and extended logic. Experimental output is included to…

  18. Information retrieval for ecological syntheses.

    PubMed

    Bayliss, Helen R; Beyer, Fiona R

    2015-06-01

    Research syntheses are increasingly being conducted within the fields of ecology and environmental management. Information retrieval is crucial in any synthesis in identifying data for inclusion whilst potentially reducing biases in the dataset gathered, yet the nature of ecological information provides several challenges when compared with medicine that should be considered when planning and undertaking searches. We present ten recommendations for anyone considering undertaking information retrieval for ecological research syntheses that highlight the main differences with medicine and, if adopted, may help reduce biases in the dataset retrieved, increase search efficiency and improve reporting standards. They are as follows: (1) plan for information retrieval at an early stage, (2) identify and use sources of help, (3) clearly define the question to be addressed, (4) ensure that provisions for managing, recording and reporting the search are in place, (5) select an appropriate search type, (6) identify sources to be used, (7) identify limitations of the sources, (8) ensure that the search vocabulary is appropriate, (9) identify limits and filters that can help direct the search, and (10) test the strategy to ensure that it is realistic and manageable. These recommendations may be of value for other disciplines where search infrastructures are not yet sufficiently well developed. Copyright © 2014 John Wiley & Sons, Ltd.

  19. User-Centric Multi-Criteria Information Retrieval

    NASA Technical Reports Server (NTRS)

    Wolfe, Shawn R.; Zhang, Yi

    2009-01-01

    Information retrieval models usually represent content only, and not other considerations, such as authority, cost, and recency. How could multiple criteria be utilized in information retrieval, and how would it affect the results? In our experiments, using multiple user-centric criteria always produced better results than a single criteria.

  20. A Logic Basis for Information Retrieval.

    ERIC Educational Resources Information Center

    Watters, C. R.; Shepherd, M. A.

    1987-01-01

    Discusses the potential of recent work in artificial intelligence, especially expert systems, for the development of more effective information retrieval systems. Highlights include the role of an expert bibliographic retrieval system and a prototype expert retrieval system, PROBIB-2, that uses MicroProlog to provide deductive reasoning…

  1. Compression of Probabilistic XML Documents

    NASA Astrophysics Data System (ADS)

    Veldman, Irma; de Keijzer, Ander; van Keulen, Maurice

    Database techniques to store, query and manipulate data that contains uncertainty receives increasing research interest. Such UDBMSs can be classified according to their underlying data model: relational, XML, or RDF. We focus on uncertain XML DBMS with as representative example the Probabilistic XML model (PXML) of [10,9]. The size of a PXML document is obviously a factor in performance. There are PXML-specific techniques to reduce the size, such as a push down mechanism, that produces equivalent but more compact PXML documents. It can only be applied, however, where possibilities are dependent. For normal XML documents there also exist several techniques for compressing a document. Since Probabilistic XML is (a special form of) normal XML, it might benefit from these methods even more. In this paper, we show that existing compression mechanisms can be combined with PXML-specific compression techniques. We also show that best compression rates are obtained with a combination of PXML-specific technique with a rather simple generic DAG-compression technique.

  2. Information retrieval system

    NASA Technical Reports Server (NTRS)

    Berg, R. F.; Holcomb, J. E.; Kelroy, E. A.; Levine, D. A.; Mee, C., III

    1970-01-01

    Generalized information storage and retrieval system capable of generating and maintaining a file, gathering statistics, sorting output, and generating final reports for output is reviewed. File generation and file maintenance programs written for the system are general purpose routines.

  3. Context-sensitive medical information retrieval.

    PubMed

    Auerbuch, Mordechai; Karson, Tom H; Ben-Ami, Benjamin; Maimon, Oded; Rokach, Lior

    2004-01-01

    Substantial medical data such as pathology reports, operative reports, discharge summaries, and radiology reports are stored in textual form. Databases containing free-text medical narratives often need to be searched to find relevant information for clinical and research purposes. Terms that appear in these documents tend to appear in different contexts. The con-text of negation, a negative finding, is of special importance, since many of the most frequently described findings are those denied by the patient or subsequently "ruled out." Hence, when searching free-text narratives for patients with a certain medical condition, if negation is not taken into account, many of the retrieved documents will be irrelevant. The purpose of this work is to develop a methodology for automated learning of negative context patterns in medical narratives and test the effect of context identification on the performance of medical information retrieval. The algorithm presented significantly improves the performance of information retrieval done on medical narratives. The precision im-proves from about 60%, when using context-insensitive retrieval, to nearly 100%. The impact on recall is only minor. In addition, context-sensitive queries enable the user to search for terms in ways not otherwise available

  4. XSemantic: An Extension of LCA Based XML Semantic Search

    NASA Astrophysics Data System (ADS)

    Supasitthimethee, Umaporn; Shimizu, Toshiyuki; Yoshikawa, Masatoshi; Porkaew, Kriengkrai

    One of the most convenient ways to query XML data is a keyword search because it does not require any knowledge of XML structure or learning a new user interface. However, the keyword search is ambiguous. The users may use different terms to search for the same information. Furthermore, it is difficult for a system to decide which node is likely to be chosen as a return node and how much information should be included in the result. To address these challenges, we propose an XML semantic search based on keywords called XSemantic. On the one hand, we give three definitions to complete in terms of semantics. Firstly, the semantic term expansion, our system is robust from the ambiguous keywords by using the domain ontology. Secondly, to return semantic meaningful answers, we automatically infer the return information from the user queries and take advantage of the shortest path to return meaningful connections between keywords. Thirdly, we present the semantic ranking that reflects the degree of similarity as well as the semantic relationship so that the search results with the higher relevance are presented to the users first. On the other hand, in the LCA and the proximity search approaches, we investigated the problem of information included in the search results. Therefore, we introduce the notion of the Lowest Common Element Ancestor (LCEA) and define our simple rule without any requirement on the schema information such as the DTD or XML Schema. The first experiment indicated that XSemantic not only properly infers the return information but also generates compact meaningful results. Additionally, the benefits of our proposed semantics are demonstrated by the second experiment.

  5. Using XML to encode TMA DES metadata.

    PubMed

    Lyttleton, Oliver; Wright, Alexander; Treanor, Darren; Lewis, Paul

    2011-01-01

    The Tissue Microarray Data Exchange Specification (TMA DES) is an XML specification for encoding TMA experiment data. While TMA DES data is encoded in XML, the files that describe its syntax, structure, and semantics are not. The DTD format is used to describe the syntax and structure of TMA DES, and the ISO 11179 format is used to define the semantics of TMA DES. However, XML Schema can be used in place of DTDs, and another XML encoded format, RDF, can be used in place of ISO 11179. Encoding all TMA DES data and metadata in XML would simplify the development and usage of programs which validate and parse TMA DES data. XML Schema has advantages over DTDs such as support for data types, and a more powerful means of specifying constraints on data values. An advantage of RDF encoded in XML over ISO 11179 is that XML defines rules for encoding data, whereas ISO 11179 does not. We created an XML Schema version of the TMA DES DTD. We wrote a program that converted ISO 11179 definitions to RDF encoded in XML, and used it to convert the TMA DES ISO 11179 definitions to RDF. We validated a sample TMA DES XML file that was supplied with the publication that originally specified TMA DES using our XML Schema. We successfully validated the RDF produced by our ISO 11179 converter with the W3C RDF validation service. All TMA DES data could be encoded using XML, which simplifies its processing. XML Schema allows datatypes and valid value ranges to be specified for CDEs, which enables a wider range of error checking to be performed using XML Schemas than could be performed using DTDs.

  6. Using XML to encode TMA DES metadata

    PubMed Central

    Lyttleton, Oliver; Wright, Alexander; Treanor, Darren; Lewis, Paul

    2011-01-01

    Background: The Tissue Microarray Data Exchange Specification (TMA DES) is an XML specification for encoding TMA experiment data. While TMA DES data is encoded in XML, the files that describe its syntax, structure, and semantics are not. The DTD format is used to describe the syntax and structure of TMA DES, and the ISO 11179 format is used to define the semantics of TMA DES. However, XML Schema can be used in place of DTDs, and another XML encoded format, RDF, can be used in place of ISO 11179. Encoding all TMA DES data and metadata in XML would simplify the development and usage of programs which validate and parse TMA DES data. XML Schema has advantages over DTDs such as support for data types, and a more powerful means of specifying constraints on data values. An advantage of RDF encoded in XML over ISO 11179 is that XML defines rules for encoding data, whereas ISO 11179 does not. Materials and Methods: We created an XML Schema version of the TMA DES DTD. We wrote a program that converted ISO 11179 definitions to RDF encoded in XML, and used it to convert the TMA DES ISO 11179 definitions to RDF. Results: We validated a sample TMA DES XML file that was supplied with the publication that originally specified TMA DES using our XML Schema. We successfully validated the RDF produced by our ISO 11179 converter with the W3C RDF validation service. Conclusions: All TMA DES data could be encoded using XML, which simplifies its processing. XML Schema allows datatypes and valid value ranges to be specified for CDEs, which enables a wider range of error checking to be performed using XML Schemas than could be performed using DTDs. PMID:21969921

  7. Exploiting salient semantic analysis for information retrieval

    NASA Astrophysics Data System (ADS)

    Luo, Jing; Meng, Bo; Quan, Changqin; Tu, Xinhui

    2016-11-01

    Recently, many Wikipedia-based methods have been proposed to improve the performance of different natural language processing (NLP) tasks, such as semantic relatedness computation, text classification and information retrieval. Among these methods, salient semantic analysis (SSA) has been proven to be an effective way to generate conceptual representation for words or documents. However, its feasibility and effectiveness in information retrieval is mostly unknown. In this paper, we study how to efficiently use SSA to improve the information retrieval performance, and propose a SSA-based retrieval method under the language model framework. First, SSA model is adopted to build conceptual representations for documents and queries. Then, these conceptual representations and the bag-of-words (BOW) representations can be used in combination to estimate the language models of queries and documents. The proposed method is evaluated on several standard text retrieval conference (TREC) collections. Experiment results on standard TREC collections show the proposed models consistently outperform the existing Wikipedia-based retrieval methods.

  8. Term Relevance Weights in On-Line Information Retrieval

    ERIC Educational Resources Information Center

    Salton, G.; Waldstein, R. K.

    1978-01-01

    Term relevance weighting systems in interactive information retrieval are reviewed. An experiment in which information retrieval users ranked query terms in decreasing order of presumed importance prior to actual search and retrieval is described. (Author/KP)

  9. Incidental retrieval-induced forgetting of location information.

    PubMed

    Gómez-Ariza, Carlos J; Fernandez, Angel; Bajo, M Teresa

    2012-06-01

    Retrieval-induced forgetting (RIF) has been studied with different types of tests and materials. However, RIF has always been tested on the items' central features, and there is no information on whether inhibition also extends to peripheral features of the events in which the items are embedded. In two experiments, we specifically tested the presence of RIF in a task in which recall of peripheral information was required. After a standard retrieval practice task oriented to item identity, participants were cued with colors (Exp. 1) or with the items themselves (Exp. 2) and asked to recall the screen locations where the items had been displayed during the study phase. RIF for locations was observed after retrieval practice, an effect that was not present when participants were asked to read instead of retrieving the items. Our findings provide evidence that peripheral location information associated with an item during study can be also inhibited when the retrieval conditions promote the inhibition of more central, item identity information.

  10. Music Information Retrieval.

    ERIC Educational Resources Information Center

    Downie, J. Stephen

    2003-01-01

    Identifies MIR (Music Information Retrieval) computer system problems, historic influences, current state-of-the-art, and future MIR solutions through an examination of the multidisciplinary approach to MIR. Highlights include pitch; temporal factors; harmonics; tone; editorial, textual, and bibliographic facets; multicultural factors; locating…

  11. Adding XML to the MIS Curriculum: Lessons from the Classroom

    ERIC Educational Resources Information Center

    Wagner, William P.; Pant, Vik; Hilken, Ralph

    2008-01-01

    eXtensible Markup Language (XML) is a new technology that is currently being extolled by many industry experts and software vendors. Potentially it represents a platform independent language for sharing information over networks in a way that is much more seamless than with previous technologies. It is extensible in that XML serves as a "meta"…

  12. An XML-Based Manipulation and Query Language for Rule-Based Information

    NASA Astrophysics Data System (ADS)

    Mansour, Essam; Höpfner, Hagen

    Rules are utilized to assist in the monitoring process that is required in activities, such as disease management and customer relationship management. These rules are specified according to the application best practices. Most of research efforts emphasize on the specification and execution of these rules. Few research efforts focus on managing these rules as one object that has a management life-cycle. This paper presents our manipulation and query language that is developed to facilitate the maintenance of this object during its life-cycle and to query the information contained in this object. This language is based on an XML-based model. Furthermore, we evaluate the model and language using a prototype system applied to a clinical case study.

  13. XML technology planning database : lessons learned

    NASA Technical Reports Server (NTRS)

    Some, Raphael R.; Neff, Jon M.

    2005-01-01

    A hierarchical Extensible Markup Language(XML) database called XCALIBR (XML Analysis LIBRary) has been developed by Millennium Program to assist in technology investment (ROI) analysis and technology Language Capability the New return on portfolio optimization. The database contains mission requirements and technology capabilities, which are related by use of an XML dictionary. The XML dictionary codifies a standardized taxonomy for space missions, systems, subsystems and technologies. In addition to being used for ROI analysis, the database is being examined for use in project planning, tracking and documentation. During the past year, the database has moved from development into alpha testing. This paper describes the lessons learned during construction and testing of the prototype database and the motivation for moving from an XML taxonomy to a standard XML-based ontology.

  14. Memory retrieval of everyday information under stress.

    PubMed

    Stock, Lisa-Marie; Merz, Christian J

    2018-07-01

    Psychosocial stress is known to crucially influence learning and memory processes. Several studies have already shown an impairing effect of elevated cortisol concentrations on memory retrieval. These studies mainly used learning material consisting of stimuli with a limited ecological validity. When using material with a social contextual component or with educational relevant material both impairing and enhancing stress effects on memory retrieval could be observed. In line with these latter studies, the present experiment also used material with a higher ecological validity (a coherent text consisting of daily relevant numeric, figural and verbal information). After encoding, retrieval took place 24 h later after exposure to psychosocial stress or a control procedure (20 healthy men per group). The stress group was further subdivided into cortisol responders and non-responders. Results showed a significantly impaired retrieval of everyday information in non-responders compared to responders and controls. Altogether, the present findings indicate the need of an appropriate cortisol response for the successful memory retrieval of everyday information. Thus, the present findings suggest that cortisol increases - contrary to a stressful experience per se - seem to play a protective role for retrieving everyday information. Additionally, it could be speculated that the previously reported impairing stress effects on memory retrieval might depend on the used learning material. Copyright © 2018 Elsevier Inc. All rights reserved.

  15. Hypertext and Information Retrieval.

    ERIC Educational Resources Information Center

    Smith, Karen E.; And Others

    1988-01-01

    An overview of hypertext and hypermedia is followed by a description of the Intermedia system, and possibilities for using hypertext in the information industry are explored. A sidebar discusses information retrieval in the humanities using hypertext, and a 58-item annotated bibliography on hypertext is presented. (7 references) (MES)

  16. An Abstraction-Based Data Model for Information Retrieval

    NASA Astrophysics Data System (ADS)

    McAllister, Richard A.; Angryk, Rafal A.

    Language ontologies provide an avenue for automated lexical analysis that may be used to supplement existing information retrieval methods. This paper presents a method of information retrieval that takes advantage of WordNet, a lexical database, to generate paths of abstraction, and uses them as the basis for an inverted index structure to be used in the retrieval of documents from an indexed corpus. We present this method as a entree to a line of research on using ontologies to perform word-sense disambiguation and improve the precision of existing information retrieval techniques.

  17. Collaborative Information Retrieval.

    ERIC Educational Resources Information Center

    Bruce, Harry; Fidel, Raya

    1999-01-01

    Researchers from the University of Washington, Microsoft Research, Boeing, and Risoe National Laboratory in Denmark have embarked on a project to explore the manifestations of Collaborative Information Retrieval (CIR) in work settings and to propose technological innovations and organizational changes that can support, facilitate, and improve CIR.…

  18. Use of information-retrieval languages in automated retrieval of experimental data from long-term storage

    NASA Technical Reports Server (NTRS)

    Khovanskiy, Y. D.; Kremneva, N. I.

    1975-01-01

    Problems and methods are discussed of automating information retrieval operations in a data bank used for long term storage and retrieval of data from scientific experiments. Existing information retrieval languages are analyzed along with those being developed. The results of studies discussing the application of the descriptive 'Kristall' language used in the 'ASIOR' automated information retrieval system are presented. The development and use of a specialized language of the classification-descriptive type, using universal decimal classification indices as the main descriptors, is described.

  19. Multimodal medical information retrieval with unsupervised rank fusion.

    PubMed

    Mourão, André; Martins, Flávio; Magalhães, João

    2015-01-01

    Modern medical information retrieval systems are paramount to manage the insurmountable quantities of clinical data. These systems empower health care experts in the diagnosis of patients and play an important role in the clinical decision process. However, the ever-growing heterogeneous information generated in medical environments poses several challenges for retrieval systems. We propose a medical information retrieval system with support for multimodal medical case-based retrieval. The system supports medical information discovery by providing multimodal search, through a novel data fusion algorithm, and term suggestions from a medical thesaurus. Our search system compared favorably to other systems in 2013 ImageCLEFMedical. Copyright © 2014 Elsevier Ltd. All rights reserved.

  20. Towards the XML schema measurement based on mapping between XML and OO domain

    NASA Astrophysics Data System (ADS)

    Rakić, Gordana; Budimac, Zoran; Heričko, Marjan; Pušnik, Maja

    2017-07-01

    Measuring quality of IT solutions is a priority in software engineering. Although numerous metrics for measuring object-oriented code already exist, measuring quality of UML models or XML Schemas is still developing. One of the research questions in the overall research leaded by ideas described in this paper is whether we can apply already defined object-oriented design metrics on XML schemas based on predefined mappings. In this paper, basic ideas for mentioned mapping are presented. This mapping is prerequisite for setting the future approach to XML schema quality measuring with object-oriented metrics.

  1. Integrated Syntactic/Semantic XML Data Validation with a Reusable Software Component

    ERIC Educational Resources Information Center

    Golikov, Steven

    2013-01-01

    Data integration is a critical component of enterprise system integration, and XML data validation is the foundation for sound data integration of XML-based information systems. Since B2B e-commerce relies on data validation as one of the critical components for enterprise integration, it is imperative for financial industries and e-commerce…

  2. The XBabelPhish MAGE-ML and XML translator.

    PubMed

    Maier, Don; Wymore, Farrell; Sherlock, Gavin; Ball, Catherine A

    2008-01-18

    MAGE-ML has been promoted as a standard format for describing microarray experiments and the data they produce. Two characteristics of the MAGE-ML format compromise its use as a universal standard: First, MAGE-ML files are exceptionally large - too large to be easily read by most people, and often too large to be read by most software programs. Second, the MAGE-ML standard permits many ways of representing the same information. As a result, different producers of MAGE-ML create different documents describing the same experiment and its data. Recognizing all the variants is an unwieldy software engineering task, resulting in software packages that can read and process MAGE-ML from some, but not all producers. This Tower of MAGE-ML Babel bars the unencumbered exchange of microarray experiment descriptions couched in MAGE-ML. We have developed XBabelPhish - an XQuery-based technology for translating one MAGE-ML variant into another. XBabelPhish's use is not restricted to translating MAGE-ML documents. It can transform XML files independent of their DTD, XML schema, or semantic content. Moreover, it is designed to work on very large (> 200 Mb.) files, which are common in the world of MAGE-ML. XBabelPhish provides a way to inter-translate MAGE-ML variants for improved interchange of microarray experiment information. More generally, it can be used to transform most XML files, including very large ones that exceed the capacity of most XML tools.

  3. Care episode retrieval: distributional semantic models for information retrieval in the clinical domain.

    PubMed

    Moen, Hans; Ginter, Filip; Marsi, Erwin; Peltonen, Laura-Maria; Salakoski, Tapio; Salanterä, Sanna

    2015-01-01

    Patients' health related information is stored in electronic health records (EHRs) by health service providers. These records include sequential documentation of care episodes in the form of clinical notes. EHRs are used throughout the health care sector by professionals, administrators and patients, primarily for clinical purposes, but also for secondary purposes such as decision support and research. The vast amounts of information in EHR systems complicate information management and increase the risk of information overload. Therefore, clinicians and researchers need new tools to manage the information stored in the EHRs. A common use case is, given a--possibly unfinished--care episode, to retrieve the most similar care episodes among the records. This paper presents several methods for information retrieval, focusing on care episode retrieval, based on textual similarity, where similarity is measured through domain-specific modelling of the distributional semantics of words. Models include variants of random indexing and the semantic neural network model word2vec. Two novel methods are introduced that utilize the ICD-10 codes attached to care episodes to better induce domain-specificity in the semantic model. We report on experimental evaluation of care episode retrieval that circumvents the lack of human judgements regarding episode relevance. Results suggest that several of the methods proposed outperform a state-of-the art search engine (Lucene) on the retrieval task.

  4. Care episode retrieval: distributional semantic models for information retrieval in the clinical domain

    PubMed Central

    2015-01-01

    Patients' health related information is stored in electronic health records (EHRs) by health service providers. These records include sequential documentation of care episodes in the form of clinical notes. EHRs are used throughout the health care sector by professionals, administrators and patients, primarily for clinical purposes, but also for secondary purposes such as decision support and research. The vast amounts of information in EHR systems complicate information management and increase the risk of information overload. Therefore, clinicians and researchers need new tools to manage the information stored in the EHRs. A common use case is, given a - possibly unfinished - care episode, to retrieve the most similar care episodes among the records. This paper presents several methods for information retrieval, focusing on care episode retrieval, based on textual similarity, where similarity is measured through domain-specific modelling of the distributional semantics of words. Models include variants of random indexing and the semantic neural network model word2vec. Two novel methods are introduced that utilize the ICD-10 codes attached to care episodes to better induce domain-specificity in the semantic model. We report on experimental evaluation of care episode retrieval that circumvents the lack of human judgements regarding episode relevance. Results suggest that several of the methods proposed outperform a state-of-the art search engine (Lucene) on the retrieval task. PMID:26099735

  5. Automated information retrieval using CLIPS

    NASA Technical Reports Server (NTRS)

    Raines, Rodney Doyle, III; Beug, James Lewis

    1991-01-01

    Expert systems have considerable potential to assist computer users in managing the large volume of information available to them. One possible use of an expert system is to model the information retrieval interests of a human user and then make recommendations to the user as to articles of interest. At Cal Poly, a prototype expert system written in the C Language Integrated Production System (CLIPS) serves as an Automated Information Retrieval System (AIRS). AIRS monitors a user's reading preferences, develops a profile of the user, and then evaluates items returned from the information base. When prompted by the user, AIRS returns a list of items of interest to the user. In order to minimize the impact on system resources, AIRS is designed to run in the background during periods of light system use.

  6. Information Retrieval in Biomedical Research: From Articles to Datasets

    ERIC Educational Resources Information Center

    Wei, Wei

    2017-01-01

    Information retrieval techniques have been applied to biomedical research for a variety of purposes, such as textual document retrieval and molecular data retrieval. As biomedical research evolves over time, information retrieval is also constantly facing new challenges, including the growing number of available data, the emerging new data types,…

  7. Retrieving Patent Information Online

    ERIC Educational Resources Information Center

    Kaback, Stuart M.

    1978-01-01

    This paper discusses patent information retrieval from online files in terms of types of questions, file contents, coverage, timeliness, and other file variations. CLAIMS, Derwent, WPI, APIPAT and Chemical Abstracts Service are described. (KP)

  8. A Simple XML Producer-Consumer Protocol

    NASA Technical Reports Server (NTRS)

    Smith, Warren; Gunter, Dan; Quesnel, Darcy

    2000-01-01

    This document describes a simple XML-based protocol that can be used for producers of events to communicate with consumers of events. The protocol described here is not meant to be the most efficient protocol, the most logical protocol, or the best protocol in any way. This protocol was defined quickly and it's intent is to give us a reasonable protocol that we can implement relatively easily and then use to gain experience in distributed event services. This experience will help us evaluate proposals for event representations, XML-based encoding of information, and communication protocols. The next section of this document describes how we represent events in this protocol and then defines the two events that we choose to use for our initial experiments. These definitions are made by example so that they are informal and easy to understand. The following section then proceeds to define the producer-consumer protocol we have agreed upon for our initial experiments.

  9. Information Retrieval in Physics.

    ERIC Educational Resources Information Center

    Herschman, Arthur

    Discussed in this paper are the information problems in physics and the current program of the American Institute of Physics (AIP) being conducted in an attempt to develop an information retrieval system. The seriousness of the need is described by means of graphs indicating the exponential rise in the number of physics publications in the last…

  10. Speed up of XML parsers with PHP language implementation

    NASA Astrophysics Data System (ADS)

    Georgiev, Bozhidar; Georgieva, Adriana

    2012-11-01

    In this paper, authors introduce PHP5's XML implementation and show how to read, parse, and write a short and uncomplicated XML file using Simple XML in a PHP environment. The possibilities for mutual work of PHP5 language and XML standard are described. The details of parsing process with Simple XML are also cleared. A practical project PHP-XML-MySQL presents the advantages of XML implementation in PHP modules. This approach allows comparatively simple search of XML hierarchical data by means of PHP software tools. The proposed project includes database, which can be extended with new data and new XML parsing functions.

  11. KAT: A Flexible XML-based Knowledge Authoring Environment

    PubMed Central

    Hulse, Nathan C.; Rocha, Roberto A.; Del Fiol, Guilherme; Bradshaw, Richard L.; Hanna, Timothy P.; Roemer, Lorrie K.

    2005-01-01

    As part of an enterprise effort to develop new clinical information systems at Intermountain Health Care, the authors have built a knowledge authoring tool that facilitates the development and refinement of medical knowledge content. At present, users of the application can compose order sets and an assortment of other structured clinical knowledge documents based on XML schemas. The flexible nature of the application allows the immediate authoring of new types of documents once an appropriate XML schema and accompanying Web form have been developed and stored in a shared repository. The need for a knowledge acquisition tool stems largely from the desire for medical practitioners to be able to write their own content for use within clinical applications. We hypothesize that medical knowledge content for clinical use can be successfully created and maintained through XML-based document frameworks containing structured and coded knowledge. PMID:15802477

  12. Standardization of XML Database Exchanges and the James Webb Space Telescope Experience

    NASA Technical Reports Server (NTRS)

    Gal-Edd, Jonathan; Detter, Ryan; Jones, Ron; Fatig, Curtis C.

    2007-01-01

    Personnel from the National Aeronautics and Space Administration (NASA) James Webb Space Telescope (JWST) Project have been working with various standard communities such the Object Management Group (OMG) and the Consultative Committee for Space Data Systems (CCSDS) to assist in the definition of a common extensible Markup Language (XML) for database exchange format. The CCSDS and OMG standards are intended for the exchange of core command and telemetry information, not for all database information needed to exercise a NASA space mission. The mission-specific database, containing all the information needed for a space mission, is translated from/to the standard using a translator. The standard is meant to provide a system that encompasses 90% of the information needed for command and telemetry processing. This paper will discuss standardization of the XML database exchange format, tools used, and the JWST experience, as well as future work with XML standard groups both commercial and government.

  13. Setting the Standard: XML on Campus.

    ERIC Educational Resources Information Center

    Rawlins, Mike

    2001-01-01

    Explains what XML (Extensible Markup Language) is; where to find it in a few years (everywhere from Web pages, to database management systems, to common campus applications); issues that will make XML somewhat of an experimental strategy in the near term; and the importance of decision-makers being abreast of XML trends in standards, tools…

  14. XML: A Publisher's Perspective.

    ERIC Educational Resources Information Center

    Andrews, Timothy M.

    1999-01-01

    Explains eXtensible Markup Language (XML) and describes how Dow Jones Interactive is using it to improve the news-gathering and dissemination process through intranets and the World Wide Web. Discusses benefits of using XML, the relationship to HyperText Markup Language (HTML), lack of available software tools and industry support, and future…

  15. Interfering effects of retrieval in learning new information.

    PubMed

    Finn, Bridgid; Roediger, Henry L

    2013-11-01

    In 7 experiments, we explored the role of retrieval in associative updating, that is, in incorporating new information into an associative memory. We tested the hypothesis that retrieval would facilitate incorporating a new contextual detail into a learned association. Participants learned 3 pieces of information-a person's face, name, and profession (in Experiments 1-5). In the 1st phase, participants in all conditions learned faces and names. In the 2nd phase, participants either restudied the face-name pair (the restudy condition) or were given the face and asked to retrieve the name (the test condition). In the 3rd phase, professions were presented for study just after restudy or testing. Our prediction was that the new information (the profession) would be more readily learned following retrieval of the face-name association compared to restudy of the face-name association. However, we found that the act of retrieval generally undermined acquisition of new associations rather than facilitating them. This detrimental effect emerged on both immediate and delayed tests. Further, the effect was not due to selective attention to feedback because we found impairment whether or not feedback was provided after the Phase 2 test. The data are novel in showing that the act of retrieving information can inhibit the ability to learn new information shortly thereafter. The results are difficult to accommodate within current theories that mostly emphasize benefits of retrieval for learning. PsycINFO Database Record (c) 2013 APA, all rights reserved.

  16. Data Visualization in Information Retrieval and Data Mining (SIG VIS).

    ERIC Educational Resources Information Center

    Efthimiadis, Efthimis

    2000-01-01

    Presents abstracts that discuss using data visualization for information retrieval and data mining, including immersive information space and spatial metaphors; spatial data using multi-dimensional matrices with maps; TREC (Text Retrieval Conference) experiments; users' information needs in cartographic information retrieval; and users' relevance…

  17. INFORMATION RETRIEVAL EXPERIMENT. FINAL REPORT.

    ERIC Educational Resources Information Center

    SELYE, HANS

    THIS REPORT IS A BRIEF REVIEW OF RESULTS OF AN EXPERIMENT TO DETERMINE THE INFORMATION RETRIEVAL EFFICIENCY OF A MANUAL SPECIALIZED INFORMATION SYSTEM BASED ON 700,000 DOCUMENTS IN THE FIELDS OF ENDOCRINOLOGY, STRESS, MAST CELLS, AND ANAPHYLACTOID REACTIONS. THE SYSTEM RECEIVES 30,000 PUBLICATIONS ANNUALLY. DETAILED INFORMATION IS REPRESENTED BY…

  18. A Prototype System for Retrieval of Gene Functional Information

    PubMed Central

    Folk, Lillian C.; Patrick, Timothy B.; Pattison, James S.; Wolfinger, Russell D.; Mitchell, Joyce A.

    2003-01-01

    Microarrays allow researchers to gather data about the expression patterns of thousands of genes simultaneously. Statistical analysis can reveal which genes show statistically significant results. Making biological sense of those results requires the retrieval of functional information about the genes thus identified, typically a manual gene-by-gene retrieval of information from various on-line databases. For experiments generating thousands of genes of interest, retrieval of functional information can become a significant bottleneck. To address this issue, we are currently developing a prototype system to automate the process of retrieval of functional information from multiple on-line sources. PMID:14728346

  19. Web information retrieval for health professionals.

    PubMed

    Ting, S L; See-To, Eric W K; Tse, Y K

    2013-06-01

    This paper presents a Web Information Retrieval System (WebIRS), which is designed to assist the healthcare professionals to obtain up-to-date medical knowledge and information via the World Wide Web (WWW). The system leverages the document classification and text summarization techniques to deliver the highly correlated medical information to the physicians. The system architecture of the proposed WebIRS is first discussed, and then a case study on an application of the proposed system in a Hong Kong medical organization is presented to illustrate the adoption process and a questionnaire is administrated to collect feedback on the operation and performance of WebIRS in comparison with conventional information retrieval in the WWW. A prototype system has been constructed and implemented on a trial basis in a medical organization. It has proven to be of benefit to healthcare professionals through its automatic functions in classification and summarizing the medical information that the physicians needed and interested. The results of the case study show that with the use of the proposed WebIRS, significant reduction of searching time and effort, with retrieval of highly relevant materials can be attained.

  20. Experiments in Multi-Lingual Information Retrieval.

    ERIC Educational Resources Information Center

    Salton, Gerard

    A comparison was made of the performance in an automatic information retrieval environment of user queries and document abstracts available in natural language form in both English and French. The results obtained indicate that the automatic indexing and retrieval techniques actually used appear equally effective in handling the query and document…

  1. MRML: an extensible communication protocol for interoperability and benchmarking of multimedia information retrieval systems

    NASA Astrophysics Data System (ADS)

    Mueller, Wolfgang; Mueller, Henning; Marchand-Maillet, Stephane; Pun, Thierry; Squire, David M.; Pecenovic, Zoran; Giess, Christoph; de Vries, Arjen P.

    2000-10-01

    While in the area of relational databases interoperability is ensured by common communication protocols (e.g. ODBC/JDBC using SQL), Content Based Image Retrieval Systems (CBIRS) and other multimedia retrieval systems are lacking both a common query language and a common communication protocol. Besides its obvious short term convenience, interoperability of systems is crucial for the exchange and analysis of user data. In this paper, we present and describe an extensible XML-based query markup language, called MRML (Multimedia Retrieval markup Language). MRML is primarily designed so as to ensure interoperability between different content-based multimedia retrieval systems. Further, MRML allows researchers to preserve their freedom in extending their system as needed. MRML encapsulates multimedia queries in a way that enable multimedia (MM) query languages, MM content descriptions, MM query engines, and MM user interfaces to grow independently from each other, reaching a maximum of interoperability while ensuring a maximum of freedom for the developer. For benefitting from this, only a few simple design principles have to be respected when extending MRML for one's fprivate needs. The design of extensions withing the MRML framework will be described in detail in the paper. MRML has been implemented and tested for the CBIRS Viper, using the user interface Snake Charmer. Both are part of the GNU project and can be downloaded at our site.

  2. Progress on an implementation of MIFlowCyt in XML

    NASA Astrophysics Data System (ADS)

    Leif, Robert C.; Leif, Stephanie H.

    2015-03-01

    Introduction: The International Society for Advancement of Cytometry (ISAC) Data Standards Task Force (DSTF) has created a standard for the Minimum Information about a Flow Cytometry Experiment (MIFlowCyt 1.0). The CytometryML schemas, are based in part upon the Flow Cytometry Standard and Digital Imaging and Communication (DICOM) standards. CytometryML has and will be extended and adapted to include MIFlowCyt, as well as to serve as a common standard for flow and image cytometry (digital microscopy). Methods: The MIFlowCyt data-types were created, as is the rest of CytometryML, in the XML Schema Definition Language (XSD1.1). Individual major elements of the MIFlowCyt schema were translated into XML and filled with reasonable data. A small section of the code was formatted with HTML formatting elements. Results: The differences in the amount of detail to be recorded for 1) users of standard techniques including data analysts and 2) others, such as method and device creators, laboratory and other managers, engineers, and regulatory specialists required that separate data-types be created to describe the instrument configuration and components. A very substantial part of the MIFlowCyt element that describes the Experimental Overview part of the MIFlowCyt and substantial parts of several other major elements have been developed. Conclusions: The future use of structured XML tags and web technology should facilitate searching of experimental information, its presentation, and inclusion in structured research, clinical, and regulatory documents, as well as demonstrate in publications adherence to the MIFlowCyt standard. The use of CytometryML together with XML technology should also result in the textual and numeric data being published using web technology without any change in composition. Preliminary testing indicates that CytometryML XML pages can be directly formatted with the combination of HTML

  3. Using XML and XSLT for flexible elicitation of mental-health risk knowledge.

    PubMed

    Buckingham, C D; Ahmed, A; Adams, A E

    2007-03-01

    Current tools for assessing risks associated with mental-health problems require assessors to make high-level judgements based on clinical experience. This paper describes how new technologies can enhance qualitative research methods to identify lower-level cues underlying these judgements, which can be collected by people without a specialist mental-health background. Content analysis of interviews with 46 multidisciplinary mental-health experts exposed the cues and their interrelationships, which were represented by a mind map using software that stores maps as XML. All 46 mind maps were integrated into a single XML knowledge structure and analysed by a Lisp program to generate quantitative information about the numbers of experts associated with each part of it. The knowledge was refined by the experts, using software developed in Flash to record their collective views within the XML itself. These views specified how the XML should be transformed by XSLT, a technology for rendering XML, which resulted in a validated hierarchical knowledge structure associating patient cues with risks. Changing knowledge elicitation requirements were accommodated by flexible transformations of XML data using XSLT, which also facilitated generation of multiple data-gathering tools suiting different assessment circumstances and levels of mental-health knowledge.

  4. Conservaton and retrieval of information

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Jensen, M.

    This is a summary of the findings of a Nordic working group formed in 1990 and given the task of establishing a basis for a common Nordic view of the need for information conservation for nuclear waste repositories by investigating the following: (1) the type of information that should be conserved; (2) the form in which the information should be kept; (3) the quality of the information as regards both type and form; and (4) the problems of future retrieval of information, including retrieval after very long periods of time. High-level waste from nuclear power generation will remain radioactive formore » very long times even though the major part of the radioactivity will have decayed within 1000 yr. Certain information about the waste must be kept for long time periods because future generations may-intentionally or inadvertently-come into contact with the radioactive waste. Current day waste management would benefit from an early identification of documents to be part of an archive for radioactive waste repositories. The same reasoning is valid for repositories for other toxic wastes.« less

  5. A Survey of Stemming Algorithms in Information Retrieval

    ERIC Educational Resources Information Center

    Moral, Cristian; de Antonio, Angélica; Imbert, Ricardo; Ramírez, Jaime

    2014-01-01

    Background: During the last fifty years, improved information retrieval techniques have become necessary because of the huge amount of information people have available, which continues to increase rapidly due to the use of new technologies and the Internet. Stemming is one of the processes that can improve information retrieval in terms of…

  6. An efficient approach for video information retrieval

    NASA Astrophysics Data System (ADS)

    Dong, Daoguo; Xue, Xiangyang

    2005-01-01

    Today, more and more video information can be accessed through internet, satellite, etc.. Retrieving specific video information from large-scale video database has become an important and challenging research topic in the area of multimedia information retrieval. In this paper, we introduce a new and efficient index structure OVA-File, which is a variant of VA-File. In OVA-File, the approximations close to each other in data space are stored in close positions of the approximation file. The benefit is that only a part of approximations close to the query vector need to be visited to get the query result. Both shot query algorithm and video clip algorithm are proposed to support video information retrieval efficiently. The experimental results showed that the queries based on OVA-File were much faster than that based on VA-File with small loss of result quality.

  7. James Webb Space Telescope XML Database: From the Beginning to Today

    NASA Technical Reports Server (NTRS)

    Gal-Edd, Jonathan; Fatig, Curtis C.

    2005-01-01

    The James Webb Space Telescope (JWST) Project has been defining, developing, and exercising the use of a common eXtensible Markup Language (XML) for the command and telemetry (C&T) database structure. JWST is the first large NASA space mission to use XML for databases. The JWST project started developing the concepts for the C&T database in 2002. The database will need to last at least 20 years since it will be used beginning with flight software development, continuing through Observatory integration and test (I&T) and through operations. Also, a database tool kit has been provided to the 18 various flight software development laboratories located in the United States, Europe, and Canada that allows the local users to create their own databases. Recently the JWST Project has been working with the Jet Propulsion Laboratory (JPL) and Object Management Group (OMG) XML Telemetry and Command Exchange (XTCE) personnel to provide all the information needed by JWST and JPL for exchanging database information using a XML standard structure. The lack of standardization requires custom ingest scripts for each ground system segment, increasing the cost of the total system. Providing a non-proprietary standard of the telemetry and command database definition formation will allow dissimilar systems to communicate without the need for expensive mission specific database tools and testing of the systems after the database translation. The various ground system components that would benefit from a standardized database are the telemetry and command systems, archives, simulators, and trending tools. JWST has exchanged the XML database with the Eclipse, EPOCH, ASIST ground systems, Portable spacecraft simulator (PSS), a front-end system, and Integrated Trending and Plotting System (ITPS) successfully. This paper will discuss how JWST decided to use XML, the barriers to a new concept, experiences utilizing the XML structure, exchanging databases with other users, and issues that have

  8. WaterML: an XML Language for Communicating Water Observations Data

    NASA Astrophysics Data System (ADS)

    Maidment, D. R.; Zaslavsky, I.; Valentine, D.

    2007-12-01

    One of the great impediments to the synthesis of water information is the plethora of formats used to publish such data. Each water agency uses its own approach. XML (eXtended Markup Languages) are generalizations of Hypertext Markup Language to communicate specific kinds of information via the internet. WaterML is an XML language for water observations data - streamflow, water quality, groundwater levels, climate, precipitation and aquatic biology data, recorded at fixed, point locations as a function of time. The Hydrologic Information System project of the Consortium of Universities for the Advancement of Hydrologic Science, Inc (CUAHSI) has defined WaterML and prepared a set of web service functions called WaterOneFLow that use WaterML to provide information about observation sites, the variables measured there and the values of those measurments. WaterML has been submitted to the Open GIS Consortium for harmonization with its standards for XML languages. Academic investigators at a number of testbed locations in the WATERS network are providing data in WaterML format using WaterOneFlow web services. The USGS and other federal agencies are also working with CUAHSI to similarly provide access to their data in WaterML through WaterOneFlow services.

  9. Information Retrieval for Ecological Syntheses

    ERIC Educational Resources Information Center

    Bayliss, Helen R.; Beyer, Fiona R.

    2015-01-01

    Research syntheses are increasingly being conducted within the fields of ecology and environmental management. Information retrieval is crucial in any synthesis in identifying data for inclusion whilst potentially reducing biases in the dataset gathered, yet the nature of ecological information provides several challenges when compared with…

  10. An Introduction to the Extensible Markup Language (XML).

    ERIC Educational Resources Information Center

    Bryan, Martin

    1998-01-01

    Describes Extensible Markup Language (XML), a subset of the Standard Generalized Markup Language (SGML) that is designed to make it easy to interchange structured documents over the Internet. Topics include Document Type Definition (DTD), components of XML, the use of XML, text and non-text elements, and uses for XML-coded files. (LRW)

  11. Data Discretization for Novel Relationship Discovery in Information Retrieval.

    ERIC Educational Resources Information Center

    Benoit, G.

    2002-01-01

    Describes an information retrieval, visualization, and manipulation model which offers the user multiple ways to exploit the retrieval set, based on weighted query terms, via an interactive interface. Outlines the mathematical model and describes an information retrieval application built on the model to search structured and full-text files.…

  12. A Low-Storage-Consumption XML Labeling Method for Efficient Structural Information Extraction

    NASA Astrophysics Data System (ADS)

    Liang, Wenxin; Takahashi, Akihiro; Yokota, Haruo

    Recently, labeling methods to extract and reconstruct the structural information of XML data, which are important for many applications such as XPath query and keyword search, are becoming more attractive. To achieve efficient structural information extraction, in this paper we propose C-DO-VLEI code, a novel update-friendly bit-vector encoding scheme, based on register-length bit operations combining with the properties of Dewey Order numbers, which cannot be implemented in other relevant existing schemes such as ORDPATH. Meanwhile, the proposed method also achieves lower storage consumption because it does not require either prefix schema or any reserved codes for node insertion. We performed experiments to evaluate and compare the performance and storage consumption of the proposed method with those of the ORDPATH method. Experimental results show that the execution times for extracting depth information and parent node labels using the C-DO-VLEI code are about 25% and 15% less, respectively, and the average label size using the C-DO-VLEI code is about 24% smaller, comparing with ORDPATH.

  13. A Simple XML Producer-Consumer Protocol

    NASA Technical Reports Server (NTRS)

    Smith, Warren; Gunter, Dan; Quesnel, Darcy; Biegel, Bryan (Technical Monitor)

    2001-01-01

    There are many different projects from government, academia, and industry that provide services for delivering events in distributed environments. The problem with these event services is that they are not general enough to support all uses and they speak different protocols so that they cannot interoperate. We require such interoperability when we, for example, wish to analyze the performance of an application in a distributed environment. Such an analysis might require performance information from the application, computer systems, networks, and scientific instruments. In this work we propose and evaluate a standard XML-based protocol for the transmission of events in distributed systems. One recent trend in government and academic research is the development and deployment of computational grids. Computational grids are large-scale distributed systems that typically consist of high-performance compute, storage, and networking resources. Examples of such computational grids are the DOE Science Grid, the NASA Information Power Grid (IPG), and the NSF Partnerships for Advanced Computing Infrastructure (PACIs). The major effort to deploy these grids is in the area of developing the software services to allow users to execute applications on these large and diverse sets of resources. These services include security, execution of remote applications, managing remote data, access to information about resources and services, and so on. There are several toolkits for providing these services such as Globus, Legion, and Condor. As part of these efforts to develop computational grids, the Global Grid Forum is working to standardize the protocols and APIs used by various grid services. This standardization will allow interoperability between the client and server software of the toolkits that are providing the grid services. The goal of the Performance Working Group of the Grid Forum is to standardize protocols and representations related to the storage and distribution of

  14. Representing Information in Patient Reports Using Natural Language Processing and the Extensible Markup Language

    PubMed Central

    Friedman, Carol; Hripcsak, George; Shagina, Lyuda; Liu, Hongfang

    1999-01-01

    Objective: To design a document model that provides reliable and efficient access to clinical information in patient reports for a broad range of clinical applications, and to implement an automated method using natural language processing that maps textual reports to a form consistent with the model. Methods: A document model that encodes structured clinical information in patient reports while retaining the original contents was designed using the extensible markup language (XML), and a document type definition (DTD) was created. An existing natural language processor (NLP) was modified to generate output consistent with the model. Two hundred reports were processed using the modified NLP system, and the XML output that was generated was validated using an XML validating parser. Results: The modified NLP system successfully processed all 200 reports. The output of one report was invalid, and 199 reports were valid XML forms consistent with the DTD. Conclusions: Natural language processing can be used to automatically create an enriched document that contains a structured component whose elements are linked to portions of the original textual report. This integrated document model provides a representation where documents containing specific information can be accurately and efficiently retrieved by querying the structured components. If manual review of the documents is desired, the salient information in the original reports can also be identified and highlighted. Using an XML model of tagging provides an additional benefit in that software tools that manipulate XML documents are readily available. PMID:9925230

  15. Improve Biomedical Information Retrieval using Modified Learning to Rank Methods.

    PubMed

    Xu, Bo; Lin, Hongfei; Lin, Yuan; Ma, Yunlong; Yang, Liang; Wang, Jian; Yang, Zhihao

    2016-06-14

    In these years, the number of biomedical articles has increased exponentially, which becomes a problem for biologists to capture all the needed information manually. Information retrieval technologies, as the core of search engines, can deal with the problem automatically, providing users with the needed information. However, it is a great challenge to apply these technologies directly for biomedical retrieval, because of the abundance of domain specific terminologies. To enhance biomedical retrieval, we propose a novel framework based on learning to rank. Learning to rank is a series of state-of-the-art information retrieval techniques, and has been proved effective in many information retrieval tasks. In the proposed framework, we attempt to tackle the problem of the abundance of terminologies by constructing ranking models, which focus on not only retrieving the most relevant documents, but also diversifying the searching results to increase the completeness of the resulting list for a given query. In the model training, we propose two novel document labeling strategies, and combine several traditional retrieval models as learning features. Besides, we also investigate the usefulness of different learning to rank approaches in our framework. Experimental results on TREC Genomics datasets demonstrate the effectiveness of our framework for biomedical information retrieval.

  16. An Evaluation of On-Line Information Retrieval System Techniques.

    ERIC Educational Resources Information Center

    Wolfe, Theodore

    This report presents a review and evaluation of three remote access on-line information retrieval systems and some ideas on what the capabilities of an ideal on-line information retrieval system should be. The three systems reviewed are the DDC Remote On-Line Retrieval System, the National Aeronautics and Space Administration RECON System, and the…

  17. XML-based approaches for the integration of heterogeneous bio-molecular data.

    PubMed

    Mesiti, Marco; Jiménez-Ruiz, Ernesto; Sanz, Ismael; Berlanga-Llavori, Rafael; Perlasca, Paolo; Valentini, Giorgio; Manset, David

    2009-10-15

    The today's public database infrastructure spans a very large collection of heterogeneous biological data, opening new opportunities for molecular biology, bio-medical and bioinformatics research, but raising also new problems for their integration and computational processing. In this paper we survey the most interesting and novel approaches for the representation, integration and management of different kinds of biological data by exploiting XML and the related recommendations and approaches. Moreover, we present new and interesting cutting edge approaches for the appropriate management of heterogeneous biological data represented through XML. XML has succeeded in the integration of heterogeneous biomolecular information, and has established itself as the syntactic glue for biological data sources. Nevertheless, a large variety of XML-based data formats have been proposed, thus resulting in a difficult effective integration of bioinformatics data schemes. The adoption of a few semantic-rich standard formats is urgent to achieve a seamless integration of the current biological resources.

  18. The Ecosystem of Information Retrieval

    ERIC Educational Resources Information Center

    Rodriguez-Munoz, Jose-Vicente; Martinez-Mendez, Francisco-Javier; Pastor-Sanchez, Juan-Antonio

    2012-01-01

    Introduction: This paper presents an initial proposal for a formal framework that, by studying the metric variables involved in information retrieval, can establish the sequence of events involved and how to perform it. Method: A systematic approach from the equations of Shannon and Weaver to establish the decidability of information retrieval…

  19. Clustering XML Documents Using Frequent Subtrees

    NASA Astrophysics Data System (ADS)

    Kutty, Sangeetha; Tran, Tien; Nayak, Richi; Li, Yuefeng

    This paper presents an experimental study conducted over the INEX 2008 Document Mining Challenge corpus using both the structure and the content of XML documents for clustering them. The concise common substructures known as the closed frequent subtrees are generated using the structural information of the XML documents. The closed frequent subtrees are then used to extract the constrained content from the documents. A matrix containing the term distribution of the documents in the dataset is developed using the extracted constrained content. The k-way clustering algorithm is applied to the matrix to obtain the required clusters. In spite of the large number of documents in the INEX 2008 Wikipedia dataset, the proposed frequent subtree-based clustering approach was successful in clustering the documents. This approach significantly reduces the dimensionality of the terms used for clustering without much loss in accuracy.

  20. Electronic publishing and intelligent information retrieval

    NASA Technical Reports Server (NTRS)

    Heck, A.

    1992-01-01

    Europeans are now taking steps to homogenize policies and standardize procedures in electronic publishing (EP) in astronomy and space sciences. This arose from an open meeting organized in Oct. 1991 at Strasbourg Observatory (France) and another business meeting held late Mar. 1992 with the major publishers and journal editors in astronomy and space sciences. The ultimate aim of EP might be considered as the so-called 'intelligent information retrieval' (IIR) or better named 'advanced information retrieval' (AIR), taking advantage of the fact that the material to be published appears at some stage in a machine-readable form. It is obvious that the combination of desktop and electronic publishing with networking and new structuring of knowledge bases will profoundly reshape not only our ways of publishing, but also our procedures of communicating and retrieving information. It should be noted that a world-wide survey among astronomers and space scientists carried out before the October 1991 colloquium on the various packages and machines used, indicated that TEX-related packages were already in majoritarian use in our community. It has also been stressed at each meeting that the European developments should be carried out in collaboration with what is done in the US (STELLAR project, for instance). American scientists and journal editors actually attended both meetings mentioned above. The paper will offer a review of the status of electronic publishing in astronomy and its possible contribution to advanced information retrieval in this field. It will also report on recent meetings such as the 'Astronomy from Large Databases-2 (ALD-2)' conference dealing with the latest developments in networking, in data, information, and knowledge bases, as well as in the related methodologies.

  1. QuakeML - An XML Schema for Seismology

    NASA Astrophysics Data System (ADS)

    Wyss, A.; Schorlemmer, D.; Maraini, S.; Baer, M.; Wiemer, S.

    2004-12-01

    We propose an extensible format-definition for seismic data (QuakeML). Sharing data and seismic information efficiently is one of the most important issues for research and observational seismology in the future. The eXtensible Markup Language (XML) is playing an increasingly important role in the exchange of a variety of data. Due to its extensible definition capabilities, its wide acceptance and the existing large number of utilities and libraries for XML, a structured representation of various types of seismological data should in our opinion be developed by defining a 'QuakeML' standard. Here we present the QuakeML definitions for parameter databases and further efforts, e.g. a central QuakeML catalog database and a web portal for exchanging codes and stylesheets.

  2. A tutorial on information retrieval: basic terms and concepts

    PubMed Central

    Zhou, Wei; Smalheiser, Neil R; Yu, Clement

    2006-01-01

    This informal tutorial is intended for investigators and students who would like to understand the workings of information retrieval systems, including the most frequently used search engines: PubMed and Google. Having a basic knowledge of the terms and concepts of information retrieval should improve the efficiency and productivity of searches. As well, this knowledge is needed in order to follow current research efforts in biomedical information retrieval and text mining that are developing new systems not only for finding documents on a given topic, but extracting and integrating knowledge across documents. PMID:16722601

  3. Interfering Effects of Retrieval in Learning New Information

    ERIC Educational Resources Information Center

    Finn, Bridgid; Roediger, Henry L., III

    2013-01-01

    In 7 experiments, we explored the role of retrieval in associative updating, that is, in incorporating new information into an associative memory. We tested the hypothesis that retrieval would facilitate incorporating a new contextual detail into a learned association. Participants learned 3 pieces of information--a person's face, name, and…

  4. Software Development Of XML Parser Based On Algebraic Tools

    NASA Astrophysics Data System (ADS)

    Georgiev, Bozhidar; Georgieva, Adriana

    2011-12-01

    In this paper, is presented one software development and implementation of an algebraic method for XML data processing, which accelerates XML parsing process. Therefore, the proposed in this article nontraditional approach for fast XML navigation with algebraic tools contributes to advanced efforts in the making of an easier user-friendly API for XML transformations. Here the proposed software for XML documents processing (parser) is easy to use and can manage files with strictly defined data structure. The purpose of the presented algorithm is to offer a new approach for search and restructuring hierarchical XML data. This approach permits fast XML documents processing, using algebraic model developed in details in previous works of the same authors. So proposed parsing mechanism is easy accessible to the web consumer who is able to control XML file processing, to search different elements (tags) in it, to delete and to add a new XML content as well. The presented various tests show higher rapidity and low consumption of resources in comparison with some existing commercial parsers.

  5. A semantic medical multimedia retrieval approach using ontology information hiding.

    PubMed

    Guo, Kehua; Zhang, Shigeng

    2013-01-01

    Searching useful information from unstructured medical multimedia data has been a difficult problem in information retrieval. This paper reports an effective semantic medical multimedia retrieval approach which can reflect the users' query intent. Firstly, semantic annotations will be given to the multimedia documents in the medical multimedia database. Secondly, the ontology that represented semantic information will be hidden in the head of the multimedia documents. The main innovations of this approach are cross-type retrieval support and semantic information preservation. Experimental results indicate a good precision and efficiency of our approach for medical multimedia retrieval in comparison with some traditional approaches.

  6. A Semantic Medical Multimedia Retrieval Approach Using Ontology Information Hiding

    PubMed Central

    Guo, Kehua; Zhang, Shigeng

    2013-01-01

    Searching useful information from unstructured medical multimedia data has been a difficult problem in information retrieval. This paper reports an effective semantic medical multimedia retrieval approach which can reflect the users' query intent. Firstly, semantic annotations will be given to the multimedia documents in the medical multimedia database. Secondly, the ontology that represented semantic information will be hidden in the head of the multimedia documents. The main innovations of this approach are cross-type retrieval support and semantic information preservation. Experimental results indicate a good precision and efficiency of our approach for medical multimedia retrieval in comparison with some traditional approaches. PMID:24082915

  7. Feedback in Information Retrieval.

    ERIC Educational Resources Information Center

    Spink, Amanda; Losee, Robert M.

    1996-01-01

    As Information Retrieval (IR) has evolved, it has become a highly interactive process, rooted in cognitive and situational contexts. Consequently the traditional cybernetic-based IR model does not suffice for interactive IR or the human approach to IR. Reviews different views of feedback in IR and their relationship to cybernetic and social…

  8. JANE, A new information retrieval system for the Radiation Shielding Information Center

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Trubey, D.K.

    A new information storage and retrieval system has been developed for the Radiation Shielding Information Center (RSIC) at Oak Ridge National Laboratory to replace mainframe systems that have become obsolete. The database contains citations and abstracts of literature which were selected by RSIC analysts and indexed with terms from a controlled vocabulary. The database, begun in 1963, has been maintained continuously since that time. The new system, called JANE, incorporates automatic indexing techniques and on-line retrieval using the RSIC Data General Eclipse MV/4000 minicomputer, Automatic indexing and retrieval techniques based on fuzzy-set theory allow the presentation of results in ordermore » of Retrieval Status Value. The fuzzy-set membership function depends on term frequency in the titles and abstracts and on Term Discrimination Values which indicate the resolving power of the individual terms. These values are determined by the Cover Coefficient method. The use of a commercial database base to store and retrieve the indexing information permits rapid retrieval of the stored documents. Comparisons of the new and presently-used systems for actual searches of the literature indicate that it is practical to replace the mainframe systems with a minicomputer system similar to the present version of JANE. 18 refs., 10 figs.« less

  9. TME2/342: The Role of the EXtensible Markup Language (XML) for Future Healthcare Application Development

    PubMed Central

    Noelle, G; Dudeck, J

    1999-01-01

    Two years, since the World Wide Web Consortium (W3C) has published the first specification of the eXtensible Markup Language (XML) there exist some concrete tools and applications to work with XML-based data. In particular, new generation Web browsers offer great opportunities to develop new kinds of medical, web-based applications. There are several data-exchange formats in medicine, which have been established in the last years: HL-7, DICOM, EDIFACT and, in the case of Germany, xDT. Whereas communication and information exchange becomes increasingly important, the development of appropriate and necessary interfaces causes problems, rising costs and effort. It has been also recognised that it is difficult to define a standardised interchange format, for one of the major future developments in medical telematics: the electronic patient record (EPR) and its availability on the Internet. Whereas XML, especially in an industrial environment, is celebrated as a generic standard and a solution for all problems concerning e-commerce, in a medical context there are only few applications developed. Nevertheless, the medical environment is an appropriate area for building XML applications: as the information and communication management becomes increasingly important in medical businesses, the role of the Internet changes quickly from an information to a communication medium. The first XML based applications in healthcare show us the advantage for a future engagement of the healthcare industry in XML: such applications are open, easy to extend and cost-effective. Additionally, XML is much more than a simple new data interchange format: many proposals for data query (XQL), data presentation (XSL) and other extensions have been proposed to the W3C and partly realised in medical applications.

  10. Semantic-Based Information Retrieval of Biomedical Data

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Jiao, Yu; Potok, Thomas E; Hurson, Ali R.

    In this paper, we propose to improve the effectiveness of biomedical information retrieval via a medical thesaurus. We analyzed the deficiencies of the existing medical thesauri and reconstructed a new thesaurus, called MEDTHES, which follows the ANSI/NISO Z39.19-2003 standard. MEDTHES also endows the users with fine-grained control of information retrieval by providing functions to calculate the semantic similarity between words. We demonstrate the usage of MEDTHES through an existing data search engine.

  11. 15 CFR 950.9 - Computerized Environmental Data and Information Retrieval Service.

    Code of Federal Regulations, 2012 CFR

    2012-01-01

    ... Information Retrieval Service. 950.9 Section 950.9 Commerce and Foreign Trade Regulations Relating to Commerce... Computerized Environmental Data and Information Retrieval Service. The Environmental Data Index (ENDEX... computerized, information retrieval service provides a parallel subject-author-abstract referral service. A...

  12. 15 CFR 950.9 - Computerized Environmental Data and Information Retrieval Service.

    Code of Federal Regulations, 2013 CFR

    2013-01-01

    ... Information Retrieval Service. 950.9 Section 950.9 Commerce and Foreign Trade Regulations Relating to Commerce... Computerized Environmental Data and Information Retrieval Service. The Environmental Data Index (ENDEX... computerized, information retrieval service provides a parallel subject-author-abstract referral service. A...

  13. 15 CFR 950.9 - Computerized Environmental Data and Information Retrieval Service.

    Code of Federal Regulations, 2010 CFR

    2010-01-01

    ... Information Retrieval Service. 950.9 Section 950.9 Commerce and Foreign Trade Regulations Relating to Commerce... Computerized Environmental Data and Information Retrieval Service. The Environmental Data Index (ENDEX... computerized, information retrieval service provides a parallel subject-author-abstract referral service. A...

  14. 15 CFR 950.9 - Computerized Environmental Data and Information Retrieval Service.

    Code of Federal Regulations, 2011 CFR

    2011-01-01

    ... Information Retrieval Service. 950.9 Section 950.9 Commerce and Foreign Trade Regulations Relating to Commerce... Computerized Environmental Data and Information Retrieval Service. The Environmental Data Index (ENDEX... computerized, information retrieval service provides a parallel subject-author-abstract referral service. A...

  15. 15 CFR 950.9 - Computerized Environmental Data and Information Retrieval Service.

    Code of Federal Regulations, 2014 CFR

    2014-01-01

    ... Information Retrieval Service. 950.9 Section 950.9 Commerce and Foreign Trade Regulations Relating to Commerce... Computerized Environmental Data and Information Retrieval Service. The Environmental Data Index (ENDEX... computerized, information retrieval service provides a parallel subject-author-abstract referral service. A...

  16. FPGA implementation of sparse matrix algorithm for information retrieval

    NASA Astrophysics Data System (ADS)

    Bojanic, Slobodan; Jevtic, Ruzica; Nieto-Taladriz, Octavio

    2005-06-01

    Information text data retrieval requires a tremendous amount of processing time because of the size of the data and the complexity of information retrieval algorithms. In this paper the solution to this problem is proposed via hardware supported information retrieval algorithms. Reconfigurable computing may adopt frequent hardware modifications through its tailorable hardware and exploits parallelism for a given application through reconfigurable and flexible hardware units. The degree of the parallelism can be tuned for data. In this work we implemented standard BLAS (basic linear algebra subprogram) sparse matrix algorithm named Compressed Sparse Row (CSR) that is showed to be more efficient in terms of storage space requirement and query-processing timing over the other sparse matrix algorithms for information retrieval application. Although inverted index algorithm is treated as the de facto standard for information retrieval for years, an alternative approach to store the index of text collection in a sparse matrix structure gains more attention. This approach performs query processing using sparse matrix-vector multiplication and due to parallelization achieves a substantial efficiency over the sequential inverted index. The parallel implementations of information retrieval kernel are presented in this work targeting the Virtex II Field Programmable Gate Arrays (FPGAs) board from Xilinx. A recent development in scientific applications is the use of FPGA to achieve high performance results. Computational results are compared to implementations on other platforms. The design achieves a high level of parallelism for the overall function while retaining highly optimised hardware within processing unit.

  17. XML Schema Languages: Beyond DTD.

    ERIC Educational Resources Information Center

    Ioannides, Demetrios

    2000-01-01

    Discussion of XML (extensible markup language) and the traditional DTD (document type definition) format focuses on efforts of the World Wide Web Consortium's XML schema working group to develop a schema language to replace DTD that will be capable of defining the set of constraints of any possible data resource. (Contains 14 references.) (LRW)

  18. Contextual Information Drives the Reconsolidation-Dependent Updating of Retrieved Fear Memories

    PubMed Central

    Jarome, Timothy J; Ferrara, Nicole C; Kwapis, Janine L; Helmstetter, Fred J

    2015-01-01

    Stored memories enter a temporary state of vulnerability following retrieval known as ‘reconsolidation', a process that can allow memories to be modified to incorporate new information. Although reconsolidation has become an attractive target for treatment of memories related to traumatic past experiences, we still do not know what new information triggers the updating of retrieved memories. Here, we used biochemical markers of synaptic plasticity in combination with a novel behavioral procedure to determine what was learned during memory reconsolidation under normal retrieval conditions. We eliminated new information during retrieval by manipulating animals' training experience and measured changes in proteasome activity and GluR2 expression in the amygdala, two established markers of fear memory lability and reconsolidation. We found that eliminating new contextual information during the retrieval of memories for predictable and unpredictable fear associations prevented changes in proteasome activity and glutamate receptor expression in the amygdala, indicating that this new information drives the reconsolidation of both predictable and unpredictable fear associations on retrieval. Consistent with this, eliminating new contextual information prior to retrieval prevented the memory-impairing effects of protein synthesis inhibitors following retrieval. These results indicate that under normal conditions, reconsolidation updates memories by incorporating new contextual information into the memory trace. Collectively, these results suggest that controlling contextual information present during retrieval may be a useful strategy for improving reconsolidation-based treatments of traumatic memories associated with anxiety disorders such as post-traumatic stress disorder. PMID:26062788

  19. AEROMETRIC INFORMATION RETRIEVAL SYSTEM (AIRS) - GRAPHICS

    EPA Science Inventory

    Aerometric Information Retrieval System (AIRS) is a computer-based repository of information about airborne pollution in the United States and various World Health Organization (WHO) member countries. AIRS is administered by the U.S. Environmental Protection Agency, and runs on t...

  20. NATIONAL PESTICIDE INFORMATION RETRIEVAL SYSTEM (NPIRS)

    EPA Science Inventory

    The National Pesticide Information Retrieval System (NPIRS) is a collection of pesticide-related databases available through subscription to the Center for Environmental and Regulatory Information Systems, CERIS. The following is a summary of data found in the databases, data sou...

  1. Information Retrieval and the Philosophy of Language.

    ERIC Educational Resources Information Center

    Blair, David C.

    2003-01-01

    Provides an overview of some of the main ideas in the philosophy of language that have relevance to the issues of information retrieval, focusing on the description of the intellectual content. Highlights include retrieval problems; recall and precision; words and meanings; context; externalism and the philosophy of language; and scaffolding and…

  2. XML and E-Journals: The State of Play.

    ERIC Educational Resources Information Center

    Wusteman, Judith

    2003-01-01

    Discusses the introduction of the use of XML (Extensible Markup Language) in publishing electronic journals. Topics include standards, including DTDs (Document Type Definition), or document type definitions; aggregator requirements; SGML (Standard Generalized Markup Language); benefits of XML for e-journals; XML metadata; the possibility of…

  3. SPIRES (Stanford Public Information Retrieval System) 1970-71 Annual Report.

    ERIC Educational Resources Information Center

    Parker, Edwin B.

    SPIRES (Stanford Public Information REtrieval System) is a computer information storage and retrieval system being developed at Stanford University with funding from the National Science Foundation. SPIRES has two major goals: to provide a user-oriented, interactive, on-line retrieval system for a variety of researchers at Stanford; and to support…

  4. Visual working memory buffers information retrieved from visual long-term memory.

    PubMed

    Fukuda, Keisuke; Woodman, Geoffrey F

    2017-05-16

    Human memory is thought to consist of long-term storage and short-term storage mechanisms, the latter known as working memory. Although it has long been assumed that information retrieved from long-term memory is represented in working memory, we lack neural evidence for this and need neural measures that allow us to watch this retrieval into working memory unfold with high temporal resolution. Here, we show that human electrophysiology can be used to track information as it is brought back into working memory during retrieval from long-term memory. Specifically, we found that the retrieval of information from long-term memory was limited to just a few simple objects' worth of information at once, and elicited a pattern of neurophysiological activity similar to that observed when people encode new information into working memory. Our findings suggest that working memory is where information is buffered when being retrieved from long-term memory and reconcile current theories of memory retrieval with classic notions about the memory mechanisms involved.

  5. Artificial Intelligence and Information Retrieval.

    ERIC Educational Resources Information Center

    Teodorescu, Ioana

    1987-01-01

    Compares artificial intelligence and information retrieval paradigms for natural language understanding, reviews progress to date, and outlines the applicability of artificial intelligence to question answering systems. A list of principal artificial intelligence software for database front end systems is appended. (CLB)

  6. Computer Information Retrieval for Journalists.

    ERIC Educational Resources Information Center

    Rodewald, Pam

    1989-01-01

    Discusses the use of computer information retrieval (on-line electronic search methods). Examines advantages and disadvantages of on-line searching versus manual searching. Offers questions to help in the decision to purchase and use on-line searching with students. (MS)

  7. Information retrieval algorithms: A survey

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Raghavan, P.

    We give an overview of some algorithmic problems arising in the representation of text/image/multimedia objects in a form amenable to automated searching, and in conducting these searches efficiently. These operations are central to information retrieval and digital library systems.

  8. Information Retrieval and Criticality in Parity-Time-Symmetric Systems.

    PubMed

    Kawabata, Kohei; Ashida, Yuto; Ueda, Masahito

    2017-11-10

    By investigating information flow between a general parity-time (PT-)symmetric non-Hermitian system and an environment, we find that the complete information retrieval from the environment can be achieved in the PT-unbroken phase, whereas no information can be retrieved in the PT-broken phase. The PT-transition point thus marks the reversible-irreversible criticality of information flow, around which many physical quantities such as the recurrence time and the distinguishability between quantum states exhibit power-law behavior. Moreover, by embedding a PT-symmetric system into a larger Hilbert space so that the entire system obeys unitary dynamics, we reveal that behind the information retrieval lies a hidden entangled partner protected by PT symmetry. Possible experimental situations are also discussed.

  9. Information Retrieval and Criticality in Parity-Time-Symmetric Systems

    NASA Astrophysics Data System (ADS)

    Kawabata, Kohei; Ashida, Yuto; Ueda, Masahito

    2017-11-01

    By investigating information flow between a general parity-time (P T -)symmetric non-Hermitian system and an environment, we find that the complete information retrieval from the environment can be achieved in the P T -unbroken phase, whereas no information can be retrieved in the P T -broken phase. The P T -transition point thus marks the reversible-irreversible criticality of information flow, around which many physical quantities such as the recurrence time and the distinguishability between quantum states exhibit power-law behavior. Moreover, by embedding a P T -symmetric system into a larger Hilbert space so that the entire system obeys unitary dynamics, we reveal that behind the information retrieval lies a hidden entangled partner protected by P T symmetry. Possible experimental situations are also discussed.

  10. XML Based Scientific Data Management Facility

    NASA Technical Reports Server (NTRS)

    Mehrotra, Piyush; Zubair, M.; Ziebartt, John (Technical Monitor)

    2001-01-01

    The World Wide Web consortium has developed an Extensible Markup Language (XML) to support the building of better information management infrastructures. The scientific computing community realizing the benefits of HTML has designed markup languages for scientific data. In this paper, we propose a XML based scientific data management facility, XDMF. The project is motivated by the fact that even though a lot of scientific data is being generated, it is not being shared because of lack of standards and infrastructure support for discovering and transforming the data. The proposed data management facility can be used to discover the scientific data itself, the transformation functions, and also for applying the required transformations. We have built a prototype system of the proposed data management facility that can work on different platforms. We have implemented the system using Java, and Apache XSLT engine Xalan. To support remote data and transformation functions, we had to extend the XSLT specification and the Xalan package.

  11. Software Helps Retrieve Information Relevant to the User

    NASA Technical Reports Server (NTRS)

    Mathe, Natalie; Chen, James

    2003-01-01

    The Adaptive Indexing and Retrieval Agent (ARNIE) is a code library, designed to be used by an application program, that assists human users in retrieving desired information in a hypertext setting. Using ARNIE, the program implements a computational model for interactively learning what information each human user considers relevant in context. The model, called a "relevance network," incrementally adapts retrieved information to users individual profiles on the basis of feedback from the users regarding specific queries. The model also generalizes such knowledge for subsequent derivation of relevant references for similar queries and profiles, thereby, assisting users in filtering information by relevance. ARNIE thus enables users to categorize and share information of interest in various contexts. ARNIE encodes the relevance and structure of information in a neural network dynamically configured with a genetic algorithm. ARNIE maintains an internal database, wherein it saves associations, and from which it returns associated items in response to a query. A C++ compiler for a platform on which ARNIE will be utilized is necessary for creating the ARNIE library but is not necessary for the execution of the software.

  12. Retrieving self-vocalized information: An event-related potential (ERP) study on the effect of retrieval orientation.

    PubMed

    Rosburg, Timm; Johansson, Mikael; Sprondel, Volker; Mecklinger, Axel

    2014-11-18

    Retrieval orientation refers to a pre-retrieval process and conceptualizes the specific form of processing that is applied to a retrieval cue. In the current event-related potential (ERP) study, we sought to find evidence for an involvement of the auditory cortex when subjects attempt to retrieve vocalized information, and hypothesized that adopting retrieval orientation would be beneficial for retrieval accuracy. During study, participants saw object words that they subsequently vocalized or visually imagined. At test, participants had to identify object names of one study condition as targets and to reject object names of the second condition together with new items. Target category switched after half of the test trials. Behaviorally, participants responded less accurately and more slowly to targets of the vocalize condition than to targets of the imagine condition. ERPs to new items varied at a single left electrode (T7) between 500 and 800ms, indicating a moderate retrieval orientation effect in the subject group as a whole. However, whereas the effect was strongly pronounced in participants with high retrieval accuracy, it was absent in participants with low retrieval accuracy. A current source density (CSD) mapping of the retrieval orientation effect indicated a source over left temporal regions. Independently from retrieval accuracy, the ERP retrieval orientation effect was surprisingly also modulated by test order. Findings are suggestive for an involvement of the auditory cortex in retrieval attempts of vocalized information and confirm that adopting retrieval orientation is potentially beneficial for retrieval accuracy. The effects of test order on retrieval-related processes might reflect a stronger focus on the newness of items in the more difficult test condition when participants started with this condition. Copyright © 2014 Elsevier Inc. All rights reserved.

  13. Visual working memory buffers information retrieved from visual long-term memory

    PubMed Central

    Fukuda, Keisuke; Woodman, Geoffrey F.

    2017-01-01

    Human memory is thought to consist of long-term storage and short-term storage mechanisms, the latter known as working memory. Although it has long been assumed that information retrieved from long-term memory is represented in working memory, we lack neural evidence for this and need neural measures that allow us to watch this retrieval into working memory unfold with high temporal resolution. Here, we show that human electrophysiology can be used to track information as it is brought back into working memory during retrieval from long-term memory. Specifically, we found that the retrieval of information from long-term memory was limited to just a few simple objects’ worth of information at once, and elicited a pattern of neurophysiological activity similar to that observed when people encode new information into working memory. Our findings suggest that working memory is where information is buffered when being retrieved from long-term memory and reconcile current theories of memory retrieval with classic notions about the memory mechanisms involved. PMID:28461479

  14. A Practical Introduction to the XML, Extensible Markup Language, by Way of Some Useful Examples

    ERIC Educational Resources Information Center

    Snyder, Robin

    2004-01-01

    XML, Extensible Markup Language, is important as a way to represent and encapsulate the structure of underlying data in a portable way that supports data exchange regardless of the physical storage of the data. This paper (and session) introduces some useful and practical aspects of XML technology for sharing information in a educational setting…

  15. 45 CFR 205.35 - Mechanized claims processing and information retrieval systems; definitions.

    Code of Federal Regulations, 2012 CFR

    2012-10-01

    ... claims processing and information retrieval systems; definitions. Section 205.35 through 205.38 contain...: (a) A mechanized claims processing and information retrieval system, hereafter referred to as an automated application processing and information retrieval system (APIRS), or the system, means a system of...

  16. 45 CFR 205.35 - Mechanized claims processing and information retrieval systems; definitions.

    Code of Federal Regulations, 2013 CFR

    2013-10-01

    ... claims processing and information retrieval systems; definitions. Section 205.35 through 205.38 contain...: (a) A mechanized claims processing and information retrieval system, hereafter referred to as an automated application processing and information retrieval system (APIRS), or the system, means a system of...

  17. 45 CFR 205.35 - Mechanized claims processing and information retrieval systems; definitions.

    Code of Federal Regulations, 2014 CFR

    2014-10-01

    ... claims processing and information retrieval systems; definitions. Section 205.35 through 205.38 contain...: (a) A mechanized claims processing and information retrieval system, hereafter referred to as an automated application processing and information retrieval system (APIRS), or the system, means a system of...

  18. Information retrieval pathways for health information exchange in multiple care settings.

    PubMed

    Kierkegaard, Patrick; Kaushal, Rainu; Vest, Joshua R

    2014-11-01

    To determine which health information exchange (HIE) technologies and information retrieval pathways healthcare professionals relied on to meet their information needs in the context of laboratory test results, radiological images and reports, and medication histories. Primary data was collected over a 2-month period across 3 emergency departments, 7 primary care practices, and 2 public health clinics in New York state. Qualitative research methods were used to collect and analyze data from semi-structured interviews and participant observation. The study reveals that healthcare professionals used a complex combination of information retrieval pathways for HIE to obtain clinical information from external organizations. The choice for each approach was setting- and information-specific, but was also highly dynamic across users and their information needs. Our findings about the complex nature of information sharing in healthcare provide insights for informatics professionals about the usage of information; indicate the need for managerial support within each organization; and suggest approaches to improve systems for organizations and agencies working to expand HIE adoption.

  19. Generic Information Can Retrieve Known Biological Associations: Implications for Biomedical Knowledge Discovery

    PubMed Central

    van Haagen, Herman H. H. B. M.; 't Hoen, Peter A. C.; Mons, Barend; Schultes, Erik A.

    2013-01-01

    Motivation Weighted semantic networks built from text-mined literature can be used to retrieve known protein-protein or gene-disease associations, and have been shown to anticipate associations years before they are explicitly stated in the literature. Our text-mining system recognizes over 640,000 biomedical concepts: some are specific (i.e., names of genes or proteins) others generic (e.g., ‘Homo sapiens’). Generic concepts may play important roles in automated information retrieval, extraction, and inference but may also result in concept overload and confound retrieval and reasoning with low-relevance or even spurious links. Here, we attempted to optimize the retrieval performance for protein-protein interactions (PPI) by filtering generic concepts (node filtering) or links to generic concepts (edge filtering) from a weighted semantic network. First, we defined metrics based on network properties that quantify the specificity of concepts. Then using these metrics, we systematically filtered generic information from the network while monitoring retrieval performance of known protein-protein interactions. We also systematically filtered specific information from the network (inverse filtering), and assessed the retrieval performance of networks composed of generic information alone. Results Filtering generic or specific information induced a two-phase response in retrieval performance: initially the effects of filtering were minimal but beyond a critical threshold network performance suddenly drops. Contrary to expectations, networks composed exclusively of generic information demonstrated retrieval performance comparable to unfiltered networks that also contain specific concepts. Furthermore, an analysis using individual generic concepts demonstrated that they can effectively support the retrieval of known protein-protein interactions. For instance the concept “binding” is indicative for PPI retrieval and the concept “mutation abnormality” is

  20. Integration of Information Retrieval and Database Management Systems.

    ERIC Educational Resources Information Center

    Deogun, Jitender S.; Raghavan, Vijay V.

    1988-01-01

    Discusses the motivation for integrating information retrieval and database management systems, and proposes a probabilistic retrieval model in which records in a file may be composed of attributes (formatted data items) and descriptors (content indicators). The details and resolutions of difficulties involved in integrating such systems are…

  1. Combining approaches to on-line handwriting information retrieval

    NASA Astrophysics Data System (ADS)

    Peña Saldarriaga, Sebastián; Viard-Gaudin, Christian; Morin, Emmanuel

    2010-01-01

    In this work, we propose to combine two quite different approaches for retrieving handwritten documents. Our hypothesis is that different retrieval algorithms should retrieve different sets of documents for the same query. Therefore, significant improvements in retrieval performances can be expected. The first approach is based on information retrieval techniques carried out on the noisy texts obtained through handwriting recognition, while the second approach is recognition-free using a word spotting algorithm. Results shows that for texts having a word error rate (WER) lower than 23%, the performances obtained with the combined system are close to the performances obtained on clean digital texts. In addition, for poorly recognized texts (WER > 52%), an improvement of nearly 17% can be observed with respect to the best available baseline method.

  2. Information Retrieval on social network: An Adaptive Proof

    NASA Astrophysics Data System (ADS)

    Elveny, M.; Syah, R.; Elfida, M.; Nasution, M. K. M.

    2018-01-01

    Information Retrieval has become one of the areas for studying to get the trusty information, with which the recall and precision become the measurement form that represents it. Nevertheless, development in certain scientific fields make it possible to improve the performance of the Information Retrieval. In this case, through social networks whereby the role of social actor degrees plays a role. This is an implication of the query in which co-occurrence becomes an indication of social networks. An adaptive approach we use by involving this query in sequence to a stand-alone query, it has proven the relationship among them.

  3. Information Retrieval Systems Retrieved? An Alternative to Present Dial Access Systems

    ERIC Educational Resources Information Center

    Hofmann, Norbert

    1976-01-01

    The expense of a dial access information retrieval system (DIARS) is weighed against its benefits. Problems of usage and efficacy for the student are outlined. A fully automated system is proposed instead, and its cost-saving features are pointed out. (MS)

  4. Information Retrieval in Virtual Universities

    ERIC Educational Resources Information Center

    Puustjärvi, Juha; Pöyry, Päivi

    2006-01-01

    Information retrieval in the context of virtual universities deals with the representation, organization, and access to learning objects. The representation and organization of learning objects should provide the learner with an easy access to the learning objects. In this article, we give an overview of the ONES system, and analyze the relevance…

  5. Distributed retrieval practice promotes superior recall of anatomy information.

    PubMed

    Dobson, John L; Perez, Jose; Linderholm, Tracy

    2017-07-01

    Effortful retrieval produces greater long-term recall of information when compared to studying (i.e., reading), as do learning sessions that are distributed (i.e., spaced apart) when compared to those that are massed together. Although the retrieval and distributed practice effects are well-established in the cognitive science literature, no studies have examined their additive effect with regard to learning anatomy information. The aim of this study was to determine how the benefits of retrieval practice vary with massed versus distributed learning. Participants used the following strategies to learn sets of skeletal muscle anatomy: (1) studying on three different days over a seven day period (SSSS 7,2,0 ), (2) studying and retrieving on three different days over a seven day period (SRSR 7,2,0 ), (3) studying on two different days over a two day period (SSSSSS 2,0 ), (4) studying and retrieving on two separate days over a two day period (SRSRSR 2,0 ), and (5) studying and retrieving on one day (SRx6 0 ). All strategies consisted of 12 learning phases and lasted exactly 24 minutes. Muscle information retention was assessed via free recall and using repeated measures ANOVAs. A week after learning, the recall scores were 24.72 ± 3.12, 33.88 ± 3.48, 15.51 ± 2.48, 20.72 ± 2.94, and 12.86 ± 2.05 for the SSSS 7,2,0 , SRSR 7,2,0 , SSSSSS 2,0 , STSTST 2,0 , and SRx6 0 strategies, respectively. In conclusion, the distributed strategies produced significantly better recall than the massed strategies, the retrieval-based strategies produced significantly better recall than the studying strategies, and the combination of distributed and retrieval practice generated the greatest recall of anatomy information. Anat Sci Educ 10: 339-347. © 2016 American Association of Anatomists. © 2016 American Association of Anatomists.

  6. Music information retrieval in compressed audio files: a survey

    NASA Astrophysics Data System (ADS)

    Zampoglou, Markos; Malamos, Athanasios G.

    2014-07-01

    In this paper, we present an organized survey of the existing literature on music information retrieval systems in which descriptor features are extracted directly from the compressed audio files, without prior decompression to pulse-code modulation format. Avoiding the decompression step and utilizing the readily available compressed-domain information can significantly lighten the computational cost of a music information retrieval system, allowing application to large-scale music databases. We identify a number of systems relying on compressed-domain information and form a systematic classification of the features they extract, the retrieval tasks they tackle and the degree in which they achieve an actual increase in the overall speed-as well as any resulting loss in accuracy. Finally, we discuss recent developments in the field, and the potential research directions they open toward ultra-fast, scalable systems.

  7. A Personalized Health Information Retrieval System

    PubMed Central

    Wang, Yunli; Liu, Zhenkai

    2005-01-01

    Consumers face barriers when seeking health information on the Internet. A Personalized Health Information Retrieval System (PHIRS) is proposed to recommend health information for consumers. The system consists of four modules: (1) User modeling module captures user’s preference and health interests; (2) Automatic quality filtering module identifies high quality health information; (3) Automatic text difficulty rating module classifies health information into professional or patient educational materials; and (4) User profile matching module tailors health information for individuals. The initial results show that PHIRS could assist consumers with simple search strategies. PMID:16779435

  8. Information Retrieval (SPIRES) and Library Automation (BALLOTS) at Stanford University.

    ERIC Educational Resources Information Center

    Ferguson, Douglas, Ed.

    At Stanford University, two major projects have been involved jointly in library automation and information retrieval since 1968: BALLOTS (Bibliographic Automation of Large Library Operations) and SPIRES (Stanford Physics Information Retrieval System). In early 1969, two prototype applications were activated using the jointly developed systems…

  9. Topic Models in Information Retrieval

    DTIC Science & Technology

    2007-08-01

    Information Processing Systems, Cambridge, MA, MIT Press, 2004. Brown, P.F., Della Pietra, V.J., deSouza, P.V., Lai, J.C. and Mercer, R.L., Class-based...2003. http://www.wkap.nl/prod/b/1-4020-1216-0. Croft, W.B., Lucia , T.J., Cringean, J., and Willett, P., Retrieving Documents By Plausible Inference

  10. Parallel interactive retrieval of item and associative information from event memory.

    PubMed

    Cox, Gregory E; Criss, Amy H

    2017-09-01

    Memory contains information about individual events (items) and combinations of events (associations). Despite the fundamental importance of this distinction, it remains unclear exactly how these two kinds of information are stored and whether different processes are used to retrieve them. We use both model-independent qualitative properties of response dynamics and quantitative modeling of individuals to address these issues. Item and associative information are not independent and they are retrieved concurrently via interacting processes. During retrieval, matching item and associative information mutually facilitate one another to yield an amplified holistic signal. Modeling of individuals suggests that this kind of facilitation between item and associative retrieval is a ubiquitous feature of human memory. Copyright © 2017 Elsevier Inc. All rights reserved.

  11. A Prototype of an Intelligent System for Information Retrieval: IOTA.

    ERIC Educational Resources Information Center

    Chiaramella, Y.; Defude, B.

    1987-01-01

    Discusses expert systems and their value as components of information retrieval systems related to semantic inference, and describes IOTA, a model of an intelligent information retrieval system which emphasizes natural language query processing. Experimental results are discussed and current and future developments are highlighted. (Author/LRW)

  12. XML and its impact on content and structure in electronic health care documents.

    PubMed Central

    Sokolowski, R.; Dudeck, J.

    1999-01-01

    Worldwide information networks have the requirement that electronic documents must be easily accessible, portable, flexible and system-independent. With the development of XML (eXtensible Markup Language), the future of electronic documents, health care informatics and the Web itself are about to change. The intent of the recently formed ASTM E31.25 subcommittee, "XML DTDs for Health Care", is to develop standard electronic document representations of paper-based health care documents and forms. A goal of the subcommittee is to work together to enhance existing levels of interoperability among the various XML/SGML standardization efforts, products and systems in health care. The ASTM E31.25 subcommittee uses common practices and software standards to develop the implementation recommendations for XML documents in health care. The implementation recommendations are being developed to standardize the many different structures of documents. These recommendations are in the form of a set of standard DTDs, or document type definitions that match the electronic document requirements in the health care industry. This paper discusses recent efforts of the ASTM E31.25 subcommittee. PMID:10566338

  13. Prototyping a Distributed Information Retrieval System That Uses Statistical Ranking.

    ERIC Educational Resources Information Center

    Harman, Donna; And Others

    1991-01-01

    Built using a distributed architecture, this prototype distributed information retrieval system uses statistical ranking techniques to provide better service to the end user. Distributed architecture was shown to be a feasible alternative to centralized or CD-ROM information retrieval, and user testing of the ranking methodology showed both…

  14. Query-Time Optimization Techniques for Structured Queries in Information Retrieval

    ERIC Educational Resources Information Center

    Cartright, Marc-Allen

    2013-01-01

    The use of information retrieval (IR) systems is evolving towards larger, more complicated queries. Both the IR industrial and research communities have generated significant evidence indicating that in order to continue improving retrieval effectiveness, increases in retrieval model complexity may be unavoidable. From an operational perspective,…

  15. Improving biomedical information retrieval by linear combinations of different query expansion techniques.

    PubMed

    Abdulla, Ahmed AbdoAziz Ahmed; Lin, Hongfei; Xu, Bo; Banbhrani, Santosh Kumar

    2016-07-25

    Biomedical literature retrieval is becoming increasingly complex, and there is a fundamental need for advanced information retrieval systems. Information Retrieval (IR) programs scour unstructured materials such as text documents in large reserves of data that are usually stored on computers. IR is related to the representation, storage, and organization of information items, as well as to access. In IR one of the main problems is to determine which documents are relevant and which are not to the user's needs. Under the current regime, users cannot precisely construct queries in an accurate way to retrieve particular pieces of data from large reserves of data. Basic information retrieval systems are producing low-quality search results. In our proposed system for this paper we present a new technique to refine Information Retrieval searches to better represent the user's information need in order to enhance the performance of information retrieval by using different query expansion techniques and apply a linear combinations between them, where the combinations was linearly between two expansion results at one time. Query expansions expand the search query, for example, by finding synonyms and reweighting original terms. They provide significantly more focused, particularized search results than do basic search queries. The retrieval performance is measured by some variants of MAP (Mean Average Precision) and according to our experimental results, the combination of best results of query expansion is enhanced the retrieved documents and outperforms our baseline by 21.06 %, even it outperforms a previous study by 7.12 %. We propose several query expansion techniques and their combinations (linearly) to make user queries more cognizable to search engines and to produce higher-quality search results.

  16. Information Model Translation to Support a Wider Science Community

    NASA Astrophysics Data System (ADS)

    Hughes, John S.; Crichton, Daniel; Ritschel, Bernd; Hardman, Sean; Joyner, Ronald

    2014-05-01

    The Planetary Data System (PDS), NASA's long-term archive for solar system exploration data, has just released PDS4, a modernization of the PDS architecture, data standards, and technical infrastructure. This next generation system positions the PDS to meet the demands of the coming decade, including big data, international cooperation, distributed nodes, and multiple ways of analysing and interpreting data. It also addresses three fundamental project goals: providing more efficient data delivery by data providers to the PDS, enabling a stable, long-term usable planetary science data archive, and enabling services for the data consumer to find, access, and use the data they require in contemporary data formats. The PDS4 information architecture is used to describe all PDS data using a common model. Captured in an ontology modeling tool it supports a hierarchy of data dictionaries built to the ISO/IEC 11179 standard and is designed to increase flexibility, enable complex searches at the product level, and to promote interoperability that facilitates data sharing both nationally and internationally. A PDS4 information architecture design requirement stipulates that the content of the information model must be translatable to external data definition languages such as XML Schema, XMI/XML, and RDF/XML. To support the semantic Web standards we are now in the process of mapping the contents into RDF/XML to support SPARQL capable databases. We are also building a terminological ontology to support virtually unified data retrieval and access. This paper will provide an overview of the PDS4 information architecture focusing on its domain information model and how the translation and mapping are being accomplished.

  17. XML Content Finally Arrives on the Web!

    ERIC Educational Resources Information Center

    Funke, Susan

    1998-01-01

    Explains extensible markup language (XML) and how it differs from hypertext markup language (HTML) and standard generalized markup language (SGML). Highlights include features of XML, including better formatting of documents, better searching capabilities, multiple uses for hyperlinking, and an increase in Web applications; Web browsers; and what…

  18. Improving life sciences information retrieval using semantic web technology.

    PubMed

    Quan, Dennis

    2007-05-01

    The ability to retrieve relevant information is at the heart of every aspect of research and development in the life sciences industry. Information is often distributed across multiple systems and recorded in a way that makes it difficult to piece together the complete picture. Differences in data formats, naming schemes and network protocols amongst information sources, both public and private, must be overcome, and user interfaces not only need to be able to tap into these diverse information sources but must also assist users in filtering out extraneous information and highlighting the key relationships hidden within an aggregated set of information. The Semantic Web community has made great strides in proposing solutions to these problems, and many efforts are underway to apply Semantic Web techniques to the problem of information retrieval in the life sciences space. This article gives an overview of the principles underlying a Semantic Web-enabled information retrieval system: creating a unified abstraction for knowledge using the RDF semantic network model; designing semantic lenses that extract contextually relevant subsets of information; and assembling semantic lenses into powerful information displays. Furthermore, concrete examples of how these principles can be applied to life science problems including a scenario involving a drug discovery dashboard prototype called BioDash are provided.

  19. Current Research into Chemical and Textual Information Retrieval at the Department of Information Studies, University of Sheffield.

    ERIC Educational Resources Information Center

    Lynch, Michael F.; Willett, Peter

    1987-01-01

    Discusses research into chemical information and document retrieval systems at the University of Sheffield. Highlights include the use of cluster analysis methods for document retrieval and drug design, representation and searching of files of generic chemical structures, and the application of parallel computer hardware to information retrieval.…

  20. Problems and challenges in patient information retrieval: a descriptive study.

    PubMed Central

    Kogan, S.; Zeng, Q.; Ash, N.; Greenes, R. A.

    2001-01-01

    Many patients now turn to the Web for health care information. However, a lack of domain knowledge and unfamiliarity with medical vocabulary and concepts restrict their ability to successfully obtain information they seek. The purpose of this descriptive study was to identify and classify the problems a patient encounters while performing information retrieval tasks on the Web, and the challenges it poses to informatics research. In this study, we observed patients performing various retrieval tasks, and measured the effectiveness of, satisfaction with, and usefulness of the results. Our study showed that patient information retrieval often failed to produce successful results due to a variety of problems. We propose a classification of patient IR problems based on our observations. PMID:11825205

  1. Roogle: an information retrieval engine for clinical data warehouse.

    PubMed

    Cuggia, Marc; Garcelon, Nicolas; Campillo-Gimenez, Boris; Bernicot, Thomas; Laurent, Jean-François; Garin, Etienne; Happe, André; Duvauferrier, Régis

    2011-01-01

    High amount of relevant information is contained in reports stored in the electronic patient records and associated metadata. R-oogle is a project aiming at developing information retrieval engines adapted to these reports and designed for clinicians. The system consists in a data warehouse (full-text reports and structured data) imported from two different hospital information systems. Information retrieval is performed using metadata-based semantic and full-text search methods (as Google). Applications may be biomarkers identification in a translational approach, search of specific cases, and constitution of cohorts, professional practice evaluation, and quality control assessment.

  2. Speech-recognition interfaces for music information retrieval

    NASA Astrophysics Data System (ADS)

    Goto, Masataka

    2005-09-01

    This paper describes two hands-free music information retrieval (MIR) systems that enable a user to retrieve and play back a musical piece by saying its title or the artist's name. Although various interfaces for MIR have been proposed, speech-recognition interfaces suitable for retrieving musical pieces have not been studied. Our MIR-based jukebox systems employ two different speech-recognition interfaces for MIR, speech completion and speech spotter, which exploit intentionally controlled nonverbal speech information in original ways. The first is a music retrieval system with the speech-completion interface that is suitable for music stores and car-driving situations. When a user only remembers part of the name of a musical piece or an artist and utters only a remembered fragment, the system helps the user recall and enter the name by completing the fragment. The second is a background-music playback system with the speech-spotter interface that can enrich human-human conversation. When a user is talking to another person, the system allows the user to enter voice commands for music playback control by spotting a special voice-command utterance in face-to-face or telephone conversations. Experimental results from use of these systems have demonstrated the effectiveness of the speech-completion and speech-spotter interfaces. (Video clips: http://staff.aist.go.jp/m.goto/MIR/speech-if.html)

  3. Guidelines for Teachers of Online Information Retrieval.

    ERIC Educational Resources Information Center

    Wood, F. E.

    These guidelines are intended to provide realistic and practical guidance about the options available to teachers and planners of education and training programs wholly or partly concerned with online information retrieval, particularly those in academic departments of library and information studies. Seven sections address: (1) the aims and scope…

  4. An XML-Based Protocol for Distributed Event Services

    NASA Technical Reports Server (NTRS)

    Smith, Warren; Gunter, Dan; Quesnel, Darcy; Biegel, Bryan (Technical Monitor)

    2001-01-01

    A recent trend in distributed computing is the construction of high-performance distributed systems called computational grids. One difficulty we have encountered is that there is no standard format for the representation of performance information and no standard protocol for transmitting this information. This limits the types of performance analysis that can be undertaken in complex distributed systems. To address this problem, we present an XML-based protocol for transmitting performance events in distributed systems and evaluate the performance of this protocol.

  5. Specifics on a XML Data Format for Scientific Data

    NASA Astrophysics Data System (ADS)

    Shaya, E.; Thomas, B.; Cheung, C.

    An XML-based data format for interchange and archiving of scientific data would benefit in many ways from the features standardized in XML. Foremost of these features is the world-wide acceptance and adoption of XML. Applications, such as browsers, XQL and XSQL advanced query, XML editing, or CSS or XSLT transformation, that are coming out of industry and academia can be easily adopted and provide startling new benefits and features. We have designed a prototype of a core format for holding, in a very general way, parameters, tables, scalar and vector fields, atlases, animations and complex combinations of these. This eXtensible Data Format (XDF) makes use of XML functionalities such as: self-validation of document structure, default values for attributes, XLink hyperlinks, entity replacements, internal referencing, inheritance, and XSLT transformation. An API is available to aid in detailed assembly, extraction, and manipulation. Conversion tools to and from FITS and other existing data formats are under development. In the future, we hope to provide object oriented interfaces to C++, Java, Python, IDL, Mathematica, Maple, and various databases. http://xml.gsfc.nasa.gov/XDF

  6. A comparison of database systems for XML-type data.

    PubMed

    Risse, Judith E; Leunissen, Jack A M

    2010-01-01

    In the field of bioinformatics interchangeable data formats based on XML are widely used. XML-type data is also at the core of most web services. With the increasing amount of data stored in XML comes the need for storing and accessing the data. In this paper we analyse the suitability of different database systems for storing and querying large datasets in general and Medline in particular. All reviewed database systems perform well when tested with small to medium sized datasets, however when the full Medline dataset is queried a large variation in query times is observed. There is not one system that is vastly superior to the others in this comparison and, depending on the database size and the query requirements, different systems are most suitable. The best all-round solution is the Oracle 11~g database system using the new binary storage option. Alias-i's Lingpipe is a more lightweight, customizable and sufficiently fast solution. It does however require more initial configuration steps. For data with a changing XML structure Sedna and BaseX as native XML database systems or MySQL with an XML-type column are suitable.

  7. Expert Search Strategies: The Information Retrieval Practices of Healthcare Information Professionals

    PubMed Central

    2017-01-01

    Background Healthcare information professionals play a key role in closing the knowledge gap between medical research and clinical practice. Their work involves meticulous searching of literature databases using complex search strategies that can consist of hundreds of keywords, operators, and ontology terms. This process is prone to error and can lead to inefficiency and bias if performed incorrectly. Objective The aim of this study was to investigate the search behavior of healthcare information professionals, uncovering their needs, goals, and requirements for information retrieval systems. Methods A survey was distributed to healthcare information professionals via professional association email discussion lists. It investigated the search tasks they undertake, their techniques for search strategy formulation, their approaches to evaluating search results, and their preferred functionality for searching library-style databases. The popular literature search system PubMed was then evaluated to determine the extent to which their needs were met. Results The 107 respondents indicated that their information retrieval process relied on the use of complex, repeatable, and transparent search strategies. On average it took 60 minutes to formulate a search strategy, with a search task taking 4 hours and consisting of 15 strategy lines. Respondents reviewed a median of 175 results per search task, far more than they would ideally like (100). The most desired features of a search system were merging search queries and combining search results. Conclusions Healthcare information professionals routinely address some of the most challenging information retrieval problems of any profession. However, their needs are not fully supported by current literature search systems and there is demand for improved functionality, in particular regarding the development and management of search strategies. PMID:28970190

  8. Modeling the Arden Syntax for medical decisions in XML.

    PubMed

    Kim, Sukil; Haug, Peter J; Rocha, Roberto A; Choi, Inyoung

    2008-10-01

    A new model expressing Arden Syntax with the eXtensible Markup Language (XML) was developed to increase its portability. Every example was manually parsed and reviewed until the schema and the style sheet were considered to be optimized. When the first schema was finished, several MLMs in Arden Syntax Markup Language (ArdenML) were validated against the schema. They were then transformed to HTML formats with the style sheet, during which they were compared to the original text version of their own MLM. When faults were found in the transformed MLM, the schema and/or style sheet was fixed. This cycle continued until all the examples were encoded into XML documents. The original MLMs were encoded in XML according to the proposed XML schema and reverse-parsed MLMs in ArdenML were checked using a public domain Arden Syntax checker. Two hundred seventy seven examples of MLMs were successfully transformed into XML documents using the model, and the reverse-parse yielded the original text version of MLMs. Two hundred sixty five of the 277 MLMs showed the same error patterns before and after transformation, and all 11 errors related to statement structure were resolved in XML version. The model uses two syntax checking mechanisms, first an XML validation process, and second, a syntax check using an XSL style sheet. Now that we have a schema for ArdenML, we can also begin the development of style sheets for transformation ArdenML into other languages.

  9. A Comparison of Three Online Information Retrieval Services.

    ERIC Educational Resources Information Center

    Zais, Harriet W.

    Three firms which offer online information retrieval are compared. The firms are Lockheed Information Service, System Development Corporation and the Western Research Application Center. Comparison tables provide information such as hours accessible, coverage, file update, search elements and cost figures for 15 data bases. In addition, general…

  10. XML Translator for Interface Descriptions

    NASA Technical Reports Server (NTRS)

    Boroson, Elizabeth R.

    2009-01-01

    A computer program defines an XML schema for specifying the interface to a generic FPGA from the perspective of software that will interact with the device. This XML interface description is then translated into header files for C, Verilog, and VHDL. User interface definition input is checked via both the provided XML schema and the translator module to ensure consistency and accuracy. Currently, programming used on both sides of an interface is inconsistent. This makes it hard to find and fix errors. By using a common schema, both sides are forced to use the same structure by using the same framework and toolset. This makes for easy identification of problems, which leads to the ability to formulate a solution. The toolset contains constants that allow a programmer to use each register, and to access each field in the register. Once programming is complete, the translator is run as part of the make process, which ensures that whenever an interface is changed, all of the code that uses the header files describing it is recompiled.

  11. A comparison of Boolean-based retrieval to the WAIS system for retrieval of aeronautical information

    NASA Technical Reports Server (NTRS)

    Marchionini, Gary; Barlow, Diane

    1994-01-01

    An evaluation of an information retrieval system using a Boolean-based retrieval engine and inverted file architecture and WAIS, which uses a vector-based engine, was conducted. Four research questions in aeronautical engineering were used to retrieve sets of citations from the NASA Aerospace Database which was mounted on a WAIS server and available through Dialog File 108 which served as the Boolean-based system (BBS). High recall and high precision searches were done in the BBS and terse and verbose queries were used in the WAIS condition. Precision values for the WAIS searches were consistently above the precision values for high recall BBS searches and consistently below the precision values for high precision BBS searches. Terse WAIS queries gave somewhat better precision performance than verbose WAIS queries. In every case, a small number of relevant documents retrieved by one system were not retrieved by the other, indicating the incomplete nature of the results from either retrieval system. Relevant documents in the WAIS searches were found to be randomly distributed in the retrieved sets rather than distributed by ranks. Advantages and limitations of both types of systems are discussed.

  12. Multilevel resistive information storage and retrieval

    DOEpatents

    Lohn, Andrew; Mickel, Patrick R.

    2016-08-09

    The present invention relates to resistive random-access memory (RRAM or ReRAM) systems, as well as methods of employing multiple state variables to form degenerate states in such memory systems. The methods herein allow for precise write and read steps to form multiple state variables, and these steps can be performed electrically. Such an approach allows for multilevel, high density memory systems with enhanced information storage capacity and simplified information retrieval.

  13. Health consumer-oriented information retrieval.

    PubMed

    Claveau, Vincent; Hamon, Thierry; Le Maguer, Sébastien; Grabar, Natalia

    2015-01-01

    While patients can freely access their Electronic Health Records or online health information, they may not be able to correctly understand the content of these documents. One of the challenges is related to the difference between expert and non-expert languages. We propose to investigate this issue within the Information Retrieval field. The patient queries have to be associated with the corresponding expert documents, that provide trustworthy information. Our approach relies on a state-of-the-art IR system called Indri and on semantic resources. Different query expansion strategies are explored. Our system shows up to 0.6740 P@10, up to 0.7610 R@10, and up to 0.6793 NDCG@10.

  14. Teaching Information Retrieval Using Telediscussion Techniques.

    ERIC Educational Resources Information Center

    Heiliger, Edward M.

    This paper concerns an experiment in teaching a graduate seminar on Information Retrieval using telediscussion techniques. Outstanding persons from Project INTREX, MEDLARS, Chemical Abstracts, the University of Georgia, the SUNY biomedical Network, AEC, NASA, and DDC gave hour-long telelectures. A Conference Telephone Set was used with success.…

  15. Adding Hierarchical Objects to Relational Database General-Purpose XML-Based Information Managements

    NASA Technical Reports Server (NTRS)

    Lin, Shu-Chun; Knight, Chris; La, Tracy; Maluf, David; Bell, David; Tran, Khai Peter; Gawdiak, Yuri

    2006-01-01

    NETMARK is a flexible, high-throughput software system for managing, storing, and rapid searching of unstructured and semi-structured documents. NETMARK transforms such documents from their original highly complex, constantly changing, heterogeneous data formats into well-structured, common data formats in using Hypertext Markup Language (HTML) and/or Extensible Markup Language (XML). The software implements an object-relational database system that combines the best practices of the relational model utilizing Structured Query Language (SQL) with those of the object-oriented, semantic database model for creating complex data. In particular, NETMARK takes advantage of the Oracle 8i object-relational database model using physical-address data types for very efficient keyword searches of records across both context and content. NETMARK also supports multiple international standards such as WEBDAV for drag-and-drop file management and SOAP for integrated information management using Web services. The document-organization and -searching capabilities afforded by NETMARK are likely to make this software attractive for use in disciplines as diverse as science, auditing, and law enforcement.

  16. Web-based multimedia information retrieval for clinical application research

    NASA Astrophysics Data System (ADS)

    Cao, Xinhua; Hoo, Kent S., Jr.; Zhang, Hong; Ching, Wan; Zhang, Ming; Wong, Stephen T. C.

    2001-08-01

    We described a web-based data warehousing method for retrieving and analyzing neurological multimedia information. The web-based method supports convenient access, effective search and retrieval of clinical textual and image data, and on-line analysis. To improve the flexibility and efficiency of multimedia information query and analysis, a three-tier, multimedia data warehouse for epilepsy research has been built. The data warehouse integrates clinical multimedia data related to epilepsy from disparate sources and archives them into a well-defined data model.

  17. The Oklahoma Geographic Information Retrieval System

    NASA Technical Reports Server (NTRS)

    Blanchard, W. A.

    1982-01-01

    The Oklahoma Geographic Information Retrieval System (OGIRS) is a highly interactive data entry, storage, manipulation, and display software system for use with geographically referenced data. Although originally developed for a project concerned with coal strip mine reclamation, OGIRS is capable of handling any geographically referenced data for a variety of natural resource management applications. A special effort has been made to integrate remotely sensed data into the information system. The timeliness and synoptic coverage of satellite data are particularly useful attributes for inclusion into the geographic information system.

  18. Information Storage and Retrieval. Reports on Analysis, Search, and Iterative Retrieval.

    ERIC Educational Resources Information Center

    Salton, Gerard

    As the fourteenth report in a series describing research in automatic information storage and retrieval, this document covers work carried out on the SMART project for approximately one year (summer 1967 to summer 1968). The document is divided into four main parts: (1) SMART systems design, (2) analysis and search experiments, (3) user feedback…

  19. Distributed Retrieval Practice Promotes Superior Recall of Anatomy Information

    ERIC Educational Resources Information Center

    Dobson, John L.; Perez, Jose; Linderholm, Tracy

    2017-01-01

    Effortful retrieval produces greater long-term recall of information when compared to studying (i.e., reading), as do learning sessions that are distributed (i.e., spaced apart) when compared to those that are massed together. Although the retrieval and distributed practice effects are well-established in the cognitive science literature, no…

  20. Historical Note: The Past Thirty Years in Information Retrieval.

    ERIC Educational Resources Information Center

    Salton, Gerard

    1987-01-01

    Briefly reviews early work in documentation and text processing, and predictions that were made about the creative role of computers in information retrieval. An attempt is made to explain why these predictions were not fulfilled and conclusions are drawn regarding the limits of computer power in text retrieval applications. (Author/CLB)

  1. Expert Search Strategies: The Information Retrieval Practices of Healthcare Information Professionals.

    PubMed

    Russell-Rose, Tony; Chamberlain, Jon

    2017-10-02

    Healthcare information professionals play a key role in closing the knowledge gap between medical research and clinical practice. Their work involves meticulous searching of literature databases using complex search strategies that can consist of hundreds of keywords, operators, and ontology terms. This process is prone to error and can lead to inefficiency and bias if performed incorrectly. The aim of this study was to investigate the search behavior of healthcare information professionals, uncovering their needs, goals, and requirements for information retrieval systems. A survey was distributed to healthcare information professionals via professional association email discussion lists. It investigated the search tasks they undertake, their techniques for search strategy formulation, their approaches to evaluating search results, and their preferred functionality for searching library-style databases. The popular literature search system PubMed was then evaluated to determine the extent to which their needs were met. The 107 respondents indicated that their information retrieval process relied on the use of complex, repeatable, and transparent search strategies. On average it took 60 minutes to formulate a search strategy, with a search task taking 4 hours and consisting of 15 strategy lines. Respondents reviewed a median of 175 results per search task, far more than they would ideally like (100). The most desired features of a search system were merging search queries and combining search results. Healthcare information professionals routinely address some of the most challenging information retrieval problems of any profession. However, their needs are not fully supported by current literature search systems and there is demand for improved functionality, in particular regarding the development and management of search strategies. ©Tony Russell-Rose, Jon Chamberlain. Originally published in JMIR Medical Informatics (http://medinform.jmir.org), 02.10.2017.

  2. A new approach to the concept of "relevance" in information retrieval (IR).

    PubMed

    Kagolovsky, Y; Möhr, J R

    2001-01-01

    The concept of "relevance" is the fundamental concept of information science in general and information retrieval, in particular. Although "relevance" is extensively used in evaluation of information retrieval, there are considerable problems associated with reaching an agreement on its definition, meaning, evaluation, and application in information retrieval. There are a number of different views on "relevance" and its use for evaluation. Based on a review of the literature the main problems associated with the concept of "relevance" in information retrieval are identified. The authors argue that the proposal for the solution of the problems can be based on the conceptual IR framework built using a systems analytic approach to IR. Using this framework different kinds of "relevance" relationships in the IR process are identified, and a methodology for evaluation of "relevance" based on methods of semantics capturing and comparison is proposed.

  3. Human Information Behaviour and Design, Development and Evaluation of Information Retrieval Systems

    ERIC Educational Resources Information Center

    Keshavarz, Hamid

    2008-01-01

    Purpose: The purpose of this paper is to introduce the concept of human information behaviour and to explore the relationship between information behaviour of users and the existing approaches dominating design and evaluation of information retrieval (IR) systems and also to describe briefly new design and evaluation methods in which extensive…

  4. SPIRES (Stanford Public Information REtrieval System). Annual Report (2d, 1968).

    ERIC Educational Resources Information Center

    Parker, Edwin B.; And Others

    During 1968 the name of the project was changed from Stanford Physics Information Retrieval System" to "Stanford Public Information Retrieval System" to reflect the broadening of perspective and goals due to formal collaboration with Project BALLOTS (Bibliographic Automation of Large Library Operations using a Time-Sharing System).…

  5. Case retrieval in medical databases by fusing heterogeneous information.

    PubMed

    Quellec, Gwénolé; Lamard, Mathieu; Cazuguel, Guy; Roux, Christian; Cochener, Béatrice

    2011-01-01

    A novel content-based heterogeneous information retrieval framework, particularly well suited to browse medical databases and support new generation computer aided diagnosis (CADx) systems, is presented in this paper. It was designed to retrieve possibly incomplete documents, consisting of several images and semantic information, from a database; more complex data types such as videos can also be included in the framework. The proposed retrieval method relies on image processing, in order to characterize each individual image in a document by their digital content, and information fusion. Once the available images in a query document are characterized, a degree of match, between the query document and each reference document stored in the database, is defined for each attribute (an image feature or a metadata). A Bayesian network is used to recover missing information if need be. Finally, two novel information fusion methods are proposed to combine these degrees of match, in order to rank the reference documents by decreasing relevance for the query. In the first method, the degrees of match are fused by the Bayesian network itself. In the second method, they are fused by the Dezert-Smarandache theory: the second approach lets us model our confidence in each source of information (i.e., each attribute) and take it into account in the fusion process for a better retrieval performance. The proposed methods were applied to two heterogeneous medical databases, a diabetic retinopathy database and a mammography screening database, for computer aided diagnosis. Precisions at five of 0.809 ± 0.158 and 0.821 ± 0.177, respectively, were obtained for these two databases, which is very promising.

  6. Infectious Cognition: Risk Perception Affects Socially Shared Retrieval-Induced Forgetting of Medical Information.

    PubMed

    Coman, Alin; Berry, Jessica N

    2015-12-01

    When speakers selectively retrieve previously learned information, listeners often concurrently, and covertly, retrieve their memories of that information. This concurrent retrieval typically enhances memory for mentioned information (the rehearsal effect) and impairs memory for unmentioned but related information (socially shared retrieval-induced forgetting, SSRIF), relative to memory for unmentioned and unrelated information. Building on research showing that anxiety leads to increased attention to threat-relevant information, we explored whether concurrent retrieval is facilitated in high-anxiety real-world contexts. Participants first learned category-exemplar facts about meningococcal disease. Following a manipulation of perceived risk of infection (low vs. high risk), they listened to a mock radio show in which some of the facts were selectively practiced. Final recall tests showed that the rehearsal effect was equivalent between the two risk conditions, but SSRIF was significantly larger in the high-risk than in the low-risk condition. Thus, the tendency to exaggerate consequences of news events was found to have deleterious consequences. © The Author(s) 2015.

  7. Can We Retrieve the Information Which Was Intentionally Forgotten? Electrophysiological Correlates of Strategic Retrieval in Directed Forgetting.

    PubMed

    Mao, Xinrui; Tian, Mengxi; Liu, Yi; Li, Bingcan; Jin, Yan; Wu, Yanhong; Guo, Chunyan

    2017-01-01

    Retrieval inhibition hypothesis of directed forgetting effects assumed TBF (to-be-forgotten) items were not retrieved intentionally, while selective rehearsal hypothesis assumed the memory representation of retrieved TBF (to-be-forgotten) items was weaker than TBR (to-be-remembered) items. Previous studies indicated that directed forgetting effects of item-cueing method resulted from selective rehearsal at encoding, but the mechanism of retrieval inhibition that affected directed forgetting of TBF (to-be-forgotten) items was not clear. Strategic retrieval is a control process allowing the selective retrieval of target information, which includes retrieval orientation and strategic recollection. Retrieval orientation via the comparison of tasks refers to the specific form of processing resulted by retrieval efforts. Strategic recollection is the type of strategies to recollect studied items for the retrieval success of targets. Using a "directed forgetting" paradigm combined with a memory exclusion task, our investigation of strategic retrieval in directed forgetting assisted to explore how retrieval inhibition played a role on directed forgetting effects. When TBF items were targeted, retrieval orientation showed more positive ERPs to new items, indicating that TBF items demanded more retrieval efforts. The results of strategic recollection indicated that: (a) when TBR items were retrieval targets, late parietal old/new effects were only evoked by TBR items but not TBF items, indicating the retrieval inhibition of TBF items; (b) when TBF items were retrieval targets, the late parietal old/new effect were evoked by both TBR items and TBF items, indicating that strategic retrieval could overcome retrieval inhibition of TBF items. These findings suggested the modulation of strategic retrieval on retrieval inhibition of directed forgetting, supporting that directed forgetting effects were not only caused by selective rehearsal, but also retrieval inhibition.

  8. Can We Retrieve the Information Which Was Intentionally Forgotten? Electrophysiological Correlates of Strategic Retrieval in Directed Forgetting

    PubMed Central

    Mao, Xinrui; Tian, Mengxi; Liu, Yi; Li, Bingcan; Jin, Yan; Wu, Yanhong; Guo, Chunyan

    2017-01-01

    Retrieval inhibition hypothesis of directed forgetting effects assumed TBF (to-be-forgotten) items were not retrieved intentionally, while selective rehearsal hypothesis assumed the memory representation of retrieved TBF (to-be-forgotten) items was weaker than TBR (to-be-remembered) items. Previous studies indicated that directed forgetting effects of item-cueing method resulted from selective rehearsal at encoding, but the mechanism of retrieval inhibition that affected directed forgetting of TBF (to-be-forgotten) items was not clear. Strategic retrieval is a control process allowing the selective retrieval of target information, which includes retrieval orientation and strategic recollection. Retrieval orientation via the comparison of tasks refers to the specific form of processing resulted by retrieval efforts. Strategic recollection is the type of strategies to recollect studied items for the retrieval success of targets. Using a “directed forgetting” paradigm combined with a memory exclusion task, our investigation of strategic retrieval in directed forgetting assisted to explore how retrieval inhibition played a role on directed forgetting effects. When TBF items were targeted, retrieval orientation showed more positive ERPs to new items, indicating that TBF items demanded more retrieval efforts. The results of strategic recollection indicated that: (a) when TBR items were retrieval targets, late parietal old/new effects were only evoked by TBR items but not TBF items, indicating the retrieval inhibition of TBF items; (b) when TBF items were retrieval targets, the late parietal old/new effect were evoked by both TBR items and TBF items, indicating that strategic retrieval could overcome retrieval inhibition of TBF items. These findings suggested the modulation of strategic retrieval on retrieval inhibition of directed forgetting, supporting that directed forgetting effects were not only caused by selective rehearsal, but also retrieval inhibition. PMID

  9. User centered and ontology based information retrieval system for life sciences.

    PubMed

    Sy, Mohameth-François; Ranwez, Sylvie; Montmain, Jacky; Regnault, Armelle; Crampes, Michel; Ranwez, Vincent

    2012-01-25

    Because of the increasing number of electronic resources, designing efficient tools to retrieve and exploit them is a major challenge. Some improvements have been offered by semantic Web technologies and applications based on domain ontologies. In life science, for instance, the Gene Ontology is widely exploited in genomic applications and the Medical Subject Headings is the basis of biomedical publications indexation and information retrieval process proposed by PubMed. However current search engines suffer from two main drawbacks: there is limited user interaction with the list of retrieved resources and no explanation for their adequacy to the query is provided. Users may thus be confused by the selection and have no idea on how to adapt their queries so that the results match their expectations. This paper describes an information retrieval system that relies on domain ontology to widen the set of relevant documents that is retrieved and that uses a graphical rendering of query results to favor user interactions. Semantic proximities between ontology concepts and aggregating models are used to assess documents adequacy with respect to a query. The selection of documents is displayed in a semantic map to provide graphical indications that make explicit to what extent they match the user's query; this man/machine interface favors a more interactive and iterative exploration of data corpus, by facilitating query concepts weighting and visual explanation. We illustrate the benefit of using this information retrieval system on two case studies one of which aiming at collecting human genes related to transcription factors involved in hemopoiesis pathway. The ontology based information retrieval system described in this paper (OBIRS) is freely available at: http://www.ontotoolkit.mines-ales.fr/ObirsClient/. This environment is a first step towards a user centred application in which the system enlightens relevant information to provide decision help.

  10. User centered and ontology based information retrieval system for life sciences

    PubMed Central

    2012-01-01

    Background Because of the increasing number of electronic resources, designing efficient tools to retrieve and exploit them is a major challenge. Some improvements have been offered by semantic Web technologies and applications based on domain ontologies. In life science, for instance, the Gene Ontology is widely exploited in genomic applications and the Medical Subject Headings is the basis of biomedical publications indexation and information retrieval process proposed by PubMed. However current search engines suffer from two main drawbacks: there is limited user interaction with the list of retrieved resources and no explanation for their adequacy to the query is provided. Users may thus be confused by the selection and have no idea on how to adapt their queries so that the results match their expectations. Results This paper describes an information retrieval system that relies on domain ontology to widen the set of relevant documents that is retrieved and that uses a graphical rendering of query results to favor user interactions. Semantic proximities between ontology concepts and aggregating models are used to assess documents adequacy with respect to a query. The selection of documents is displayed in a semantic map to provide graphical indications that make explicit to what extent they match the user's query; this man/machine interface favors a more interactive and iterative exploration of data corpus, by facilitating query concepts weighting and visual explanation. We illustrate the benefit of using this information retrieval system on two case studies one of which aiming at collecting human genes related to transcription factors involved in hemopoiesis pathway. Conclusions The ontology based information retrieval system described in this paper (OBIRS) is freely available at: http://www.ontotoolkit.mines-ales.fr/ObirsClient/. This environment is a first step towards a user centred application in which the system enlightens relevant information to provide

  11. XML: An Introduction.

    ERIC Educational Resources Information Center

    Lewis, John D.

    1998-01-01

    Describes XML (extensible markup language), a new language classification submitted to the World Wide Web Consortium that is defined in terms of both SGML (Standard Generalized Markup Language) and HTML (Hypertext Markup Language), specifically designed for the Internet. Limitations of PDF (Portable Document Format) files for electronic journals…

  12. Strong Similarity Measures for Ordered Sets of Documents in Information Retrieval.

    ERIC Educational Resources Information Center

    Egghe, L.; Michel, Christine

    2002-01-01

    Presents a general method to construct ordered similarity measures in information retrieval based on classical similarity measures for ordinary sets. Describes a test of some of these measures in an information retrieval system that extracted ranked document sets and discuses the practical usability of the ordered similarity measures. (Author/LRW)

  13. A LDA-based approach to promoting ranking diversity for genomics information retrieval.

    PubMed

    Chen, Yan; Yin, Xiaoshi; Li, Zhoujun; Hu, Xiaohua; Huang, Jimmy Xiangji

    2012-06-11

    In the biomedical domain, there are immense data and tremendous increase of genomics and biomedical relevant publications. The wealth of information has led to an increasing amount of interest in and need for applying information retrieval techniques to access the scientific literature in genomics and related biomedical disciplines. In many cases, the desired information of a query asked by biologists is a list of a certain type of entities covering different aspects that are related to the question, such as cells, genes, diseases, proteins, mutations, etc. Hence, it is important of a biomedical IR system to be able to provide relevant and diverse answers to fulfill biologists' information needs. However traditional IR model only concerns with the relevance between retrieved documents and user query, but does not take redundancy between retrieved documents into account. This will lead to high redundancy and low diversity in the retrieval ranked lists. In this paper, we propose an approach which employs a topic generative model called Latent Dirichlet Allocation (LDA) to promoting ranking diversity for biomedical information retrieval. Different from other approaches or models which consider aspects on word level, our approach assumes that aspects should be identified by the topics of retrieved documents. We present LDA model to discover topic distribution of retrieval passages and word distribution of each topic dimension, and then re-rank retrieval results with topic distribution similarity between passages based on N-size slide window. We perform our approach on TREC 2007 Genomics collection and two distinctive IR baseline runs, which can achieve 8% improvement over the highest Aspect MAP reported in TREC 2007 Genomics track. The proposed method is the first study of adopting topic model to genomics information retrieval, and demonstrates its effectiveness in promoting ranking diversity as well as in improving relevance of ranked lists of genomics search

  14. Disposal of Information Seeking and Retrieval Research: Replacement with a Radical Proposition

    ERIC Educational Resources Information Center

    Budd, John M.; Anstaett, Ashley

    2013-01-01

    Introduction: Research and theory on the topics of information seeking and retrieval have been plagued by some fundamental problems for several decades. Many of the difficulties spring from mechanistic and instrumental thinking and modelling. Method: Existing models of information retrieval and information seeking are examined for efficacy in a…

  15. Semantic concept-enriched dependence model for medical information retrieval.

    PubMed

    Choi, Sungbin; Choi, Jinwook; Yoo, Sooyoung; Kim, Heechun; Lee, Youngho

    2014-02-01

    In medical information retrieval research, semantic resources have been mostly used by expanding the original query terms or estimating the concept importance weight. However, implicit term-dependency information contained in semantic concept terms has been overlooked or at least underused in most previous studies. In this study, we incorporate a semantic concept-based term-dependence feature into a formal retrieval model to improve its ranking performance. Standardized medical concept terms used by medical professionals were assumed to have implicit dependency within the same concept. We hypothesized that, by elaborately revising the ranking algorithms to favor documents that preserve those implicit dependencies, the ranking performance could be improved. The implicit dependence features are harvested from the original query using MetaMap. These semantic concept-based dependence features were incorporated into a semantic concept-enriched dependence model (SCDM). We designed four different variants of the model, with each variant having distinct characteristics in the feature formulation method. We performed leave-one-out cross validations on both a clinical document corpus (TREC Medical records track) and a medical literature corpus (OHSUMED), which are representative test collections in medical information retrieval research. Our semantic concept-enriched dependence model consistently outperformed other state-of-the-art retrieval methods. Analysis shows that the performance gain has occurred independently of the concept's explicit importance in the query. By capturing implicit knowledge with regard to the query term relationships and incorporating them into a ranking model, we could build a more robust and effective retrieval model, independent of the concept importance. Copyright © 2013 Elsevier Inc. All rights reserved.

  16. Users guide for information retrieval using APL

    NASA Technical Reports Server (NTRS)

    Shapiro, A.

    1974-01-01

    A Programming Language (APL) is a precise, concise, and powerful computer programming language. Several features make APL useful to managers and other potential computer users. APL is interactive; therefore, the user can communicate with his program or data base in near real-time. This, coupled with the fact that APL has excellent debugging features, reduces program checkout time to minutes or hours rather than days or months. Of particular importance is the fact that APL can be utilized as a management science tool using such techniques as operations research, statistical analysis, and forecasting. The gap between the scientist and the manager could be narrowed by showing how APL can be used to do what the scientists and the manager each need to do, retrieve information. Sometimes, the information needs to be retrieved rapidly. In this case APL is ideally suited for this challenge.

  17. Task Oriented Tools for Information Retrieval

    ERIC Educational Resources Information Center

    Yang, Peilin

    2017-01-01

    Information Retrieval (IR) is one of the most evolving research fields and has drawn extensive attention in recent years. Because of its empirical nature, the advance of the IR field is closely related to the development of various toolkits. While the traditional IR toolkit mainly provides a platform to evaluate the effectiveness of retrieval…

  18. Intelligent Information Retrieval: Diagnosing Information Need. Part I. The Theoretical Framework for Developing an Intelligent IR Tool.

    ERIC Educational Resources Information Center

    Cole, Charles

    1998-01-01

    Suggests that the principles underlying the procedure used by doctors to diagnose a patient's disease are useful in the design of intelligent information-retrieval systems because the task of the doctor is conceptually similar to the computer or human intermediary's task in information retrieval: to draw out the user's query/information need.…

  19. Automatic query formulations in information retrieval.

    PubMed

    Salton, G; Buckley, C; Fox, E A

    1983-07-01

    Modern information retrieval systems are designed to supply relevant information in response to requests received from the user population. In most retrieval environments the search requests consist of keywords, or index terms, interrelated by appropriate Boolean operators. Since it is difficult for untrained users to generate effective Boolean search requests, trained search intermediaries are normally used to translate original statements of user need into useful Boolean search formulations. Methods are introduced in this study which reduce the role of the search intermediaries by making it possible to generate Boolean search formulations completely automatically from natural language statements provided by the system patrons. Frequency considerations are used automatically to generate appropriate term combinations as well as Boolean connectives relating the terms. Methods are covered to produce automatic query formulations both in a standard Boolean logic system, as well as in an extended Boolean system in which the strict interpretation of the connectives is relaxed. Experimental results are supplied to evaluate the effectiveness of the automatic query formulation process, and methods are described for applying the automatic query formulation process in practice.

  20. Federated Access to Heterogeneous Information Resources in the Neuroscience Information Framework (NIF)

    PubMed Central

    Gupta, Amarnath; Bug, William; Marenco, Luis; Qian, Xufei; Condit, Christopher; Rangarajan, Arun; Müller, Hans Michael; Miller, Perry L.; Sanders, Brian; Grethe, Jeffrey S.; Astakhov, Vadim; Shepherd, Gordon; Sternberg, Paul W.; Martone, Maryann E.

    2009-01-01

    The overarching goal of the NIF (Neuroscience Information Framework) project is to be a one-stop-shop for Neuroscience. This paper provides a technical overview of how the system is designed. The technical goal of the first version of the NIF system was to develop an information system that a neuroscientist can use to locate relevant information from a wide variety of information sources by simple keyword queries. Although the user would provide only keywords to retrieve information, the NIF system is designed to treat them as concepts whose meanings are interpreted by the system. Thus, a search for term should find a record containing synonyms of the term. The system is targeted to find information from web pages, publications, databases, web sites built upon databases, XML documents and any other modality in which such information may be published. We have designed a system to achieve this functionality. A central element in the system is an ontology called NIFSTD (for NIF Standard) constructed by amalgamating a number of known and newly developed ontologies. NIFSTD is used by our ontology management module, called OntoQuest to perform ontology-based search over data sources. The NIF architecture currently provides three different mechanisms for searching heterogeneous data sources including relational databases, web sites, XML documents and full text of publications. Version 1.0 of the NIF system is currently in beta test and may be accessed through http://nif.nih.gov. PMID:18958629

  1. Federated access to heterogeneous information resources in the Neuroscience Information Framework (NIF).

    PubMed

    Gupta, Amarnath; Bug, William; Marenco, Luis; Qian, Xufei; Condit, Christopher; Rangarajan, Arun; Müller, Hans Michael; Miller, Perry L; Sanders, Brian; Grethe, Jeffrey S; Astakhov, Vadim; Shepherd, Gordon; Sternberg, Paul W; Martone, Maryann E

    2008-09-01

    The overarching goal of the NIF (Neuroscience Information Framework) project is to be a one-stop-shop for Neuroscience. This paper provides a technical overview of how the system is designed. The technical goal of the first version of the NIF system was to develop an information system that a neuroscientist can use to locate relevant information from a wide variety of information sources by simple keyword queries. Although the user would provide only keywords to retrieve information, the NIF system is designed to treat them as concepts whose meanings are interpreted by the system. Thus, a search for term should find a record containing synonyms of the term. The system is targeted to find information from web pages, publications, databases, web sites built upon databases, XML documents and any other modality in which such information may be published. We have designed a system to achieve this functionality. A central element in the system is an ontology called NIFSTD (for NIF Standard) constructed by amalgamating a number of known and newly developed ontologies. NIFSTD is used by our ontology management module, called OntoQuest to perform ontology-based search over data sources. The NIF architecture currently provides three different mechanisms for searching heterogeneous data sources including relational databases, web sites, XML documents and full text of publications. Version 1.0 of the NIF system is currently in beta test and may be accessed through http://nif.nih.gov.

  2. Adult Age Differences in Accessing and Retrieving Information from Long-Term Memory.

    ERIC Educational Resources Information Center

    Petros, Thomas V.; And Others

    1983-01-01

    Investigated adult age differences in accessing and retrieving information from long-term memory. Results showed that older adults (N=26) were slower than younger adults (N=35) at feature extraction, lexical access, and accessing category information. The age deficit was proportionally greater when retrieval of category information was required.…

  3. Biomedical information retrieval across languages.

    PubMed

    Daumke, Philipp; Markü, Kornél; Poprat, Michael; Schulz, Stefan; Klar, Rüdiger

    2007-06-01

    This work presents a new dictionary-based approach to biomedical cross-language information retrieval (CLIR) that addresses many of the general and domain-specific challenges in current CLIR research. Our method is based on a multilingual lexicon that was generated partly manually and partly automatically, and currently covers six European languages. It contains morphologically meaningful word fragments, termed subwords. Using subwords instead of entire words significantly reduces the number of lexical entries necessary to sufficiently cover a specific language and domain. Mediation between queries and documents is based on these subwords as well as on lists of word-n-grams that are generated from large monolingual corpora and constitute possible translation units. The translations are then sent to a standard Internet search engine. This process makes our approach an effective tool for searching the biomedical content of the World Wide Web in different languages. We evaluate this approach using the OHSUMED corpus, a large medical document collection, within a cross-language retrieval setting.

  4. Information retrieval for a document writing assistance program

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Corral, M.L.; Simon, A.; Julien, C.

    This paper presents an Information Retrieval mechanism to facilitate the writing of technical documents in the space domain. To address the need for document exchange between partners in a given project, documents are standardized. The writing of a new document requires the re-use of existing documents or parts thereof. These parts can be identified by {open_quotes}tagging{close_quotes} the logical structure of documents and restored by means of a purpose-built Information Retrieval System (I.R.S.). The I.R.S. implemented in our writing assistance tool uses natural language queries and is based on a statistical linguistic approach which is enhanced by the use of documentmore » structure module.« less

  5. CytometryML, an XML format based on DICOM and FCS for analytical cytology data.

    PubMed

    Leif, Robert C; Leif, Suzanne B; Leif, Stephanie H

    2003-07-01

    Flow Cytometry Standard (FCS) was initially created to standardize the software researchers use to analyze, transmit, and store data produced by flow cytometers and sorters. Because of the clinical utility of flow cytometry, it is necessary to have a standard consistent with the requirements of medical regulatory agencies. We extended the existing mapping of FCS to the Digital Imaging and Communications in Medicine (DICOM) standard to include list-mode data produced by flow cytometry, laser scanning cytometry, and microscopic image cytometry. FCS list-mode was mapped to the DICOM Waveform Information Object. We created a collection of Extensible Markup Language (XML) schemas to express the DICOM analytical cytologic text-based data types except for large binary objects. We also developed a cytometry markup language, CytometryML, in an open environment subject to continuous peer review. The feasibility of expressing the data contained in FCS, including list-mode in DICOM, was demonstrated; and a preliminary mapping for list-mode data in the form of XML schemas and documents was completed. DICOM permitted the creation of indices that can be used to rapidly locate in a list-mode file the cells that are members of a subset. DICOM and its coding schemes for other medical standards can be represented by XML schemas, which can be combined with other relevant XML applications, such as Mathematical Markup Language (MathML). The use of XML format based on DICOM for analytical cytology met most of the previously specified requirements and appears capable of meeting the others; therefore, the present FCS should be retired and replaced by an open, XML-based, standard CytometryML. Copyright 2003 Wiley-Liss, Inc.

  6. An XML-based interchange format for genotype-phenotype data.

    PubMed

    Whirl-Carrillo, M; Woon, M; Thorn, C F; Klein, T E; Altman, R B

    2008-02-01

    Recent advances in high-throughput genotyping and phenotyping have accelerated the creation of pharmacogenomic data. Consequently, the community requires standard formats to exchange large amounts of diverse information. To facilitate the transfer of pharmacogenomics data between databases and analysis packages, we have created a standard XML (eXtensible Markup Language) schema that describes both genotype and phenotype data as well as associated metadata. The schema accommodates information regarding genes, drugs, diseases, experimental methods, genomic/RNA/protein sequences, subjects, subject groups, and literature. The Pharmacogenetics and Pharmacogenomics Knowledge Base (PharmGKB; www.pharmgkb.org) has used this XML schema for more than 5 years to accept and process submissions containing more than 1,814,139 SNPs on 20,797 subjects using 8,975 assays. Although developed in the context of pharmacogenomics, the schema is of general utility for exchange of genotype and phenotype data. We have written syntactic and semantic validators to check documents using this format. The schema and code for validation is available to the community at http://www.pharmgkb.org/schema/index.html (last accessed: 8 October 2007). (c) 2007 Wiley-Liss, Inc.

  7. The XSD-Builder Specification Language—Toward a Semantic View of XML Schema Definition

    NASA Astrophysics Data System (ADS)

    Fong, Joseph; Cheung, San Kuen

    In the present database market, XML database model is a main structure for the forthcoming database system in the Internet environment. As a conceptual schema of XML database, XML Model has its limitation on presenting its data semantics. System analyst has no toolset for modeling and analyzing XML system. We apply XML Tree Model (shown in Figure 2) as a conceptual schema of XML database to model and analyze the structure of an XML database. It is important not only for visualizing, specifying, and documenting structural models, but also for constructing executable systems. The tree model represents inter-relationship among elements inside different logical schema such as XML Schema Definition (XSD), DTD, Schematron, XDR, SOX, and DSD (shown in Figure 1, an explanation of the terms in the figure are shown in Table 1). The XSD-Builder consists of XML Tree Model, source language, translator, and XSD. The source language is called XSD-Source which is mainly for providing an environment with concept of user friendliness while writing an XSD. The source language will consequently be translated by XSD-Translator. Output of XSD-Translator is an XSD which is our target and is called as an object language.

  8. Improving data management and dissemination in web based information systems by semantic enrichment of descriptive data aspects

    NASA Astrophysics Data System (ADS)

    Gebhardt, Steffen; Wehrmann, Thilo; Klinger, Verena; Schettler, Ingo; Huth, Juliane; Künzer, Claudia; Dech, Stefan

    2010-10-01

    The German-Vietnamese water-related information system for the Mekong Delta (WISDOM) project supports business processes in Integrated Water Resources Management in Vietnam. Multiple disciplines bring together earth and ground based observation themes, such as environmental monitoring, water management, demographics, economy, information technology, and infrastructural systems. This paper introduces the components of the web-based WISDOM system including data, logic and presentation tier. It focuses on the data models upon which the database management system is built, including techniques for tagging or linking metadata with the stored information. The model also uses ordered groupings of spatial, thematic and temporal reference objects to semantically tag datasets to enable fast data retrieval, such as finding all data in a specific administrative unit belonging to a specific theme. A spatial database extension is employed by the PostgreSQL database. This object-oriented database was chosen over a relational database to tag spatial objects to tabular data, improving the retrieval of census and observational data at regional, provincial, and local areas. While the spatial database hinders processing raster data, a "work-around" was built into WISDOM to permit efficient management of both raster and vector data. The data model also incorporates styling aspects of the spatial datasets through styled layer descriptions (SLD) and web mapping service (WMS) layer specifications, allowing retrieval of rendered maps. Metadata elements of the spatial data are based on the ISO19115 standard. XML structured information of the SLD and metadata are stored in an XML database. The data models and the data management system are robust for managing the large quantity of spatial objects, sensor observations, census and document data. The operational WISDOM information system prototype contains modules for data management, automatic data integration, and web services for data

  9. XML: A Language To Manage the World Wide Web. ERIC Digest.

    ERIC Educational Resources Information Center

    Davis-Tanous, Jennifer R.

    This digest provides an overview of XML (Extensible Markup Language), a markup language used to construct World Wide Web pages. Topics addressed include: (1) definition of a markup language, including comparison of XML with SGML (Standard Generalized Markup Language) and HTML (HyperText Markup Language); (2) how XML works, including sample tags,…

  10. Informative Top-k Retrieval for Advanced Skill Management

    NASA Astrophysics Data System (ADS)

    Colucci, Simona; di Noia, Tommaso; Ragone, Azzurra; Ruta, Michele; Straccia, Umberto; Tinelli, Eufemia

    The paper presents a knowledge-based framework for skills and talent management based on an advanced matchmaking between profiles of candidates and available job positions. Interestingly, informative content of top-k retrieval is enriched through semantic capabilities. The proposed approach allows to: (1) express a requested profile in terms of both hard constraints and soft ones; (2) provide a ranking function based also on qualitative attributes of a profile; (3) explain the resulting outcomes (given a job request, a motivation for the obtained score of each selected profile is provided). Top-k retrieval allows to select most promising candidates according to an ontology formalizing the domain knowledge. Such a knowledge is further exploited to provide a semantic-based explanation of missing or conflicting features in retrieved profiles. They also indicate additional profile characteristics emerging by the retrieval procedure for a further request refinement. A concrete case study followed by an exhaustive experimental campaign is reported to prove the approach effectiveness.

  11. Mathematical, Logical, and Formal Methods in Information Retrieval: An Introduction to the Special Issue.

    ERIC Educational Resources Information Center

    Crestani, Fabio; Dominich, Sandor; Lalmas, Mounia; van Rijsbergen, Cornelis Joost

    2003-01-01

    Discusses the importance of research on the use of mathematical, logical, and formal methods in information retrieval to help enhance retrieval effectiveness and clarify underlying concepts of information retrieval. Highlights include logic; probability; spaces; and future research needs. (Author/LRW)

  12. Fault-tolerant symmetrically-private information retrieval

    NASA Astrophysics Data System (ADS)

    Wang, Tian-Yin; Cai, Xiao-Qiu; Zhang, Rui-Ling

    2016-08-01

    We propose two symmetrically-private information retrieval protocols based on quantum key distribution, which provide a good degree of database and user privacy while being flexible, loss-resistant and easily generalized to a large database similar to the precedent works. Furthermore, one protocol is robust to a collective-dephasing noise, and the other is robust to a collective-rotation noise.

  13. Information Retrieval in Telemedicine: a Comparative Study on Bibliographic Databases

    PubMed Central

    Ahmadi, Maryam; Sarabi, Roghayeh Ershad; Orak, Roohangiz Jamshidi; Bahaadinbeigy, Kambiz

    2015-01-01

    Background and Aims: The first step in each systematic review is selection of the most valid database that can provide the highest number of relevant references. This study was carried out to determine the most suitable database for information retrieval in telemedicine field. Methods: Cinhal, PubMed, Web of Science and Scopus databases were searched for telemedicine matched with Education, cost benefit and patient satisfaction. After analysis of the obtained results, the accuracy coefficient, sensitivity, uniqueness and overlap of databases were calculated. Results: The studied databases differed in the number of retrieved articles. PubMed was identified as the most suitable database for retrieving information on the selected topics with the accuracy and sensitivity ratios of 50.7% and 61.4% respectively. The uniqueness percent of retrieved articles ranged from 38% for Pubmed to 3.0% for Cinhal. The highest overlap rate (18.6%) was found between PubMed and Web of Science. Less than 1% of articles have been indexed in all searched databases. Conclusion: PubMed is suggested as the most suitable database for starting search in telemedicine and after PubMed, Scopus and Web of Science can retrieve about 90% of the relevant articles. PMID:26236086

  14. Information Retrieval in Telemedicine: a Comparative Study on Bibliographic Databases.

    PubMed

    Ahmadi, Maryam; Sarabi, Roghayeh Ershad; Orak, Roohangiz Jamshidi; Bahaadinbeigy, Kambiz

    2015-06-01

    The first step in each systematic review is selection of the most valid database that can provide the highest number of relevant references. This study was carried out to determine the most suitable database for information retrieval in telemedicine field. Cinhal, PubMed, Web of Science and Scopus databases were searched for telemedicine matched with Education, cost benefit and patient satisfaction. After analysis of the obtained results, the accuracy coefficient, sensitivity, uniqueness and overlap of databases were calculated. The studied databases differed in the number of retrieved articles. PubMed was identified as the most suitable database for retrieving information on the selected topics with the accuracy and sensitivity ratios of 50.7% and 61.4% respectively. The uniqueness percent of retrieved articles ranged from 38% for Pubmed to 3.0% for Cinhal. The highest overlap rate (18.6%) was found between PubMed and Web of Science. Less than 1% of articles have been indexed in all searched databases. PubMed is suggested as the most suitable database for starting search in telemedicine and after PubMed, Scopus and Web of Science can retrieve about 90% of the relevant articles.

  15. Adaptive Hypermedia Educational System Based on XML Technologies.

    ERIC Educational Resources Information Center

    Baek, Yeongtae; Wang, Changjong; Lee, Sehoon

    This paper proposes an adaptive hypermedia educational system using XML technologies, such as XML, XSL, XSLT, and XLink. Adaptive systems are capable of altering the presentation of the content of the hypermedia on the basis of a dynamic understanding of the individual user. The user profile can be collected in a user model, while the knowledge…

  16. 42 CFR 433.116 - FFP for operation of mechanized claims processing and information retrieval systems.

    Code of Federal Regulations, 2011 CFR

    2011-10-01

    ... and information retrieval systems. 433.116 Section 433.116 Public Health CENTERS FOR MEDICARE... FISCAL ADMINISTRATION Mechanized Claims Processing and Information Retrieval Systems § 433.116 FFP for operation of mechanized claims processing and information retrieval systems. (a) Subject to paragraph (j) of...

  17. 7 CFR 277.18 - Establishment of an Automated Data Processing (ADP) and Information Retrieval System.

    Code of Federal Regulations, 2012 CFR

    2012-01-01

    ...) and Information Retrieval System. 277.18 Section 277.18 Agriculture Regulations of the Department of... Data Processing (ADP) and Information Retrieval System. (a) Scope and application. This section... costs of planning, design, development or installation of ADP and information retrieval systems if the...

  18. 42 CFR 433.116 - FFP for operation of mechanized claims processing and information retrieval systems.

    Code of Federal Regulations, 2013 CFR

    2013-10-01

    ... and information retrieval systems. 433.116 Section 433.116 Public Health CENTERS FOR MEDICARE... FISCAL ADMINISTRATION Mechanized Claims Processing and Information Retrieval Systems § 433.116 FFP for operation of mechanized claims processing and information retrieval systems. (a) Subject to paragraph (j) of...

  19. 7 CFR 277.18 - Establishment of an Automated Data Processing (ADP) and Information Retrieval System.

    Code of Federal Regulations, 2014 CFR

    2014-01-01

    ...) and Information Retrieval System. 277.18 Section 277.18 Agriculture Regulations of the Department of... Data Processing (ADP) and Information Retrieval System. (a) Scope and application. This section... costs of planning, design, development or installation of ADP and information retrieval systems if the...

  20. 42 CFR 433.116 - FFP for operation of mechanized claims processing and information retrieval systems.

    Code of Federal Regulations, 2014 CFR

    2014-10-01

    ... and information retrieval systems. 433.116 Section 433.116 Public Health CENTERS FOR MEDICARE... FISCAL ADMINISTRATION Mechanized Claims Processing and Information Retrieval Systems § 433.116 FFP for operation of mechanized claims processing and information retrieval systems. (a) Subject to paragraph (j) of...

  1. 7 CFR 277.18 - Establishment of an Automated Data Processing (ADP) and Information Retrieval System.

    Code of Federal Regulations, 2011 CFR

    2011-01-01

    ...) and Information Retrieval System. 277.18 Section 277.18 Agriculture Regulations of the Department of... Data Processing (ADP) and Information Retrieval System. (a) Scope and application. This section... costs of planning, design, development or installation of ADP and information retrieval systems if the...

  2. 42 CFR 433.116 - FFP for operation of mechanized claims processing and information retrieval systems.

    Code of Federal Regulations, 2012 CFR

    2012-10-01

    ... and information retrieval systems. 433.116 Section 433.116 Public Health CENTERS FOR MEDICARE... FISCAL ADMINISTRATION Mechanized Claims Processing and Information Retrieval Systems § 433.116 FFP for operation of mechanized claims processing and information retrieval systems. (a) Subject to paragraph (j) of...

  3. 7 CFR 277.18 - Establishment of an Automated Data Processing (ADP) and Information Retrieval System.

    Code of Federal Regulations, 2013 CFR

    2013-01-01

    ...) and Information Retrieval System. 277.18 Section 277.18 Agriculture Regulations of the Department of... Data Processing (ADP) and Information Retrieval System. (a) Scope and application. This section... costs of planning, design, development or installation of ADP and information retrieval systems if the...

  4. A Survey and Analysis of Access Control Architectures for XML Data

    DTIC Science & Technology

    2006-03-01

    13 4. XML Query Engines ...castle and the drawbridge over the moat. Extending beyond the visual analogy, there are many key components to the protection of information and...technology. While XML’s original intent was to enable large-scale electronic publishing over the internet, its functionality is firmly rooted in its

  5. XML and Bibliographic Data: The TVS (Transport, Validation and Services) Model.

    ERIC Educational Resources Information Center

    de Carvalho, Joaquim; Cordeiro, Maria Ines

    This paper discusses the role of XML in library information systems at three major levels: as are presentation language that enables the transport of bibliographic data in a way that is technologically independent and universally understood across systems and domains; as a language that enables the specification of complex validation rules…

  6. Automating data acquisition into ontologies from pharmacogenetics relational data sources using declarative object definitions and XML.

    PubMed

    Rubin, Daniel L; Hewett, Micheal; Oliver, Diane E; Klein, Teri E; Altman, Russ B

    2002-01-01

    Ontologies are useful for organizing large numbers of concepts having complex relationships, such as the breadth of genetic and clinical knowledge in pharmacogenomics. But because ontologies change and knowledge evolves, it is time consuming to maintain stable mappings to external data sources that are in relational format. We propose a method for interfacing ontology models with data acquisition from external relational data sources. This method uses a declarative interface between the ontology and the data source, and this interface is modeled in the ontology and implemented using XML schema. Data is imported from the relational source into the ontology using XML, and data integrity is checked by validating the XML submission with an XML schema. We have implemented this approach in PharmGKB (http://www.pharmgkb.org/), a pharmacogenetics knowledge base. Our goals were to (1) import genetic sequence data, collected in relational format, into the pharmacogenetics ontology, and (2) automate the process of updating the links between the ontology and data acquisition when the ontology changes. We tested our approach by linking PharmGKB with data acquisition from a relational model of genetic sequence information. The ontology subsequently evolved, and we were able to rapidly update our interface with the external data and continue acquiring the data. Similar approaches may be helpful for integrating other heterogeneous information sources in order make the diversity of pharmacogenetics data amenable to computational analysis.

  7. Cytometry metadata in XML

    NASA Astrophysics Data System (ADS)

    Leif, Robert C.; Leif, Stephanie H.

    2016-04-01

    Introduction: The International Society for Advancement of Cytometry (ISAC) has created a standard for the Minimum Information about a Flow Cytometry Experiment (MIFlowCyt 1.0). CytometryML will serve as a common metadata standard for flow and image cytometry (digital microscopy). Methods: The MIFlowCyt data-types were created, as is the rest of CytometryML, in the XML Schema Definition Language (XSD1.1). The datatypes are primarily based on the Flow Cytometry and the Digital Imaging and Communication (DICOM) standards. A small section of the code was formatted with standard HTML formatting elements (p, h1, h2, etc.). Results:1) The part of MIFlowCyt that describes the Experimental Overview including the specimen and substantial parts of several other major elements has been implemented as CytometryML XML schemas (www.cytometryml.org). 2) The feasibility of using MIFlowCyt to provide the combination of an overview, table of contents, and/or an index of a scientific paper or a report has been demonstrated. Previously, a sample electronic publication, EPUB, was created that could contain both MIFlowCyt metadata as well as the binary data. Conclusions: The use of CytometryML technology together with XHTML5 and CSS permits the metadata to be directly formatted and together with the binary data to be stored in an EPUB container. This will facilitate: formatting, data- mining, presentation, data verification, and inclusion in structured research, clinical, and regulatory documents, as well as demonstrate a publication's adherence to the MIFlowCyt standard, promote interoperability and should also result in the textual and numeric data being published using web technology without any change in composition.

  8. 42 CFR 433.116 - FFP for operation of mechanized claims processing and information retrieval systems.

    Code of Federal Regulations, 2010 CFR

    2010-10-01

    ... and information retrieval systems. 433.116 Section 433.116 Public Health CENTERS FOR MEDICARE... FISCAL ADMINISTRATION Mechanized Claims Processing and Information Retrieval Systems § 433.116 FFP for operation of mechanized claims processing and information retrieval systems. (a) Subject to 42 CFR 433.113(c...

  9. Comparing the quality of accessing medical literature using content-based visual and textual information retrieval

    NASA Astrophysics Data System (ADS)

    Müller, Henning; Kalpathy-Cramer, Jayashree; Kahn, Charles E., Jr.; Hersh, William

    2009-02-01

    Content-based visual information (or image) retrieval (CBIR) has been an extremely active research domain within medical imaging over the past ten years, with the goal of improving the management of visual medical information. Many technical solutions have been proposed, and application scenarios for image retrieval as well as image classification have been set up. However, in contrast to medical information retrieval using textual methods, visual retrieval has only rarely been applied in clinical practice. This is despite the large amount and variety of visual information produced in hospitals every day. This information overload imposes a significant burden upon clinicians, and CBIR technologies have the potential to help the situation. However, in order for CBIR to become an accepted clinical tool, it must demonstrate a higher level of technical maturity than it has to date. Since 2004, the ImageCLEF benchmark has included a task for the comparison of visual information retrieval algorithms for medical applications. In 2005, a task for medical image classification was introduced and both tasks have been run successfully for the past four years. These benchmarks allow an annual comparison of visual retrieval techniques based on the same data sets and the same query tasks, enabling the meaningful comparison of various retrieval techniques. The datasets used from 2004-2007 contained images and annotations from medical teaching files. In 2008, however, the dataset used was made up of 67,000 images (along with their associated figure captions and the full text of their corresponding articles) from two Radiological Society of North America (RSNA) scientific journals. This article describes the results of the medical image retrieval task of the ImageCLEF 2008 evaluation campaign. We compare the retrieval results of both visual and textual information retrieval systems from 15 research groups on the aforementioned data set. The results show clearly that, currently

  10. Millennial Students' Mental Models of Information Retrieval

    ERIC Educational Resources Information Center

    Holman, Lucy

    2009-01-01

    This qualitative study examines first-year college students' online search habits in order to identify patterns in millennials' mental models of information retrieval. The study employed a combination of modified contextual inquiry and concept mapping methodologies to elicit students' mental models. The researcher confirmed previously observed…

  11. On Information Retrieval (IR) Systems: Revisiting Their Development, Evaluation Methodologies, and Assumptions (SIGs LAN, ED).

    ERIC Educational Resources Information Center

    Stirling, Keith

    2000-01-01

    Describes a session on information retrieval systems that planned to discuss relevance measures with Web-based information retrieval; retrieval system performance and evaluation; probabilistic independence of index terms; vector-based models; metalanguages and digital objects; how users assess the reliability, timeliness and bias of information;…

  12. Dissociations within human hippocampal subregions during encoding and retrieval of spatial information.

    PubMed

    Suthana, Nanthia; Ekstrom, Arne; Moshirvaziri, Saba; Knowlton, Barbara; Bookheimer, Susan

    2011-07-01

    Although the hippocampus is critical for the formation and retrieval of spatial memories, it is unclear how subregions are differentially involved in these processes. Previous high-resolution functional magnetic resonance imaging (fMRI) studies have shown that CA2, CA3, and dentate gyrus (CA23DG) regions support the encoding of novel associations, whereas the subicular cortices support the retrieval of these learned associations. Whether these subregions are used in humans during encoding and retrieval of spatial information has yet to be explored. Using high-resolution fMRI (1.6 mm × 1.6-mm in-plane), we found that activity within the right CA23DG increased during encoding compared to retrieval. Conversely, right subicular activity increased during retrieval compared to encoding of spatial associations. These results are consistent with the previous studies illustrating dissociations within human hippocampal subregions and further suggest that these regions are similarly involved during the encoding and retrieval of spatial information. Copyright © 2010 Wiley-Liss, Inc.

  13. Multispectral information for gas and aerosol retrieval from TANSO-FTS instrument

    NASA Astrophysics Data System (ADS)

    Herbin, H.; Labonnote, L. C.; Dubuisson, P.

    2012-11-01

    The Greenhouse gases Observing SATellite (GOSAT) mission and in particular TANSO-FTS instrument has the advantage to measure simultaneously the same field of view in different spectral ranges with a high spectral resolution. These features are promising to improve, not only, gaseous retrieval in clear sky or scattering atmosphere, but also to retrieve aerosol parameters. Therefore, this paper is dedicated to an Information Content (IC) analysis of potential synergy between thermal infrared, shortwave infrared and visible, in order to obtain a more accurate retrieval of gas and aerosol. The latter is based on Shannon theory and used a sophisticated radiative transfer algorithm developed at "Laboratoire d'Optique Atmosphérique", dealing with multiple scattering. This forward model can be relied to an optimal estimation method, which allows simultaneously retrieving gases profiles and aerosol granulometry and concentration. The analysis of the information provided by the spectral synergy is based on climatology of dust, volcanic ash and biomass burning aerosols. This work was conducted in order to develop a powerful tool that allows retrieving simultaneously not only the gas concentrations but also the aerosol characteristics by selecting the so called "best channels", i.e. the channels that bring most of the information concerning gas and aerosol. The methodology developed in this paper could also be used to define the specifications of future high spectral resolution mission to reach a given accuracy on retrieved parameters.

  14. Factors Influencing Successful Use of Information Retrieval Systems by Nurse Practitioner Students

    PubMed Central

    Rose, Linda; Crabtree, Katherine; Hersh, William

    1998-01-01

    This study examined whether a relationship exists between selected Nurse Practitioner students' attributes and successful information retrieval as demonstrated by correctly answering clinical questions using an information retrieval system (Medline). One predictor variable, attitude toward current computer technology, was significantly correlated (r =0.43, p ≤ .05) with successful literature searching.

  15. Issues and solutions for storage, retrieval, and searching of MPEG-7 documents

    NASA Astrophysics Data System (ADS)

    Chang, Yuan-Chi; Lo, Ming-Ling; Smith, John R.

    2000-10-01

    The ongoing MPEG-7 standardization activity aims at creating a standard for describing multimedia content in order to facilitate the interpretation of the associated information content. Attempting to address a broad range of applications, MPEG-7 has defined a flexible framework consisting of Descriptors, Description Schemes, and Description Definition Language. Descriptors and Description Schemes describe features, structure and semantics of multimedia objects. They are written in the Description Definition Language (DDL). In the most recent revision, DDL applies XML (Extensible Markup Language) Schema with MPEG-7 extensions. DDL has constructs that support inclusion, inheritance, reference, enumeration, choice, sequence, and abstract type of Description Schemes and Descriptors. In order to enable multimedia systems to use MPEG-7, a number of important problems in storing, retrieving and searching MPEG-7 documents need to be solved. This paper reports on initial finding on issues and solutions of storing and accessing MPEG-7 documents. In particular, we discuss the benefits of using a virtual document management framework based on XML Access Server (XAS) in order to bridge the MPEG-7 multimedia applications and database systems. The need arises partly because MPEG-7 descriptions need customized storage schema, indexing and search engines. We also discuss issues arising in managing dependence and cross-description scheme search.

  16. Requirements for SPIRES II. An External Specification for the Stanford Public Information Retrieval System.

    ERIC Educational Resources Information Center

    Parker, Edwin B.

    SPIRES (Stanford Public Information Retrieval System) is a computerized information storage and retrieval system intended for use by students and faculty members who have little knowledge of computers but who need rapid and sophisticated retrieval and analysis. The functions and capabilities of the system from the user's point of view are…

  17. GOSAT CO2 retrieval results using TANSO-CAI aerosol information over East Asia

    NASA Astrophysics Data System (ADS)

    KIM, M.; Kim, W.; Jung, Y.; Lee, S.; Kim, J.; Lee, H.; Boesch, H.; Goo, T. Y.

    2015-12-01

    In the satellite remote sensing of CO2, incorrect aerosol information could induce large errors as previous studies suggested. Many factors, such as, aerosol type, wavelength dependency of AOD, aerosol polarization effect and etc. have been main error sources. Due to these aerosol effects, large number of data retrieved are screened out in quality control, or retrieval errors tend to increase if not screened out, especially in East Asia where aerosol concentrations are fairly high. To reduce these aerosol induced errors, a CO2 retrieval algorithm using the simultaneous TANSO-CAI aerosol information is developed. This algorithm adopts AOD and aerosol type information as a priori information from the CAI aerosol retrieval algorithm. The CO2 retrieval algorithm based on optimal estimation method and VLIDORT, a vector discrete ordinate radiative transfer model. The CO2 algorithm, developed with various state vectors to find accurate CO2 concentration, shows reasonable results when compared with other dataset. This study concentrates on the validation of retrieved results with the ground-based TCCON measurements in East Asia and the comparison with the previous retrieval from ACOS, NIES, and UoL. Although, the retrieved CO2 concentration is lower than previous results by ppm's, it shows similar trend and high correlation with previous results. Retrieved data and TCCON measurements data are compared at three stations of Tsukuba, Saga, Anmyeondo in East Asia, with the collocation criteria of ±2°in latitude/longitude and ±1 hours of GOSAT passing time. Compared results also show similar trend with good correlation. Based on the TCCON comparison results, bias correction equation is calculated and applied to the East Asia data.

  18. Autocorrelation and Regularization of Query-Based Information Retrieval Scores

    DTIC Science & Technology

    2008-02-01

    of the most general information retrieval models [ Salton , 1968]. By treating a query as a very short document, documents and queries can be rep... Salton , 1971]. In the context of single link hierarchical clustering, Jardine and van Rijsbergen showed that ranking all k clusters and retrieving a...a document about “dogs”, then the system will always miss this document when a user queries “dog”. Salton recognized that a document’s representation

  19. Information Retrieval and Text Mining Technologies for Chemistry.

    PubMed

    Krallinger, Martin; Rabal, Obdulia; Lourenço, Anália; Oyarzabal, Julen; Valencia, Alfonso

    2017-06-28

    Efficient access to chemical information contained in scientific literature, patents, technical reports, or the web is a pressing need shared by researchers and patent attorneys from different chemical disciplines. Retrieval of important chemical information in most cases starts with finding relevant documents for a particular chemical compound or family. Targeted retrieval of chemical documents is closely connected to the automatic recognition of chemical entities in the text, which commonly involves the extraction of the entire list of chemicals mentioned in a document, including any associated information. In this Review, we provide a comprehensive and in-depth description of fundamental concepts, technical implementations, and current technologies for meeting these information demands. A strong focus is placed on community challenges addressing systems performance, more particularly CHEMDNER and CHEMDNER patents tasks of BioCreative IV and V, respectively. Considering the growing interest in the construction of automatically annotated chemical knowledge bases that integrate chemical information and biological data, cheminformatics approaches for mapping the extracted chemical names into chemical structures and their subsequent annotation together with text mining applications for linking chemistry with biological information are also presented. Finally, future trends and current challenges are highlighted as a roadmap proposal for research in this emerging field.

  20. Interactive Information Seeking and Retrieving: A Third Feedback Framework.

    ERIC Educational Resources Information Center

    Spink, Amanda

    1996-01-01

    Presents an overview of feedback within the cybernetics and social frameworks. These feedback concepts are then compared with the interactive feedback concept evolving within the framework of information seeking and retrieving, based on their conceptualization of the feedback loop and notion of information. (Author/AEF)

  1. Utilization of ontology look-up services in information retrieval for biomedical literature.

    PubMed

    Vishnyakova, Dina; Pasche, Emilie; Lovis, Christian; Ruch, Patrick

    2013-01-01

    With the vast amount of biomedical data we face the necessity to improve information retrieval processes in biomedical domain. The use of biomedical ontologies facilitated the combination of various data sources (e.g. scientific literature, clinical data repository) by increasing the quality of information retrieval and reducing the maintenance efforts. In this context, we developed Ontology Look-up services (OLS), based on NEWT and MeSH vocabularies. Our services were involved in some information retrieval tasks such as gene/disease normalization. The implementation of OLS services significantly accelerated the extraction of particular biomedical facts by structuring and enriching the data context. The results of precision in normalization tasks were boosted on about 20%.

  2. Information Retrieval Using UMLS-based Structured Queries

    PubMed Central

    Fagan, Lawrence M.; Berrios, Daniel C.; Chan, Albert; Cucina, Russell; Datta, Anupam; Shah, Maulik; Surendran, Sujith

    2001-01-01

    During the last three years, we have developed and described components of ELBook, a semantically based information-retrieval system [1-4]. Using these components, domain experts can specify a query model, indexers can use the query model to index documents, and end-users can search these documents for instances of indexed queries.

  3. Coordinating Council. Tenth Meeting: Information retrieval: The role of controlled vocabularies

    NASA Technical Reports Server (NTRS)

    1993-01-01

    The theme of this NASA Scientific and Technical Information Program Coordinating Council meeting was the role of controlled vocabularies (thesauri) in information retrieval. Included are summaries of the presentations and the accompanying visuals. Dr. Raya Fidel addressed 'Retrieval: Free Text, Full Text, and Controlled Vocabularies.' Dr. Bella Hass Weinberg spoke on 'Controlled Vocabularies and Thesaurus Standards.' The presentations were followed by a panel discussion with participation from NASA, the National Library of Medicine, the Defense Technical Information Center, and the Department of Energy; this discussion, however, is not summarized in any detail in this document.

  4. Transfer and distortion of atmospheric information in the satellite temperature retrieval problem

    NASA Technical Reports Server (NTRS)

    Thompson, O. E.

    1981-01-01

    A systematic approach to investigating the transfer of basic ambient temperature information and its distortion by satellite systems and subsequent analysis algorithms is discussed. The retrieval analysis cycle is derived, the variance spectrum of information is examined as it takes different forms in that process, and the quality and quantity of information existing at each stop is compared with the initial ambient temperature information. Temperature retrieval algorithms can smooth, add, or further distort information, depending on how stable the algorithm is, and how heavily influenced by a priori data.

  5. Nonmaterialized Relations and the Support of Information Retrieval Applications by Relational Database Systems.

    ERIC Educational Resources Information Center

    Lynch, Clifford A.

    1991-01-01

    Describes several aspects of the problem of supporting information retrieval system query requirements in the relational database management system (RDBMS) environment and proposes an extension to query processing called nonmaterialized relations. User interactions with information retrieval systems are discussed, and nonmaterialized relations are…

  6. An fMRI Study of Episodic Memory: Retrieval of Object, Spatial, and Temporal Information

    PubMed Central

    Hayes, Scott M.; Ryan, Lee; Schnyer, David M.; Nadel, Lynn

    2011-01-01

    Sixteen participants viewed a videotaped tour of 4 houses, highlighting a series of objects and their spatial locations. Participants were tested for memory of object, spatial, and temporal order information while undergoing functional Magnetic Resonance Imaging. Preferential activation was observed in right parahippocampal gyrus during the retrieval of spatial location information. Retrieval of contextual information (spatial location and temporal order) was associated with activation in right dorsolateral prefrontal cortex. In bilateral posterior parietal regions, greater activation was associated with processing of visual scenes, regardless of the memory judgment. These findings support current theories positing roles for frontal and medial temporal regions during episodic retrieval and suggest a specific role for the hippocampal complex in the retrieval of spatial location information PMID:15506871

  7. Loss of Retrieval Information in Prose Recall.

    ERIC Educational Resources Information Center

    Sehulster, Jerome R.; And Others

    The purpose of this research was to experimentally manipulate input and output orders of information and separate storage and retrieval components of prose free recall. The cued partial recall method, used in word list recall, was adapted to a prose learning task. Four short biographical stories of about 55 words each were systematically combined…

  8. NeXML: rich, extensible, and verifiable representation of comparative data and metadata.

    PubMed

    Vos, Rutger A; Balhoff, James P; Caravas, Jason A; Holder, Mark T; Lapp, Hilmar; Maddison, Wayne P; Midford, Peter E; Priyam, Anurag; Sukumaran, Jeet; Xia, Xuhua; Stoltzfus, Arlin

    2012-07-01

    In scientific research, integration and synthesis require a common understanding of where data come from, how much they can be trusted, and what they may be used for. To make such an understanding computer-accessible requires standards for exchanging richly annotated data. The challenges of conveying reusable data are particularly acute in regard to evolutionary comparative analysis, which comprises an ever-expanding list of data types, methods, research aims, and subdisciplines. To facilitate interoperability in evolutionary comparative analysis, we present NeXML, an XML standard (inspired by the current standard, NEXUS) that supports exchange of richly annotated comparative data. NeXML defines syntax for operational taxonomic units, character-state matrices, and phylogenetic trees and networks. Documents can be validated unambiguously. Importantly, any data element can be annotated, to an arbitrary degree of richness, using a system that is both flexible and rigorous. We describe how the use of NeXML by the TreeBASE and Phenoscape projects satisfies user needs that cannot be satisfied with other available file formats. By relying on XML Schema Definition, the design of NeXML facilitates the development and deployment of software for processing, transforming, and querying documents. The adoption of NeXML for practical use is facilitated by the availability of (1) an online manual with code samples and a reference to all defined elements and attributes, (2) programming toolkits in most of the languages used commonly in evolutionary informatics, and (3) input-output support in several widely used software applications. An active, open, community-based development process enables future revision and expansion of NeXML.

  9. A Parallel Relational Database Management System Approach to Relevance Feedback in Information Retrieval.

    ERIC Educational Resources Information Center

    Lundquist, Carol; Frieder, Ophir; Holmes, David O.; Grossman, David

    1999-01-01

    Describes a scalable, parallel, relational database-drive information retrieval engine. To support portability across a wide range of execution environments, all algorithms adhere to the SQL-92 standard. By incorporating relevance feedback algorithms, accuracy is enhanced over prior database-driven information retrieval efforts. Presents…

  10. Modeling and mining term association for improving biomedical information retrieval performance.

    PubMed

    Hu, Qinmin; Huang, Jimmy Xiangji; Hu, Xiaohua

    2012-06-11

    The growth of the biomedical information requires most information retrieval systems to provide short and specific answers in response to complex user queries. Semantic information in the form of free text that is structured in a way makes it straightforward for humans to read but more difficult for computers to interpret automatically and search efficiently. One of the reasons is that most traditional information retrieval models assume terms are conditionally independent given a document/passage. Therefore, we are motivated to consider term associations within different contexts to help the models understand semantic information and use it for improving biomedical information retrieval performance. We propose a term association approach to discover term associations among the keywords from a query. The experiments are conducted on the TREC 2004-2007 Genomics data sets and the TREC 2004 HARD data set. The proposed approach is promising and achieves superiority over the baselines and the GSP results. The parameter settings and different indices are investigated that the sentence-based index produces the best results in terms of the document-level, the word-based index for the best results in terms of the passage-level and the paragraph-based index for the best results in terms of the passage2-level. Furthermore, the best term association results always come from the best baseline. The tuning number k in the proposed recursive re-ranking algorithm is discussed and locally optimized to be 10. First, modelling term association for improving biomedical information retrieval using factor analysis, is one of the major contributions in our work. Second, the experiments confirm that term association considering co-occurrence and dependency among the keywords can produce better results than the baselines treating the keywords independently. Third, the baselines are re-ranked according to the importance and reliance of latent factors behind term associations. These latent

  11. Modeling and mining term association for improving biomedical information retrieval performance

    PubMed Central

    2012-01-01

    Background The growth of the biomedical information requires most information retrieval systems to provide short and specific answers in response to complex user queries. Semantic information in the form of free text that is structured in a way makes it straightforward for humans to read but more difficult for computers to interpret automatically and search efficiently. One of the reasons is that most traditional information retrieval models assume terms are conditionally independent given a document/passage. Therefore, we are motivated to consider term associations within different contexts to help the models understand semantic information and use it for improving biomedical information retrieval performance. Results We propose a term association approach to discover term associations among the keywords from a query. The experiments are conducted on the TREC 2004-2007 Genomics data sets and the TREC 2004 HARD data set. The proposed approach is promising and achieves superiority over the baselines and the GSP results. The parameter settings and different indices are investigated that the sentence-based index produces the best results in terms of the document-level, the word-based index for the best results in terms of the passage-level and the paragraph-based index for the best results in terms of the passage2-level. Furthermore, the best term association results always come from the best baseline. The tuning number k in the proposed recursive re-ranking algorithm is discussed and locally optimized to be 10. Conclusions First, modelling term association for improving biomedical information retrieval using factor analysis, is one of the major contributions in our work. Second, the experiments confirm that term association considering co-occurrence and dependency among the keywords can produce better results than the baselines treating the keywords independently. Third, the baselines are re-ranked according to the importance and reliance of latent factors behind

  12. Information Retrieval as Hypermedia: An Outline of InterBrowse.

    ERIC Educational Resources Information Center

    Kahn, Paul

    InterBrowse, a uniform interface information retrieval application for several different databases, is designed to be used in Intermedia, a hypermedia environment currently under development at Brown University's Institute for Research Information and Scholarship. This application arose out of the recognized need for an interface that can be used…

  13. Developmental Differences in the Use of Retrieval Cues to Describe Episodic Information in Memory.

    ERIC Educational Resources Information Center

    Ackerman, Brian P.; Rathburn, Jill

    1984-01-01

    Examines reasons why second and fourth grade students use cues relatively ineffectively to retrieve episodic information. Four experiments tested the hypothesis that retrieval cue effectiveness varies with the extent to which cue information describes event information in memory. Results showed that problems of discriminability and…

  14. Towards health care process description framework: an XML DTD design.

    PubMed Central

    Staccini, P.; Joubert, M.; Quaranta, J. F.; Aymard, S.; Fieschi, D.; Fieschi, M.

    2001-01-01

    The development of health care and hospital information systems has to meet users needs as well as requirements such as the tracking of all care activities and the support of quality improvement. The use of process-oriented analysis is of-value to provide analysts with: (i) a systematic description of activities; (ii) the elicitation of the useful data to perform and record care tasks; (iii) the selection of relevant decision-making support. But paper-based tools are not a very suitable way to manage and share the documentation produced during this step. The purpose of this work is to propose a method to implement the results of process analysis according to XML techniques (eXtensible Markup Language). It is based on the IDEF0 activity modeling language (Integration DEfinition for Function modeling). A hierarchical description of a process and its components has been defined through a flat XML file with a grammar of proper metadata tags. Perspectives of this method are discussed. PMID:11825265

  15. XBRL: Beyond Basic XML

    ERIC Educational Resources Information Center

    VanLengen, Craig Alan

    2010-01-01

    The Securities and Exchange Commission (SEC) has recently announced a proposal that will require all public companies to report their financial data in Extensible Business Reporting Language (XBRL). XBRL is an extension of Extensible Markup Language (XML). Moving to a standard reporting format makes it easier for organizations to report the…

  16. Development of a full-text information retrieval system

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Keizo Oyama; AKira Miyazawa, Atsuhiro Takasu; Kouji Shibano

    The authors have executed a project to realize a full-text information retrieval system. The system is designed to deal with a document database comprising full text of a large number of documents such as academic papers. The document structures are utilized in searching and extracting appropriate information. The concept of structure handling and the configuration of the system are described in this paper.

  17. EquiX-A Search and Query Language for XML.

    ERIC Educational Resources Information Center

    Cohen, Sara; Kanza, Yaron; Kogan, Yakov; Sagiv, Yehoshua; Nutt, Werner; Serebrenik, Alexander

    2002-01-01

    Describes EquiX, a search language for XML that combines querying with searching to query the data and the meta-data content of Web pages. Topics include search engines; a data model for XML documents; search query syntax; search query semantics; an algorithm for evaluating a query on a document; and indexing EquiX queries. (LRW)

  18. The Development of Online Information Retrieval Services in the People's Republic of China.

    ERIC Educational Resources Information Center

    Xiaocun, Lu

    1986-01-01

    Assesses the promotion and development of online information retrieval in China. Highlights include opening of the first online retrieval center at China Overseas Building Development Company Limited; establishment and activities of a cooperative network; online retrieval seminars; telecommunication lines and terminal installations; and problems…

  19. Information-rich spectral channels for simulated retrievals of partial column-averaged methane

    NASA Astrophysics Data System (ADS)

    Su, Zhan; Xi, Xi; Natraj, Vijay; Li, King-Fai; Shia, Run-Lie; Miller, Charles E.; Yung, Yuk L.

    2016-01-01

    Space-based remote sensing of the column-averaged methane dry air mole fraction (XCH4) has greatly increased our understanding of the spatiotemporal patterns in the global methane cycle. The potential to retrieve multiple pieces of vertical profile information would further improve the quantification of CH4 across space-time scales. We conduct information analysis for channel selection and evaluate the prospects of retrieving multiple pieces of information as well as total column CH4 from both ground-based and space-based near-infrared remote sensing spectra. We analyze the degrees of freedom of signal (DOF) in the CH4 absorption bands near 2.3 μm and 1.6 μm and select ˜1% of the channels that contain >95% of the information about the CH4 profile. The DOF is around 4 for fine ground-based spectra (resolution = 0.01 cm-1) and 3 for coarse space-based spectra (resolution = 0.20 cm-1) based on channel selection and a signal-to-noise ratio (SNR) of 300. The DOF varies from 2.2 to 3.2 when SNR is between 100 and 300, and spectral resolution is 0.20 cm-1. Simulated retrieval tests in clear-sky conditions using the selected channels reveal that the retrieved partial column-averaged CH4 values are not sensitive to the a priori profiles and can reflect local enhancements of CH4 in different partial air columns. Both the total and partial column-averaged retrieval errors in all tests are within 1% of the true state. These simulated tests highlight the possibility to retrieve up to three to four pieces of information about the vertical distribution of CH4 in reality.

  20. Mutual information based feature selection for medical image retrieval

    NASA Astrophysics Data System (ADS)

    Zhi, Lijia; Zhang, Shaomin; Li, Yan

    2018-04-01

    In this paper, authors propose a mutual information based method for lung CT image retrieval. This method is designed to adapt to different datasets and different retrieval task. For practical applying consideration, this method avoids using a large amount of training data. Instead, with a well-designed training process and robust fundamental features and measurements, the method in this paper can get promising performance and maintain economic training computation. Experimental results show that the method has potential practical values for clinical routine application.

  1. A passage retrieval method based on probabilistic information retrieval model and UMLS concepts in biomedical question answering.

    PubMed

    Sarrouti, Mourad; Ouatik El Alaoui, Said

    2017-04-01

    Passage retrieval, the identification of top-ranked passages that may contain the answer for a given biomedical question, is a crucial component for any biomedical question answering (QA) system. Passage retrieval in open-domain QA is a longstanding challenge widely studied over the last decades. However, it still requires further efforts in biomedical QA. In this paper, we present a new biomedical passage retrieval method based on Stanford CoreNLP sentence/passage length, probabilistic information retrieval (IR) model and UMLS concepts. In the proposed method, we first use our document retrieval system based on PubMed search engine and UMLS similarity to retrieve relevant documents to a given biomedical question. We then take the abstracts from the retrieved documents and use Stanford CoreNLP for sentence splitter to make a set of sentences, i.e., candidate passages. Using stemmed words and UMLS concepts as features for the BM25 model, we finally compute the similarity scores between the biomedical question and each of the candidate passages and keep the N top-ranked ones. Experimental evaluations performed on large standard datasets, provided by the BioASQ challenge, show that the proposed method achieves good performances compared with the current state-of-the-art methods. The proposed method significantly outperforms the current state-of-the-art methods by an average of 6.84% in terms of mean average precision (MAP). We have proposed an efficient passage retrieval method which can be used to retrieve relevant passages in biomedical QA systems with high mean average precision. Copyright © 2017 Elsevier Inc. All rights reserved.

  2. An XML transfer schema for exchange of genomic and genetic mapping data: implementation as a web service in a Taverna workflow.

    PubMed

    Paterson, Trevor; Law, Andy

    2009-08-14

    Genomic analysis, particularly for less well-characterized organisms, is greatly assisted by performing comparative analyses between different types of genome maps and across species boundaries. Various providers publish a plethora of on-line resources collating genome mapping data from a multitude of species. Datasources range in scale and scope from small bespoke resources for particular organisms, through larger web-resources containing data from multiple species, to large-scale bioinformatics resources providing access to data derived from genome projects for model and non-model organisms. The heterogeneity of information held in these resources reflects both the technologies used to generate the data and the target users of each resource. Currently there is no common information exchange standard or protocol to enable access and integration of these disparate resources. Consequently data integration and comparison must be performed in an ad hoc manner. We have developed a simple generic XML schema (GenomicMappingData.xsd - GMD) to allow export and exchange of mapping data in a common lightweight XML document format. This schema represents the various types of data objects commonly described across mapping datasources and provides a mechanism for recording relationships between data objects. The schema is sufficiently generic to allow representation of any map type (for example genetic linkage maps, radiation hybrid maps, sequence maps and physical maps). It also provides mechanisms for recording data provenance and for cross referencing external datasources (including for example ENSEMBL, PubMed and Genbank.). The schema is extensible via the inclusion of additional datatypes, which can be achieved by importing further schemas, e.g. a schema defining relationship types. We have built demonstration web services that export data from our ArkDB database according to the GMD schema, facilitating the integration of data retrieval into Taverna workflows. The data

  3. An XML transfer schema for exchange of genomic and genetic mapping data: implementation as a web service in a Taverna workflow

    PubMed Central

    Paterson, Trevor; Law, Andy

    2009-01-01

    Background Genomic analysis, particularly for less well-characterized organisms, is greatly assisted by performing comparative analyses between different types of genome maps and across species boundaries. Various providers publish a plethora of on-line resources collating genome mapping data from a multitude of species. Datasources range in scale and scope from small bespoke resources for particular organisms, through larger web-resources containing data from multiple species, to large-scale bioinformatics resources providing access to data derived from genome projects for model and non-model organisms. The heterogeneity of information held in these resources reflects both the technologies used to generate the data and the target users of each resource. Currently there is no common information exchange standard or protocol to enable access and integration of these disparate resources. Consequently data integration and comparison must be performed in an ad hoc manner. Results We have developed a simple generic XML schema (GenomicMappingData.xsd – GMD) to allow export and exchange of mapping data in a common lightweight XML document format. This schema represents the various types of data objects commonly described across mapping datasources and provides a mechanism for recording relationships between data objects. The schema is sufficiently generic to allow representation of any map type (for example genetic linkage maps, radiation hybrid maps, sequence maps and physical maps). It also provides mechanisms for recording data provenance and for cross referencing external datasources (including for example ENSEMBL, PubMed and Genbank.). The schema is extensible via the inclusion of additional datatypes, which can be achieved by importing further schemas, e.g. a schema defining relationship types. We have built demonstration web services that export data from our ArkDB database according to the GMD schema, facilitating the integration of data retrieval into Taverna

  4. XML technologies for the Omaha System: a data model, a Java tool and several case studies supporting home healthcare.

    PubMed

    Vittorini, Pierpaolo; Tarquinio, Antonietta; di Orio, Ferdinando

    2009-03-01

    The eXtensible markup language (XML) is a metalanguage which is useful to represent and exchange data between heterogeneous systems. XML may enable healthcare practitioners to document, monitor, evaluate, and archive medical information and services into distributed computer environments. Therefore, the most recent proposals on electronic health records (EHRs) are usually based on XML documents. Since none of the existing nomenclatures were specifically developed for use in automated clinical information systems, but were adapted to such use, numerous current EHRs are organized as a sequence of events, each represented through codes taken from international classification systems. In nursing, a hierarchically organized problem-solving approach is followed, which hardly couples with the sequential organization of such EHRs. Therefore, the paper presents an XML data model for the Omaha System taxonomy, which is one of the most important international nomenclatures used in the home healthcare nursing context. Such a data model represents the formal definition of EHRs specifically developed for nursing practice. Furthermore, the paper delineates a Java application prototype which is able to manage such documents, shows the possibility to transform such documents into readable web pages, and reports several case studies, one currently managed by the home care service of a Health Center in Central Italy.

  5. NeXML: Rich, Extensible, and Verifiable Representation of Comparative Data and Metadata

    PubMed Central

    Vos, Rutger A.; Balhoff, James P.; Caravas, Jason A.; Holder, Mark T.; Lapp, Hilmar; Maddison, Wayne P.; Midford, Peter E.; Priyam, Anurag; Sukumaran, Jeet; Xia, Xuhua; Stoltzfus, Arlin

    2012-01-01

    Abstract In scientific research, integration and synthesis require a common understanding of where data come from, how much they can be trusted, and what they may be used for. To make such an understanding computer-accessible requires standards for exchanging richly annotated data. The challenges of conveying reusable data are particularly acute in regard to evolutionary comparative analysis, which comprises an ever-expanding list of data types, methods, research aims, and subdisciplines. To facilitate interoperability in evolutionary comparative analysis, we present NeXML, an XML standard (inspired by the current standard, NEXUS) that supports exchange of richly annotated comparative data. NeXML defines syntax for operational taxonomic units, character-state matrices, and phylogenetic trees and networks. Documents can be validated unambiguously. Importantly, any data element can be annotated, to an arbitrary degree of richness, using a system that is both flexible and rigorous. We describe how the use of NeXML by the TreeBASE and Phenoscape projects satisfies user needs that cannot be satisfied with other available file formats. By relying on XML Schema Definition, the design of NeXML facilitates the development and deployment of software for processing, transforming, and querying documents. The adoption of NeXML for practical use is facilitated by the availability of (1) an online manual with code samples and a reference to all defined elements and attributes, (2) programming toolkits in most of the languages used commonly in evolutionary informatics, and (3) input–output support in several widely used software applications. An active, open, community-based development process enables future revision and expansion of NeXML. PMID:22357728

  6. Modeling the Time Course of Feature Perception and Feature Information Retrieval

    ERIC Educational Resources Information Center

    Kent, Christopher; Lamberts, Koen

    2006-01-01

    Three experiments investigated whether retrieval of information about different dimensions of a visual object varies as a function of the perceptual properties of those dimensions. The experiments involved two perception-based matching tasks and two retrieval-based matching tasks. A signal-to-respond methodology was used in all tasks. A stochastic…

  7. Natural language information retrieval in digital libraries

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Strzalkowski, T.; Perez-Carballo, J.; Marinescu, M.

    In this paper we report on some recent developments in joint NYU and GE natural language information retrieval system. The main characteristic of this system is the use of advanced natural language processing to enhance the effectiveness of term-based document retrieval. The system is designed around a traditional statistical backbone consisting of the indexer module, which builds inverted index files from pre-processed documents, and a retrieval engine which searches and ranks the documents in response to user queries. Natural language processing is used to (1) preprocess the documents in order to extract content-carrying terms, (2) discover inter-term dependencies and buildmore » a conceptual hierarchy specific to the database domain, and (3) process user`s natural language requests into effective search queries. This system has been used in NIST-sponsored Text Retrieval Conferences (TREC), where we worked with approximately 3.3 GBytes of text articles including material from the Wall Street Journal, the Associated Press newswire, the Federal Register, Ziff Communications`s Computer Library, Department of Energy abstracts, U.S. Patents and the San Jose Mercury News, totaling more than 500 million words of English. The system have been designed to facilitate its scalability to deal with ever increasing amounts of data. In particular, a randomized index-splitting mechanism has been installed which allows the system to create a number of smaller indexes that can be independently and efficiently searched.« less

  8. On-Demand Associative Cross-Language Information Retrieval

    NASA Astrophysics Data System (ADS)

    Geraldo, André Pinto; Moreira, Viviane P.; Gonçalves, Marcos A.

    This paper proposes the use of algorithms for mining association rules as an approach for Cross-Language Information Retrieval. These algorithms have been widely used to analyse market basket data. The idea is to map the problem of finding associations between sales items to the problem of finding term translations over a parallel corpus. The proposal was validated by means of experiments using queries in two distinct languages: Portuguese and Finnish to retrieve documents in English. The results show that the performance of our proposed approach is comparable to the performance of the monolingual baseline and to query translation via machine translation, even though these systems employ more complex Natural Language Processing techniques. The combination between machine translation and our approach yielded the best results, even outperforming the monolingual baseline.

  9. The Dilemma of the Subjective in Information Organisation and Retrieval.

    ERIC Educational Resources Information Center

    Neill, S. D.

    1987-01-01

    This review of the literature discusses theories of information with emphasis on the views of Dervin and Popper on subjectivity and objectivity as they relate to information use. Classification and indexing as applied in information retrieval systems and libraries are also discussed, and 78 references are provided. (LRW)

  10. Information storage medium and method of recording and retrieving information thereon

    DOEpatents

    Marchant, D. D.; Begej, Stefan

    1986-01-01

    Information storage medium comprising a semiconductor doped with first and second impurities or dopants. Preferably, one of the impurities is introduced by ion implantation. Conductive electrodes are photolithographically formed on the surface of the medium. Information is recorded on the medium by selectively applying a focused laser beam to discrete regions of the medium surface so as to anneal discrete regions of the medium containing lattice defects introduced by the ion-implanted impurity. Information is retrieved from the storage medium by applying a focused laser beam to annealed and non-annealed regions so as to produce a photovoltaic signal at each region.

  11. DESIGN PRINCIPLES FOR AN ON-LINE INFORMATION RETRIEVAL SYSTEM. TECHNICAL REPORT.

    ERIC Educational Resources Information Center

    LOWE, THOMAS C.

    AREAS INVESTIGATED INCLUDE SLOW MEMORY DATA STORAGE, THE PROBLEM OF DECODING FROM AN INDEX TO A SLOW MEMORY ADDRESS, THE STRUCTURE OF DATA LISTS AND DATA LIST OPERATORS, COMMUNICATIONS BETWEEN THE HUMAN USER AND THE SYSTEM, PROCESSING OF RETRIEVAL REQUESTS, AND THE USER'S CONTROL OVER THE RETURN OF INFORMATION RETRIEVED. LINEAR, LINKED AND…

  12. Finding Information on the World Wide Web: The Retrieval Effectiveness of Search Engines.

    ERIC Educational Resources Information Center

    Pathak, Praveen; Gordon, Michael

    1999-01-01

    Describes a study that examined the effectiveness of eight search engines for the World Wide Web. Calculated traditional information-retrieval measures of recall and precision at varying numbers of retrieved documents to use as the bases for statistical comparisons of retrieval effectiveness. Also examined the overlap between search engines.…

  13. Kid's Catalog: An Information Retrieval System for Children.

    ERIC Educational Resources Information Center

    Busey, Paula; Doerr, Tom

    1993-01-01

    Describes an online public access catalog for children, called the Kid's Catalog. Design objectives include eliminating the barriers to information retrieval outlined in the research literature; being fun, interactive, and respectful of children's intelligence and creativity; motivating children with an expansive range of subjects and search…

  14. The Development and Implementation of a Management Information System for an Education Information Retrieval Center.

    ERIC Educational Resources Information Center

    Jegi, John

    A management information system was developed for the Contra Costa County, California, Department of Education's Educational Information Retrieval Center. The system was designed to determine needed operational changes, to measure the effects of these changes, to monitor the center's operation, and to obtain information for dissemination. Data…

  15. An adaptable XML based approach for scientific data management and integration

    NASA Astrophysics Data System (ADS)

    Wang, Fusheng; Thiel, Florian; Furrer, Daniel; Vergara-Niedermayr, Cristobal; Qin, Chen; Hackenberg, Georg; Bourgue, Pierre-Emmanuel; Kaltschmidt, David; Wang, Mo

    2008-03-01

    Increased complexity of scientific research poses new challenges to scientific data management. Meanwhile, scientific collaboration is becoming increasing important, which relies on integrating and sharing data from distributed institutions. We develop SciPort, a Web-based platform on supporting scientific data management and integration based on a central server based distributed architecture, where researchers can easily collect, publish, and share their complex scientific data across multi-institutions. SciPort provides an XML based general approach to model complex scientific data by representing them as XML documents. The documents capture not only hierarchical structured data, but also images and raw data through references. In addition, SciPort provides an XML based hierarchical organization of the overall data space to make it convenient for quick browsing. To provide generalization, schemas and hierarchies are customizable with XML-based definitions, thus it is possible to quickly adapt the system to different applications. While each institution can manage documents on a Local SciPort Server independently, selected documents can be published to a Central Server to form a global view of shared data across all sites. By storing documents in a native XML database, SciPort provides high schema extensibility and supports comprehensive queries through XQuery. By providing a unified and effective means for data modeling, data access and customization with XML, SciPort provides a flexible and powerful platform for sharing scientific data for scientific research communities, and has been successfully used in both biomedical research and clinical trials.

  16. An Adaptable XML Based Approach for Scientific Data Management and Integration.

    PubMed

    Wang, Fusheng; Thiel, Florian; Furrer, Daniel; Vergara-Niedermayr, Cristobal; Qin, Chen; Hackenberg, Georg; Bourgue, Pierre-Emmanuel; Kaltschmidt, David; Wang, Mo

    2008-02-20

    Increased complexity of scientific research poses new challenges to scientific data management. Meanwhile, scientific collaboration is becoming increasing important, which relies on integrating and sharing data from distributed institutions. We develop SciPort, a Web-based platform on supporting scientific data management and integration based on a central server based distributed architecture, where researchers can easily collect, publish, and share their complex scientific data across multi-institutions. SciPort provides an XML based general approach to model complex scientific data by representing them as XML documents. The documents capture not only hierarchical structured data, but also images and raw data through references. In addition, SciPort provides an XML based hierarchical organization of the overall data space to make it convenient for quick browsing. To provide generalization, schemas and hierarchies are customizable with XML-based definitions, thus it is possible to quickly adapt the system to different applications. While each institution can manage documents on a Local SciPort Server independently, selected documents can be published to a Central Server to form a global view of shared data across all sites. By storing documents in a native XML database, SciPort provides high schema extensibility and supports comprehensive queries through XQuery. By providing a unified and effective means for data modeling, data access and customization with XML, SciPort provides a flexible and powerful platform for sharing scientific data for scientific research communities, and has been successfully used in both biomedical research and clinical trials.

  17. An information-processing model of three cortical regions: evidence in episodic memory retrieval.

    PubMed

    Sohn, Myeong-Ho; Goode, Adam; Stenger, V Andrew; Jung, Kwan-Jin; Carter, Cameron S; Anderson, John R

    2005-03-01

    ACT-R (Anderson, J.R., et al., 2003. An information-processing model of the BOLD response in symbol manipulation tasks. Psychon. Bull. Rev. 10, 241-261) relates the inferior dorso-lateral prefrontal cortex to a retrieval buffer that holds information retrieved from memory and the posterior parietal cortex to an imaginal buffer that holds problem representations. Because the number of changes in a problem representation is not necessarily correlated with retrieval difficulties, it is possible to dissociate prefrontal-parietal activations. In two fMRI experiments, we examined this dissociation using the fan effect paradigm. Experiment 1 compared a recognition task, in which representation requirement remains the same regardless of retrieval difficulty, with a recall task, in which both representation and retrieval loads increase with retrieval difficulty. In the recognition task, the prefrontal activation revealed a fan effect but not the parietal activation. In the recall task, both regions revealed fan effects. In Experiment 2, we compared visually presented stimuli and aurally presented stimuli using the recognition task. While only the prefrontal region revealed the fan effect, the activation patterns in the prefrontal and the parietal region did not differ by stimulus presentation modality. In general, these results provide support for the prefrontal-parietal dissociation in terms of retrieval and representation and the modality-independent nature of the information processed by these regions. Using ACT-R, we also provide computational models that explain patterns of fMRI responses in these two areas during recognition and recall.

  18. An overview of selected information storage and retrieval issues in computerized document processing

    NASA Technical Reports Server (NTRS)

    Dominick, Wayne D. (Editor); Ihebuzor, Valentine U.

    1984-01-01

    The rapid development of computerized information storage and retrieval techniques has introduced the possibility of extending the word processing concept to document processing. A major advantage of computerized document processing is the relief of the tedious task of manual editing and composition usually encountered by traditional publishers through the immense speed and storage capacity of computers. Furthermore, computerized document processing provides an author with centralized control, the lack of which is a handicap of the traditional publishing operation. A survey of some computerized document processing techniques is presented with emphasis on related information storage and retrieval issues. String matching algorithms are considered central to document information storage and retrieval and are also discussed.

  19. LOGISTIC MANAGEMENT INFORMATION SYSTEM - MANUAL DATA STORAGE AND RETRIEVAL SYSTEM.

    DTIC Science & Technology

    Logistics Management Information System . The procedures are applicable to manual storage and retrieval of all data used in the Logistics Management ... Information System (LMIS) and include the following: (1) Action Officer data source file. (2) Action Officer presentation format file. (3) LMI Coordination

  20. Strategic Help in User Interfaces for Information Retrieval.

    ERIC Educational Resources Information Center

    Brajnik, Giorgio; Mizzaro, Stefano; Tasso, Carlo; Venuti, Fabio

    2002-01-01

    Discussion of search strategy in information retrieval by end users focuses on the role played by strategic reasoning and design principles for user interfaces. Highlights include strategic help based on collaborative coaching; a conceptual model for strategic help; and a prototype knowledge-based system named FIRE. (Author/LRW)

  1. Cross-Language Information Retrieval: An Analysis of Errors.

    ERIC Educational Resources Information Center

    Ruiz, Miguel E.; Srinivasan, Padmini

    1998-01-01

    Investigates an automatic method for Cross Language Information Retrieval (CLIR) that utilizes the multilingual Unified Medical Language System (UMLS) Metathesaurus to translate Spanish natural-language queries into English. Results indicate that for Spanish, the UMLS Metathesaurus-based CLIR method is at least equivalent to if not better than…

  2. Information Retrieval Diary of an Expert Technical Translator.

    ERIC Educational Resources Information Center

    Cremmins, Edward T.

    1984-01-01

    Recommends use of entries from the information retrieval diary of Ted Crump, expert technical translator at the National Institute of Health, in the construction of computer models showing how expert translators solve problems of ambiguity in language. Expert and inexpert translation systems, eponyms, abbreviations, and alphabetic solutions are…

  3. SLIMMER--A UNIX System-Based Information Retrieval System.

    ERIC Educational Resources Information Center

    Waldstein, Robert K.

    1988-01-01

    Describes an information retrieval system developed at Bell Laboratories to create and maintain a variety of different but interrelated databases, and to provide controlled access to these databases. The components discussed include the interfaces, indexing rules, display languages, response time, and updating procedures of the system. (6 notes…

  4. Interoperability, Data Control and Battlespace Visualization using XML, XSLT and X3D

    DTIC Science & Technology

    2003-09-01

    26 Rosenthal, Arnon, Seligman , Len and Costello, Roger, XML, Databases, and Interoperability, Federal Database Colloquium, AFCEA, San Diego...79 Rosenthal, Arnon, Seligman , Len and Costello, Roger, “XML, Databases, and Interoperability”, Federal Database Colloquium, AFCEA, San Diego, 1999... Linda , Mastering XML, Premium Edition, SYBEX, 2001 Wooldridge, Michael , An Introduction to MultiAgent Systems, Wiley, 2002 PAPERS Abernathy, M

  5. Efficient Caption-Based Retrieval of Multimedia Information

    DTIC Science & Technology

    1993-10-09

    in the design of transportable natural language interfaces. Artifcial Intelligence , 32 (1987), 173-243. - 13- (101 Jones, M. and Eisner, J. A...systems for multimedia data . They exploit captions on the data and perform natural-language processing of them and English retrieval requests. Some...content analysis of the data is also performed to obtain additional descriptive information. The key to getting this approach to work is sufficiently

  6. Retrieval of the atmospheric compounds using a spectral optical thickness information

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Ioltukhovski, A.A.

    A spectral inversion technique for retrieval of the atmospheric gases and aerosols contents is proposed. This technique based upon the preliminary measurement or retrieval of the spectral optical thickness. The existence of a priori information about the spectral cross sections for some of the atmospheric components allows to retrieve the relative contents of these components in the atmosphere. Method of smooth filtration makes possible to estimate contents of atmospheric aerosols with known cross sections and to filter out other aerosols; this is done independently from their relative contribution to the optical thickness.

  7. Episodic retrieval involves early and sustained effects of reactivating information from encoding.

    PubMed

    Johnson, Jeffrey D; Price, Mason H; Leiker, Emily K

    2015-02-01

    Several fMRI studies have shown a correspondence between the brain regions activated during encoding and retrieval, consistent with the view that memory retrieval involves hippocampally-mediated reinstatement of cortical activity. With the limited temporal resolution of fMRI, the precise timing of such reactivation is unclear, calling into question the functional significance of these effects. Whereas reactivation influencing retrieval should emerge with neural correlates of retrieval success, that signifying post-retrieval monitoring would trail retrieval. The present study employed EEG to provide a temporal landmark of retrieval success from which we could investigate the sub-trial time course of reactivation. Pattern-classification analyses revealed that early-onsetting reactivation differentiated the outcome of recognition-memory judgments and was associated with individual differences in behavioral accuracy, while reactivation was also evident in a sustained form later in the trial. The EEG findings suggest that, whereas prior fMRI findings could be interpreted as reflecting the contribution of reinstatement to retrieval success, they could also indicate the maintenance of episodic information in service of post-retrieval evaluation. Copyright © 2014 Elsevier Inc. All rights reserved.

  8. Determinants to trigger memory reconsolidation: The role of retrieval and updating information.

    PubMed

    Rodriguez-Ortiz, Carlos J; Bermúdez-Rattoni, Federico

    2017-07-01

    Long-term memories can undergo destabilization/restabilization processes, collectively called reconsolidation. However, the parameters that trigger memory reconsolidation are poorly understood and are a matter of intense investigation. Particularly, memory retrieval is widely held as requisite to initiate reconsolidation. This assumption makes sense since only relevant cues will induce reconsolidation of a specific memory. However, recent studies show that pharmacological inhibition of retrieval does not avoid memory from undergoing reconsolidation, indicating that memory reconsolidation occurs through a process that can be dissociated from retrieval. We propose that retrieval is not a unitary process but has two dissociable components; one leading to the expression of memory and the other to reconsolidation, referred herein as executer and integrator respectively. The executer would lead to the behavioral expression of the memory. This component would be the one disrupted on the studies that show reconsolidation independence from retrieval. The integrator would deal with reconsolidation. This component of retrieval would lead to long-term memory destabilization when specific conditions are met. We think that an important number of reports are consistent with the hypothesis that reconsolidation is only initiated when updating information is acquired. We suggest that the integrator would initiate reconsolidation to integrate updating information into long-term memory. Copyright © 2016 Elsevier Inc. All rights reserved.

  9. Episodic Memory Retrieval Functionally Relies on Very Rapid Reactivation of Sensory Information.

    PubMed

    Waldhauser, Gerd T; Braun, Verena; Hanslmayr, Simon

    2016-01-06

    Episodic memory retrieval is assumed to rely on the rapid reactivation of sensory information that was present during encoding, a process termed "ecphory." We investigated the functional relevance of this scarcely understood process in two experiments in human participants. We presented stimuli to the left or right of fixation at encoding, followed by an episodic memory test with centrally presented retrieval cues. This allowed us to track the reactivation of lateralized sensory memory traces during retrieval. Successful episodic retrieval led to a very early (∼100-200 ms) reactivation of lateralized alpha/beta (10-25 Hz) electroencephalographic (EEG) power decreases in the visual cortex contralateral to the visual field at encoding. Applying rhythmic transcranial magnetic stimulation to interfere with early retrieval processing in the visual cortex led to decreased episodic memory performance specifically for items encoded in the visual field contralateral to the site of stimulation. These results demonstrate, for the first time, that episodic memory functionally relies on very rapid reactivation of sensory information. Remembering personal experiences requires a "mental time travel" to revisit sensory information perceived in the past. This process is typically described as a controlled, relatively slow process. However, by using electroencephalography to measure neural activity with a high time resolution, we show that such episodic retrieval entails a very rapid reactivation of sensory brain areas. Using transcranial magnetic stimulation to alter brain function during retrieval revealed that this early sensory reactivation is causally relevant for conscious remembering. These results give first neural evidence for a functional, preconscious component of episodic remembering. This provides new insight into the nature of human memory and may help in the understanding of psychiatric conditions that involve the automatic intrusion of unwanted memories. Copyright

  10. Structuring Legacy Pathology Reports by openEHR Archetypes to Enable Semantic Querying.

    PubMed

    Kropf, Stefan; Krücken, Peter; Mueller, Wolf; Denecke, Kerstin

    2017-05-18

    Clinical information is often stored as free text, e.g. in discharge summaries or pathology reports. These documents are semi-structured using section headers, numbered lists, items and classification strings. However, it is still challenging to retrieve relevant documents since keyword searches applied on complete unstructured documents result in many false positive retrieval results. We are concentrating on the processing of pathology reports as an example for unstructured clinical documents. The objective is to transform reports semi-automatically into an information structure that enables an improved access and retrieval of relevant data. The data is expected to be stored in a standardized, structured way to make it accessible for queries that are applied to specific sections of a document (section-sensitive queries) and for information reuse. Our processing pipeline comprises information modelling, section boundary detection and section-sensitive queries. For enabling a focused search in unstructured data, documents are automatically structured and transformed into a patient information model specified through openEHR archetypes. The resulting XML-based pathology electronic health records (PEHRs) are queried by XQuery and visualized by XSLT in HTML. Pathology reports (PRs) can be reliably structured into sections by a keyword-based approach. The information modelling using openEHR allows saving time in the modelling process since many archetypes can be reused. The resulting standardized, structured PEHRs allow accessing relevant data by retrieving data matching user queries. Mapping unstructured reports into a standardized information model is a practical solution for a better access to data. Archetype-based XML enables section-sensitive retrieval and visualisation by well-established XML techniques. Focussing the retrieval to particular sections has the potential of saving retrieval time and improving the accuracy of the retrieval.

  11. Hybrid ontology for semantic information retrieval model using keyword matching indexing system.

    PubMed

    Uthayan, K R; Mala, G S Anandha

    2015-01-01

    Ontology is the process of growth and elucidation of concepts of an information domain being common for a group of users. Establishing ontology into information retrieval is a normal method to develop searching effects of relevant information users require. Keywords matching process with historical or information domain is significant in recent calculations for assisting the best match for specific input queries. This research presents a better querying mechanism for information retrieval which integrates the ontology queries with keyword search. The ontology-based query is changed into a primary order to predicate logic uncertainty which is used for routing the query to the appropriate servers. Matching algorithms characterize warm area of researches in computer science and artificial intelligence. In text matching, it is more dependable to study semantics model and query for conditions of semantic matching. This research develops the semantic matching results between input queries and information in ontology field. The contributed algorithm is a hybrid method that is based on matching extracted instances from the queries and information field. The queries and information domain is focused on semantic matching, to discover the best match and to progress the executive process. In conclusion, the hybrid ontology in semantic web is sufficient to retrieve the documents when compared to standard ontology.

  12. Hybrid Ontology for Semantic Information Retrieval Model Using Keyword Matching Indexing System

    PubMed Central

    Uthayan, K. R.; Anandha Mala, G. S.

    2015-01-01

    Ontology is the process of growth and elucidation of concepts of an information domain being common for a group of users. Establishing ontology into information retrieval is a normal method to develop searching effects of relevant information users require. Keywords matching process with historical or information domain is significant in recent calculations for assisting the best match for specific input queries. This research presents a better querying mechanism for information retrieval which integrates the ontology queries with keyword search. The ontology-based query is changed into a primary order to predicate logic uncertainty which is used for routing the query to the appropriate servers. Matching algorithms characterize warm area of researches in computer science and artificial intelligence. In text matching, it is more dependable to study semantics model and query for conditions of semantic matching. This research develops the semantic matching results between input queries and information in ontology field. The contributed algorithm is a hybrid method that is based on matching extracted instances from the queries and information field. The queries and information domain is focused on semantic matching, to discover the best match and to progress the executive process. In conclusion, the hybrid ontology in semantic web is sufficient to retrieve the documents when compared to standard ontology. PMID:25922851

  13. Optically secured information retrieval using two authenticated phase-only masks.

    PubMed

    Wang, Xiaogang; Chen, Wen; Mei, Shengtao; Chen, Xudong

    2015-10-23

    We propose an algorithm for jointly designing two phase-only masks (POMs) that allow for the encryption and noise-free retrieval of triple images. The images required for optical retrieval are first stored in quick-response (QR) codes for noise-free retrieval and flexible readout. Two sparse POMs are respectively calculated from two different images used as references for authentication based on modified Gerchberg-Saxton algorithm (GSA) and pixel extraction, and are then used as support constraints in a modified double-phase retrieval algorithm (MPRA), together with the above-mentioned QR codes. No visible information about the target images or the reference images can be obtained from each of these authenticated POMs. This approach allows users to authenticate the two POMs used for image reconstruction without visual observation of the reference images. It also allows user to friendly access and readout with mobile devices.

  14. Optically secured information retrieval using two authenticated phase-only masks

    PubMed Central

    Wang, Xiaogang; Chen, Wen; Mei, Shengtao; Chen, Xudong

    2015-01-01

    We propose an algorithm for jointly designing two phase-only masks (POMs) that allow for the encryption and noise-free retrieval of triple images. The images required for optical retrieval are first stored in quick-response (QR) codes for noise-free retrieval and flexible readout. Two sparse POMs are respectively calculated from two different images used as references for authentication based on modified Gerchberg-Saxton algorithm (GSA) and pixel extraction, and are then used as support constraints in a modified double-phase retrieval algorithm (MPRA), together with the above-mentioned QR codes. No visible information about the target images or the reference images can be obtained from each of these authenticated POMs. This approach allows users to authenticate the two POMs used for image reconstruction without visual observation of the reference images. It also allows user to friendly access and readout with mobile devices. PMID:26494213

  15. Optically secured information retrieval using two authenticated phase-only masks

    NASA Astrophysics Data System (ADS)

    Wang, Xiaogang; Chen, Wen; Mei, Shengtao; Chen, Xudong

    2015-10-01

    We propose an algorithm for jointly designing two phase-only masks (POMs) that allow for the encryption and noise-free retrieval of triple images. The images required for optical retrieval are first stored in quick-response (QR) codes for noise-free retrieval and flexible readout. Two sparse POMs are respectively calculated from two different images used as references for authentication based on modified Gerchberg-Saxton algorithm (GSA) and pixel extraction, and are then used as support constraints in a modified double-phase retrieval algorithm (MPRA), together with the above-mentioned QR codes. No visible information about the target images or the reference images can be obtained from each of these authenticated POMs. This approach allows users to authenticate the two POMs used for image reconstruction without visual observation of the reference images. It also allows user to friendly access and readout with mobile devices.

  16. Data compression and information retrieval via symbolization

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Tang, X.Z.; Tracy, E.R.

    Converting a continuous signal into a multisymbol stream is a simple method of data compression which preserves much of the dynamical information present in the original signal. The retrieval of selected types of information from symbolic data involves binary operations and is therefore optimal for digital computers. For example, correlation time scales can be easily recovered, even at high noise levels, by varying the time delay for symbolization. Also, the presence of periodicity in the signal can be reliably detected even if it is weak and masked by a dominant chaotic/stochastic background. {copyright} {ital 1998 American Institute of Physics.}

  17. Combining Passive Microwave Sounders with CYGNSS information for improved retrievals: Observations during Hurricane Harvey

    NASA Astrophysics Data System (ADS)

    Schreier, M. M.

    2017-12-01

    The launch of CYGNSS (Cyclone Global Navigation Satellite System) has added an interesting component to satellite observations: it can provide wind speeds in the tropical area with a high repetition rate. Passive microwave sounders that are overpassing the same region can benefit from this information, when it comes to the retrieval of temperature or water profiles: the uncertainty about wind speeds has a strong impact on emissivity and reflectivity calculations with respect to surface temperature. This has strong influences on the uncertainty of retrieval of temperature and water content, especially under extreme weather conditions. Adding CYGNSS information to the retrieval can help to reduce errors and provide a significantly better sounder retrieval. Based on observations during Hurricane Harvey, we want to show the impact of CYGNSS data on the retrieval of passive microwave sensors. We will show examples on the impact on the retrieval from polar orbiting instruments, like the Advanced Technology Microwave Sounder (ATMS) and AMSU-A/B on NOAA-18 and 19. In addition we will also show the impact on retrievals from HAMSR (High Altitude MMIC Sounding Radiometer), which was flying on the Global Hawk during the EPOCH campaign. We will compare the results with other observations and estimate the impact of additional CYGNSS information on the microwave retrieval, especially on the impact in error and uncertainty reduction. We think, that a synergetic use of these different data sources could significantly help to produce better assimilation products for forecast assimilation.

  18. Retrieval monitoring is influenced by information value: the interplay between importance and confidence on false memory.

    PubMed

    McDonough, Ian M; Bui, Dung C; Friedman, Michael C; Castel, Alan D

    2015-10-01

    The perceived value of information can influence one's motivation to successfully remember that information. This study investigated how information value can affect memory search and evaluation processes (i.e., retrieval monitoring). In Experiment 1, participants studied unrelated words associated with low, medium, or high values. Subsequent memory tests required participants to selectively monitor retrieval for different values. False memory effects were smaller when searching memory for high-value than low-value words, suggesting that people more effectively monitored more important information. In Experiment 2, participants studied semantically-related words, and the need for retrieval monitoring was reduced at test by using inclusion instructions (i.e., endorsement of any word related to the studied words) compared with standard instructions. Inclusion instructions led to increases in false recognition for low-value, but not for high-value words, suggesting that under standard-instruction conditions retrieval monitoring was less likely to occur for important information. Experiment 3 showed that words retrieved with lower confidence were associated with more effective retrieval monitoring, suggesting that the quality of the retrieved memory influenced the degree and effectiveness of monitoring processes. Ironically, unless encouraged to do so, people were less likely to carefully monitor important information, even though people want to remember important memories most accurately. Copyright © 2015 Elsevier B.V. All rights reserved.

  19. "BRAIN": Baruch Retrieval of Automated Information for Negotiations.

    ERIC Educational Resources Information Center

    Levenstein, Aaron, Ed.

    1981-01-01

    A data processing program that can be used as a research and collective bargaining aid for colleges is briefly described and the fields of the system are outlined. The system, known as BRAIN (Baruch Retrieval of Automated Information for Negotiations), is designed primarily as an instrument for quantitative and qualitative analysis. BRAIN consists…

  20. Retrieval practice enhances the ability to evaluate complex physiology information.

    PubMed

    Dobson, John; Linderholm, Tracy; Perez, Jose

    2018-05-01

    Many investigations have shown that retrieval practice enhances the recall of different types of information, including both medical and physiological, but the effects of the strategy on higher-order thinking, such as evaluation, are less clear. The primary aim of this study was to compare how effectively retrieval practice and repeated studying (i.e. reading) strategies facilitated the evaluation of two research articles that advocated dissimilar conclusions. A secondary aim was to determine if that comparison was affected by using those same strategies to first learn important contextual information about the articles. Participants were randomly assigned to learn three texts that provided background information about the research articles either by studying them four consecutive times (Text-S) or by studying and then retrieving them two consecutive times (Text-R). Half of both the Text-S and Text-R groups were then randomly assigned to learn two physiology research articles by studying them four consecutive times (Article-S) and the other half learned them by studying and then retrieving them two consecutive times (Article-R). Participants then completed two assessments: the first tested their ability to critique the research articles and the second tested their recall of the background texts. On the article critique assessment, the Article-R groups' mean scores of 33.7 ± 4.7% and 35.4 ± 4.5% (Text-R then Article-R group and Text-S then Article-R group, respectively) were both significantly (p < 0.05) higher than the two Article-S mean scores of 19.5 ± 4.4% and 21.7 ± 2.9% (Text-S then Article-S group and Text-R then Article-S group, respectively). There was no difference between the two Article-R groups on the article critique assessment, indicating those scores weren't affected by the different contextual learning strategies. Retrieval practice promoted superior critical evaluation of the research articles, and the results also indicated the

  1. Scalability of Findability: Decentralized Search and Retrieval in Large Information Networks

    ERIC Educational Resources Information Center

    Ke, Weimao

    2010-01-01

    Amid the rapid growth of information today is the increasing challenge for people to survive and navigate its magnitude. Dynamics and heterogeneity of large information spaces such as the Web challenge information retrieval in these environments. Collection of information in advance and centralization of IR operations are hardly possible because…

  2. XML in an Adaptive Framework for Instrument Control

    NASA Technical Reports Server (NTRS)

    Ames, Troy J.

    2004-01-01

    NASA Goddard Space Flight Center is developing an extensible framework for instrument command and control, known as Instrument Remote Control (IRC), that combines the platform independent processing capabilities of Java with the power of the Extensible Markup Language (XML). A key aspect of the architecture is software that is driven by an instrument description, written using the Instrument Markup Language (IML). IML is an XML dialect used to describe interfaces to control and monitor the instrument, command sets and command formats, data streams, communication mechanisms, and data processing algorithms.

  3. Estimating Missing Features to Improve Multimedia Information Retrieval

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Bagherjeiran, A; Love, N S; Kamath, C

    Retrieval in a multimedia database usually involves combining information from different modalities of data, such as text and images. However, all modalities of the data may not be available to form the query. The retrieval results from such a partial query are often less than satisfactory. In this paper, we present an approach to complete a partial query by estimating the missing features in the query. Our experiments with a database of images and their associated captions show that, with an initial text-only query, our completion method has similar performance to a full query with both image and text features.more » In addition, when we use relevance feedback, our approach outperforms the results obtained using a full query.« less

  4. An introduction to information retrieval: applications in genomics

    PubMed Central

    Nadkarni, P M

    2011-01-01

    Information retrieval (IR) is the field of computer science that deals with the processing of documents containing free text, so that they can be rapidly retrieved based on keywords specified in a user’s query. IR technology is the basis of Web-based search engines, and plays a vital role in biomedical research, because it is the foundation of software that supports literature search. Documents can be indexed by both the words they contain, as well as the concepts that can be matched to domain-specific thesauri; concept matching, however, poses several practical difficulties that make it unsuitable for use by itself. This article provides an introduction to IR and summarizes various applications of IR and related technologies to genomics. PMID:12049181

  5. Analyzing Document Retrievability in Patent Retrieval Settings

    NASA Astrophysics Data System (ADS)

    Bashir, Shariq; Rauber, Andreas

    Most information retrieval settings, such as web search, are typically precision-oriented, i.e. they focus on retrieving a small number of highly relevant documents. However, in specific domains, such as patent retrieval or law, recall becomes more relevant than precision: in these cases the goal is to find all relevant documents, requiring algorithms to be tuned more towards recall at the cost of precision. This raises important questions with respect to retrievability and search engine bias: depending on how the similarity between a query and documents is measured, certain documents may be more or less retrievable in certain systems, up to some documents not being retrievable at all within common threshold settings. Biases may be oriented towards popularity of documents (increasing weight of references), towards length of documents, favour the use of rare or common words; rely on structural information such as metadata or headings, etc. Existing accessibility measurement techniques are limited as they measure retrievability with respect to all possible queries. In this paper, we improve accessibility measurement by considering sets of relevant and irrelevant queries for each document. This simulates how recall oriented users create their queries when searching for relevant information. We evaluate retrievability scores using a corpus of patents from US Patent and Trademark Office.

  6. Third Annual Symposium on Document Analysis and Information Retrieval

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Not Available

    This document presents papers of the Third Annual Symposium on Document Analysis and Information Retrieval at the Information Science Research-l Institute at the University of Nevada, Las Vegas (UNLV/ISRI). Of the 60 papers submitted, 25 were accepted for oral presentation and 9 as poster papers. Both oral presentations and poster papers are included in these Proceedings. The individual papers have been cataloged separately.

  7. Experiments and Analysis on a Computer Interface to an Information-Retrieval Network.

    ERIC Educational Resources Information Center

    Marcus, Richard S.; Reintjes, J. Francis

    A primary goal of this project was to develop an interface that would provide direct access for inexperienced users to existing online bibliographic information retrieval networks. The experiment tested the concept of a virtual-system mode of access to a network of heterogeneous interactive retrieval systems and databases. An experimental…

  8. Multimodal retrieval of autobiographical memories: sensory information contributes differently to the recollection of events.

    PubMed

    Willander, Johan; Sikström, Sverker; Karlsson, Kristina

    2015-01-01

    Previous studies on autobiographical memory have focused on unimodal retrieval cues (i.e., cues pertaining to one modality). However, from an ecological perspective multimodal cues (i.e., cues pertaining to several modalities) are highly important to investigate. In the present study we investigated age distributions and experiential ratings of autobiographical memories retrieved with unimodal and multimodal cues. Sixty-two participants were randomized to one of four cue-conditions: visual, olfactory, auditory, or multimodal. The results showed that the peak of the distributions depends on the modality of the retrieval cue. The results indicated that multimodal retrieval seemed to be driven by visual and auditory information to a larger extent and to a lesser extent by olfactory information. Finally, no differences were observed in the number of retrieved memories or experiential ratings across the four cue-conditions.

  9. Combinatorial Fusion Analysis for Meta Search Information Retrieval

    NASA Astrophysics Data System (ADS)

    Hsu, D. Frank; Taksa, Isak

    Leading commercial search engines are built as single event systems. In response to a particular search query, the search engine returns a single list of ranked search results. To find more relevant results the user must frequently try several other search engines. A meta search engine was developed to enhance the process of multi-engine querying. The meta search engine queries several engines at the same time and fuses individual engine results into a single search results list. The fusion of multiple search results has been shown (mostly experimentally) to be highly effective. However, the question of why and how the fusion should be done still remains largely unanswered. In this chapter, we utilize the combinatorial fusion analysis proposed by Hsu et al. to analyze combination and fusion of multiple sources of information. A rank/score function is used in the design and analysis of our framework. The framework provides a better understanding of the fusion phenomenon in information retrieval. For example, to improve the performance of the combined multiple scoring systems, it is necessary that each of the individual scoring systems has relatively high performance and the individual scoring systems are diverse. Additionally, we illustrate various applications of the framework using two examples from the information retrieval domain.

  10. Diffused holographic information storage and retrieval using photorefractive optical materials

    NASA Astrophysics Data System (ADS)

    McMillen, Deanna Kay

    Holography offers a tremendous opportunity for dense information storage, theoretically one bit per cubic wavelength of material volume, with rapid retrieval, of up to thousands of pages of information simultaneously. However, many factors prevent the theoretical storage limit from being reached, including dynamic range problems and imperfections in recording materials. This research explores new ways of moving closer to practical holographic information storage and retrieval by altering the recording materials, in this case, photorefractive crystals, and by increasing the current storage capacity while improving the information retrieved. As an experimental example of the techniques developed, the information retrieved is the correlation peak from an optical recognition architecture, but the materials and methods developed are applicable to many other holographic information storage systems. Optical correlators can potentially solve any signal or image recognition problem. Military surveillance, fingerprint identification for law enforcement or employee identification, and video games are but a few examples of applications. A major obstacle keeping optical correlators from being universally accepted is the lack of a high quality, thick (high capacity) holographic recording material that operates with red or infrared wavelengths which are available from inexpensive diode lasers. This research addresses the problems from two positions: find a better material for use with diode lasers, and reduce the requirements placed on the material while maintaining an efficient and effective system. This research found that the solutions are new dopants introduced into photorefractive lithium niobate to improve wavelength sensitivities and the use of a novel inexpensive diffuser that reduces the dynamic range and optical element quality requirements (which reduces the cost) while improving performance. A uniquely doped set of 12 lithium niobate crystals was specified and

  11. 42 CFR 433.127 - Termination of FFP for failure to provide access to claims processing and information retrieval...

    Code of Federal Regulations, 2014 CFR

    2014-10-01

    ... claims processing and information retrieval systems. 433.127 Section 433.127 Public Health CENTERS FOR... PROGRAMS STATE FISCAL ADMINISTRATION Mechanized Claims Processing and Information Retrieval Systems § 433.127 Termination of FFP for failure to provide access to claims processing and information retrieval...

  12. 42 CFR 433.127 - Termination of FFP for failure to provide access to claims processing and information retrieval...

    Code of Federal Regulations, 2011 CFR

    2011-10-01

    ... claims processing and information retrieval systems. 433.127 Section 433.127 Public Health CENTERS FOR... PROGRAMS STATE FISCAL ADMINISTRATION Mechanized Claims Processing and Information Retrieval Systems § 433.127 Termination of FFP for failure to provide access to claims processing and information retrieval...

  13. 42 CFR 433.127 - Termination of FFP for failure to provide access to claims processing and information retrieval...

    Code of Federal Regulations, 2010 CFR

    2010-10-01

    ... claims processing and information retrieval systems. 433.127 Section 433.127 Public Health CENTERS FOR... PROGRAMS STATE FISCAL ADMINISTRATION Mechanized Claims Processing and Information Retrieval Systems § 433.127 Termination of FFP for failure to provide access to claims processing and information retrieval...

  14. 42 CFR 433.127 - Termination of FFP for failure to provide access to claims processing and information retrieval...

    Code of Federal Regulations, 2013 CFR

    2013-10-01

    ... claims processing and information retrieval systems. 433.127 Section 433.127 Public Health CENTERS FOR... PROGRAMS STATE FISCAL ADMINISTRATION Mechanized Claims Processing and Information Retrieval Systems § 433.127 Termination of FFP for failure to provide access to claims processing and information retrieval...

  15. 42 CFR 433.127 - Termination of FFP for failure to provide access to claims processing and information retrieval...

    Code of Federal Regulations, 2012 CFR

    2012-10-01

    ... claims processing and information retrieval systems. 433.127 Section 433.127 Public Health CENTERS FOR... PROGRAMS STATE FISCAL ADMINISTRATION Mechanized Claims Processing and Information Retrieval Systems § 433.127 Termination of FFP for failure to provide access to claims processing and information retrieval...

  16. Using the Weighted Keyword Model to Improve Information Retrieval for Answering Biomedical Questions

    PubMed Central

    Yu, Hong; Cao, Yong-gang

    2009-01-01

    Physicians ask many complex questions during the patient encounter. Information retrieval systems that can provide immediate and relevant answers to these questions can be invaluable aids to the practice of evidence-based medicine. In this study, we first automatically identify topic keywords from ad hoc clinical questions with a Condition Random Field model that is trained over thousands of manually annotated clinical questions. We then report on a linear model that assigns query weights based on their automatically identified semantic roles: topic keywords, domain specific terms, and their synonyms. Our evaluation shows that this weighted keyword model improves information retrieval from the Text Retrieval Conference Genomics track data. PMID:21347188

  17. Using the weighted keyword model to improve information retrieval for answering biomedical questions.

    PubMed

    Yu, Hong; Cao, Yong-Gang

    2009-03-01

    Physicians ask many complex questions during the patient encounter. Information retrieval systems that can provide immediate and relevant answers to these questions can be invaluable aids to the practice of evidence-based medicine. In this study, we first automatically identify topic keywords from ad hoc clinical questions with a Condition Random Field model that is trained over thousands of manually annotated clinical questions. We then report on a linear model that assigns query weights based on their automatically identified semantic roles: topic keywords, domain specific terms, and their synonyms. Our evaluation shows that this weighted keyword model improves information retrieval from the Text Retrieval Conference Genomics track data.

  18. Theory of retrieving orientation-resolved molecular information using time-domain rotational coherence spectroscopy

    NASA Astrophysics Data System (ADS)

    Wang, Xu; Le, Anh-Thu; Zhou, Zhaoyan; Wei, Hui; Lin, C. D.

    2017-08-01

    We provide a unified theoretical framework for recently emerging experiments that retrieve fixed-in-space molecular information through time-domain rotational coherence spectroscopy. Unlike a previous approach by Makhija et al. (V. Makhija et al., arXiv:1611.06476), our method can be applied to the retrieval of both real-valued (e.g., ionization yield) and complex-valued (e.g., induced dipole moment) molecular response information. It is also a direct retrieval method without using iterations. We also demonstrate that experimental parameters, such as the fluence of the aligning laser pulse and the rotational temperature of the molecular ensemble, can be quite accurately determined using a statistical method.

  19. Beyond Information Retrieval: Ways To Provide Content in Context.

    ERIC Educational Resources Information Center

    Wiley, Deborah Lynne

    1998-01-01

    Provides an overview of information retrieval from mainframe systems to Web search engines; discusses collaborative filtering, data extraction, data visualization, agent technology, pattern recognition, classification and clustering, and virtual communities. Argues that rather than huge data-storage centers and proprietary software, we need…

  20. Introducing Chemistry Undergraduate Students to Online Chemical Information Retrieval.

    ERIC Educational Resources Information Center

    Wolman, Yecheskel

    1985-01-01

    The results of manual and online searching are compared during a unit on online chemical information retrieval taught at Hebrew University. Strategies and results obtained are provided for student searches on the synthesis of vitamin K(3) from 2-methylnaphthalene and polywater. (JN)

  1. Support Vector Machines: Relevance Feedback and Information Retrieval.

    ERIC Educational Resources Information Center

    Drucker, Harris; Shahrary, Behzad; Gibbon, David C.

    2002-01-01

    Compares support vector machines (SVMs) to Rocchio, Ide regular and Ide dec-hi algorithms in information retrieval (IR) of text documents using relevancy feedback. If the preliminary search is so poor that one has to search through many documents to find at least one relevant document, then SVM is preferred. Includes nine tables. (Contains 24…

  2. Interactive information retrieval systems with minimalist representation

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Domeshek, E.; Kedar, S.; Gordon, A.

    Almost any information you might want is becoming available on-line. The problem is how to find what you need. One strategy to improve access to existing information sources, is intelligent information agents - an approach based on extensive representation and inference. Another alternative is to simply concentrate on better information organization and indexing. Our systems use a form of conceptual indexing sensitive to users` task-specific information needs. We aim for minimalist representation, coding only select aspects of stored items. Rather than supporting reliable automated inference, the primary purpose of our representations is to provide sufficient discrimination and guidance to amore » user for a given domain and task. This paper argues, using case studies, that minimal representations can make strong contributions to the usefulness and usability of interactive information systems, while minimizing knowledge engineering effort. We demonstrate this approach in several broad spectrum applications including video retrieval and advisory systems.« less

  3. Personalizing Information Retrieval Using Task Features, Topic Knowledge, and Task Products

    ERIC Educational Resources Information Center

    Liu, Jingjing

    2010-01-01

    Personalization of information retrieval tailors search towards individual users to meet their particular information needs by taking into account information about users and their contexts, often through implicit sources of evidence such as user behaviors and contextual factors. The current study looks particularly at users' dwelling behavior,…

  4. Personalizing Information Retrieval Using Interaction Behaviors in Search Sessions in Different Types of Tasks

    ERIC Educational Resources Information Center

    Liu, Chang

    2012-01-01

    When using information retrieval (IR) systems, users often pose short and ambiguous query terms. It is critical for IR systems to obtain more accurate representation of users' information need, their document preferences, and the context they are working in, and then incorporate them into the design of the systems to tailor retrieval to…

  5. All-Union Conference on Information Retrieval Systems and Automatic Processing of Scientific and Technical Information, 3rd, Moscow, 1967, Transactions. (Selected Articles).

    ERIC Educational Resources Information Center

    Air Force Systems Command, Wright-Patterson AFB, OH. Foreign Technology Div.

    The role and place of the machine in scientific and technical information is explored including: basic trends in the development of information retrieval systems; preparation of engineering and scientific cadres with respect to mechanization and automation of information works; the logic of descriptor retrieval systems; the 'SETKA-3' automated…

  6. Knowledge-Sparse and Knowledge-Rich Learning in Information Retrieval.

    ERIC Educational Resources Information Center

    Rada, Roy

    1987-01-01

    Reviews aspects of the relationship between machine learning and information retrieval. Highlights include learning programs that extend from knowledge-sparse learning to knowledge-rich learning; the role of the thesaurus; knowledge bases; artificial intelligence; weighting documents; work frequency; and merging classification structures. (78…

  7. New NED XML/VOtable Services and Client Interface Applications

    NASA Astrophysics Data System (ADS)

    Pevunova, O.; Good, J.; Mazzarella, J.; Berriman, G. B.; Madore, B.

    2005-12-01

    The NASA/IPAC Extragalactic Database (NED) provides data and cross-identifications for over 7 million extragalactic objects fused from thousands of survey catalogs and journal articles. The data cover all frequencies from radio through gamma rays and include positions, redshifts, photometry and spectral energy distributions (SEDs), sizes, and images. NED services have traditionally supplied data in HTML format for connections from Web browsers, and a custom ASCII data structure for connections by remote computer programs written in the C programming language. We describe new services that provide responses from NED queries in XML documents compliant with the international virtual observatory VOtable protocol. The XML/VOtable services support cone searches, all-sky searches based on object attributes (survey names, cross-IDs, redshifts, flux densities), and requests for detailed object data. Initial services have been inserted into the NVO registry, and others will follow soon. The first client application is a Style Sheet specification for rendering NED VOtable query results in Web browsers that support XML. The second prototype application is a Java applet that allows users to compare multiple SEDs. The new XML/VOtable output mode will also simplify the integration of data from NED into visualization and analysis packages, software agents, and other virtual observatory applications. We show an example SED from NED plotted using VOPlot. The NED website is: http://nedwww.ipac.caltech.edu.

  8. AEROMETRIC INFORMATION RETRIEVAL SYSTEM (AIRS) -GEOGRAPHIC, COMMON, AND MAINTENANCE SUBSYSTEM (GCS)

    EPA Science Inventory

    Aerometric Information Retrieval System (AIRS) is a computer-based repository of information about airborne pollution in the United States and various World Health Organization (WHO) member countries. AIRS is administered by the U.S. Environmental Protection Agency, and runs on t...

  9. The validation of the Yonsei CArbon Retrieval algorithm with improved aerosol information using GOSAT measurements

    NASA Astrophysics Data System (ADS)

    Jung, Yeonjin; Kim, Jhoon; Kim, Woogyung; Boesch, Hartmut; Goo, Tae-Young; Cho, Chunho

    2017-04-01

    Although several CO2 retrieval algorithms have been developed to improve our understanding about carbon cycle, limitations in spatial coverage and uncertainties due to aerosols and thin cirrus clouds are still remained as a problem for monitoring CO2 concentration globally. Based on an optimal estimation method, the Yonsei CArbon Retrieval (YCAR) algorithm was developed to retrieve the column-averaged dry-air mole fraction of carbon dioxide (XCO2) using the Greenhouse Gases Observing SATellite (GOSAT) measurements with optimized a priori CO2 profiles and aerosol models over East Asia. In previous studies, the aerosol optical properties (AOP) are the most important factors in CO2 retrievals since AOPs are assumed as fixed parameters during retrieval process, resulting in significant XCO2 retrieval error up to 2.5 ppm. In this study, to reduce these errors caused by inaccurate aerosol optical information, the YCAR algorithm improved with taking into account aerosol optical properties as well as aerosol vertical distribution simultaneously. The CO2 retrievals with two difference aerosol approaches have been analyzed using the GOSAT spectra and have been evaluated throughout the comparison with collocated ground-based observations at several Total Carbon Column Observing Network (TCCON) sites. The improved YCAR algorithm has biases of 0.59±0.48 ppm and 2.16±0.87 ppm at Saga and Tsukuba sites, respectively, with smaller biases and higher correlation coefficients compared to the GOSAT operational algorithm. In addition, the XCO2 retrievals will be validated at other TCCON sites and error analysis will be evaluated. These results reveal that considering better aerosol information can improve the accuracy of CO2 retrieval algorithm and provide more useful XCO2 information with reduced uncertainties. This study would be expected to provide useful information in estimating carbon sources and sinks.

  10. HepML, an XML-based format for describing simulated data in high energy physics

    NASA Astrophysics Data System (ADS)

    Belov, S.; Dudko, L.; Kekelidze, D.; Sherstnev, A.

    2010-10-01

    In this paper we describe a HepML format and a corresponding C++ library developed for keeping complete description of parton level events in a unified and flexible form. HepML tags contain enough information to understand what kind of physics the simulated events describe and how the events have been prepared. A HepML block can be included into event files in the LHEF format. The structure of the HepML block is described by means of several XML Schemas. The Schemas define necessary information for the HepML block and how this information should be located within the block. The library libhepml is a C++ library intended for parsing and serialization of HepML tags, and representing the HepML block in computer memory. The library is an API for external software. For example, Matrix Element Monte Carlo event generators can use the library for preparing and writing a header of an LHEF file in the form of HepML tags. In turn, Showering and Hadronization event generators can parse the HepML header and get the information in the form of C++ classes. libhepml can be used in C++, C, and Fortran programs. All necessary parts of HepML have been prepared and we present the project to the HEP community. Program summaryProgram title: libhepml Catalogue identifier: AEGL_v1_0 Program summary URL:http://cpc.cs.qub.ac.uk/summaries/AEGL_v1_0.html Program obtainable from: CPC Program Library, Queen's University, Belfast, N. Ireland Licensing provisions: GNU GPLv3 No. of lines in distributed program, including test data, etc.: 138 866 No. of bytes in distributed program, including test data, etc.: 613 122 Distribution format: tar.gz Programming language: C++, C Computer: PCs and workstations Operating system: Scientific Linux CERN 4/5, Ubuntu 9.10 RAM: 1 073 741 824 bytes (1 Gb) Classification: 6.2, 11.1, 11.2 External routines: Xerces XML library ( http://xerces.apache.org/xerces-c/), Expat XML Parser ( http://expat.sourceforge.net/) Nature of problem: Monte Carlo simulation in high

  11. Hospital nurses' information retrieval behaviours in relation to evidence based nursing: a literature review.

    PubMed

    Alving, Berit Elisabeth; Christensen, Janne Buck; Thrysøe, Lars

    2018-03-01

    The purpose of this literature review is to provide an overview of the information retrieval behaviour of clinical nurses, in terms of the use of databases and other information resources and their frequency of use. Systematic searches carried out in five databases and handsearching were used to identify the studies from 2010 to 2016, with a populations, exposures and outcomes (PEO) search strategy, focusing on the question: In which databases or other information resources do hospital nurses search for evidence based information, and how often? Of 5272 titles retrieved based on the search strategy, only nine studies fulfilled the criteria for inclusion. The studies are from the United States, Canada, Taiwan and Nigeria. The results show that hospital nurses' primary choice of source for evidence based information is Google and peers, while bibliographic databases such as PubMed are secondary choices. Data on frequency are only included in four of the studies, and data are heterogenous. The reasons for choosing Google and peers are primarily lack of time; lack of information; lack of retrieval skills; or lack of training in database searching. Only a few studies are published on clinical nurses' retrieval behaviours, and more studies are needed from Europe and Australia. © 2018 Health Libraries Group.

  12. Semantics-driven modelling of user preferences for information retrieval in the biomedical domain.

    PubMed

    Gladun, Anatoly; Rogushina, Julia; Valencia-García, Rafael; Béjar, Rodrigo Martínez

    2013-03-01

    A large amount of biomedical and genomic data are currently available on the Internet. However, data are distributed into heterogeneous biological information sources, with little or even no organization. Semantic technologies provide a consistent and reliable basis with which to confront the challenges involved in the organization, manipulation and visualization of data and knowledge. One of the knowledge representation techniques used in semantic processing is the ontology, which is commonly defined as a formal and explicit specification of a shared conceptualization of a domain of interest. The work presented here introduces a set of interoperable algorithms that can use domain and ontological information to improve information-retrieval processes. This work presents an ontology-based information-retrieval system for the biomedical domain. This system, with which some experiments have been carried out that are described in this paper, is based on the use of domain ontologies for the creation and normalization of lightweight ontologies that represent user preferences in a determined domain in order to improve information-retrieval processes.

  13. Improving information retrieval with multiple health terminologies in a quality-controlled gateway.

    PubMed

    Soualmia, Lina F; Sakji, Saoussen; Letord, Catherine; Rollin, Laetitia; Massari, Philippe; Darmoni, Stéfan J

    2013-01-01

    The Catalog and Index of French-language Health Internet resources (CISMeF) is a quality-controlled health gateway, primarily for Web resources in French (n=89,751). Recently, we achieved a major improvement in the structure of the catalogue by setting-up multiple terminologies, based on twelve health terminologies available in French, to overcome the potential weakness of the MeSH thesaurus, which is the main and pivotal terminology we use for indexing and retrieval since 1995. The main aim of this study was to estimate the added-value of exploiting several terminologies and their semantic relationships to improve Web resource indexing and retrieval in CISMeF, in order to provide additional health resources which meet the users' expectations. Twelve terminologies were integrated into the CISMeF information system to set up multiple-terminologies indexing and retrieval. The same sets of thirty queries were run: (i) by exploiting the hierarchical structure of the MeSH, and (ii) by exploiting the additional twelve terminologies and their semantic links. The two search modes were evaluated and compared. The overall coverage of the multiple-terminologies search mode was improved by comparison to the coverage of using the MeSH (16,283 vs. 14,159) (+15%). These additional findings were estimated at 56.6% relevant results, 24.7% intermediate results and 18.7% irrelevant. The multiple-terminologies approach improved information retrieval. These results suggest that integrating additional health terminologies was able to improve recall. Since performing the study, 21 other terminologies have been added which should enable us to make broader studies in multiple-terminologies information retrieval.

  14. Latent morpho-semantic analysis : multilingual information retrieval with character n-grams and mutual information.

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Bader, Brett William; Chew, Peter A.; Abdelali, Ahmed

    We describe an entirely statistics-based, unsupervised, and language-independent approach to multilingual information retrieval, which we call Latent Morpho-Semantic Analysis (LMSA). LMSA overcomes some of the shortcomings of related previous approaches such as Latent Semantic Analysis (LSA). LMSA has an important theoretical advantage over LSA: it combines well-known techniques in a novel way to break the terms of LSA down into units which correspond more closely to morphemes. Thus, it has a particular appeal for use with morphologically complex languages such as Arabic. We show through empirical results that the theoretical advantages of LMSA can translate into significant gains in precisionmore » in multilingual information retrieval tests. These gains are not matched either when a standard stemmer is used with LSA, or when terms are indiscriminately broken down into n-grams.« less

  15. Age of acquisition affects the retrieval of grammatical category information.

    PubMed

    Bai, Lili; Ma, Tengfei; Dunlap, Susan; Chen, Baoguo

    2013-01-01

    This study investigated age of acquisition (AoA) effects on processing grammatical category information of Chinese single-character words. In Experiment 1, nouns and verbs that were acquired at different ages were used as materials in a grammatical category decision task. Results showed that the grammatical category information of earlier acquired nouns and verbs was easier to retrieve. In Experiment 2, AoA and predictability from orthography to grammatical category were manipulated in a grammatical category decision task. Results showed larger AoA effects under lower predictability conditions. In Experiment 3, a semantic category decision task was used with the same materials as those in Experiment 2. Different results were found from Experiment 2, suggesting that the grammatical category decision task is not merely the same as the semantic category decision task, but rather involves additional processing of grammatical category information. Therefore the conclusions of Experiments 1 and 2 were strengthened. In summary, it was found for the first time that AoA affects the retrieval of grammatical category information, thus providing new evidence in support of the arbitrary mapping hypothesis.

  16. XML-BSPM: an XML format for storing Body Surface Potential Map recordings.

    PubMed

    Bond, Raymond R; Finlay, Dewar D; Nugent, Chris D; Moore, George

    2010-05-14

    The Body Surface Potential Map (BSPM) is an electrocardiographic method, for recording and displaying the electrical activity of the heart, from a spatial perspective. The BSPM has been deemed more accurate for assessing certain cardiac pathologies when compared to the 12-lead ECG. Nevertheless, the 12-lead ECG remains the most popular ECG acquisition method for non-invasively assessing the electrical activity of the heart. Although data from the 12-lead ECG can be stored and shared using open formats such as SCP-ECG, no open formats currently exist for storing and sharing the BSPM. As a result, an innovative format for storing BSPM datasets has been developed within this study. The XML vocabulary was chosen for implementation, as opposed to binary for the purpose of human readability. There are currently no standards to dictate the number of electrodes and electrode positions for recording a BSPM. In fact, there are at least 11 different BSPM electrode configurations in use today. Therefore, in order to support these BSPM variants, the XML-BSPM format was made versatile. Hence, the format supports the storage of custom torso diagrams using SVG graphics. This diagram can then be used in a 2D coordinate system for retaining electrode positions. This XML-BSPM format has been successfully used to store the Kornreich-117 BSPM dataset and the Lux-192 BSPM dataset. The resulting file sizes were in the region of 277 kilobytes for each BSPM recording and can be deemed suitable for example, for use with any telemonitoring application. Moreover, there is potential for file sizes to be further reduced using basic compression algorithms, i.e. the deflate algorithm. Finally, these BSPM files have been parsed and visualised within a convenient time period using a web based BSPM viewer. This format, if widely adopted could promote BSPM interoperability, knowledge sharing and data mining. This work could also be used to provide conceptual solutions and inspire existing formats

  17. Exploring Encoding and Retrieval Effects of Background Information on Text Memory

    ERIC Educational Resources Information Center

    Rawson, Katherine A.; Kintsch, Walter

    2004-01-01

    Two experiments were conducted (a) to evaluate how providing background information at test may benefit retrieval and (b) to further examine how providing background information prior to study influences encoding. Half of the participants read background information prior to study, and the other half did not. In each group, half were presented…

  18. Secure quantum private information retrieval using phase-encoded queries

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Olejnik, Lukasz

    We propose a quantum solution to the classical private information retrieval (PIR) problem, which allows one to query a database in a private manner. The protocol offers privacy thresholds and allows the user to obtain information from a database in a way that offers the potential adversary, in this model the database owner, no possibility of deterministically establishing the query contents. This protocol may also be viewed as a solution to the symmetrically private information retrieval problem in that it can offer database security (inability for a querying user to steal its contents). Compared to classical solutions, the protocol offersmore » substantial improvement in terms of communication complexity. In comparison with the recent quantum private queries [Phys. Rev. Lett. 100, 230502 (2008)] protocol, it is more efficient in terms of communication complexity and the number of rounds, while offering a clear privacy parameter. We discuss the security of the protocol and analyze its strengths and conclude that using this technique makes it challenging to obtain the unconditional (in the information-theoretic sense) privacy degree; nevertheless, in addition to being simple, the protocol still offers a privacy level. The oracle used in the protocol is inspired both by the classical computational PIR solutions as well as the Deutsch-Jozsa oracle.« less

  19. Bottom-Up Evaluation of Twig Join Pattern Queries in XML Document Databases

    NASA Astrophysics Data System (ADS)

    Chen, Yangjun

    Since the extensible markup language XML emerged as a new standard for information representation and exchange on the Internet, the problem of storing, indexing, and querying XML documents has been among the major issues of database research. In this paper, we study the twig pattern matching and discuss a new algorithm for processing ordered twig pattern queries. The time complexity of the algorithmis bounded by O(|D|·|Q| + |T|·leaf Q ) and its space overhead is by O(leaf T ·leaf Q ), where T stands for a document tree, Q for a twig pattern and D is a largest data stream associated with a node q of Q, which contains the database nodes that match the node predicate at q. leaf T (leaf Q ) represents the number of the leaf nodes of T (resp. Q). In addition, the algorithm can be adapted to an indexing environment with XB-trees being used.

  20. Health Professionals' Use of Online Information Retrieval Systems and Online Evidence.

    PubMed

    Lialiou, Paschalina; Pavlopoulou, Ioanna; Mantas, John

    2016-01-01

    Across-sectional survey was designed to determine health professionals' awareness and usage of online evidence retrieval systems in clinical practice. A questionnaire was used to measure professionals' behavior and utilization of online evidences, as well as, reasons and barriers on information retrieval. 439 nurses and physicians from public and private hospitals in Greece formulate the study's sample. The two most common reasons that individuals are using online information systems were for writing scientific manuscripts or filling a knowledge gap. A positive correlation was found between participants with postgraduate studies and information system usage. The majority of them (90,6%) believe that online information systems improves patient care and 67,6% of them had their own experiences on this. More support is needed to nurses and physicians in order to use the online evidence and as a result to improve the provided care and practices.

  1. Common Problems of Documentary Information Transfer, Storage and Retrieval in Industrial Organizations.

    ERIC Educational Resources Information Center

    Vickers, P. H.

    1983-01-01

    Examination of management information systems of three manufacturing firms highlights principal characteristics, document types and functions, main information flows, storage and retrieval systems, and common problems (corporate memory failure, records management, management information systems, general management). A literature review and…

  2. The semantic representation of event information depends on the cue modality: an instance of meaning-based retrieval.

    PubMed

    Karlsson, Kristina; Sikström, Sverker; Willander, Johan

    2013-01-01

    The semantic content, or the meaning, is the essence of autobiographical memories. In comparison to previous research, which has mainly focused on the phenomenological experience and the age distribution of retrieved events, the present study provides a novel view on the retrieval of event information by quantifying the information as semantic representations. We investigated the semantic representation of sensory cued autobiographical events and studied the modality hierarchy within the multimodal retrieval cues. The experiment comprised a cued recall task, where the participants were presented with visual, auditory, olfactory or multimodal retrieval cues and asked to recall autobiographical events. The results indicated that the three different unimodal retrieval cues generate significantly different semantic representations. Further, the auditory and the visual modalities contributed the most to the semantic representation of the multimodally retrieved events. Finally, the semantic representation of the multimodal condition could be described as a combination of the three unimodal conditions. In conclusion, these results suggest that the meaning of the retrieved event information depends on the modality of the retrieval cues.

  3. The Semantic Representation of Event Information Depends on the Cue Modality: An Instance of Meaning-Based Retrieval

    PubMed Central

    Karlsson, Kristina; Sikström, Sverker; Willander, Johan

    2013-01-01

    The semantic content, or the meaning, is the essence of autobiographical memories. In comparison to previous research, which has mainly focused on the phenomenological experience and the age distribution of retrieved events, the present study provides a novel view on the retrieval of event information by quantifying the information as semantic representations. We investigated the semantic representation of sensory cued autobiographical events and studied the modality hierarchy within the multimodal retrieval cues. The experiment comprised a cued recall task, where the participants were presented with visual, auditory, olfactory or multimodal retrieval cues and asked to recall autobiographical events. The results indicated that the three different unimodal retrieval cues generate significantly different semantic representations. Further, the auditory and the visual modalities contributed the most to the semantic representation of the multimodally retrieved events. Finally, the semantic representation of the multimodal condition could be described as a combination of the three unimodal conditions. In conclusion, these results suggest that the meaning of the retrieved event information depends on the modality of the retrieval cues. PMID:24204561

  4. Representation and alignment of sung queries for music information retrieval

    NASA Astrophysics Data System (ADS)

    Adams, Norman H.; Wakefield, Gregory H.

    2005-09-01

    The pursuit of robust and rapid query-by-humming systems, which search melodic databases using sung queries, is a common theme in music information retrieval. The retrieval aspect of this database problem has received considerable attention, whereas the front-end processing of sung queries and the data structure to represent melodies has been based on musical intuition and historical momentum. The present work explores three time series representations for sung queries: a sequence of notes, a ``smooth'' pitch contour, and a sequence of pitch histograms. The performance of the three representations is compared using a collection of naturally sung queries. It is found that the most robust performance is achieved by the representation with highest dimension, the smooth pitch contour, but that this representation presents a formidable computational burden. For all three representations, it is necessary to align the query and target in order to achieve robust performance. The computational cost of the alignment is quadratic, hence it is necessary to keep the dimension small for rapid retrieval. Accordingly, iterative deepening is employed to achieve both robust performance and rapid retrieval. Finally, the conventional iterative framework is expanded to adapt the alignment constraints based on previous iterations, further expediting retrieval without degrading performance.

  5. Automating Disk Forensic Processing with SleuthKit, XML and Python

    DTIC Science & Technology

    2009-05-01

    1 Automating Disk Forensic Processing with SleuthKit, XML and Python Simson L. Garfinkel Abstract We have developed a program called fiwalk which...files themselves. We show how it is relatively simple to create automated disk forensic applications using a Python module we have written that reads...software that the portable device may contain. Keywords: Computer Forensics; XML; Sleuth Kit; Python I. INTRODUCTION In recent years we have found many

  6. Space shuttle program information control and retrieval system feasibility study report

    NASA Technical Reports Server (NTRS)

    Lingle, C. P.

    1973-01-01

    The feasibility of having a common information management network for space shuttle data, is studied. Identified are the information types required, sources and users of the information, and existing techniques for acquiring, storing and retrieving the data. The study concluded that a decentralized system is feasible, and described a recommended development plan for it.

  7. Indexing and Metatag Schemes for Web-Based Information Retrieval.

    ERIC Educational Resources Information Center

    Torok, Andrew G.

    This paper reviews indexing theory and suggests that information retrieval can be significantly improved by applying basic indexing criteria. Indexing practices are described, including the three main types of indexes: pre-coordinate, post-coordinate, and variants of both. Design features of indexes are summarized, including accuracy, consistency,…

  8. On Inference Rules of Logic-Based Information Retrieval Systems.

    ERIC Educational Resources Information Center

    Chen, Patrick Shicheng

    1994-01-01

    Discussion of relevance and the needs of the users in information retrieval focuses on a deductive object-oriented approach and suggests eight inference rules for the deduction. Highlights include characteristics of a deductive object-oriented system, database and data modeling language, implementation, and user interface. (Contains 24…

  9. Intelligent Information Retrieval: Diagnosing Information Need. Part II. Uncertainty Expansion in a Prototype of a Diagnostic IR Tool.

    ERIC Educational Resources Information Center

    Cole, Charles; Cantero, Pablo; Sauve, Diane

    1998-01-01

    Outlines a prototype of an intelligent information-retrieval tool to facilitate information access for an undergraduate seeking information for a term paper. Topics include diagnosing the information need, Kuhlthau's information-search-process model, Shannon's mathematical theory of communication, and principles of uncertainty expansion and…

  10. Understanding vaccination resistance: vaccine search term selection bias and the valence of retrieved information.

    PubMed

    Ruiz, Jeanette B; Bell, Robert A

    2014-10-07

    Dubious vaccination-related information on the Internet leads some parents to opt out of vaccinating their children. To determine if negative, neutral and positive search terms retrieve vaccination information that differs in valence and confirms searchers' assumptions about vaccination. A content analysis of first-page Google search results was conducted using three negative, three neutral, and three positive search terms for the concepts "vaccine," "vaccination," and "MMR"; 84 of the 90 websites retrieved met inclusion requirements. Two coders independently and reliably coded for the presence or absence of each of 15 myths about vaccination (e.g., "vaccines cause autism"), statements that countered these myths, and recommendations for or against vaccination. Data were analyzed using descriptive statistics. Across all websites, at least one myth was perpetuated on 16.7% of websites and at least one myth was countered on 64.3% of websites. The mean number of myths perpetuated on websites retrieved with negative, neutral, and positive search terms, respectively, was 1.93, 0.53, and 0.40. The mean number of myths countered on websites retrieved with negative, neutral, and positive search terms, respectively, was 3.0, 3.27, and 2.87. Explicit recommendations regarding vaccination were offered on 22.6% of websites. A recommendation against vaccination was more often made on websites retrieved with negative search terms (37.5% of recommendations) than on websites retrieved with neutral (12.5%) or positive (0%) search terms. The concerned parent who seeks information about the risks of childhood immunizations will find more websites that perpetuate vaccine myths and recommend against vaccination than the parent who seeks information about the benefits of vaccination. This suggests that search term valence can lead to online information that supports concerned parents' misconceptions about vaccines. Copyright © 2014 Elsevier Ltd. All rights reserved.

  11. Knowledge Acquisition of Generic Queries for Information Retrieval

    PubMed Central

    Seol, Yoon-Ho; Johnson, Stephen B.; Cimino, James J.

    2002-01-01

    Several studies have identified clinical questions posed by health care professionals to understand the nature of information needs during clinical practice. To support access to digital information sources, it is necessary to integrate the information needs with a computer system. We have developed a conceptual guidance approach in information retrieval, based on a knowledge base that contains the patterns of information needs. The knowledge base uses a formal representation of clinical questions based on the UMLS knowledge sources, called the Generic Query model. To improve the coverage of the knowledge base, we investigated a method for extracting plausible clinical questions from the medical literature. This poster presents the Generic Query model, shows how it is used to represent the patterns of clinical questions, and describes the framework used to extract knowledge from the medical literature.

  12. Dissociable parietal regions facilitate successful retrieval of recently learned and personally familiar information.

    PubMed

    Elman, Jeremy A; Cohn-Sheehy, Brendan I; Shimamura, Arthur P

    2013-03-01

    In fMRI analyses, the posterior parietal cortex (PPC) is particularly active during the successful retrieval of episodic memory. To delineate the neural correlates of episodic retrieval more succinctly, we compared retrieval of recently learned spatial locations (photographs of buildings) with retrieval of previously familiar locations (photographs of familiar campus buildings). Episodic retrieval of recently learned locations activated a circumscribed region within the ventral PPC (anterior angular gyrus and adjacent regions in the supramarginal gyrus) as well as medial PPC regions (posterior cingulated gyrus and posterior precuneus). Retrieval of familiar locations activated more posterior regions in the ventral PPC (posterior angular gyrus, LOC) and more anterior regions in the medial PPC (anterior precuneus and retrosplenial cortex). These dissociable effects define more precisely PPC regions involved in the retrieval of recent, contextually bound information as opposed to regions involved in other processes, such as visual imagery, scene reconstruction, and self-referential processing. Copyright © 2012 Elsevier Ltd. All rights reserved.

  13. Developing a comprehensive system for content-based retrieval of image and text data from a national survey

    NASA Astrophysics Data System (ADS)

    Antani, Sameer K.; Natarajan, Mukil; Long, Jonathan L.; Long, L. Rodney; Thoma, George R.

    2005-04-01

    The article describes the status of our ongoing R&D at the U.S. National Library of Medicine (NLM) towards the development of an advanced multimedia database biomedical information system that supports content-based image retrieval (CBIR). NLM maintains a collection of 17,000 digitized spinal X-rays along with text survey data from the Second National Health and Nutritional Examination Survey (NHANES II). These data serve as a rich data source for epidemiologists and researchers of osteoarthritis and musculoskeletal diseases. It is currently possible to access these through text keyword queries using our Web-based Medical Information Retrieval System (WebMIRS). CBIR methods developed specifically for biomedical images could offer direct visual searching of these images by means of example image or user sketch. We are building a system which supports hybrid queries that have text and image-content components. R&D goals include developing algorithms for robust image segmentation for localizing and identifying relevant anatomy, labeling the segmented anatomy based on its pathology, developing suitable indexing and similarity matching methods for images and image features, and associating the survey text information for query and retrieval along with the image data. Some highlights of the system developed in MATLAB and Java are: use of a networked or local centralized database for text and image data; flexibility to incorporate new research work; provides a means to control access to system components under development; and use of XML for structured reporting. The article details the design, features, and algorithms in this third revision of this prototype system, CBIR3.

  14. Information content of thermal infrared a microwave bands for simultaneous retrieval of cirrus ice water path and particle effective diameter

    NASA Astrophysics Data System (ADS)

    Bell, A.; Tang, G.; Yang, P.; Wu, D.

    2017-12-01

    Due to their high spatial and temporal coverage, cirrus clouds have a profound role in regulating the Earth's energy budget. Variability of their radiative, geometric, and microphysical properties can pose significant uncertainties in global climate model simulations if not adequately constrained. Thus, the development of retrieval methodologies able to accurately retrieve ice cloud properties and present associated uncertainties is essential. The effectiveness of cirrus cloud retrievals relies on accurate a priori understanding of ice radiative properties, as well as the current state of the atmosphere. Current studies have implemented information content theory analyses prior to retrievals to quantify the amount of information that should be expected on parameters to be retrieved, as well as the relative contribution of information provided by certain measurement channels. Through this analysis, retrieval algorithms can be designed in a way to maximize the information in measurements, and therefore ensure enough information is present to retrieve ice cloud properties. In this study, we present such an information content analysis to quantify the amount of information to be expected in retrievals of cirrus ice water path and particle effective diameter using sub-millimeter and thermal infrared radiometry. Preliminary results show these bands to be sensitive to changes in ice water path and effective diameter, and thus lend confidence their ability to simultaneously retrieve these parameters. Further quantification of sensitivity and the information provided from these bands can then be used to design and optimal retrieval scheme. While this information content analysis is employed on a theoretical retrieval combining simulated radiance measurements, the methodology could in general be applicable to any instrument or retrieval approach.

  15. Information retrieval for systematic reviews in food and feed topics: A narrative review.

    PubMed

    Wood, Hannah; O'Connor, Annette; Sargeant, Jan; Glanville, Julie

    2018-01-09

    Systematic review methods are now being used for reviews of food production, food safety and security, plant health, and animal health and welfare. Information retrieval methods in this context have been informed by human health-care approaches and ideally should be based on relevant research and experience. This narrative review seeks to identify and summarize current research-based evidence and experience on information retrieval for systematic reviews in food and feed topics. MEDLINE (Ovid), Science Citation Index (Web of Science), and ScienceDirect (http://www.sciencedirect.com/) were searched in 2012 and 2016. We also contacted topic experts and undertook citation searches. We selected and summarized studies reporting research on information retrieval, as well as published guidance and experience. There is little published evidence on the most efficient way to conduct searches for food and feed topics. There are few available study design search filters, and their use may be problematic given poor or inconsistent reporting of study methods. Food and feed research makes use of a wide range of study designs so it might be best to focus strategy development on capturing study populations, although this also has challenges. There is limited guidance on which resources should be searched and whether publication bias in disciplines relevant to food and feed necessitates extensive searching of the gray literature. There is some limited evidence on information retrieval approaches, but more research is required to inform effective and efficient approaches to searching to populate food and feed reviews. Copyright © 2018 John Wiley & Sons, Ltd.

  16. ForConX: A forcefield conversion tool based on XML.

    PubMed

    Lesch, Volker; Diddens, Diddo; Bernardes, Carlos E S; Golub, Benjamin; Dequidt, Alain; Zeindlhofer, Veronika; Sega, Marcello; Schröder, Christian

    2017-04-05

    The force field conversion from one MD program to another one is exhausting and error-prone. Although single conversion tools from one MD program to another exist not every combination and both directions of conversion are available for the favorite MD programs Amber, Charmm, Dl-Poly, Gromacs, and Lammps. We present here a general tool for the force field conversion on the basis of an XML document. The force field is converted to and from this XML structure facilitating the implementation of new MD programs for the conversion. Furthermore, the XML structure is human readable and can be manipulated before continuing the conversion. We report, as testcases, the conversions of topologies for acetonitrile, dimethylformamide, and 1-ethyl-3-methylimidazolium trifluoromethanesulfonate comprising also Urey-Bradley and Ryckaert-Bellemans potentials. © 2017 Wiley Periodicals, Inc. © 2017 Wiley Periodicals, Inc.

  17. Information Storage and Retrieval Scientific Report No. ISR-22.

    ERIC Educational Resources Information Center

    Salton, Gerard

    The twenty-second in a series, this report describes research in information organization and retrieval conducted by the Department of Computer Science at Cornell University. The report covers work carried out during the period summer 1972 through summer 1974 and is divided into four parts: indexing theory, automatic content analysis, feedback…

  18. Tetrahydrocannabinol (THC) impairs encoding but not retrieval of verbal information.

    PubMed

    Ranganathan, Mohini; Radhakrishnan, Rajiv; Addy, Peter H; Schnakenberg-Martin, Ashley M; Williams, Ashley H; Carbuto, Michelle; Elander, Jacqueline; Pittman, Brian; Andrew Sewell, R; Skosnik, Patrick D; D'Souza, Deepak Cyril

    2017-10-03

    Cannabis and agonists of the brain cannabinoid receptor (CB 1 R) produce acute memory impairments in humans. However, the extent to which cannabinoids impair the component processes of encoding and retrieval has not been established in humans. The objective of this analysis was to determine whether the administration of Δ 9 -Tetrahydrocannabinol (THC), the principal psychoactive constituent of cannabis, impairs encoding and/or retrieval of verbal information. Healthy subjects were recruited from the community. Subjects were administered the Rey-Auditory Verbal Learning Test (RAVLT) either before administration of THC (experiment #1) (n=38) or while under the influence of THC (experiment #2) (n=57). Immediate and delayed recall on the RAVLT was compared. Subjects received intravenous THC, in a placebo-controlled, double-blind, randomized manner at doses known to produce behavioral and subjective effects consistent with cannabis intoxication. Total immediate recall, short delayed recall, and long delayed recall were reduced in a statistically significant manner only when the RAVLT was administered to subjects while they were under the influence of THC (experiment #2) and not when the RAVLT was administered prior. THC acutely interferes with encoding of verbal memory without interfering with retrieval. These data suggest that learning information prior to the use of cannabis or cannabinoids is not likely to disrupt recall of that information. Future studies will be necessary to determine whether THC impairs encoding of non-verbal information, to what extent THC impairs memory consolidation, and the role of other cannabinoids in the memory-impairing effects of cannabis. Cannabinoids, Neural Synchrony, and Information Processing (THC-Gamma) http://clinicaltrials.gov/ct2/show/study/NCT00708994 NCT00708994 Pharmacogenetics of Cannabinoid Response http://clinicaltrials.gov/ct2/show/NCT00678730 NCT00678730. Copyright © 2017. Published by Elsevier Inc.

  19. Effects of Information Access Cost and Accountability on Medical Residents' Information Retrieval Strategy and Performance During Prehandover Preparation: Evidence From Interview and Simulation Study.

    PubMed

    Yang, X Jessie; Wickens, Christopher D; Park, Taezoon; Fong, Liesel; Siah, Kewin T H

    2015-12-01

    We aimed to examine the effects of information access cost and accountability on medical residents' information retrieval strategy and performance during prehandover preparation. Prior studies observing doctors' prehandover practices witnessed the use of memory-intensive strategies when retrieving patient information. These strategies impose potential threats to patient safety as human memory is prone to errors. Of interest in this work are the underlying determinants of information retrieval strategy and the potential impacts on medical residents' information preparation performance. A two-step research approach was adopted, consisting of semistructured interviews with 21 medical residents and a simulation-based experiment with 32 medical residents. The semistructured interviews revealed that a substantial portion of medical residents (38%) relied largely on memory for preparing handover information. The simulation-based experiment showed that higher information access cost reduced information access attempts and access duration on patient documents and harmed information preparation performance. Higher accountability led to marginally longer access to patient documents. It is important to understand the underlying determinants of medical residents' information retrieval strategy and performance during prehandover preparation. We noted the criticality of easy access to patient documents in prehandover preparation. In addition, accountability marginally influenced medical residents' information retrieval strategy. Findings from this research suggested that the cost of accessing information sources should be minimized in developing handover preparation tools. © 2015, Human Factors and Ergonomics Society.

  20. Distinct regions of prefrontal cortex are associated with the controlled retrieval and selection of social information.

    PubMed

    Satpute, Ajay B; Badre, David; Ochsner, Kevin N

    2014-05-01

    Research in social neuroscience has uncovered a social knowledge network that is particularly attuned to making social judgments. However, the processes that are being performed by both regions within this network and those outside of this network that are nevertheless engaged in the service of making a social judgment remain unclear. To help address this, we drew upon research in semantic memory, which suggests that making a semantic judgment engages 2 distinct control processes: A controlled retrieval process, which aids in bringing goal-relevant information to mind from long-term stores, and a selection process, which aids in selecting the information that is goal-relevant from the information retrieved. In a neuroimaging study, we investigated whether controlled retrieval and selection for social information engage distinct portions of both the social knowledge network and regions outside this network. Controlled retrieval for social information engaged an anterior ventrolateral portion of the prefrontal cortex, whereas selection engaged both the dorsomedial prefrontal cortex and temporoparietal junction within the social knowledge network. These results suggest that the social knowledge network may be more involved with the selection of social information than the controlled retrieval of it and incorporates lateral prefrontal regions in accessing memory for making social judgments.

  1. Cross-language information retrieval using PARAFAC2.

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Bader, Brett William; Chew, Peter; Abdelali, Ahmed

    A standard approach to cross-language information retrieval (CLIR) uses Latent Semantic Analysis (LSA) in conjunction with a multilingual parallel aligned corpus. This approach has been shown to be successful in identifying similar documents across languages - or more precisely, retrieving the most similar document in one language to a query in another language. However, the approach has severe drawbacks when applied to a related task, that of clustering documents 'language-independently', so that documents about similar topics end up closest to one another in the semantic space regardless of their language. The problem is that documents are generally more similar tomore » other documents in the same language than they are to documents in a different language, but on the same topic. As a result, when using multilingual LSA, documents will in practice cluster by language, not by topic. We propose a novel application of PARAFAC2 (which is a variant of PARAFAC, a multi-way generalization of the singular value decomposition [SVD]) to overcome this problem. Instead of forming a single multilingual term-by-document matrix which, under LSA, is subjected to SVD, we form an irregular three-way array, each slice of which is a separate term-by-document matrix for a single language in the parallel corpus. The goal is to compute an SVD for each language such that V (the matrix of right singular vectors) is the same across all languages. Effectively, PARAFAC2 imposes the constraint, not present in standard LSA, that the 'concepts' in all documents in the parallel corpus are the same regardless of language. Intuitively, this constraint makes sense, since the whole purpose of using a parallel corpus is that exactly the same concepts are expressed in the translations. We tested this approach by comparing the performance of PARAFAC2 with standard LSA in solving a particular CLIR problem. From our results, we conclude that PARAFAC2 offers a very promising alternative to LSA not

  2. The Effect of Bilingual Term List Size on Dictionary-Based Cross-Language Information Retrieval

    DTIC Science & Technology

    2006-01-01

    The Effect of Bilingual Term List Size on Dictionary -Based Cross-Language Information Retrieval Dina Demner-Fushman Department of Computer Science... dictionary -based Cross-Language Information Retrieval (CLIR), in which the goal is to find documents written in one natural language based on queries that...in which the documents are written. In dictionary -based CLIR techniques, the princi- pal source of translation knowledge is a translation lexicon

  3. Information content of ozone retrieval algorithms

    NASA Technical Reports Server (NTRS)

    Rodgers, C.; Bhartia, P. K.; Chu, W. P.; Curran, R.; Deluisi, J.; Gille, J. C.; Hudson, R.; Mateer, C.; Rusch, D.; Thomas, R. J.

    1989-01-01

    The algorithms are characterized that were used for production processing by the major suppliers of ozone data to show quantitatively: how the retrieved profile is related to the actual profile (This characterizes the altitude range and vertical resolution of the data); the nature of systematic errors in the retrieved profiles, including their vertical structure and relation to uncertain instrumental parameters; how trends in the real ozone are reflected in trends in the retrieved ozone profile; and how trends in other quantities (both instrumental and atmospheric) might appear as trends in the ozone profile. No serious deficiencies were found in the algorithms used in generating the major available ozone data sets. As the measurements are all indirect in someway, and the retrieved profiles have different characteristics, data from different instruments are not directly comparable.

  4. RUBE: an XML-based architecture for 3D process modeling and model fusion

    NASA Astrophysics Data System (ADS)

    Fishwick, Paul A.

    2002-07-01

    Information fusion is a critical problem for science and engineering. There is a need to fuse information content specified as either data or model. We frame our work in terms of fusing dynamic and geometric models, to create an immersive environment where these models can be juxtaposed in 3D, within the same interface. The method by which this is accomplished fits well into other eXtensible Markup Language (XML) approaches to fusion in general. The task of modeling lies at the heart of the human-computer interface, joining the human to the system under study through a variety of sensory modalities. I overview modeling as a key concern for the Defense Department and the Air Force, and then follow with a discussion of past, current, and future work. Past work began with a package with C and has progressed, in current work, to an implementation in XML. Our current work is defined within the RUBE architecture, which is detailed in subsequent papers devoted to key components. We have built RUBE as a next generation modeling framework using our prior software, with research opportunities in immersive 3D and tangible user interfaces.

  5. A novel architecture for information retrieval system based on semantic web

    NASA Astrophysics Data System (ADS)

    Zhang, Hui

    2011-12-01

    Nowadays, the web has enabled an explosive growth of information sharing (there are currently over 4 billion pages covering most areas of human endeavor) so that the web has faced a new challenge of information overhead. The challenge that is now before us is not only to help people locating relevant information precisely but also to access and aggregate a variety of information from different resources automatically. Current web document are in human-oriented formats and they are suitable for the presentation, but machines cannot understand the meaning of document. To address this issue, Berners-Lee proposed a concept of semantic web. With semantic web technology, web information can be understood and processed by machine. It provides new possibilities for automatic web information processing. A main problem of semantic web information retrieval is that when these is not enough knowledge to such information retrieval system, the system will return to a large of no sense result to uses due to a huge amount of information results. In this paper, we present the architecture of information based on semantic web. In addiction, our systems employ the inference Engine to check whether the query should pose to Keyword-based Search Engine or should pose to the Semantic Search Engine.

  6. The Effectiveness of the Thesaurus Method in Automatic Information Retrieval. Technical Report No. 75-261.

    ERIC Educational Resources Information Center

    Yu, C. T.; Salton, G.

    Formal proofs are given of the effectiveness under well-defined conditions of the thesaurus method in information retrieval. It is shown, in particular, that when certain semantically related terms are added to the information queries originally submitted by the user population, a superior retrieval system is obtained in the sense that for every…

  7. Information content of visible and midinfrared radiances for retrieving tropical ice cloud properties

    NASA Astrophysics Data System (ADS)

    Chang, Kai-Wei; L'Ecuyer, Tristan S.; Kahn, Brian H.; Natraj, Vijay

    2017-05-01

    Hyperspectral instruments such as Atmospheric Infrared Sounder (AIRS) have spectrally dense observations effective for ice cloud retrievals. However, due to the large number of channels, only a small subset is typically used. It is crucial that this subset of channels be chosen to contain the maximum possible information about the retrieved variables. This study describes an information content analysis designed to select optimal channels for ice cloud retrievals. To account for variations in ice cloud properties, we perform channel selection over an ensemble of cloud regimes, extracted with a clustering algorithm, from a multiyear database at a tropical Atmospheric Radiation Measurement site. Multiple satellite viewing angles over land and ocean surfaces are considered to simulate the variations in observation scenarios. The results suggest that AIRS channels near wavelengths of 14, 10.4, 4.2, and 3.8 μm contain the most information. With an eye toward developing a joint AIRS-MODIS (Moderate Resolution Imaging Spectroradiometer) retrieval, the analysis is also applied to combined measurements from both instruments. While application of this method to MODIS yields results consistent with previous channel sensitivity studies, the analysis shows that this combination may yield substantial improvement in cloud retrievals. MODIS provides most information on optical thickness and particle size, aided by a better constraint on cloud vertical placement from AIRS. An alternate scenario where cloud top boundaries are supplied by the active sensors in the A-train is also explored. The more robust cloud placement afforded by active sensors shifts the optimal channels toward the window region and shortwave infrared, further constraining optical thickness and particle size.

  8. Automatic Content Analysis; Part I of Scientific Report No. ISR-18, Information Storage and Retrieval...

    ERIC Educational Resources Information Center

    Cornell Univ., Ithaca, NY. Dept. of Computer Science.

    Four papers are included in Part One of the eighteenth report on Salton's Magical Automatic Retriever of Texts (SMART) project. The first paper: "Content Analysis in Information Retrieval" by S. F. Weiss presents the results of experiments aimed at determining the conditions under which content analysis improves retrieval results as well…

  9. [Design and implementation of medical instrument standard information retrieval system based on APS.NET].

    PubMed

    Yu, Kaijun

    2010-07-01

    This paper Analys the design goals of Medical Instrumentation standard information retrieval system. Based on the B /S structure,we established a medical instrumentation standard retrieval system with ASP.NET C # programming language, IIS f Web server, SQL Server 2000 database, in the. NET environment. The paper also Introduces the system structure, retrieval system modules, system development environment and detailed design of the system.

  10. XML DTD and Schemas for HDF-EOS

    NASA Technical Reports Server (NTRS)

    Ullman, Richard; Yang, Jingli

    2008-01-01

    An Extensible Markup Language (XML) document type definition (DTD) standard for the structure and contents of HDF-EOS files and their contents, and an equivalent standard in the form of schemas, have been developed.

  11. Conjunctive patches subspace learning with side information for collaborative image retrieval.

    PubMed

    Zhang, Lining; Wang, Lipo; Lin, Weisi

    2012-08-01

    Content-Based Image Retrieval (CBIR) has attracted substantial attention during the past few years for its potential practical applications to image management. A variety of Relevance Feedback (RF) schemes have been designed to bridge the semantic gap between the low-level visual features and the high-level semantic concepts for an image retrieval task. Various Collaborative Image Retrieval (CIR) schemes aim to utilize the user historical feedback log data with similar and dissimilar pairwise constraints to improve the performance of a CBIR system. However, existing subspace learning approaches with explicit label information cannot be applied for a CIR task, although the subspace learning techniques play a key role in various computer vision tasks, e.g., face recognition and image classification. In this paper, we propose a novel subspace learning framework, i.e., Conjunctive Patches Subspace Learning (CPSL) with side information, for learning an effective semantic subspace by exploiting the user historical feedback log data for a CIR task. The CPSL can effectively integrate the discriminative information of labeled log images, the geometrical information of labeled log images and the weakly similar information of unlabeled images together to learn a reliable subspace. We formally formulate this problem into a constrained optimization problem and then present a new subspace learning technique to exploit the user historical feedback log data. Extensive experiments on both synthetic data sets and a real-world image database demonstrate the effectiveness of the proposed scheme in improving the performance of a CBIR system by exploiting the user historical feedback log data.

  12. Information object definition-based unified modeling language representation of DICOM structured reporting: a case study of transcoding DICOM to XML.

    PubMed

    Tirado-Ramos, Alfredo; Hu, Jingkun; Lee, K P

    2002-01-01

    Supplement 23 to DICOM (Digital Imaging and Communications for Medicine), Structured Reporting, is a specification that supports a semantically rich representation of image and waveform content, enabling experts to share image and related patient information. DICOM SR supports the representation of textual and coded data linked to images and waveforms. Nevertheless, the medical information technology community needs models that work as bridges between the DICOM relational model and open object-oriented technologies. The authors assert that representations of the DICOM Structured Reporting standard, using object-oriented modeling languages such as the Unified Modeling Language, can provide a high-level reference view of the semantically rich framework of DICOM and its complex structures. They have produced an object-oriented model to represent the DICOM SR standard and have derived XML-exchangeable representations of this model using World Wide Web Consortium specifications. They expect the model to benefit developers and system architects who are interested in developing applications that are compliant with the DICOM SR specification.

  13. Secure quantum private information retrieval using phase-encoded queries

    NASA Astrophysics Data System (ADS)

    Olejnik, Lukasz

    2011-08-01

    We propose a quantum solution to the classical private information retrieval (PIR) problem, which allows one to query a database in a private manner. The protocol offers privacy thresholds and allows the user to obtain information from a database in a way that offers the potential adversary, in this model the database owner, no possibility of deterministically establishing the query contents. This protocol may also be viewed as a solution to the symmetrically private information retrieval problem in that it can offer database security (inability for a querying user to steal its contents). Compared to classical solutions, the protocol offers substantial improvement in terms of communication complexity. In comparison with the recent quantum private queries [Phys. Rev. Lett.PRLTAO0031-900710.1103/PhysRevLett.100.230502 100, 230502 (2008)] protocol, it is more efficient in terms of communication complexity and the number of rounds, while offering a clear privacy parameter. We discuss the security of the protocol and analyze its strengths and conclude that using this technique makes it challenging to obtain the unconditional (in the information-theoretic sense) privacy degree; nevertheless, in addition to being simple, the protocol still offers a privacy level. The oracle used in the protocol is inspired both by the classical computational PIR solutions as well as the Deutsch-Jozsa oracle.

  14. Internet-based data interchange with XML

    NASA Astrophysics Data System (ADS)

    Fuerst, Karl; Schmidt, Thomas

    2000-12-01

    In this paper, a complete concept for Internet Electronic Data Interchange (EDI) - a well-known buzzword in the area of logistics and supply chain management to enable the automation of the interactions between companies and their partners - using XML (eXtensible Markup Language) will be proposed. This approach is based on Internet and XML, because the implementation of traditional EDI (e.g. EDIFACT, ANSI X.12) is mostly too costly for small and medium sized enterprises, which want to integrate their suppliers and customers in a supply chain. The paper will also present the results of the implementation of a prototype for such a system, which has been developed for an industrial partner to improve the current situation of parts delivery. The main functions of this system are an early warning system to detect problems during the parts delivery process as early as possible, and a transport following system to pursue the transportation.

  15. Information Content and Sensitivity of the 3β+2α Lidar Measurement System for Microphysical Retrievals

    NASA Astrophysics Data System (ADS)

    Burton, S. P.; Liu, X.; Chemyakin, E.; Hostetler, C. A.; Stamnes, S.; Moore, R.; Sawamura, P.; Ferrare, R. A.; Knobelspiesse, K. D.

    2015-12-01

    There is considerable interest in retrieving aerosol effective radius, number concentration and refractive index from lidar measurements of extinction and backscatter at several wavelengths. The 3 backscatter + 2 extinction (3β+2α) combination is particularly important since the planned NASA Aerosol-Clouds-Ecosystem (ACE) mission recommends this combination of measurements. The 2nd-generation NASA Langley airborne High Spectral Resolution Lidar (HSRL-2) has been making 3β+2α measurements since 2012. Here we develop a deeper understanding of the information content and sensitivities of the 3β+2α system in terms of aerosol microphysical parameters of interest. We determine best case results using a retrieval-free methodology. We calculate information content and uncertainty metrics from Optimal Estimation techniques using only a simplified forward model look-up table, with no explicit inversion. Simplifications include spherical particles, mono-modal log-normal size distributions, and wavelength-independent refractive indices. Since we only use the forward model with no retrieval, our results are applicable as a best case for all existing retrievals. Retrieval-dependent errors due to mismatch between the assumptions and true atmospheric aerosols are not included. The sensitivity metrics allow for identifying (1) information content of the measurements versus a priori information; (2) best-case error bars on the retrieved parameters; and (3) potential sources of cross-talk or "compensating" errors wherein different retrieval parameters are not independently captured by the measurements. These results suggest that even in the best case, this retrieval system is underdetermined. Recommendations are given for addressing cross-talk between effective radius and number concentration. A potential solution to the under-determination problem is a combined active (lidar) and passive (polarimeter) retrieval, which is the subject of a new funded NASA project by our team.

  16. Information retrieval from black holes

    NASA Astrophysics Data System (ADS)

    Lochan, Kinjalk; Chakraborty, Sumanta; Padmanabhan, T.

    2016-08-01

    It is generally believed that, when matter collapses to form a black hole, the complete information about the initial state of the matter cannot be retrieved by future asymptotic observers, through local measurements. This is contrary to the expectation from a unitary evolution in quantum theory and leads to (a version of) the black hole information paradox. Classically, nothing else, apart from mass, charge, and angular momentum is expected to be revealed to such asymptotic observers after the formation of a black hole. Semiclassically, black holes evaporate after their formation through the Hawking radiation. The dominant part of the radiation is expected to be thermal and hence one cannot know anything about the initial data from the resultant radiation. However, there can be sources of distortions which make the radiation nonthermal. Although the distortions are not strong enough to make the evolution unitary, these distortions carry some part of information regarding the in-state. In this work, we show how one can decipher the information about the in-state of the field from these distortions. We show that the distortions of a particular kind—which we call nonvacuum distortions—can be used to fully reconstruct the initial data. The asymptotic observer can do this operationally by measuring certain well-defined observables of the quantum field at late times. We demonstrate that a general class of in-states encode all their information content in the correlation of late time out-going modes. Further, using a 1 +1 dimensional dilatonic black hole model to accommodate backreaction self-consistently, we show that observers can also infer and track the information content about the initial data, during the course of evaporation, unambiguously. Implications of such information extraction are discussed.

  17. Rock.XML - Towards a library of rock physics models

    NASA Astrophysics Data System (ADS)

    Jensen, Erling Hugo; Hauge, Ragnar; Ulvmoen, Marit; Johansen, Tor Arne; Drottning, Åsmund

    2016-08-01

    Rock physics modelling provides tools for correlating physical properties of rocks and their constituents to the geophysical observations we measure on a larger scale. Many different theoretical and empirical models exist, to cover the range of different types of rocks. However, upon reviewing these, we see that they are all built around a few main concepts. Based on this observation, we propose a format for digitally storing the specifications for rock physics models which we have named Rock.XML. It does not only contain data about the various constituents, but also the theories and how they are used to combine these building blocks to make a representative model for a particular rock. The format is based on the Extensible Markup Language XML, making it flexible enough to handle complex models as well as scalable towards extending it with new theories and models. This technology has great advantages as far as documenting and exchanging models in an unambiguous way between people and between software. Rock.XML can become a platform for creating a library of rock physics models; making them more accessible to everyone.

  18. Information retrieval for the Cochrane systematic reviews: the case of breast cancer surgery.

    PubMed

    Cognetti, Gaetana; Grossi, Laura; Lucon, Antonio; Solimini, Renata

    2015-01-01

    Systematic reviews are fundamental sources of knowledge on the state-of-the-art interventions for various clinical problems. One of the essential components in carrying out a systematic review is that of developing a comprehensive literature search. Three Cochrane systematic reviews published in 2012 were retrieved using the MeSH descriptor breast neoplasms/surgery, and analyzed with respect to the information sources used and the search strategies adopted. In March 2014, an update of one of the reviews retrieved was also considered in the study. The number of databases queried for each review ranged between three and seven. All the reviews reported the search strategies adopted, however some only partially. All the reviews explicitly claimed that the searches applied no language restriction although sources such as the free database Lilacs (in Spanish and Portuguese) was not consulted. To improve the quality it is necessary to apply standards in carrying out systematic reviews (as laid down in the MECIR project). To meet these standards concerning literature searching, professional information retrieval specialist staff should be involved. The peer review committee in charge of evaluating the publication of a systematic review should also include specialists in information retrieval for assessing the quality of the literature search.

  19. The man/machine interface in information retrieval: Providing access to the casual user

    NASA Technical Reports Server (NTRS)

    Dominick, Wayne D. (Editor); Granier, Martin

    1984-01-01

    This study is concerned with the difficulties encountered by casual users wishing to employ Information Storage and Retrieval Systems. A casual user is defined as a professional who has neither time nor desire to pursue in depth the study of the numerous and varied retrieval systems. His needs for on-line search are only occasional, and not limited to any particular system. The paper takes a close look at the state of the art of research concerned with aiding casual users of Information Storage and Retrieval Systems. Current experiments such as LEXIS, CONIT, IIDA, CITE, and CCL are presented and discussed. Comments and proposals are offered, specifically in the areas of training, learning and cost as experienced by the casual user. An extensive bibliography of recent works on the subject follows the text.

  20. Retrieval of land cover information under thin fog in Landsat TM image

    NASA Astrophysics Data System (ADS)

    Wei, Yuchun

    2008-04-01

    Thin fog, which often appears in remote sensing image of subtropical climate region, has resulted in the low image quantity and bad image mapping. Therefore, it is necessary to develop the image processing method to retrieve land cover information under thin fog. In this paper, the Landsat TM image near the Taihu Lake that is in the subtropical climate zone of China was used as an example, and the workflow and method used to retrieve the land cover information under thin fog have been built based on ENVI software and a single TM image. The basic step covers three parts: 1) isolating the thin fog area in image according to the spectral difference of different bands; 2) retrieving the visible band information of different land cover types under thin fog from the near-infrared bands according to the relationships between near-infrared bands and visible bands of different land cover types in the area without fog; 3) image post-process. The result showed that the method in the paper is easy and suitable, and can be used to improve the quantity of TM image mapping more effectively.

  1. Image retrieval by information fusion based on scalable vocabulary tree and robust Hausdorff distance

    NASA Astrophysics Data System (ADS)

    Che, Chang; Yu, Xiaoyang; Sun, Xiaoming; Yu, Boyang

    2017-12-01

    In recent years, Scalable Vocabulary Tree (SVT) has been shown to be effective in image retrieval. However, for general images where the foreground is the object to be recognized while the background is cluttered, the performance of the current SVT framework is restricted. In this paper, a new image retrieval framework that incorporates a robust distance metric and information fusion is proposed, which improves the retrieval performance relative to the baseline SVT approach. First, the visual words that represent the background are diminished by using a robust Hausdorff distance between different images. Second, image matching results based on three image signature representations are fused, which enhances the retrieval precision. We conducted intensive experiments on small-scale to large-scale image datasets: Corel-9, Corel-48, and PKU-198, where the proposed Hausdorff metric and information fusion outperforms the state-of-the-art methods by about 13, 15, and 15%, respectively.

  2. Using Metadata To Improve Organization and Information Retrieval on the WWW.

    ERIC Educational Resources Information Center

    Doan, Bich-Lien; Beigbeder, Michel; Girardot, Jean-Jacques; Jaillon, Philippe

    The growing volume of heterogeneous and distributed information on the World Wide Web has made it increasingly difficult for existing tools to retrieve relevant information. To improve the performance of these tools, this paper suggests how to handle two aspects of the problem. The first aspect concerns a better representation and description of…

  3. Information Retrieval for Education: Making Search Engines Language Aware

    ERIC Educational Resources Information Center

    Ott, Niels; Meurers, Detmar

    2010-01-01

    Search engines have been a major factor in making the web the successful and widely used information source it is today. Generally speaking, they make it possible to retrieve web pages on a topic specified by the keywords entered by the user. Yet web searching currently does not take into account which of the search results are comprehensible for…

  4. Controlled Retrieval of Specific Context Information in Children and Adults.

    PubMed

    Lorsbach, Thomas C; Friehe, Mary J; Teten, Amy Fair; Reimer, Jason F; Armendarez, Joseph J

    2015-01-01

    This study adapted a procedure used by Luo and Craik (2009) to examine whether developmental differences exist in the ability to use controlled retrieval processes to access the contextual details of memory representations. Participants from 3 age groups (mean ages 9, 12, and 25 years) were presented with words in 3 study contexts: with a black-and-white picture, with a color picture, or alone without a picture. Six recognition tests were then presented that varied in the demands (high or low) placed on the retrieval of specific contextual information. Each test consisted of a mixture of words that were old targets from 1 study context, distractors (i.e., previously studied words from a different context), and completely new words. A high-specificity and a low-specificity test list was paired with each test question, with high and low specificity being determined by the nature of the distractors used in a test list. High-specificity tests contained words that were studied in similar contexts: old targets (e.g., words studied with black-and-white pictures) and distractors (e.g., words studied with color pictures). In contrast, low-specificity tests contained words that were studied in dissimilar contexts: old targets (e.g., words studied with black-and-white pictures) and distractors (e.g., words previously studied without a picture). Relative to low-specificity tests, the retrieval conditions of high-specificity tests were assumed to place greater demands on the controlled access of specific contextual information. Analysis of recollection scores revealed that age differences were present on high-but not low-specificity tests, with the performance of 9-year-olds disproportionately affected by the retrieval demands of high-specificity tests.

  5. Relevance similarity: an alternative means to monitor information retrieval systems

    PubMed Central

    Dong, Peng; Loh, Marie; Mondry, Adrian

    2005-01-01

    Background Relevance assessment is a major problem in the evaluation of information retrieval systems. The work presented here introduces a new parameter, "Relevance Similarity", for the measurement of the variation of relevance assessment. In a situation where individual assessment can be compared with a gold standard, this parameter is used to study the effect of such variation on the performance of a medical information retrieval system. In such a setting, Relevance Similarity is the ratio of assessors who rank a given document same as the gold standard over the total number of assessors in the group. Methods The study was carried out on a collection of Critically Appraised Topics (CATs). Twelve volunteers were divided into two groups of people according to their domain knowledge. They assessed the relevance of retrieved topics obtained by querying a meta-search engine with ten keywords related to medical science. Their assessments were compared to the gold standard assessment, and Relevance Similarities were calculated as the ratio of positive concordance with the gold standard for each topic. Results The similarity comparison among groups showed that a higher degree of agreements exists among evaluators with more subject knowledge. The performance of the retrieval system was not significantly different as a result of the variations in relevance assessment in this particular query set. Conclusion In assessment situations where evaluators can be compared to a gold standard, Relevance Similarity provides an alternative evaluation technique to the commonly used kappa scores, which may give paradoxically low scores in highly biased situations such as document repositories containing large quantities of relevant data. PMID:16029513

  6. Retrieval and Sleep Both Counteract the Forgetting of Spatial Information

    ERIC Educational Resources Information Center

    Antony, James W.; Paller, Ken A.

    2018-01-01

    Repeatedly studying information is a good way to strengthen memory storage. Nevertheless, testing recall often produces superior long-term retention. Demonstrations of this testing effect, typically with verbal stimuli, have shown that repeated retrieval through testing reduces forgetting. Sleep also benefits memory storage, perhaps through…

  7. Tomcat, Oracle & XML Web Archive

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Cothren, D. C.

    2008-01-01

    The TOX (Tomcat Oracle & XML) web archive is a foundation for development of HTTP-based applications using Tomcat (or some other servlet container) and an Oracle RDBMS. Use of TOX requires coding primarily in PL/SQL, JavaScript, and XSLT, but also in HTML, CSS and potentially Java. Coded in Java and PL/SQL itself, TOX provides the foundation for more complex applications to be built.

  8. Retrieval practice is an efficient method of enhancing the retention of anatomy and physiology information.

    PubMed

    Dobson, John L

    2013-06-01

    Although a great deal of empirical evidence has indicated that retrieval practice is an effective means of promoting learning and memory, very few studies have investigated the strategy in the context of an actual class. The primary purpose of this study was to determine if a series of very brief retrieval quizzes could significantly improve the retention of previously tested information throughout an anatomy and physiology course. A second purpose was to determine if there were any significant differences between expanding and uniform patterns of retrieval that followed a standardized initial retrieval delay. Anatomy and physiology students were assigned to either a control group or groups that were repeatedly prompted to retrieve a subset of previously tested course information via a series of quizzes that were administered on either an expanding or a uniform schedule. Each retrieval group completed a total of 10 retrieval quizzes, and the series of quizzes required (only) a total of 2 h to complete. Final retention of the exam subset material was assessed during the last week of the semester. There were no significant differences between the expanding and uniform retrieval groups, but both retained an average of 41% more of the subset material than did the control group (ANOVA, F = 129.8, P = 0.00, ηp(2) = 0.36). In conclusion, retrieval practice is a highly efficient and effective strategy for enhancing the retention of anatomy and physiology material.

  9. The carbohydrate sequence markup language (CabosML): an XML description of carbohydrate structures.

    PubMed

    Kikuchi, Norihiro; Kameyama, Akihiko; Nakaya, Shuuichi; Ito, Hiromi; Sato, Takashi; Shikanai, Toshihide; Takahashi, Yoriko; Narimatsu, Hisashi

    2005-04-15

    Bioinformatics resources for glycomics are very poor as compared with those for genomics and proteomics. The complexity of carbohydrate sequences makes it difficult to define a common language to represent them, and the development of bioinformatics tools for glycomics has not progressed. In this study, we developed a carbohydrate sequence markup language (CabosML), an XML description of carbohydrate structures. The language definition (XML Schema) and an experimental database of carbohydrate structures using an XML database management system are available at http://www.phoenix.hydra.mki.co.jp/CabosDemo.html kikuchi@hydra.mki.co.jp.

  10. Information Retrieval from SAGE II and MFRSR Multi-Spectral Extinction Measurements

    NASA Technical Reports Server (NTRS)

    Lacis, Andrew A.; Hansen, James E. (Technical Monitor)

    2001-01-01

    Direct beam spectral extinction measurements of solar radiation contain important information on atmospheric composition in a form that is essentially free from multiple scattering contributions that otherwise tend to complicate the data analysis and information retrieval. Such direct beam extinction measurements are available from the solar occultation satellite-based measurements made by the Stratospheric and Aerosol Gas Experiment (SAGE II) instrument and by ground-based Multi-Filter Shadowband Radiometers (MFRSRs). The SAGE II data provide cross-sectional slices of the atmosphere twice per orbit at seven wavelengths between 385 and 1020 nm with approximately 1 km vertical resolution, while the MFRSR data provide atmospheric column measurements at six wavelengths between 415 and 940 nm but at one minute time intervals. We apply the same retrieval technique of simultaneous least-squares fit to the observed spectral extinctions to retrieve aerosol optical depth, effective radius and variance, and ozone, nitrogen dioxide, and water vapor amounts from the SAGE II and MFRSR measurements. The retrieval technique utilizes a physical model approach based on laboratory measurements of ozone and nitrogen dioxide extinction, line-by-line and numerical k-distribution calculations for water vapor absorption, and Mie scattering constraints on aerosol spectral extinction properties. The SAGE II measurements have the advantage of being self-calibrating in that deep space provides an effective zero point for the relative spectral extinctions. The MFRSR measurements require periodic clear-day Langley regression calibration events to maintain accurate knowledge of instrument calibration.

  11. Overcoming Terminology Barrier Using Web Resources for Cross-Language Medical Information Retrieval

    PubMed Central

    Lu, Wen-Hsiang; Lin, Ray Shih-Jui; Chan, Yi-Che; Chen, Kuan-Hsi

    2006-01-01

    A number of authoritative medical websites, such as PubMed and MedlinePlus, provide consumers with the most up-to-date health information. However, non-English speakers often encounter not only language barriers (from other languages to English) but also terminology barriers (from laypersons’ terms to professional medical terms) when retrieving information from these websites. Our previous work addresses language barriers by developing a multilingual medical thesaurus, Chinese-English MeSH, while this study presents an approach to overcome terminology barriers based on Web resources. Two techniques were utilized in our approach: monolingual concept mapping using approximate string matching and crosslingual concept mapping using Web resources. The evaluation shows that our approach can significantly improve the performance on MeSH concept mapping and cross-language medical information retrieval. PMID:17238395

  12. A Location Aware Middleware Framework for Collaborative Visual Information Discovery and Retrieval

    DTIC Science & Technology

    2017-09-14

    Information Discovery and Retrieval Andrew J.M. Compton Follow this and additional works at: https://scholar.afit.edu/etd Part of the Digital...and Dissertations by an authorized administrator of AFIT Scholar. For more information , please contact richard.mansfield@afit.edu. Recommended Citation...

  13. Do Online Information Retrieval Systems Help Experienced Clinicians Answer Clinical Questions?

    PubMed Central

    Westbrook, Johanna I.; Coiera, Enrico W.; Gosling, A. Sophie

    2005-01-01

    Objective: To assess the impact of clinicians' use of an online information retrieval system on their performance in answering clinical questions. Design: Pre-/post-intervention experimental design. Measurements: In a computer laboratory, 75 clinicians (26 hospital-based doctors, 18 family practitioners, and 31 clinical nurse consultants) provided 600 answers to eight clinical scenarios before and after the use of an online information retrieval system. We examined the proportion of correct answers pre- and post-intervention, direction of change in answers, and differences between professional groups. Results: System use resulted in a 21% improvement in clinicians' answers, from 29% (95% confidence interval [CI] 25.4–32.6) correct pre- to 50% (95% CI 46.0–54.0) post-system use. In 33% (95% CI 29.1–36.9) answers were changed from incorrect to correct. In 21% (95% CI 17.1–23.9) correct pre-test answers were supported by evidence found using the system, and in 7% (95% CI 4.9–9.1) correct pre-test answers were changed incorrectly. For 40% (35.4–43.6) of scenarios, incorrect pre-test answers were not rectified following system use. Despite significant differences in professional groups' pre-test scores [family practitioners: 41% (95% CI 33.0–49.0), hospital doctors: 35% (95% CI 28.5–41.2), and clinical nurse consultants: 17% (95% CI 12.3–21.7; χ2 = 29.0, df = 2, p < 0.01)], there was no difference in post-test scores. (χ2 = 2.6, df = 2, p = 0.73). Conclusions: The use of an online information retrieval system was associated with a significant improvement in the quality of answers provided by clinicians to typical clinical problems. In a small proportion of cases, use of the system produced errors. While there was variation in the performance of clinical groups when answering questions unaided, performance did not differ significantly following system use. Online information retrieval systems can be an effective tool in improving the accuracy of

  14. An overview of the National Space Science data Center Standard Information Retrieval System (SIRS)

    NASA Technical Reports Server (NTRS)

    Shapiro, A.; Blecher, S.; Verson, E. E.; King, M. L. (Editor)

    1974-01-01

    A general overview is given of the National Space Science Data Center (NSSDC) Standard Information Retrieval System. A description, in general terms, the information system that contains the data files and the software system that processes and manipulates the files maintained at the Data Center. Emphasis is placed on providing users with an overview of the capabilities and uses of the NSSDC Standard Information Retrieval System (SIRS). Examples given are taken from the files at the Data Center. Detailed information about NSSDC data files is documented in a set of File Users Guides, with one user's guide prepared for each file processed by SIRS. Detailed information about SIRS is presented in the SIRS Users Guide.

  15. Competitive retrieval is not a prerequisite for forgetting in the retrieval practice paradigm.

    PubMed

    Camp, Gino; Dalm, Sander

    2016-09-01

    Retrieving information from memory can lead to forgetting of other, related information. The inhibition account of this retrieval-induced forgetting effect predicts that this form of forgetting occurs when competition arises between the practiced information and the related information, leading to inhibition of the related information. In the standard retrieval practice paradigm, a retrieval practice task is used in which participants retrieve the items based on a category-plus-stem cue (e.g., FRUIT-or___). In the current experiment, participants instead generated the target based on a cue in which the first 2 letters of the target were transposed (e.g., FRUIT-roange). This noncompetitive task also induced forgetting of unpracticed items from practiced categories. This finding is inconsistent with the inhibition account, which asserts that the forgetting effect depends on competitive retrieval. We argue that interference-based accounts of forgetting and the context-based account of retrieval-induced forgetting can account for this result. (PsycINFO Database Record (c) 2016 APA, all rights reserved).

  16. An integrated information retrieval and document management system

    NASA Technical Reports Server (NTRS)

    Coles, L. Stephen; Alvarez, J. Fernando; Chen, James; Chen, William; Cheung, Lai-Mei; Clancy, Susan; Wong, Alexis

    1993-01-01

    This paper describes the requirements and prototype development for an intelligent document management and information retrieval system that will be capable of handling millions of pages of text or other data. Technologies for scanning, Optical Character Recognition (OCR), magneto-optical storage, and multiplatform retrieval using a Standard Query Language (SQL) will be discussed. The semantic ambiguity inherent in the English language is somewhat compensated-for through the use of coefficients or weighting factors for partial synonyms. Such coefficients are used both for defining structured query trees for routine queries and for establishing long-term interest profiles that can be used on a regular basis to alert individual users to the presence of relevant documents that may have just arrived from an external source, such as a news wire service. Although this attempt at evidential reasoning is limited in comparison with the latest developments in AI Expert Systems technology, it has the advantage of being commercially available.

  17. The Effect of Bilingual Term List Size on Dictionary-Based Cross-Language Information Retrieval

    DTIC Science & Technology

    2003-02-01

    FEB 2003 2. REPORT TYPE 3. DATES COVERED 00-00-2003 to 00-00-2003 4. TITLE AND SUBTITLE The Effect of Bilingual Term List Size on Dictionary ...298 (Rev. 8-98) Prescribed by ANSI Std Z39-18 The Effect of Bilingual Term List Size on Dictionary -Based Cross-Language Information Retrieval Dina...are extensively used as a resource for dictionary -based Cross-Language Information Retrieval (CLIR), in which the goal is to find documents written

  18. Converting from XML to HDF-EOS

    NASA Technical Reports Server (NTRS)

    Ullman, Richard; Bane, Bob; Yang, Jingli

    2008-01-01

    A computer program recreates an HDF-EOS file from an Extensible Markup Language (XML) representation of the contents of that file. This program is one of two programs written to enable testing of the schemas described in the immediately preceding article to determine whether the schemas capture all details of HDF-EOS files.

  19. XTCE. XML Telemetry and Command Exchange Tutorial

    NASA Technical Reports Server (NTRS)

    Rice, Kevin; Kizzort, Brad; Simon, Jerry

    2010-01-01

    An XML Telemetry Command Exchange (XTCE) tutoral oriented towards packets or minor frames is shown. The contents include: 1) The Basics; 2) Describing Telemetry; 3) Describing the Telemetry Format; 4) Commanding; 5) Forgotten Elements; 6) Implementing XTCE; and 7) GovSat.

  20. Information Visualization and Proposing New Interface for Movie Retrieval System (IMDB)

    ERIC Educational Resources Information Center

    Etemadpour, Ronak; Masood, Mona; Belaton, Bahari

    2010-01-01

    This research studies the development of a new prototype of visualization in support of movie retrieval. The goal of information visualization is unveiling of large amounts of data or abstract data set using visual presentation. With this knowledge the main goal is to develop a 2D presentation of information on movies from the IMDB (Internet Movie…

  1. Monetary incentives at retrieval promote recognition of involuntarily learned emotional information.

    PubMed

    Yan, Chunping; Li, Yunyun; Zhang, Qin; Cui, Lixia

    2018-03-07

    Previous studies have suggested that the effects of reward on memory processes are affected by certain factors, but it remains unclear whether the effects of reward at retrieval on recognition processes are influenced by emotion. The event-related potential was used to investigate the combined effect of reward and emotion on memory retrieval and its neural mechanism. The behavioral results indicated that the reward at retrieval improved recognition performance under positive and negative emotional conditions. The event-related potential results indicated that there were significant interactions between the reward and emotion in the average amplitude during recognition, and the significant reward effects from the frontal to parietal brain areas appeared at 130-800 ms for positive pictures and at 190-800 ms for negative pictures, but there were no significant reward effects of neutral pictures; the reward effect of positive items appeared relatively earlier, starting at 130 ms, and that of negative pictures began at 190 ms. These results indicate that monetary incentives at retrieval promote recognition of involuntarily learned emotional information.

  2. Interactive, Secure Web-enabled Aircraft Engine Simulation Using XML Databinding Integration

    NASA Technical Reports Server (NTRS)

    Lin, Risheng; Afjeh, Abdollah A.

    2003-01-01

    This paper discusses the detailed design of an XML databinding framework for aircraft engine simulation. The framework provides an object interface to access and use engine data. while at the same time preserving the meaning of the original data. The Language independent representation of engine component data enables users to move around XML data using HTTP through disparate networks. The application of this framework is demonstrated via a web-based turbofan propulsion system simulation using the World Wide Web (WWW). A Java Servlet based web component architecture is used for rendering XML engine data into HTML format and dealing with input events from the user, which allows users to interact with simulation data from a web browser. The simulation data can also be saved to a local disk for archiving or to restart the simulation at a later time.

  3. A vector-product information retrieval system adapted to heterogeneous, distributed computing environments

    NASA Technical Reports Server (NTRS)

    Rorvig, Mark E.

    1991-01-01

    Vector-product information retrieval (IR) systems produce retrieval results superior to all other searching methods but presently have no commercial implementations beyond the personal computer environment. The NASA Electronic Library Systems (NELS) provides a ranked list of the most likely relevant objects in collections in response to a natural language query. Additionally, the system is constructed using standards and tools (Unix, X-Windows, Notif, and TCP/IP) that permit its operation in organizations that possess many different hosts, workstations, and platforms. There are no known commercial equivalents to this product at this time. The product has applications in all corporate management environments, particularly those that are information intensive, such as finance, manufacturing, biotechnology, and research and development.

  4. INFOL for the CDC 6400 Information Storage and Retrieval System. Reference Manual.

    ERIC Educational Resources Information Center

    Mittman, B.; And Others

    INFOL for the CDC 6400 is a rewrite in FORTRAN IV of the CDC 3600/3800 INFOL (Information Oriented Language), a generalized information storage and retrieval system developed by the Control Data Corporation for the CDC 3600/3800 computer. With INFOL, selected pieces of information are extracted from a file and presented to the user quickly and…

  5. Linear information retrieval method in X-ray grating-based phase contrast imaging and its interchangeability with tomographic reconstruction

    NASA Astrophysics Data System (ADS)

    Wu, Z.; Gao, K.; Wang, Z. L.; Shao, Q. G.; Hu, R. F.; Wei, C. X.; Zan, G. B.; Wali, F.; Luo, R. H.; Zhu, P. P.; Tian, Y. C.

    2017-06-01

    In X-ray grating-based phase contrast imaging, information retrieval is necessary for quantitative research, especially for phase tomography. However, numerous and repetitive processes have to be performed for tomographic reconstruction. In this paper, we report a novel information retrieval method, which enables retrieving phase and absorption information by means of a linear combination of two mutually conjugate images. Thanks to the distributive law of the multiplication as well as the commutative law and associative law of the addition, the information retrieval can be performed after tomographic reconstruction, thus simplifying the information retrieval procedure dramatically. The theoretical model of this method is established in both parallel beam geometry for Talbot interferometer and fan beam geometry for Talbot-Lau interferometer. Numerical experiments are also performed to confirm the feasibility and validity of the proposed method. In addition, we discuss its possibility in cone beam geometry and its advantages compared with other methods. Moreover, this method can also be employed in other differential phase contrast imaging methods, such as diffraction enhanced imaging, non-interferometric imaging, and edge illumination.

  6. XML Schema Versioning Policies Version 1.0

    NASA Astrophysics Data System (ADS)

    Harrison, Paul; Demleitner, Markus; Major, Brian; Dowler, Pat; Harrison, Paul

    2018-05-01

    This note describes the recommended practice for the evolution of IVOA standard XML schemata that are associated with IVOA standards. The criteria for deciding what might be considered major and minor changes and the policies for dealing with each case are described.

  7. voevent-parse: Parse, manipulate, and generate VOEvent XML packets

    NASA Astrophysics Data System (ADS)

    Staley, Tim D.

    2014-11-01

    voevent-parse, written in Python, parses, manipulates, and generates VOEvent XML packets; it is built atop lxml.objectify. Details of transients detected by many projects, including Fermi, Swift, and the Catalina Sky Survey, are currently made available as VOEvents, which is also the standard alert format by future facilities such as LSST and SKA. However, working with XML and adhering to the sometimes lengthy VOEvent schema can be a tricky process. voevent-parse provides convenience routines for common tasks, while allowing the user to utilise the full power of the lxml library when required. An earlier version of voevent-parse was part of the pysovo (ascl:1411.002) library.

  8. Recommending Education Materials for Diabetic Questions Using Information Retrieval Approaches.

    PubMed

    Zeng, Yuqun; Liu, Xusheng; Wang, Yanshan; Shen, Feichen; Liu, Sijia; Rastegar-Mojarad, Majid; Wang, Liwei; Liu, Hongfang

    2017-10-16

    Self-management is crucial to diabetes care and providing expert-vetted content for answering patients' questions is crucial in facilitating patient self-management. The aim is to investigate the use of information retrieval techniques in recommending patient education materials for diabetic questions of patients. We compared two retrieval algorithms, one based on Latent Dirichlet Allocation topic modeling (topic modeling-based model) and one based on semantic group (semantic group-based model), with the baseline retrieval models, vector space model (VSM), in recommending diabetic patient education materials to diabetic questions posted on the TuDiabetes forum. The evaluation was based on a gold standard dataset consisting of 50 randomly selected diabetic questions where the relevancy of diabetic education materials to the questions was manually assigned by two experts. The performance was assessed using precision of top-ranked documents. We retrieved 7510 diabetic questions on the forum and 144 diabetic patient educational materials from the patient education database at Mayo Clinic. The mapping rate of words in each corpus mapped to the Unified Medical Language System (UMLS) was significantly different (P<.001). The topic modeling-based model outperformed the other retrieval algorithms. For example, for the top-retrieved document, the precision of the topic modeling-based, semantic group-based, and VSM models was 67.0%, 62.8%, and 54.3%, respectively. This study demonstrated that topic modeling can mitigate the vocabulary difference and it achieved the best performance in recommending education materials for answering patients' questions. One direction for future work is to assess the generalizability of our findings and to extend our study to other disease areas, other patient education material resources, and online forums. ©Yuqun Zeng, Xusheng Liu, Yanshan Wang, Feichen Shen, Sijia Liu, Majid Rastegar Mojarad, Liwei Wang, Hongfang Liu. Originally published in

  9. Recommending Education Materials for Diabetic Questions Using Information Retrieval Approaches

    PubMed Central

    Wang, Yanshan; Shen, Feichen; Liu, Sijia; Rastegar-Mojarad, Majid; Wang, Liwei

    2017-01-01

    Background Self-management is crucial to diabetes care and providing expert-vetted content for answering patients’ questions is crucial in facilitating patient self-management. Objective The aim is to investigate the use of information retrieval techniques in recommending patient education materials for diabetic questions of patients. Methods We compared two retrieval algorithms, one based on Latent Dirichlet Allocation topic modeling (topic modeling-based model) and one based on semantic group (semantic group-based model), with the baseline retrieval models, vector space model (VSM), in recommending diabetic patient education materials to diabetic questions posted on the TuDiabetes forum. The evaluation was based on a gold standard dataset consisting of 50 randomly selected diabetic questions where the relevancy of diabetic education materials to the questions was manually assigned by two experts. The performance was assessed using precision of top-ranked documents. Results We retrieved 7510 diabetic questions on the forum and 144 diabetic patient educational materials from the patient education database at Mayo Clinic. The mapping rate of words in each corpus mapped to the Unified Medical Language System (UMLS) was significantly different (P<.001). The topic modeling-based model outperformed the other retrieval algorithms. For example, for the top-retrieved document, the precision of the topic modeling-based, semantic group-based, and VSM models was 67.0%, 62.8%, and 54.3%, respectively. Conclusions This study demonstrated that topic modeling can mitigate the vocabulary difference and it achieved the best performance in recommending education materials for answering patients’ questions. One direction for future work is to assess the generalizability of our findings and to extend our study to other disease areas, other patient education material resources, and online forums. PMID:29038097

  10. SPIRES (Stanford Physics Information REtrieval System) 1969-70 Annual Report to the National Science Foundation (Office of Science Information Service).

    ERIC Educational Resources Information Center

    Parker, Edwin B.

    The third annual report (covering the 18-month period from January 1969 to June 1970) of the Stanford Physics Information REtrieval System (SPIRES) project, which is developing an augmented bibliographic retrieval capability, is presented in this document. A first section describes the background of the project and its association with Project…

  11. What versus where: Investigating how autobiographical memory retrieval differs when accessed with thematic versus spatial information.

    PubMed

    Sheldon, Signy; Chu, Sonja

    2017-09-01

    Autobiographical memory research has investigated how cueing distinct aspects of a past event can trigger different recollective experiences. This research has stimulated theories about how autobiographical knowledge is accessed and organized. Here, we test the idea that thematic information organizes multiple autobiographical events whereas spatial information organizes individual past episodes by investigating how retrieval guided by these two forms of information differs. We used a novel autobiographical fluency task in which participants accessed multiple memory exemplars to event theme and spatial (location) cues followed by a narrative description task in which they described the memories generated to these cues. Participants recalled significantly more memory exemplars to event theme than to spatial cues; however, spatial cues prompted faster access to past memories. Results from the narrative description task revealed that memories retrieved via event theme cues compared to spatial cues had a higher number of overall details, but those recalled to the spatial cues were recollected with a greater concentration on episodic details than those retrieved via event theme cues. These results provide evidence that thematic information organizes and integrates multiple memories whereas spatial information prompts the retrieval of specific episodic content from a past event.

  12. Information Content of Aerosol Retrievals in the Sunglint Region

    NASA Technical Reports Server (NTRS)

    Ottaviani, M.; Knobelspiesse, K.; Cairns, B.; Mishchenko, M.

    2013-01-01

    We exploit quantitative metrics to investigate the information content in retrievals of atmospheric aerosol parameters (with a focus on single-scattering albedo), contained in multi-angle and multi-spectral measurements with sufficient dynamical range in the sunglint region. The simulations are performed for two classes of maritime aerosols with optical and microphysical properties compiled from measurements of the Aerosol Robotic Network. The information content is assessed using the inverse formalism and is compared to that deriving from observations not affected by sunglint. We find that there indeed is additional information in measurements containing sunglint, not just for single-scattering albedo, but also for aerosol optical thickness and the complex refractive index of the fine aerosol size mode, although the amount of additional information varies with aerosol type.

  13. Catalog Descriptions Using VOTable Files

    NASA Astrophysics Data System (ADS)

    Thompson, R.; Levay, K.; Kimball, T.; White, R.

    2008-08-01

    Additional information is frequently required to describe database table contents and make it understandable to users. For this reason, the Multimission Archive at Space Telescope (MAST) creates Òdescription filesÓ for each table/catalog. After trying various XML and CSV formats, we finally chose VOTable. These files are easy to update via an HTML form, easily read using an XML parser such as (in our case) the PHP5 SimpleXML extension, and have found multiple uses in our data access/retrieval process.

  14. Fast and Efficient XML Data Access for Next-Generation Mass Spectrometry.

    PubMed

    Röst, Hannes L; Schmitt, Uwe; Aebersold, Ruedi; Malmström, Lars

    2015-01-01

    In mass spectrometry-based proteomics, XML formats such as mzML and mzXML provide an open and standardized way to store and exchange the raw data (spectra and chromatograms) of mass spectrometric experiments. These file formats are being used by a multitude of open-source and cross-platform tools which allow the proteomics community to access algorithms in a vendor-independent fashion and perform transparent and reproducible data analysis. Recent improvements in mass spectrometry instrumentation have increased the data size produced in a single LC-MS/MS measurement and put substantial strain on open-source tools, particularly those that are not equipped to deal with XML data files that reach dozens of gigabytes in size. Here we present a fast and versatile parsing library for mass spectrometric XML formats available in C++ and Python, based on the mature OpenMS software framework. Our library implements an API for obtaining spectra and chromatograms under memory constraints using random access or sequential access functions, allowing users to process datasets that are much larger than system memory. For fast access to the raw data structures, small XML files can also be completely loaded into memory. In addition, we have improved the parsing speed of the core mzML module by over 4-fold (compared to OpenMS 1.11), making our library suitable for a wide variety of algorithms that need fast access to dozens of gigabytes of raw mass spectrometric data. Our C++ and Python implementations are available for the Linux, Mac, and Windows operating systems. All proposed modifications to the OpenMS code have been merged into the OpenMS mainline codebase and are available to the community at https://github.com/OpenMS/OpenMS.

  15. Fast and Efficient XML Data Access for Next-Generation Mass Spectrometry

    PubMed Central

    Röst, Hannes L.; Schmitt, Uwe; Aebersold, Ruedi; Malmström, Lars

    2015-01-01

    Motivation In mass spectrometry-based proteomics, XML formats such as mzML and mzXML provide an open and standardized way to store and exchange the raw data (spectra and chromatograms) of mass spectrometric experiments. These file formats are being used by a multitude of open-source and cross-platform tools which allow the proteomics community to access algorithms in a vendor-independent fashion and perform transparent and reproducible data analysis. Recent improvements in mass spectrometry instrumentation have increased the data size produced in a single LC-MS/MS measurement and put substantial strain on open-source tools, particularly those that are not equipped to deal with XML data files that reach dozens of gigabytes in size. Results Here we present a fast and versatile parsing library for mass spectrometric XML formats available in C++ and Python, based on the mature OpenMS software framework. Our library implements an API for obtaining spectra and chromatograms under memory constraints using random access or sequential access functions, allowing users to process datasets that are much larger than system memory. For fast access to the raw data structures, small XML files can also be completely loaded into memory. In addition, we have improved the parsing speed of the core mzML module by over 4-fold (compared to OpenMS 1.11), making our library suitable for a wide variety of algorithms that need fast access to dozens of gigabytes of raw mass spectrometric data. Availability Our C++ and Python implementations are available for the Linux, Mac, and Windows operating systems. All proposed modifications to the OpenMS code have been merged into the OpenMS mainline codebase and are available to the community at https://github.com/OpenMS/OpenMS. PMID:25927999

  16. Secure Retrieval of FFTF Testing, Design, and Operating Information

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Butner, R. Scott; Wootan, David W.; Omberg, Ronald P.

    One of the goals of the Advanced Fuel Cycle Initiative (AFCI) is to preserve the knowledge that has been gained in the United States on Liquid Metal Reactors (LMR). In addition, preserving LMR information and knowledge is part of a larger international collaborative activity conducted under the auspices of the International Atomic Energy Agency (IAEA). A similar program is being conducted for EBR-II at the Idaho Nuclear Laboratory (INL) and international programs are also in progress. Knowledge preservation at the FFTF is focused on the areas of design, construction, startup, and operation of the reactor. As the primary function ofmore » the FFTF was testing, the focus is also on preserving information obtained from irradiation testing of fuels and materials. This information will be invaluable when, at a later date, international decisions are made to pursue new LMRs. In the interim, this information may be of potential use for international exchanges with other LMR programs around the world. At least as important in the United States, which is emphasizing large-scale computer simulation and modeling, this information provides the basis for creating benchmarks for validating and testing these large scale computer programs. Although the preservation activity with respect to FFTF information as discussed below is still underway, the team of authors above is currently retrieving and providing experimental and design information to the LMR modeling and simulation efforts for use in validating their computer models. On the Hanford Site, the FFTF reactor plant is one of the facilities intended for decontamination and decommissioning consistent with the cleanup mission on this site. The reactor facility has been deactivated and is being maintained in a cold and dark minimal surveillance and maintenance mode until final decommissioning is pursued. In order to ensure protection of information at risk, the program to date has focused on sequestering and secure retrieval

  17. Words, concepts, or both: optimal indexing units for automated information retrieval.

    PubMed Central

    Hersh, W. R.; Hickam, D. H.; Leone, T. J.

    1992-01-01

    What is the best way to represent the content of documents in an information retrieval system? This study compares the retrieval effectiveness of five different methods for automated (machine-assigned) indexing using three test collections. The consistently best methods are those that use indexing based on the words that occur in the available text of each document. Methods used to map text into concepts from a controlled vocabulary showed no advantage over the word-based methods. This study also looked at an approach to relevance feedback which showed benefit for both word-based and concept-based methods. PMID:1482951

  18. Practical Side of the Bibliographic Information Retrieval System in the National Museum of Ethnology

    NASA Astrophysics Data System (ADS)

    Kondo, Katsuichi

    The information retrieval system of the National Museum of Ethnology made its debut in 1979 and now enables us to search the books not only in the Museum but in the country and abroad by means of JAPAN MARC & LC MARC. The author presents the outline and the development of the information managing system including the above briefly and secondly the practical case of using our retrieval system in particular. The problems to be solved in the course of the future plan are also mentioned.

  19. Using a Combination of UML, C2RM, XML, and Metadata Registries to Support Long-Term Development/Engineering

    DTIC Science & Technology

    2003-01-01

    Authenticat’n (XCBF) Authorizat’n (XACML) (SAML) Privacy (P3P) Digital Rights Management (XrML) Content Mngmnt (DASL) (WebDAV) Content Syndicat’n...Registry/ Repository BPSS eCommerce XML/EDI Universal Business Language (UBL) Internet & Computing Human Resources (HR-XML) Semantic KEY XML SPECIFICATIONS

  20. Interactions among emotional attention, encoding, and retrieval of ambiguous information: An eye-tracking study.

    PubMed

    Everaert, Jonas; Koster, Ernst H W

    2015-10-01

    Emotional biases in attention modulate encoding of emotional material into long-term memory, but little is known about the role of such attentional biases during emotional memory retrieval. The present study investigated how emotional biases in memory are related to attentional allocation during retrieval. Forty-nine individuals encoded emotionally positive and negative meanings derived from ambiguous information and then searched their memory for encoded meanings in response to a set of retrieval cues. The remember/know/new procedure was used to classify memories as recollection-based or familiarity-based, and gaze behavior was monitored throughout the task to measure attentional allocation. We found that a bias in sustained attention during recollection-based, but not familiarity-based, retrieval predicted subsequent memory bias toward positive versus negative material following encoding. Thus, during emotional memory retrieval, attention affects controlled forms of retrieval (i.e., recollection) but does not modulate relatively automatic, familiarity-based retrieval. These findings enhance understanding of how distinct components of attention regulate the emotional content of memories. Implications for theoretical models and emotion regulation are discussed. (c) 2015 APA, all rights reserved).

  1. Fluency heuristic: a model of how the mind exploits a by-product of information retrieval.

    PubMed

    Hertwig, Ralph; Herzog, Stefan M; Schooler, Lael J; Reimer, Torsten

    2008-09-01

    Boundedly rational heuristics for inference can be surprisingly accurate and frugal for several reasons. They can exploit environmental structures, co-opt complex capacities, and elude effortful search by exploiting information that automatically arrives on the mental stage. The fluency heuristic is a prime example of a heuristic that makes the most of an automatic by-product of retrieval from memory, namely, retrieval fluency. In 4 experiments, the authors show that retrieval fluency can be a proxy for real-world quantities, that people can discriminate between two objects' retrieval fluencies, and that people's inferences are in line with the fluency heuristic (in particular fast inferences) and with experimentally manipulated fluency. The authors conclude that the fluency heuristic may be one tool in the mind's repertoire of strategies that artfully probes memory for encapsulated frequency information that can veridically reflect statistical regularities in the world. (c) 2008 APA, all rights reserved.

  2. Towards an Intelligent Possibilistic Web Information Retrieval Using Multiagent System

    ERIC Educational Resources Information Center

    Elayeb, Bilel; Evrard, Fabrice; Zaghdoud, Montaceur; Ahmed, Mohamed Ben

    2009-01-01

    Purpose: The purpose of this paper is to make a scientific contribution to web information retrieval (IR). Design/methodology/approach: A multiagent system for web IR is proposed based on new technologies: Hierarchical Small-Worlds (HSW) and Possibilistic Networks (PN). This system is based on a possibilistic qualitative approach which extends the…

  3. Protein Annotators' Assistant: A Novel Application of Information Retrieval Techniques.

    ERIC Educational Resources Information Center

    Wise, Michael J.

    2000-01-01

    Protein Annotators' Assistant (PAA) is a software system which assists protein annotators in assigning functions to newly sequenced proteins. PAA employs a number of information retrieval techniques in a novel setting and is thus related to text categorization, where multiple categories may be suggested, except that in this case none of the…

  4. SPIRES (Stanford Physics Information REtrieval System) 1969-70 Annual Report.

    ERIC Educational Resources Information Center

    Stanford Univ., CA. Inst. for Communication Research.

    For those unfamiliar with the Stanford Physics Information Retrieval System (SPIRES) an introduction and background section is provided in this 1969-70 annual report. This is followed by: (1) the SPIRES I prototype, (2) developing a production system--SPIRES II and (3) system scope and requirements analysis. The appendices present: (1) Stanford…

  5. Fringe pattern information retrieval using wavelets

    NASA Astrophysics Data System (ADS)

    Sciammarella, Cesar A.; Patimo, Caterina; Manicone, Pasquale D.; Lamberti, Luciano

    2005-08-01

    Two-dimensional phase modulation is currently the basic model used in the interpretation of fringe patterns that contain displacement information, moire, holographic interferometry, speckle techniques. Another way to look to these two-dimensional signals is to consider them as frequency modulated signals. This alternative interpretation has practical implications similar to those that exist in radio engineering for handling frequency modulated signals. Utilizing this model it is possible to obtain frequency information by using the energy approach introduced by Ville in 1944. A natural complementary tool of this process is the wavelet methodology. The use of wavelet makes it possible to obtain the local values of the frequency in a one or two dimensional domain without the need of previous phase retrieval and differentiation. Furthermore from the properties of wavelets it is also possible to obtain at the same time the phase of the signal with the advantage of a better noise removal capabilities and the possibility of developing simpler algorithms for phase unwrapping due to the availability of the derivative of the phase.

  6. Natural Language Query System Design for Interactive Information Storage and Retrieval Systems. M.S. Thesis

    NASA Technical Reports Server (NTRS)

    Dominick, Wayne D. (Editor); Liu, I-Hsiung

    1985-01-01

    The currently developed multi-level language interfaces of information systems are generally designed for experienced users. These interfaces commonly ignore the nature and needs of the largest user group, i.e., casual users. This research identifies the importance of natural language query system research within information storage and retrieval system development; addresses the topics of developing such a query system; and finally, proposes a framework for the development of natural language query systems in order to facilitate the communication between casual users and information storage and retrieval systems.

  7. Toward an Episodic Context Account of Retrieval-Based Learning: Dissociating Retrieval Practice and Elaboration

    ERIC Educational Resources Information Center

    Lehman, Melissa; Smith, Megan A.; Karpicke, Jeffrey D.

    2014-01-01

    We tested the predictions of 2 explanations for retrieval-based learning; while the elaborative retrieval hypothesis assumes that the retrieval of studied information promotes the generation of semantically related information, which aids in later retrieval (Carpenter, 2009), the episodic context account proposed by Karpicke, Lehman, and Aue (in…

  8. Aerosol Retrievals from Proposed Satellite Bistatic Lidar Observations: Algorithm and Information Content

    NASA Astrophysics Data System (ADS)

    Alexandrov, M. D.; Mishchenko, M. I.

    2017-12-01

    Accurate aerosol retrievals from space remain quite challenging and typically involve solving a severely ill-posed inverse scattering problem. We suggested to address this ill-posedness by flying a bistatic lidar system. Such a system would consist of formation flying constellation of a primary satellite equipped with a conventional monostatic (backscattering) lidar and an additional platform hosting a receiver of the scattered laser light. If successfully implemented, this concept would combine the measurement capabilities of a passive multi-angle multi-spectral polarimeter with the vertical profiling capability of a lidar. Thus, bistatic lidar observations will be free of deficiencies affecting both monostatic lidar measurements (caused by the highly limited information content) and passive photopolarimetric measurements (caused by vertical integration and surface reflection).We present a preliminary aerosol retrieval algorithm for a bistatic lidar system consisting of a high spectral resolution lidar (HSRL) and an additional receiver flown in formation with it at a scattering angle of 165 degrees. This algorithm was applied to synthetic data generated using Mie-theory computations. The model/retrieval parameters in our tests were the effective radius and variance of the aerosol size distribution, complex refractive index of the particles, and their number concentration. Both mono- and bimodal aerosol mixtures were considered. Our algorithm allowed for definitive evaluation of error propagation from measurements to retrievals using a Monte Carlo technique, which involves random distortion of the observations and statistical characterization of the resulting retrieval errors. Our tests demonstrated that supplementing a conventional monostatic HSRL with an additional receiver dramatically increases the information content of the measurements and allows for a sufficiently accurate characterization of tropospheric aerosols.

  9. ORFer--retrieval of protein sequences and open reading frames from GenBank and storage into relational databases or text files.

    PubMed

    Büssow, Konrad; Hoffmann, Steve; Sievert, Volker

    2002-12-19

    Functional genomics involves the parallel experimentation with large sets of proteins. This requires management of large sets of open reading frames as a prerequisite of the cloning and recombinant expression of these proteins. A Java program was developed for retrieval of protein and nucleic acid sequences and annotations from NCBI GenBank, using the XML sequence format. Annotations retrieved by ORFer include sequence name, organism and also the completeness of the sequence. The program has a graphical user interface, although it can be used in a non-interactive mode. For protein sequences, the program also extracts the open reading frame sequence, if available, and checks its correct translation. ORFer accepts user input in the form of single or lists of GenBank GI identifiers or accession numbers. It can be used to extract complete sets of open reading frames and protein sequences from any kind of GenBank sequence entry, including complete genomes or chromosomes. Sequences are either stored with their features in a relational database or can be exported as text files in Fasta or tabulator delimited format. The ORFer program is freely available at http://www.proteinstrukturfabrik.de/orfer. The ORFer program allows for fast retrieval of DNA sequences, protein sequences and their open reading frames and sequence annotations from GenBank. Furthermore, storage of sequences and features in a relational database is supported. Such a database can supplement a laboratory information system (LIMS) with appropriate sequence information.

  10. Machine Translation-Supported Cross-Language Information Retrieval for a Consumer Health Resource

    PubMed Central

    Rosemblat, Graciela; Gemoets, Darren; Browne, Allen C.; Tse, Tony

    2003-01-01

    The U.S. National Institutes of Health, through its National Library of Medicine, developed ClinicalTrials.gov to provide the public with easy access to information on clinical trials on a wide range of conditions or diseases. Only English language information retrieval is currently supported. Given the growing number of Spanish speakers in the U.S. and their increasing use of the Web, we anticipate a significant increase in Spanish-speaking users. This study compares the effectiveness of two common cross-language information retrieval methods using machine translation, query translation versus document translation, using a subset of genuine user queries from ClinicalTrials.gov. Preliminary results conducted with the ClinicalTrials.gov search engine show that in our environment, query translation is statistically significantly better than document translation. We discuss possible reasons for this result and we conclude with suggestions for future work. PMID:14728236

  11. Encoding and retrieval processes involved in the access of source information in the absence of item memory.

    PubMed

    Ball, B Hunter; DeWitt, Michael R; Knight, Justin B; Hicks, Jason L

    2014-09-01

    The current study sought to examine the relative contributions of encoding and retrieval processes in accessing contextual information in the absence of item memory using an extralist cuing procedure in which the retrieval cues used to query memory for contextual information were related to the target item but never actually studied. In Experiments 1 and 2, participants studied 1 category member (e.g., onion) from a variety of different categories and at test were presented with an unstudied category label (e.g., vegetable) to probe memory for item and source information. In Experiments 3 and 4, 1 member of unidirectional (e.g., credit or card) or bidirectional (e.g., salt or pepper) associates was studied, whereas the other unstudied member served as a test probe. When recall failed, source information was accessible only when items were processed deeply during encoding (Experiments 1 and 2) and when there was strong forward associative strength between the retrieval cue and target (Experiments 3 and 4). These findings suggest that a retrieval probe diagnostic of semantically related item information reinstantiates information bound in memory during encoding that results in reactivation of associated contextual information, contingent upon sufficient learning of the item itself and the association between the item and its context information.

  12. Construction of a nasopharyngeal carcinoma 2D/MS repository with Open Source XML database--Xindice.

    PubMed

    Li, Feng; Li, Maoyu; Xiao, Zhiqiang; Zhang, Pengfei; Li, Jianling; Chen, Zhuchu

    2006-01-11

    Many proteomics initiatives require integration of all information with uniformcriteria from collection of samples and data display to publication of experimental results. The integration and exchanging of these data of different formats and structure imposes a great challenge to us. The XML technology presents a promise in handling this task due to its simplicity and flexibility. Nasopharyngeal carcinoma (NPC) is one of the most common cancers in southern China and Southeast Asia, which has marked geographic and racial differences in incidence. Although there are some cancer proteome databases now, there is still no NPC proteome database. The raw NPC proteome experiment data were captured into one XML document with Human Proteome Markup Language (HUP-ML) editor and imported into native XML database Xindice. The 2D/MS repository of NPC proteome was constructed with Apache, PHP and Xindice to provide access to the database via Internet. On our website, two methods, keyword query and click query, were provided at the same time to access the entries of the NPC proteome database. Our 2D/MS repository can be used to share the raw NPC proteomics data that are generated from gel-based proteomics experiments. The database, as well as the PHP source codes for constructing users' own proteome repository, can be accessed at http://www.xyproteomics.org/.

  13. Distinguishable memory retrieval networks for collaboratively and non-collaboratively learned information.

    PubMed

    Vanlangendonck, Flora; Takashima, Atsuko; Willems, Roel M; Hagoort, Peter

    2018-03-01

    Learning often occurs in communicative and collaborative settings, yet almost all research into the neural basis of memory relies on participants encoding and retrieving information on their own. We investigated whether learning linguistic labels in a collaborative context at least partly relies on cognitively and neurally distinct representations, as compared to learning in an individual context. Healthy human participants learned labels for sets of abstract shapes in three different tasks. They came up with labels with another person in a collaborative communication task (collaborative condition), by themselves (individual condition), or were given pre-determined unrelated labels to learn by themselves (arbitrary condition). Immediately after learning, participants retrieved and produced the labels aloud during a communicative task in the MRI scanner. The fMRI results show that the retrieval of collaboratively generated labels as compared to individually learned labels engages brain regions involved in understanding others (mentalizing or theory of mind) and autobiographical memory, including the medial prefrontal cortex, the right temporoparietal junction and the precuneus. This study is the first to show that collaboration during encoding affects the neural networks involved in retrieval. Copyright © 2018 The Authors. Published by Elsevier Ltd.. All rights reserved.

  14. Information content and sensitivity of the 3β + 2α lidar measurement system for aerosol microphysical retrievals

    NASA Astrophysics Data System (ADS)

    Burton, Sharon P.; Chemyakin, Eduard; Liu, Xu; Knobelspiesse, Kirk; Stamnes, Snorre; Sawamura, Patricia; Moore, Richard H.; Hostetler, Chris A.; Ferrare, Richard A.

    2016-11-01

    There is considerable interest in retrieving profiles of aerosol effective radius, total number concentration, and complex refractive index from lidar measurements of extinction and backscatter at several wavelengths. The combination of three backscatter channels plus two extinction channels (3β + 2α) is particularly important since it is believed to be the minimum configuration necessary for the retrieval of aerosol microphysical properties and because the technological readiness of lidar systems permits this configuration on both an airborne and future spaceborne instrument. The second-generation NASA Langley airborne High Spectral Resolution Lidar (HSRL-2) has been making 3β + 2α measurements since 2012. The planned NASA Aerosol/Clouds/Ecosystems (ACE) satellite mission also recommends the 3β + 2α combination.Here we develop a deeper understanding of the information content and sensitivities of the 3β + 2α system in terms of aerosol microphysical parameters of interest. We use a retrieval-free methodology to determine the basic sensitivities of the measurements independent of retrieval assumptions and constraints. We calculate information content and uncertainty metrics using tools borrowed from the optimal estimation methodology based on Bayes' theorem, using a simplified forward model look-up table, with no explicit inversion. The forward model is simplified to represent spherical particles, monomodal log-normal size distributions, and wavelength-independent refractive indices. Since we only use the forward model with no retrieval, the given simplified aerosol scenario is applicable as a best case for all existing retrievals in the absence of additional constraints. Retrieval-dependent errors due to mismatch between retrieval assumptions and true atmospheric aerosols are not included in this sensitivity study, and neither are retrieval errors that may be introduced in the inversion process. The choice of a simplified model adds clarity to the

  15. TRANSACTIONS OF THE ALL-UNION CONFERENCE ON INFORMATION RETRIEVAL SYSTEMS AND AUTOMATED PROCESSING OF SCIENTIFIC AND TECHNICAL INFORMATION (3rd): PREFACE

    DTIC Science & Technology

    The research and development of information services within the USSR, reported at the 3rd All-Union Conference on information retrieval systems and automated processing of scientific and technical information, is discussed.

  16. Managing and Querying Image Annotation and Markup in XML.

    PubMed

    Wang, Fusheng; Pan, Tony; Sharma, Ashish; Saltz, Joel

    2010-01-01

    Proprietary approaches for representing annotations and image markup are serious barriers for researchers to share image data and knowledge. The Annotation and Image Markup (AIM) project is developing a standard based information model for image annotation and markup in health care and clinical trial environments. The complex hierarchical structures of AIM data model pose new challenges for managing such data in terms of performance and support of complex queries. In this paper, we present our work on managing AIM data through a native XML approach, and supporting complex image and annotation queries through native extension of XQuery language. Through integration with xService, AIM databases can now be conveniently shared through caGrid.

  17. Managing and Querying Image Annotation and Markup in XML

    PubMed Central

    Wang, Fusheng; Pan, Tony; Sharma, Ashish; Saltz, Joel

    2010-01-01

    Proprietary approaches for representing annotations and image markup are serious barriers for researchers to share image data and knowledge. The Annotation and Image Markup (AIM) project is developing a standard based information model for image annotation and markup in health care and clinical trial environments. The complex hierarchical structures of AIM data model pose new challenges for managing such data in terms of performance and support of complex queries. In this paper, we present our work on managing AIM data through a native XML approach, and supporting complex image and annotation queries through native extension of XQuery language. Through integration with xService, AIM databases can now be conveniently shared through caGrid. PMID:21218167

  18. Impact of a priori information on IASI ozone retrievals and trends

    NASA Astrophysics Data System (ADS)

    Barret, B.; Peiro, H.; Emili, E.; Le Flocgmoën, E.

    2017-12-01

    The IASI sensor documents atmospheric water vapor, temperature and composition since 2007. The Software for a Fast Retrieval of IASI Data (SOFRID) has been developped to retrieve O3 and CO profiles from IASI in near-real time on a global scale. Information content analyses have shown that IASI enables the quantification of O3 independently in the troposphere, the UTLS and the stratosphere. Validation studies have demonstrated that the daily to seasonal variability of tropospheric and UTLS O3 was well captured by IASI especially in the tropics. IASI-SOFRID retrievals have also been used to document the tropospheric composition during the Asian monsoon and participated to determine the O3 evolution during the 2008-2016 period in the framework of the TOAR project. Nevertheless, IASI-SOFRID O3 is biased high in the UTLS and in the tropical troposphere and the 8 years O3 trends from the different IASI products are significantly different from the O3 trends from UV-Vis satellite sensors (e.g. OMI)..SOFRID is based on the Optimal Estimation Method that requires a priori information to complete the information provided by the measured thermal infrared radiances. In SOFRID-O3 v1.5 used in TOAR the a priori consists of a single O3 profile and associated covariance matrix based on global O3 radiosoundings. Such a global a priori is characterized by a very large variabilty and does not represent our best kowledge of the O3 profile at a given time and location. Furthermore it is biased towards the northern hemisphere middle latitudes. We have therefore implemented the possibility to use dynamical a priori data in SOFRID and performed experiments using O3 climatological data and MLS O3 analyses. We will present O3 distributions and comparisons with O3 radiosoundings from the different SOFRID-O3 retrievals. We will in particular assess the impact of the use of different a priori data upon the O3 biases and trends during the IASI period.

  19. Retrieval of profile information from airborne multiaxis UV-visible skylight absorption measurements.

    PubMed

    Bruns, Marco; Buehler, Stefan A; Burrows, John P; Heue, Klaus-Peter; Platt, Ulrich; Pundt, Irene; Richter, Andreas; Rozanov, Alexej; Wagner, Thomas; Wang, Ping

    2004-08-01

    A recent development in ground-based remote sensing of atmospheric constituents by UV-visible absorption measurements of scattered light is the simultaneous use of several horizon viewing directions in addition to the traditional zenith-sky pointing. The different light paths through the atmosphere enable the vertical distribution of some atmospheric absorbers, such as NO2, BrO, or O3, to be retrieved. This approach has recently been implemented on an airborne platform. This novel instrument, the airborne multiaxis differential optical absorption spectrometer (AMAXDOAS), has been flown for the first time. In this study, the amount of profile information that can be retrieved from such measurements is investigated for the trace gas NO2. Sensitivity studies on synthetic data are performed for a variety of representative measurement conditions including two wavelengths, one in the UV and one in the visible, two different surface spectral reflectances, various lines of sight (LOSs), and for two different flight altitudes. The results demonstrate that the AMAXDOAS measurements contain useful profile information, mainly at flight altitude and below the aircraft. Depending on wavelength and LOS used, the vertical resolution of the retrieved profiles is as good as 2 km near flight altitude. Above 14 km the profile information content of AMAXDOAS measurements is sparse. Airborne multiaxis measurements are thus a promising tool for atmospheric studies in the troposphere and the upper troposphere and lower stratosphere region.

  20. Retrieval and sleep both counteract the forgetting of spatial information.

    PubMed

    Antony, James W; Paller, Ken A

    2018-06-01

    Repeatedly studying information is a good way to strengthen memory storage. Nevertheless, testing recall often produces superior long-term retention. Demonstrations of this testing effect, typically with verbal stimuli, have shown that repeated retrieval through testing reduces forgetting. Sleep also benefits memory storage, perhaps through repeated retrieval as well. That is, memories may generally be subject to forgetting that can be counteracted when memories become reactivated, and there are several types of reactivation: (i) via intentional restudying, (ii) via testing, (iii) without provocation during wake, or (iv) during sleep. We thus measured forgetting for spatial material subjected to repeated study or repeated testing followed by retention intervals with sleep versus wake. Four groups of subjects learned a set of visual object-location associations and either restudied the associations or recalled locations given the objects as cues. We found the advantage for restudied over retested information was greater in the PM than AM group. Additional groups tested at 5-min and 1-wk retention intervals confirmed previous findings of greater relative benefits for restudying in the short-term and for retesting in the long-term. Results overall support the conclusion that repeated reactivation through testing or sleeping stabilizes information against forgetting. © 2018 Antony and Paller; Published by Cold Spring Harbor Laboratory Press.