Data Compression in Full-Text Retrieval Systems.
ERIC Educational Resources Information Center
Bell, Timothy C.; And Others
1993-01-01
Describes compression methods for components of full-text systems such as text databases on CD-ROM. Topics discussed include storage media; structures for full-text retrieval, including indexes, inverted files, and bitmaps; compression tools; memory requirements during retrieval; and ranking and information retrieval. (Contains 53 references.)…
STATUS/IQ: A Semi-Intelligent Information Retrieval System.
ERIC Educational Resources Information Center
Pearsall, Jayne
1990-01-01
Provides background on the problems of traditional text retrieval systems and describes STATUS/IQ, an advanced text retrieval system that incorporates a natural language front-end and an advanced relevance ranking facility. The principles, capabilities, and benefits of the system are discussed, and an example of a STATUS/IQ session is presented…
The Impact of Text Browsing on Text Retrieval Performance.
ERIC Educational Resources Information Center
Bodner, Richard C.; Chignell, Mark H.; Charoenkitkarn, Nipon; Golovchinsky, Gene; Kopak, Richard W.
2001-01-01
Compares empirical results from three experiments using Text Retrieval Conference (TREC) data and search topics that involved three different user interfaces. Results show that marking Boolean queries on text, which encourages browsing, and hypertext interfaces to text retrieval systems can benefit recall and can also benefit novice users.…
BROWSER: An Automatic Indexing On-Line Text Retrieval System. Annual Progress Report.
ERIC Educational Resources Information Center
Williams, J. H., Jr.
The development and testing of the Browsing On-line With Selective Retrieval (BROWSER) text retrieval system allowing a natural language query statement and providing on-line browsing capabilities through an IBM 2260 display terminal is described. The prototype system contains data bases of 25,000 German language patent abstracts, 9,000 English…
Testing of a Natural Language Retrieval System for a Full Text Knowledge Base.
ERIC Educational Resources Information Center
Bernstein, Lionel M.; Williamson, Robert E.
1984-01-01
The Hepatitis Knowledge Base (text of prototype information system) was used for modifying and testing "A Navigator of Natural Language Organized (Textual) Data" (ANNOD), a retrieval system which combines probabilistic, linguistic, and empirical means to rank individual paragraphs of full text for similarity to natural language queries…
ERIC Educational Resources Information Center
Cornell Univ., Ithaca, NY. Dept. of Computer Science.
On-line retrieval system design is discussed in the two papers which make up Part Five of this report on Salton's Magical Automatic Retriever of Texts (SMART) project report. The first paper: "A Prototype On-Line Document Retrieval System" by D. Williamson and R. Williamson outlines a design for a SMART on-line document retrieval system…
Development of a full-text information retrieval system
DOE Office of Scientific and Technical Information (OSTI.GOV)
Keizo Oyama; AKira Miyazawa, Atsuhiro Takasu; Kouji Shibano
The authors have executed a project to realize a full-text information retrieval system. The system is designed to deal with a document database comprising full text of a large number of documents such as academic papers. The document structures are utilized in searching and extracting appropriate information. The concept of structure handling and the configuration of the system are described in this paper.
ERIC Educational Resources Information Center
Tauchert, Wolfgang; And Others
1991-01-01
Describes the PADOK-II project in Germany, which was designed to give information on the effects of linguistic algorithms on retrieval in a full-text database, the German Patent Information System (GPI). Relevance assessments are discussed, statistical evaluations are described, and searches are compared for the full-text section versus the…
Image/text automatic indexing and retrieval system using context vector approach
NASA Astrophysics Data System (ADS)
Qing, Kent P.; Caid, William R.; Ren, Clara Z.; McCabe, Patrick
1995-11-01
Thousands of documents and images are generated daily both on and off line on the information superhighway and other media. Storage technology has improved rapidly to handle these data but indexing this information is becoming very costly. HNC Software Inc. has developed a technology for automatic indexing and retrieval of free text and images. This technique is demonstrated and is based on the concept of `context vectors' which encode a succinct representation of the associated text and features of sub-image. In this paper, we will describe the Automated Librarian System which was designed for free text indexing and the Image Content Addressable Retrieval System (ICARS) which extends the technique from the text domain into the image domain. Both systems have the ability to automatically assign indices for a new document and/or image based on the content similarities in the database. ICARS also has the capability to retrieve images based on similarity of content using index terms, text description, and user-generated images as a query without performing segmentation or object recognition.
Creating and indexing teaching files from free-text patient reports.
Johnson, D. B.; Chu, W. W.; Dionisio, J. D.; Taira, R. K.; Kangarloo, H.
1999-01-01
Teaching files based on real patient data can enhance the education of students, staff and other colleagues. Although information retrieval system can index free-text documents using keywords, these systems do not work well where content bearing terms (e.g., anatomy descriptions) frequently appears. This paper describes a system that uses multi-word indexing terms to provide access to free-text patient reports. The utilization of multi-word indexing allows better modeling of the content of medical reports, thus improving retrieval performance. The method used to select indexing terms as well as early evaluation of retrieval performance is discussed. PMID:10566473
Combining approaches to on-line handwriting information retrieval
NASA Astrophysics Data System (ADS)
Peña Saldarriaga, Sebastián; Viard-Gaudin, Christian; Morin, Emmanuel
2010-01-01
In this work, we propose to combine two quite different approaches for retrieving handwritten documents. Our hypothesis is that different retrieval algorithms should retrieve different sets of documents for the same query. Therefore, significant improvements in retrieval performances can be expected. The first approach is based on information retrieval techniques carried out on the noisy texts obtained through handwriting recognition, while the second approach is recognition-free using a word spotting algorithm. Results shows that for texts having a word error rate (WER) lower than 23%, the performances obtained with the combined system are close to the performances obtained on clean digital texts. In addition, for poorly recognized texts (WER > 52%), an improvement of nearly 17% can be observed with respect to the best available baseline method.
Documents Similarity Measurement Using Field Association Terms.
ERIC Educational Resources Information Center
Atlam, El-Sayed; Fuketa, M.; Morita, K.; Aoe, Jun-ichi
2003-01-01
Discussion of text analysis and information retrieval and measurement of document similarity focuses on a new text manipulation system called FA (field association)-Sim that is useful for retrieving information in large heterogeneous texts and for recognizing content similarity in text excerpts. Discusses recall and precision, automatic indexing…
World Wide Web Based Image Search Engine Using Text and Image Content Features
NASA Astrophysics Data System (ADS)
Luo, Bo; Wang, Xiaogang; Tang, Xiaoou
2003-01-01
Using both text and image content features, a hybrid image retrieval system for Word Wide Web is developed in this paper. We first use a text-based image meta-search engine to retrieve images from the Web based on the text information on the image host pages to provide an initial image set. Because of the high-speed and low cost nature of the text-based approach, we can easily retrieve a broad coverage of images with a high recall rate and a relatively low precision. An image content based ordering is then performed on the initial image set. All the images are clustered into different folders based on the image content features. In addition, the images can be re-ranked by the content features according to the user feedback. Such a design makes it truly practical to use both text and image content for image retrieval over the Internet. Experimental results confirm the efficiency of the system.
Term-Weighting Approaches in Automatic Text Retrieval.
ERIC Educational Resources Information Center
Salton, Gerard; Buckley, Christopher
1988-01-01
Summarizes the experimental evidence that indicates that text indexing systems based on the assignment of appropriately weighted single terms produce retrieval results superior to those obtained with more elaborate text representations, and provides baseline single term indexing models with which more elaborate content analysis procedures can be…
Augmenting Oracle Text with the UMLS for enhanced searching of free-text medical reports.
Ding, Jing; Erdal, Selnur; Dhaval, Rakesh; Kamal, Jyoti
2007-10-11
The intrinsic complexity of free-text medical reports imposes great challenges for information retrieval systems. We have developed a prototype search engine for retrieving clinical reports that leverages the powerful indexing and querying capabilities of Oracle Text, and the rich biomedical domain knowledge and semantic structures that are captured in the UMLS Metathesaurus.
The State of Retrieval System Evaluation.
ERIC Educational Resources Information Center
Salton, Gerald
1992-01-01
The current state of information retrieval (IR) evaluation is reviewed with criticisms directed at the available test collections and the research and evaluation methodologies used, including precision and recall rates for online searches and laboratory tests not including real users. Automatic text retrieval systems are also discussed. (32…
Learned Vector-Space Models for Document Retrieval.
ERIC Educational Resources Information Center
Caid, William R.; And Others
1995-01-01
The Latent Semantic Indexing and MatchPlus systems examine similar contexts in which words appear and create representational models that capture the similarity of meaning of terms and then use the representation for retrieval. Text Retrieval Conference experiments using these systems demonstrate the computational feasibility of using…
Intelligent retrieval of medical images from the Internet
NASA Astrophysics Data System (ADS)
Tang, Yau-Kuo; Chiang, Ted T.
1996-05-01
The object of this study is using Internet resources to provide a cost-effective, user-friendly method to access the medical image archive system and to provide an easy method for the user to identify the images required. This paper describes the prototype system architecture, the implementation, and results. In the study, we prototype the Intelligent Medical Image Retrieval (IMIR) system as a Hypertext Transport Prototype server and provide Hypertext Markup Language forms for user, as an Internet client, using browser to enter image retrieval criteria for review. We are developing the intelligent retrieval engine, with the capability to map the free text search criteria to the standard terminology used for medical image identification. We evaluate retrieved records based on the number of the free text entries matched and their relevance level to the standard terminology. We are in the integration and testing phase. We have collected only a few different types of images for testing and have trained a few phrases to map the free text to the standard medical terminology. Nevertheless, we are able to demonstrate the IMIR's ability to search, retrieve, and review medical images from the archives using general Internet browser. The prototype also uncovered potential problems in performance, security, and accuracy. Additional studies and enhancements will make the system clinically operational.
NASA Astrophysics Data System (ADS)
Antani, Sameer K.; Natarajan, Mukil; Long, Jonathan L.; Long, L. Rodney; Thoma, George R.
2005-04-01
The article describes the status of our ongoing R&D at the U.S. National Library of Medicine (NLM) towards the development of an advanced multimedia database biomedical information system that supports content-based image retrieval (CBIR). NLM maintains a collection of 17,000 digitized spinal X-rays along with text survey data from the Second National Health and Nutritional Examination Survey (NHANES II). These data serve as a rich data source for epidemiologists and researchers of osteoarthritis and musculoskeletal diseases. It is currently possible to access these through text keyword queries using our Web-based Medical Information Retrieval System (WebMIRS). CBIR methods developed specifically for biomedical images could offer direct visual searching of these images by means of example image or user sketch. We are building a system which supports hybrid queries that have text and image-content components. R&D goals include developing algorithms for robust image segmentation for localizing and identifying relevant anatomy, labeling the segmented anatomy based on its pathology, developing suitable indexing and similarity matching methods for images and image features, and associating the survey text information for query and retrieval along with the image data. Some highlights of the system developed in MATLAB and Java are: use of a networked or local centralized database for text and image data; flexibility to incorporate new research work; provides a means to control access to system components under development; and use of XML for structured reporting. The article details the design, features, and algorithms in this third revision of this prototype system, CBIR3.
A User Interface for Multiple Retrieval Systems.
ERIC Educational Resources Information Center
Teskey, Niall; And Others
1987-01-01
Reviews current systems designed to help end-users search online databases without the assistance of an intermediary and describes a prototype system which emulates the Deco (the text storage and retrieval system used by Unilever) interface on Dialog and Data-Star. Initial trials of the prototype system are reported. (15 references) (MES)
A content-based news video retrieval system: NVRS
NASA Astrophysics Data System (ADS)
Liu, Huayong; He, Tingting
2009-10-01
This paper focus on TV news programs and design a content-based news video browsing and retrieval system, NVRS, which is convenient for users to fast browsing and retrieving news video by different categories such as political, finance, amusement, etc. Combining audiovisual features and caption text information, the system automatically segments a complete news program into separate news stories. NVRS supports keyword-based news story retrieval, category-based news story browsing and generates key-frame-based video abstract for each story. Experiments show that the method of story segmentation is effective and the retrieval is also efficient.
ERIC Educational Resources Information Center
Proceedings of the ASIS Annual Meeting, 1993
1993-01-01
Presents abstracts of 34 special interest group (SIG) sessions. Highlights include humanities scholars and electronic texts; information retrieval and indexing systems design; automated indexing; domain analysis; query expansion in document retrieval systems; thesauri; business intelligence; Americans with Disabilities Act; management;…
Roogle: an information retrieval engine for clinical data warehouse.
Cuggia, Marc; Garcelon, Nicolas; Campillo-Gimenez, Boris; Bernicot, Thomas; Laurent, Jean-François; Garin, Etienne; Happe, André; Duvauferrier, Régis
2011-01-01
High amount of relevant information is contained in reports stored in the electronic patient records and associated metadata. R-oogle is a project aiming at developing information retrieval engines adapted to these reports and designed for clinicians. The system consists in a data warehouse (full-text reports and structured data) imported from two different hospital information systems. Information retrieval is performed using metadata-based semantic and full-text search methods (as Google). Applications may be biomarkers identification in a translational approach, search of specific cases, and constitution of cohorts, professional practice evaluation, and quality control assessment.
Text Retrieval Systems in 1991: The Market, Vendors, Evaluators and Users.
ERIC Educational Resources Information Center
Koulopoulos, Thomas M.
1992-01-01
Reports results of a 1991 survey of vendors, users, and evaluators of text retrieval products. Data are presented and discussed on product revenue trends, sales by industry, individuals responsible for evaluation and purchase decisions, product installations by platform, relative importance of key benefits and functionalities, return on…
Lin, Jimmy
2008-01-01
Background Graph analysis algorithms such as PageRank and HITS have been successful in Web environments because they are able to extract important inter-document relationships from manually-created hyperlinks. We consider the application of these techniques to biomedical text retrieval. In the current PubMed® search interface, a MEDLINE® citation is connected to a number of related citations, which are in turn connected to other citations. Thus, a MEDLINE record represents a node in a vast content-similarity network. This article explores the hypothesis that these networks can be exploited for text retrieval, in the same manner as hyperlink graphs on the Web. Results We conducted a number of reranking experiments using the TREC 2005 genomics track test collection in which scores extracted from PageRank and HITS analysis were combined with scores returned by an off-the-shelf retrieval engine. Experiments demonstrate that incorporating PageRank scores yields significant improvements in terms of standard ranked-retrieval metrics. Conclusion The link structure of content-similarity networks can be exploited to improve the effectiveness of information retrieval systems. These results generalize the applicability of graph analysis algorithms to text retrieval in the biomedical domain. PMID:18538027
Comparing the Document Representations of Two IR-Systems: CLARIT and TOPIC.
ERIC Educational Resources Information Center
Paijmans, Hans
1993-01-01
Compares two information retrieval systems, CLARIT and TOPIC, in terms of assigned versus derived and precoordinate versus postcoordinate indexing. Models of information retrieval systems are discussed, and a test of the systems using a demonstration database of full-text articles from the "Wall Street Journal" is described. (Contains 21…
The Video PATSEARCH System: An Interview with Peter Urbach.
ERIC Educational Resources Information Center
Videodisc/Videotext, 1982
1982-01-01
The Video PATSEARCH system consists of a microcomputer with a special keyboard and two display screens which accesses the PATSEARCH database of United States government patents on the Bibliographic Retrieval Services (BRS) search system. The microcomputer retrieves text from BRS and matching graphics from an analog optical videodisc. (Author/JJD)
The TREC Interactive Track: An Annotated Bibliography.
ERIC Educational Resources Information Center
Over, Paul
2001-01-01
Discussion of the study of interactive information retrieval (IR) at the Text Retrieval Conferences (TREC) focuses on summaries of the Interactive Track at each conference. Describes evolution of the track, which has changed from comparing human-machine systems with fully automatic systems to comparing interactive systems that focus on the search…
Hypertext Image Retrieval: The Evolution of an Application.
ERIC Educational Resources Information Center
Roberts, G. Louis; Kenney, Carol E.
1991-01-01
Describes the development and implementation of a full-text image retrieval system at the Boeing Commercial Airplane Group. The conversion of card formats to a microcomputer-based system using HyperCard is described; the online system architecture is explained; and future plans are discussed, including conversion to digital images. (LRW)
An integrated content and metadata based retrieval system for art.
Lewis, Paul H; Martinez, Kirk; Abas, Fazly Salleh; Fauzi, Mohammad Faizal Ahmad; Chan, Stephen C Y; Addis, Matthew J; Boniface, Mike J; Grimwood, Paul; Stevenson, Alison; Lahanier, Christian; Stevenson, James
2004-03-01
A new approach to image retrieval is presented in the domain of museum and gallery image collections. Specialist algorithms, developed to address specific retrieval tasks, are combined with more conventional content and metadata retrieval approaches, and implemented within a distributed architecture to provide cross-collection searching and navigation in a seamless way. External systems can access the different collections using interoperability protocols and open standards, which were extended to accommodate content based as well as text based retrieval paradigms. After a brief overview of the complete system, we describe the novel design and evaluation of some of the specialist image analysis algorithms including a method for image retrieval based on sub-image queries, retrievals based on very low quality images and retrieval using canvas crack patterns. We show how effective retrieval results can be achieved by real end-users consisting of major museums and galleries, accessing the distributed but integrated digital collections.
Laboratory Experiments with Okapi: Participation in the TREC Programme.
ERIC Educational Resources Information Center
Robertson, S. E.; And Others
1997-01-01
Summarizes the development of information retrieval evaluation ideas, describes the design of the TREC (Text Retrieval Conference) experiments, and discusses the Okapi team's participation in TREC. Highlights include the Cranfield projects that tested the principles of information retrieval system design, test collections, weighting functions,…
TES: A Text Extraction System.
ERIC Educational Resources Information Center
Goh, A.; Hui, S. C.
1996-01-01
Describes how TES, a text extraction system, is able to electronically retrieve a set of sentences from a document to form an indicative abstract. Discusses various text abstraction techniques and related work in the area, provides an overview of the TES system, and compares system results against manually produced abstracts. (LAM)
The Effects of Noisy Data on Text Retrieval.
ERIC Educational Resources Information Center
Taghva, Kazem; And Others
1994-01-01
Discusses the use of optical character recognition (OCR) for inputting documents in an information retrieval system and describes a study that used an OCR-generated database and its corresponding corrected version to examine query evaluation in the presence of noisy data. Scanning technology, recognition technology, and retrieval technology are…
Task-Oriented Access to Data Files: An Evaluation.
ERIC Educational Resources Information Center
Watters, Carolyn; And Others
1994-01-01
Discussion of information retrieval highlights DalText, a prototype information retrieval system that provides access to nonindexed textual data files where the mode of access is determined by the user based on the task at hand. A user study is described that was conducted at Dalhousie University (Nova Scotia) to test DalText. (Contains 23…
Evaluating Combinations of Ranked Lists and Visualizations of Inter-Document Similarity.
ERIC Educational Resources Information Center
Allan, James; Leuski, Anton; Swan, Russell; Byrd, Donald
2001-01-01
Considers how ideas from document clustering can be used to improve retrieval accuracy of ranked lists in interactive systems and how to evaluate system effectiveness. Describes a TREC (Text Retrieval Conference) study that constructed and evaluated systems that present the user with ranked lists and a visualization of inter-document similarities.…
An evaluation of information retrieval accuracy with simulated OCR output
DOE Office of Scientific and Technical Information (OSTI.GOV)
Croft, W.B.; Harding, S.M.; Taghva, K.
Optical Character Recognition (OCR) is a critical part of many text-based applications. Although some commercial systems use the output from OCR devices to index documents without editing, there is very little quantitative data on the impact of OCR errors on the accuracy of a text retrieval system. Because of the difficulty of constructing test collections to obtain this data, we have carried out evaluation using simulated OCR output on a variety of databases. The results show that high quality OCR devices have little effect on the accuracy of retrieval, but low quality devices used with databases of short documents canmore » result in significant degradation.« less
A tutorial on information retrieval: basic terms and concepts
Zhou, Wei; Smalheiser, Neil R; Yu, Clement
2006-01-01
This informal tutorial is intended for investigators and students who would like to understand the workings of information retrieval systems, including the most frequently used search engines: PubMed and Google. Having a basic knowledge of the terms and concepts of information retrieval should improve the efficiency and productivity of searches. As well, this knowledge is needed in order to follow current research efforts in biomedical information retrieval and text mining that are developing new systems not only for finding documents on a given topic, but extracting and integrating knowledge across documents. PMID:16722601
What Friends Are For: Collaborative Intelligence Analysis and Search
2014-06-01
14. SUBJECT TERMS Intelligence Community, information retrieval, recommender systems , search engines, social networks, user profiling, Lucene...improvements over existing search systems . The improvements are shown to be robust to high levels of human error and low similarity between users ...precision NOLH nearly orthogonal Latin hypercubes P@ precision at documents RS recommender systems TREC Text REtrieval Conference USM user
Microsoft Research at TREC 2009. Web and Relevance Feedback Tracks
2009-11-01
Information Processing Systems, pages 193–200, 2006. [2] J . M. Kleinberg. Authoritative sources in a hyperlinked environment. In Proc. of the 9th...Walker, S. Jones, M. Hancock-Beaulieu, and M. Gatford. Okapi at TREC-3. In Proc. of the 3rd Text REtrieval Conference, 1994. [8] J . J . Rocchio. Relevance...feedback in information retrieval. In Gerard Salton , editor, The SMART Retrieval System - Experiments in Automatic Document Processing. Prentice Hall
NASA Technical Reports Server (NTRS)
Ambur, Manjula Y.; Adams, David L.; Trinidad, P. Paul
1997-01-01
NASA Langley Technical Library has been involved in developing systems for full-text information delivery of NACA/NASA technical reports since 1991. This paper will describe the two prototypes it has developed and the present production system configuration. The prototype systems are a NACA CD-ROM of thirty-three classic paper NACA reports and a network-based Full-text Electronic Reports Documents System (FEDS) constructed from both paper and electronic formats of NACA and NASA reports. The production system is the DigiDoc System (DIGItal Documents) presently being developed based on the experiences gained from the two prototypes. DigiDoc configuration integrates the on-line catalog database World Wide Web interface and PDF technology to provide a powerful and flexible search and retrieval system. It describes in detail significant achievements and lessons learned in terms of data conversion, storage technologies, full-text searching and retrieval, and image databases. The conclusions from the experiences of digitization and full- text access and future plans for DigiDoc system implementation are discussed.
Robust keyword retrieval method for OCRed text
NASA Astrophysics Data System (ADS)
Fujii, Yusaku; Takebe, Hiroaki; Tanaka, Hiroshi; Hotta, Yoshinobu
2011-01-01
Document management systems have become important because of the growing popularity of electronic filing of documents and scanning of books, magazines, manuals, etc., through a scanner or a digital camera, for storage or reading on a PC or an electronic book. Text information acquired by optical character recognition (OCR) is usually added to the electronic documents for document retrieval. Since texts generated by OCR generally include character recognition errors, robust retrieval methods have been introduced to overcome this problem. In this paper, we propose a retrieval method that is robust against both character segmentation and recognition errors. In the proposed method, the insertion of noise characters and dropping of characters in the keyword retrieval enables robustness against character segmentation errors, and character substitution in the keyword of the recognition candidate for each character in OCR or any other character enables robustness against character recognition errors. The recall rate of the proposed method was 15% higher than that of the conventional method. However, the precision rate was 64% lower.
Medical Language Processing for Knowledge Representation and Retrievals
Lyman, Margaret; Sager, Naomi; Chi, Emile C.; Tick, Leo J.; Nhan, Ngo Thanh; Su, Yun; Borst, Francois; Scherrer, Jean-Raoul
1989-01-01
The Linguistic String Project-Medical Language Processor, a system for computer analysis of narrative patient documents in English, is being adapted for French Lettres de Sortie. The system converts the free-text input to a semantic representation which is then mapped into a relational database. Retrievals of clinical data from the database are described.
An Optical Disk-Based Information Retrieval System.
ERIC Educational Resources Information Center
Bender, Avi
1988-01-01
Discusses a pilot project by the Nuclear Regulatory Commission to apply optical disk technology to the storage and retrieval of documents related to its high level waste management program. Components and features of the microcomputer-based system which provides full-text and image access to documents are described. A sample search is included.…
Dugan, J M; Berrios, D C; Liu, X; Kim, D K; Kaizer, H; Fagan, L M
1999-01-01
Our group has built an information retrieval system based on a complex semantic markup of medical textbooks. We describe the construction of a set of web-based knowledge-acquisition tools that expedites the collection and maintenance of the concepts required for text markup and the search interface required for information retrieval from the marked text. In the text markup system, domain experts (DEs) identify sections of text that contain one or more elements from a finite set of concepts. End users can then query the text using a predefined set of questions, each of which identifies a subset of complementary concepts. The search process matches that subset of concepts to relevant points in the text. The current process requires that the DE invest significant time to generate the required concepts and questions. We propose a new system--called ACQUIRE (Acquisition of Concepts and Queries in an Integrated Retrieval Environment)--that assists a DE in two essential tasks in the text-markup process. First, it helps her to develop, edit, and maintain the concept model: the set of concepts with which she marks the text. Second, ACQUIRE helps her to develop a query model: the set of specific questions that end users can later use to search the marked text. The DE incorporates concepts from the concept model when she creates the questions in the query model. The major benefit of the ACQUIRE system is a reduction in the time and effort required for the text-markup process. We compared the process of concept- and query-model creation using ACQUIRE to the process used in previous work by rebuilding two existing models that we previously constructed manually. We observed a significant decrease in the time required to build and maintain the concept and query models.
Computer retrieval of bibliographies using an editing program
Brethauer, G.E.; Brokaw, V.L.
1979-01-01
A simple program permits use of the text .editor 'qedx,' part of many computer systems, to input bibliographic entries and to retrieve specific entries which contain keywords of interest. Multiple keywords may be used sequentially to find specific entries.
Information retrieval algorithms: A survey
DOE Office of Scientific and Technical Information (OSTI.GOV)
Raghavan, P.
We give an overview of some algorithmic problems arising in the representation of text/image/multimedia objects in a form amenable to automated searching, and in conducting these searches efficiently. These operations are central to information retrieval and digital library systems.
Natural language information retrieval in digital libraries
DOE Office of Scientific and Technical Information (OSTI.GOV)
Strzalkowski, T.; Perez-Carballo, J.; Marinescu, M.
In this paper we report on some recent developments in joint NYU and GE natural language information retrieval system. The main characteristic of this system is the use of advanced natural language processing to enhance the effectiveness of term-based document retrieval. The system is designed around a traditional statistical backbone consisting of the indexer module, which builds inverted index files from pre-processed documents, and a retrieval engine which searches and ranks the documents in response to user queries. Natural language processing is used to (1) preprocess the documents in order to extract content-carrying terms, (2) discover inter-term dependencies and buildmore » a conceptual hierarchy specific to the database domain, and (3) process user`s natural language requests into effective search queries. This system has been used in NIST-sponsored Text Retrieval Conferences (TREC), where we worked with approximately 3.3 GBytes of text articles including material from the Wall Street Journal, the Associated Press newswire, the Federal Register, Ziff Communications`s Computer Library, Department of Energy abstracts, U.S. Patents and the San Jose Mercury News, totaling more than 500 million words of English. The system have been designed to facilitate its scalability to deal with ever increasing amounts of data. In particular, a randomized index-splitting mechanism has been installed which allows the system to create a number of smaller indexes that can be independently and efficiently searched.« less
Words, concepts, or both: optimal indexing units for automated information retrieval.
Hersh, W. R.; Hickam, D. H.; Leone, T. J.
1992-01-01
What is the best way to represent the content of documents in an information retrieval system? This study compares the retrieval effectiveness of five different methods for automated (machine-assigned) indexing using three test collections. The consistently best methods are those that use indexing based on the words that occur in the available text of each document. Methods used to map text into concepts from a controlled vocabulary showed no advantage over the word-based methods. This study also looked at an approach to relevance feedback which showed benefit for both word-based and concept-based methods. PMID:1482951
Comprehension and retrieval of failure cases in airborne observatories
NASA Technical Reports Server (NTRS)
Alvarado, Sergio J.; Mock, Kenrick J.
1995-01-01
This paper describes research dealing with the computational problem of analyzing and repairing failures of electronic and mechanical systems of telescopes in NASA's airborne observatories, such as KAO (Kuiper Airborne Observatory) and SOFIA (Stratospheric Observatory for Infrared Astronomy). The research has resulted in the development of an experimental system that acquires knowledge of failure analysis from input text, and answers questions regarding failure detection and correction. The system's design builds upon previous work on text comprehension and question answering, including: knowledge representation for conceptual analysis of failure descriptions, strategies for mapping natural language into conceptual representations, case-based reasoning strategies for memory organization and indexing, and strategies for memory search and retrieval. These techniques have been combined into a model that accounts for: (a) how to build a knowledge base of system failures and repair procedures from descriptions that appear in telescope-operators' logbooks and FMEA (failure modes and effects analysis) manuals; and (b) how to use that knowledge base to search and retrieve answers to questions about causes and effects of failures, as well as diagnosis and repair procedures. This model has been implemented in FANSYS (Failure ANalysis SYStem), a prototype text comprehension and question answering program for failure analysis.
Comprehension and retrieval of failure cases in airborne observatories
NASA Astrophysics Data System (ADS)
Alvarado, Sergio J.; Mock, Kenrick J.
1995-05-01
This paper describes research dealing with the computational problem of analyzing and repairing failures of electronic and mechanical systems of telescopes in NASA's airborne observatories, such as KAO (Kuiper Airborne Observatory) and SOFIA (Stratospheric Observatory for Infrared Astronomy). The research has resulted in the development of an experimental system that acquires knowledge of failure analysis from input text, and answers questions regarding failure detection and correction. The system's design builds upon previous work on text comprehension and question answering, including: knowledge representation for conceptual analysis of failure descriptions, strategies for mapping natural language into conceptual representations, case-based reasoning strategies for memory organization and indexing, and strategies for memory search and retrieval. These techniques have been combined into a model that accounts for: (a) how to build a knowledge base of system failures and repair procedures from descriptions that appear in telescope-operators' logbooks and FMEA (failure modes and effects analysis) manuals; and (b) how to use that knowledge base to search and retrieve answers to questions about causes and effects of failures, as well as diagnosis and repair procedures. This model has been implemented in FANSYS (Failure ANalysis SYStem), a prototype text comprehension and question answering program for failure analysis.
Designing a Syntax-Based Retrieval System for Supporting Language Learning
ERIC Educational Resources Information Center
Tsao, Nai-Lung; Kuo, Chin-Hwa; Wible, David; Hung, Tsung-Fu
2009-01-01
In this paper, we propose a syntax-based text retrieval system for on-line language learning and use a fast regular expression search engine as its main component. Regular expression searches provide more scalable querying and search results than keyword-based searches. However, without a well-designed index scheme, the execution time of regular…
Recent Experiments with INQUERY
1995-11-01
were conducted with version of the INQUERY information retrieval system INQUERY is based on the Bayesian inference network retrieval model It is...corpus based query expansion For TREC a subset of of the adhoc document set was used to build the InFinder database None of the...experiments that showed signi cant improvements in retrieval eectiveness when document rankings based on the entire document text are combined with
ERIC Educational Resources Information Center
Melton, Jessica S.
Objectives of this project were to develop and test a method for automatically processing the text of abstracts for a document retrieval system. The test corpus consisted of 768 abstracts from the metallurgical section of Chemical Abstracts (CA). The system, based on a subject indexing rational, had two components: (1) a stored dictionary of words…
A Usability Case Study Using TREC and ZPRISE.
ERIC Educational Resources Information Center
Downey, Laura L.; Tice, Dawn M.
1999-01-01
Examines the challenges involved in conducting an informal usability case study based on the introduction of a new information-retrieval system to experienced users. Identifies problems users were having with TREC (Text Retrieval Conference) and examines the usability of the new ZPRISE interface. (Author/LRW)
An architecture for diversity-aware search for medical web content.
Denecke, K
2012-01-01
The Web provides a huge source of information, also on medical and health-related issues. In particular the content of medical social media data can be diverse due to the background of an author, the source or the topic. Diversity in this context means that a document covers different aspects of a topic or a topic is described in different ways. In this paper, we introduce an approach that allows to consider the diverse aspects of a search query when providing retrieval results to a user. We introduce a system architecture for a diversity-aware search engine that allows retrieving medical information from the web. The diversity of retrieval results is assessed by calculating diversity measures that rely upon semantic information derived from a mapping to concepts of a medical terminology. Considering these measures, the result set is diversified by ranking more diverse texts higher. The methods and system architecture are implemented in a retrieval engine for medical web content. The diversity measures reflect the diversity of aspects considered in a text and its type of information content. They are used for result presentation, filtering and ranking. In a user evaluation we assess the user satisfaction with an ordering of retrieval results that considers the diversity measures. It is shown through the evaluation that diversity-aware retrieval considering diversity measures in ranking could increase the user satisfaction with retrieval results.
ERIC Educational Resources Information Center
National Library of Australia, Canberra.
This manual is designed to provide an introduction and basic guide to the use of IBM's Advanced Text Management System (ATMS), the text processing system to be used for the creation of Australian data bases within AUSINET. Instructions are provided for using the system to enter, store, retrieve, and modify data, which may then be displayed at the…
The Ecological Approach to Text Visualization.
ERIC Educational Resources Information Center
Wise, James A.
1999-01-01
Presents both theoretical and technical bases on which to build a "science of text visualization." The Spatial Paradigm for Information Retrieval and Exploration (SPIRE) text-visualization system, which images information from free-text documents as natural terrains, serves as an example of the "ecological approach" in its visual metaphor, its…
Dugan, J. M.; Berrios, D. C.; Liu, X.; Kim, D. K.; Kaizer, H.; Fagan, L. M.
1999-01-01
Our group has built an information retrieval system based on a complex semantic markup of medical textbooks. We describe the construction of a set of web-based knowledge-acquisition tools that expedites the collection and maintenance of the concepts required for text markup and the search interface required for information retrieval from the marked text. In the text markup system, domain experts (DEs) identify sections of text that contain one or more elements from a finite set of concepts. End users can then query the text using a predefined set of questions, each of which identifies a subset of complementary concepts. The search process matches that subset of concepts to relevant points in the text. The current process requires that the DE invest significant time to generate the required concepts and questions. We propose a new system--called ACQUIRE (Acquisition of Concepts and Queries in an Integrated Retrieval Environment)--that assists a DE in two essential tasks in the text-markup process. First, it helps her to develop, edit, and maintain the concept model: the set of concepts with which she marks the text. Second, ACQUIRE helps her to develop a query model: the set of specific questions that end users can later use to search the marked text. The DE incorporates concepts from the concept model when she creates the questions in the query model. The major benefit of the ACQUIRE system is a reduction in the time and effort required for the text-markup process. We compared the process of concept- and query-model creation using ACQUIRE to the process used in previous work by rebuilding two existing models that we previously constructed manually. We observed a significant decrease in the time required to build and maintain the concept and query models. Images Figure 1 Figure 2 Figure 4 Figure 5 PMID:10566457
NASA Technical Reports Server (NTRS)
2002-01-01
A system that retrieves problem reports from a NASA database is described. The database is queried with natural language questions. Part-of-speech tags are first assigned to each word in the question using a rule based tagger. A partial parse of the question is then produced with independent sets of deterministic finite state a utomata. Using partial parse information, a look up strategy searches the database for problem reports relevant to the question. A bigram stemmer and irregular verb conjugates have been incorporated into the system to improve accuracy. The system is evaluated by a set of fifty five questions posed by NASA engineers. A discussion of future research is also presented.
The man/machine interface in information retrieval: Providing access to the casual user
NASA Technical Reports Server (NTRS)
Dominick, Wayne D. (Editor); Granier, Martin
1984-01-01
This study is concerned with the difficulties encountered by casual users wishing to employ Information Storage and Retrieval Systems. A casual user is defined as a professional who has neither time nor desire to pursue in depth the study of the numerous and varied retrieval systems. His needs for on-line search are only occasional, and not limited to any particular system. The paper takes a close look at the state of the art of research concerned with aiding casual users of Information Storage and Retrieval Systems. Current experiments such as LEXIS, CONIT, IIDA, CITE, and CCL are presented and discussed. Comments and proposals are offered, specifically in the areas of training, learning and cost as experienced by the casual user. An extensive bibliography of recent works on the subject follows the text.
Developing a Large Lexical Database for Information Retrieval, Parsing, and Text Generation Systems.
ERIC Educational Resources Information Center
Conlon, Sumali Pin-Ngern; And Others
1993-01-01
Important characteristics of lexical databases and their applications in information retrieval and natural language processing are explained. An ongoing project using various machine-readable sources to build a lexical database is described, and detailed designs of individual entries with examples are included. (Contains 66 references.) (EAM)
ERIC Educational Resources Information Center
Belkin, N. J.; Cool, C.; Kelly, D.; Lin, S. -J.; Park, S. Y.; Perez-Carballo, J.; Sikora, C.
2001-01-01
Reports on the progressive investigation of techniques for supporting interactive query reformulation in the TREC (Text Retrieval Conference) Interactive Track. Highlights include methods of term suggestion; interface design to support different system functionalities; an overview of each year's TREC investigation; and relevance to the development…
World Key Information Service System Designed For EPCOT Center
NASA Astrophysics Data System (ADS)
Kelsey, J. A.
1984-03-01
An advanced Bell Laboratories and Western Electric designed electronic information retrieval system utilizing the latest Information Age technologies, and a fiber optic transmission system is featured at the Walt Disney World Resort's newest theme park - The Experimental Prototype Community of Tomorrow (EPCOT Center). The project is an interactive audio, video and text information system that is deployed at key locations within the park. The touch sensitive terminals utilizing the ARIEL (Automatic Retrieval of Information Electronically) System is interconnected by a Western Electric designed and manufactured lightwave transmission system.
Protein Annotators' Assistant: A Novel Application of Information Retrieval Techniques.
ERIC Educational Resources Information Center
Wise, Michael J.
2000-01-01
Protein Annotators' Assistant (PAA) is a software system which assists protein annotators in assigning functions to newly sequenced proteins. PAA employs a number of information retrieval techniques in a novel setting and is thus related to text categorization, where multiple categories may be suggested, except that in this case none of the…
Desktop Social Science: Coming of Age.
ERIC Educational Resources Information Center
Dwyer, David C.; And Others
Beginning in 1985, Apple Computer, Inc. and several school districts began a collaboration to examine the impact of intensive computer use on instruction and learning in K-12 classrooms. This paper follows the development of a Macintosh II-based management and retrieval system for text data undertaken to store and retrieve oral reflections of…
Automated Text Markup for Information Retrieval from an Electronic Textbook of Infectious Disease
Berrios, Daniel C.; Kehler, Andrew; Kim, David K.; Yu, Victor L.; Fagan, Lawrence M.
1998-01-01
The information needs of practicing clinicians frequently require textbook or journal searches. Making these sources available in electronic form improves the speed of these searches, but precision (i.e., the fraction of relevant to total documents retrieved) remains low. Improving the traditional keyword search by transforming search terms into canonical concepts does not improve search precision greatly. Kim et al. have designed and built a prototype system (MYCIN II) for computer-based information retrieval from a forthcoming electronic textbook of infectious disease. The system requires manual indexing by experts in the form of complex text markup. However, this mark-up process is time consuming (about 3 person-hours to generate, review, and transcribe the index for each of 218 chapters). We have designed and implemented a system to semiautomate the markup process. The system, information extraction for semiautomated indexing of documents (ISAID), uses query models and existing information-extraction tools to provide support for any user, including the author of the source material, to mark up tertiary information sources quickly and accurately.
ERIC Educational Resources Information Center
Lofstrom, Mats
Because experience with large information retrieval (IR) and database management (DBM) systems has shown that they are not adequate for the handling of textual material, two Swedish companies--Paralog and AU-System Network--have joined in a venture to develop a software package which combines features from IR and DMB systems to form a Text Data…
ERIC Educational Resources Information Center
Selig, Judith A.; And Others
This report, summarizing the activities of the Vision Information Center (VIC) in the field of computer-assisted instruction from December, 1966 to August, 1967, describes the methodology used to load a large body of information--a programed text on basic opthalmology--onto a computer for subsequent information retrieval and computer-assisted…
The Effectiveness of Stemming for Natural-Language Access to Slovene Textual Data.
ERIC Educational Resources Information Center
Popovic, Mirko; Willett, Peter
1992-01-01
Reports on the use of stemming for Slovene language documents and queries in free-text retrieval systems and demonstrates that an appropriate stemming algorithm results in an increase in retrieval effectiveness when compared with nonstemming processing. A comparison is made with stemming of English versions of the same documents and queries. (24…
Content-based TV sports video retrieval using multimodal analysis
NASA Astrophysics Data System (ADS)
Yu, Yiqing; Liu, Huayong; Wang, Hongbin; Zhou, Dongru
2003-09-01
In this paper, we propose content-based video retrieval, which is a kind of retrieval by its semantical contents. Because video data is composed of multimodal information streams such as video, auditory and textual streams, we describe a strategy of using multimodal analysis for automatic parsing sports video. The paper first defines the basic structure of sports video database system, and then introduces a new approach that integrates visual stream analysis, speech recognition, speech signal processing and text extraction to realize video retrieval. The experimental results for TV sports video of football games indicate that the multimodal analysis is effective for video retrieval by quickly browsing tree-like video clips or inputting keywords within predefined domain.
Markó, K; Schulz, S; Hahn, U
2005-01-01
We propose an interlingua-based indexing approach to account for the particular challenges that arise in the design and implementation of cross-language document retrieval systems for the medical domain. Documents, as well as queries, are mapped to a language-independent conceptual layer on which retrieval operations are performed. We contrast this approach with the direct translation of German queries to English ones which, subsequently, are matched against English documents. We evaluate both approaches, interlingua-based and direct translation, on a large medical document collection, the OHSUMED corpus. A substantial benefit for interlingua-based document retrieval using German queries on English texts is found, which amounts to 93% of the (monolingual) English baseline. Most state-of-the-art cross-language information retrieval systems translate user queries to the language(s) of the target documents. In contra-distinction to this approach, translating both documents and user queries into a language-independent, concept-like representation format is more beneficial to enhance cross-language retrieval performance.
Classroom Laboratory Report: Using an Image Database System in Engineering Education.
ERIC Educational Resources Information Center
Alam, Javed; And Others
1991-01-01
Describes an image database system assembled using separate computer components that was developed to overcome text-only computer hardware storage and retrieval limitations for a pavement design class. (JJK)
Keyless Entry: Building a Text Database Using OCR Technology.
ERIC Educational Resources Information Center
Grotophorst, Clyde W.
1989-01-01
Discusses the use of optical character recognition (OCR) technology to produce an ASCII text database. A tutorial on digital scanning and OCR is provided, and a systems integration project which used the Calera CDP-3000XF scanner and text retrieval software to construct a database of dissertations at George Mason University is described. (four…
Selecting Data-Base Management Software for Microcomputers in Libraries and Information Units.
ERIC Educational Resources Information Center
Pieska, K. A. O.
1986-01-01
Presents a model for the evaluation of database management systems software from the viewpoint of librarians and information specialists. The properties of data management systems, database management systems, and text retrieval systems are outlined and compared. (10 references) (CLB)
A Personalized Health Information Retrieval System
Wang, Yunli; Liu, Zhenkai
2005-01-01
Consumers face barriers when seeking health information on the Internet. A Personalized Health Information Retrieval System (PHIRS) is proposed to recommend health information for consumers. The system consists of four modules: (1) User modeling module captures user’s preference and health interests; (2) Automatic quality filtering module identifies high quality health information; (3) Automatic text difficulty rating module classifies health information into professional or patient educational materials; and (4) User profile matching module tailors health information for individuals. The initial results show that PHIRS could assist consumers with simple search strategies. PMID:16779435
Scalable ranked retrieval using document images
NASA Astrophysics Data System (ADS)
Jain, Rajiv; Oard, Douglas W.; Doermann, David
2013-12-01
Despite the explosion of text on the Internet, hard copy documents that have been scanned as images still play a significant role for some tasks. The best method to perform ranked retrieval on a large corpus of document images, however, remains an open research question. The most common approach has been to perform text retrieval using terms generated by optical character recognition. This paper, by contrast, examines whether a scalable segmentation-free image retrieval algorithm, which matches sub-images containing text or graphical objects, can provide additional benefit in satisfying a user's information needs on a large, real world dataset. Results on 7 million scanned pages from the CDIP v1.0 test collection show that content based image retrieval finds a substantial number of documents that text retrieval misses, and that when used as a basis for relevance feedback can yield improvements in retrieval effectiveness.
Computer-Assisted Search Of Large Textual Data Bases
NASA Technical Reports Server (NTRS)
Driscoll, James R.
1995-01-01
"QA" denotes high-speed computer system for searching diverse collections of documents including (but not limited to) technical reference manuals, legal documents, medical documents, news releases, and patents. Incorporates previously available and emerging information-retrieval technology to help user intelligently and rapidly locate information found in large textual data bases. Technology includes provision for inquiries in natural language; statistical ranking of retrieved information; artificial-intelligence implementation of semantics, in which "surface level" knowledge found in text used to improve ranking of retrieved information; and relevance feedback, in which user's judgements of relevance of some retrieved documents used automatically to modify search for further information.
Document Delivery from Full-Text Online Files: A Pilot Project.
ERIC Educational Resources Information Center
Gillikin, David P.
1990-01-01
Describes the Electronic Journal Retrieval Project (EJRP) developed at the University of Tennessee, Knoxville Libraries, to provide full-text journal articles from online systems. Highlights include costs of various search strategies; implications for library services; collection development and interlibrary loan considerations; and suggestions…
Content-Based Medical Image Retrieval
NASA Astrophysics Data System (ADS)
Müller, Henning; Deserno, Thomas M.
This chapter details the necessity for alternative access concepts to the currently mainly text-based methods in medical information retrieval. This need is partly due to the large amount of visual data produced, the increasing variety of medical imaging data and changing user patterns. The stored visual data contain large amounts of unused information that, if well exploited, can help diagnosis, teaching and research. The chapter briefly reviews the history of image retrieval and its general methods before technologies that have been developed in the medical domain are focussed. We also discuss evaluation of medical content-based image retrieval (CBIR) systems and conclude with pointing out their strengths, gaps, and further developments. As examples, the MedGIFT project and the Image Retrieval in Medical Applications (IRMA) framework are presented.
Edinger, Tracy; Cohen, Aaron M.; Bedrick, Steven; Ambert, Kyle; Hersh, William
2012-01-01
Objective: Secondary use of electronic health record (EHR) data relies on the ability to retrieve accurate and complete information about desired patient populations. The Text Retrieval Conference (TREC) 2011 Medical Records Track was a challenge evaluation allowing comparison of systems and algorithms to retrieve patients eligible for clinical studies from a corpus of de-identified medical records, grouped by patient visit. Participants retrieved cohorts of patients relevant to 35 different clinical topics, and visits were judged for relevance to each topic. This study identified the most common barriers to identifying specific clinic populations in the test collection. Methods: Using the runs from track participants and judged visits, we analyzed the five non-relevant visits most often retrieved and the five relevant visits most often overlooked. Categories were developed iteratively to group the reasons for incorrect retrieval for each of the 35 topics. Results: Reasons fell into nine categories for non-relevant visits and five categories for relevant visits. Non-relevant visits were most often retrieved because they contained a non-relevant reference to the topic terms. Relevant visits were most often infrequently retrieved because they used a synonym for a topic term. Conclusions: This failure analysis provides insight into areas for future improvement in EHR-based retrieval with techniques such as more widespread and complete use of standardized terminology in retrieval and data entry systems. PMID:23304287
Edinger, Tracy; Cohen, Aaron M; Bedrick, Steven; Ambert, Kyle; Hersh, William
2012-01-01
Secondary use of electronic health record (EHR) data relies on the ability to retrieve accurate and complete information about desired patient populations. The Text Retrieval Conference (TREC) 2011 Medical Records Track was a challenge evaluation allowing comparison of systems and algorithms to retrieve patients eligible for clinical studies from a corpus of de-identified medical records, grouped by patient visit. Participants retrieved cohorts of patients relevant to 35 different clinical topics, and visits were judged for relevance to each topic. This study identified the most common barriers to identifying specific clinic populations in the test collection. Using the runs from track participants and judged visits, we analyzed the five non-relevant visits most often retrieved and the five relevant visits most often overlooked. Categories were developed iteratively to group the reasons for incorrect retrieval for each of the 35 topics. Reasons fell into nine categories for non-relevant visits and five categories for relevant visits. Non-relevant visits were most often retrieved because they contained a non-relevant reference to the topic terms. Relevant visits were most often infrequently retrieved because they used a synonym for a topic term. This failure analysis provides insight into areas for future improvement in EHR-based retrieval with techniques such as more widespread and complete use of standardized terminology in retrieval and data entry systems.
Kushniruk, Andre W; Kan, Min-Yem; McKeown, Kathleen; Klavans, Judith; Jordan, Desmond; LaFlamme, Mark; Patel, Vimia L
2002-01-01
This paper describes the comparative evaluation of an experimental automated text summarization system, Centrifuser and three conventional search engines - Google, Yahoo and About.com. Centrifuser provides information to patients and families relevant to their questions about specific health conditions. It then produces a multidocument summary of articles retrieved by a standard search engine, tailored to the user's question. Subjects, consisting of friends or family of hospitalized patients, were asked to "think aloud" as they interacted with the four systems. The evaluation involved audio- and video recording of subject interactions with the interfaces in situ at a hospital. Results of the evaluation show that subjects found Centrifuser's summarization capability useful and easy to understand. In comparing Centrifuser to the three search engines, subjects' ratings varied; however, specific interface features were deemed useful across interfaces. We conclude with a discussion of the implications for engineering Web-based retrieval systems.
Complex Event Processing for Content-Based Text, Image, and Video Retrieval
2016-06-01
NY): Wiley- Interscience; 2000. Feldman R, Sanger J. The text mining handbook: advanced approaches in analyzing unstructured data. New York (NY...ARL-TR-7705 ● JUNE 2016 US Army Research Laboratory Complex Event Processing for Content-Based Text , Image, and Video Retrieval...ARL-TR-7705 ● JUNE 2016 US Army Research Laboratory Complex Event Processing for Content-Based Text , Image, and Video Retrieval
User-oriented evaluation of a medical image retrieval system for radiologists.
Markonis, Dimitrios; Holzer, Markus; Baroz, Frederic; De Castaneda, Rafael Luis Ruiz; Boyer, Célia; Langs, Georg; Müller, Henning
2015-10-01
This article reports the user-oriented evaluation of a text- and content-based medical image retrieval system. User tests with radiologists using a search system for images in the medical literature are presented. The goal of the tests is to assess the usability of the system, identify system and interface aspects that need improvement and useful additions. Another objective is to investigate the system's added value to radiology information retrieval. The study provides an insight into required specifications and potential shortcomings of medical image retrieval systems through a concrete methodology for conducting user tests. User tests with a working image retrieval system of images from the biomedical literature were performed in an iterative manner, where each iteration had the participants perform radiology information seeking tasks and then refining the system as well as the user study design itself. During these tasks the interaction of the users with the system was monitored, usability aspects were measured, retrieval success rates recorded and feedback was collected through survey forms. In total, 16 radiologists participated in the user tests. The success rates in finding relevant information were on average 87% and 78% for image and case retrieval tasks, respectively. The average time for a successful search was below 3 min in both cases. Users felt quickly comfortable with the novel techniques and tools (after 5 to 15 min), such as content-based image retrieval and relevance feedback. User satisfaction measures show a very positive attitude toward the system's functionalities while the user feedback helped identifying the system's weak points. The participants proposed several potentially useful new functionalities, such as filtering by imaging modality and search for articles using image examples. The iterative character of the evaluation helped to obtain diverse and detailed feedback on all system aspects. Radiologists are quickly familiar with the functionalities but have several comments on desired functionalities. The analysis of the results can potentially assist system refinement for future medical information retrieval systems. Moreover, the methodology presented as well as the discussion on the limitations and challenges of such studies can be useful for user-oriented medical image retrieval evaluation, as user-oriented evaluation of interactive system is still only rarely performed. Such interactive evaluations can be limited in effort if done iteratively and can give many insights for developing better systems. Copyright © 2015. Published by Elsevier Ireland Ltd.
Web information retrieval for health professionals.
Ting, S L; See-To, Eric W K; Tse, Y K
2013-06-01
This paper presents a Web Information Retrieval System (WebIRS), which is designed to assist the healthcare professionals to obtain up-to-date medical knowledge and information via the World Wide Web (WWW). The system leverages the document classification and text summarization techniques to deliver the highly correlated medical information to the physicians. The system architecture of the proposed WebIRS is first discussed, and then a case study on an application of the proposed system in a Hong Kong medical organization is presented to illustrate the adoption process and a questionnaire is administrated to collect feedback on the operation and performance of WebIRS in comparison with conventional information retrieval in the WWW. A prototype system has been constructed and implemented on a trial basis in a medical organization. It has proven to be of benefit to healthcare professionals through its automatic functions in classification and summarizing the medical information that the physicians needed and interested. The results of the case study show that with the use of the proposed WebIRS, significant reduction of searching time and effort, with retrieval of highly relevant materials can be attained.
Mobile medical image retrieval
NASA Astrophysics Data System (ADS)
Duc, Samuel; Depeursinge, Adrien; Eggel, Ivan; Müller, Henning
2011-03-01
Images are an integral part of medical practice for diagnosis, treatment planning and teaching. Image retrieval has gained in importance mainly as a research domain over the past 20 years. Both textual and visual retrieval of images are essential. In the process of mobile devices becoming reliable and having a functionality equaling that of formerly desktop clients, mobile computing has gained ground and many applications have been explored. This creates a new field of mobile information search & access and in this context images can play an important role as they often allow understanding complex scenarios much quicker and easier than free text. Mobile information retrieval in general has skyrocketed over the past year with many new applications and tools being developed and all sorts of interfaces being adapted to mobile clients. This article describes constraints of an information retrieval system including visual and textual information retrieval from the medical literature of BioMedCentral and of the RSNA journals Radiology and Radiographics. Solutions for mobile data access with an example on an iPhone in a web-based environment are presented as iPhones are frequently used and the operating system is bound to become the most frequent smartphone operating system in 2011. A web-based scenario was chosen to allow for a use by other smart phone platforms such as Android as well. Constraints of small screens and navigation with touch screens are taken into account in the development of the application. A hybrid choice had to be taken to allow for taking pictures with the cell phone camera and upload them for visual similarity search as most producers of smart phones block this functionality to web applications. Mobile information access and in particular access to images can be surprisingly efficient and effective on smaller screens. Images can be read on screen much faster and relevance of documents can be identified quickly through the use of images contained in the text. Problems with the many, often incompatible mobile platforms were discovered and are listed in the text. Mobile information access is a quickly growing domain and the constraints of mobile access also need to be taken into account for image retrieval. The demonstrated access to the medical literature is most relevant as the medical literature and their images are clearly the largest knowledge source in the medical field.
TREC Initiative with Cheshire II.
ERIC Educational Resources Information Center
Larson, Ray R.
2001-01-01
Describes the University of California at Berkeley's participation in the TREC (Text Retrieval Conference) interactive track experiments. Highlights include results of searches on two systems, Cheshire II and ZPRISE; system design goals and implementation; precision and recall results; search questions by topic and system; and results of…
Using the Weighted Keyword Model to Improve Information Retrieval for Answering Biomedical Questions
Yu, Hong; Cao, Yong-gang
2009-01-01
Physicians ask many complex questions during the patient encounter. Information retrieval systems that can provide immediate and relevant answers to these questions can be invaluable aids to the practice of evidence-based medicine. In this study, we first automatically identify topic keywords from ad hoc clinical questions with a Condition Random Field model that is trained over thousands of manually annotated clinical questions. We then report on a linear model that assigns query weights based on their automatically identified semantic roles: topic keywords, domain specific terms, and their synonyms. Our evaluation shows that this weighted keyword model improves information retrieval from the Text Retrieval Conference Genomics track data. PMID:21347188
Yu, Hong; Cao, Yong-Gang
2009-03-01
Physicians ask many complex questions during the patient encounter. Information retrieval systems that can provide immediate and relevant answers to these questions can be invaluable aids to the practice of evidence-based medicine. In this study, we first automatically identify topic keywords from ad hoc clinical questions with a Condition Random Field model that is trained over thousands of manually annotated clinical questions. We then report on a linear model that assigns query weights based on their automatically identified semantic roles: topic keywords, domain specific terms, and their synonyms. Our evaluation shows that this weighted keyword model improves information retrieval from the Text Retrieval Conference Genomics track data.
Presentation video retrieval using automatically recovered slide and spoken text
NASA Astrophysics Data System (ADS)
Cooper, Matthew
2013-03-01
Video is becoming a prevalent medium for e-learning. Lecture videos contain text information in both the presentation slides and lecturer's speech. This paper examines the relative utility of automatically recovered text from these sources for lecture video retrieval. To extract the visual information, we automatically detect slides within the videos and apply optical character recognition to obtain their text. Automatic speech recognition is used similarly to extract spoken text from the recorded audio. We perform controlled experiments with manually created ground truth for both the slide and spoken text from more than 60 hours of lecture video. We compare the automatically extracted slide and spoken text in terms of accuracy relative to ground truth, overlap with one another, and utility for video retrieval. Results reveal that automatically recovered slide text and spoken text contain different content with varying error profiles. Experiments demonstrate that automatically extracted slide text enables higher precision video retrieval than automatically recovered spoken text.
NASA Technical Reports Server (NTRS)
Stocker, Erich Franz; Kelley, O.; Kummerow, C.; Huffman, G.; Olson, W.; Kwiatkowski, J.
2015-01-01
In February 2015, the Global Precipitation Measurement (GPM) mission core satellite will complete its first year in space. The core satellite carries a conically scanning microwave imager called the GPM Microwave Imager (GMI), which also has 166 GHz and 183 GHz frequency channels. The GPM core satellite also carries a dual frequency radar (DPR) which operates at Ku frequency, similar to the Tropical Rainfall Measuring Mission (TRMM) Precipitation Radar, and a new Ka frequency. The precipitation processing system (PPS) is producing swath-based instantaneous precipitation retrievals from GMI, both radars including a dual-frequency product, and a combined GMIDPR precipitation retrieval. These level 2 products are written in the HDF5 format and have many additional parameters beyond surface precipitation that are organized into appropriate groups. While these retrieval algorithms were developed prior to launch and are not optimal, these algorithms are producing very creditable retrievals. It is appropriate for a wide group of users to have access to the GPM retrievals. However, for researchers requiring only surface precipitation, these L2 swath products can appear to be very intimidating and they certainly do contain many more variables than the average researcher needs. Some researchers desire only surface retrievals stored in a simple easily accessible format. In response, PPS has begun to produce gridded text based products that contain just the most widely used variables for each instrument (surface rainfall rate, fraction liquid, fraction convective) in a single line for each grid box that contains one or more observations.This paper will describe the gridded data products that are being produced and provide an overview of their content. Currently two types of gridded products are being produced: (1) surface precipitation retrievals from the core satellite instruments GMI, DPR, and combined GMIDPR (2) surface precipitation retrievals for the partner constellation satellites. Both of these gridded products are generated for a.25 degree x.25 degree hourly grid, which are packaged into daily ASCII (American Standard Code for Information Interchange) files that can downloaded from the PPS FTP (File Transfer Protocol) site. To reduce the download size, the files are compressed using the gzip utility.This paper will focus on presenting high-level details about the gridded text product being generated from the instruments on the GPM core satellite. But summary information will also be presented about the partner radiometer gridded product. All retrievals for the partner radiometer are done using the GPROF2014 algorithmusing as input the PPS generated inter-calibrated 1C product for the radiometer.
Text mining and its potential applications in systems biology.
Ananiadou, Sophia; Kell, Douglas B; Tsujii, Jun-ichi
2006-12-01
With biomedical literature increasing at a rate of several thousand papers per week, it is impossible to keep abreast of all developments; therefore, automated means to manage the information overload are required. Text mining techniques, which involve the processes of information retrieval, information extraction and data mining, provide a means of solving this. By adding meaning to text, these techniques produce a more structured analysis of textual knowledge than simple word searches, and can provide powerful tools for the production and analysis of systems biology models.
An Expert System for Searching in Full-Text
1989-12-01
2 An Expert System for Searching in Full-Text 00~ TR89-043 -December, 1989 cc D IF L, ~r." T M~ EA 13~ 1991ON- The University of North Carolina at...Full-Text by Susan Evalyn Gauch A dissertation submitted to the faculty of The University of North Carolina at Chapel Hill in partial fulfillment of...MICROARRAS, the retrieval software. MICROARRAS developed at the University of North Carolina under the direction of John B. Smith and Stephen Weiss [Smith et
An overview of the education and training component of RICIS
NASA Technical Reports Server (NTRS)
Freedman, Glenn B.
1987-01-01
Research in education and training according to RICIS (Research Institute for Computing and Information Systems) program focuses on means to disseminate knowledge, skills, and technological advances rapidly, accurately, and effectively. A range of areas for study include: artificial intelligence, hypermedia and full-text retrieval strategies, use of mass storage and retrieval options such as CD-ROM and laser disks, and interactive video and interactive media presentations.
Enamel color changes following orthodontic treatment.
Pandian, Akshaya; Ranganathan, Sukanya; Padmanabhan, Sridevi
2017-01-01
To evaluate and compare the effect of various orthodontic bonding systems and clean up procedures on quantitative enamel colour change. A literature search was done to identify the studies that assessed the quantitative enamel colour change associated with the various bonding systems and cleanup procedures. Electronic database (Pub Med, Cochrane and Google Scholar) were searched. First stage screening was performed and the abstracts were selected according to the initial selection criteria. Full text articles were retrieved and analyzed during second stage screening. The bibliographies were reviewed to identify additional relevant studies. Sixteen full text articles were retrieved. Six were rejected because the methodology was different. There was significant enamel colour change following orthodontic bonding, debonding and clean up procedures. Self-etching primers produce less enamel colour change compared to conventional etching. Resin Modified GIC produces least colour change compared to other light cure and chemical cure systems. Polishing following the clean-up procedure reduces the colour change of the enamel.
van Haagen, Herman H. H. B. M.; 't Hoen, Peter A. C.; Mons, Barend; Schultes, Erik A.
2013-01-01
Motivation Weighted semantic networks built from text-mined literature can be used to retrieve known protein-protein or gene-disease associations, and have been shown to anticipate associations years before they are explicitly stated in the literature. Our text-mining system recognizes over 640,000 biomedical concepts: some are specific (i.e., names of genes or proteins) others generic (e.g., ‘Homo sapiens’). Generic concepts may play important roles in automated information retrieval, extraction, and inference but may also result in concept overload and confound retrieval and reasoning with low-relevance or even spurious links. Here, we attempted to optimize the retrieval performance for protein-protein interactions (PPI) by filtering generic concepts (node filtering) or links to generic concepts (edge filtering) from a weighted semantic network. First, we defined metrics based on network properties that quantify the specificity of concepts. Then using these metrics, we systematically filtered generic information from the network while monitoring retrieval performance of known protein-protein interactions. We also systematically filtered specific information from the network (inverse filtering), and assessed the retrieval performance of networks composed of generic information alone. Results Filtering generic or specific information induced a two-phase response in retrieval performance: initially the effects of filtering were minimal but beyond a critical threshold network performance suddenly drops. Contrary to expectations, networks composed exclusively of generic information demonstrated retrieval performance comparable to unfiltered networks that also contain specific concepts. Furthermore, an analysis using individual generic concepts demonstrated that they can effectively support the retrieval of known protein-protein interactions. For instance the concept “binding” is indicative for PPI retrieval and the concept “mutation abnormality” is indicative for gene-disease associations. Conclusion Generic concepts are important for information retrieval and cannot be removed from semantic networks without negative impact on retrieval performance. PMID:24260124
Semi-Automated Methods for Refining a Domain-Specific Terminology Base
2011-02-01
only as a resource for written and oral translation, but also for Natural Language Processing ( NLP ) applications, text retrieval, document indexing...Natural Language Processing ( NLP ) applications, text retrieval, document indexing, and other knowledge management tasks. The objective of this...also for Natural Language Processing ( NLP ) applications, text retrieval (1), document indexing, and other knowledge management tasks. The National
NASA Astrophysics Data System (ADS)
Müller, Henning; Kalpathy-Cramer, Jayashree; Kahn, Charles E., Jr.; Hersh, William
2009-02-01
Content-based visual information (or image) retrieval (CBIR) has been an extremely active research domain within medical imaging over the past ten years, with the goal of improving the management of visual medical information. Many technical solutions have been proposed, and application scenarios for image retrieval as well as image classification have been set up. However, in contrast to medical information retrieval using textual methods, visual retrieval has only rarely been applied in clinical practice. This is despite the large amount and variety of visual information produced in hospitals every day. This information overload imposes a significant burden upon clinicians, and CBIR technologies have the potential to help the situation. However, in order for CBIR to become an accepted clinical tool, it must demonstrate a higher level of technical maturity than it has to date. Since 2004, the ImageCLEF benchmark has included a task for the comparison of visual information retrieval algorithms for medical applications. In 2005, a task for medical image classification was introduced and both tasks have been run successfully for the past four years. These benchmarks allow an annual comparison of visual retrieval techniques based on the same data sets and the same query tasks, enabling the meaningful comparison of various retrieval techniques. The datasets used from 2004-2007 contained images and annotations from medical teaching files. In 2008, however, the dataset used was made up of 67,000 images (along with their associated figure captions and the full text of their corresponding articles) from two Radiological Society of North America (RSNA) scientific journals. This article describes the results of the medical image retrieval task of the ImageCLEF 2008 evaluation campaign. We compare the retrieval results of both visual and textual information retrieval systems from 15 research groups on the aforementioned data set. The results show clearly that, currently, visual retrieval alone does not achieve the performance necessary for real-world clinical applications. Most of the common visual retrieval techniques have a MAP (Mean Average Precision) of around 2-3%, which is much lower than that achieved using textual retrieval (MAP=29%). Advanced machine learning techniques, together with good training data, have been shown to improve the performance of visual retrieval systems in the past. Multimodal retrieval (basing retrieval on both visual and textual information) can achieve better results than purely visual, but only when carefully applied. In many cases, multimodal retrieval systems performed even worse than purely textual retrieval systems. On the other hand, some multimodal retrieval systems demonstrated significantly increased early precision, which has been shown to be a desirable behavior in real-world systems.
Wei, Wei; Ji, Zhanglong; He, Yupeng; Zhang, Kai; Ha, Yuanchi; Li, Qi; Ohno-Machado, Lucila
2018-01-01
Abstract The number and diversity of biomedical datasets grew rapidly in the last decade. A large number of datasets are stored in various repositories, with different formats. Existing dataset retrieval systems lack the capability of cross-repository search. As a result, users spend time searching datasets in known repositories, and they typically do not find new repositories. The biomedical and healthcare data discovery index ecosystem (bioCADDIE) team organized a challenge to solicit new indexing and searching strategies for retrieving biomedical datasets across repositories. We describe the work of one team that built a retrieval pipeline and examined its performance. The pipeline used online resources to supplement dataset metadata, automatically generated queries from users’ free-text questions, produced high-quality retrieval results and achieved the highest inferred Normalized Discounted Cumulative Gain among competitors. The results showed that it is a promising solution for cross-database, cross-domain and cross-repository biomedical dataset retrieval. Database URL: https://github.com/w2wei/dataset_retrieval_pipeline PMID:29688374
Abdulla, Ahmed AbdoAziz Ahmed; Lin, Hongfei; Xu, Bo; Banbhrani, Santosh Kumar
2016-07-25
Biomedical literature retrieval is becoming increasingly complex, and there is a fundamental need for advanced information retrieval systems. Information Retrieval (IR) programs scour unstructured materials such as text documents in large reserves of data that are usually stored on computers. IR is related to the representation, storage, and organization of information items, as well as to access. In IR one of the main problems is to determine which documents are relevant and which are not to the user's needs. Under the current regime, users cannot precisely construct queries in an accurate way to retrieve particular pieces of data from large reserves of data. Basic information retrieval systems are producing low-quality search results. In our proposed system for this paper we present a new technique to refine Information Retrieval searches to better represent the user's information need in order to enhance the performance of information retrieval by using different query expansion techniques and apply a linear combinations between them, where the combinations was linearly between two expansion results at one time. Query expansions expand the search query, for example, by finding synonyms and reweighting original terms. They provide significantly more focused, particularized search results than do basic search queries. The retrieval performance is measured by some variants of MAP (Mean Average Precision) and according to our experimental results, the combination of best results of query expansion is enhanced the retrieved documents and outperforms our baseline by 21.06 %, even it outperforms a previous study by 7.12 %. We propose several query expansion techniques and their combinations (linearly) to make user queries more cognizable to search engines and to produce higher-quality search results.
Haux, R; Grothe, W; Runkel, M; Schackert, H K; Windeler, H J; Winter, A; Wirtz, R; Herfarth, C; Kunze, S
1996-04-01
We report on a prospective, prolective observational study, supplying information on how physicians and other health care professionals retrieve medical knowledge on-line within the Heidelberg University Hospital information system. Within this hospital information system, on-line access to medical knowledge has been realised by installing a medical knowledge server in the range of about 24 GB and by providing access to it by health care professional workstations in wards, physicians' rooms, etc. During the study, we observed about 96 accesses per working day. The main group of health care professionals retrieving medical knowledge were physicians and medical students. Primary reasons for its utilisation were identified as support for the users' scientific work (50%), own clinical cases (19%), general medical problems (14%) and current clinical problems (13%). Health care professionals had accesses to medical knowledge bases such as MEDLINE (79%), drug bases ('Rote Liste', 6%), and to electronic text books and knowledge base systems as well. Sixty-five percent of accesses to medical knowledge were judged to be successful. In our opinion, medical knowledge retrieval can serve as a first step towards knowledge processing in medicine. We point out the consequences for the management of hospital information systems in order to provide the prerequisites for such a type of knowledge retrieval.
Automatic Processing of Current Affairs Queries
ERIC Educational Resources Information Center
Salton, G.
1973-01-01
The SMART system is used for the analysis, search and retrieval of news stories appearing in Time'' magazine. A comparison is made between the automatic text processing methods incorporated into the SMART system and a manual search using the classified index to Time.'' (14 references) (Author)
ERIC Educational Resources Information Center
Chowdhury, Gobinda G.
2003-01-01
Discusses issues related to natural language processing, including theoretical developments; natural language understanding; tools and techniques; natural language text processing systems; abstracting; information extraction; information retrieval; interfaces; software; Internet, Web, and digital library applications; machine translation for…
Computer aided systems human engineering: A hypermedia tool
NASA Technical Reports Server (NTRS)
Boff, Kenneth R.; Monk, Donald L.; Cody, William J.
1992-01-01
The Computer Aided Systems Human Engineering (CASHE) system, Version 1.0, is a multimedia ergonomics database on CD-ROM for the Apple Macintosh II computer, being developed for use by human system designers, educators, and researchers. It will initially be available on CD-ROM and will allow users to access ergonomics data and models stored electronically as text, graphics, and audio. The CASHE CD-ROM, Version 1.0 will contain the Boff and Lincoln (1988) Engineering Data Compendium, MIL-STD-1472D and a unique, interactive simulation capability, the Perception and Performance Prototyper. Its features also include a specialized data retrieval, scaling, and analysis capability and the state of the art in information retrieval, browsing, and navigation.
Expert system for automatically correcting OCR output
NASA Astrophysics Data System (ADS)
Taghva, Kazem; Borsack, Julie; Condit, Allen
1994-03-01
This paper describes a new expert system for automatically correcting errors made by optical character recognition (OCR) devices. The system, which we call the post-processing system, is designed to improve the quality of text produced by an OCR device in preparation for subsequent retrieval from an information system. The system is composed of numerous parts: an information retrieval system, an English dictionary, a domain-specific dictionary, and a collection of algorithms and heuristics designed to correct as many OCR errors as possible. For the remaining errors that cannot be corrected, the system passes them on to a user-level editing program. This post-processing system can be viewed as part of a larger system that would streamline the steps of taking a document from its hard copy form to its usable electronic form, or it can be considered a stand alone system for OCR error correction. An earlier version of this system has been used to process approximately 10,000 pages of OCR generated text. Among the OCR errors discovered by this version, about 87% were corrected. We implement numerous new parts of the system, test this new version, and present the results.
An integrated information retrieval and document management system
NASA Technical Reports Server (NTRS)
Coles, L. Stephen; Alvarez, J. Fernando; Chen, James; Chen, William; Cheung, Lai-Mei; Clancy, Susan; Wong, Alexis
1993-01-01
This paper describes the requirements and prototype development for an intelligent document management and information retrieval system that will be capable of handling millions of pages of text or other data. Technologies for scanning, Optical Character Recognition (OCR), magneto-optical storage, and multiplatform retrieval using a Standard Query Language (SQL) will be discussed. The semantic ambiguity inherent in the English language is somewhat compensated-for through the use of coefficients or weighting factors for partial synonyms. Such coefficients are used both for defining structured query trees for routine queries and for establishing long-term interest profiles that can be used on a regular basis to alert individual users to the presence of relevant documents that may have just arrived from an external source, such as a news wire service. Although this attempt at evidential reasoning is limited in comparison with the latest developments in AI Expert Systems technology, it has the advantage of being commercially available.
NASA Technical Reports Server (NTRS)
Driscoll, James N.
1994-01-01
The high-speed data search system developed for KSC incorporates existing and emerging information retrieval technology to help a user intelligently and rapidly locate information found in large textual databases. This technology includes: natural language input; statistical ranking of retrieved information; an artificial intelligence concept called semantics, where 'surface level' knowledge found in text is used to improve the ranking of retrieved information; and relevance feedback, where user judgements about viewed information are used to automatically modify the search for further information. Semantics and relevance feedback are features of the system which are not available commercially. The system further demonstrates focus on paragraphs of information to decide relevance; and it can be used (without modification) to intelligently search all kinds of document collections, such as collections of legal documents medical documents, news stories, patents, and so forth. The purpose of this paper is to demonstrate the usefulness of statistical ranking, our semantic improvement, and relevance feedback.
Instance-Based Question Answering
2006-12-01
answer clustering, composition, and scoring. Moreover, with the effort dedicated to improving monolingual system performance, system parameters are...text collections: document type, manual or automatic annotations (if any), and stylistic and notational differences in technical terms. Monolingual ...forum in which cross language retrieval systems and question answering systems are tested for various Eu- ropean languages. The CLEF QA monolingual task
Defense Technical Information Center Free Text Experiment - Management Data Bases.
1981-10-01
that can be retrieved directly from the inverted file. In other online systems, such as the National Library of Medicine /MEDLARS or the Systems...should be considered during search strategy formulation. EXAMPLE: Marijuana, Marihuana , Pot, Grass, Weed, Mary Jane 8. Foreign spellings should be
Mann, G; Birkmann, C; Schmidt, T; Schaeffler, V
1999-01-01
Introduction Present solutions for the representation and retrieval of medical information from online sources are not very satisfying. Either the retrieval process lacks of precision and completeness the representation does not support the update and maintenance of the represented information. Most efforts are currently put into improving the combination of search engines and HTML based documents. However, due to the current shortcomings of methods for natural language understanding there are clear limitations to this approach. Furthermore, this approach does not solve the maintenance problem. At least medical information exceeding a certain complexity seems to afford approaches that rely on structured knowledge representation and corresponding retrieval mechanisms. Methods Knowledge-based information systems are based on the following fundamental ideas. The representation of information is based on ontologies that define the structure of the domain's concepts and their relations. Views on domain models are defined and represented as retrieval schemata. Retrieval schemata can be interpreted as canonical query types focussing on specific aspects of the provided information (e.g. diagnosis or therapy centred views). Based on these retrieval schemata it can be decided which parts of the information in the domain model must be represented explicitly and formalised to support the retrieval process. As representation language propositional logic is used. All other information can be represented in a structured but informal way using text, images etc. Layout schemata are used to assign layout information to retrieved domain concepts. Depending on the target environment HTML or XML can be used. Results Based on this approach two knowledge-based information systems have been developed. The 'Ophthalmologic Knowledge-based Information System for Diabetic Retinopathy' (OKIS-DR) provides information on diagnoses, findings, examinations, guidelines, and reference images related to diabetic retinopathy. OKIS-DR uses combinations of findings to specify the information that must be retrieved. The second system focuses on nutrition related allergies and intolerances. Information on allergies and intolerances of a patient are used to retrieve general information on the specified combination of allergies and intolerances. As a special feature the system generates tables showing food types and products that are tolerated or not tolerated by patients. Evaluation by external experts and user groups showed that the described approach of knowledge-based information systems increases the precision and completeness of knowledge retrieval. Due to the structured and non-redundant representation of information the maintenance and update of the information can be simplified. Both systems are available as WWW based online knowledge bases and CD-ROMs (cf. http://mta.gsf.de topic: products).
The influence of retrieval practice on memory and comprehension of science texts
NASA Astrophysics Data System (ADS)
Hinze, Scott R.
The testing effect, where retrieval practice aids performance on later tests, may be a powerful tool for improving learning and retention. Three experiments test the potentials and limitations of retrieval practice for retention and comprehension of the content of science texts. Experiment 1 demonstrated that cued recall of paragraphs, but not fill-in-the-blank tests, improved performance on new memory items. Experiment 2 manipulated test expectancy and extended cued recall benefits to inference items. Test expectancies established prior to retrieval altered processing to either be ineffective (when expecting a memory test) or effective (when expecting an inference test). In Experiment 3, the processing task engaged in during retrieval practice was manipulated. Explanation during retrieval practice led to more effective transfer than free recall instructions, especially when participants were compliant and effective in their explanations. These experiments demonstrate that some, but not all, processing during retrieval practice can influence both memory and understanding of science texts.
The Influence of Retrieval Practice on Memory and Comprehension of Science Texts
ERIC Educational Resources Information Center
Hinze, Scott R.
2010-01-01
The testing effect, where retrieval practice aids performance on later tests, may be a powerful tool for improving learning and retention. Three experiments test the potentials and limitations of retrieval practice for retention and comprehension of the content of science texts. Experiment 1 demonstrated that cued recall of paragraphs, but not…
Historical Note: The Past Thirty Years in Information Retrieval.
ERIC Educational Resources Information Center
Salton, Gerard
1987-01-01
Briefly reviews early work in documentation and text processing, and predictions that were made about the creative role of computers in information retrieval. An attempt is made to explain why these predictions were not fulfilled and conclusions are drawn regarding the limits of computer power in text retrieval applications. (Author/CLB)
A Detailed Examination of the GPM Core Satellite Gridded Text Product
NASA Technical Reports Server (NTRS)
Stocker, Erich Franz; Kelley, Owen A.; Kummerow, C.; Huffman, George; Olson, William S.; Kwiatowski, John M.
2015-01-01
The Global Precipitation Measurement (GPM) mission quarter-degree gridded-text product has a similar file format and a similar purpose as the Tropical Rainfall Measuring Mission (TRMM) 3G68 quarter-degree product. The GPM text-grid format is an hourly summary of surface precipitation retrievals from various GPM instruments and combinations of GPM instruments. The GMI Goddard Profiling (GPROF) retrieval provides the widest swath (800 km) and does the retrieval using the GPM Microwave Imager (GMI). The Ku radar provides the widest radar swath (250 km swath) and also provides continuity with the TRMM Ku Precipitation Radar. GPM's Ku+Ka band matched swath (125 km swath) provides a dual-frequency precipitation retrieval. The "combined" retrieval (125 km swath) provides a multi-instrument precipitation retrieval based on the GMI, the DPR Ku radar, and the DPR Ka radar. While the data are reported in hourly grids, all hours for a day are packaged into a single text file that is g-zipped to reduce file size and to speed up downloading. The data are reported on a 0.25deg x 0.25 deg grid.
Is searching full text more effective than searching abstracts?
Lin, Jimmy
2009-01-01
Background With the growing availability of full-text articles online, scientists and other consumers of the life sciences literature now have the ability to go beyond searching bibliographic records (title, abstract, metadata) to directly access full-text content. Motivated by this emerging trend, I posed the following question: is searching full text more effective than searching abstracts? This question is answered by comparing text retrieval algorithms on MEDLINE® abstracts, full-text articles, and spans (paragraphs) within full-text articles using data from the TREC 2007 genomics track evaluation. Two retrieval models are examined: bm25 and the ranking algorithm implemented in the open-source Lucene search engine. Results Experiments show that treating an entire article as an indexing unit does not consistently yield higher effectiveness compared to abstract-only search. However, retrieval based on spans, or paragraphs-sized segments of full-text articles, consistently outperforms abstract-only search. Results suggest that highest overall effectiveness may be achieved by combining evidence from spans and full articles. Conclusion Users searching full text are more likely to find relevant articles than searching only abstracts. This finding affirms the value of full text collections for text retrieval and provides a starting point for future work in exploring algorithms that take advantage of rapidly-growing digital archives. Experimental results also highlight the need to develop distributed text retrieval algorithms, since full-text articles are significantly longer than abstracts and may require the computational resources of multiple machines in a cluster. The MapReduce programming model provides a convenient framework for organizing such computations. PMID:19192280
Electronic Document Management Systems: Where Are They Today?
ERIC Educational Resources Information Center
Koulopoulos, Thomas M.; Frappaolo, Carl
1993-01-01
Discusses developments in document management systems based on a survey of over 400 corporations and government agencies. Text retrieval and imaging markets, architecture and integration, purchasing plans, and vendor market leaders are covered. Five graphs present data on user preferences for improvements. A sidebar article reviews the development…
Emotional memory can be persistently weakened by suppressing cortisol during retrieval.
Rimmele, Ulrike; Besedovsky, Luciana; Lange, Tanja; Born, Jan
2015-03-01
Cortisol's effects on memory follow an inverted U-shaped function such that memory retrieval is impaired with very low concentrations, presumably due to insufficient activation of high-affine mineralocorticoid receptors (MR), or with very high concentrations, due to predominant low-affine glucocorticoid receptor (GR) activation. Through corresponding changes in re-encoding, the retrieval effect of cortisol might translate into a persistent change of the retrieved memory. We tested whether partial suppression of morning cortisol synthesis by metyrapone, leading to intermediate, circadian nadir-like levels with presumed predominant MR activation, improves retrieval, particularly of emotional memory, and persistently changes the memory. In a randomized, placebo-controlled, double-blind, within-subject cross-over design, 18 men were orally administered metyrapone (1g) vs. placebo at 4:00 AM to suppress the morning cortisol rise. Retrieval of emotional and neutral texts and pictures (learned 3 days earlier) was assessed 4h after substance administration and a second time one week later. Metyrapone suppressed endogenous cortisol release to circadian nadir-equivalent levels at the time of retrieval testing. Contrary to our expectations, metyrapone significantly impaired free recall of emotional texts (p<.05), whereas retrieval of neutral texts or pictures remained unaffected. One week later, participants still showed lower memory for emotional texts in the metyrapone than placebo condition (p<.05). Our finding that suppressing morning cortisol to nadir-like concentrations not only impairs acute retrieval, but also persistently weakens emotional memories corroborates the concept that retrieval effects of cortisol produce persistent memory changes, possibly by affecting re-encoding. Copyright © 2015 The Authors. Published by Elsevier Inc. All rights reserved.
Retrieval feedback in MEDLINE.
Srinivasan, P
1996-01-01
OBJECTIVE: To investigate a new approach for query expansion based on retrieval feedback. The first objective in this study was to examine alternative query-expansion methods within the same retrieval-feedback framework. The three alternatives proposed are: expansion on the MeSH query field alone, expansion on the free-text field alone, and expansion on both the MeSH and the free-text fields. The second objective was to gain further understanding of retrieval feedback by examining possible dependencies on relevant documents during the feedback cycle. DESIGN: Comparative study of retrieval effectiveness using the original unexpanded and the alternative expanded user queries on a MEDLINE test collection of 75 queries and 2,334 MEDLINE citations. MEASUREMENTS: Retrieval effectivenesses of the original unexpanded and the alternative expanded queries were compared using 11-point-average precision scores (11-AvgP). These are averages of precision scores obtained at 11 standard recall points. RESULTS: All three expansion strategies significantly improved the original queries in terms of retrieval effectiveness. Expansion on MeSH alone was equivalent to expansion on both MeSH and the free-text fields. Expansion on the free-text field alone improved the queries significantly less than did the other two strategies. The second part of the study indicated that retrieval-feedback-based expansion yields significant performance improvements independent of the availability of relevant documents for feedback information. CONCLUSIONS: Retrieval feedback offers a robust procedure for query expansion that is most effective for MEDLINE when applied to the MeSH field. PMID:8653452
Enhancing biomedical text summarization using semantic relation extraction.
Shang, Yue; Li, Yanpeng; Lin, Hongfei; Yang, Zhihao
2011-01-01
Automatic text summarization for a biomedical concept can help researchers to get the key points of a certain topic from large amount of biomedical literature efficiently. In this paper, we present a method for generating text summary for a given biomedical concept, e.g., H1N1 disease, from multiple documents based on semantic relation extraction. Our approach includes three stages: 1) We extract semantic relations in each sentence using the semantic knowledge representation tool SemRep. 2) We develop a relation-level retrieval method to select the relations most relevant to each query concept and visualize them in a graphic representation. 3) For relations in the relevant set, we extract informative sentences that can interpret them from the document collection to generate text summary using an information retrieval based method. Our major focus in this work is to investigate the contribution of semantic relation extraction to the task of biomedical text summarization. The experimental results on summarization for a set of diseases show that the introduction of semantic knowledge improves the performance and our results are better than the MEAD system, a well-known tool for text summarization.
Hanauer, David A; Mei, Qiaozhu; Law, James; Khanna, Ritu; Zheng, Kai
2015-06-01
This paper describes the University of Michigan's nine-year experience in developing and using a full-text search engine designed to facilitate information retrieval (IR) from narrative documents stored in electronic health records (EHRs). The system, called the Electronic Medical Record Search Engine (EMERSE), functions similar to Google but is equipped with special functionalities for handling challenges unique to retrieving information from medical text. Key features that distinguish EMERSE from general-purpose search engines are discussed, with an emphasis on functions crucial to (1) improving medical IR performance and (2) assuring search quality and results consistency regardless of users' medical background, stage of training, or level of technical expertise. Since its initial deployment, EMERSE has been enthusiastically embraced by clinicians, administrators, and clinical and translational researchers. To date, the system has been used in supporting more than 750 research projects yielding 80 peer-reviewed publications. In several evaluation studies, EMERSE demonstrated very high levels of sensitivity and specificity in addition to greatly improved chart review efficiency. Increased availability of electronic data in healthcare does not automatically warrant increased availability of information. The success of EMERSE at our institution illustrates that free-text EHR search engines can be a valuable tool to help practitioners and researchers retrieve information from EHRs more effectively and efficiently, enabling critical tasks such as patient case synthesis and research data abstraction. EMERSE, available free of charge for academic use, represents a state-of-the-art medical IR tool with proven effectiveness and user acceptance. Copyright © 2015 The Authors. Published by Elsevier Inc. All rights reserved.
NASA automatic subject analysis technique for extracting retrievable multi-terms (NASA TERM) system
NASA Technical Reports Server (NTRS)
Kirschbaum, J.; Williamson, R. E.
1978-01-01
Current methods for information processing and retrieval used at the NASA Scientific and Technical Information Facility are reviewed. A more cost effective computer aided indexing system is proposed which automatically generates print terms (phrases) from the natural text. Satisfactory print terms can be generated in a primarily automatic manner to produce a thesaurus (NASA TERMS) which extends all the mappings presently applied by indexers, specifies the worth of each posting term in the thesaurus, and indicates the areas of use of the thesaurus entry phrase. These print terms enable the computer to determine which of several terms in a hierarchy is desirable and to differentiate ambiguous terms. Steps in the NASA TERMS algorithm are discussed and the processing of surrogate entry phrases is demonstrated using four previously manually indexed STAR abstracts for comparison. The simulation shows phrase isolation, text phrase reduction, NASA terms selection, and RECON display.
TRECVID: the utility of a content-based video retrieval evaluation
NASA Astrophysics Data System (ADS)
Hauptmann, Alexander G.
2006-01-01
TRECVID, an annual retrieval evaluation benchmark organized by NIST, encourages research in information retrieval from digital video. TRECVID benchmarking covers both interactive and manual searching by end users, as well as the benchmarking of some supporting technologies including shot boundary detection, extraction of semantic features, and the automatic segmentation of TV news broadcasts. Evaluations done in the context of the TRECVID benchmarks show that generally, speech transcripts and annotations provide the single most important clue for successful retrieval. However, automatically finding the individual images is still a tremendous and unsolved challenge. The evaluations repeatedly found that none of the multimedia analysis and retrieval techniques provide a significant benefit over retrieval using only textual information such as from automatic speech recognition transcripts or closed captions. In interactive systems, we do find significant differences among the top systems, indicating that interfaces can make a huge difference for effective video/image search. For interactive tasks efficient interfaces require few key clicks, but display large numbers of images for visual inspection by the user. The text search finds the right context region in the video in general, but to select specific relevant images we need good interfaces to easily browse the storyboard pictures. In general, TRECVID has motivated the video retrieval community to be honest about what we don't know how to do well (sometimes through painful failures), and has focused us to work on the actual task of video retrieval, as opposed to flashy demos based on technological capabilities.
Information Retrieval and Text Mining Technologies for Chemistry.
Krallinger, Martin; Rabal, Obdulia; Lourenço, Anália; Oyarzabal, Julen; Valencia, Alfonso
2017-06-28
Efficient access to chemical information contained in scientific literature, patents, technical reports, or the web is a pressing need shared by researchers and patent attorneys from different chemical disciplines. Retrieval of important chemical information in most cases starts with finding relevant documents for a particular chemical compound or family. Targeted retrieval of chemical documents is closely connected to the automatic recognition of chemical entities in the text, which commonly involves the extraction of the entire list of chemicals mentioned in a document, including any associated information. In this Review, we provide a comprehensive and in-depth description of fundamental concepts, technical implementations, and current technologies for meeting these information demands. A strong focus is placed on community challenges addressing systems performance, more particularly CHEMDNER and CHEMDNER patents tasks of BioCreative IV and V, respectively. Considering the growing interest in the construction of automatically annotated chemical knowledge bases that integrate chemical information and biological data, cheminformatics approaches for mapping the extracted chemical names into chemical structures and their subsequent annotation together with text mining applications for linking chemistry with biological information are also presented. Finally, future trends and current challenges are highlighted as a roadmap proposal for research in this emerging field.
QCS : a system for querying, clustering, and summarizing documents.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Dunlavy, Daniel M.
2006-08-01
Information retrieval systems consist of many complicated components. Research and development of such systems is often hampered by the difficulty in evaluating how each particular component would behave across multiple systems. We present a novel hybrid information retrieval system--the Query, Cluster, Summarize (QCS) system--which is portable, modular, and permits experimentation with different instantiations of each of the constituent text analysis components. Most importantly, the combination of the three types of components in the QCS design improves retrievals by providing users more focused information organized by topic. We demonstrate the improved performance by a series of experiments using standard test setsmore » from the Document Understanding Conferences (DUC) along with the best known automatic metric for summarization system evaluation, ROUGE. Although the DUC data and evaluations were originally designed to test multidocument summarization, we developed a framework to extend it to the task of evaluation for each of the three components: query, clustering, and summarization. Under this framework, we then demonstrate that the QCS system (end-to-end) achieves performance as good as or better than the best summarization engines. Given a query, QCS retrieves relevant documents, separates the retrieved documents into topic clusters, and creates a single summary for each cluster. In the current implementation, Latent Semantic Indexing is used for retrieval, generalized spherical k-means is used for the document clustering, and a method coupling sentence ''trimming'', and a hidden Markov model, followed by a pivoted QR decomposition, is used to create a single extract summary for each cluster. The user interface is designed to provide access to detailed information in a compact and useful format. Our system demonstrates the feasibility of assembling an effective IR system from existing software libraries, the usefulness of the modularity of the design, and the value of this particular combination of modules.« less
QCS: a system for querying, clustering and summarizing documents.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Dunlavy, Daniel M.; Schlesinger, Judith D.; O'Leary, Dianne P.
2006-10-01
Information retrieval systems consist of many complicated components. Research and development of such systems is often hampered by the difficulty in evaluating how each particular component would behave across multiple systems. We present a novel hybrid information retrieval system--the Query, Cluster, Summarize (QCS) system--which is portable, modular, and permits experimentation with different instantiations of each of the constituent text analysis components. Most importantly, the combination of the three types of components in the QCS design improves retrievals by providing users more focused information organized by topic. We demonstrate the improved performance by a series of experiments using standard test setsmore » from the Document Understanding Conferences (DUC) along with the best known automatic metric for summarization system evaluation, ROUGE. Although the DUC data and evaluations were originally designed to test multidocument summarization, we developed a framework to extend it to the task of evaluation for each of the three components: query, clustering, and summarization. Under this framework, we then demonstrate that the QCS system (end-to-end) achieves performance as good as or better than the best summarization engines. Given a query, QCS retrieves relevant documents, separates the retrieved documents into topic clusters, and creates a single summary for each cluster. In the current implementation, Latent Semantic Indexing is used for retrieval, generalized spherical k-means is used for the document clustering, and a method coupling sentence 'trimming', and a hidden Markov model, followed by a pivoted QR decomposition, is used to create a single extract summary for each cluster. The user interface is designed to provide access to detailed information in a compact and useful format. Our system demonstrates the feasibility of assembling an effective IR system from existing software libraries, the usefulness of the modularity of the design, and the value of this particular combination of modules.« less
Essie: A Concept-based Search Engine for Structured Biomedical Text
Ide, Nicholas C.; Loane, Russell F.; Demner-Fushman, Dina
2007-01-01
This article describes the algorithms implemented in the Essie search engine that is currently serving several Web sites at the National Library of Medicine. Essie is a phrase-based search engine with term and concept query expansion and probabilistic relevancy ranking. Essie’s design is motivated by an observation that query terms are often conceptually related to terms in a document, without actually occurring in the document text. Essie’s performance was evaluated using data and standard evaluation methods from the 2003 and 2006 Text REtrieval Conference (TREC) Genomics track. Essie was the best-performing search engine in the 2003 TREC Genomics track and achieved results comparable to those of the highest-ranking systems on the 2006 TREC Genomics track task. Essie shows that a judicious combination of exploiting document structure, phrase searching, and concept based query expansion is a useful approach for information retrieval in the biomedical domain. PMID:17329729
Are written and spoken recall of text equivalent?
Kellogg, Ronald T
2007-01-01
Writing is less practiced than speaking, graphemic codes are activated only in writing, and the retrieved representations of the text must be maintained in working memory longer because handwritten output is slower than speech. These extra demands on working memory could result in less effort being given to retrieval during written compared with spoken text recall. To test this hypothesis, college students read or heard Bartlett's "War of the Ghosts" and then recalled the text in writing or speech. Spoken recall produced more accurately recalled propositions and more major distortions (e.g., inferences) than written recall. The results suggest that writing reduces the retrieval effort given to reconstructing the propositions of a text.
ERIC Educational Resources Information Center
Qin, Jian; Jurisica, Igor; Liddy, Elizabeth D.; Jansen, Bernard J; Spink, Amanda; Priss, Uta; Norton, Melanie J.
2000-01-01
These six articles discuss knowledge discovery in databases (KDD). Topics include data mining; knowledge management systems; applications of knowledge discovery; text and Web mining; text mining and information retrieval; user search patterns through Web log analysis; concept analysis; data collection; and data structure inconsistency. (LRW)
Retrieval practice enhances the ability to evaluate complex physiology information.
Dobson, John; Linderholm, Tracy; Perez, Jose
2018-05-01
Many investigations have shown that retrieval practice enhances the recall of different types of information, including both medical and physiological, but the effects of the strategy on higher-order thinking, such as evaluation, are less clear. The primary aim of this study was to compare how effectively retrieval practice and repeated studying (i.e. reading) strategies facilitated the evaluation of two research articles that advocated dissimilar conclusions. A secondary aim was to determine if that comparison was affected by using those same strategies to first learn important contextual information about the articles. Participants were randomly assigned to learn three texts that provided background information about the research articles either by studying them four consecutive times (Text-S) or by studying and then retrieving them two consecutive times (Text-R). Half of both the Text-S and Text-R groups were then randomly assigned to learn two physiology research articles by studying them four consecutive times (Article-S) and the other half learned them by studying and then retrieving them two consecutive times (Article-R). Participants then completed two assessments: the first tested their ability to critique the research articles and the second tested their recall of the background texts. On the article critique assessment, the Article-R groups' mean scores of 33.7 ± 4.7% and 35.4 ± 4.5% (Text-R then Article-R group and Text-S then Article-R group, respectively) were both significantly (p < 0.05) higher than the two Article-S mean scores of 19.5 ± 4.4% and 21.7 ± 2.9% (Text-S then Article-S group and Text-R then Article-S group, respectively). There was no difference between the two Article-R groups on the article critique assessment, indicating those scores weren't affected by the different contextual learning strategies. Retrieval practice promoted superior critical evaluation of the research articles, and the results also indicated the strategy enhanced the recall of background information. © 2018 John Wiley & Sons Ltd and The Association for the Study of Medical Education.
A novel methodology for querying web images
NASA Astrophysics Data System (ADS)
Prabhakara, Rashmi; Lee, Ching Cheng
2005-01-01
Ever since the advent of Internet, there has been an immense growth in the amount of image data that is available on the World Wide Web. With such a magnitude of image availability, an efficient and effective image retrieval system is required to make use of this information. This research presents an effective image matching and indexing technique that improvises on existing integrated image retrieval methods. The proposed technique follows a two-phase approach, integrating query by topic and query by example specification methods. The first phase consists of topic-based image retrieval using an improved text information retrieval (IR) technique that makes use of the structured format of HTML documents. It consists of a focused crawler that not only provides for the user to enter the keyword for the topic-based search but also, the scope in which the user wants to find the images. The second phase uses the query by example specification to perform a low-level content-based image match for the retrieval of smaller and relatively closer results of the example image. Information related to the image feature is automatically extracted from the query image by the image processing system. A technique that is not computationally intensive based on color feature is used to perform content-based matching of images. The main goal is to develop a functional image search and indexing system and to demonstrate that better retrieval results can be achieved with this proposed hybrid search technique.
A novel methodology for querying web images
NASA Astrophysics Data System (ADS)
Prabhakara, Rashmi; Lee, Ching Cheng
2004-12-01
Ever since the advent of Internet, there has been an immense growth in the amount of image data that is available on the World Wide Web. With such a magnitude of image availability, an efficient and effective image retrieval system is required to make use of this information. This research presents an effective image matching and indexing technique that improvises on existing integrated image retrieval methods. The proposed technique follows a two-phase approach, integrating query by topic and query by example specification methods. The first phase consists of topic-based image retrieval using an improved text information retrieval (IR) technique that makes use of the structured format of HTML documents. It consists of a focused crawler that not only provides for the user to enter the keyword for the topic-based search but also, the scope in which the user wants to find the images. The second phase uses the query by example specification to perform a low-level content-based image match for the retrieval of smaller and relatively closer results of the example image. Information related to the image feature is automatically extracted from the query image by the image processing system. A technique that is not computationally intensive based on color feature is used to perform content-based matching of images. The main goal is to develop a functional image search and indexing system and to demonstrate that better retrieval results can be achieved with this proposed hybrid search technique.
Experiments on Interfaces To Support Query Expansion.
ERIC Educational Resources Information Center
Beaulieu, M.
1997-01-01
Focuses on the user and human-computer interaction aspects of the research based on the Okapi text retrieval system. Three experiments implementing different approaches to query expansion are described, including the use of graphical user interfaces with different windowing techniques. (Author/LRW)
Multi-Character Tries for Text Searching.
ERIC Educational Resources Information Center
Cooper, Lorraine K. D.; Tharp, Alan L.
1993-01-01
Introduces the multicharacter trie as an index structure that can improve the time needed for retrieving full-text materials stored on CD-ROMs. The advantages of this structure compared to other structures are described, and experimental results comparing it to the widely used B+ tree and other structures used for full-text retrieval are…
Spatial Paradigm for Information Retrieval and Exploration
DOE Office of Scientific and Technical Information (OSTI.GOV)
The SPIRE system consists of software for visual analysis of primarily text based information sources. This technology enables the content analysis of text documents without reading all the documents. It employs several algorithms for text and word proximity analysis. It identifies the key themes within the text documents. From this analysis, it projects the results onto a visual spatial proximity display (Galaxies or Themescape) where items (documents and/or themes) visually close to each other are known to have content which is close to each other. Innovative interaction techniques then allow for dynamic visual analysis of large text based information spaces.
SPIRE1.03. Spatial Paradigm for Information Retrieval and Exploration
DOE Office of Scientific and Technical Information (OSTI.GOV)
Adams, K.J.; Bohn, S.; Crow, V.
The SPIRE system consists of software for visual analysis of primarily text based information sources. This technology enables the content analysis of text documents without reading all the documents. It employs several algorithms for text and word proximity analysis. It identifies the key themes within the text documents. From this analysis, it projects the results onto a visual spatial proximity display (Galaxies or Themescape) where items (documents and/or themes) visually close to each other are known to have content which is close to each other. Innovative interaction techniques then allow for dynamic visual analysis of large text based information spaces.
On the Application of Syntactic Methodologies in Automatic Text Analysis.
ERIC Educational Resources Information Center
Salton, Gerard; And Others
1990-01-01
Summarizes various linguistic approaches proposed for document analysis in information retrieval environments. Topics discussed include syntactic analysis; use of machine-readable dictionary information; knowledge base construction; the PLNLP English Grammar (PEG) system; phrase normalization; and statistical and syntactic phrase evaluation used…
GPM Mission Gridded Text Products Providing Surface Precipitation Retrievals
NASA Astrophysics Data System (ADS)
Stocker, Erich Franz; Kelley, Owen; Huffman, George; Kummerow, Christian
2015-04-01
In February 2015, the Global Precipitation Measurement (GPM) mission core satellite will complete its first year in space. The core satellite carries a conically scanning microwave imager called the GPM Microwave Imager (GMI), which also has 166 GHz and 183 GHz frequency channels. The GPM core satellite also carries a dual frequency radar (DPR) which operates at Ku frequency, similar to the Tropical Rainfall Measuring Mission (TRMM) Precipitation Radar), and a new Ka frequency. The precipitation processing system (PPS) is producing swath-based instantaneous precipitation retrievals from GMI, both radars including a dual-frequency product, and a combined GMI/DPR precipitation retrieval. These level 2 products are written in the HDF5 format and have many additional parameters beyond surface precipitation that are organized into appropriate groups. While these retrieval algorithms were developed prior to launch and are not optimal, these algorithms are producing very creditable retrievals. It is appropriate for a wide group of users to have access to the GPM retrievals. However, for reseachers requiring only surface precipitation, these L2 swath products can appear to be very intimidating and they certainly do contain many more variables than the average researcher needs. Some researchers desire only surface retrievals stored in a simple easily accessible format. In response, PPS has begun to produce gridded text based products that contain just the most widely used variables for each instrument (surface rainfall rate, fraction liquid, fraction convective) in a single line for each grid box that contains one or more observations. This paper will describe the gridded data products that are being produced and provide an overview of their content. Currently two types of gridded products are being produced: (1) surface precipitation retrievals from the core satellite instruments - GMI, DPR, and combined GMI/DPR (2) surface precipitation retrievals for the partner constellation satellites. Both of these gridded products are generated for a .25 degree x .25 degree hourly grid, which are packaged into daily ASCII files that can downloaded from the PPS FTP site. To reduce the download size, the files are compressed using the gzip utility. This paper will focus on presenting high-level details about the gridded text product being generated from the instruments on the GPM core satellite. But summary information will also be presented about the partner radiometer gridded product. All retrievals for the partner radiometer are done using the GPROF2014 algorithm using as input the PPS generated inter-calibrated 1C product for the radiometer.
Enhancing Biomedical Text Summarization Using Semantic Relation Extraction
Shang, Yue; Li, Yanpeng; Lin, Hongfei; Yang, Zhihao
2011-01-01
Automatic text summarization for a biomedical concept can help researchers to get the key points of a certain topic from large amount of biomedical literature efficiently. In this paper, we present a method for generating text summary for a given biomedical concept, e.g., H1N1 disease, from multiple documents based on semantic relation extraction. Our approach includes three stages: 1) We extract semantic relations in each sentence using the semantic knowledge representation tool SemRep. 2) We develop a relation-level retrieval method to select the relations most relevant to each query concept and visualize them in a graphic representation. 3) For relations in the relevant set, we extract informative sentences that can interpret them from the document collection to generate text summary using an information retrieval based method. Our major focus in this work is to investigate the contribution of semantic relation extraction to the task of biomedical text summarization. The experimental results on summarization for a set of diseases show that the introduction of semantic knowledge improves the performance and our results are better than the MEAD system, a well-known tool for text summarization. PMID:21887336
A Study of Composition/Correction System with Corpus Retrieval Function
ERIC Educational Resources Information Center
Liu, Song; Liu, Peng; Urano, Yoshiyori
2013-01-01
Practice and research in the composition education that is using computer and network have been more and more active. Through online composition system, a large amount of written texts produced by students and teachers can be collected. This kind of information is called a learner corpus, which is important in second language education because the…
Managing biomedical image metadata for search and retrieval of similar images.
Korenblum, Daniel; Rubin, Daniel; Napel, Sandy; Rodriguez, Cesar; Beaulieu, Chris
2011-08-01
Radiology images are generally disconnected from the metadata describing their contents, such as imaging observations ("semantic" metadata), which are usually described in text reports that are not directly linked to the images. We developed a system, the Biomedical Image Metadata Manager (BIMM) to (1) address the problem of managing biomedical image metadata and (2) facilitate the retrieval of similar images using semantic feature metadata. Our approach allows radiologists, researchers, and students to take advantage of the vast and growing repositories of medical image data by explicitly linking images to their associated metadata in a relational database that is globally accessible through a Web application. BIMM receives input in the form of standard-based metadata files using Web service and parses and stores the metadata in a relational database allowing efficient data query and maintenance capabilities. Upon querying BIMM for images, 2D regions of interest (ROIs) stored as metadata are automatically rendered onto preview images included in search results. The system's "match observations" function retrieves images with similar ROIs based on specific semantic features describing imaging observation characteristics (IOCs). We demonstrate that the system, using IOCs alone, can accurately retrieve images with diagnoses matching the query images, and we evaluate its performance on a set of annotated liver lesion images. BIMM has several potential applications, e.g., computer-aided detection and diagnosis, content-based image retrieval, automating medical analysis protocols, and gathering population statistics like disease prevalences. The system provides a framework for decision support systems, potentially improving their diagnostic accuracy and selection of appropriate therapies.
Exploiting salient semantic analysis for information retrieval
NASA Astrophysics Data System (ADS)
Luo, Jing; Meng, Bo; Quan, Changqin; Tu, Xinhui
2016-11-01
Recently, many Wikipedia-based methods have been proposed to improve the performance of different natural language processing (NLP) tasks, such as semantic relatedness computation, text classification and information retrieval. Among these methods, salient semantic analysis (SSA) has been proven to be an effective way to generate conceptual representation for words or documents. However, its feasibility and effectiveness in information retrieval is mostly unknown. In this paper, we study how to efficiently use SSA to improve the information retrieval performance, and propose a SSA-based retrieval method under the language model framework. First, SSA model is adopted to build conceptual representations for documents and queries. Then, these conceptual representations and the bag-of-words (BOW) representations can be used in combination to estimate the language models of queries and documents. The proposed method is evaluated on several standard text retrieval conference (TREC) collections. Experiment results on standard TREC collections show the proposed models consistently outperform the existing Wikipedia-based retrieval methods.
Architecture for biomedical multimedia information delivery on the World Wide Web
NASA Astrophysics Data System (ADS)
Long, L. Rodney; Goh, Gin-Hua; Neve, Leif; Thoma, George R.
1997-10-01
Research engineers at the National Library of Medicine are building a prototype system for the delivery of multimedia biomedical information on the World Wide Web. This paper discuses the architecture and design considerations for the system, which will be used initially to make images and text from the third National Health and Nutrition Examination Survey (NHANES) publicly available. We categorized our analysis as follows: (1) fundamental software tools: we analyzed trade-offs among use of conventional HTML/CGI, X Window Broadway, and Java; (2) image delivery: we examined the use of unconventional TCP transmission methods; (3) database manager and database design: we discuss the capabilities and planned use of the Informix object-relational database manager and the planned schema for the HNANES database; (4) storage requirements for our Sun server; (5) user interface considerations; (6) the compatibility of the system with other standard research and analysis tools; (7) image display: we discuss considerations for consistent image display for end users. Finally, we discuss the scalability of the system in terms of incorporating larger or more databases of similar data, and the extendibility of the system for supporting content-based retrieval of biomedical images. The system prototype is called the Web-based Medical Information Retrieval System. An early version was built as a Java applet and tested on Unix, PC, and Macintosh platforms. This prototype used the MiniSQL database manager to do text queries on a small database of records of participants in the second NHANES survey. The full records and associated x-ray images were retrievable and displayable on a standard Web browser. A second version has now been built, also a Java applet, using the MySQL database manager.
NASA Technical Reports Server (NTRS)
Stocker, Erich Franz; Kelley, Owen
2017-01-01
This presentation will summarize the changes in the products for the GPM V05 reprocessing cycle. It will concentrate on discussing the gridded text product from the core satellite retrievals. However, all aspects of the GPROF GMI changes in this product are equally appropriate to the other two gridded text products. The GPM mission reprocessed its products in May of 2017 as part of a continuing improvement of precipitation retrievals. This lead to important improvement in the retrievals and therefore also necessitated reprocessing the gridded test products. The V05 GPROF changes not only improved the retrievals but substantially alerted the format and this compelled changes to the gridded text products. Especially important in this regard is the GPROF2017 (used in V05) change from reporting the fraction of the total precipitation rate that occurring as convection or in liquid phase. Instead, GPROF2017, and therefore V05 gridded text products, report the rate of convective precipitation in mm/hr. The GPROF2017 algorithm now reports the frozen precipitation rate in mm/hr rather than the fraction of total precipitation that is liquid. Because of the aim of the gridded text product is to remain simple the radar and combined results will also change in V05 to reflect this change in the GMI retrieval. The presentation provides an analysis of these changes as well as presenting a comparison with the swath products from which the hourly text grids were derived.
AI User Support System for SAP ERP
NASA Astrophysics Data System (ADS)
Vlasov, Vladimir; Chebotareva, Victoria; Rakhimov, Marat; Kruglikov, Sergey
2017-10-01
An intelligent system for SAP ERP user support is proposed in this paper. It enables automatic replies on users’ requests for support, saving time for problem analysis and resolution and improving responsiveness for end users. The system is based on an ensemble of machine learning algorithms of multiclass text classification, providing efficient question understanding, and a special framework for evidence retrieval, providing the best answer derivation.
Moving beyond Text Highlights: Inferring Users' Interests to Improve the Relevance of Retrieval
ERIC Educational Resources Information Center
Balakrishnan, Vimala; Mehmood, Yasir; Nagappan, Yoganathan
2016-01-01
Introduction: Studies have indicated that users' text highlighting behaviour can be further manipulated to improve the relevance of retrieved results. This article reports on a study that examined users' text highlight frequency, length and users' copy-paste actions. Method: A binary voting mechanism was employed to determine the weights for the…
Data Visualization in Information Retrieval and Data Mining (SIG VIS).
ERIC Educational Resources Information Center
Efthimiadis, Efthimis
2000-01-01
Presents abstracts that discuss using data visualization for information retrieval and data mining, including immersive information space and spatial metaphors; spatial data using multi-dimensional matrices with maps; TREC (Text Retrieval Conference) experiments; users' information needs in cartographic information retrieval; and users' relevance…
Challenges for automatically extracting molecular interactions from full-text articles.
McIntosh, Tara; Curran, James R
2009-09-24
The increasing availability of full-text biomedical articles will allow more biomedical knowledge to be extracted automatically with greater reliability. However, most Information Retrieval (IR) and Extraction (IE) tools currently process only abstracts. The lack of corpora has limited the development of tools that are capable of exploiting the knowledge in full-text articles. As a result, there has been little investigation into the advantages of full-text document structure, and the challenges developers will face in processing full-text articles. We manually annotated passages from full-text articles that describe interactions summarised in a Molecular Interaction Map (MIM). Our corpus tracks the process of identifying facts to form the MIM summaries and captures any factual dependencies that must be resolved to extract the fact completely. For example, a fact in the results section may require a synonym defined in the introduction. The passages are also annotated with negated and coreference expressions that must be resolved.We describe the guidelines for identifying relevant passages and possible dependencies. The corpus includes 2162 sentences from 78 full-text articles. Our corpus analysis demonstrates the necessity of full-text processing; identifies the article sections where interactions are most commonly stated; and quantifies the proportion of interaction statements requiring coherent dependencies. Further, it allows us to report on the relative importance of identifying synonyms and resolving negated expressions. We also experiment with an oracle sentence retrieval system using the corpus as a gold-standard evaluation set. We introduce the MIM corpus, a unique resource that maps interaction facts in a MIM to annotated passages within full-text articles. It is an invaluable case study providing guidance to developers of biomedical IR and IE systems, and can be used as a gold-standard evaluation set for full-text IR tasks.
Multilingual Information Discovery and AccesS (MIDAS): A Joint ACM DL'99/ ACM SIGIR'99 Workshop.
ERIC Educational Resources Information Center
Oard, Douglas; Peters, Carol; Ruiz, Miguel; Frederking, Robert; Klavans, Judith; Sheridan, Paraic
1999-01-01
Discusses a multidisciplinary workshop that addressed issues concerning internationally distributed information networks. Highlights include multilingual information access in media other than character-coded text; cross-language information retrieval and multilingual metadata; and evaluation of multilingual systems. (LRW)
A Survey in Indexing and Searching XML Documents.
ERIC Educational Resources Information Center
Luk, Robert W. P.; Leong, H. V.; Dillon, Tharam S.; Chan, Alvin T. S.; Croft, W. Bruce; Allan, James
2002-01-01
Discussion of XML focuses on indexing techniques for XML documents, grouping them into flat-file, semistructured, and structured indexing paradigms. Highlights include searching techniques, including full text search and multistage search; search result presentations; database and information retrieval system integration; XML query languages; and…
Beyond Information Retrieval—Medical Question Answering
Lee, Minsuk; Cimino, James; Zhu, Hai Ran; Sable, Carl; Shanker, Vijay; Ely, John; Yu, Hong
2006-01-01
Physicians have many questions when caring for patients, and frequently need to seek answers for their questions. Information retrieval systems (e.g., PubMed) typically return a list of documents in response to a user’s query. Frequently the number of returned documents is large and makes physicians’ information seeking “practical only ‘after hours’ and not in the clinical settings”. Question answering techniques are based on automatically analyzing thousands of electronic documents to generate short-text answers in response to clinical questions that are posed by physicians. The authors address physicians’ information needs and described the design, implementation, and evaluation of the medical question answering system (MedQA). Although our long term goal is to enable MedQA to answer all types of medical questions, currently, we currently implement MedQA to integrate information retrieval, extraction, and summarization techniques to automatically generate paragraph-level text for definitional questions (i.e., “What is X?”). MedQA can be accessed at http://www.dbmi.columbia.edu/~yuh9001/research/MedQA.html. PMID:17238385
Probabilistic and machine learning-based retrieval approaches for biomedical dataset retrieval
Karisani, Payam; Qin, Zhaohui S; Agichtein, Eugene
2018-01-01
Abstract The bioCADDIE dataset retrieval challenge brought together different approaches to retrieval of biomedical datasets relevant to a user’s query, expressed as a text description of a needed dataset. We describe experiments in applying a data-driven, machine learning-based approach to biomedical dataset retrieval as part of this challenge. We report on a series of experiments carried out to evaluate the performance of both probabilistic and machine learning-driven techniques from information retrieval, as applied to this challenge. Our experiments with probabilistic information retrieval methods, such as query term weight optimization, automatic query expansion and simulated user relevance feedback, demonstrate that automatically boosting the weights of important keywords in a verbose query is more effective than other methods. We also show that although there is a rich space of potential representations and features available in this domain, machine learning-based re-ranking models are not able to improve on probabilistic information retrieval techniques with the currently available training data. The models and algorithms presented in this paper can serve as a viable implementation of a search engine to provide access to biomedical datasets. The retrieval performance is expected to be further improved by using additional training data that is created by expert annotation, or gathered through usage logs, clicks and other processes during natural operation of the system. Database URL: https://github.com/emory-irlab/biocaddie PMID:29688379
Comparison of Effects of Different Forms of Presentation on the Recall and Retrieval of Information.
ERIC Educational Resources Information Center
Jonassen, David H.; Pace, Ann Jaffe
A study compared the relative effects of typographically cued or mapped text, intact text with signaling, and intact text without signaling on the recall and retrieval of information from prose passages. (Signaling, a noncontent aspect of prose, emphasizes certain aspects of the semantic content or points out aspects of the structure of content.)…
The testing effect and analogical problem-solving.
Peterson, Daniel J; Wissman, Kathryn T
2018-06-25
Researchers generally agree that retrieval practice of previously learned material facilitates subsequent recall of same material, a phenomenon known as the testing effect. There is debate, however, about when such benefits transfer to related (though not identical) material. The current study examines the phenomenon of transfer in the domain of analogical problem-solving. In Experiments 1 and 2, learners were presented a source text describing a problem and solution to read which was subsequently either restudied or recalled. Following a short (Experiment 1) or long (Experiment 2) delay, learners were given a new target text and asked to solve a problem. The two texts shared a common structure such that the provided solution for the source text could be applied to solve the problem in the target text. In a combined analysis of both experiments, learners in the retrieval practice condition were more successful at solving the problem than those in the restudy condition. Experiment 3 explored the degree to which retrieval practice promotes cued versus spontaneous transfer by manipulating whether participants were provided with an explicit hint that the source and target texts were related. Results revealed no effect of retrieval practice.
Data Discretization for Novel Relationship Discovery in Information Retrieval.
ERIC Educational Resources Information Center
Benoit, G.
2002-01-01
Describes an information retrieval, visualization, and manipulation model which offers the user multiple ways to exploit the retrieval set, based on weighted query terms, via an interactive interface. Outlines the mathematical model and describes an information retrieval application built on the model to search structured and full-text files.…
Variation in Relevance Judgments and the Measurement of Retrieval Effectiveness.
ERIC Educational Resources Information Center
Voorhees, Ellen M.
2000-01-01
Discusses the test collections developed in the TREC (Text REtrieval Conference) workshops for information retrieval research and describes a study by NIST (National Institute of Standards and Technology) that verified their reliability by investigating the effect changes in the relevance assessments have on the evaluation of retrieval results.…
Natural Language Processing: Toward Large-Scale, Robust Systems.
ERIC Educational Resources Information Center
Haas, Stephanie W.
1996-01-01
Natural language processing (NLP) is concerned with getting computers to do useful things with natural language. Major applications include machine translation, text generation, information retrieval, and natural language interfaces. Reviews important developments since 1987 that have led to advances in NLP; current NLP applications; and problems…
Image query and indexing for digital x rays
NASA Astrophysics Data System (ADS)
Long, L. Rodney; Thoma, George R.
1998-12-01
The web-based medical information retrieval system (WebMIRS) allows interned access to databases containing 17,000 digitized x-ray spine images and associated text data from National Health and Nutrition Examination Surveys (NHANES). WebMIRS allows SQL query of the text, and viewing of the returned text records and images using a standard browser. We are now working (1) to determine utility of data directly derived from the images in our databases, and (2) to investigate the feasibility of computer-assisted or automated indexing of the images to support image retrieval of images of interest to biomedical researchers in the field of osteoarthritis. To build an initial database based on image data, we are manually segmenting a subset of the vertebrae, using techniques from vertebral morphometry. From this, we will derive and add to the database vertebral features. This image-derived data will enhance the user's data access capability by enabling the creation of combined SQL/image-content queries.
NASA directives master list and index
NASA Technical Reports Server (NTRS)
1993-01-01
This Handbook sets forth in two parts the information for the guidance of users of the NASA Management Directives System. Complementary to this Handbook is the NASA Online Directives Information System (NODIS), an electronic computer text retrieval system. The first part contains the Master List of Management Directives in force as of 30 Sep. 1993. The second part contains an Index to NASA Management Directives in force as of 30 Sep. 1993.
Bio-TDS: bioscience query tool discovery system.
Gnimpieba, Etienne Z; VanDiermen, Menno S; Gustafson, Shayla M; Conn, Bill; Lushbough, Carol M
2017-01-04
Bioinformatics and computational biology play a critical role in bioscience and biomedical research. As researchers design their experimental projects, one major challenge is to find the most relevant bioinformatics toolkits that will lead to new knowledge discovery from their data. The Bio-TDS (Bioscience Query Tool Discovery Systems, http://biotds.org/) has been developed to assist researchers in retrieving the most applicable analytic tools by allowing them to formulate their questions as free text. The Bio-TDS is a flexible retrieval system that affords users from multiple bioscience domains (e.g. genomic, proteomic, bio-imaging) the ability to query over 12 000 analytic tool descriptions integrated from well-established, community repositories. One of the primary components of the Bio-TDS is the ontology and natural language processing workflow for annotation, curation, query processing, and evaluation. The Bio-TDS's scientific impact was evaluated using sample questions posed by researchers retrieved from Biostars, a site focusing on BIOLOGICAL DATA ANALYSIS: The Bio-TDS was compared to five similar bioscience analytic tool retrieval systems with the Bio-TDS outperforming the others in terms of relevance and completeness. The Bio-TDS offers researchers the capacity to associate their bioscience question with the most relevant computational toolsets required for the data analysis in their knowledge discovery process. © The Author(s) 2016. Published by Oxford University Press on behalf of Nucleic Acids Research.
Mining biomedical images towards valuable information retrieval in biomedical and life sciences
Ahmed, Zeeshan; Zeeshan, Saman; Dandekar, Thomas
2016-01-01
Biomedical images are helpful sources for the scientists and practitioners in drawing significant hypotheses, exemplifying approaches and describing experimental results in published biomedical literature. In last decades, there has been an enormous increase in the amount of heterogeneous biomedical image production and publication, which results in a need for bioimaging platforms for feature extraction and analysis of text and content in biomedical images to take advantage in implementing effective information retrieval systems. In this review, we summarize technologies related to data mining of figures. We describe and compare the potential of different approaches in terms of their developmental aspects, used methodologies, produced results, achieved accuracies and limitations. Our comparative conclusions include current challenges for bioimaging software with selective image mining, embedded text extraction and processing of complex natural language queries. PMID:27538578
Knowledge Retrieval Solutions.
ERIC Educational Resources Information Center
Khan, Kamran
1998-01-01
Excalibur RetrievalWare offers true knowledge retrieval solutions. Its fundamental technologies, Adaptive Pattern Recognition Processing and Semantic Networks, have capabilities for knowledge discovery and knowledge management of full-text, structured and visual information. The software delivers a combination of accuracy, extensibility,…
ERIC Educational Resources Information Center
Chan, Jason C. K.
2009-01-01
Retrieval practice can enhance long-term retention of the tested material (the testing effect), but it can also impair later recall of the nontested material--a phenomenon known as retrieval-induced forgetting (Anderson, M. C., Bjork, R. A., & Bjork, E. L. (1994). "Remembering can cause forgetting: retrieval dynamics in long-term memory." "Journal…
Lossef, S V; Schwartz, L H
1990-09-01
A computerized reference system for radiology journal articles was developed by using an IBM-compatible personal computer with a hand-held optical scanner and optical character recognition software. This allows direct entry of scanned text from printed material into word processing or data-base files. Additionally, line diagrams and photographs of radiographs can be incorporated into these files. A text search and retrieval software program enables rapid searching for keywords in scanned documents. The hand scanner and software programs are commercially available, relatively inexpensive, and easily used. This permits construction of a personalized radiology literature file of readily accessible text and images requiring minimal typing or keystroke entry.
Yu, Hong; Agarwal, Shashank; Johnston, Mark; Cohen, Aaron
2009-01-06
Biomedical scientists need to access figures to validate research facts and to formulate or to test novel research hypotheses. However, figures are difficult to comprehend without associated text (e.g., figure legend and other reference text). We are developing automated systems to extract the relevant explanatory information along with figures extracted from full text articles. Such systems could be very useful in improving figure retrieval and in reducing the workload of biomedical scientists, who otherwise have to retrieve and read the entire full-text journal article to determine which figures are relevant to their research. As a crucial step, we studied the importance of associated text in biomedical figure comprehension. Twenty subjects evaluated three figure-text combinations: figure+legend, figure+legend+title+abstract, and figure+full-text. Using a Likert scale, each subject scored each figure+text according to the extent to which the subject thought he/she understood the meaning of the figure and the confidence in providing the assigned score. Additionally, each subject entered a free text summary for each figure-text. We identified missing information using indicator words present within the text summaries. Both the Likert scores and the missing information were statistically analyzed for differences among the figure-text types. We also evaluated the quality of text summaries with the text-summarization evaluation method the ROUGE score. Our results showed statistically significant differences in figure comprehension when varying levels of text were provided. When the full-text article is not available, presenting just the figure+legend left biomedical researchers lacking 39-68% of the information about a figure as compared to having complete figure comprehension; adding the title and abstract improved the situation, but still left biomedical researchers missing 30% of the information. When the full-text article is available, figure comprehension increased to 86-97%; this indicates that researchers felt that only 3-14% of the necessary information for full figure comprehension was missing when full text was available to them. Clearly there is information in the abstract and in the full text that biomedical scientists deem important for understanding the figures that appear in full-text biomedical articles. We conclude that the texts that appear in full-text biomedical articles are useful for understanding the meaning of a figure, and an effective figure-mining system needs to unlock the information beyond figure legend. Our work provides important guidance to the figure mining systems that extract information only from figure and figure legend.
2009-01-01
Background Biomedical scientists need to access figures to validate research facts and to formulate or to test novel research hypotheses. However, figures are difficult to comprehend without associated text (e.g., figure legend and other reference text). We are developing automated systems to extract the relevant explanatory information along with figures extracted from full text articles. Such systems could be very useful in improving figure retrieval and in reducing the workload of biomedical scientists, who otherwise have to retrieve and read the entire full-text journal article to determine which figures are relevant to their research. As a crucial step, we studied the importance of associated text in biomedical figure comprehension. Methods Twenty subjects evaluated three figure-text combinations: figure+legend, figure+legend+title+abstract, and figure+full-text. Using a Likert scale, each subject scored each figure+text according to the extent to which the subject thought he/she understood the meaning of the figure and the confidence in providing the assigned score. Additionally, each subject entered a free text summary for each figure-text. We identified missing information using indicator words present within the text summaries. Both the Likert scores and the missing information were statistically analyzed for differences among the figure-text types. We also evaluated the quality of text summaries with the text-summarization evaluation method the ROUGE score. Results Our results showed statistically significant differences in figure comprehension when varying levels of text were provided. When the full-text article is not available, presenting just the figure+legend left biomedical researchers lacking 39–68% of the information about a figure as compared to having complete figure comprehension; adding the title and abstract improved the situation, but still left biomedical researchers missing 30% of the information. When the full-text article is available, figure comprehension increased to 86–97%; this indicates that researchers felt that only 3–14% of the necessary information for full figure comprehension was missing when full text was available to them. Clearly there is information in the abstract and in the full text that biomedical scientists deem important for understanding the figures that appear in full-text biomedical articles. Conclusion We conclude that the texts that appear in full-text biomedical articles are useful for understanding the meaning of a figure, and an effective figure-mining system needs to unlock the information beyond figure legend. Our work provides important guidance to the figure mining systems that extract information only from figure and figure legend. PMID:19126221
Experiments on Linguistically-Based Term Associations.
ERIC Educational Resources Information Center
Ruge, Gerda
1992-01-01
Describes the hyperterm system REALIST (Retrieval Aids by Linguistics and Statistics) with emphasis on its semantic component, which generates term relations from free-text input. Experiments with various similarity measures are discussed, and the quality of the associated terms is evaluated using term recall and term precision measures. (22…
A Semi-Automatic Approach to Construct Vietnamese Ontology from Online Text
ERIC Educational Resources Information Center
Nguyen, Bao-An; Yang, Don-Lin
2012-01-01
An ontology is an effective formal representation of knowledge used commonly in artificial intelligence, semantic web, software engineering, and information retrieval. In open and distance learning, ontologies are used as knowledge bases for e-learning supplements, educational recommenders, and question answering systems that support students with…
Tutorial on Generalized Programming Language s and Systems. Instructor Edition.
ERIC Educational Resources Information Center
Fasana, Paul J., Ed.; Shank, Russell, Ed.
This instructor's manual is a comparative analysis and review of the various computer programing languages currently available and their capabilities for performing text manipulation, information storage, and data retrieval tasks. Based on materials presented at the 1967 Convention of the American Society for Information Science, the manual…
Low-Speed Fingerprint Image Capture System User`s Guide, June 1, 1993
DOE Office of Scientific and Technical Information (OSTI.GOV)
Whitus, B.R.; Goddard, J.S.; Jatko, W.B.
1993-06-01
The Low-Speed Fingerprint Image Capture System (LS-FICS) uses a Sun workstation controlling a Lenzar ElectroOptics Opacity 1000 imaging system to digitize fingerprint card images to support the Federal Bureau of Investigation`s (FBI`s) Automated Fingerprint Identification System (AFIS) program. The system also supports the operations performed by the Oak Ridge National Laboratory- (ORNL-) developed Image Transmission Network (ITN) prototype card scanning system. The input to the system is a single FBI fingerprint card of the agreed-upon standard format and a user-specified identification number. The output is a file formatted to be compatible with the National Institute of Standards and Technology (NIST)more » draft standard for fingerprint data exchange dated June 10, 1992. These NIST compatible files contain the required print and text images. The LS-FICS is designed to provide the FBI with the capability of scanning fingerprint cards into a digital format. The FBI will replicate the system to generate a data base of test images. The Host Workstation contains the image data paths and the compression algorithm. A local area network interface, disk storage, and tape drive are used for the image storage and retrieval, and the Lenzar Opacity 1000 scanner is used to acquire the image. The scanner is capable of resolving 500 pixels/in. in both x and y directions. The print images are maintained in full 8-bit gray scale and compressed with an FBI-approved wavelet-based compression algorithm. The text fields are downsampled to 250 pixels/in. and 2-bit gray scale. The text images are then compressed using a lossless Huffman coding scheme. The text fields retrieved from the output files are easily interpreted when displayed on the screen. Detailed procedures are provided for system calibration and operation. Software tools are provided to verify proper system operation.« less
ERIC Educational Resources Information Center
Chen, Hsinchun
2003-01-01
Discusses information retrieval techniques used on the World Wide Web. Topics include machine learning in information extraction; relevance feedback; information filtering and recommendation; text classification and text clustering; Web mining, based on data mining techniques; hyperlink structure; and Web size. (LRW)
n-Gram-Based Indexing for Korean Text Retrieval.
ERIC Educational Resources Information Center
Lee, Joon Ho; Cho, Hyun Yang; Park, Hyouk Ro
1999-01-01
Discusses indexing methods in Korean text retrieval and proposes a new indexing method based on n-grams which can handle compound nouns effectively without dictionaries and complex linguistic knowledge. Experimental results show that n-gram-based indexing is considerably faster than morpheme-based indexing, and also provides better retrieval…
Techniques of Document Management: A Review of Text Retrieval and Related Technologies.
ERIC Educational Resources Information Center
Veal, D. C.
2001-01-01
Reviews present and possible future developments in the techniques of electronic document management, the major ones being text retrieval and scanning and OCR (optical character recognition). Also addresses document acquisition, indexing and thesauri, publishing and dissemination standards, impact of the Internet, and the document management…
Understanding Student Article Retrieval Behaviors: Instructional Implications
ERIC Educational Resources Information Center
Cook-Cottone, Catherine P.; Dutt-Doner, Karen; Schoen, David
2005-01-01
This study evaluates the use of full-text databases amongst 425 undergraduate and graduate students in western New York. A review of literature implicated convenience, time issues, article retrieval option knowledge, and the appreciation and understanding of research article quality as potential predictors of full-text reliance. These variables…
Subject Retrieval from Full-Text Databases in the Humanities
ERIC Educational Resources Information Center
East, John W.
2007-01-01
This paper examines the problems involved in subject retrieval from full-text databases of secondary materials in the humanities. Ten such databases were studied and their search functionality evaluated, focusing on factors such as Boolean operators, document surrogates, limiting by subject area, proximity operators, phrase searching, wildcards,…
A new method for text detection and recognition in indoor scene for assisting blind people
NASA Astrophysics Data System (ADS)
Jabnoun, Hanen; Benzarti, Faouzi; Amiri, Hamid
2017-03-01
Developing assisting system of handicapped persons become a challenging ask in research projects. Recently, a variety of tools are designed to help visually impaired or blind people object as a visual substitution system. The majority of these tools are based on the conversion of input information into auditory or tactile sensory information. Furthermore, object recognition and text retrieval are exploited in the visual substitution systems. Text detection and recognition provides the description of the surrounding environments, so that the blind person can readily recognize the scene. In this work, we aim to introduce a method for detecting and recognizing text in indoor scene. The process consists on the detection of the regions of interest that should contain the text using the connected component. Then, the text detection is provided by employing the images correlation. This component of an assistive blind person should be simple, so that the users are able to obtain the most informative feedback within the shortest time.
Depeursinge, Adrien; Vargas, Alejandro; Gaillard, Frédéric; Platon, Alexandra; Geissbuhler, Antoine; Poletti, Pierre-Alexandre; Müller, Henning
2012-01-01
Clinical workflows and user interfaces of image-based computer-aided diagnosis (CAD) for interstitial lung diseases in high-resolution computed tomography are introduced and discussed. Three use cases are implemented to assist students, radiologists, and physicians in the diagnosis workup of interstitial lung diseases. In a first step, the proposed system shows a three-dimensional map of categorized lung tissue patterns with quantification of the diseases based on texture analysis of the lung parenchyma. Then, based on the proportions of abnormal and normal lung tissue as well as clinical data of the patients, retrieval of similar cases is enabled using a multimodal distance aggregating content-based image retrieval (CBIR) and text-based information search. The global system leads to a hybrid detection-CBIR-based CAD, where detection-based and CBIR-based CAD show to be complementary both on the user's side and on the algorithmic side. The proposed approach is in accordance with the classical workflow of clinicians searching for similar cases in textbooks and personal collections. The developed system enables objective and customizable inter-case similarity assessment, and the performance measures obtained with a leave-one-patient-out cross-validation (LOPO CV) are representative of a clinical usage of the system.
K.M. Reynolds; H.M. Rauscher; C.V. Worth
1995-01-01
The hypermedia system, ForestEM, was developed in HyperWriter for use in Microsoft Windows. ForestEM version 1.0 includes text and figures from the FEMAT report and the Record of Decision and Standards and Guidelines. Hypermedia introduces two fundamental changes to knowledge management. The first is the capability to interactively store and retrieve large amounts of...
ERIC Educational Resources Information Center
Library of Congress, Washington, DC.
The Program Session of the April 1984 meeting of the Library of Congress Network Advisory Committee (NAC) was devoted to discussion of electronic information delivery systems. Recent developments in six areas were covered: (1) electronic manuscript generation and transmission; (2) online full-text searching and retrieval; (3) online database…
[The computer assisted pacemaker clinic at the regional hospital of Udine (author's transl)].
Feruglio, G A; Lestuzzi, L; Carminati, D
1978-01-01
For a close follow-up of large groups of pacemaker patients and for evaluation of long term pacing on a reliable statistical basis, many pacemaker centers in the world are now using computer systems. A patient data system with structured display records, designed to give complete, comprehensive and surveyable information and which are immediately retrievable 24 hours a day, on display or printed sets, seems to offer an ideal solution. The pacemaker clinic at the Regional Hospital of Udine has adopted this type of system. The clinic in linked to a live, on-line patient data system (G/3, Informatica Friuli-Venezia Giulia). The input and retrieval of information are made through a conventional keyboard. The input formats have fixed headings with coded alternatives and a limited space for comments in free text. The computer edits the coded information to surveyable reviews. Searches can be made on coded information and data of interest.
ERIC Educational Resources Information Center
Cornell Univ., Ithaca, NY. Dept. of Computer Science.
Four papers are included in Part One of the eighteenth report on Salton's Magical Automatic Retriever of Texts (SMART) project. The first paper: "Content Analysis in Information Retrieval" by S. F. Weiss presents the results of experiments aimed at determining the conditions under which content analysis improves retrieval results as well…
Techniques for Soundscape Retrieval and Synthesis
NASA Astrophysics Data System (ADS)
Mechtley, Brandon Michael
The study of acoustic ecology is concerned with the manner in which life interacts with its environment as mediated through sound. As such, a central focus is that of the soundscape: the acoustic environment as perceived by a listener. This dissertation examines the application of several computational tools in the realms of digital signal processing, multimedia information retrieval, and computer music synthesis to the analysis of the soundscape. Namely, these tools include a) an open source software library, Sirens, which can be used for the segmentation of long environmental field recordings into individual sonic events and compare these events in terms of acoustic content, b) a graph-based retrieval system that can use these measures of acoustic similarity and measures of semantic similarity using the lexical database WordNet to perform both text-based retrieval and automatic annotation of environmental sounds, and c) new techniques for the dynamic, realtime parametric morphing of multiple field recordings, informed by the geographic paths along which they were recorded.
Diversification of visual media retrieval results using saliency detection
NASA Astrophysics Data System (ADS)
Muratov, Oleg; Boato, Giulia; De Natale, Franesco G. B.
2013-03-01
Diversification of retrieval results allows for better and faster search. Recently there has been proposed different methods for diversification of image retrieval results mainly utilizing text information and techniques imported from natural language processing domain. However, images contain visual information that is impossible to describe in text and the use of visual features is inevitable. Visual saliency is information about the main object of an image implicitly included by humans while creating visual content. For this reason it is naturally to exploit this information for the task of diversification of the content. In this work we study whether visual saliency can be used for the task of diversification and propose a method for re-ranking image retrieval results using saliency. The evaluation has shown that the use of saliency information results in higher diversity of retrieval results.
ERIC Educational Resources Information Center
Garfield, Eugene
2001-01-01
Traces the development of information retrieval/services and suggests that the creation of large digital libraries seems inevitable. Examines possibilities for increasing electronic access and the role of artificial intelligence. Highlights include: searching full text; sending full texts; selective dissemination of information (SDI) profiling and…
Automatic Cataloguing and Searching for Retrospective Data by Use of OCR Text.
ERIC Educational Resources Information Center
Tseng, Yuen-Hsien
2001-01-01
Describes efforts in supporting information retrieval from OCR (optical character recognition) degraded text. Reports on approaches used in an automatic cataloging and searching contest for books in multiple languages, including a vector space retrieval model, an n-gram indexing method, and a weighting scheme; and discusses problems of Asian…
2013-01-01
Background Open metadata registries are a fundamental tool for researchers in the Life Sciences trying to locate resources. While most current registries assume that resources are annotated with well-structured metadata, evidence shows that most of the resource annotations simply consists of informal free text. This reality must be taken into account in order to develop effective techniques for resource discovery in Life Sciences. Results BioUSeR is a semantic-based tool aimed at retrieving Life Sciences resources described in free text. The retrieval process is driven by the user requirements, which consist of a target task and a set of facets of interest, both expressed in free text. BioUSeR is able to effectively exploit the available textual descriptions to find relevant resources by using semantic-aware techniques. Conclusions BioUSeR overcomes the limitations of the current registries thanks to: (i) rich specification of user information needs, (ii) use of semantics to manage textual descriptions, (iii) retrieval and ranking of resources based on user requirements. PMID:23635042
Mining biomedical images towards valuable information retrieval in biomedical and life sciences.
Ahmed, Zeeshan; Zeeshan, Saman; Dandekar, Thomas
2016-01-01
Biomedical images are helpful sources for the scientists and practitioners in drawing significant hypotheses, exemplifying approaches and describing experimental results in published biomedical literature. In last decades, there has been an enormous increase in the amount of heterogeneous biomedical image production and publication, which results in a need for bioimaging platforms for feature extraction and analysis of text and content in biomedical images to take advantage in implementing effective information retrieval systems. In this review, we summarize technologies related to data mining of figures. We describe and compare the potential of different approaches in terms of their developmental aspects, used methodologies, produced results, achieved accuracies and limitations. Our comparative conclusions include current challenges for bioimaging software with selective image mining, embedded text extraction and processing of complex natural language queries. © The Author(s) 2016. Published by Oxford University Press.
Shatkay, Hagit; Pan, Fengxia; Rzhetsky, Andrey; Wilbur, W. John
2008-01-01
Motivation: Much current research in biomedical text mining is concerned with serving biologists by extracting certain information from scientific text. We note that there is no ‘average biologist’ client; different users have distinct needs. For instance, as noted in past evaluation efforts (BioCreative, TREC, KDD) database curators are often interested in sentences showing experimental evidence and methods. Conversely, lab scientists searching for known information about a protein may seek facts, typically stated with high confidence. Text-mining systems can target specific end-users and become more effective, if the system can first identify text regions rich in the type of scientific content that is of interest to the user, retrieve documents that have many such regions, and focus on fact extraction from these regions. Here, we study the ability to characterize and classify such text automatically. We have recently introduced a multi-dimensional categorization and annotation scheme, developed to be applicable to a wide variety of biomedical documents and scientific statements, while intended to support specific biomedical retrieval and extraction tasks. Results: The annotation scheme was applied to a large corpus in a controlled effort by eight independent annotators, where three individual annotators independently tagged each sentence. We then trained and tested machine learning classifiers to automatically categorize sentence fragments based on the annotation. We discuss here the issues involved in this task, and present an overview of the results. The latter strongly suggest that automatic annotation along most of the dimensions is highly feasible, and that this new framework for scientific sentence categorization is applicable in practice. Contact: shatkay@cs.queensu.ca PMID:18718948
Categorizing document by fuzzy C-Means and K-nearest neighbors approach
NASA Astrophysics Data System (ADS)
Priandini, Novita; Zaman, Badrus; Purwanti, Endah
2017-08-01
Increasing of technology had made categorizing documents become important. It caused by increasing of number of documents itself. Managing some documents by categorizing is one of Information Retrieval application, because it involve text mining on its process. Whereas, categorization technique could be done both Fuzzy C-Means (FCM) and K-Nearest Neighbors (KNN) method. This experiment would consolidate both methods. The aim of the experiment is increasing performance of document categorize. First, FCM is in order to clustering training documents. Second, KNN is in order to categorize testing document until the output of categorization is shown. Result of the experiment is 14 testing documents retrieve relevantly to its category. Meanwhile 6 of 20 testing documents retrieve irrelevant to its category. Result of system evaluation shows that both precision and recall are 0,7.
Let Documents Talk to Each Other: A Computer Model for Connection of Short Documents.
ERIC Educational Resources Information Center
Chen, Z.
1993-01-01
Discusses the integration of scientific texts through the connection of documents and describes a computer model that can connect short documents. Information retrieval and artificial intelligence are discussed; a prototype system of the model is explained; and the model is compared to other computer models. (17 references) (LRW)
ERIC Educational Resources Information Center
Wall, C. Edward; And Others
1995-01-01
Discusses the integration of Standard General Markup Language, Hypertext Markup Language, and MARC format to parse classified analytical bibliographies. Use of the resulting electronic knowledge constructs in local library systems as maps of a specified subset of resources is discussed, and an example is included. (LRW)
Handwritten-word spotting using biologically inspired features.
van der Zant, Tijn; Schomaker, Lambert; Haak, Koen
2008-11-01
For quick access to new handwritten collections, current handwriting recognition methods are too cumbersome. They cannot deal with the lack of labeled data and would require extensive laboratory training for each individual script, style, language and collection. We propose a biologically inspired whole-word recognition method which is used to incrementally elicit word labels in a live, web-based annotation system, named Monk. Since human labor should be minimized given the massive amount of image data, it becomes important to rely on robust perceptual mechanisms in the machine. Recent computational models of the neuro-physiology of vision are applied to isolated word classification. A primate cortex-like mechanism allows to classify text-images that have a low frequency of occurrence. Typically these images are the most difficult to retrieve and often contain named entities and are regarded as the most important to people. Usually standard pattern-recognition technology cannot deal with these text-images if there are not enough labeled instances. The results of this retrieval system are compared to normalized word-image matching and appear to be very promising.
NELS 2.0 - A general system for enterprise wide information management
NASA Technical Reports Server (NTRS)
Smith, Stephanie L.
1993-01-01
NELS, the NASA Electronic Library System, is an information management tool for creating distributed repositories of documents, drawings, and code for use and reuse by the aerospace community. The NELS retrieval engine can load metadata and source files of full text objects, perform natural language queries to retrieve ranked objects, and create links to connect user interfaces. For flexibility, the NELS architecture has layered interfaces between the application program and the stored library information. The session manager provides the interface functions for development of NELS applications. The data manager is an interface between session manager and the structured data system. The center of the structured data system is the Wide Area Information Server. This system architecture provides access to information across heterogeneous platforms in a distributed environment. There are presently three user interfaces that connect to the NELS engine; an X-Windows interface, and ASCII interface and the Spatial Data Management System. This paper describes the design and operation of NELS as an information management tool and repository.
Retrieval practice with short-answer, multiple-choice, and hybrid tests.
Smith, Megan A; Karpicke, Jeffrey D
2014-01-01
Retrieval practice improves meaningful learning, and the most frequent way of implementing retrieval practice in classrooms is to have students answer questions. In four experiments (N=372) we investigated the effects of different question formats on learning. Students read educational texts and practised retrieval by answering short-answer, multiple-choice, or hybrid questions. In hybrid conditions students first attempted to recall answers in short-answer format, then identified answers in multiple-choice format. We measured learning 1 week later using a final assessment with two types of questions: those that could be answered by recalling information verbatim from the texts and those that required inferences. Practising retrieval in all format conditions enhanced retention, relative to a study-only control condition, on both verbatim and inference questions. However, there were little or no advantages of answering short-answer or hybrid format questions over multiple-choice questions in three experiments. In Experiment 4, when retrieval success was improved under initial short-answer conditions, there was an advantage of answering short-answer or hybrid questions over multiple-choice questions. The results challenge the simple conclusion that short-answer questions always produce the best learning, due to increased retrieval effort or difficulty, and demonstrate the importance of retrieval success for retrieval-based learning activities.
TRMM .25 deg x .25 deg Gridded Precipitation Text Product
NASA Technical Reports Server (NTRS)
Stocker, Erich; Kelley, Owen
2009-01-01
Since the launch of the Tropical Rainfall Measuring Mission (TRMM), the Precipitation Measurement Missions science team has endeavored to provide TRMM precipitation retrievals in a variety of formats that are more easily usable by the broad science community than the standard Hierarchical Data Format (HDF) in which TRMM data is produced and archived. At the request of users, the Precipitation Processing System (PPS) has developed a .25 x .25 gridded product in an easily used ASCII text format. The entire TRMM mission data has been made available in this format. The paper provides the details of this new precipitation product that is designated with the TRMM designator 3G68.25. The format is packaged into daily files. It provides hourly precipitation information from the TRMM microwave imager (TMI), precipitation radar (PR), and TMI/PR combined rain retrievals. A major advantage of this approach is the inclusion only of rain data, compression when a particular grid has no rain from the PR or combined, and its direct ASCII text format. For those interested only in rain retrievals and whether rain is convection or stratiform, these products provide a huge reduction in the data volume inherent in the standard TRMM products. This paper provides examples of the 3G68 data products and their uses. It also provides information about C tools that can be used to aggregate daily files into larger time samples. In addition, it describes the possibilities inherent in the spatial sampling which allows resampling into coarser spatial sampling. The paper concludes with information about downloading the gridded text data products.
Web image retrieval using an effective topic and content-based technique
NASA Astrophysics Data System (ADS)
Lee, Ching-Cheng; Prabhakara, Rashmi
2005-03-01
There has been an exponential growth in the amount of image data that is available on the World Wide Web since the early development of Internet. With such a large amount of information and image available and its usefulness, an effective image retrieval system is thus greatly needed. In this paper, we present an effective approach with both image matching and indexing techniques that improvise on existing integrated image retrieval methods. This technique follows a two-phase approach, integrating query by topic and query by example specification methods. In the first phase, The topic-based image retrieval is performed by using an improved text information retrieval (IR) technique that makes use of the structured format of HTML documents. This technique consists of a focused crawler that not only provides for the user to enter the keyword for the topic-based search but also, the scope in which the user wants to find the images. In the second phase, we use query by example specification to perform a low-level content-based image match in order to retrieve smaller and relatively closer results of the example image. From this, information related to the image feature is automatically extracted from the query image. The main objective of our approach is to develop a functional image search and indexing technique and to demonstrate that better retrieval results can be achieved.
ERIC Educational Resources Information Center
Voorhees, Ellen M., Ed.; Harman, Donna K., Ed.
This report constitutes the proceedings of the 2001 Text REtrieval Conference (TREC 2001). The conference was co-sponsored by the National Institute of Standards and Technology (NIST), the Defense Advanced Research Projects Agency (DARPA), and the Advanced Research and Development Agency (ARDA). Approximately 175 people attended the conference,…
ERIC Educational Resources Information Center
Voorhees, Ellen M., Ed.; Harman, Donna K., Ed.
This report constitutes the proceedings of the ninth Text REtrieval Conference (TREC-9). The conference was co-sponsored by the National Institute of Standards and Technology (NIST), the Defense Advanced Research Projects Agency (DARPA), and the Advanced Research and Development Agency (ARDA). Approximately 175 people attended the conference,…
Estimating Missing Features to Improve Multimedia Information Retrieval
DOE Office of Scientific and Technical Information (OSTI.GOV)
Bagherjeiran, A; Love, N S; Kamath, C
Retrieval in a multimedia database usually involves combining information from different modalities of data, such as text and images. However, all modalities of the data may not be available to form the query. The retrieval results from such a partial query are often less than satisfactory. In this paper, we present an approach to complete a partial query by estimating the missing features in the query. Our experiments with a database of images and their associated captions show that, with an initial text-only query, our completion method has similar performance to a full query with both image and text features.more » In addition, when we use relevance feedback, our approach outperforms the results obtained using a full query.« less
Text Mining in Biomedical Domain with Emphasis on Document Clustering.
Renganathan, Vinaitheerthan
2017-07-01
With the exponential increase in the number of articles published every year in the biomedical domain, there is a need to build automated systems to extract unknown information from the articles published. Text mining techniques enable the extraction of unknown knowledge from unstructured documents. This paper reviews text mining processes in detail and the software tools available to carry out text mining. It also reviews the roles and applications of text mining in the biomedical domain. Text mining processes, such as search and retrieval of documents, pre-processing of documents, natural language processing, methods for text clustering, and methods for text classification are described in detail. Text mining techniques can facilitate the mining of vast amounts of knowledge on a given topic from published biomedical research articles and draw meaningful conclusions that are not possible otherwise.
Context-sensitive medical information retrieval.
Auerbuch, Mordechai; Karson, Tom H; Ben-Ami, Benjamin; Maimon, Oded; Rokach, Lior
2004-01-01
Substantial medical data such as pathology reports, operative reports, discharge summaries, and radiology reports are stored in textual form. Databases containing free-text medical narratives often need to be searched to find relevant information for clinical and research purposes. Terms that appear in these documents tend to appear in different contexts. The con-text of negation, a negative finding, is of special importance, since many of the most frequently described findings are those denied by the patient or subsequently "ruled out." Hence, when searching free-text narratives for patients with a certain medical condition, if negation is not taken into account, many of the retrieved documents will be irrelevant. The purpose of this work is to develop a methodology for automated learning of negative context patterns in medical narratives and test the effect of context identification on the performance of medical information retrieval. The algorithm presented significantly improves the performance of information retrieval done on medical narratives. The precision im-proves from about 60%, when using context-insensitive retrieval, to nearly 100%. The impact on recall is only minor. In addition, context-sensitive queries enable the user to search for terms in ways not otherwise available
On the use of the singular value decomposition for text retrieval
DOE Office of Scientific and Technical Information (OSTI.GOV)
Husbands, P.; Simon, H.D.; Ding, C.
2000-12-04
The use of the Singular Value Decomposition (SVD) has been proposed for text retrieval in several recent works. This technique uses the SVD to project very high dimensional document and query vectors into a low dimensional space. In this new space it is hoped that the underlying structure of the collection is revealed thus enhancing retrieval performance. Theoretical results have provided some evidence for this claim and to some extent experiments have confirmed this. However, these studies have mostly used small test collections and simplified document models. In this work we investigate the use of the SVD on large documentmore » collections. We show that, if interpreted as a mechanism for representing the terms of the collection, this technique alone is insufficient for dealing with the variability in term occurrence. Section 2 introduces the text retrieval concepts necessary for our work. A short description of our experimental architecture is presented in Section 3. Section 4 describes how term occurrence variability affects the SVD and then shows how the decomposition influences retrieval performance. A possible way of improving SVD-based techniques is presented in Section 5 and concluded in Section 6.« less
Suppressing the Morning Rise in Cortisol Impairs Free Recall
ERIC Educational Resources Information Center
Rimmele, Ulrike; Meier, Flurina; Lange, Tanja; Born, Jan
2010-01-01
Elevated glucocorticoid levels impair memory retrieval. We investigated whether retrieval under naturally elevated glucocorticoid levels, i.e., during the morning rise in cortisol can be improved by suppressing cortisol. In a crossover study 16 men retrieved emotional and neutral texts and pictures (learned 3 d earlier) 30 min after morning…
Semantic Annotation of Complex Text Structures in Problem Reports
NASA Technical Reports Server (NTRS)
Malin, Jane T.; Throop, David R.; Fleming, Land D.
2011-01-01
Text analysis is important for effective information retrieval from databases where the critical information is embedded in text fields. Aerospace safety depends on effective retrieval of relevant and related problem reports for the purpose of trend analysis. The complex text syntax in problem descriptions has limited statistical text mining of problem reports. The presentation describes an intelligent tagging approach that applies syntactic and then semantic analysis to overcome this problem. The tags identify types of problems and equipment that are embedded in the text descriptions. The power of these tags is illustrated in a faceted searching and browsing interface for problem report trending that combines automatically generated tags with database code fields and temporal information.
Implementation of the common phrase index method on the phrase query for information retrieval
NASA Astrophysics Data System (ADS)
Fatmawati, Triyah; Zaman, Badrus; Werdiningsih, Indah
2017-08-01
As the development of technology, the process of finding information on the news text is easy, because the text of the news is not only distributed in print media, such as newspapers, but also in electronic media that can be accessed using the search engine. In the process of finding relevant documents on the search engine, a phrase often used as a query. The number of words that make up the phrase query and their position obviously affect the relevance of the document produced. As a result, the accuracy of the information obtained will be affected. Based on the outlined problem, the purpose of this research was to analyze the implementation of the common phrase index method on information retrieval. This research will be conducted in English news text and implemented on a prototype to determine the relevance level of the documents produced. The system is built with the stages of pre-processing, indexing, term weighting calculation, and cosine similarity calculation. Then the system will display the document search results in a sequence, based on the cosine similarity. Furthermore, system testing will be conducted using 100 documents and 20 queries. That result is then used for the evaluation stage. First, determine the relevant documents using kappa statistic calculation. Second, determine the system success rate using precision, recall, and F-measure calculation. In this research, the result of kappa statistic calculation was 0.71, so that the relevant documents are eligible for the system evaluation. Then the calculation of precision, recall, and F-measure produces precision of 0.37, recall of 0.50, and F-measure of 0.43. From this result can be said that the success rate of the system to produce relevant documents is low.
Comparing the performance of two CBIRS indexing schemes
NASA Astrophysics Data System (ADS)
Mueller, Wolfgang; Robbert, Guenter; Henrich, Andreas
2003-01-01
Content based image retrieval (CBIR) as it is known today has to deal with a number of challenges. Quickly summarized, the main challenges are firstly, to bridge the semantic gap between high-level concepts and low-level features using feedback, secondly to provide performance under adverse conditions. High-dimensional spaces, as well as a demanding machine learning task make the right way of indexing an important issue. When indexing multimedia data, most groups opt for extraction of high-dimensional feature vectors from the data, followed by dimensionality reduction like PCA (Principal Components Analysis) or LSI (Latent Semantic Indexing). The resulting vectors are indexed using spatial indexing structures such as kd-trees or R-trees, for example. Other projects, such as MARS and Viper propose the adaptation of text indexing techniques, notably the inverted file. Here, the Viper system is the most direct adaptation of text retrieval techniques to quantized vectors. However, while the Viper query engine provides decent performance together with impressive user-feedback behavior, as well as the possibility for easy integration of long-term learning algorithms, and support for potentially infinite feature vectors, there has been no comparison of vector-based methods and inverted-file-based methods under similar conditions. In this publication, we compare a CBIR query engine that uses inverted files (Bothrops, a rewrite of the Viper query engine based on a relational database), and a CBIR query engine based on LSD (Local Split Decision) trees for spatial indexing using the same feature sets. The Benchathlon initiative works on providing a set of images and ground truth for simulating image queries by example and corresponding user feedback. When performing the Benchathlon benchmark on a CBIR system (the System Under Test, SUT), a benchmarking harness connects over internet to the SUT, performing a number of queries using an agreed-upon protocol, the multimedia retrieval markup language (MRML). Using this benchmark one can measure the quality of retrieval, as well as the overall (speed) performance of the benchmarked system. Our Benchmarks will draw on the Benchathlon"s work for documenting the retrieval performance of both inverted file-based and LSD tree based techniques. However in addition to these results, we will present statistics, that can be obtained only inside the system under test. These statistics will include the number of complex mathematical operations, as well as the amount of data that has to be read from disk during operation of a query.
Inferring Higher Functional Information for RIKEN Mouse Full-Length cDNA Clones With FACTS
Nagashima, Takeshi; Silva, Diego G.; Petrovsky, Nikolai; Socha, Luis A.; Suzuki, Harukazu; Saito, Rintaro; Kasukawa, Takeya; Kurochkin, Igor V.; Konagaya, Akihiko; Schönbach, Christian
2003-01-01
FACTS (Functional Association/Annotation of cDNA Clones from Text/Sequence Sources) is a semiautomated knowledge discovery and annotation system that integrates molecular function information derived from sequence analysis results (sequence inferred) with functional information extracted from text. Text-inferred information was extracted from keyword-based retrievals of MEDLINE abstracts and by matching of gene or protein names to OMIM, BIND, and DIP database entries. Using FACTS, we found that 47.5% of the 60,770 RIKEN mouse cDNA FANTOM2 clone annotations were informative for text searches. MEDLINE queries yielded molecular interaction-containing sentences for 23.1% of the clones. When disease MeSH and GO terms were matched with retrieved abstracts, 22.7% of clones were associated with potential diseases, and 32.5% with GO identifiers. A significant number (23.5%) of disease MeSH-associated clones were also found to have a hereditary disease association (OMIM Morbidmap). Inferred neoplastic and nervous system disease represented 49.6% and 36.0% of disease MeSH-associated clones, respectively. A comparison of sequence-based GO assignments with informative text-based GO assignments revealed that for 78.2% of clones, identical GO assignments were provided for that clone by either method, whereas for 21.8% of clones, the assignments differed. In contrast, for OMIM assignments, only 28.5% of clones had identical sequence-based and text-based OMIM assignments. Sequence, sentence, and term-based functional associations are included in the FACTS database (http://facts.gsc.riken.go.jp/), which permits results to be annotated and explored through web-accessible keyword and sequence search interfaces. The FACTS database will be a critical tool for investigating the functional complexity of the mouse transcriptome, cDNA-inferred interactome (molecular interactions), and pathome (pathologies). PMID:12819151
Coordinating Council. Tenth Meeting: Information retrieval: The role of controlled vocabularies
NASA Technical Reports Server (NTRS)
1993-01-01
The theme of this NASA Scientific and Technical Information Program Coordinating Council meeting was the role of controlled vocabularies (thesauri) in information retrieval. Included are summaries of the presentations and the accompanying visuals. Dr. Raya Fidel addressed 'Retrieval: Free Text, Full Text, and Controlled Vocabularies.' Dr. Bella Hass Weinberg spoke on 'Controlled Vocabularies and Thesaurus Standards.' The presentations were followed by a panel discussion with participation from NASA, the National Library of Medicine, the Defense Technical Information Center, and the Department of Energy; this discussion, however, is not summarized in any detail in this document.
Büssow, Konrad; Hoffmann, Steve; Sievert, Volker
2002-12-19
Functional genomics involves the parallel experimentation with large sets of proteins. This requires management of large sets of open reading frames as a prerequisite of the cloning and recombinant expression of these proteins. A Java program was developed for retrieval of protein and nucleic acid sequences and annotations from NCBI GenBank, using the XML sequence format. Annotations retrieved by ORFer include sequence name, organism and also the completeness of the sequence. The program has a graphical user interface, although it can be used in a non-interactive mode. For protein sequences, the program also extracts the open reading frame sequence, if available, and checks its correct translation. ORFer accepts user input in the form of single or lists of GenBank GI identifiers or accession numbers. It can be used to extract complete sets of open reading frames and protein sequences from any kind of GenBank sequence entry, including complete genomes or chromosomes. Sequences are either stored with their features in a relational database or can be exported as text files in Fasta or tabulator delimited format. The ORFer program is freely available at http://www.proteinstrukturfabrik.de/orfer. The ORFer program allows for fast retrieval of DNA sequences, protein sequences and their open reading frames and sequence annotations from GenBank. Furthermore, storage of sequences and features in a relational database is supported. Such a database can supplement a laboratory information system (LIMS) with appropriate sequence information.
Müller, H-M; Van Auken, K M; Li, Y; Sternberg, P W
2018-03-09
The biomedical literature continues to grow at a rapid pace, making the challenge of knowledge retrieval and extraction ever greater. Tools that provide a means to search and mine the full text of literature thus represent an important way by which the efficiency of these processes can be improved. We describe the next generation of the Textpresso information retrieval system, Textpresso Central (TPC). TPC builds on the strengths of the original system by expanding the full text corpus to include the PubMed Central Open Access Subset (PMC OA), as well as the WormBase C. elegans bibliography. In addition, TPC allows users to create a customized corpus by uploading and processing documents of their choosing. TPC is UIMA compliant, to facilitate compatibility with external processing modules, and takes advantage of Lucene indexing and search technology for efficient handling of millions of full text documents. Like Textpresso, TPC searches can be performed using keywords and/or categories (semantically related groups of terms), but to provide better context for interpreting and validating queries, search results may now be viewed as highlighted passages in the context of full text. To facilitate biocuration efforts, TPC also allows users to select text spans from the full text and annotate them, create customized curation forms for any data type, and send resulting annotations to external curation databases. As an example of such a curation form, we describe integration of TPC with the Noctua curation tool developed by the Gene Ontology (GO) Consortium. Textpresso Central is an online literature search and curation platform that enables biocurators and biomedical researchers to search and mine the full text of literature by integrating keyword and category searches with viewing search results in the context of the full text. It also allows users to create customized curation interfaces, use those interfaces to make annotations linked to supporting evidence statements, and then send those annotations to any database in the world. Textpresso Central URL: http://www.textpresso.org/tpc.
BELTracker: evidence sentence retrieval for BEL statements
Rastegar-Mojarad, Majid; Komandur Elayavilli, Ravikumar; Liu, Hongfang
2016-01-01
Biological expression language (BEL) is one of the main formal representation models of biological networks. The primary source of information for curating biological networks in BEL representation has been literature. It remains a challenge to identify relevant articles and the corresponding evidence statements for curating and validating BEL statements. In this paper, we describe BELTracker, a tool used to retrieve and rank evidence sentences from PubMed abstracts and full-text articles for a given BEL statement (per the 2015 task requirements of BioCreative V BEL Task). The system is comprised of three main components, (i) translation of a given BEL statement to an information retrieval (IR) query, (ii) retrieval of relevant PubMed citations and (iii) finding and ranking the evidence sentences in those citations. BELTracker uses a combination of multiple approaches based on traditional IR, machine learning, and heuristics to accomplish the task. The system identified and ranked at least one fully relevant evidence sentence in the top 10 retrieved sentences for 72 out of 97 BEL statements in the test set. BELTracker achieved a precision of 0.392, 0.532 and 0.615 when evaluated with three criteria, namely full, relaxed and context criteria, respectively, by the task organizers. Our team at Mayo Clinic was the only participant in this task. BELTracker is available as a RESTful API and is available for public use. Database URL: http://www.openbionlp.org:8080/BelTracker/finder/Given_BEL_Statement PMID:27173525
Figure mining for biomedical research.
Rodriguez-Esteban, Raul; Iossifov, Ivan
2009-08-15
Figures from biomedical articles contain valuable information difficult to reach without specialized tools. Currently, there is no search engine that can retrieve specific figure types. This study describes a retrieval method that takes advantage of principles in image understanding, text mining and optical character recognition (OCR) to retrieve figure types defined conceptually. A search engine was developed to retrieve tables and figure types to aid computational and experimental research. http://iossifovlab.cshl.edu/figurome/.
Rimmele, Ulrike; Besedovsky, Luciana; Lange, Tanja; Born, Jan
2013-01-01
Memory retrieval is impaired at very low as well as very high cortisol levels, but not at intermediate levels. This inverted-U-shaped relationship between cortisol levels and memory retrieval may originate from different roles of the mineralocorticoid (MR) and glucocorticoid receptor (GR) that bind cortisol with distinctly different affinity. Here, we examined the role of MRs and GRs in human memory retrieval using specific receptor antagonists. In two double-blind within-subject, cross-over designed studies, young healthy men were asked to retrieve emotional and neutral texts and pictures (learnt 3 days earlier) between 0745 and 0915 hours in the morning, either after administration of 400 mg of the MR blocker spironolactone vs placebo (200 mg at 2300 hours and 200 mg at 0400 hours, Study I) or after administration of the GR blocker mifepristone vs placebo (200 mg at 2300 hours, Study II). Blockade of MRs impaired free recall of both texts and pictures particularly for emotional material. In contrast, blockade of GRs resulted in better memory retrieval for pictures, with the effect being more pronounced for neutral than emotional materials. These findings indicate indeed opposing roles of MRs and GRs in memory retrieval, with optimal retrieval at intermediate cortisol levels likely mediated by high MR but concurrently low GR activation. PMID:23303058
Rimmele, Ulrike; Besedovsky, Luciana; Lange, Tanja; Born, Jan
2013-04-01
Memory retrieval is impaired at very low as well as very high cortisol levels, but not at intermediate levels. This inverted-U-shaped relationship between cortisol levels and memory retrieval may originate from different roles of the mineralocorticoid (MR) and glucocorticoid receptor (GR) that bind cortisol with distinctly different affinity. Here, we examined the role of MRs and GRs in human memory retrieval using specific receptor antagonists. In two double-blind within-subject, cross-over designed studies, young healthy men were asked to retrieve emotional and neutral texts and pictures (learnt 3 days earlier) between 0745 and 0915 hours in the morning, either after administration of 400 mg of the MR blocker spironolactone vs placebo (200 mg at 2300 hours and 200 mg at 0400 hours, Study I) or after administration of the GR blocker mifepristone vs placebo (200 mg at 2300 hours, Study II). Blockade of MRs impaired free recall of both texts and pictures particularly for emotional material. In contrast, blockade of GRs resulted in better memory retrieval for pictures, with the effect being more pronounced for neutral than emotional materials. These findings indicate indeed opposing roles of MRs and GRs in memory retrieval, with optimal retrieval at intermediate cortisol levels likely mediated by high MR but concurrently low GR activation.
Crowley, Rebecca S; Castine, Melissa; Mitchell, Kevin; Chavan, Girish; McSherry, Tara; Feldman, Michael
2010-01-01
The authors report on the development of the Cancer Tissue Information Extraction System (caTIES)--an application that supports collaborative tissue banking and text mining by leveraging existing natural language processing methods and algorithms, grid communication and security frameworks, and query visualization methods. The system fills an important need for text-derived clinical data in translational research such as tissue-banking and clinical trials. The design of caTIES addresses three critical issues for informatics support of translational research: (1) federation of research data sources derived from clinical systems; (2) expressive graphical interfaces for concept-based text mining; and (3) regulatory and security model for supporting multi-center collaborative research. Implementation of the system at several Cancer Centers across the country is creating a potential network of caTIES repositories that could provide millions of de-identified clinical reports to users. The system provides an end-to-end application of medical natural language processing to support multi-institutional translational research programs.
NASA Astrophysics Data System (ADS)
Li, Jia; Tian, Yonghong; Gao, Wen
2008-01-01
In recent years, the amount of streaming video has grown rapidly on the Web. Often, retrieving these streaming videos offers the challenge of indexing and analyzing the media in real time because the streams must be treated as effectively infinite in length, thus precluding offline processing. Generally speaking, captions are important semantic clues for video indexing and retrieval. However, existing caption detection methods often have difficulties to make real-time detection for streaming video, and few of them concern on the differentiation of captions from scene texts and scrolling texts. In general, these texts have different roles in streaming video retrieval. To overcome these difficulties, this paper proposes a novel approach which explores the inter-frame correlation analysis and wavelet-domain modeling for real-time caption detection in streaming video. In our approach, the inter-frame correlation information is used to distinguish caption texts from scene texts and scrolling texts. Moreover, wavelet-domain Generalized Gaussian Models (GGMs) are utilized to automatically remove non-text regions from each frame and only keep caption regions for further processing. Experiment results show that our approach is able to offer real-time caption detection with high recall and low false alarm rate, and also can effectively discern caption texts from the other texts even in low resolutions.
Biron, P; Metzger, M H; Pezet, C; Sebban, C; Barthuet, E; Durand, T
2014-01-01
A full-text search tool was introduced into the daily practice of Léon Bérard Center (France), a health care facility devoted to treatment of cancer. This tool was integrated into the hospital information system by the IT department having been granted full autonomy to improve the system. To describe the development and various uses of a tool for full-text search of computerized patient records. The technology is based on Solr, an open-source search engine. It is a web-based application that processes HTTP requests and returns HTTP responses. A data processing pipeline that retrieves data from different repositories, normalizes, cleans and publishes it to Solr, was integrated in the information system of the Leon Bérard center. The IT department developed also user interfaces to allow users to access the search engine within the computerized medical record of the patient. From January to May 2013, 500 queries were launched per month by an average of 140 different users. Several usages of the tool were described, as follows: medical management of patients, medical research, and improving the traceability of medical care in medical records. The sensitivity of the tool for detecting the medical records of patients diagnosed with both breast cancer and diabetes was 83.0%, and its positive predictive value was 48.7% (gold standard: manual screening by a clinical research assistant). The project demonstrates that the introduction of full-text-search tools allowed practitioners to use unstructured medical information for various purposes.
INFORMATION STORAGE AND RETRIEVAL, REPORTS ON EVALUATION, CLUSTERING, AND FEEDBACK.
ERIC Educational Resources Information Center
SALTON, GERALD
THE TWELFTH IN A SERIES COVERING RESEARCH IN AUTOMATIC STORAGE AND RETRIEVAL, THIS REPORT IS DIVIDED INTO THREE PARTS TITLED EVALUATION, CLUSTER SEARCHING, AND USER FEEDBACK METHODS, RESPECTIVELY. THE FIRST PART, EVALUATION, CONTAINS A COMPLETE SUMMARY OF THE RETRIEVAL RESULTS DERIVED FROM SOME SIXTY DIFFERENT TEXT ANALYSIS EXPERIMENTS. IN EACH…
The present status and problems in document retrieval system : document input type retrieval system
NASA Astrophysics Data System (ADS)
Inagaki, Hirohito
The office-automation (OA) made many changes. Many documents were begun to maintained in an electronic filing system. Therefore, it is needed to establish efficient document retrieval system to extract useful information. Current document retrieval systems are using simple word-matching, syntactic-matching, semantic-matching to obtain high retrieval efficiency. On the other hand, the document retrieval systems using special hardware devices, such as ISSP, were developed for aiming high speed retrieval. Since these systems can accept a single sentence or keywords as input, it is difficult to explain searcher's request. We demonstrated document input type retrieval system, which can directly accept document as an input, and can search similar documents from document data-base.
Névéol, Aurélie; Pereira, Suzanne; Kerdelhué, Gaetan; Dahamna, Badisse; Joubert, Michel; Darmoni, Stéfan J
2007-01-01
The growing number of resources to be indexed in the catalogue of online health resources in French (CISMeF) calls for curating strategies involving automatic indexing tools while maintaining the catalogue's high indexing quality standards. To develop a simple automatic tool that retrieves MeSH descriptors from documents titles. In parallel to research on advanced indexing methods, a bag-of-words tool was developed for timely inclusion in CISMeF's maintenance system. An evaluation was carried out on a corpus of 99 documents. The indexing sets retrieved by the automatic tool were compared to manual indexing based on the title and on the full text of resources. 58% of the major main headings were retrieved by the bag-of-words algorithm and the precision on main heading retrieval was 69%. Bag-of-words indexing has effectively been used on selected resources to be included in CISMeF since August 2006. Meanwhile, on going work aims at improving the current version of the tool.
Math expression retrieval using an inverted index over symbol pairs
NASA Astrophysics Data System (ADS)
Stalnaker, David; Zanibbi, Richard
2015-01-01
We introduce a new method for indexing and retrieving mathematical expressions, and a new protocol for evaluating math formula retrieval systems. The Tangent search engine uses an inverted index over pairs of symbols in math expressions. Each key in the index is a pair of symbols along with their relative distance and vertical displacement within an expression. Matched expressions are ranked by the harmonic mean of the percentage of symbol pairs matched in the query, and the percentage of symbol pairs matched in the candidate expression. We have found that our method is fast enough for use in real time and finds partial matches well, such as when subexpressions are re-arranged (e.g. expressions moved from the left to the right of an equals sign) or when individual symbols (e.g. variables) differ from a query expression. In an experiment using expressions from English Wikipedia, student and faculty participants (N=20) found expressions returned by Tangent significantly more similar than those from a text-based retrieval system (Lucene) adapted for mathematical expressions. Participants provided similarity ratings using a 5-point Likert scale, evaluating expressions from both algorithms one-at-a-time in a randomized order to avoid bias from the position of hits in search result lists. For the Lucene-based system, precision for the top 1 and 10 hits averaged 60% and 39% across queries respectively, while for Tangent mean precision at 1 and 10 were 99% and 60%. A demonstration and source code are publicly available.
NASA Astrophysics Data System (ADS)
Borsdorff, Tobias; aan de Brugh, Joost; Hu, Haili; Nédélec, Philippe; Aben, Ilse; Landgraf, Jochen
2017-05-01
We discuss the retrieval of carbon monoxide (CO) vertical column densities from clear-sky and cloud contaminated 2311-2338 nm reflectance spectra measured by the Scanning Imaging Absorption Spectrometer for Atmospheric Chartography (SCIAMACHY) from January 2003 until the end of the mission in April 2012. These data were processed with the Shortwave Infrared CO Retrieval algorithm (SICOR) that we developed for the operational data processing of the Tropospheric Monitoring Instrument (TROPOMI) that will be launched on ESA's Sentinel-5 Precursor (S5P) mission. This study complements previous work that was limited to clear-sky observations over land. Over the oceans, CO is estimated from cloudy-sky measurements only, which is an important addition to the SCIAMACHY clear-sky CO data set as shown by NDACC and TCCON measurements at coastal sites. For Ny-Ålesund, Lauder, Mauna Loa and Reunion, a validation of SCIAMACHY clear-sky retrievals is not meaningful because of the high retrieval noise and the few collocations at these sites. The situation improves significantly when considering cloudy-sky observations, where we find a low mean bias b = ±6. 0 ppb and a strong correlation between the validation and the SCIAMACHY results with a mean Pearson correlation coefficient r = 0. 7. Also for land observations, cloudy-sky CO retrievals present an interesting complement to the clear-sky data set. For example, at the cities Tehran and Beijing the agreement of SCIAMACHY clear-sky CO observations with MOZAIC/IAGOS airborne measurements is poor with a mean bias of b = 171. 2 ppb and 57.9 ppb because of local CO pollution, which cannot be captured by SCIAMACHY. For cloudy-sky retrievals, the validation improves significantly. Here the retrieved column is mainly sensitive to CO above the cloud and so not affected by the strong local surface emissions. Adjusting the MOZAIC/IAGOS measurements to the vertical sensitivity of the retrieval, the mean bias adds up to b = 52. 3 ppb and 5.0 ppb for Tehran and Beijing. At the less urbanised region around the airport Windhoek, local CO pollution is less prominent and so MOZAIC/IAGOS measurements agree well with SCIAMACHY clear-sky retrievals with a mean bias of b = 15. 5 ppb, but can be even further improved for cloudy SCIAMACHY observations with a mean bias of b = 0. 2 ppb. Overall the cloudy-sky CO retrievals from SCIAMACHY short-wave infrared measurements present a major extension of the clear-sky-only data set, which more than triples the amount of data and adds unique observations over the oceans. Moreover, the study represents the first application of the S5P algorithm for operational CO data processing on cloudy observations prior to the launch of the S5P mission.
Text Mining in Biomedical Domain with Emphasis on Document Clustering
2017-01-01
Objectives With the exponential increase in the number of articles published every year in the biomedical domain, there is a need to build automated systems to extract unknown information from the articles published. Text mining techniques enable the extraction of unknown knowledge from unstructured documents. Methods This paper reviews text mining processes in detail and the software tools available to carry out text mining. It also reviews the roles and applications of text mining in the biomedical domain. Results Text mining processes, such as search and retrieval of documents, pre-processing of documents, natural language processing, methods for text clustering, and methods for text classification are described in detail. Conclusions Text mining techniques can facilitate the mining of vast amounts of knowledge on a given topic from published biomedical research articles and draw meaningful conclusions that are not possible otherwise. PMID:28875048
Landmark Image Retrieval by Jointing Feature Refinement and Multimodal Classifier Learning.
Zhang, Xiaoming; Wang, Senzhang; Li, Zhoujun; Ma, Shuai; Xiaoming Zhang; Senzhang Wang; Zhoujun Li; Shuai Ma; Ma, Shuai; Zhang, Xiaoming; Wang, Senzhang; Li, Zhoujun
2018-06-01
Landmark retrieval is to return a set of images with their landmarks similar to those of the query images. Existing studies on landmark retrieval focus on exploiting the geometries of landmarks for visual similarity matches. However, the visual content of social images is of large diversity in many landmarks, and also some images share common patterns over different landmarks. On the other side, it has been observed that social images usually contain multimodal contents, i.e., visual content and text tags, and each landmark has the unique characteristic of both visual content and text content. Therefore, the approaches based on similarity matching may not be effective in this environment. In this paper, we investigate whether the geographical correlation among the visual content and the text content could be exploited for landmark retrieval. In particular, we propose an effective multimodal landmark classification paradigm to leverage the multimodal contents of social image for landmark retrieval, which integrates feature refinement and landmark classifier with multimodal contents by a joint model. The geo-tagged images are automatically labeled for classifier learning. Visual features are refined based on low rank matrix recovery, and multimodal classification combined with group sparse is learned from the automatically labeled images. Finally, candidate images are ranked by combining classification result and semantic consistence measuring between the visual content and text content. Experiments on real-world datasets demonstrate the superiority of the proposed approach as compared to existing methods.
Integrating query of relational and textual data in clinical databases: a case study.
Fisk, John M; Mutalik, Pradeep; Levin, Forrest W; Erdos, Joseph; Taylor, Caroline; Nadkarni, Prakash
2003-01-01
The authors designed and implemented a clinical data mart composed of an integrated information retrieval (IR) and relational database management system (RDBMS). Using commodity software, which supports interactive, attribute-centric text and relational searches, the mart houses 2.8 million documents that span a five-year period and supports basic IR features such as Boolean searches, stemming, and proximity and fuzzy searching. Results are relevance-ranked using either "total documents per patient" or "report type weighting." Non-curated medical text has a significant degree of malformation with respect to spelling and punctuation, which creates difficulties for text indexing and searching. Presently, the IR facilities of RDBMS packages lack the features necessary to handle such malformed text adequately. A robust IR+RDBMS system can be developed, but it requires integrating RDBMSs with third-party IR software. RDBMS vendors need to make their IR offerings more accessible to non-programmers.
Hackl, W O; Ganslandt, T
2017-08-01
Objective: To summarize recent research and to propose a selection of best papers published in 2016 in the field of Clinical Information Systems (CIS). Method: The query used to retrieve the articles for the CIS section of the 2016 edition of the IMIA Yearbook of Medical Informatics was reused. It again aimed at identifying relevant publications in the field of CIS from PubMed and Web of Science and comprised search terms from the Medical Subject Headings (MeSH) catalog as well as additional free text search terms. The retrieved articles were categorized in a multi-pass review carried out by the two section editors. The final selection of candidate papers was then peer-reviewed by Yearbook editors and external reviewers. Based on the review results, the best papers were then chosen at the selection meeting with the IMIA Yearbook editorial board. Text mining, term co-occurrence mapping, and topic modelling techniques were used to get an overview on the content of the retrieved articles. Results: The query was carried out in mid-January 2017, yielding a consolidated result set of 2,190 articles published in 921 different journals. Out of them, 14 papers were nominated as candidate best papers and three of them were finally selected as the best papers of the CIS field. The content analysis of the articles revealed the broad spectrum of topics covered by CIS research. Conclusions: The CIS field is multi-dimensional and complex. It is hard to draw a well-defined outline between CIS and other domains or other sections of the IMIA Yearbook. The trends observed in the previous years are progressing. Clinical information systems are more than just sociotechnical systems for data collection, processing, exchange, presentation, and archiving. They are the backbone of a complex, trans-institutional information logistics process. Georg Thieme Verlag KG Stuttgart.
ERIC Educational Resources Information Center
Cornell Univ., Ithaca, NY. Dept. of Computer Science.
Part Two of the eighteenth report on Salton's Magical Automatic Retriever of Texts (SMART) project is composed of three papers: The first: "The Effect of Common Words and Synonyms on Retrieval Performance" by D. Bergmark discloses that removal of common words from the query and document vectors significantly increases precision and that…
A Holistic, Similarity-Based Approach for Personalized Ranking in Web Databases
ERIC Educational Resources Information Center
Telang, Aditya
2011-01-01
With the advent of the Web, the notion of "information retrieval" has acquired a completely new connotation and currently encompasses several disciplines ranging from traditional forms of text and data retrieval in unstructured and structured repositories to retrieval of static and dynamic information from the contents of the surface and deep Web.…
NASA Astrophysics Data System (ADS)
Shah, Sweta; Tuinder, Olaf N. E.; van Peet, Jacob C. A.; de Laat, Adrianus T. J.; Stammes, Piet
2018-04-01
Ozone profile retrieval from nadir-viewing satellite instruments operating in the ultraviolet-visible range requires accurate calibration of Level-1 (L1) radiance data. Here we study the effects of calibration on the derived Level-2 (L2) ozone profiles for three versions of SCanning Imaging Absorption spectroMeter for Atmospheric ChartograpHY (SCIAMACHY) L1 data: version 7 (v7), version 7 with m-factors (v7mfac) and version 8 (v8). We retrieve nadir ozone profiles from the SCIAMACHY instrument that flew on board Envisat using the Ozone ProfilE Retrieval Algorithm (OPERA) developed at KNMI with a focus on stratospheric ozone. We study and assess the quality of these profiles and compare retrieved L2 products from L1 SCIAMACHY data versions from the years 2003 to 2011 without further radiometric correction. From validation of the profiles against ozone sonde measurements, we find that the v8 performs better than v7 and v7mfac due to correction for the scan-angle dependency of the instrument's optical degradation. Validation for the years 2003 and 2009 with ozone sondes shows deviations of SCIAMACHY ozone profiles of 0.8-15 % in the stratosphere (corresponding to pressure range ˜ 100-10 hPa) and 2.5-100 % in the troposphere (corresponding to pressure range ˜ 1000-100 hPa), depending on the latitude and the L1 version used. Using L1 v8 for the years 2003-2011 leads to deviations of ˜ 1-11 % in stratospheric ozone and ˜ 1-45 % in tropospheric ozone. The SCIAMACHY L1 v8 data can still be improved upon in the 265-330 nm range used for ozone profile retrieval. The slit function can be improved with a spectral shift and squeeze, which leads to a few percent residue reduction compared to reference solar irradiance spectra. Furthermore, studies of the ratio of measured to simulated reflectance spectra show that a bias correction in the reflectance for wavelengths below 300 nm appears to be necessary.
NASA Astrophysics Data System (ADS)
You, Daekeun; Simpson, Matthew; Antani, Sameer; Demner-Fushman, Dina; Thoma, George R.
2013-01-01
Pointers (arrows and symbols) are frequently used in biomedical images to highlight specific image regions of interest (ROIs) that are mentioned in figure captions and/or text discussion. Detection of pointers is the first step toward extracting relevant visual features from ROIs and combining them with textual descriptions for a multimodal (text and image) biomedical article retrieval system. Recently we developed a pointer recognition algorithm based on an edge-based pointer segmentation method, and subsequently reported improvements made on our initial approach involving the use of Active Shape Models (ASM) for pointer recognition and region growing-based method for pointer segmentation. These methods contributed to improving the recall of pointer recognition but not much to the precision. The method discussed in this article is our recent effort to improve the precision rate. Evaluation performed on two datasets and compared with other pointer segmentation methods show significantly improved precision and the highest F1 score.
Neural networks for data mining electronic text collections
NASA Astrophysics Data System (ADS)
Walker, Nicholas; Truman, Gregory
1997-04-01
The use of neural networks in information retrieval and text analysis has primarily suffered from the issues of adequate document representation, the ability to scale to very large collections, dynamism in the face of new information and the practical difficulties of basing the design on the use of supervised training sets. Perhaps the most important approach to begin solving these problems is the use of `intermediate entities' which reduce the dimensionality of document representations and the size of documents collections to manageable levels coupled with the use of unsupervised neural network paradigms. This paper describes the issues, a fully configured neural network-based text analysis system--dataHARVEST--aimed at data mining text collections which begins this process, along with the remaining difficulties and potential ways forward.
BIRAM: a content-based image retrieval framework for medical images
NASA Astrophysics Data System (ADS)
Moreno, Ramon A.; Furuie, Sergio S.
2006-03-01
In the medical field, digital images are becoming more and more important for diagnostics and therapy of the patients. At the same time, the development of new technologies has increased the amount of image data produced in a hospital. This creates a demand for access methods that offer more than text-based queries for retrieval of the information. In this paper is proposed a framework for the retrieval of medical images that allows the use of different algorithms for the search of medical images by similarity. The framework also enables the search for textual information from an associated medical report and DICOM header information. The proposed system can be used for support of clinical decision making and is intended to be integrated with an open source picture, archiving and communication systems (PACS). The BIRAM has the following advantages: (i) Can receive several types of algorithms for image similarity search; (ii) Allows the codification of the report according to a medical dictionary, improving the indexing of the information and retrieval; (iii) The algorithms can be selectively applied to images with the appropriated characteristics, for instance, only in magnetic resonance images. The framework was implemented in Java language using a MS Access 97 database. The proposed framework can still be improved, by the use of regions of interest (ROI), indexing with slim-trees and integration with a PACS Server.
45 CFR 205.35 - Mechanized claims processing and information retrieval systems; definitions.
Code of Federal Regulations, 2012 CFR
2012-10-01
... claims processing and information retrieval systems; definitions. Section 205.35 through 205.38 contain...: (a) A mechanized claims processing and information retrieval system, hereafter referred to as an automated application processing and information retrieval system (APIRS), or the system, means a system of...
45 CFR 205.35 - Mechanized claims processing and information retrieval systems; definitions.
Code of Federal Regulations, 2013 CFR
2013-10-01
... claims processing and information retrieval systems; definitions. Section 205.35 through 205.38 contain...: (a) A mechanized claims processing and information retrieval system, hereafter referred to as an automated application processing and information retrieval system (APIRS), or the system, means a system of...
45 CFR 205.35 - Mechanized claims processing and information retrieval systems; definitions.
Code of Federal Regulations, 2014 CFR
2014-10-01
... claims processing and information retrieval systems; definitions. Section 205.35 through 205.38 contain...: (a) A mechanized claims processing and information retrieval system, hereafter referred to as an automated application processing and information retrieval system (APIRS), or the system, means a system of...
Converting information from paper to optical media
NASA Technical Reports Server (NTRS)
Deaton, Timothy N.; Tiller, Bruce K.
1990-01-01
The technology of converting large amounts of paper into electronic form is described for use in information management systems based on optical disk storage. The space savings and photographic nature of microfiche are combined in these systems with the advantages of computerized data (fast and flexible retrieval of graphics and text, simultaneous instant access for multiple users, and easy manipulation of data). It is noted that electronic imaging systems offer a unique opportunity to dramatically increase the productivity and profitability of information systems. Particular attention is given to the CALS (Computer-aided Aquisition and Logistic Support) system.
The effects of text messaging on young drivers.
Hosking, Simon G; Young, Kristie L; Regan, Michael A
2009-08-01
This study investigated the effects of using a cell phone to retrieve and send text messages on the driving performance of young novice drivers. Young drivers are particularly susceptible to driver distraction and have an increased risk of distraction-related crashes. Distractions from in-vehicle devices, particularly, those that require manual input, are known to cause decrements in driving performance. Twenty young novice drivers used a cell phone to retrieve and send text messages while driving a simulator. The amount of time that drivers spent not looking at the road when text messaging was up to approximately 400% greater than that recorded in baseline (notext-messaging) conditions. Furthermore, drivers' variability in lane position increased up to approximately 50%, and missed lane changes increased 140%. There was also an increase of up to approximately 150% in drivers' variability in following distances to lead vehicles. Previous research has shown that the risk of crashing while dialing a handheld device, such as when text messaging and driving, is more than double that of conversing on a cell phone. The present study has identified the detrimental effects of text messaging on driving performance that may underlie such increased crash risk. More effective road safety measures are needed to prevent and mitigate the adverse effects on driving performance of using cell phones to retrieve and send text messages.
Rotation-invariant features for multi-oriented text detection in natural images.
Yao, Cong; Zhang, Xin; Bai, Xiang; Liu, Wenyu; Ma, Yi; Tu, Zhuowen
2013-01-01
Texts in natural scenes carry rich semantic information, which can be used to assist a wide range of applications, such as object recognition, image/video retrieval, mapping/navigation, and human computer interaction. However, most existing systems are designed to detect and recognize horizontal (or near-horizontal) texts. Due to the increasing popularity of mobile-computing devices and applications, detecting texts of varying orientations from natural images under less controlled conditions has become an important but challenging task. In this paper, we propose a new algorithm to detect texts of varying orientations. Our algorithm is based on a two-level classification scheme and two sets of features specially designed for capturing the intrinsic characteristics of texts. To better evaluate the proposed method and compare it with the competing algorithms, we generate a comprehensive dataset with various types of texts in diverse real-world scenes. We also propose a new evaluation protocol, which is more suitable for benchmarking algorithms for detecting texts in varying orientations. Experiments on benchmark datasets demonstrate that our system compares favorably with the state-of-the-art algorithms when handling horizontal texts and achieves significantly enhanced performance on variant texts in complex natural scenes.
Krieger, Mary M; Richter, Randy R; Austin, Tricia M
2008-10-01
The research sought to determine (1) how use of the PubMed free full-text (FFT) limit affects citation retrieval and (2) how use of the FFT limit impacts the types of articles and levels of evidence retrieved. Four clinical questions based on a research agenda for physical therapy were searched in PubMed both with and without the use of the FFT limit. Retrieved citations were examined for relevancy to each question. Abstracts of relevant citations were reviewed to determine the types of articles and levels of evidence. Descriptive analysis was used to compare the total number of citations, number of relevant citations, types of articles, and levels of evidence both with and without the use of the FFT limit. Across all 4 questions, the FFT limit reduced the number of citations to 11.1% of the total number of citations retrieved without the FFT limit. Additionally, high-quality evidence such as systematic reviews and randomized controlled trials were missed when the FFT limit was used. Health sciences librarians play a key role in educating users about the potential impact the FFT limit has on the number of citations, types of articles, and levels of evidence retrieved.
Update on Genomic Databases and Resources at the National Center for Biotechnology Information.
Tatusova, Tatiana
2016-01-01
The National Center for Biotechnology Information (NCBI), as a primary public repository of genomic sequence data, collects and maintains enormous amounts of heterogeneous data. Data for genomes, genes, gene expressions, gene variation, gene families, proteins, and protein domains are integrated with the analytical, search, and retrieval resources through the NCBI website, text-based search and retrieval system, provides a fast and easy way to navigate across diverse biological databases.Comparative genome analysis tools lead to further understanding of evolution processes quickening the pace of discovery. Recent technological innovations have ignited an explosion in genome sequencing that has fundamentally changed our understanding of the biology of living organisms. This huge increase in DNA sequence data presents new challenges for the information management system and the visualization tools. New strategies have been designed to bring an order to this genome sequence shockwave and improve the usability of associated data.
Jeong, Hyeonjeong; Sugiura, Motoaki; Sassa, Yuko; Wakusawa, Keisuke; Horie, Kaoru; Sato, Shigeru; Kawashima, Ryuta
2010-04-01
Second language (L2) acquisition necessitates learning and retrieving new words in different modes. In this study, we attempted to investigate the cortical representation of an L2 vocabulary acquired in different learning modes and in cross-modal transfer between learning and retrieval. Healthy participants learned new L2 words either by written translations (text-based learning) or in real-life situations (situation-based learning). Brain activity was then measured during subsequent retrieval of these words. The right supramarginal gyrus and left middle frontal gyrus were involved in situation-based learning and text-based learning, respectively, whereas the left inferior frontal gyrus was activated when learners used L2 knowledge in a mode different from the learning mode. Our findings indicate that the brain regions that mediate L2 memory differ according to how L2 words are learned and used. Copyright 2009 Elsevier Inc. All rights reserved.
Duchrow, Timo; Shtatland, Timur; Guettler, Daniel; Pivovarov, Misha; Kramer, Stefan; Weissleder, Ralph
2009-01-01
Background The breadth of biological databases and their information content continues to increase exponentially. Unfortunately, our ability to query such sources is still often suboptimal. Here, we introduce and apply community voting, database-driven text classification, and visual aids as a means to incorporate distributed expert knowledge, to automatically classify database entries and to efficiently retrieve them. Results Using a previously developed peptide database as an example, we compared several machine learning algorithms in their ability to classify abstracts of published literature results into categories relevant to peptide research, such as related or not related to cancer, angiogenesis, molecular imaging, etc. Ensembles of bagged decision trees met the requirements of our application best. No other algorithm consistently performed better in comparative testing. Moreover, we show that the algorithm produces meaningful class probability estimates, which can be used to visualize the confidence of automatic classification during the retrieval process. To allow viewing long lists of search results enriched by automatic classifications, we added a dynamic heat map to the web interface. We take advantage of community knowledge by enabling users to cast votes in Web 2.0 style in order to correct automated classification errors, which triggers reclassification of all entries. We used a novel framework in which the database "drives" the entire vote aggregation and reclassification process to increase speed while conserving computational resources and keeping the method scalable. In our experiments, we simulate community voting by adding various levels of noise to nearly perfectly labelled instances, and show that, under such conditions, classification can be improved significantly. Conclusion Using PepBank as a model database, we show how to build a classification-aided retrieval system that gathers training data from the community, is completely controlled by the database, scales well with concurrent change events, and can be adapted to add text classification capability to other biomedical databases. The system can be accessed at . PMID:19799796
Data-Base Software For Tracking Technological Developments
NASA Technical Reports Server (NTRS)
Aliberti, James A.; Wright, Simon; Monteith, Steve K.
1996-01-01
Technology Tracking System (TechTracS) computer program developed for use in storing and retrieving information on technology and related patent information developed under auspices of NASA Headquarters and NASA's field centers. Contents of data base include multiple scanned still images and quick-time movies as well as text. TechTracS includes word-processing, report-editing, chart-and-graph-editing, and search-editing subprograms. Extensive keyword searching capabilities enable rapid location of technologies, innovators, and companies. System performs routine functions automatically and serves multiple users.
Recruit--An Ontology Based Information Retrieval System for Clinical Trials Recruitment.
Patrão, Diogo F C; Oleynik, Michel; Massicano, Felipe; Morassi Sasso, Ariane
2015-01-01
Clinical trials are studies designed to assess whether a new intervention is better than the current alternatives. However, most of them fail to recruit participants on schedule. It is hard to use Electronic Health Record (EHR) data to find eligible patients, therefore studies rely on manual assessment, which is time consuming, inefficient and requires specialized training. In this work we describe the design and development of an information retrieval system with the objective of finding eligible patients for cancer trials. The Recruit system has been in use at A. C. Camargo Cancer Center since August/2014 and contains data from more than 500,000 patients and 9 databases. It uses ontologies to integrate data from several sources and represent medical knowledge, which helps enhance results. One can search both in structured data and inside free text reports. The preliminary quality assessments shows excellent recall rates. Recruit proved to be an useful tool for researchers and its modular design could be applied to other clinical conditions and hospitals.
Automatic evidence retrieval for systematic reviews.
Choong, Miew Keen; Galgani, Filippo; Dunn, Adam G; Tsafnat, Guy
2014-10-01
Snowballing involves recursively pursuing relevant references cited in the retrieved literature and adding them to the search results. Snowballing is an alternative approach to discover additional evidence that was not retrieved through conventional search. Snowballing's effectiveness makes it best practice in systematic reviews despite being time-consuming and tedious. Our goal was to evaluate an automatic method for citation snowballing's capacity to identify and retrieve the full text and/or abstracts of cited articles. Using 20 review articles that contained 949 citations to journal or conference articles, we manually searched Microsoft Academic Search (MAS) and identified 78.0% (740/949) of the cited articles that were present in the database. We compared the performance of the automatic citation snowballing method against the results of this manual search, measuring precision, recall, and F1 score. The automatic method was able to correctly identify 633 (as proportion of included citations: recall=66.7%, F1 score=79.3%; as proportion of citations in MAS: recall=85.5%, F1 score=91.2%) of citations with high precision (97.7%), and retrieved the full text or abstract for 490 (recall=82.9%, precision=92.1%, F1 score=87.3%) of the 633 correctly retrieved citations. The proposed method for automatic citation snowballing is accurate and is capable of obtaining the full texts or abstracts for a substantial proportion of the scholarly citations in review articles. By automating the process of citation snowballing, it may be possible to reduce the time and effort of common evidence surveillance tasks such as keeping trial registries up to date and conducting systematic reviews.
A Logic Basis for Information Retrieval.
ERIC Educational Resources Information Center
Watters, C. R.; Shepherd, M. A.
1987-01-01
Discusses the potential of recent work in artificial intelligence, especially expert systems, for the development of more effective information retrieval systems. Highlights include the role of an expert bibliographic retrieval system and a prototype expert retrieval system, PROBIB-2, that uses MicroProlog to provide deductive reasoning…
An Evaluation of On-Line Information Retrieval System Techniques.
ERIC Educational Resources Information Center
Wolfe, Theodore
This report presents a review and evaluation of three remote access on-line information retrieval systems and some ideas on what the capabilities of an ideal on-line information retrieval system should be. The three systems reviewed are the DDC Remote On-Line Retrieval System, the National Aeronautics and Space Administration RECON System, and the…
Assessing semantic similarity of texts - Methods and algorithms
NASA Astrophysics Data System (ADS)
Rozeva, Anna; Zerkova, Silvia
2017-12-01
Assessing the semantic similarity of texts is an important part of different text-related applications like educational systems, information retrieval, text summarization, etc. This task is performed by sophisticated analysis, which implements text-mining techniques. Text mining involves several pre-processing steps, which provide for obtaining structured representative model of the documents in a corpus by means of extracting and selecting the features, characterizing their content. Generally the model is vector-based and enables further analysis with knowledge discovery approaches. Algorithms and measures are used for assessing texts at syntactical and semantic level. An important text-mining method and similarity measure is latent semantic analysis (LSA). It provides for reducing the dimensionality of the document vector space and better capturing the text semantics. The mathematical background of LSA for deriving the meaning of the words in a given text by exploring their co-occurrence is examined. The algorithm for obtaining the vector representation of words and their corresponding latent concepts in a reduced multidimensional space as well as similarity calculation are presented.
Automatic indexing of compound words based on mutual information for Korean text retrieval
DOE Office of Scientific and Technical Information (OSTI.GOV)
Pan Koo Kim; Yoo Kun Cho
In this paper, we present an automatic indexing technique for compound words suitable to an aggulutinative language, specifically Korean. Firstly, we present the construction conditions to compose compound words as indexing terms. Also we present the decomposition rules applicable to consecutive nouns to extract all contents of text. Finally we propose a measure to estimate the usefulness of a term, mutual information, to calculate the degree of word association of compound words, based on the information theoretic notion. By applying this method, our system has raised the precision rate of compound words from 72% to 87%.
Regional information guidance system based on hypermedia concept
NASA Astrophysics Data System (ADS)
Matoba, Hiroshi; Hara, Yoshinori; Kasahara, Yutako
1990-08-01
A regional information guidance system has been developed on an image workstation. Two main features of this system are hypermedia data structure and friendly visual interface realized by the full-color frame memory system. As the hypermedia data structure manages regional information such as maps, pictures and explanations of points of interest, users can retrieve those information one by one, next to next according to their interest change. For example, users can retrieve explanation of a picture through the link between pictures and text explanations. Users can also traverse from one document to another by using keywords as cross reference indices. The second feature is to utilize a full-color, high resolution and wide space frame memory for visual interface design. This frame memory system enables real-time operation of image data and natural scene representation. The system also provides half tone representing function which enables fade-in/out presentations. This fade-in/out functions used in displaying and erasing menu and image data, makes visual interface soft for human eyes. The system we have developed is a typical example of multimedia applications. We expect the image workstation will play an important role as a platform for multimedia applications.
Automatic Text Structuring and Summarization.
ERIC Educational Resources Information Center
Salton, Gerard; And Others
1997-01-01
Discussion of the use of information retrieval techniques for automatic generation of semantic hypertext links focuses on automatic text summarization. Topics include World Wide Web links, text segmentation, and evaluation of text summarization by comparing automatically generated abstracts with manually prepared abstracts. (Author/LRW)
Code of Federal Regulations, 2011 CFR
2011-10-01
... and information retrieval systems. 433.116 Section 433.116 Public Health CENTERS FOR MEDICARE... FISCAL ADMINISTRATION Mechanized Claims Processing and Information Retrieval Systems § 433.116 FFP for operation of mechanized claims processing and information retrieval systems. (a) Subject to paragraph (j) of...
7 CFR 277.18 - Establishment of an Automated Data Processing (ADP) and Information Retrieval System.
Code of Federal Regulations, 2012 CFR
2012-01-01
...) and Information Retrieval System. 277.18 Section 277.18 Agriculture Regulations of the Department of... Data Processing (ADP) and Information Retrieval System. (a) Scope and application. This section... costs of planning, design, development or installation of ADP and information retrieval systems if the...
Code of Federal Regulations, 2013 CFR
2013-10-01
... and information retrieval systems. 433.116 Section 433.116 Public Health CENTERS FOR MEDICARE... FISCAL ADMINISTRATION Mechanized Claims Processing and Information Retrieval Systems § 433.116 FFP for operation of mechanized claims processing and information retrieval systems. (a) Subject to paragraph (j) of...
7 CFR 277.18 - Establishment of an Automated Data Processing (ADP) and Information Retrieval System.
Code of Federal Regulations, 2014 CFR
2014-01-01
...) and Information Retrieval System. 277.18 Section 277.18 Agriculture Regulations of the Department of... Data Processing (ADP) and Information Retrieval System. (a) Scope and application. This section... costs of planning, design, development or installation of ADP and information retrieval systems if the...
Code of Federal Regulations, 2014 CFR
2014-10-01
... and information retrieval systems. 433.116 Section 433.116 Public Health CENTERS FOR MEDICARE... FISCAL ADMINISTRATION Mechanized Claims Processing and Information Retrieval Systems § 433.116 FFP for operation of mechanized claims processing and information retrieval systems. (a) Subject to paragraph (j) of...
7 CFR 277.18 - Establishment of an Automated Data Processing (ADP) and Information Retrieval System.
Code of Federal Regulations, 2011 CFR
2011-01-01
...) and Information Retrieval System. 277.18 Section 277.18 Agriculture Regulations of the Department of... Data Processing (ADP) and Information Retrieval System. (a) Scope and application. This section... costs of planning, design, development or installation of ADP and information retrieval systems if the...
Code of Federal Regulations, 2012 CFR
2012-10-01
... and information retrieval systems. 433.116 Section 433.116 Public Health CENTERS FOR MEDICARE... FISCAL ADMINISTRATION Mechanized Claims Processing and Information Retrieval Systems § 433.116 FFP for operation of mechanized claims processing and information retrieval systems. (a) Subject to paragraph (j) of...
7 CFR 277.18 - Establishment of an Automated Data Processing (ADP) and Information Retrieval System.
Code of Federal Regulations, 2013 CFR
2013-01-01
...) and Information Retrieval System. 277.18 Section 277.18 Agriculture Regulations of the Department of... Data Processing (ADP) and Information Retrieval System. (a) Scope and application. This section... costs of planning, design, development or installation of ADP and information retrieval systems if the...
The power and limits of a rule-based morpho-semantic parser.
Baud, R. H.; Rassinoux, A. M.; Ruch, P.; Lovis, C.; Scherrer, J. R.
1999-01-01
The venue of Electronic Patient Record (EPR) implies an increasing amount of medical texts readily available for processing, as soon as convenient tools are made available. The chief application is text analysis, from which one can drive other disciplines like indexing for retrieval, knowledge representation, translation and inferencing for medical intelligent systems. Prerequisites for a convenient analyzer of medical texts are: building the lexicon, developing semantic representation of the domain, having a large corpus of texts available for statistical analysis, and finally mastering robust and powerful parsing techniques in order to satisfy the constraints of the medical domain. This article aims at presenting an easy-to-use parser ready to be adapted in different settings. It describes its power together with its practical limitations as experienced by the authors. PMID:10566313
The power and limits of a rule-based morpho-semantic parser.
Baud, R H; Rassinoux, A M; Ruch, P; Lovis, C; Scherrer, J R
1999-01-01
The venue of Electronic Patient Record (EPR) implies an increasing amount of medical texts readily available for processing, as soon as convenient tools are made available. The chief application is text analysis, from which one can drive other disciplines like indexing for retrieval, knowledge representation, translation and inferencing for medical intelligent systems. Prerequisites for a convenient analyzer of medical texts are: building the lexicon, developing semantic representation of the domain, having a large corpus of texts available for statistical analysis, and finally mastering robust and powerful parsing techniques in order to satisfy the constraints of the medical domain. This article aims at presenting an easy-to-use parser ready to be adapted in different settings. It describes its power together with its practical limitations as experienced by the authors.
Improving Retrieval Performance by Relevance Feedback.
ERIC Educational Resources Information Center
Salton, Gerard; Buckley, Chris
1990-01-01
Briefly describes the principal relevance feedback methods that have been introduced over the years and evaluates the effectiveness of the methods in producing improved query formulations. Prescriptions are given for conducting text retrieval operations iteratively using relevance feedback. (24 references) (Author/CLB)
NASA Astrophysics Data System (ADS)
Taira, Ricky K.; Wong, Clement; Johnson, David; Bhushan, Vikas; Rivera, Monica; Huang, Lu J.; Aberle, Denise R.; Cardenas, Alfonso F.; Chu, Wesley W.
1995-05-01
With the increase in the volume and distribution of images and text available in PACS and medical electronic health-care environments it becomes increasingly important to maintain indexes that summarize the content of these multi-media documents. Such indices are necessary to quickly locate relevant patient cases for research, patient management, and teaching. The goal of this project is to develop an intelligent document retrieval system that allows researchers to request for patient cases based on document content. Thus we wish to retrieve patient cases from electronic information archives that could include a combined specification of patient demographics, low level radiologic findings (size, shape, number), intermediate-level radiologic findings (e.g., atelectasis, infiltrates, etc.) and/or high-level pathology constraints (e.g., well-differentiated small cell carcinoma). The cases could be distributed among multiple heterogeneous databases such as PACS, RIS, and HIS. Content- based retrieval systems go beyond the capabilities of simple key-word or string-based retrieval matching systems. These systems require a knowledge base to comprehend the generality/specificity of a concept (thus knowing the subclasses or related concepts to a given concept) and knowledge of the various string representations for each concept (i.e., synonyms, lexical variants, etc.). We have previously reported on a data integration mediation layer that allows transparent access to multiple heterogeneous distributed medical databases (HIS, RIS, and PACS). The data access layer of our architecture currently has limited query processing capabilities. Given a patient hospital identification number, the access mediation layer collects all documents in RIS and HIS and returns this information to a specified workstation location. In this paper we report on our efforts to extend the query processing capabilities of the system by creation of custom query interfaces, an intelligent query processing engine, and a document-content index that can be generated automatically (i.e., no manual authoring or changes to the normal clinical protocols).
Shahar, Yuval; Young, Ohad; Shalom, Erez; Mayaffit, Alon; Moskovitch, Robert; Hessing, Alon; Galperin, Maya
2004-01-01
We propose to present a poster (and potentially also a demonstration of the implemented system) summarizing the current state of our work on a hybrid, multiple-format representation of clinical guidelines that facilitates conversion of guidelines from free text to a formal representation. We describe a distributed Web-based architecture (DeGeL) and a set of tools using the hybrid representation. The tools enable performing tasks such as guideline specification, semantic markup, search, retrieval, visualization, eligibility determination, runtime application and retrospective quality assessment. The representation includes four parallel formats: Free text (one or more original sources); semistructured text (labeled by the target guideline-ontology semantic labels); semiformal text (which includes some control specification); and a formal, machine-executable representation. The specification, indexing, search, retrieval, and browsing tools are essentially independent of the ontology chosen for guideline representation, but editing the semi-formal and formal formats requires ontology-specific tools, which we have developed in the case of the Asbru guideline-specification language. The four formats support increasingly sophisticated computational tasks. The hybrid guidelines are stored in a Web-based library. All tools, such as for runtime guideline application or retrospective quality assessment, are designed to operate on all representations. We demonstrate the hybrid framework by providing examples from the semantic markup and search tools.
Automatic natural acquisition of a semantic network for information retrieval systems
NASA Astrophysics Data System (ADS)
Enguehard, Chantal; Malvache, Pierre; Trigano, Philippe
1992-03-01
The amount of information is becoming greater and greater, in industries where complex processes are performed it is becoming increasingly difficult to profit from all the documents produced when fresh knowledge becomes available (reports, experiments, findings). This situation causes a considerable and expensive waste of precious time lost searching for documents or, quite simply, results in outright repeating what has been done. One solution is to transform all paper information into computerized information. We might imagine that we are in a science-fiction world and that we have the perfect computer. We tell it everything we know, we make it read all the books, and if we ask it any question, it will find the response if that response exists. But unfortunately, we are in the real world and the last four decades have taught us to minimize our expectations of computers. During the 1960s, the information retrieval systems appeared. Their purpose is to provide access to any desired documents, in response to a question about a subject, even if it is not known to exist. Here we focus on the problem of selecting items to index the documents. In 1966, Salton identified this problem as crucial when he saw that his system, Medlars, did not find a relevant text because of the wrong indexation. Faced with this problem, he imagined a guide to help authors choose the correct indexation, but he anticipated the automation of this operation with the SMART system. It was stated previously that a manual language analysis for information items by subjects experts is likely to prove impractical in the long run. After a brief survey of the existing responses to the index choice problem, we shall present the system automatic natural acquisition (ANA) which chooses items to index texts by using as little knowledge as possible- -just by learning the language. This system does not use any grammar or lexicon, so the selected indexes will be very close to the field concerned in the texts.
Krieger, Mary M.; Richter, Randy R.; Austin, Tricia M.
2008-01-01
Objective: The research sought to determine (1) how use of the PubMed free full-text (FFT) limit affects citation retrieval and (2) how use of the FFT limit impacts the types of articles and levels of evidence retrieved. Methods: Four clinical questions based on a research agenda for physical therapy were searched in PubMed both with and without the use of the FFT limit. Retrieved citations were examined for relevancy to each question. Abstracts of relevant citations were reviewed to determine the types of articles and levels of evidence. Descriptive analysis was used to compare the total number of citations, number of relevant citations, types of articles, and levels of evidence both with and without the use of the FFT limit. Results: Across all 4 questions, the FFT limit reduced the number of citations to 11.1% of the total number of citations retrieved without the FFT limit. Additionally, high-quality evidence such as systematic reviews and randomized controlled trials were missed when the FFT limit was used. Conclusions: Health sciences librarians play a key role in educating users about the potential impact the FFT limit has on the number of citations, types of articles, and levels of evidence retrieved. PMID:18974812
Yu, Kaijun
2010-07-01
This paper Analys the design goals of Medical Instrumentation standard information retrieval system. Based on the B /S structure,we established a medical instrumentation standard retrieval system with ASP.NET C # programming language, IIS f Web server, SQL Server 2000 database, in the. NET environment. The paper also Introduces the system structure, retrieval system modules, system development environment and detailed design of the system.
Code of Federal Regulations, 2010 CFR
2010-10-01
... and information retrieval systems. 433.116 Section 433.116 Public Health CENTERS FOR MEDICARE... FISCAL ADMINISTRATION Mechanized Claims Processing and Information Retrieval Systems § 433.116 FFP for operation of mechanized claims processing and information retrieval systems. (a) Subject to 42 CFR 433.113(c...
SPIRES (Stanford Public Information Retrieval System) 1970-71 Annual Report.
ERIC Educational Resources Information Center
Parker, Edwin B.
SPIRES (Stanford Public Information REtrieval System) is a computer information storage and retrieval system being developed at Stanford University with funding from the National Science Foundation. SPIRES has two major goals: to provide a user-oriented, interactive, on-line retrieval system for a variety of researchers at Stanford; and to support…
Integrating Query of Relational and Textual Data in Clinical Databases: A Case Study
Fisk, John M.; Mutalik, Pradeep; Levin, Forrest W.; Erdos, Joseph; Taylor, Caroline; Nadkarni, Prakash
2003-01-01
Objectives: The authors designed and implemented a clinical data mart composed of an integrated information retrieval (IR) and relational database management system (RDBMS). Design: Using commodity software, which supports interactive, attribute-centric text and relational searches, the mart houses 2.8 million documents that span a five-year period and supports basic IR features such as Boolean searches, stemming, and proximity and fuzzy searching. Measurements: Results are relevance-ranked using either “total documents per patient” or “report type weighting.” Results: Non-curated medical text has a significant degree of malformation with respect to spelling and punctuation, which creates difficulties for text indexing and searching. Presently, the IR facilities of RDBMS packages lack the features necessary to handle such malformed text adequately. Conclusion: A robust IR+RDBMS system can be developed, but it requires integrating RDBMSs with third-party IR software. RDBMS vendors need to make their IR offerings more accessible to non-programmers. PMID:12509355
Reinforced Concrete Wall Form Design Program
1992-08-01
criteria is an absolute limit. You have the choice of 1/8 or 1/16 of an inch total deflection in a span. Once these limits are set here, then they are...Calls GET-INFO-TEXT - Calls ZERO -PLY - If the response to GET-INFO-TEXT is "Values retrieved by computer", then the following procedures are executed...like to enter their own values. ZERO -PLY - Re-initializes all PLY-VEC values to"?". GET-PLY-CLASS - Retrieves from the user the grade of plyform to be
NASA Technical Reports Server (NTRS)
Jamsek, Damir A.
1993-01-01
A brief example of the use of formal methods techniques in the specification of a software system is presented. The report is part of a larger effort targeted at defining a formal methods pilot project for NASA. One possible application domain that may be used to demonstrate the effective use of formal methods techniques within the NASA environment is presented. It is not intended to provide a tutorial on either formal methods techniques or the application being addressed. It should, however, provide an indication that the application being considered is suitable for a formal methods by showing how such a task may be started. The particular system being addressed is the Structured File Services (SFS), which is a part of the Data Storage and Retrieval Subsystem (DSAR), which in turn is part of the Data Management System (DMS) onboard Spacestation Freedom. This is a software system that is currently under development for NASA. An informal mathematical development is presented. Section 3 contains the same development using Penelope (23), an Ada specification and verification system. The complete text of the English version Software Requirements Specification (SRS) is reproduced in Appendix A.
Validating a Geographical Image Retrieval System.
ERIC Educational Resources Information Center
Zhu, Bin; Chen, Hsinchun
2000-01-01
Summarizes a prototype geographical image retrieval system that demonstrates how to integrate image processing and information analysis techniques to support large-scale content-based image retrieval. Describes an experiment to validate the performance of this image retrieval system against that of human subjects by examining similarity analysis…
Automatic Evidence Retrieval for Systematic Reviews
Choong, Miew Keen; Galgani, Filippo; Dunn, Adam G
2014-01-01
Background Snowballing involves recursively pursuing relevant references cited in the retrieved literature and adding them to the search results. Snowballing is an alternative approach to discover additional evidence that was not retrieved through conventional search. Snowballing’s effectiveness makes it best practice in systematic reviews despite being time-consuming and tedious. Objective Our goal was to evaluate an automatic method for citation snowballing’s capacity to identify and retrieve the full text and/or abstracts of cited articles. Methods Using 20 review articles that contained 949 citations to journal or conference articles, we manually searched Microsoft Academic Search (MAS) and identified 78.0% (740/949) of the cited articles that were present in the database. We compared the performance of the automatic citation snowballing method against the results of this manual search, measuring precision, recall, and F1 score. Results The automatic method was able to correctly identify 633 (as proportion of included citations: recall=66.7%, F1 score=79.3%; as proportion of citations in MAS: recall=85.5%, F1 score=91.2%) of citations with high precision (97.7%), and retrieved the full text or abstract for 490 (recall=82.9%, precision=92.1%, F1 score=87.3%) of the 633 correctly retrieved citations. Conclusions The proposed method for automatic citation snowballing is accurate and is capable of obtaining the full texts or abstracts for a substantial proportion of the scholarly citations in review articles. By automating the process of citation snowballing, it may be possible to reduce the time and effort of common evidence surveillance tasks such as keeping trial registries up to date and conducting systematic reviews. PMID:25274020
Feature extraction for document text using Latent Dirichlet Allocation
NASA Astrophysics Data System (ADS)
Prihatini, P. M.; Suryawan, I. K.; Mandia, IN
2018-01-01
Feature extraction is one of stages in the information retrieval system that used to extract the unique feature values of a text document. The process of feature extraction can be done by several methods, one of which is Latent Dirichlet Allocation. However, researches related to text feature extraction using Latent Dirichlet Allocation method are rarely found for Indonesian text. Therefore, through this research, a text feature extraction will be implemented for Indonesian text. The research method consists of data acquisition, text pre-processing, initialization, topic sampling and evaluation. The evaluation is done by comparing Precision, Recall and F-Measure value between Latent Dirichlet Allocation and Term Frequency Inverse Document Frequency KMeans which commonly used for feature extraction. The evaluation results show that Precision, Recall and F-Measure value of Latent Dirichlet Allocation method is higher than Term Frequency Inverse Document Frequency KMeans method. This shows that Latent Dirichlet Allocation method is able to extract features and cluster Indonesian text better than Term Frequency Inverse Document Frequency KMeans method.
[Technologies for Complex Intelligent Clinical Data Analysis].
Baranov, A A; Namazova-Baranova, L S; Smirnov, I V; Devyatkin, D A; Shelmanov, A O; Vishneva, E A; Antonova, E V; Smirnov, V I
2016-01-01
The paper presents the system for intelligent analysis of clinical information. Authors describe methods implemented in the system for clinical information retrieval, intelligent diagnostics of chronic diseases, patient's features importance and for detection of hidden dependencies between features. Results of the experimental evaluation of these methods are also presented. Healthcare facilities generate a large flow of both structured and unstructured data which contain important information about patients. Test results are usually retained as structured data but some data is retained in the form of natural language texts (medical history, the results of physical examination, and the results of other examinations, such as ultrasound, ECG or X-ray studies). Many tasks arising in clinical practice can be automated applying methods for intelligent analysis of accumulated structured array and unstructured data that leads to improvement of the healthcare quality. the creation of the complex system for intelligent data analysis in the multi-disciplinary pediatric center. Authors propose methods for information extraction from clinical texts in Russian. The methods are carried out on the basis of deep linguistic analysis. They retrieve terms of diseases, symptoms, areas of the body and drugs. The methods can recognize additional attributes such as "negation" (indicates that the disease is absent), "no patient" (indicates that the disease refers to the patient's family member, but not to the patient), "severity of illness", disease course", "body region to which the disease refers". Authors use a set of hand-drawn templates and various techniques based on machine learning to retrieve information using a medical thesaurus. The extracted information is used to solve the problem of automatic diagnosis of chronic diseases. A machine learning method for classification of patients with similar nosology and the methodfor determining the most informative patients'features are also proposed. Authors have processed anonymized health records from the pediatric center to estimate the proposed methods. The results show the applicability of the information extracted from the texts for solving practical problems. The records ofpatients with allergic, glomerular and rheumatic diseases were used for experimental assessment of the method of automatic diagnostic. Authors have also determined the most appropriate machine learning methods for classification of patients for each group of diseases, as well as the most informative disease signs. It has been found that using additional information extracted from clinical texts, together with structured data helps to improve the quality of diagnosis of chronic diseases. Authors have also obtained pattern combinations of signs of diseases. The proposed methods have been implemented in the intelligent data processing system for a multidisciplinary pediatric center. The experimental results show the availability of the system to improve the quality of pediatric healthcare.
NASA Astrophysics Data System (ADS)
Nomori, Koji; Kitamura, Koji; Motomura, Yoichi; Nishida, Yoshifumi; Yamanaka, Tatsuhiro; Komatsubara, Akinori
In Japan, childhood injury prevention is urgent issue. Safety measures through creating knowledge of injury data are essential for preventing childhood injuries. Especially the injury prevention approach by product modification is very important. The risk assessment is one of the most fundamental methods to design safety products. The conventional risk assessment has been carried out subjectively because product makers have poor data on injuries. This paper deals with evidence-based risk assessment, in which artificial intelligence technologies are strongly needed. This paper describes a new method of foreseeing usage of products, which is the first step of the evidence-based risk assessment, and presents a retrieval system of injury data. The system enables a product designer to foresee how children use a product and which types of injuries occur due to the product in daily environment. The developed system consists of large scale injury data, text mining technology and probabilistic modeling technology. Large scale text data on childhood injuries was collected from medical institutions by an injury surveillance system. Types of behaviors to a product were derived from the injury text data using text mining technology. The relationship among products, types of behaviors, types of injuries and characteristics of children was modeled by Bayesian Network. The fundamental functions of the developed system and examples of new findings obtained by the system are reported in this paper.
ERIC Educational Resources Information Center
Parker, Edwin B.
SPIRES (Stanford Public Information Retrieval System) is a computerized information storage and retrieval system intended for use by students and faculty members who have little knowledge of computers but who need rapid and sophisticated retrieval and analysis. The functions and capabilities of the system from the user's point of view are…
Information system to manage anatomical knowledge and image data about brain
NASA Astrophysics Data System (ADS)
Barillot, Christian; Gibaud, Bernard; Montabord, E.; Garlatti, S.; Gauthier, N.; Kanellos, I.
1994-09-01
This paper reports about first results obtained in a project aiming at developing a computerized system to manage knowledge about brain anatomy. The emphasis is put on the design of a knowledge base which includes a symbolic model of cerebral anatomical structures (grey nuclei, cortical structures such as gyri and sulci, verntricles, vessels, etc.) and of hypermedia facilities allowing to retrieve and display information associated with the objects (texts, drawings, images). Atlas plates digitized from a stereotactic atlas are also used to provide natural and effective communication means between the user and the system.
RIM as the data base management system for a material properties data base
NASA Technical Reports Server (NTRS)
Karr, P. H.; Wilson, D. J.
1984-01-01
Relational Information Management (RIM) was selected as the data base management system for a prototype engineering materials data base. The data base provides a central repository for engineering material properties data, which facilitates their control. Numerous RIM capabilities are exploited to satisfy prototype data base requirements. Numerical, text, tabular, and graphical data and references are being stored for five material types. Data retrieval will be accomplished both interactively and through a FORTRAN interface. The experience gained in creating and exercising the prototype will be used in specifying requirements for a production system.
Effective Web and Desktop Retrieval with Enhanced Semantic Spaces
NASA Astrophysics Data System (ADS)
Daoud, Amjad M.
We describe the design and implementation of the NETBOOK prototype system for collecting, structuring and efficiently creating semantic vectors for concepts, noun phrases, and documents from a corpus of free full text ebooks available on the World Wide Web. Automatic generation of concept maps from correlated index terms and extracted noun phrases are used to build a powerful conceptual index of individual pages. To ensure scalabilty of our system, dimension reduction is performed using Random Projection [13]. Furthermore, we present a complete evaluation of the relative effectiveness of the NETBOOK system versus the Google Desktop [8].
Interoperability Policy Roadmap
2010-01-01
Retrieval – SMART The technique developed by Dr. Gerard Salton for automated information retrieval and text analysis is called the vector-space... Salton , G., Wong, A., Yang, C.S., “A Vector Space Model for Automatic Indexing”, Commu- nications of the ACM, 18, 613-620. [10] Salton , G., McGill
Multimodal medical information retrieval with unsupervised rank fusion.
Mourão, André; Martins, Flávio; Magalhães, João
2015-01-01
Modern medical information retrieval systems are paramount to manage the insurmountable quantities of clinical data. These systems empower health care experts in the diagnosis of patients and play an important role in the clinical decision process. However, the ever-growing heterogeneous information generated in medical environments poses several challenges for retrieval systems. We propose a medical information retrieval system with support for multimodal medical case-based retrieval. The system supports medical information discovery by providing multimodal search, through a novel data fusion algorithm, and term suggestions from a medical thesaurus. Our search system compared favorably to other systems in 2013 ImageCLEFMedical. Copyright © 2014 Elsevier Ltd. All rights reserved.
García-Remesal, M; Maojo, V; Billhardt, H; Crespo, J
2010-01-01
Bringing together structured and text-based sources is an exciting challenge for biomedical informaticians, since most relevant biomedical sources belong to one of these categories. In this paper we evaluate the feasibility of integrating relational and text-based biomedical sources using: i) an original logical schema acquisition method for textual databases developed by the authors, and ii) OntoFusion, a system originally designed by the authors for the integration of relational sources. We conducted an integration experiment involving a test set of seven differently structured sources covering the domain of genetic diseases. We used our logical schema acquisition method to generate schemas for all textual sources. The sources were integrated using the methods and tools provided by OntoFusion. The integration was validated using a test set of 500 queries. A panel of experts answered a questionnaire to evaluate i) the quality of the extracted schemas, ii) the query processing performance of the integrated set of sources, and iii) the relevance of the retrieved results. The results of the survey show that our method extracts coherent and representative logical schemas. Experts' feedback on the performance of the integrated system and the relevance of the retrieved results was also positive. Regarding the validation of the integration, the system successfully provided correct results for all queries in the test set. The results of the experiment suggest that text-based sources including a logical schema can be regarded as equivalent to structured databases. Using our method, previous research and existing tools designed for the integration of structured databases can be reused - possibly subject to minor modifications - to integrate differently structured sources.
A similarity learning approach to content-based image retrieval: application to digital mammography.
El-Naqa, Issam; Yang, Yongyi; Galatsanos, Nikolas P; Nishikawa, Robert M; Wernick, Miles N
2004-10-01
In this paper, we describe an approach to content-based retrieval of medical images from a database, and provide a preliminary demonstration of our approach as applied to retrieval of digital mammograms. Content-based image retrieval (CBIR) refers to the retrieval of images from a database using information derived from the images themselves, rather than solely from accompanying text indices. In the medical-imaging context, the ultimate aim of CBIR is to provide radiologists with a diagnostic aid in the form of a display of relevant past cases, along with proven pathology and other suitable information. CBIR may also be useful as a training tool for medical students and residents. The goal of information retrieval is to recall from a database information that is relevant to the user's query. The most challenging aspect of CBIR is the definition of relevance (similarity), which is used to guide the retrieval machine. In this paper, we pursue a new approach, in which similarity is learned from training examples provided by human observers. Specifically, we explore the use of neural networks and support vector machines to predict the user's notion of similarity. Within this framework we propose using a hierarchal learning approach, which consists of a cascade of a binary classifier and a regression module to optimize retrieval effectiveness and efficiency. We also explore how to incorporate online human interaction to achieve relevance feedback in this learning framework. Our experiments are based on a database consisting of 76 mammograms, all of which contain clustered microcalcifications (MCs). Our goal is to retrieve mammogram images containing similar MC clusters to that in a query. The performance of the retrieval system is evaluated using precision-recall curves computed using a cross-validation procedure. Our experimental results demonstrate that: 1) the learning framework can accurately predict the perceptual similarity reported by human observers, thereby serving as a basis for CBIR; 2) the learning-based framework can significantly outperform a simple distance-based similarity metric; 3) the use of the hierarchical two-stage network can improve retrieval performance; and 4) relevance feedback can be effectively incorporated into this learning framework to achieve improvement in retrieval precision based on online interaction with users; and 5) the retrieved images by the network can have predicting value for the disease condition of the query.
FPGA implementation of sparse matrix algorithm for information retrieval
NASA Astrophysics Data System (ADS)
Bojanic, Slobodan; Jevtic, Ruzica; Nieto-Taladriz, Octavio
2005-06-01
Information text data retrieval requires a tremendous amount of processing time because of the size of the data and the complexity of information retrieval algorithms. In this paper the solution to this problem is proposed via hardware supported information retrieval algorithms. Reconfigurable computing may adopt frequent hardware modifications through its tailorable hardware and exploits parallelism for a given application through reconfigurable and flexible hardware units. The degree of the parallelism can be tuned for data. In this work we implemented standard BLAS (basic linear algebra subprogram) sparse matrix algorithm named Compressed Sparse Row (CSR) that is showed to be more efficient in terms of storage space requirement and query-processing timing over the other sparse matrix algorithms for information retrieval application. Although inverted index algorithm is treated as the de facto standard for information retrieval for years, an alternative approach to store the index of text collection in a sparse matrix structure gains more attention. This approach performs query processing using sparse matrix-vector multiplication and due to parallelization achieves a substantial efficiency over the sequential inverted index. The parallel implementations of information retrieval kernel are presented in this work targeting the Virtex II Field Programmable Gate Arrays (FPGAs) board from Xilinx. A recent development in scientific applications is the use of FPGA to achieve high performance results. Computational results are compared to implementations on other platforms. The design achieves a high level of parallelism for the overall function while retaining highly optimised hardware within processing unit.
A comparison of Boolean-based retrieval to the WAIS system for retrieval of aeronautical information
NASA Technical Reports Server (NTRS)
Marchionini, Gary; Barlow, Diane
1994-01-01
An evaluation of an information retrieval system using a Boolean-based retrieval engine and inverted file architecture and WAIS, which uses a vector-based engine, was conducted. Four research questions in aeronautical engineering were used to retrieve sets of citations from the NASA Aerospace Database which was mounted on a WAIS server and available through Dialog File 108 which served as the Boolean-based system (BBS). High recall and high precision searches were done in the BBS and terse and verbose queries were used in the WAIS condition. Precision values for the WAIS searches were consistently above the precision values for high recall BBS searches and consistently below the precision values for high precision BBS searches. Terse WAIS queries gave somewhat better precision performance than verbose WAIS queries. In every case, a small number of relevant documents retrieved by one system were not retrieved by the other, indicating the incomplete nature of the results from either retrieval system. Relevant documents in the WAIS searches were found to be randomly distributed in the retrieved sets rather than distributed by ranks. Advantages and limitations of both types of systems are discussed.
Program Helps Generate And Manage Graphics
NASA Technical Reports Server (NTRS)
Truong, L. V.
1994-01-01
Living Color Frame Maker (LCFM) computer program generates computer-graphics frames. Graphical frames saved as text files, in readable and disclosed format, easily retrieved and manipulated by user programs for wide range of real-time visual information applications. LCFM implemented in frame-based expert system for visual aids in management of systems. Monitoring, diagnosis, and/or control, diagrams of circuits or systems brought to "life" by use of designated video colors and intensities to symbolize status of hardware components (via real-time feedback from sensors). Status of systems can be displayed. Written in C++ using Borland C++ 2.0 compiler for IBM PC-series computers and compatible computers running MS-DOS.
Cognitive Overhead in Hypertext Learning Reexamined: Overcoming the Myths
ERIC Educational Resources Information Center
Zumbach, Joerg
2006-01-01
In hypertext learning, comparative research is mostly dedicated to differences in text-hypertext information retrieval and processing and to optimization of nonlinear information retrieval. Most of these investigations are conducted within the context of applied research. The theoretical background of information acquisition from linear and…
Web Mining for Web Image Retrieval.
ERIC Educational Resources Information Center
Chen, Zheng; Wenyin, Liu; Zhang, Feng; Li, Mingjing; Zhang, Hongjiang
2001-01-01
Presents a prototype system for image retrieval from the Internet using Web mining. Discusses the architecture of the Web image retrieval prototype; document space modeling; user log mining; and image retrieval experiments to evaluate the proposed system. (AEF)
Tanaka, M; Nakazono, S; Matsuno, H; Tsujimoto, H; Kitamura, Y; Miyano, S
2000-01-01
We have implemented a system for assisting experts in selecting MEDLINE records for database construction purposes. This system has two specific features: The first is a learning mechanism which extracts characteristics in the abstracts of MEDLINE records of interest as patterns. These patterns reflect selection decisions by experts and are used for screening the records. The second is a keyword recommendation system which assists and supplements experts' knowledge in unexpected cases. Combined with a conventional keyword-based information retrieval system, this system may provide an efficient and comfortable environment for MEDLINE record selection by experts. Some computational experiments are provided to prove that this idea is useful.
Cognitive Process as a Basis for Intelligent Retrieval Systems Design.
ERIC Educational Resources Information Center
Chen, Hsinchun; Dhar, Vasant
1991-01-01
Two studies of the cognitive processes involved in online document-based information retrieval were conducted. These studies led to the development of five computational models of online document retrieval which were incorporated into the design of an "intelligent" document-based retrieval system. Both the system and the broader implications of…
ERIC Educational Resources Information Center
Stirling, Keith
2000-01-01
Describes a session on information retrieval systems that planned to discuss relevance measures with Web-based information retrieval; retrieval system performance and evaluation; probabilistic independence of index terms; vector-based models; metalanguages and digital objects; how users assess the reliability, timeliness and bias of information;…
Scene text detection by leveraging multi-channel information and local context
NASA Astrophysics Data System (ADS)
Wang, Runmin; Qian, Shengyou; Yang, Jianfeng; Gao, Changxin
2018-03-01
As an important information carrier, texts play significant roles in many applications. However, text detection in unconstrained scenes is a challenging problem due to cluttered backgrounds, various appearances, uneven illumination, etc.. In this paper, an approach based on multi-channel information and local context is proposed to detect texts in natural scenes. According to character candidate detection plays a vital role in text detection system, Maximally Stable Extremal Regions(MSERs) and Graph-cut based method are integrated to obtain the character candidates by leveraging the multi-channel image information. A cascaded false positive elimination mechanism are constructed from the perspective of the character and the text line respectively. Since the local context information is very valuable for us, these information is utilized to retrieve the missing characters for boosting the text detection performance. Experimental results on two benchmark datasets, i.e., the ICDAR 2011 dataset and the ICDAR 2013 dataset, demonstrate that the proposed method have achieved the state-of-the-art performance.
Bringing text display digital radio to consumers with hearing loss.
Sheffield, Ellyn G; Starling, Michael; Schwab, Daniel
2011-01-01
Radio is migrating to digital transmission, expanding its offerings to include captioning for individuals with hearing loss. Text display radio requires a large amount of word throughput with minimal screen display area, making good user interface design crucial to its success. In two experiments, we presented hearing, hard-of-hearing, and deaf consumers with National Public Radio stories converted to text and examined their preferences for and reactions to midsized and small radio text displays. We focused on physical display attributes such as text color, font style, line length, and scrolling type as well as emergency alert messages and emergency prompts for drivers, announcer identification schemes, and synchronization of audio and text. Results suggest that midsized, Global Positioning System (GPS)-style displays were well liked, synchronization of audio and text was important to comprehension and retrieval of story details, identification of announcers was served best with a combination of name change in parenthesis and color change, and a mixture of color and flashing symbols was preferred for emergency alerting.
Retrieval System for Calcined Waste for the Idaho Cleanup Project - 12104
DOE Office of Scientific and Technical Information (OSTI.GOV)
Eastman, Randy L.; Johnston, Beau A.; Lower, Danielle E.
This paper describes the conceptual approach to retrieve radioactive calcine waste, hereafter called calcine, from stainless steel storage bins contained within concrete vaults. The retrieval system will allow evacuation of the granular solids (calcine) from the storage bins through the use of stationary vacuum nozzles. The nozzles will use air jets for calcine fluidization and will be able to rotate and direct the fluidization or displacement of the calcine within the bin. Each bin will have a single retrieval system installed prior to operation to prevent worker exposure to the high radiation fields. The addition of an articulated camera armmore » will allow for operations monitoring and will be equipped with contingency tools to aid in calcine removal. Possible challenges (calcine bridging and rat-holing) associated with calcine retrieval and transport, including potential solutions for bin pressurization, calcine fluidization and waste confinement, are also addressed. The Calcine Disposition Project has the responsibility to retrieve, treat, and package HLW calcine. The calcine retrieval system has been designed to incorporate the functions and technical characteristics as established by the retrieval system functional analysis. By adequately implementing the highest ranking technical characteristics into the design of the retrieval system, the system will be able to satisfy the functional requirements. The retrieval system conceptual design provides the means for removing bulk calcine from the bins of the CSSF vaults. Top-down vacuum retrieval coupled with an articulating camera arm will allow for a robust, contained process capable of evacuating bulk calcine from bins and transporting it to the processing facility. The system is designed to fluidize, vacuum, transport and direct the calcine from its current location to the CSSF roof-top transport lines. An articulating camera arm, deployed through an adjacent access riser, will work in conjunction with the retrieval nozzle to aid in calcine fluidization, remote viewing, clumped calcine breaking and recovery from off-normal conditions. As the design of the retrieval system progresses from conceptual to preliminary, increasing attention will be directed toward detailed design and proof-of- concept testing. (authors)« less
Blanc, Xavier; Collet, Tinh-Hai; Auer, Reto; Iriarte, Pablo; Krause, Jan; Légaré, France; Cornuz, Jacques; Clair, Carole
2015-04-07
Full-text searches of articles increase the recall, defined by the proportion of relevant publications that are retrieved. However, this method is rarely used in medical research due to resource constraints. For the purpose of a systematic review of publications addressing shared decision making, a full-text search method was required to retrieve publications where shared decision making does not appear in the title or abstract. The objective of our study was to assess the efficiency and reliability of full-text searches in major medical journals for identifying shared decision making publications. A full-text search was performed on the websites of 15 high-impact journals in general internal medicine to look up publications of any type from 1996-2011 containing the phrase "shared decision making". The search method was compared with a PubMed search of titles and abstracts only. The full-text search was further validated by requesting all publications from the same time period from the individual journal publishers and searching through the collected dataset. The full-text search for "shared decision making" on journal websites identified 1286 publications in 15 journals compared to 119 through the PubMed search. The search within the publisher-provided publications of 6 journals identified 613 publications compared to 646 with the full-text search on the respective journal websites. The concordance rate was 94.3% between both full-text searches. Full-text searching on medical journal websites is an efficient and reliable way to identify relevant articles in the field of shared decision making for review or other purposes. It may be more widely used in biomedical research in other fields in the future, with the collaboration of publishers and journals toward open-access data.
Ensemble of classifiers for ontology enrichment
NASA Astrophysics Data System (ADS)
Semenova, A. V.; Kureichik, V. M.
2018-05-01
A classifier is a basis of ontology learning systems. Classification of text documents is used in many applications, such as information retrieval, information extraction, definition of spam. A new ensemble of classifiers based on SVM (a method of support vectors), LSTM (neural network) and word embedding are suggested. An experiment was conducted on open data, which allows us to conclude that the proposed classification method is promising. The implementation of the proposed classifier is performed in the Matlab using the functions of the Text Analytics Toolbox. The principal difference between the proposed ensembles of classifiers is the high quality of classification of data at acceptable time costs.
Building a common pipeline for rule-based document classification.
Patterson, Olga V; Ginter, Thomas; DuVall, Scott L
2013-01-01
Instance-based classification of clinical text is a widely used natural language processing task employed as a step for patient classification, document retrieval, or information extraction. Rule-based approaches rely on concept identification and context analysis in order to determine the appropriate class. We propose a five-step process that enables even small research teams to develop simple but powerful rule-based NLP systems by taking advantage of a common UIMA AS based pipeline for classification. Our proposed methodology coupled with the general-purpose solution provides researchers with access to the data locked in clinical text in cases of limited human resources and compact timelines.
Overlap in the functional neural systems involved in semantic and episodic memory retrieval.
Rajah, M N; McIntosh, A R
2005-03-01
Neuroimaging and neuropsychological data suggest that episodic and semantic memory may be mediated by distinct neural systems. However, an alternative perspective is that episodic and semantic memory represent different modes of processing within a single declarative memory system. To examine whether the multiple or the unitary system view better represents the data we conducted a network analysis using multivariate partial least squares (PLS ) activation analysis followed by covariance structural equation modeling (SEM) of positron emission tomography data obtained while healthy adults performed episodic and semantic verbal retrieval tasks. It is argued that if performance of episodic and semantic retrieval tasks are mediated by different memory systems, then there should differences in both regional activations and interregional correlations related to each type of retrieval task, respectively. The PLS results identified brain regions that were differentially active during episodic retrieval versus semantic retrieval. Regions that showed maximal differences in regional activity between episodic retrieval tasks were used to construct separate functional models for episodic and semantic retrieval. Omnibus tests of these functional models failed to find a significant difference across tasks for both functional models. The pattern of path coefficients for the episodic retrieval model were not different across tasks, nor were the path coefficients for the semantic retrieval model. The SEM results suggest that the same memory network/system was engaged across tasks, given the similarities in path coefficients. Therefore, activation differences between episodic and semantic retrieval may ref lect variation along a continuum of processing during task performance within the context of a single memory system.
Using Induction to Refine Information Retrieval Strategies
NASA Technical Reports Server (NTRS)
Baudin, Catherine; Pell, Barney; Kedar, Smadar
1994-01-01
Conceptual information retrieval systems use structured document indices, domain knowledge and a set of heuristic retrieval strategies to match user queries with a set of indices describing the document's content. Such retrieval strategies increase the set of relevant documents retrieved (increase recall), but at the expense of returning additional irrelevant documents (decrease precision). Usually in conceptual information retrieval systems this tradeoff is managed by hand and with difficulty. This paper discusses ways of managing this tradeoff by the application of standard induction algorithms to refine the retrieval strategies in an engineering design domain. We gathered examples of query/retrieval pairs during the system's operation using feedback from a user on the retrieved information. We then fed these examples to the induction algorithm and generated decision trees that refine the existing set of retrieval strategies. We found that (1) induction improved the precision on a set of queries generated by another user, without a significant loss in recall, and (2) in an interactive mode, the decision trees pointed out flaws in the retrieval and indexing knowledge and suggested ways to refine the retrieval strategies.
42 CFR 433.110 - Basis, purpose, and applicability.
Code of Federal Regulations, 2010 CFR
2010-10-01
... and Information Retrieval Systems § 433.110 Basis, purpose, and applicability. (a) This subpart... information retrieval systems and for the operation of certain systems. Additional HHS regulations and CMS... mechanized claims processing and information retrieval system or if the system fails to meet certain...
Hybrid ontology for semantic information retrieval model using keyword matching indexing system.
Uthayan, K R; Mala, G S Anandha
2015-01-01
Ontology is the process of growth and elucidation of concepts of an information domain being common for a group of users. Establishing ontology into information retrieval is a normal method to develop searching effects of relevant information users require. Keywords matching process with historical or information domain is significant in recent calculations for assisting the best match for specific input queries. This research presents a better querying mechanism for information retrieval which integrates the ontology queries with keyword search. The ontology-based query is changed into a primary order to predicate logic uncertainty which is used for routing the query to the appropriate servers. Matching algorithms characterize warm area of researches in computer science and artificial intelligence. In text matching, it is more dependable to study semantics model and query for conditions of semantic matching. This research develops the semantic matching results between input queries and information in ontology field. The contributed algorithm is a hybrid method that is based on matching extracted instances from the queries and information field. The queries and information domain is focused on semantic matching, to discover the best match and to progress the executive process. In conclusion, the hybrid ontology in semantic web is sufficient to retrieve the documents when compared to standard ontology.
Hybrid Ontology for Semantic Information Retrieval Model Using Keyword Matching Indexing System
Uthayan, K. R.; Anandha Mala, G. S.
2015-01-01
Ontology is the process of growth and elucidation of concepts of an information domain being common for a group of users. Establishing ontology into information retrieval is a normal method to develop searching effects of relevant information users require. Keywords matching process with historical or information domain is significant in recent calculations for assisting the best match for specific input queries. This research presents a better querying mechanism for information retrieval which integrates the ontology queries with keyword search. The ontology-based query is changed into a primary order to predicate logic uncertainty which is used for routing the query to the appropriate servers. Matching algorithms characterize warm area of researches in computer science and artificial intelligence. In text matching, it is more dependable to study semantics model and query for conditions of semantic matching. This research develops the semantic matching results between input queries and information in ontology field. The contributed algorithm is a hybrid method that is based on matching extracted instances from the queries and information field. The queries and information domain is focused on semantic matching, to discover the best match and to progress the executive process. In conclusion, the hybrid ontology in semantic web is sufficient to retrieve the documents when compared to standard ontology. PMID:25922851
Retrieving the unretrievable in electronic imaging systems: emotions, themes, and stories
NASA Astrophysics Data System (ADS)
Joergensen, Corinne
1999-05-01
New paradigms such as 'affective computing' and user-based research are extending the realm of facets traditionally addressed in IR systems. This paper builds on previous research reported to the electronic imaging community concerning the need to provide access to more abstract attributes of images than those currently amenable to a variety of content-based and text-based indexing techniques. Empirical research suggest that, for visual materials, in addition to standard bibliographic data and broad subject, and in addition to such visually perceptual attributes such as color, texture, shape, and position or focal point, additional access points such as themes, abstract concepts, emotions, stories, and 'people-related' information such as social status would be useful in image retrieval. More recent research demonstrates that similar results are also obtained with 'fine arts' images, which generally have no access provided for these types of attributes. Current efforts to match image attributes as revealed in empirical research with those addressed both in current textural and content-based indexing systems are discussed, as well as the need for new representations for image attributes and for collaboration among diverse communities of researchers.
Lowe, H. J.
1993-01-01
This paper describes Image Engine, an object-oriented, microcomputer-based, multimedia database designed to facilitate the storage and retrieval of digitized biomedical still images, video, and text using inexpensive desktop computers. The current prototype runs on Apple Macintosh computers and allows network database access via peer to peer file sharing protocols. Image Engine supports both free text and controlled vocabulary indexing of multimedia objects. The latter is implemented using the TView thesaurus model developed by the author. The current prototype of Image Engine uses the National Library of Medicine's Medical Subject Headings (MeSH) vocabulary (with UMLS Meta-1 extensions) as its indexing thesaurus. PMID:8130596
DORS: DDC Online Retrieval System.
ERIC Educational Resources Information Center
Liu, Songqiao; Svenonius, Elaine
1991-01-01
Describes the Dewey Online Retrieval System (DORS), which was developed at the University of California, Los Angeles (UCLA), to experiment with classification-based search strategies in online catalogs. Classification structures in automated information retrieval are discussed; and specifications for a classification retrieval interface are…
42 CFR 433.110 - Basis, purpose, and applicability.
Code of Federal Regulations, 2011 CFR
2011-10-01
... and Information Retrieval Systems § 433.110 Basis, purpose, and applicability. (a) This subpart... information retrieval systems and for the operation of certain systems. Additional HHS regulations and CMS... conditions on mechanized claims processing and information retrieval systems (including eligibility...
42 CFR 433.110 - Basis, purpose, and applicability.
Code of Federal Regulations, 2012 CFR
2012-10-01
... and Information Retrieval Systems § 433.110 Basis, purpose, and applicability. (a) This subpart... information retrieval systems and for the operation of certain systems. Additional HHS regulations and CMS... conditions on mechanized claims processing and information retrieval systems (including eligibility...
42 CFR 433.110 - Basis, purpose, and applicability.
Code of Federal Regulations, 2014 CFR
2014-10-01
... and Information Retrieval Systems § 433.110 Basis, purpose, and applicability. (a) This subpart... information retrieval systems and for the operation of certain systems. Additional HHS regulations and CMS... conditions on mechanized claims processing and information retrieval systems (including eligibility...
42 CFR 433.110 - Basis, purpose, and applicability.
Code of Federal Regulations, 2013 CFR
2013-10-01
... and Information Retrieval Systems § 433.110 Basis, purpose, and applicability. (a) This subpart... information retrieval systems and for the operation of certain systems. Additional HHS regulations and CMS... conditions on mechanized claims processing and information retrieval systems (including eligibility...
Medical image retrieval system using multiple features from 3D ROIs
NASA Astrophysics Data System (ADS)
Lu, Hongbing; Wang, Weiwei; Liao, Qimei; Zhang, Guopeng; Zhou, Zhiming
2012-02-01
Compared to a retrieval using global image features, features extracted from regions of interest (ROIs) that reflect distribution patterns of abnormalities would benefit more for content-based medical image retrieval (CBMIR) systems. Currently, most CBMIR systems have been designed for 2D ROIs, which cannot reflect 3D anatomical features and region distribution of lesions comprehensively. To further improve the accuracy of image retrieval, we proposed a retrieval method with 3D features including both geometric features such as Shape Index (SI) and Curvedness (CV) and texture features derived from 3D Gray Level Co-occurrence Matrix, which were extracted from 3D ROIs, based on our previous 2D medical images retrieval system. The system was evaluated with 20 volume CT datasets for colon polyp detection. Preliminary experiments indicated that the integration of morphological features with texture features could improve retrieval performance greatly. The retrieval result using features extracted from 3D ROIs accorded better with the diagnosis from optical colonoscopy than that based on features from 2D ROIs. With the test database of images, the average accuracy rate for 3D retrieval method was 76.6%, indicating its potential value in clinical application.
Project W-211 Initial Tank Retrieval Systems (ITRS) Description of Operations for 241-AZ-102
DOE Office of Scientific and Technical Information (OSTI.GOV)
BRIGGS, S.R.
2000-02-25
The primary purpose of the Initial Tank Retrieval Systems (ITRS) is to provide systems for retrieval of radioactive wastes stored in underground double-shell tanks (DSTs) for transfer to alternate storage, evaporation, pretreatment or treatment, while concurrently reducing risks associated with safety watch list and other DSTs. This Description of Operation (DOO) defines the control philosophy for the waste retrieval system for Tank 241-AZ-102 (AZ-102). This DOO provides a basis for the detailed design of the Project W-211 Retrieval Control System (RCS) for AZ-102 and also establishes test criteria for the RCS.
Wagland, Richard; Recio-Saucedo, Alejandra; Simon, Michael; Bracher, Michael; Hunt, Katherine; Foster, Claire; Downing, Amy; Glaser, Adam; Corner, Jessica
2016-08-01
Quality of cancer care may greatly impact on patients' health-related quality of life (HRQoL). Free-text responses to patient-reported outcome measures (PROMs) provide rich data but analysis is time and resource-intensive. This study developed and tested a learning-based text-mining approach to facilitate analysis of patients' experiences of care and develop an explanatory model illustrating impact on HRQoL. Respondents to a population-based survey of colorectal cancer survivors provided free-text comments regarding their experience of living with and beyond cancer. An existing coding framework was tested and adapted, which informed learning-based text mining of the data. Machine-learning algorithms were trained to identify comments relating to patients' specific experiences of service quality, which were verified by manual qualitative analysis. Comparisons between coded retrieved comments and a HRQoL measure (EQ5D) were explored. The survey response rate was 63.3% (21 802/34 467), of which 25.8% (n=5634) participants provided free-text comments. Of retrieved comments on experiences of care (n=1688), over half (n=1045, 62%) described positive care experiences. Most negative experiences concerned a lack of post-treatment care (n=191, 11% of retrieved comments) and insufficient information concerning self-management strategies (n=135, 8%) or treatment side effects (n=160, 9%). Associations existed between HRQoL scores and coded algorithm-retrieved comments. Analysis indicated that the mechanism by which service quality impacted on HRQoL was the extent to which services prevented or alleviated challenges associated with disease and treatment burdens. Learning-based text mining techniques were found useful and practical tools to identify specific free-text comments within a large dataset, facilitating resource-efficient qualitative analysis. This method should be considered for future PROM analysis to inform policy and practice. Study findings indicated that perceived care quality directly impacts on HRQoL. Published by the BMJ Publishing Group Limited. For permission to use (where not already granted under a licence) please go to http://www.bmj.com/company/products-services/rights-and-licensing/
Transparent Information Systems through Gateways, Front Ends, Intermediaries, and Interfaces.
ERIC Educational Resources Information Center
Williams, Martha E.
1986-01-01
Provides overview of design requirements for transparent information retrieval (implies that user sees through complexity of retrieval activities sequence). Highlights include need for transparent systems; history of transparent retrieval research; information retrieval functions (automated converters, routers, selectors, evaluators/analyzers);…
Knowledge-Based Information Retrieval.
ERIC Educational Resources Information Center
Ford, Nigel
1991-01-01
Discussion of information retrieval focuses on theoretical and empirical advances in knowledge-based information retrieval. Topics discussed include the use of natural language for queries; the use of expert systems; intelligent tutoring systems; user modeling; the need for evaluation of system effectiveness; and examples of systems, including…
Code of Federal Regulations, 2014 CFR
2014-10-01
..., development or installation of a statewide automated application processing and information retrieval system.... (2) The system is compatible with the claims processing and information retrieval systems used in the... in the title IV-A (AFDC) Automated Application Processing and Information Retrieval System Guide...
Code of Federal Regulations, 2013 CFR
2013-10-01
..., development or installation of a statewide automated application processing and information retrieval system.... (2) The system is compatible with the claims processing and information retrieval systems used in the... in the title IV-A (AFDC) Automated Application Processing and Information Retrieval System Guide...
Code of Federal Regulations, 2012 CFR
2012-10-01
..., development or installation of a statewide automated application processing and information retrieval system.... (2) The system is compatible with the claims processing and information retrieval systems used in the... in the title IV-A (AFDC) Automated Application Processing and Information Retrieval System Guide...
Code of Federal Regulations, 2011 CFR
2011-10-01
..., development or installation of a statewide automated application processing and information retrieval system.... (2) The system is compatible with the claims processing and information retrieval systems used in the... in the title IV-A (AFDC) Automated Application Processing and Information Retrieval System Guide...
Code of Federal Regulations, 2010 CFR
2010-10-01
..., development or installation of a statewide automated application processing and information retrieval system.... (2) The system is compatible with the claims processing and information retrieval systems used in the... in the title IV-A (AFDC) Automated Application Processing and Information Retrieval System Guide...
Image selection system. [computerized data storage and retrieval system
NASA Technical Reports Server (NTRS)
Knutson, M. A.; Hurd, D.; Hubble, L.; Kroeck, R. M.
1974-01-01
An image selection (ISS) was developed for the NASA-Ames Research Center Earth Resources Aircraft Project. The ISS is an interactive, graphics oriented, computer retrieval system for aerial imagery. An analysis of user coverage requests and retrieval strategies is presented, followed by a complete system description. Data base structure, retrieval processors, command language, interactive display options, file structures, and the system's capability to manage sets of selected imagery are described. A detailed example of an area coverage request is graphically presented.
Experimental evaluation of ontology-based HIV/AIDS frequently asked question retrieval system.
Ayalew, Yirsaw; Moeng, Barbara; Mosweunyane, Gontlafetse
2018-05-01
This study presents the results of experimental evaluations of an ontology-based frequently asked question retrieval system in the domain of HIV and AIDS. The main purpose of the system is to provide answers to questions on HIV/AIDS using ontology. To evaluate the effectiveness of the frequently asked question retrieval system, we conducted two experiments. The first experiment focused on the evaluation of the quality of the ontology we developed using the OQuaRE evaluation framework which is based on software quality metrics and metrics designed for ontology quality evaluation. The second experiment focused on evaluating the effectiveness of the ontology in retrieving relevant answers. For this we used an open-source information retrieval platform, Terrier, with retrieval models BM25 and PL2. For the measurement of performance, we used the measures mean average precision, mean reciprocal rank, and precision at 5. The results suggest that frequently asked question retrieval with ontology is more effective than frequently asked question retrieval without ontology in the domain of HIV/AIDS.
Information Retrieval Systems Retrieved? An Alternative to Present Dial Access Systems
ERIC Educational Resources Information Center
Hofmann, Norbert
1976-01-01
The expense of a dial access information retrieval system (DIARS) is weighed against its benefits. Problems of usage and efficacy for the student are outlined. A fully automated system is proposed instead, and its cost-saving features are pointed out. (MS)
Suppressing the morning rise in cortisol impairs free recall.
Rimmele, Ulrike; Meier, Flurina; Lange, Tanja; Born, Jan
2010-04-01
Elevated glucocorticoid levels impair memory retrieval. We investigated whether retrieval under naturally elevated glucocorticoid levels, i.e., during the morning rise in cortisol can be improved by suppressing cortisol. In a crossover study 16 men retrieved emotional and neutral texts and pictures (learned 3 d earlier) 30 min after morning awakening, following administration of the cortisol synthesis inhibitor metyrapone or placebo. Unexpectedly, the metyrapone-induced cortisol suppression significantly impaired free recall of both materials. Recognition remained unaffected. Thus, not only high, but also very low glucocorticoid levels impair retrieval, with the latter effect possibly reflecting insufficient occupation of hippocampal/amygdalar mineralocorticoid receptors (MRs).
The Flip Sides of Full-Text: Superindex and the Harvard Business Review/Online.
ERIC Educational Resources Information Center
Dadlez, Eva M.
1984-01-01
This article illustrates similarities between two different types of full-text databases--Superindex, Harvard Business Review/Online--and uses them as arena to demonstrate search and display applications of full-text. The selection of logical operators, full-text search strategies, and keywords and Bibliographic Retrieval Service's Occurrence…
Bohne-Lang, Andreas; Lang, Elke; Taube, Anke
2005-06-27
Web-based searching is the accepted contemporary mode of retrieving relevant literature, and retrieving as many full text articles as possible is a typical prerequisite for research success. In most cases only a proportion of references will be directly accessible as digital reprints through displayed links. A large number of references, however, have to be verified in library catalogues and, depending on their availability, are accessible as print holdings or by interlibrary loan request. The problem of verifying local print holdings from an initial retrieval set of citations can be solved using Z39.50, an ANSI protocol for interactively querying library information systems. Numerous systems include Z39.50 interfaces and therefore can process Z39.50 interactive requests. However, the programmed query interaction command structure is non-intuitive and inaccessible to the average biomedical researcher. For the typical user, it is necessary to implement the protocol within a tool that hides and handles Z39.50 syntax, presenting a comfortable user interface. PMD2HD is a web tool implementing Z39.50 to provide an appropriately functional and usable interface to integrate into the typical workflow that follows an initial PubMed literature search, providing users with an immediate asset to assist in the most tedious step in literature retrieval, checking for subscription holdings against a local online catalogue. PMD2HD can facilitate literature access considerably with respect to the time and cost of manual comparisons of search results with local catalogue holdings. The example presented in this article is related to the library system and collections of the German Cancer Research Centre. However, the PMD2HD software architecture and use of common Z39.50 protocol commands allow for transfer to a broad range of scientific libraries using Z39.50-compatible library information systems.
Enhanced Information Retrieval Using AJAX
NASA Astrophysics Data System (ADS)
Kachhwaha, Rajendra; Rajvanshi, Nitin
2010-11-01
Information Retrieval deals with the representation, storage, organization of, and access to information items. The representation and organization of information items should provide the user with easy access to the information with the rapid development of Internet, large amounts of digitally stored information is readily available on the World Wide Web. This information is so huge that it becomes increasingly difficult and time consuming for the users to find the information relevant to their needs. The explosive growth of information on the Internet has greatly increased the need for information retrieval systems. However, most of the search engines are using conventional information retrieval systems. An information system needs to implement sophisticated pattern matching tools to determine contents at a faster rate. AJAX has recently emerged as the new tool such the of information retrieval process of information retrieval can become fast and information reaches the use at a faster pace as compared to conventional retrieval systems.
Graph-Based Interactive Bibliographic Information Retrieval Systems
ERIC Educational Resources Information Center
Zhu, Yongjun
2017-01-01
In the big data era, we have witnessed the explosion of scholarly literature. This explosion has imposed challenges to the retrieval of bibliographic information. Retrieval of intended bibliographic information has become challenging due to the overwhelming search results returned by bibliographic information retrieval systems for given input…
Rinaldi, Fabio; Ellendorff, Tilia Renate; Madan, Sumit; Clematide, Simon; van der Lek, Adrian; Mevissen, Theo; Fluck, Juliane
2016-01-01
Automatic extraction of biological network information is one of the most desired and most complex tasks in biological and medical text mining. Track 4 at BioCreative V attempts to approach this complexity using fragments of large-scale manually curated biological networks, represented in Biological Expression Language (BEL), as training and test data. BEL is an advanced knowledge representation format which has been designed to be both human readable and machine processable. The specific goal of track 4 was to evaluate text mining systems capable of automatically constructing BEL statements from given evidence text, and of retrieving evidence text for given BEL statements. Given the complexity of the task, we designed an evaluation methodology which gives credit to partially correct statements. We identified various levels of information expressed by BEL statements, such as entities, functions, relations, and introduced an evaluation framework which rewards systems capable of delivering useful BEL fragments at each of these levels. The aim of this evaluation method is to help identify the characteristics of the systems which, if combined, would be most useful for achieving the overall goal of automatically constructing causal biological networks from text. © The Author(s) 2016. Published by Oxford University Press.
Query Expansion for Noisy Legal Documents
2008-11-01
9] G. Salton (ed). The SMART retrieval system experiments in automatic document processing. 1971. [10] H. Schutze and J . Pedersen. A cooccurrence...Language Modeling and Information Retrieval. http://www.lemurproject.org. [2] J . Baron, D. Lewis, and D. Oard. TREC 2006 legal track overview. In...Retrieval, 1993. [8] J . Rocchio. Relevance feedback in information retrieval. In The SMART retrieval system experiments in automatic document processing, 1971
EARS: An Online Bibliographic Search and Retrieval System Based on Ordered Explosion.
ERIC Educational Resources Information Center
Ramesh, R.; Drury, Colin G.
1987-01-01
Provides overview of Ergonomics Abstracts Retrieval System (EARS), an online bibliographic search and retrieval system in the area of human factors engineering. Other online systems are described, the design of EARS based on inverted file organization is explained, and system expansions including a thesaurus are discussed. (Author/LRW)
Vaccine adverse event text mining system for extracting features from vaccine safety reports.
Botsis, Taxiarchis; Buttolph, Thomas; Nguyen, Michael D; Winiecki, Scott; Woo, Emily Jane; Ball, Robert
2012-01-01
To develop and evaluate a text mining system for extracting key clinical features from vaccine adverse event reporting system (VAERS) narratives to aid in the automated review of adverse event reports. Based upon clinical significance to VAERS reviewing physicians, we defined the primary (diagnosis and cause of death) and secondary features (eg, symptoms) for extraction. We built a novel vaccine adverse event text mining (VaeTM) system based on a semantic text mining strategy. The performance of VaeTM was evaluated using a total of 300 VAERS reports in three sequential evaluations of 100 reports each. Moreover, we evaluated the VaeTM contribution to case classification; an information retrieval-based approach was used for the identification of anaphylaxis cases in a set of reports and was compared with two other methods: a dedicated text classifier and an online tool. The performance metrics of VaeTM were text mining metrics: recall, precision and F-measure. We also conducted a qualitative difference analysis and calculated sensitivity and specificity for classification of anaphylaxis cases based on the above three approaches. VaeTM performed best in extracting diagnosis, second level diagnosis, drug, vaccine, and lot number features (lenient F-measure in the third evaluation: 0.897, 0.817, 0.858, 0.874, and 0.914, respectively). In terms of case classification, high sensitivity was achieved (83.1%); this was equal and better compared to the text classifier (83.1%) and the online tool (40.7%), respectively. Our VaeTM implementation of a semantic text mining strategy shows promise in providing accurate and efficient extraction of key features from VAERS narratives.
ERIC Educational Resources Information Center
Forrest, Charles
1988-01-01
Reviews technological developments centered around microcomputers that have led to the design of integrated workstations. Topics discussed include methods of information storage, information retrieval, telecommunications networks, word processing, data management, graphics, interactive video, sound, interfaces, artificial intelligence, hypermedia,…
Precise and Efficient Retrieval of Captioned Images: The MARIE Project.
ERIC Educational Resources Information Center
Rowe, Neil C.
1999-01-01
The MARIE project explores knowledge-based information retrieval of captioned images of the kind found in picture libraries and on the Internet. MARIE's five-part approach exploits the idea that images are easier to understand with context, especially descriptive text near them, but it also does image analysis. Experiments show MARIE prototypes…
Exploring Encoding and Retrieval Effects of Background Information on Text Memory
ERIC Educational Resources Information Center
Rawson, Katherine A.; Kintsch, Walter
2004-01-01
Two experiments were conducted (a) to evaluate how providing background information at test may benefit retrieval and (b) to further examine how providing background information prior to study influences encoding. Half of the participants read background information prior to study, and the other half did not. In each group, half were presented…
NASA Technical Reports Server (NTRS)
Lindley, Craig A.
1995-01-01
This paper presents an architecture for satellites regarded as intercommunicating agents. The architecture is based upon a postmodern paradigm of artificial intelligence in which represented knowledge is regarded as text, inference procedures are regarded as social discourse and decision making conventions and the semantics of representations are grounded in the situated behaviour and activity of agents. A particular protocol is described for agent participation in distributed search and retrieval operations conducted as joint activities.
Do, Bao H; Wu, Andrew; Biswal, Sandip; Kamaya, Aya; Rubin, Daniel L
2010-11-01
Storing and retrieving radiology cases is an important activity for education and clinical research, but this process can be time-consuming. In the process of structuring reports and images into organized teaching files, incidental pathologic conditions not pertinent to the primary teaching point can be omitted, as when a user saves images of an aortic dissection case but disregards the incidental osteoid osteoma. An alternate strategy for identifying teaching cases is text search of reports in radiology information systems (RIS), but retrieved reports are unstructured, teaching-related content is not highlighted, and patient identifying information is not removed. Furthermore, searching unstructured reports requires sophisticated retrieval methods to achieve useful results. An open-source, RadLex(®)-compatible teaching file solution called RADTF, which uses natural language processing (NLP) methods to process radiology reports, was developed to create a searchable teaching resource from the RIS and the picture archiving and communication system (PACS). The NLP system extracts and de-identifies teaching-relevant statements from full reports to generate a stand-alone database, thus converting existing RIS archives into an on-demand source of teaching material. Using RADTF, the authors generated a semantic search-enabled, Web-based radiology archive containing over 700,000 cases with millions of images. RADTF combines a compact representation of the teaching-relevant content in radiology reports and a versatile search engine with the scale of the entire RIS-PACS collection of case material. ©RSNA, 2010
Visual Based Retrieval Systems and Web Mining--Introduction.
ERIC Educational Resources Information Center
Iyengar, S. S.
2001-01-01
Briefly discusses Web mining and image retrieval techniques, and then presents a summary of articles in this special issue. Articles focus on Web content mining, artificial neural networks as tools for image retrieval, content-based image retrieval systems, and personalizing the Web browsing experience using media agents. (AEF)
ERIC Educational Resources Information Center
Lynch, Clifford A.
1991-01-01
Describes several aspects of the problem of supporting information retrieval system query requirements in the relational database management system (RDBMS) environment and proposes an extension to query processing called nonmaterialized relations. User interactions with information retrieval systems are discussed, and nonmaterialized relations are…
SPIRES (Stanford Public Information REtrieval System). Annual Report (2d, 1968).
ERIC Educational Resources Information Center
Parker, Edwin B.; And Others
During 1968 the name of the project was changed from Stanford Physics Information Retrieval System" to "Stanford Public Information Retrieval System" to reflect the broadening of perspective and goals due to formal collaboration with Project BALLOTS (Bibliographic Automation of Large Library Operations using a Time-Sharing System).…
A Prototype of an Intelligent System for Information Retrieval: IOTA.
ERIC Educational Resources Information Center
Chiaramella, Y.; Defude, B.
1987-01-01
Discusses expert systems and their value as components of information retrieval systems related to semantic inference, and describes IOTA, a model of an intelligent information retrieval system which emphasizes natural language query processing. Experimental results are discussed and current and future developments are highlighted. (Author/LRW)
Crawler Acquisition and Testing Demonstration Project Management Plan
DOE Office of Scientific and Technical Information (OSTI.GOV)
DEFIGH-PRICE, C.
2000-10-23
If the crawler based retrieval system is selected, this project management plan identifies the path forward for acquiring a crawler/track pump waste retrieval system, and completing sufficient testing to support deploying the crawler for as part of a retrieval technology demonstration for Tank 241-C-104. In the balance of the document, these activities will be referred to as the Crawler Acquisition and Testing Demonstration. During recent Tri-Party Agreement negotiations, TPA milestones were proposed for a sludge/hard heel waste retrieval demonstration in tank C-104. Specifically one of the proposed milestones requires completion of a cold demonstration of sufficient scale to support finalmore » design and testing of the equipment (M-45-03G) by 6/30/2004. A crawler-based retrieval system was one of the two options evaluated during the pre-conceptual engineering for C-104 retrieval (RPP-6843 Rev. 0). The alternative technology procurement initiated by the Hanford Tanks Initiative (HTI) project, combined with the pre-conceptual engineering for C-104 retrieval provide an opportunity to achieve compliance with the proposed TPA milestone M-45-03H. This Crawler Acquisition and Testing Demonstration project management plan identifies the plans, organizational interfaces and responsibilities, management control systems, reporting systems, timeline and requirements for the acquisition and testing of the crawler based retrieval system. This project management plan is complimentary to and supportive of the Project Management Plan for Retrieval of C-104 (RPP-6557). This project management plan focuses on utilizing and completing the efforts initiated under the Hanford Tanks Initiative (HTI) to acquire and cold test a commercial crawler based retrieval system. The crawler-based retrieval system will be purchased on a schedule to support design of the waste retrieval from tank C-104 (project W-523) and to meet the requirement of proposed TPA milestone M-45-03H. This Crawler Acquisition and Testing Demonstration project management plan includes the following: (1) Identification of acquisition strategy and plan to obtain a crawler based retrieval system; (2) Plan for sufficient cold testing to make a decision for W-523 and to comply with TPA Milestone M-45-03H; (3) Cost and schedule for path forward; (4) Responsibilities of the participants; and (5) The plan is supported by updated Level 1 logics, a Relative Order of Magnitude cost estimate and preliminary project schedule.« less
Sensitivity Analysis for Atmospheric Infrared Sounder (AIRS) CO2 Retrieval
NASA Technical Reports Server (NTRS)
Gat, Ilana
2012-01-01
The Atmospheric Infrared Sounder (AIRS) is a thermal infrared sensor able to retrieve the daily atmospheric state globally for clear as well as partially cloudy field-of-views. The AIRS spectrometer has 2378 channels sensing from 15.4 micrometers to 3.7 micrometers, of which a small subset in the 15 micrometers region has been selected, to date, for CO2 retrieval. To improve upon the current retrieval method, we extended the retrieval calculations to include a prior estimate component and developed a channel ranking system to optimize the channels and number of channels used. The channel ranking system uses a mathematical formalism to rapidly process and assess the retrieval potential of large numbers of channels. Implementing this system, we identifed a larger optimized subset of AIRS channels that can decrease retrieval errors and minimize the overall sensitivity to other iridescent contributors, such as water vapor, ozone, and atmospheric temperature. This methodology selects channels globally by accounting for the latitudinal, longitudinal, and seasonal dependencies of the subset. The new methodology increases accuracy in AIRS CO2 as well as other retrievals and enables the extension of retrieved CO2 vertical profiles to altitudes ranging from the lower troposphere to upper stratosphere. The extended retrieval method for CO2 vertical profile estimation using a maximum-likelihood estimation method. We use model data to demonstrate the beneficial impact of the extended retrieval method using the new channel ranking system on CO2 retrieval.
Using Bitmap Indexing Technology for Combined Numerical and TextQueries
DOE Office of Scientific and Technical Information (OSTI.GOV)
Stockinger, Kurt; Cieslewicz, John; Wu, Kesheng
2006-10-16
In this paper, we describe a strategy of using compressedbitmap indices to speed up queries on both numerical data and textdocuments. By using an efficient compression algorithm, these compressedbitmap indices are compact even for indices with millions of distinctterms. Moreover, bitmap indices can be used very efficiently to answerBoolean queries over text documents involving multiple query terms.Existing inverted indices for text searches are usually inefficient forcorpora with a very large number of terms as well as for queriesinvolving a large number of hits. We demonstrate that our compressedbitmap index technology overcomes both of those short-comings. In aperformance comparison against amore » commonly used database system, ourindices answer queries 30 times faster on average. To provide full SQLsupport, we integrated our indexing software, called FastBit, withMonetDB. The integrated system MonetDB/FastBit provides not onlyefficient searches on a single table as FastBit does, but also answersjoin queries efficiently. Furthermore, MonetDB/FastBit also provides avery efficient retrieval mechanism of result records.« less
Computer program and user documentation medical data tape retrieval system
NASA Technical Reports Server (NTRS)
Anderson, J.
1971-01-01
This volume provides several levels of documentation for the program module of the NASA medical directorate mini-computer storage and retrieval system. A biomedical information system overview describes some of the reasons for the development of the mini-computer storage and retrieval system. It briefly outlines all of the program modules which constitute the system.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Berglin, E.J.
1996-09-17
Westinghouse Hanford Company (WHC) is exploring commercial methods for retrieving waste from the underground storage tanks at the Hanford site in south central Washington state. WHC needs data on commercial retrieval systems equipment in order to make programmatic decisions for waste retrieval. Full system testing of retrieval processes is to be demonstrated in phases through September 1997 in support of programs aimed to Acquire Commercial Technology for Retrieval (ACTR) and at the Hanford Tanks Initiative (HTI). One of the important parts of the integrated testing will be the deployment of retrieval tools using manipulator-based systems. WHC requires an assessment ofmore » a number of commercial deployment systems that have been identified by the ACTR program as good candidates to be included in an integrated testing effort. Included in this assessment should be an independent evaluation of manipulator tests performed to date, so that WHC can construct an integrated test based on these systems. The objectives of this document are to provide a description of the need, requirements, and constraints for a manipulator-based retrieval system; to evaluate manipulator-based concepts and testing performed to date by a number of commercial organizations; and to identify issues to be resolved through testing and/or analysis for each concept.« less
The Effect of Indexing Exhaustivity on Retrieval Performance.
ERIC Educational Resources Information Center
Burgin, Robert
1991-01-01
Describes results of a study that investigated the effect of variations in indexing exhaustivity on retrieval performance in a vector space retrieval system. The test collection of documents in the National Library of Medicine's Medline file indexed under cystic fibrosis is described, and use of the SMART information retrieval system is discussed.…
Code of Federal Regulations, 2014 CFR
2014-10-01
... claims processing and information retrieval systems. 433.127 Section 433.127 Public Health CENTERS FOR... PROGRAMS STATE FISCAL ADMINISTRATION Mechanized Claims Processing and Information Retrieval Systems § 433.127 Termination of FFP for failure to provide access to claims processing and information retrieval...
Code of Federal Regulations, 2011 CFR
2011-10-01
... claims processing and information retrieval systems. 433.127 Section 433.127 Public Health CENTERS FOR... PROGRAMS STATE FISCAL ADMINISTRATION Mechanized Claims Processing and Information Retrieval Systems § 433.127 Termination of FFP for failure to provide access to claims processing and information retrieval...
Code of Federal Regulations, 2010 CFR
2010-10-01
... claims processing and information retrieval systems. 433.127 Section 433.127 Public Health CENTERS FOR... PROGRAMS STATE FISCAL ADMINISTRATION Mechanized Claims Processing and Information Retrieval Systems § 433.127 Termination of FFP for failure to provide access to claims processing and information retrieval...
Code of Federal Regulations, 2013 CFR
2013-10-01
... claims processing and information retrieval systems. 433.127 Section 433.127 Public Health CENTERS FOR... PROGRAMS STATE FISCAL ADMINISTRATION Mechanized Claims Processing and Information Retrieval Systems § 433.127 Termination of FFP for failure to provide access to claims processing and information retrieval...
Code of Federal Regulations, 2012 CFR
2012-10-01
... claims processing and information retrieval systems. 433.127 Section 433.127 Public Health CENTERS FOR... PROGRAMS STATE FISCAL ADMINISTRATION Mechanized Claims Processing and Information Retrieval Systems § 433.127 Termination of FFP for failure to provide access to claims processing and information retrieval...
INFORMATION STORAGE AND RETRIEVAL, REPORTS ON EVALUATION PROCEDURES AND RESULTS 1965-1967.
ERIC Educational Resources Information Center
SALTON, GERALD
A DETAILED ANALYSIS OF THE RETRIEVAL EVALUATION RESULTS OBTAINED WITH THE AUTOMATIC SMART DOCUMENT RETRIEVAL SYSTEM FOR DOCUMENT COLLECTIONS IN THE FIELDS OF AERODYNAMICS, COMPUTER SCIENCE, AND DOCUMENTATION IS GIVEN IN THIS REPORT. THE VARIOUS COMPONENTS OF FULLY AUTOMATIC DOCUMENT RETRIEVAL SYSTEMS ARE DISCUSSED IN DETAIL, INCLUDING THE FORMS OF…
Patterns of Hierarchical Structure in the Medical Lexicon
Michael, Patricia A.; Cole, William G.; Stewart, James; Blois, Marsden S.
1987-01-01
Concepts in basic and clinical medical science cover a wide range of levels of description, from the subatomic level to the level of the patient as a whole. Medical language may have usage regularities consistent with this hierarchical nature of medical knowledge. Preliminary studies of word occurrence in abstracts drawn from three medical journals representing three broadly defined levels of description (chemical system, physiologic system, and patient as a whole) demonstrated a nonuniform word usage, with many words unique to one or another journal. In this present study, word occurrence was examined in an expanded pool of medical text consisting of sixteen textbooks representing ten different levels of description: atom/ion, micromolecule, macromolecule, organelle, cell, tissue, organ, physiologic system, major body part (or multiple physiologic systems) and patient as a whole. Word usage was found to be nonuniform, with many words unique to specific levels. The presence of such usage regularities may provide a basis for facilitating the automatic classification and retrieval of medical text.
HUC--A User Designed System for All Recorded Knowledge and Information.
ERIC Educational Resources Information Center
Hilton, Howard J.
This paper proposes a user designed system, HUC, intended to provide a single index and retrieval system covering all recorded knowledge and information capable of being retrieved from all modes of storage, from manual to the most sophisticated retrieval system. The concept integrates terminal hardware, software, and database structure to allow…
Automatic medical image annotation and keyword-based image retrieval using relevance feedback.
Ko, Byoung Chul; Lee, JiHyeon; Nam, Jae-Yeal
2012-08-01
This paper presents novel multiple keywords annotation for medical images, keyword-based medical image retrieval, and relevance feedback method for image retrieval for enhancing image retrieval performance. For semantic keyword annotation, this study proposes a novel medical image classification method combining local wavelet-based center symmetric-local binary patterns with random forests. For keyword-based image retrieval, our retrieval system use the confidence score that is assigned to each annotated keyword by combining probabilities of random forests with predefined body relation graph. To overcome the limitation of keyword-based image retrieval, we combine our image retrieval system with relevance feedback mechanism based on visual feature and pattern classifier. Compared with other annotation and relevance feedback algorithms, the proposed method shows both improved annotation performance and accurate retrieval results.
Searching Harvard Business Review Online. . . Lessons in Searching a Full Text Database.
ERIC Educational Resources Information Center
Tenopir, Carol
1985-01-01
This article examines the Harvard Business Review Online (HBRO) database (bibliographic description fields, abstracts, extracted information, full text, subject descriptors) and reports on 31 sample HBRO searches conducted in Bibliographic Retrieval Services to test differences between searching full text and searching bibliographic record. Sample…
Elaboration over a Discourse Facilitates Retrieval in Sentence Processing.
Troyer, Melissa; Hofmeister, Philip; Kutas, Marta
2016-01-01
Language comprehension requires access to stored knowledge and the ability to combine knowledge in new, meaningful ways. Previous work has shown that processing linguistically more complex expressions ('Texas cattle rancher' vs. 'rancher') leads to slow-downs in reading during initial processing, possibly reflecting effort in combining information. Conversely, when this information must subsequently be retrieved (as in filler-gap constructions), processing is facilitated for more complex expressions, possibly because more semantic cues are available during retrieval. To follow up on this hypothesis, we tested whether information distributed across a short discourse can similarly provide effective cues for retrieval. Participants read texts introducing two referents (e.g., two senators), one of whom was described in greater detail than the other (e.g., 'The Democrat had voted for one of the senators, and the Republican had voted for the other, a man from Ohio who was running for president'). The final sentence (e.g., 'The senator who the {Republican/Democrat}had voted for…') contained a relative clause picking out either the Many-Cue referent (with 'Republican') or the One-Cue referent (with 'Democrat'). We predicted facilitated retrieval (faster reading times) for the Many-Cue condition at the verb region ('had voted for'), where readers could understand that 'The senator' is the object of the verb. As predicted, this pattern was observed at the retrieval region and continued throughout the rest of the sentence. Participants also completed the Author/Magazine Recognition Tests (ART/MRT; Stanovich and West, 1989), providing a proxy for world knowledge. Since higher ART/MRT scores may index (a) greater experience accessing relevant knowledge and/or (b) richer/more highly structured representations in semantic memory, we predicted it would be positively associated with effects of elaboration on retrieval. We did not observe the predicted interaction between ART/MRT scores and Cue condition at the retrieval region, though ART/MRT interacted with Cue condition in other locations in the sentence. In sum, we found that providing more elaborative information over the course of a text can facilitate retrieval for referents, consistent with a framework in which referential elaboration over a discourse and not just local linguistic information directly impacts information retrieval during sentence processing.
Not on the Same Page: Undergraduates' Information Retrieval in Electronic and Print Books
ERIC Educational Resources Information Center
Berg, Selinda Adelle; Hoffmann, Kristin; Dawson, Diane
2010-01-01
Academic libraries are increasingly collecting e-books, but little research has investigated how students use e-books compared to print texts. This study used a prompted think-aloud method to gain an understanding of the information retrieval behavior of students in both formats. Qualitative analysis identified themes that will inform instruction…
Support Vector Machines: Relevance Feedback and Information Retrieval.
ERIC Educational Resources Information Center
Drucker, Harris; Shahrary, Behzad; Gibbon, David C.
2002-01-01
Compares support vector machines (SVMs) to Rocchio, Ide regular and Ide dec-hi algorithms in information retrieval (IR) of text documents using relevancy feedback. If the preliminary search is so poor that one has to search through many documents to find at least one relevant document, then SVM is preferred. Includes nine tables. (Contains 24…
Clustering Methods; Part IV of Scientific Report No. ISR-18, Information Storage and Retrieval...
ERIC Educational Resources Information Center
Cornell Univ., Ithaca, NY. Dept. of Computer Science.
Two papers are included as Part Four of this report on Salton's Magical Automatic Retriever of Texts (SMART) project report. The first paper: "A Controlled Single Pass Classification Algorithm with Application to Multilevel Clustering" by D. B. Johnson and J. M. Laferente presents a single pass clustering method which compares favorably…
ERIC Educational Resources Information Center
Cornell Univ., Ithaca, NY. Dept. of Computer Science.
Part Three of this five part report on Salton's Magical Automatic Retriever of Texts (SMART) project contains four papers. The first: "Variations on the Query Splitting Technique with Relevance Feedback" by T. P. Baker discusses some experiments in relevance feedback performed with variations on the technique of query splitting. The…
Hearns, S; Shirley, P J
2006-01-01
Retrieval and transfer of critically ill and injured patients is a high risk activity. Risk can be minimised with robust safety and clinical governance systems in place. This article describes the various governance systems that can be employed to optimise safety and efficiency in retrieval services. These include operating procedure development, equipment management, communications procedures, crew resource management, significant event analysis, audit and training. PMID:17130608
A proposal of fuzzy connective with learning function and its application to fuzzy retrieval system
NASA Technical Reports Server (NTRS)
Hayashi, Isao; Naito, Eiichi; Ozawa, Jun; Wakami, Noboru
1993-01-01
A new fuzzy connective and a structure of network constructed by fuzzy connectives are proposed to overcome a drawback of conventional fuzzy retrieval systems. This network represents a retrieval query and the fuzzy connectives in networks have a learning function to adjust its parameters by data from a database and outputs of a user. The fuzzy retrieval systems employing this network are also constructed. Users can retrieve results even with a query whose attributes do not exist in a database schema and can get satisfactory results for variety of thinkings by learning function.
ERIC Educational Resources Information Center
Vasarhelyi, Paul
The new data retrieval system for the social sciences which has recently been installed in the UNESCO Secretariat in Paris is described in this comprehensive report. The computerized system is designed to facilitate the existing storage systems in the circulation of information, data retrieval, and indexing services. Basically, this report…
Identification of histone modifications in biomedical text for supporting epigenomic research
Kolářik, Corinna; Klinger, Roman; Hofmann-Apitius, Martin
2009-01-01
Background Posttranslational modifications of histones influence the structure of chromatine and in such a way take part in the regulation of gene expression. Certain histone modification patterns, distributed over the genome, are connected to cell as well as tissue differentiation and to the adaption of organisms to their environment. Abnormal changes instead influence the development of disease states like cancer. The regulation mechanisms for modifying histones and its functionalities are the subject of epigenomics investigation and are still not completely understood. Text provides a rich resource of knowledge on epigenomics and modifications of histones in particular. It contains information about experimental studies, the conditions used, and results. To our knowledge, no approach has been published so far for identifying histone modifications in text. Results We have developed an approach for identifying histone modifications in biomedical literature with Conditional Random Fields (CRF) and for resolving the recognized histone modification term variants by term standardization. For the term identification F1 measures of 0.84 by 10-fold cross-validation on the training corpus and 0.81 on an independent test corpus have been obtained. The standardization enabled the correct transformation of 96% of the terms from training and 98% from test the corpus. Due to the lack of terminologies exhaustively covering specific histone modification types, we developed a histone modification term hierarchy for use in a semantic text retrieval system. Conclusion The developed approach highly improves the retrieval of articles describing histone modifications. Since text contains context information about performed studies and experiments, the identification of histone modifications is the basis for supporting literature-based knowledge discovery and hypothesis generation to accelerate epigenomic research. PMID:19208128
Identification of histone modifications in biomedical text for supporting epigenomic research.
Kolárik, Corinna; Klinger, Roman; Hofmann-Apitius, Martin
2009-01-30
Posttranslational modifications of histones influence the structure of chromatine and in such a way take part in the regulation of gene expression. Certain histone modification patterns, distributed over the genome, are connected to cell as well as tissue differentiation and to the adaption of organisms to their environment. Abnormal changes instead influence the development of disease states like cancer. The regulation mechanisms for modifying histones and its functionalities are the subject of epigenomics investigation and are still not completely understood. Text provides a rich resource of knowledge on epigenomics and modifications of histones in particular. It contains information about experimental studies, the conditions used, and results. To our knowledge, no approach has been published so far for identifying histone modifications in text. We have developed an approach for identifying histone modifications in biomedical literature with Conditional Random Fields (CRF) and for resolving the recognized histone modification term variants by term standardization. For the term identification F1 measures of 0.84 by 10-fold cross-validation on the training corpus and 0.81 on an independent test corpus have been obtained. The standardization enabled the correct transformation of 96% of the terms from training and 98% from test the corpus. Due to the lack of terminologies exhaustively covering specific histone modification types, we developed a histone modification term hierarchy for use in a semantic text retrieval system. The developed approach highly improves the retrieval of articles describing histone modifications. Since text contains context information about performed studies and experiments, the identification of histone modifications is the basis for supporting literature-based knowledge discovery and hypothesis generation to accelerate epigenomic research.
Atsak, Piray; Hauer, Daniela; Campolongo, Patrizia; Schelling, Gustav; McGaugh, James L.; Roozendaal, Benno
2012-01-01
There is extensive evidence that glucocorticoid hormones impair the retrieval of memory of emotionally arousing experiences. Although it is known that glucocorticoid effects on memory retrieval impairment depend on rapid interactions with arousal-induced noradrenergic activity, the exact mechanism underlying this presumably nongenomically mediated glucocorticoid action remains to be elucidated. Here, we show that the hippocampal endocannabinoid system, a rapidly activated retrograde messenger system, is involved in mediating glucocorticoid effects on retrieval of contextual fear memory. Systemic administration of corticosterone (0.3–3 mg/kg) to male Sprague–Dawley rats 1 h before retention testing impaired the retrieval of contextual fear memory without impairing the retrieval of auditory fear memory or directly affecting the expression of freezing behavior. Importantly, a blockade of hippocampal CB1 receptors with AM251 prevented the impairing effect of corticosterone on retrieval of contextual fear memory, whereas the same impairing dose of corticosterone increased hippocampal levels of the endocannabinoid 2-arachidonoylglycerol. We also found that antagonism of hippocampal β-adrenoceptor activity with local infusions of propranolol blocked the memory retrieval impairment induced by the CB receptor agonist WIN55,212–2. Thus, these findings strongly suggest that the endocannabinoid system plays an intermediary role in regulating rapid glucocorticoid effects on noradrenergic activity in impairing memory retrieval of emotionally arousing experiences. PMID:22331883
An Intelligent System for Document Retrieval in Distributed Office Environments.
ERIC Educational Resources Information Center
Mukhopadhyay, Uttam; And Others
1986-01-01
MINDS (Multiple Intelligent Node Document Servers) is a distributed system of knowledge-based query engines for efficiently retrieving multimedia documents in an office environment of distributed workstations. By learning document distribution patterns and user interests and preferences during system usage, it customizes document retrievals for…
42 CFR 433.138 - Identifying liable third parties.
Code of Federal Regulations, 2013 CFR
2013-10-01
... processing and information retrieval system. Basic requirement—Development of an action plan. (1) If a State has a mechanized claims processing and information retrieval system approved by CMS under subpart C of... plan must be integrated with the mechanized claims processing and information retrieval system. (2) The...
42 CFR 433.138 - Identifying liable third parties.
Code of Federal Regulations, 2014 CFR
2014-10-01
... processing and information retrieval system. Basic requirement—Development of an action plan. (1) If a State has a mechanized claims processing and information retrieval system approved by CMS under subpart C of... plan must be integrated with the mechanized claims processing and information retrieval system. (2) The...
42 CFR 433.138 - Identifying liable third parties.
Code of Federal Regulations, 2012 CFR
2012-10-01
... processing and information retrieval system. Basic requirement—Development of an action plan. (1) If a State has a mechanized claims processing and information retrieval system approved by CMS under subpart C of... plan must be integrated with the mechanized claims processing and information retrieval system. (2) The...
Integration of Information Retrieval and Database Management Systems.
ERIC Educational Resources Information Center
Deogun, Jitender S.; Raghavan, Vijay V.
1988-01-01
Discusses the motivation for integrating information retrieval and database management systems, and proposes a probabilistic retrieval model in which records in a file may be composed of attributes (formatted data items) and descriptors (content indicators). The details and resolutions of difficulties involved in integrating such systems are…
Hyperspectral remote sensing image retrieval system using spectral and texture features.
Zhang, Jing; Geng, Wenhao; Liang, Xi; Li, Jiafeng; Zhuo, Li; Zhou, Qianlan
2017-06-01
Although many content-based image retrieval systems have been developed, few studies have focused on hyperspectral remote sensing images. In this paper, a hyperspectral remote sensing image retrieval system based on spectral and texture features is proposed. The main contributions are fourfold: (1) considering the "mixed pixel" in the hyperspectral image, endmembers as spectral features are extracted by an improved automatic pixel purity index algorithm, then the texture features are extracted with the gray level co-occurrence matrix; (2) similarity measurement is designed for the hyperspectral remote sensing image retrieval system, in which the similarity of spectral features is measured with the spectral information divergence and spectral angle match mixed measurement and in which the similarity of textural features is measured with Euclidean distance; (3) considering the limited ability of the human visual system, the retrieval results are returned after synthesizing true color images based on the hyperspectral image characteristics; (4) the retrieval results are optimized by adjusting the feature weights of similarity measurements according to the user's relevance feedback. The experimental results on NASA data sets can show that our system can achieve comparable superior retrieval performance to existing hyperspectral analysis schemes.
Polygons of global undersea features for geographic searches
Hartwell, Stephen R.; Wingfield, Dana K.; Allwardt, Alan O.; Lightsom, Frances L.; Wong, Florence L.
2018-01-01
A shapefile of 311 undersea features from all major oceans and seas has been created as an aid for retrieving georeferenced information resources. Geospatial information systems with the capability to search user-defined, polygonal geographic areas will be able to utilize this shapefile or secondary products derived from it, such as linked data based on well-known text representations of the individual polygons within the shapefile. Version 1.1 of this report also includes a linked data representation of 299 of these features and their spatial extents.
Hypothesis-confirming information search strategies and computerized information-retrieval systems
DOE Office of Scientific and Technical Information (OSTI.GOV)
Jacobs, S.M.
A recent trend in information-retrieval systems technology is the development of on-line information retrieval systems. One objective of these systems has been to attempt to enhance decision effectiveness by allowing users to preferentially seek information, thereby facilitating the reduction or elimination of information overload. These systems do not necessarily lead to more-effective decision making, however. Recent research in information-search strategy suggests that when users are seeking information subsequent to forming initial beliefs, they may preferentially seek information to confirm these beliefs. It seems that effective computer-based decision support requires an information retrieval system capable of: (a) retrieving a subset ofmore » all available information, in order to reduce information overload, and (b) supporting an information search strategy that considers all relevant information, rather than merely hypothesis-confirming information. An information retrieval system with an expert component (i.e., a knowledge-based DSS) should be able to provide these capabilities. Results of this study are non conclusive; there was neither strong confirmatory evidence nor strong disconfirmatory evidence regarding the effectiveness of the KBDSS.« less
The development of the Medical Literature Analysis and Retrieval System (MEDLARS)*
Dee, Cheryl Rae
2007-01-01
Objective: The research provides a chronology of the US National Library of Medicine's (NLM's) contribution to access to the world's biomedical literature through its computerization of biomedical indexes, particularly the Medical Literature Analysis and Retrieval System (MEDLARS). Method: Using material gathered from NLM's archives and from personal interviews with people associated with developing MEDLARS and its associated systems, the author discusses key events in the history of MEDLARS. Discussion: From the development of the early mechanized bibliographic retrieval systems of the 1940s and to the beginnings of online, interactive computerized bibliographic search systems of the early 1970s chronicled here, NLM's contributions to automation and bibliographic retrieval have been extensive. Conclusion: As NLM's technological experience and expertise grew, innovative bibliographic storage and retrieval systems emerged. NLM's accomplishments regarding MEDLARS were cutting edge, placing the library at the forefront of incorporating mechanization and technologies into medical information systems. PMID:17971889
42 CFR 432.50 - FFP: Staffing and training costs.
Code of Federal Regulations, 2014 CFR
2014-10-01
... directly in the operation of mechanized claims processing and information retrieval systems, the rate is 75... processing and information retrieval systems, the rate is 50 percent for training and 90 percent for all... information retrieval systems (paragraphs (b)(2) and (3) of this section) are applicable only if the design...
42 CFR 432.50 - FFP: Staffing and training costs.
Code of Federal Regulations, 2013 CFR
2013-10-01
... directly in the operation of mechanized claims processing and information retrieval systems, the rate is 75... processing and information retrieval systems, the rate is 50 percent for training and 90 percent for all... information retrieval systems (paragraphs (b)(2) and (3) of this section) are applicable only if the design...
42 CFR 432.50 - FFP: Staffing and training costs.
Code of Federal Regulations, 2010 CFR
2010-10-01
... directly in the operation of mechanized claims processing and information retrieval systems, the rate is 75... processing and information retrieval systems, the rate is 50 percent for training and 90 percent for all... information retrieval systems (paragraphs (b)(2) and (3) of this section) are applicable only if the design...
42 CFR 432.50 - FFP: Staffing and training costs.
Code of Federal Regulations, 2011 CFR
2011-10-01
... directly in the operation of mechanized claims processing and information retrieval systems, the rate is 75... processing and information retrieval systems, the rate is 50 percent for training and 90 percent for all... information retrieval systems (paragraphs (b)(2) and (3) of this section) are applicable only if the design...
42 CFR 432.50 - FFP: Staffing and training costs.
Code of Federal Regulations, 2012 CFR
2012-10-01
... directly in the operation of mechanized claims processing and information retrieval systems, the rate is 75... processing and information retrieval systems, the rate is 50 percent for training and 90 percent for all... information retrieval systems (paragraphs (b)(2) and (3) of this section) are applicable only if the design...
Topology of Document Retrieval Systems.
ERIC Educational Resources Information Center
Everett, Daniel M.; Cater, Steven C.
1992-01-01
Explains the use of a topological structure to examine the closeness between documents in retrieval systems and analyzes the topological structure of a vector-space model, a fuzzy-set model, an extended Boolean model, a probabilistic model, and a TIRS (Topological Information Retrieval System) model. Proofs for the results are appended. (17…
Web Image Retrieval Using Self-Organizing Feature Map.
ERIC Educational Resources Information Center
Wu, Qishi; Iyengar, S. Sitharama; Zhu, Mengxia
2001-01-01
Provides an overview of current image retrieval systems. Describes the architecture of the SOFM (Self Organizing Feature Maps) based image retrieval system, discussing the system architecture and features. Introduces the Kohonen model, and describes the implementation details of SOFM computation and its learning algorithm. Presents a test example…
A computer system for the storage and retrieval of gravity data, Kingdom of Saudi Arabia
Godson, Richard H.; Andreasen, Gordon H.
1974-01-01
A computer system has been developed for the systematic storage and retrieval of gravity data. All pertinent facts relating to gravity station measurements and computed Bouguer values may be retrieved either by project name or by geographical coordinates. Features of the system include visual display in the form of printer listings of gravity data and printer plots of station locations. The retrieved data format interfaces with the format of GEOPAC, a system of computer programs designed for the analysis of geophysical data.
DOE Office of Scientific and Technical Information (OSTI.GOV)
RIECK, C.A.
1999-02-25
The primary purpose of the Initial Tank Retrieval Systems (ITRS) is to provide systems for retrieval of radioactive wastes stored in underground double-shell tanks (DSTS) for transfer to alternate storage, evaporation, pretreatment or treatment, while concurrently reducing risks associated with safety watch list and other DSTs. This Description of Operations (DOO) defines the control philosophy for the waste retrieval system for tanks 241-AP-102 (AP-102) and 241-AP-104 (AP-104). This DOO will provide a basis for the detailed design of the Retrieval Control System (RCS) for AP-102 and AP-104 and establishes test criteria for the RCS. The test criteria will be usedmore » during qualification testing and acceptance testing to verify operability.« less
JANE, A new information retrieval system for the Radiation Shielding Information Center
DOE Office of Scientific and Technical Information (OSTI.GOV)
Trubey, D.K.
A new information storage and retrieval system has been developed for the Radiation Shielding Information Center (RSIC) at Oak Ridge National Laboratory to replace mainframe systems that have become obsolete. The database contains citations and abstracts of literature which were selected by RSIC analysts and indexed with terms from a controlled vocabulary. The database, begun in 1963, has been maintained continuously since that time. The new system, called JANE, incorporates automatic indexing techniques and on-line retrieval using the RSIC Data General Eclipse MV/4000 minicomputer, Automatic indexing and retrieval techniques based on fuzzy-set theory allow the presentation of results in ordermore » of Retrieval Status Value. The fuzzy-set membership function depends on term frequency in the titles and abstracts and on Term Discrimination Values which indicate the resolving power of the individual terms. These values are determined by the Cover Coefficient method. The use of a commercial database base to store and retrieve the indexing information permits rapid retrieval of the stored documents. Comparisons of the new and presently-used systems for actual searches of the literature indicate that it is practical to replace the mainframe systems with a minicomputer system similar to the present version of JANE. 18 refs., 10 figs.« less
Query Log Analysis of an Electronic Health Record Search Engine
Yang, Lei; Mei, Qiaozhu; Zheng, Kai; Hanauer, David A.
2011-01-01
We analyzed a longitudinal collection of query logs of a full-text search engine designed to facilitate information retrieval in electronic health records (EHR). The collection, 202,905 queries and 35,928 user sessions recorded over a course of 4 years, represents the information-seeking behavior of 533 medical professionals, including frontline practitioners, coding personnel, patient safety officers, and biomedical researchers for patient data stored in EHR systems. In this paper, we present descriptive statistics of the queries, a categorization of information needs manifested through the queries, as well as temporal patterns of the users’ information-seeking behavior. The results suggest that information needs in medical domain are substantially more sophisticated than those that general-purpose web search engines need to accommodate. Therefore, we envision there exists a significant challenge, along with significant opportunities, to provide intelligent query recommendations to facilitate information retrieval in EHR. PMID:22195150
Automatic Analysis of Critical Incident Reports: Requirements and Use Cases.
Denecke, Kerstin
2016-01-01
Increasingly, critical incident reports are used as a means to increase patient safety and quality of care. The entire potential of these sources of experiential knowledge remains often unconsidered since retrieval and analysis is difficult and time-consuming, and the reporting systems often do not provide support for these tasks. The objective of this paper is to identify potential use cases for automatic methods that analyse critical incident reports. In more detail, we will describe how faceted search could offer an intuitive retrieval of critical incident reports and how text mining could support in analysing relations among events. To realise an automated analysis, natural language processing needs to be applied. Therefore, we analyse the language of critical incident reports and derive requirements towards automatic processing methods. We learned that there is a huge potential for an automatic analysis of incident reports, but there are still challenges to be solved.
The Weaknesses of Full-Text Searching
ERIC Educational Resources Information Center
Beall, Jeffrey
2008-01-01
This paper provides a theoretical critique of the deficiencies of full-text searching in academic library databases. Because full-text searching relies on matching words in a search query with words in online resources, it is an inefficient method of finding information in a database. This matching fails to retrieve synonyms, and it also retrieves…
Multi-source and ontology-based retrieval engine for maize mutant phenotypes
Green, Jason M.; Harnsomburana, Jaturon; Schaeffer, Mary L.; Lawrence, Carolyn J.; Shyu, Chi-Ren
2011-01-01
Model Organism Databases, including the various plant genome databases, collect and enable access to massive amounts of heterogeneous information, including sequence data, gene product information, images of mutant phenotypes, etc, as well as textual descriptions of many of these entities. While a variety of basic browsing and search capabilities are available to allow researchers to query and peruse the names and attributes of phenotypic data, next-generation search mechanisms that allow querying and ranking of text descriptions are much less common. In addition, the plant community needs an innovative way to leverage the existing links in these databases to search groups of text descriptions simultaneously. Furthermore, though much time and effort have been afforded to the development of plant-related ontologies, the knowledge embedded in these ontologies remains largely unused in available plant search mechanisms. Addressing these issues, we have developed a unique search engine for mutant phenotypes from MaizeGDB. This advanced search mechanism integrates various text description sources in MaizeGDB to aid a user in retrieving desired mutant phenotype information. Currently, descriptions of mutant phenotypes, loci and gene products are utilized collectively for each search, though expansion of the search mechanism to include other sources is straightforward. The retrieval engine, to our knowledge, is the first engine to exploit the content and structure of available domain ontologies, currently the Plant and Gene Ontologies, to expand and enrich retrieval results in major plant genomic databases. Database URL: http:www.PhenomicsWorld.org/QBTA.php PMID:21558151
The Impact Of Optical Storage Technology On Image Processing Systems
NASA Astrophysics Data System (ADS)
Garges, Daniel T.; Durbin, Gerald T.
1984-09-01
The recent announcement of commercially available high density optical storage devices will have a profound impact on the information processing industry. Just as the initial introduction of random access storage created entirely new processing strategies, optical technology will allow dramatic changes in the storage, retrieval, and dissemination of engineering drawings and other pictorial or text-based documents. Storage Technology Corporation has assumed a leading role in this arena with the introduction of the 7600 Optical Storage Subsystem, and the formation of StorageTek Integrated Systems, a subsidiary chartered to incorporate this new technology into deliverable total systems. This paper explores the impact of optical storage technology from the perspective of a leading-edge manufacturer and integrator.
Understanding natural language for spacecraft sequencing
NASA Technical Reports Server (NTRS)
Katz, Boris; Brooks, Robert N., Jr.
1987-01-01
The paper describes a natural language understanding system, START, that translates English text into a knowledge base. The understanding and the generating modules of START share a Grammar which is built upon reversible transformations. Users can retrieve information by querying the knowledge base in English; the system then produces an English response. START can be easily adapted to many different domains. One such domain is spacecraft sequencing. A high-level overview of sequencing as it is practiced at JPL is presented in the paper, and three areas within this activity are identified for potential application of the START system. Examples are given of an actual dialog with START based on simulated data for the Mars Observer mission.
Content Based Lecture Video Retrieval Using Speech and Video Text Information
ERIC Educational Resources Information Center
Yang, Haojin; Meinel, Christoph
2014-01-01
In the last decade e-lecturing has become more and more popular. The amount of lecture video data on the "World Wide Web" (WWW) is growing rapidly. Therefore, a more efficient method for video retrieval in WWW or within large lecture video archives is urgently needed. This paper presents an approach for automated video indexing and video…
ERIC Educational Resources Information Center
Liu, Ming-Chi; Huang, Yueh-Min; Kinshuk; Wen, Dunwei
2013-01-01
It is critical that students learn how to retrieve useful information in hypermedia environments, a task that is often especially difficult when it comes to image retrieval, as little text feedback is given that allows them to reformulate keywords they need to use. This situation may make students feel disorientated while attempting image…
ERIC Educational Resources Information Center
Salton, Gerald; And Others
The present report is the twenty-first in a series describing research in information storage and retrieval conducted by the Department of Computer Science at Cornell University. The report covering work carried out by the SMART project for approximately two years (summer 1970 to summer 1972) is separated into five parts: automatic content…
BIBLIO: A Computerized Retrieval System for Communication Education.
ERIC Educational Resources Information Center
Williams, M. Lee; Edwards, Renee
1983-01-01
Describes BIBLIO, a computer program created for the storage and retrieval of articles in the 1970-80 issues of "Communication Education." Tells how articles were coded, method used to retrieve information, and advantages and uses of the system. (PD)
NASA develops teleoperator retrieval system
NASA Technical Reports Server (NTRS)
1978-01-01
The teleoperator retrieval system vehicle was designed to reboost and/or deorbit the Skylab; however, usefulness in survey, stabilization, retrieval and delivery was examined. Thrusters, designed for cold gas propulsion, were adapted to hydrazine propulsion. Design specifications and cost analysis are given.
Passage-Based Bibliographic Coupling: An Inter-Article Similarity Measure for Biomedical Articles
Liu, Rey-Long
2015-01-01
Biomedical literature is an essential source of biomedical evidence. To translate the evidence for biomedicine study, researchers often need to carefully read multiple articles about specific biomedical issues. These articles thus need to be highly related to each other. They should share similar core contents, including research goals, methods, and findings. However, given an article r, it is challenging for search engines to retrieve highly related articles for r. In this paper, we present a technique PBC (Passage-based Bibliographic Coupling) that estimates inter-article similarity by seamlessly integrating bibliographic coupling with the information collected from context passages around important out-link citations (references) in each article. Empirical evaluation shows that PBC can significantly improve the retrieval of those articles that biomedical experts believe to be highly related to specific articles about gene-disease associations. PBC can thus be used to improve search engines in retrieving the highly related articles for any given article r, even when r is cited by very few (or even no) articles. The contribution is essential for those researchers and text mining systems that aim at cross-validating the evidence about specific gene-disease associations. PMID:26440794
Passage-Based Bibliographic Coupling: An Inter-Article Similarity Measure for Biomedical Articles.
Liu, Rey-Long
2015-01-01
Biomedical literature is an essential source of biomedical evidence. To translate the evidence for biomedicine study, researchers often need to carefully read multiple articles about specific biomedical issues. These articles thus need to be highly related to each other. They should share similar core contents, including research goals, methods, and findings. However, given an article r, it is challenging for search engines to retrieve highly related articles for r. In this paper, we present a technique PBC (Passage-based Bibliographic Coupling) that estimates inter-article similarity by seamlessly integrating bibliographic coupling with the information collected from context passages around important out-link citations (references) in each article. Empirical evaluation shows that PBC can significantly improve the retrieval of those articles that biomedical experts believe to be highly related to specific articles about gene-disease associations. PBC can thus be used to improve search engines in retrieving the highly related articles for any given article r, even when r is cited by very few (or even no) articles. The contribution is essential for those researchers and text mining systems that aim at cross-validating the evidence about specific gene-disease associations.
A general UNIX interface for biocomputing and network information retrieval software.
Kiong, B K; Tan, T W
1993-10-01
We describe a UNIX program, HYBROW, which can integrate without modification a wide range of UNIX biocomputing and network information retrieval software. HYBROW works in conjunction with a separate set of ASCII files containing embedded hypertext-like links. The program operates like a hypertext browser featuring five basic links: file link, execute-only link, execute-display link, directory-browse link and field-filling link. Useful features of the interface may be developed using combinations of these links with simple shell scripts and examples of these are briefly described. The system manager who supports biocomputing users should find the program easy to maintain, and useful in assisting new and infrequent users; it is also simple to incorporate new programs. Moreover, the individual user can customize the interface, create dynamic menus, hypertext a document, invoke shell scripts and new programs simply with a basic understanding of the UNIX operating system and any text editor. This program was written in C language and uses the UNIX curses and termcap libraries. It is freely available as a tar compressed file (by anonymous FTP from nuscc.nus.sg).
32 CFR 701.116 - PA systems of records notices overview.
Code of Federal Regulations, 2012 CFR
2012-07-01
.... (b) Retrieval practices. How a record is retrieved determines whether or not it qualifies to be a... birth, etc.) to qualify as a system of records. Accordingly, a record that contains information about an... system of records. The requirement is retrieval by a name or personal identifier.) Should a business...
32 CFR 701.116 - PA systems of records notices overview.
Code of Federal Regulations, 2013 CFR
2013-07-01
.... (b) Retrieval practices. How a record is retrieved determines whether or not it qualifies to be a... birth, etc.) to qualify as a system of records. Accordingly, a record that contains information about an... system of records. The requirement is retrieval by a name or personal identifier.) Should a business...
32 CFR 701.116 - PA systems of records notices overview.
Code of Federal Regulations, 2014 CFR
2014-07-01
.... (b) Retrieval practices. How a record is retrieved determines whether or not it qualifies to be a... birth, etc.) to qualify as a system of records. Accordingly, a record that contains information about an... system of records. The requirement is retrieval by a name or personal identifier.) Should a business...
32 CFR 701.116 - PA systems of records notices overview.
Code of Federal Regulations, 2011 CFR
2011-07-01
.... (b) Retrieval practices. How a record is retrieved determines whether or not it qualifies to be a... birth, etc.) to qualify as a system of records. Accordingly, a record that contains information about an... system of records. The requirement is retrieval by a name or personal identifier.) Should a business...
32 CFR 701.116 - PA systems of records notices overview.
Code of Federal Regulations, 2010 CFR
2010-07-01
.... (b) Retrieval practices. How a record is retrieved determines whether or not it qualifies to be a... birth, etc.) to qualify as a system of records. Accordingly, a record that contains information about an... system of records. The requirement is retrieval by a name or personal identifier.) Should a business...
A data storage, retrieval and analysis system for endocrine research. [for Skylab
NASA Technical Reports Server (NTRS)
Newton, L. E.; Johnston, D. A.
1975-01-01
This retrieval system builds, updates, retrieves, and performs basic statistical analyses on blood, urine, and diet parameters for the M071 and M073 Skylab and Apollo experiments. This system permits data entry from cards to build an indexed sequential file. Programs are easily modified for specialized analyses.
Information Retrieval: A Sequential Learning Process.
ERIC Educational Resources Information Center
Bookstein, Abraham
1983-01-01
Presents decision-theoretic models which intrinsically include retrieval of multiple documents whereby system responds to request by presenting documents to patron in sequence, gathering feedback, and using information to modify future retrievals. Document independence model, set retrieval model, sequential retrieval model, learning model,…
Chen, Hongyu; Martin, Bronwen; Daimon, Caitlin M; Maudsley, Stuart
2013-01-01
Text mining is rapidly becoming an essential technique for the annotation and analysis of large biological data sets. Biomedical literature currently increases at a rate of several thousand papers per week, making automated information retrieval methods the only feasible method of managing this expanding corpus. With the increasing prevalence of open-access journals and constant growth of publicly-available repositories of biomedical literature, literature mining has become much more effective with respect to the extraction of biomedically-relevant data. In recent years, text mining of popular databases such as MEDLINE has evolved from basic term-searches to more sophisticated natural language processing techniques, indexing and retrieval methods, structural analysis and integration of literature with associated metadata. In this review, we will focus on Latent Semantic Indexing (LSI), a computational linguistics technique increasingly used for a variety of biological purposes. It is noted for its ability to consistently outperform benchmark Boolean text searches and co-occurrence models at information retrieval and its power to extract indirect relationships within a data set. LSI has been used successfully to formulate new hypotheses, generate novel connections from existing data, and validate empirical data.
The Electronic Documentation Project in the NASA mission control center environment
NASA Technical Reports Server (NTRS)
Wang, Lui; Leigh, Albert
1994-01-01
NASA's space programs like many other technical programs of its magnitude is supported by a large volume of technical documents. These documents are not only diverse but also abundant. Management, maintenance, and retrieval of these documents is a challenging problem by itself; but, relating and cross-referencing this wealth of information when it is all on a medium of paper is an even greater challenge. The Electronic Documentation Project (EDP) is to provide an electronic system capable of developing, distributing and controlling changes for crew/ground controller procedures and related documents. There are two primary motives for the solution. The first motive is to reduce the cost of maintaining the current paper based method of operations by replacing paper documents with electronic information storage and retrieval. And, the other is to improve the efficiency and provide enhanced flexibility in document usage. Initially, the current paper based system will be faithfully reproduced in an electronic format to be used in the document viewing system. In addition, this metaphor will have hypertext extensions. Hypertext features support basic functions such as full text searches, key word searches, data retrieval, and traversal between nodes of information as well as speeding up the data access rate. They enable related but separate documents to have relationships, and allow the user to explore information naturally through non-linear link traversals. The basic operational requirements of the document viewing system are to: provide an electronic corollary to the current method of paper based document usage; supplement and ultimately replace paper-based documents; maintain focused toward control center operations such as Flight Data File, Flight Rules and Console Handbook viewing; and be available NASA wide.
Retrieval with Clustering in a Case-Based Reasoning System for Radiotherapy Treatment Planning
NASA Astrophysics Data System (ADS)
Khussainova, Gulmira; Petrovic, Sanja; Jagannathan, Rupa
2015-05-01
Radiotherapy treatment planning aims to deliver a sufficient radiation dose to cancerous tumour cells while sparing healthy organs in the tumour surrounding area. This is a trial and error process highly dependent on the medical staff's experience and knowledge. Case-Based Reasoning (CBR) is an artificial intelligence tool that uses past experiences to solve new problems. A CBR system has been developed to facilitate radiotherapy treatment planning for brain cancer. Given a new patient case the existing CBR system retrieves a similar case from an archive of successfully treated patient cases with the suggested treatment plan. The next step requires adaptation of the retrieved treatment plan to meet the specific demands of the new case. The CBR system was tested by medical physicists for the new patient cases. It was discovered that some of the retrieved cases were not suitable and could not be adapted for the new cases. This motivated us to revise the retrieval mechanism of the existing CBR system by adding a clustering stage that clusters cases based on their tumour positions. A number of well-known clustering methods were investigated and employed in the retrieval mechanism. Results using real world brain cancer patient cases have shown that the success rate of the new CBR retrieval is higher than that of the original system.
Retrieval-travel-time model for free-fall-flow-rack automated storage and retrieval system
NASA Astrophysics Data System (ADS)
Metahri, Dhiyaeddine; Hachemi, Khalid
2018-03-01
Automated storage and retrieval systems (AS/RSs) are material handling systems that are frequently used in manufacturing and distribution centers. The modelling of the retrieval-travel time of an AS/RS (expected product delivery time) is practically important, because it allows us to evaluate and improve the system throughput. The free-fall-flow-rack AS/RS has emerged as a new technology for drug distribution. This system is a new variation of flow-rack AS/RS that uses an operator or a single machine for storage operations, and uses a combination between the free-fall movement and a transport conveyor for retrieval operations. The main contribution of this paper is to develop an analytical model of the expected retrieval-travel time for the free-fall flow-rack under a dedicated storage assignment policy. The proposed model, which is based on a continuous approach, is compared for accuracy, via simulation, with discrete model. The obtained results show that the maximum deviation between the continuous model and the simulation is less than 5%, which shows the accuracy of our model to estimate the retrieval time. The analytical model is useful to optimise the dimensions of the rack, assess the system throughput, and evaluate different storage policies.
Biomedical data mining in clinical routine: expanding the impact of hospital information systems.
Müller, Marcel; Markó, Kornel; Daumke, Philipp; Paetzold, Jan; Roesner, Arnold; Klar, Rüdiger
2007-01-01
In this paper we want to describe how the promising technology of biomedical data mining can improve the use of hospital information systems: a large set of unstructured, narrative clinical data from a dermatological university hospital like discharge letters or other dermatological reports were processed through a morpho-semantic text retrieval engine ("MorphoSaurus") and integrated with other clinical data using a web-based interface and brought into daily clinical routine. The user evaluation showed a very high user acceptance - this system seems to meet the clinicians' requirements for a vertical data mining in the electronic patient records. What emerges is the need for integration of biomedical data mining into hospital information systems for clinical, scientific, educational and economic reasons.
Automated semantic indexing of figure captions to improve radiology image retrieval.
Kahn, Charles E; Rubin, Daniel L
2009-01-01
We explored automated concept-based indexing of unstructured figure captions to improve retrieval of images from radiology journals. The MetaMap Transfer program (MMTx) was used to map the text of 84,846 figure captions from 9,004 peer-reviewed, English-language articles to concepts in three controlled vocabularies from the UMLS Metathesaurus, version 2006AA. Sampling procedures were used to estimate the standard information-retrieval metrics of precision and recall, and to evaluate the degree to which concept-based retrieval improved image retrieval. Precision was estimated based on a sample of 250 concepts. Recall was estimated based on a sample of 40 concepts. The authors measured the impact of concept-based retrieval to improve upon keyword-based retrieval in a random sample of 10,000 search queries issued by users of a radiology image search engine. Estimated precision was 0.897 (95% confidence interval, 0.857-0.937). Estimated recall was 0.930 (95% confidence interval, 0.838-1.000). In 5,535 of 10,000 search queries (55%), concept-based retrieval found results not identified by simple keyword matching; in 2,086 searches (21%), more than 75% of the results were found by concept-based search alone. Concept-based indexing of radiology journal figure captions achieved very high precision and recall, and significantly improved image retrieval.
ERIC Educational Resources Information Center
Bell, Steven J.
2003-01-01
Discusses full-text databases and whether existing aggregator databases are meeting user needs. Topics include the need for better search interfaces; concepts of quality research and information retrieval; information overload; full text in electronic journal collections versus aggregator databases; underrepresentation of certain disciplines; and…
Text-mining and information-retrieval services for molecular biology
Krallinger, Martin; Valencia, Alfonso
2005-01-01
Text-mining in molecular biology - defined as the automatic extraction of information about genes, proteins and their functional relationships from text documents - has emerged as a hybrid discipline on the edges of the fields of information science, bioinformatics and computational linguistics. A range of text-mining applications have been developed recently that will improve access to knowledge for biologists and database annotators. PMID:15998455
Topological Aspects of Information Retrieval.
ERIC Educational Resources Information Center
Egghe, Leo; Rousseau, Ronald
1998-01-01
Discusses topological aspects of theoretical information retrieval, including retrieval topology; similarity topology; pseudo-metric topology; document spaces as topological spaces; Boolean information retrieval as a subsystem of any topological system; and proofs of theorems. (LRW)
Huang, Mingbo; Hu, Ding; Yu, Donglan; Zheng, Zhensheng; Wang, Kuijian
2011-12-01
Enhanced extracorporeal counterpulsation (EECP) information consists of both text and hemodynamic waveform data. At present EECP text information has been successfully managed through Web browser, while the management and sharing of hemodynamic waveform data through Internet has not been solved yet. In order to manage EECP information completely, based on the in-depth analysis of EECP hemodynamic waveform file of digital imaging and communications in medicine (DICOM) format and its disadvantages in Internet sharing, we proposed the use of the extensible markup language (XML), which is currently the Internet popular data exchange standard, as the storage specification for the sharing of EECP waveform data. Then we designed a web-based sharing system of EECP hemodynamic waveform data via ASP. NET 2.0 platform. Meanwhile, we specifically introduced the four main system function modules and their implement methods, including DICOM to XML conversion module, EECP waveform data management module, retrieval and display of EECP waveform module and the security mechanism of the system.
Code of Federal Regulations, 2014 CFR
2014-10-01
... enhancement of mechanized claims processing and information retrieval systems. 433.112 Section 433.112 Public... processing and information retrieval systems. (a) Subject to paragraph (c) of this section, FFP is available... enhancement of a mechanized claims processing and information retrieval system only if the APD is approved by...
Code of Federal Regulations, 2012 CFR
2012-10-01
... enhancement of mechanized claims processing and information retrieval systems. 433.112 Section 433.112 Public... processing and information retrieval systems. (a) Subject to paragraph (c) of this section, FFP is available... enhancement of a mechanized claims processing and information retrieval system only if the APD is approved by...
Code of Federal Regulations, 2011 CFR
2011-10-01
... enhancement of mechanized claims processing and information retrieval systems. 433.112 Section 433.112 Public... processing and information retrieval systems. (a) Subject to paragraph (c) of this section, FFP is available... enhancement of a mechanized claims processing and information retrieval system only if the APD is approved by...
Code of Federal Regulations, 2010 CFR
2010-10-01
... enhancement of mechanized claims processing and information retrieval systems. 433.112 Section 433.112 Public... processing and information retrieval systems. (a) FFP is available at the 90 percent rate in State... information retrieval system only if the APD is approved by CMS prior to the State's expenditure of funds for...
Code of Federal Regulations, 2013 CFR
2013-10-01
... enhancement of mechanized claims processing and information retrieval systems. 433.112 Section 433.112 Public... processing and information retrieval systems. (a) Subject to paragraph (c) of this section, FFP is available... enhancement of a mechanized claims processing and information retrieval system only if the APD is approved by...
NASA Technical Reports Server (NTRS)
1973-01-01
The retrieval command subsystem reference manual for the NASA Aerospace Safety Information System (NASIS) is presented. The command subsystem may be operated conversationally or in the batch mode. Retrieval commands are categorized into search-oriented and output-oriented commands. The characteristics of ancillary commands and their application are reported.
The South Australian Department of Mines and Energy Bibliography Retrieval System.
ERIC Educational Resources Information Center
Mannik, Maire
1980-01-01
Described is the South Australian Department of Mines and Energy Bibliography Retrieval System which is a repository for a large amount of geological and related information. Instructions for retrieval are outlined, and the coding information procedures are given. (DS)
Intelligent Information Retrieval: An Introduction.
ERIC Educational Resources Information Center
Gauch, Susan
1992-01-01
Discusses the application of artificial intelligence to online information retrieval systems and describes several systems: (1) CANSEARCH, from MEDLINE; (2) Intelligent Interface for Information Retrieval (I3R); (3) Gausch's Query Reformulation; (4) Environmental Pollution Expert (EP-X); (5) PLEXUS (gardening); and (6) SCISOR (corporate…
Content-based image retrieval with ontological ranking
NASA Astrophysics Data System (ADS)
Tsai, Shen-Fu; Tsai, Min-Hsuan; Huang, Thomas S.
2010-02-01
Images are a much more powerful medium of expression than text, as the adage says: "One picture is worth a thousand words." It is because compared with text consisting of an array of words, an image has more degrees of freedom and therefore a more complicated structure. However, the less limited structure of images presents researchers in the computer vision community a tough task of teaching machines to understand and organize images, especially when a limit number of learning examples and background knowledge are given. The advance of internet and web technology in the past decade has changed the way human gain knowledge. People, hence, can exchange knowledge with others by discussing and contributing information on the web. As a result, the web pages in the internet have become a living and growing source of information. One is therefore tempted to wonder whether machines can learn from the web knowledge base as well. Indeed, it is possible to make computer learn from the internet and provide human with more meaningful knowledge. In this work, we explore this novel possibility on image understanding applied to semantic image search. We exploit web resources to obtain links from images to keywords and a semantic ontology constituting human's general knowledge. The former maps visual content to related text in contrast to the traditional way of associating images with surrounding text; the latter provides relations between concepts for machines to understand to what extent and in what sense an image is close to the image search query. With the aid of these two tools, the resulting image search system is thus content-based and moreover, organized. The returned images are ranked and organized such that semantically similar images are grouped together and given a rank based on the semantic closeness to the input query. The novelty of the system is twofold: first, images are retrieved not only based on text cues but their actual contents as well; second, the grouping is different from pure visual similarity clustering. More specifically, the inferred concepts of each image in the group are examined in the context of a huge concept ontology to determine their true relations with what people have in mind when doing image search.
Task-Driven Dynamic Text Summarization
ERIC Educational Resources Information Center
Workman, Terri Elizabeth
2011-01-01
The objective of this work is to examine the efficacy of natural language processing (NLP) in summarizing bibliographic text for multiple purposes. Researchers have noted the accelerating growth of bibliographic databases. Information seekers using traditional information retrieval techniques when searching large bibliographic databases are often…
Finding Relevant Data in a Sea of Languages
2016-04-26
full machine-translated text , unbiased word clouds , query-biased word clouds , and query-biased sentence...and information retrieval to automate language processing tasks so that the limited number of linguists available for analyzing text and spoken...the crime (stock market). The Cross-LAnguage Search Engine (CLASE) has already preprocessed the documents, extracting text to identify the language
Dynamics, control and sensor issues pertinent to robotic hands for the EVA retriever system
NASA Technical Reports Server (NTRS)
Mclauchlan, Robert A.
1987-01-01
Basic dynamics, sensor, control, and related artificial intelligence issues pertinent to smart robotic hands for the Extra Vehicular Activity (EVA) Retriever system are summarized and discussed. These smart hands are to be used as end effectors on arms attached to manned maneuvering units (MMU). The Retriever robotic systems comprised of MMU, arm and smart hands, are being developed to aid crewmen in the performance of routine EVA tasks including tool and object retrieval. The ultimate goal is to enhance the effectiveness of EVA crewmen.
Seminal nanotechnology literature: a review.
Kostoff, Ronald N; Koytcheff, Raymond G; Lau, Clifford G Y
2009-11-01
This paper uses complementary text mining techniques to identify and retrieve the high impact (seminal) nanotechnology literature over a span of time. Following a brief scientometric analysis of the seminal articles retrieved, these seminal articles are then used as a basis for a comprehensive literature survey of nanoscience and nanotechnology. The paper ends with a global analysis of the relation of seminal nanotechnology document production to total nanotechnology document production.
A new visual navigation system for exploring biomedical Open Educational Resource (OER) videos
Zhao, Baoquan; Xu, Songhua; Lin, Shujin; Luo, Xiaonan; Duan, Lian
2016-01-01
Objective Biomedical videos as open educational resources (OERs) are increasingly proliferating on the Internet. Unfortunately, seeking personally valuable content from among the vast corpus of quality yet diverse OER videos is nontrivial due to limitations of today’s keyword- and content-based video retrieval techniques. To address this need, this study introduces a novel visual navigation system that facilitates users’ information seeking from biomedical OER videos in mass quantity by interactively offering visual and textual navigational clues that are both semantically revealing and user-friendly. Materials and Methods The authors collected and processed around 25 000 YouTube videos, which collectively last for a total length of about 4000 h, in the broad field of biomedical sciences for our experiment. For each video, its semantic clues are first extracted automatically through computationally analyzing audio and visual signals, as well as text either accompanying or embedded in the video. These extracted clues are subsequently stored in a metadata database and indexed by a high-performance text search engine. During the online retrieval stage, the system renders video search results as dynamic web pages using a JavaScript library that allows users to interactively and intuitively explore video content both efficiently and effectively. Results The authors produced a prototype implementation of the proposed system, which is publicly accessible at https://patentq.njit.edu/oer. To examine the overall advantage of the proposed system for exploring biomedical OER videos, the authors further conducted a user study of a modest scale. The study results encouragingly demonstrate the functional effectiveness and user-friendliness of the new system for facilitating information seeking from and content exploration among massive biomedical OER videos. Conclusion Using the proposed tool, users can efficiently and effectively find videos of interest, precisely locate video segments delivering personally valuable information, as well as intuitively and conveniently preview essential content of a single or a collection of videos. PMID:26335986
42 CFR 433.138 - Identifying liable third parties.
Code of Federal Regulations, 2010 CFR
2010-10-01
...) Integration with the State mechanized claims processing and information retrieval system. Basic requirement—Development of an action plan. (1) If a State has a mechanized claims processing and information retrieval... processing and information retrieval system. (2) The action plan must describe the actions and methodologies...
42 CFR 433.138 - Identifying liable third parties.
Code of Federal Regulations, 2011 CFR
2011-10-01
...) Integration with the State mechanized claims processing and information retrieval system. Basic requirement—Development of an action plan. (1) If a State has a mechanized claims processing and information retrieval... processing and information retrieval system. (2) The action plan must describe the actions and methodologies...
Elaboration over a Discourse Facilitates Retrieval in Sentence Processing
Troyer, Melissa; Hofmeister, Philip; Kutas, Marta
2016-01-01
Language comprehension requires access to stored knowledge and the ability to combine knowledge in new, meaningful ways. Previous work has shown that processing linguistically more complex expressions (‘Texas cattle rancher’ vs. ‘rancher’) leads to slow-downs in reading during initial processing, possibly reflecting effort in combining information. Conversely, when this information must subsequently be retrieved (as in filler-gap constructions), processing is facilitated for more complex expressions, possibly because more semantic cues are available during retrieval. To follow up on this hypothesis, we tested whether information distributed across a short discourse can similarly provide effective cues for retrieval. Participants read texts introducing two referents (e.g., two senators), one of whom was described in greater detail than the other (e.g., ‘The Democrat had voted for one of the senators, and the Republican had voted for the other, a man from Ohio who was running for president’). The final sentence (e.g., ‘The senator who the {Republican/Democrat}had voted for…’) contained a relative clause picking out either the Many-Cue referent (with ‘Republican’) or the One-Cue referent (with ‘Democrat’). We predicted facilitated retrieval (faster reading times) for the Many-Cue condition at the verb region (‘had voted for’), where readers could understand that ‘The senator’ is the object of the verb. As predicted, this pattern was observed at the retrieval region and continued throughout the rest of the sentence. Participants also completed the Author/Magazine Recognition Tests (ART/MRT; Stanovich and West, 1989), providing a proxy for world knowledge. Since higher ART/MRT scores may index (a) greater experience accessing relevant knowledge and/or (b) richer/more highly structured representations in semantic memory, we predicted it would be positively associated with effects of elaboration on retrieval. We did not observe the predicted interaction between ART/MRT scores and Cue condition at the retrieval region, though ART/MRT interacted with Cue condition in other locations in the sentence. In sum, we found that providing more elaborative information over the course of a text can facilitate retrieval for referents, consistent with a framework in which referential elaboration over a discourse and not just local linguistic information directly impacts information retrieval during sentence processing. PMID:27014172
A cloud-based framework for large-scale traditional Chinese medical record retrieval.
Liu, Lijun; Liu, Li; Fu, Xiaodong; Huang, Qingsong; Zhang, Xianwen; Zhang, Yin
2018-01-01
Electronic medical records are increasingly common in medical practice. The secondary use of medical records has become increasingly important. It relies on the ability to retrieve the complete information about desired patient populations. How to effectively and accurately retrieve relevant medical records from large- scale medical big data is becoming a big challenge. Therefore, we propose an efficient and robust framework based on cloud for large-scale Traditional Chinese Medical Records (TCMRs) retrieval. We propose a parallel index building method and build a distributed search cluster, the former is used to improve the performance of index building, and the latter is used to provide high concurrent online TCMRs retrieval. Then, a real-time multi-indexing model is proposed to ensure the latest relevant TCMRs are indexed and retrieved in real-time, and a semantics-based query expansion method and a multi- factor ranking model are proposed to improve retrieval quality. Third, we implement a template-based visualization method for displaying medical reports. The proposed parallel indexing method and distributed search cluster can improve the performance of index building and provide high concurrent online TCMRs retrieval. The multi-indexing model can ensure the latest relevant TCMRs are indexed and retrieved in real-time. The semantics expansion method and the multi-factor ranking model can enhance retrieval quality. The template-based visualization method can enhance the availability and universality, where the medical reports are displayed via friendly web interface. In conclusion, compared with the current medical record retrieval systems, our system provides some advantages that are useful in improving the secondary use of large-scale traditional Chinese medical records in cloud environment. The proposed system is more easily integrated with existing clinical systems and be used in various scenarios. Copyright © 2017. Published by Elsevier Inc.
Information retrieval and terminology extraction in online resources for patients with diabetes.
Seljan, Sanja; Baretić, Maja; Kucis, Vlasta
2014-06-01
Terminology use, as a mean for information retrieval or document indexing, plays an important role in health literacy. Specific types of users, i.e. patients with diabetes need access to various online resources (on foreign and/or native language) searching for information on self-education of basic diabetic knowledge, on self-care activities regarding importance of dietetic food, medications, physical exercises and on self-management of insulin pumps. Automatic extraction of corpus-based terminology from online texts, manuals or professional papers, can help in building terminology lists or list of "browsing phrases" useful in information retrieval or in document indexing. Specific terminology lists represent an intermediate step between free text search and controlled vocabulary, between user's demands and existing online resources in native and foreign language. The research aiming to detect the role of terminology in online resources, is conducted on English and Croatian manuals and Croatian online texts, and divided into three interrelated parts: i) comparison of professional and popular terminology use ii) evaluation of automatic statistically-based terminology extraction on English and Croatian texts iii) comparison and evaluation of extracted terminology performed on English manual using statistical and hybrid approaches. Extracted terminology candidates are evaluated by comparison with three types of reference lists: list created by professional medical person, list of highly professional vocabulary contained in MeSH and list created by non-medical persons, made as intersection of 15 lists. Results report on use of popular and professional terminology in online diabetes resources, on evaluation of automatically extracted terminology candidates in English and Croatian texts and on comparison of statistical and hybrid extraction methods in English text. Evaluation of automatic and semi-automatic terminology extraction methods is performed by recall, precision and f-measure.
NASA Technical Reports Server (NTRS)
Dominick, Wayne D. (Editor); Liu, I-Hsiung
1985-01-01
The currently developed multi-level language interfaces of information systems are generally designed for experienced users. These interfaces commonly ignore the nature and needs of the largest user group, i.e., casual users. This research identifies the importance of natural language query system research within information storage and retrieval system development; addresses the topics of developing such a query system; and finally, proposes a framework for the development of natural language query systems in order to facilitate the communication between casual users and information storage and retrieval systems.
RobotReviewer: evaluation of a system for automatically assessing bias in clinical trials.
Marshall, Iain J; Kuiper, Joël; Wallace, Byron C
2016-01-01
To develop and evaluate RobotReviewer, a machine learning (ML) system that automatically assesses bias in clinical trials. From a (PDF-formatted) trial report, the system should determine risks of bias for the domains defined by the Cochrane Risk of Bias (RoB) tool, and extract supporting text for these judgments. We algorithmically annotated 12,808 trial PDFs using data from the Cochrane Database of Systematic Reviews (CDSR). Trials were labeled as being at low or high/unclear risk of bias for each domain, and sentences were labeled as being informative or not. This dataset was used to train a multi-task ML model. We estimated the accuracy of ML judgments versus humans by comparing trials with two or more independent RoB assessments in the CDSR. Twenty blinded experienced reviewers rated the relevance of supporting text, comparing ML output with equivalent (human-extracted) text from the CDSR. By retrieving the top 3 candidate sentences per document (top3 recall), the best ML text was rated more relevant than text from the CDSR, but not significantly (60.4% ML text rated 'highly relevant' v 56.5% of text from reviews; difference +3.9%, [-3.2% to +10.9%]). Model RoB judgments were less accurate than those from published reviews, though the difference was <10% (overall accuracy 71.0% with ML v 78.3% with CDSR). Risk of bias assessment may be automated with reasonable accuracy. Automatically identified text supporting bias assessment is of equal quality to the manually identified text in the CDSR. This technology could substantially reduce reviewer workload and expedite evidence syntheses. © The Author 2015. Published by Oxford University Press on behalf of the American Medical Informatics Association.
Retrieval Demands Adaptively Change Striatal Old/New Signals and Boost Subsequent Long-Term Memory.
Herweg, Nora A; Sommer, Tobias; Bunzeck, Nico
2018-01-17
The striatum is a central part of the dopaminergic mesolimbic system and contributes both to the encoding and retrieval of long-term memories. In this regard, the co-occurrence of striatal novelty and retrieval success effects in independent studies underlines the structure's double duty and suggests dynamic contextual adaptation. To test this hypothesis and further investigate the underlying mechanisms of encoding and retrieval dynamics, human subjects viewed pre-familiarized scene images intermixed with new scenes and classified them as indoor versus outdoor (encoding task) or old versus new (retrieval task), while fMRI and eye tracking data were recorded. Subsequently, subjects performed a final recognition task. As hypothesized, striatal activity and pupil size reflected task-conditional salience of old and new stimuli, but, unexpectedly, this effect was not reflected in the substantia nigra and ventral tegmental area (SN/VTA), medial temporal lobe, or subsequent memory performance. Instead, subsequent memory generally benefitted from retrieval, an effect possibly driven by task difficulty and activity in a network including different parts of the striatum and SN/VTA. Our findings extend memory models of encoding and retrieval dynamics by pinpointing a specific contextual factor that differentially modulates the functional properties of the mesolimbic system. SIGNIFICANCE STATEMENT The mesolimbic system is involved in the encoding and retrieval of information but it is unclear how these two processes are achieved within the same network of brain regions. In particular, memory retrieval and novelty encoding were considered in independent studies, implying that novelty (new > old) and retrieval success (old > new) effects may co-occur in the striatum. Here, we used a common framework implicating the striatum, but not other parts of the mesolimbic system, in tracking context-dependent salience of old and new information. The current study, therefore, paves the way for a more comprehensive understanding of the functional properties of the mesolimbic system during memory encoding and retrieval. Copyright © 2018 the authors 0270-6474/18/380745-10$15.00/0.
Describe yourself to improve your autobiographical memory: A study in Alzheimer's disease.
El Haj, Mohamad; Antoine, Pascal
2017-03-01
This study investigated whether retrieval of information related to conceptual self (i.e., self-images that encompass general factual and evaluative knowledge of one's identity) would improve autobiographical memory in Alzheimer's disease (AD). Participants with AD and controls were asked to retrieve autobiographical memories after providing statements to the question "Who am I? and after a control condition consisting of reading a general text. Autobiographical recall was analyzed with respect to specificity (general vs specific event), context recall (information describing the "when, where, and who" as well as affective states), and reliving (the subjective experience of recall). AD participants showed higher specificity, context recall and reliving after the "Who am I?" statements than after the text reading, and controls showed higher context recall after the former than after the latter condition. These findings highlight the relationship between self and autobiographical memory in AD and demonstrate how retrieval of information related to conceptual self may influence autobiographical memory in the disease. Copyright © 2017 Elsevier Ltd. All rights reserved.
Improving e-book access via a library-developed full-text search tool.
Foust, Jill E; Bergen, Phillip; Maxeiner, Gretchen L; Pawlowski, Peter N
2007-01-01
This paper reports on the development of a tool for searching the contents of licensed full-text electronic book (e-book) collections. The Health Sciences Library System (HSLS) provides services to the University of Pittsburgh's medical programs and large academic health system. The HSLS has developed an innovative tool for federated searching of its e-book collections. Built using the XML-based Vivísimo development environment, the tool enables a user to perform a full-text search of over 2,500 titles from the library's seven most highly used e-book collections. From a single "Google-style" query, results are returned as an integrated set of links pointing directly to relevant sections of the full text. Results are also grouped into categories that enable more precise retrieval without reformulation of the search. A heuristic evaluation demonstrated the usability of the tool and a web server log analysis indicated an acceptable level of usage. Based on its success, there are plans to increase the number of online book collections searched. This library's first foray into federated searching has produced an effective tool for searching across large collections of full-text e-books and has provided a good foundation for the development of other library-based federated searching products.
Improving e-book access via a library-developed full-text search tool*
Foust, Jill E.; Bergen, Phillip; Maxeiner, Gretchen L.; Pawlowski, Peter N.
2007-01-01
Purpose: This paper reports on the development of a tool for searching the contents of licensed full-text electronic book (e-book) collections. Setting: The Health Sciences Library System (HSLS) provides services to the University of Pittsburgh's medical programs and large academic health system. Brief Description: The HSLS has developed an innovative tool for federated searching of its e-book collections. Built using the XML-based Vivísimo development environment, the tool enables a user to perform a full-text search of over 2,500 titles from the library's seven most highly used e-book collections. From a single “Google-style” query, results are returned as an integrated set of links pointing directly to relevant sections of the full text. Results are also grouped into categories that enable more precise retrieval without reformulation of the search. Results/Evaluation: A heuristic evaluation demonstrated the usability of the tool and a web server log analysis indicated an acceptable level of usage. Based on its success, there are plans to increase the number of online book collections searched. Conclusion: This library's first foray into federated searching has produced an effective tool for searching across large collections of full-text e-books and has provided a good foundation for the development of other library-based federated searching products. PMID:17252065
Similarity study on chloride corrosion of prestressed concrete in marine atmosphere
NASA Astrophysics Data System (ADS)
Li, Congqi; Wang, Ruojun; Liu, Ronggui
2018-02-01
All articles must contain an abstract. The abstract text should be formatted using 10 point Times or Times New Roman and indented 25 mm from the left margin. Leave 10 mm space after the abstract before you begin the main text of your article, starting on the same page as the abstract. The abstract should give readers concise information about the content of the article and indicate the main results obtained and conclusions drawn. The abstract is not part of the text and should be complete in itself; no table numbers, figure numbers, references or displayed mathematical expressions should be included. It should be suitable for direct inclusion in abstracting services and should not normally exceed 200 words in a single paragraph. Since contemporary information-retrieval systems rely heavily on the content of titles and abstracts to identify relevant articles in literature searches, great care should be taken in constructing both.
Experimental investigation of the burning of mixed and synthetic fuel counterflow burner module
NASA Astrophysics Data System (ADS)
Kononova, V. V.; Gur'yanov, A. I.
2017-11-01
All articles must contain an abstract. The abstract text should be formatted using 10 point Times or Times New Roman and indented 25 mm from the left margin. Leave 10 mm space after the abstract before you begin the main text of your article, starting on the same page as the abstract. The abstract should give readers concise information about the content of the article and indicate the main results obtained and conclusions drawn. The abstract is not part of the text and should be complete in itself; no table numbers, figure numbers, references or displayed mathematical expressions should be included. It should be suitable for direct inclusion in abstracting services and should not normally exceed 200 words in a single paragraph. Since contemporary information-retrieval systems rely heavily on the content of titles and abstracts to identify relevant articles in literature searches, great care should be taken in constructing both.
Term Relevance Weights in On-Line Information Retrieval
ERIC Educational Resources Information Center
Salton, G.; Waldstein, R. K.
1978-01-01
Term relevance weighting systems in interactive information retrieval are reviewed. An experiment in which information retrieval users ranked query terms in decreasing order of presumed importance prior to actual search and retrieval is described. (Author/KP)
AP-102/104 Retrieval control system qualification test procedure
DOE Office of Scientific and Technical Information (OSTI.GOV)
RIECK, C.A.
1999-05-18
This Qualification Test Procedure documents the results of the qualification testing that was performed on the Project W-211, ''Initial Tank Retrieval Systems,'' retrieval control system (RCS) for tanks 241-AP-102 and 241-AP-104. The results confirm that the RCS has been programmed correctly and that the two related hardware enclosures have been assembled in accordance with the design documents.
ERIC Educational Resources Information Center
Liu, Chang
2012-01-01
When using information retrieval (IR) systems, users often pose short and ambiguous query terms. It is critical for IR systems to obtain more accurate representation of users' information need, their document preferences, and the context they are working in, and then incorporate them into the design of the systems to tailor retrieval to…
ERIC Educational Resources Information Center
Air Force Systems Command, Wright-Patterson AFB, OH. Foreign Technology Div.
The role and place of the machine in scientific and technical information is explored including: basic trends in the development of information retrieval systems; preparation of engineering and scientific cadres with respect to mechanization and automation of information works; the logic of descriptor retrieval systems; the 'SETKA-3' automated…
A Framework for Evaluation and Optimization of Relevance and Novelty-Based Retrieval
ERIC Educational Resources Information Center
Lad, Abhimanyu
2011-01-01
There has been growing interest in building and optimizing retrieval systems with respect to relevance and novelty of information, which together more realistically reflect the usefulness of a system as perceived by the user. How to combine these criteria into a single metric that can be used to measure as well as optimize retrieval systems is an…
Kingfisher: a system for remote sensing image database management
NASA Astrophysics Data System (ADS)
Bruzzo, Michele; Giordano, Ferdinando; Dellepiane, Silvana G.
2003-04-01
At present retrieval methods in remote sensing image database are mainly based on spatial-temporal information. The increasing amount of images to be collected by the ground station of earth observing systems emphasizes the need for database management with intelligent data retrieval capabilities. The purpose of the proposed method is to realize a new content based retrieval system for remote sensing images database with an innovative search tool based on image similarity. This methodology is quite innovative for this application, at present many systems exist for photographic images, as for example QBIC and IKONA, but they are not able to extract and describe properly remote image content. The target database is set by an archive of images originated from an X-SAR sensor (spaceborne mission, 1994). The best content descriptors, mainly texture parameters, guarantees high retrieval performances and can be extracted without losses independently of image resolution. The latter property allows DBMS (Database Management System) to process low amount of information, as in the case of quick-look images, improving time performance and memory access without reducing retrieval accuracy. The matching technique has been designed to enable image management (database population and retrieval) independently of dimensions (width and height). Local and global content descriptors are compared, during retrieval phase, with the query image and results seem to be very encouraging.
System engineering approach to GPM retrieval algorithms
DOE Office of Scientific and Technical Information (OSTI.GOV)
Rose, C. R.; Chandrasekar, V.
2004-01-01
System engineering principles and methods are very useful in large-scale complex systems for developing the engineering requirements from end-user needs. Integrating research into system engineering is a challenging task. The proposed Global Precipitation Mission (GPM) satellite will use a dual-wavelength precipitation radar to measure and map global precipitation with unprecedented accuracy, resolution and areal coverage. The satellite vehicle, precipitation radars, retrieval algorithms, and ground validation (GV) functions are all critical subsystems of the overall GPM system and each contributes to the success of the mission. Errors in the radar measurements and models can adversely affect the retrieved output values. Groundmore » validation (GV) systems are intended to provide timely feedback to the satellite and retrieval algorithms based on measured data. These GV sites will consist of radars and DSD measurement systems and also have intrinsic constraints. One of the retrieval algorithms being studied for use with GPM is the dual-wavelength DSD algorithm that does not use the surface reference technique (SRT). The underlying microphysics of precipitation structures and drop-size distributions (DSDs) dictate the types of models and retrieval algorithms that can be used to estimate precipitation. Many types of dual-wavelength algorithms have been studied. Meneghini (2002) analyzed the performance of single-pass dual-wavelength surface-reference-technique (SRT) based algorithms. Mardiana (2003) demonstrated that a dual-wavelength retrieval algorithm could be successfully used without the use of the SRT. It uses an iterative approach based on measured reflectivities at both wavelengths and complex microphysical models to estimate both No and Do at each range bin. More recently, Liao (2004) proposed a solution to the Do ambiguity problem in rain within the dual-wavelength algorithm and showed a possible melting layer model based on stratified spheres. With the No and Do calculated at each bin, the rain rate can then be calculated based on a suitable rain-rate model. This paper develops a system engineering interface to the retrieval algorithms while remaining cognizant of system engineering issues so that it can be used to bridge the divide between algorithm physics an d overall mission requirements. Additionally, in line with the systems approach, a methodology is developed such that the measurement requirements pass through the retrieval model and other subsystems and manifest themselves as measurement and other system constraints. A systems model has been developed for the retrieval algorithm that can be evaluated through system-analysis tools such as MATLAB/Simulink.« less
NASA Technical Reports Server (NTRS)
Wind, Galina; Riedi, Jerome; Platnick, Steven; Heidinger, Andrew
2014-01-01
The Cross-platform HIgh resolution Multi-instrument AtmosphEric Retrieval Algorithms (CHIMAERA) system allows us to perform MODIS-like cloud top, optical and microphysical properties retrievals on any sensor that possesses a minimum set of common spectral channels. The CHIMAERA system uses a shared-core architecture that takes retrieval method out of the equation when intercomparisons are made. Here we show an example of such retrieval and a comparison of simultaneous retrievals done using SEVIRI, MODIS and VIIRS sensors. All sensor retrievals are performed using CLAVR-x (or CLAVR-x based) cloud top properties algorithm. SEVIRI uses the SAF_NWC cloud mask. MODIS and VIIRS use the IFF-based cloud mask that is a shared algorithm between MODIS and VIIRS. The MODIS and VIIRS retrievals are performed using a VIIRS branch of CHIMAERA that limits available MODIS channel set. Even though in that mode certain MODIS products such as multilayer cloud map are not available, the cloud retrieval remains fully equivalent to operational Data Collection 6.
Morrison, Norman; Hancock, David; Hirschman, Lynette; Dawyndt, Peter; Verslyppe, Bert; Kyrpides, Nikos; Kottmann, Renzo; Yilmaz, Pelin; Glöckner, Frank Oliver; Grethe, Jeff; Booth, Tim; Sterk, Peter; Nenadic, Goran; Field, Dawn
2011-04-29
In the future, we hope to see an open and thriving data market in which users can find and select data from a wide range of data providers. In such an open access market, data are products that must be packaged accordingly. Increasingly, eCommerce sellers present heterogeneous product lines to buyers using faceted browsing. Using this approach we have developed the Ontogrator platform, which allows for rapid retrieval of data in a way that would be familiar to any online shopper. Using Knowledge Organization Systems (KOS), especially ontologies, Ontogrator uses text mining to mark up data and faceted browsing to help users navigate, query and retrieve data. Ontogrator offers the potential to impact scientific research in two major ways: 1) by significantly improving the retrieval of relevant information; and 2) by significantly reducing the time required to compose standard database queries and assemble information for further research. Here we present a pilot implementation developed in collaboration with the Genomic Standards Consortium (GSC) that includes content from the StrainInfo, GOLD, CAMERA, Silva and Pubmed databases. This implementation demonstrates the power of ontogration and highlights that the usefulness of this approach is fully dependent on both the quality of data and the KOS (ontologies) used. Ideally, the use and further expansion of this collaborative system will help to surface issues associated with the underlying quality of annotation and could lead to a systematic means for accessing integrated data resources.
Morrison, Norman; Hancock, David; Hirschman, Lynette; Dawyndt, Peter; Verslyppe, Bert; Kyrpides, Nikos; Kottmann, Renzo; Yilmaz, Pelin; Glöckner, Frank Oliver; Grethe, Jeff; Booth, Tim; Sterk, Peter; Nenadic, Goran; Field, Dawn
2011-01-01
In the future, we hope to see an open and thriving data market in which users can find and select data from a wide range of data providers. In such an open access market, data are products that must be packaged accordingly. Increasingly, eCommerce sellers present heterogeneous product lines to buyers using faceted browsing. Using this approach we have developed the Ontogrator platform, which allows for rapid retrieval of data in a way that would be familiar to any online shopper. Using Knowledge Organization Systems (KOS), especially ontologies, Ontogrator uses text mining to mark up data and faceted browsing to help users navigate, query and retrieve data. Ontogrator offers the potential to impact scientific research in two major ways: 1) by significantly improving the retrieval of relevant information; and 2) by significantly reducing the time required to compose standard database queries and assemble information for further research. Here we present a pilot implementation developed in collaboration with the Genomic Standards Consortium (GSC) that includes content from the StrainInfo, GOLD, CAMERA, Silva and Pubmed databases. This implementation demonstrates the power of ontogration and highlights that the usefulness of this approach is fully dependent on both the quality of data and the KOS (ontologies) used. Ideally, the use and further expansion of this collaborative system will help to surface issues associated with the underlying quality of annotation and could lead to a systematic means for accessing integrated data resources. PMID:21677865
Total Bregman Divergence and its Applications to Shape Retrieval.
Liu, Meizhu; Vemuri, Baba C; Amari, Shun-Ichi; Nielsen, Frank
2010-01-01
Shape database search is ubiquitous in the world of biometric systems, CAD systems etc. Shape data in these domains is experiencing an explosive growth and usually requires search of whole shape databases to retrieve the best matches with accuracy and efficiency for a variety of tasks. In this paper, we present a novel divergence measure between any two given points in [Formula: see text] or two distribution functions. This divergence measures the orthogonal distance between the tangent to the convex function (used in the definition of the divergence) at one of its input arguments and its second argument. This is in contrast to the ordinate distance taken in the usual definition of the Bregman class of divergences [4]. We use this orthogonal distance to redefine the Bregman class of divergences and develop a new theory for estimating the center of a set of vectors as well as probability distribution functions. The new class of divergences are dubbed the total Bregman divergence (TBD). We present the l 1 -norm based TBD center that is dubbed the t-center which is then used as a cluster center of a class of shapes The t-center is weighted mean and this weight is small for noise and outliers. We present a shape retrieval scheme using TBD and the t-center for representing the classes of shapes from the MPEG-7 database and compare the results with other state-of-the-art methods in literature.
Text grouping in patent analysis using adaptive K-means clustering algorithm
NASA Astrophysics Data System (ADS)
Shanie, Tiara; Suprijadi, Jadi; Zulhanif
2017-03-01
Patents are one of the Intellectual Property. Analyzing patent is one requirement in knowing well the development of technology in each country and in the world now. This study uses the patent document coming from the Espacenet server about Green Tea. Patent documents related to the technology in the field of tea is still widespread, so it will be difficult for users to information retrieval (IR). Therefore, it is necessary efforts to categorize documents in a specific group of related terms contained therein. This study uses titles patent text data with the proposed Green Tea in Statistical Text Mining methods consists of two phases: data preparation and data analysis stage. The data preparation phase uses Text Mining methods and data analysis stage is done by statistics. Statistical analysis in this study using a cluster analysis algorithm, the Adaptive K-Means Clustering Algorithm. Results from this study showed that based on the maximum value Silhouette, generate 87 clusters associated fifteen terms therein that can be utilized in the process of information retrieval needs.
Multilingual Information Retrieval in Thoracic Radiology: Feasibility Study
Castilla, André Coutinho; Furuie, Sérgio Shiguemi; Mendonça, Eneida A.
2014-01-01
Most of essential information contained on Electronic Medical Record is stored as text, imposing several difficulties on automated data extraction and retrieval. Natural language processing is an approach that can unlock clinical information from free texts. The proposed methodology uses the specialized natural language processor MEDLEE developed for English language. To use this processor on Portuguese medical texts, chest x-ray reports were Machine Translated into English. The result of serial coupling of MT an NLP is tagged text which needs further investigation for extracting clinical findings. The objective of this experiment was to investigate normal reports and reports with device description on a set of 165 chest x-ray reports. We obtained sensitivity and specificity of 1 and 0.71 for the first condition and 0.97 and 0.97 for the second respectively. The reference was formed by the opinion of two radiologists. The results of this experiment indicate the viability of extracting clinical findings from chest x-ray reports through coupling MT and NLP. PMID:17911745
42 CFR 433.117 - Initial approval of replacement systems.
Code of Federal Regulations, 2011 CFR
2011-10-01
... and Information Retrieval Systems § 433.117 Initial approval of replacement systems. (a) A replacement... information retrieval system. (b) The agency must submit a APD that includes— (1) The date the replacement...
42 CFR 433.117 - Initial approval of replacement systems.
Code of Federal Regulations, 2012 CFR
2012-10-01
... and Information Retrieval Systems § 433.117 Initial approval of replacement systems. (a) A replacement... information retrieval system. (b) The agency must submit a APD that includes— (1) The date the replacement...
42 CFR 433.117 - Initial approval of replacement systems.
Code of Federal Regulations, 2013 CFR
2013-10-01
... and Information Retrieval Systems § 433.117 Initial approval of replacement systems. (a) A replacement... information retrieval system. (b) The agency must submit a APD that includes— (1) The date the replacement...
42 CFR 433.117 - Initial approval of replacement systems.
Code of Federal Regulations, 2014 CFR
2014-10-01
... and Information Retrieval Systems § 433.117 Initial approval of replacement systems. (a) A replacement... information retrieval system. (b) The agency must submit a APD that includes— (1) The date the replacement...
WITHDRAWN: Resorbable versus titanium plates for facial fractures.
Dorri, Mojtaba; Oliver, Richard
2018-05-23
Rigid internal fixation of the jaw bones is a routine procedure for the management of facial fractures. Titanium plates and screws are routinely used for this purpose. The limitations of this system has led to the development of plates manufactured from bioresorbable materials which, in some cases, omits the necessity for the second surgery. However, concerns remain about the stability of fixation and the length of time required for their degradation and the possibility of foreign body reactions. To compare the effectiveness of bioresorbable fixation systems with titanium systems for the management of facial fractures. We searched the following databases: The Cochrane Oral Health Group's Trials Register (to 20th August 2008), the Cochrane Central Register of Controlled Trials (CENTRAL) (The Cochrane Library 2008, Issue 3), MEDLINE (1950 to 20th August 2008), EMBASE (from 1980 to 20th August 2008), http://www.clinicaltrials.gov/ and http://www.controlled-trials.com (to 20th August 2008). Randomised controlled trials comparing resorbable versus titanium fixation systems used for facial fractures. Retrieved studies were independently screened by two review authors. Results were to be expressed as random-effects models using mean differences for continuous outcomes and risk ratios for dichotomous outcomes with 95% confidence intervals. Heterogeneity was to be investigated including both clinical and methodological factors. The search strategy retrieved 53 potentially eligible studies. None of the retrieved studies met our inclusion criteria and all were excluded from this review. One study is awaiting classification as we failed to obtain the full text copy. Three ongoing trials were retrieved, two of which were stopped before recruiting the planned number of participants. In one study, the excess complications in the resorbable arm was declared as the reason for stopping the trial. This review illustrates that there are no published randomised controlled clinical trials relevant to this review question. There is currently insufficient evidence for the effectiveness of resorbable fixation systems compared with conventional titanium systems for facial fractures. The findings of this review, based on the results of the aborted trials, do not suggest that resorbable plates are as effective as titanium plates. In future, the results of ongoing clinical trials may provide high level reliable evidence for assisting clinicians and patients for decision making. Trialists should design their studies accurately and comprehensively to meet the aims and objectives defined for the study.
Development of a Search Strategy for an Evidence Based Retrieval Service
Ho, Gah Juan; Liew, Su May; Ng, Chirk Jenn; Hisham Shunmugam, Ranita; Glasziou, Paul
2016-01-01
Background Physicians are often encouraged to locate answers for their clinical queries via an evidence-based literature search approach. The methods used are often not clearly specified. Inappropriate search strategies, time constraint and contradictory information complicate evidence retrieval. Aims Our study aimed to develop a search strategy to answer clinical queries among physicians in a primary care setting Methods Six clinical questions of different medical conditions seen in primary care were formulated. A series of experimental searches to answer each question was conducted on 3 commonly advocated medical databases. We compared search results from a PICO (patients, intervention, comparison, outcome) framework for questions using different combinations of PICO elements. We also compared outcomes from doing searches using text words, Medical Subject Headings (MeSH), or a combination of both. All searches were documented using screenshots and saved search strategies. Results Answers to all 6 questions using the PICO framework were found. A higher number of systematic reviews were obtained using a 2 PICO element search compared to a 4 element search. A more optimal choice of search is a combination of both text words and MeSH terms. Despite searching using the Systematic Review filter, many non-systematic reviews or narrative reviews were found in PubMed. There was poor overlap between outcomes of searches using different databases. The duration of search and screening for the 6 questions ranged from 1 to 4 hours. Conclusion This strategy has been shown to be feasible and can provide evidence to doctors’ clinical questions. It has the potential to be incorporated into an interventional study to determine the impact of an online evidence retrieval system. PMID:27935993
Memory retrieval of everyday information under stress.
Stock, Lisa-Marie; Merz, Christian J
2018-07-01
Psychosocial stress is known to crucially influence learning and memory processes. Several studies have already shown an impairing effect of elevated cortisol concentrations on memory retrieval. These studies mainly used learning material consisting of stimuli with a limited ecological validity. When using material with a social contextual component or with educational relevant material both impairing and enhancing stress effects on memory retrieval could be observed. In line with these latter studies, the present experiment also used material with a higher ecological validity (a coherent text consisting of daily relevant numeric, figural and verbal information). After encoding, retrieval took place 24 h later after exposure to psychosocial stress or a control procedure (20 healthy men per group). The stress group was further subdivided into cortisol responders and non-responders. Results showed a significantly impaired retrieval of everyday information in non-responders compared to responders and controls. Altogether, the present findings indicate the need of an appropriate cortisol response for the successful memory retrieval of everyday information. Thus, the present findings suggest that cortisol increases - contrary to a stressful experience per se - seem to play a protective role for retrieving everyday information. Additionally, it could be speculated that the previously reported impairing stress effects on memory retrieval might depend on the used learning material. Copyright © 2018 Elsevier Inc. All rights reserved.
Exploiting the Maximum Entropy Principle to Increase Retrieval Effectiveness.
ERIC Educational Resources Information Center
Cooper, William S.
1983-01-01
Presents information retrieval design approach in which queries of computer-based system consist of sets of terms, either unweighted or weighted with subjective term precision estimates, and retrieval outputs ranked by probability of usefulness estimated by "maximum entropy principle." Boolean and weighted request systems are discussed.…
Prototyping a Distributed Information Retrieval System That Uses Statistical Ranking.
ERIC Educational Resources Information Center
Harman, Donna; And Others
1991-01-01
Built using a distributed architecture, this prototype distributed information retrieval system uses statistical ranking techniques to provide better service to the end user. Distributed architecture was shown to be a feasible alternative to centralized or CD-ROM information retrieval, and user testing of the ranking methodology showed both…
Predicting Document Retrieval System Performance: An Expected Precision Measure.
ERIC Educational Resources Information Center
Losee, Robert M., Jr.
1987-01-01
Describes an expected precision (EP) measure designed to predict document retrieval performance. Highlights include decision theoretic models; precision and recall as measures of system performance; EP graphs; relevance feedback; and computing the retrieval status value of a document for two models, the Binary Independent Model and the Two Poisson…
77 FR 67203 - Privacy Act of 1974; Republication of Systems of Records Notices
Federal Register 2010, 2011, 2012, 2013, 2014
2012-11-08
... file folders and on electronic media. RETRIEVABILITY: Accessed by name, tag number, and/or permit... DISPOSING OF RECORDS IN THE SYSTEM: STORAGE: Records are maintained on electronic media. RETRIEVABILITY... electronic media. RETRIEVABILITY: Records are accessed by individual action file number or by the name of the...
Medical image archive node simulation and architecture
NASA Astrophysics Data System (ADS)
Chiang, Ted T.; Tang, Yau-Kuo
1996-05-01
It is a well known fact that managed care and new treatment technologies are revolutionizing the health care provider world. Community Health Information Network and Computer-based Patient Record projects are underway throughout the United States. More and more hospitals are installing digital, `filmless' radiology (and other imagery) systems. They generate a staggering amount of information around the clock. For example, a typical 500-bed hospital might accumulate more than 5 terabytes of image data in a period of 30 years for conventional x-ray images and digital images such as Magnetic Resonance Imaging and Computer Tomography images. With several hospitals contributing to the archive, the storage required will be in the hundreds of terabytes. Systems for reliable, secure, and inexpensive storage and retrieval of digital medical information do not exist today. In this paper, we present a Medical Image Archive and Distribution Service (MIADS) concept. MIADS is a system shared by individual and community hospitals, laboratories, and doctors' offices that need to store and retrieve medical images. Due to the large volume and complexity of the data, as well as the diversified user access requirement, implementation of the MIADS will be a complex procedure. One of the key challenges to implementing a MIADS is to select a cost-effective, scalable system architecture to meet the ingest/retrieval performance requirements. We have performed an in-depth system engineering study, and developed a sophisticated simulation model to address this key challenge. This paper describes the overall system architecture based on our system engineering study and simulation results. In particular, we will emphasize system scalability and upgradability issues. Furthermore, we will discuss our simulation results in detail. The simulations study the ingest/retrieval performance requirements based on different system configurations and architectures for variables such as workload, tape access time, number of drives, number of exams per patient, number of Central Processing Units, patient grouping, and priority impacts. The MIADS, which could be a key component of a broader data repository system, will be able to communicate with and obtain data from existing hospital information systems. We will discuss the external interfaces enabling MIADS to communicate with and obtain data from existing Radiology Information Systems such as the Picture Archiving and Communication System (PACS). Our system design encompasses the broader aspects of the archive node, which could include multimedia data such as image, audio, video, and free text data. This system is designed to be integrated with current hospital PACS through a Digital Imaging and Communications in Medicine interface. However, the system can also be accessed through the Internet using Hypertext Transport Protocol or Simple File Transport Protocol. Our design and simulation work will be key to implementing a successful, scalable medical image archive and distribution system.
Modeling and mining term association for improving biomedical information retrieval performance.
Hu, Qinmin; Huang, Jimmy Xiangji; Hu, Xiaohua
2012-06-11
The growth of the biomedical information requires most information retrieval systems to provide short and specific answers in response to complex user queries. Semantic information in the form of free text that is structured in a way makes it straightforward for humans to read but more difficult for computers to interpret automatically and search efficiently. One of the reasons is that most traditional information retrieval models assume terms are conditionally independent given a document/passage. Therefore, we are motivated to consider term associations within different contexts to help the models understand semantic information and use it for improving biomedical information retrieval performance. We propose a term association approach to discover term associations among the keywords from a query. The experiments are conducted on the TREC 2004-2007 Genomics data sets and the TREC 2004 HARD data set. The proposed approach is promising and achieves superiority over the baselines and the GSP results. The parameter settings and different indices are investigated that the sentence-based index produces the best results in terms of the document-level, the word-based index for the best results in terms of the passage-level and the paragraph-based index for the best results in terms of the passage2-level. Furthermore, the best term association results always come from the best baseline. The tuning number k in the proposed recursive re-ranking algorithm is discussed and locally optimized to be 10. First, modelling term association for improving biomedical information retrieval using factor analysis, is one of the major contributions in our work. Second, the experiments confirm that term association considering co-occurrence and dependency among the keywords can produce better results than the baselines treating the keywords independently. Third, the baselines are re-ranked according to the importance and reliance of latent factors behind term associations. These latent factors are decided by the proposed model and their term appearances in the first round retrieved passages.
Modeling and mining term association for improving biomedical information retrieval performance
2012-01-01
Background The growth of the biomedical information requires most information retrieval systems to provide short and specific answers in response to complex user queries. Semantic information in the form of free text that is structured in a way makes it straightforward for humans to read but more difficult for computers to interpret automatically and search efficiently. One of the reasons is that most traditional information retrieval models assume terms are conditionally independent given a document/passage. Therefore, we are motivated to consider term associations within different contexts to help the models understand semantic information and use it for improving biomedical information retrieval performance. Results We propose a term association approach to discover term associations among the keywords from a query. The experiments are conducted on the TREC 2004-2007 Genomics data sets and the TREC 2004 HARD data set. The proposed approach is promising and achieves superiority over the baselines and the GSP results. The parameter settings and different indices are investigated that the sentence-based index produces the best results in terms of the document-level, the word-based index for the best results in terms of the passage-level and the paragraph-based index for the best results in terms of the passage2-level. Furthermore, the best term association results always come from the best baseline. The tuning number k in the proposed recursive re-ranking algorithm is discussed and locally optimized to be 10. Conclusions First, modelling term association for improving biomedical information retrieval using factor analysis, is one of the major contributions in our work. Second, the experiments confirm that term association considering co-occurrence and dependency among the keywords can produce better results than the baselines treating the keywords independently. Third, the baselines are re-ranked according to the importance and reliance of latent factors behind term associations. These latent factors are decided by the proposed model and their term appearances in the first round retrieved passages. PMID:22901087
A Prototype System for Retrieval of Gene Functional Information
Folk, Lillian C.; Patrick, Timothy B.; Pattison, James S.; Wolfinger, Russell D.; Mitchell, Joyce A.
2003-01-01
Microarrays allow researchers to gather data about the expression patterns of thousands of genes simultaneously. Statistical analysis can reveal which genes show statistically significant results. Making biological sense of those results requires the retrieval of functional information about the genes thus identified, typically a manual gene-by-gene retrieval of information from various on-line databases. For experiments generating thousands of genes of interest, retrieval of functional information can become a significant bottleneck. To address this issue, we are currently developing a prototype system to automate the process of retrieval of functional information from multiple on-line sources. PMID:14728346
Automated Semantic Indexing of Figure Captions to Improve Radiology Image Retrieval
Kahn, Charles E.; Rubin, Daniel L.
2009-01-01
Objective We explored automated concept-based indexing of unstructured figure captions to improve retrieval of images from radiology journals. Design The MetaMap Transfer program (MMTx) was used to map the text of 84,846 figure captions from 9,004 peer-reviewed, English-language articles to concepts in three controlled vocabularies from the UMLS Metathesaurus, version 2006AA. Sampling procedures were used to estimate the standard information-retrieval metrics of precision and recall, and to evaluate the degree to which concept-based retrieval improved image retrieval. Measurements Precision was estimated based on a sample of 250 concepts. Recall was estimated based on a sample of 40 concepts. The authors measured the impact of concept-based retrieval to improve upon keyword-based retrieval in a random sample of 10,000 search queries issued by users of a radiology image search engine. Results Estimated precision was 0.897 (95% confidence interval, 0.857–0.937). Estimated recall was 0.930 (95% confidence interval, 0.838–1.000). In 5,535 of 10,000 search queries (55%), concept-based retrieval found results not identified by simple keyword matching; in 2,086 searches (21%), more than 75% of the results were found by concept-based search alone. Conclusion Concept-based indexing of radiology journal figure captions achieved very high precision and recall, and significantly improved image retrieval. PMID:19261938
Van Landeghem, Sofie; De Bodt, Stefanie; Drebert, Zuzanna J; Inzé, Dirk; Van de Peer, Yves
2013-03-01
Despite the availability of various data repositories for plant research, a wealth of information currently remains hidden within the biomolecular literature. Text mining provides the necessary means to retrieve these data through automated processing of texts. However, only recently has advanced text mining methodology been implemented with sufficient computational power to process texts at a large scale. In this study, we assess the potential of large-scale text mining for plant biology research in general and for network biology in particular using a state-of-the-art text mining system applied to all PubMed abstracts and PubMed Central full texts. We present extensive evaluation of the textual data for Arabidopsis thaliana, assessing the overall accuracy of this new resource for usage in plant network analyses. Furthermore, we combine text mining information with both protein-protein and regulatory interactions from experimental databases. Clusters of tightly connected genes are delineated from the resulting network, illustrating how such an integrative approach is essential to grasp the current knowledge available for Arabidopsis and to uncover gene information through guilt by association. All large-scale data sets, as well as the manually curated textual data, are made publicly available, hereby stimulating the application of text mining data in future plant biology studies.
BIOSSES: a semantic sentence similarity estimation system for the biomedical domain.
Sogancioglu, Gizem; Öztürk, Hakime; Özgür, Arzucan
2017-07-15
The amount of information available in textual format is rapidly increasing in the biomedical domain. Therefore, natural language processing (NLP) applications are becoming increasingly important to facilitate the retrieval and analysis of these data. Computing the semantic similarity between sentences is an important component in many NLP tasks including text retrieval and summarization. A number of approaches have been proposed for semantic sentence similarity estimation for generic English. However, our experiments showed that such approaches do not effectively cover biomedical knowledge and produce poor results for biomedical text. We propose several approaches for sentence-level semantic similarity computation in the biomedical domain, including string similarity measures and measures based on the distributed vector representations of sentences learned in an unsupervised manner from a large biomedical corpus. In addition, ontology-based approaches are presented that utilize general and domain-specific ontologies. Finally, a supervised regression based model is developed that effectively combines the different similarity computation metrics. A benchmark data set consisting of 100 sentence pairs from the biomedical literature is manually annotated by five human experts and used for evaluating the proposed methods. The experiments showed that the supervised semantic sentence similarity computation approach obtained the best performance (0.836 correlation with gold standard human annotations) and improved over the state-of-the-art domain-independent systems up to 42.6% in terms of the Pearson correlation metric. A web-based system for biomedical semantic sentence similarity computation, the source code, and the annotated benchmark data set are available at: http://tabilab.cmpe.boun.edu.tr/BIOSSES/ . gizemsogancioglu@gmail.com or arzucan.ozgur@boun.edu.tr. © The Author 2017. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com
Liu, Yifeng; Liang, Yongjie; Wishart, David
2015-07-01
PolySearch2 (http://polysearch.ca) is an online text-mining system for identifying relationships between biomedical entities such as human diseases, genes, SNPs, proteins, drugs, metabolites, toxins, metabolic pathways, organs, tissues, subcellular organelles, positive health effects, negative health effects, drug actions, Gene Ontology terms, MeSH terms, ICD-10 medical codes, biological taxonomies and chemical taxonomies. PolySearch2 supports a generalized 'Given X, find all associated Ys' query, where X and Y can be selected from the aforementioned biomedical entities. An example query might be: 'Find all diseases associated with Bisphenol A'. To find its answers, PolySearch2 searches for associations against comprehensive collections of free-text collections, including local versions of MEDLINE abstracts, PubMed Central full-text articles, Wikipedia full-text articles and US Patent application abstracts. PolySearch2 also searches 14 widely used, text-rich biological databases such as UniProt, DrugBank and Human Metabolome Database to improve its accuracy and coverage. PolySearch2 maintains an extensive thesaurus of biological terms and exploits the latest search engine technology to rapidly retrieve relevant articles and databases records. PolySearch2 also generates, ranks and annotates associative candidates and present results with relevancy statistics and highlighted key sentences to facilitate user interpretation. © The Author(s) 2015. Published by Oxford University Press on behalf of Nucleic Acids Research.
Liu, Yifeng; Liang, Yongjie; Wishart, David
2015-01-01
PolySearch2 (http://polysearch.ca) is an online text-mining system for identifying relationships between biomedical entities such as human diseases, genes, SNPs, proteins, drugs, metabolites, toxins, metabolic pathways, organs, tissues, subcellular organelles, positive health effects, negative health effects, drug actions, Gene Ontology terms, MeSH terms, ICD-10 medical codes, biological taxonomies and chemical taxonomies. PolySearch2 supports a generalized ‘Given X, find all associated Ys’ query, where X and Y can be selected from the aforementioned biomedical entities. An example query might be: ‘Find all diseases associated with Bisphenol A’. To find its answers, PolySearch2 searches for associations against comprehensive collections of free-text collections, including local versions of MEDLINE abstracts, PubMed Central full-text articles, Wikipedia full-text articles and US Patent application abstracts. PolySearch2 also searches 14 widely used, text-rich biological databases such as UniProt, DrugBank and Human Metabolome Database to improve its accuracy and coverage. PolySearch2 maintains an extensive thesaurus of biological terms and exploits the latest search engine technology to rapidly retrieve relevant articles and databases records. PolySearch2 also generates, ranks and annotates associative candidates and present results with relevancy statistics and highlighted key sentences to facilitate user interpretation. PMID:25925572
[A retrieval method of drug molecules based on graph collapsing].
Qu, J W; Lv, X Q; Liu, Z M; Liao, Y; Sun, P H; Wang, B; Tang, Z
2018-04-18
To establish a compact and efficient hypergraph representation and a graph-similarity-based retrieval method of molecules to achieve effective and efficient medicine information retrieval. Chemical structural formula (CSF) was a primary search target as a unique and precise identifier for each compound at the molecular level in the research field of medicine information retrieval. To retrieve medicine information effectively and efficiently, a complete workflow of the graph-based CSF retrieval system was introduced. This system accepted the photos taken from smartphones and the sketches drawn on tablet personal computers as CSF inputs, and formalized the CSFs with the corresponding graphs. Then this paper proposed a compact and efficient hypergraph representation for molecules on the basis of analyzing factors that directly affected the efficiency of graph matching. According to the characteristics of CSFs, a hierarchical collapsing method combining graph isomorphism and frequent subgraph mining was adopted. There was yet a fundamental challenge, subgraph overlapping during the collapsing procedure, which hindered the method from establishing the correct compact hypergraph of an original CSF graph. Therefore, a graph-isomorphism-based algorithm was proposed to select dominant acyclic subgraphs on the basis of overlapping analysis. Finally, the spatial similarity among graphical CSFs was evaluated by multi-dimensional measures of similarity. To evaluate the performance of the proposed method, the proposed system was firstly compared with Wikipedia Chemical Structure Explorer (WCSE), the state-of-the-art system that allowed CSF similarity searching within Wikipedia molecules dataset, on retrieval accuracy. The system achieved higher values on mean average precision, discounted cumulative gain, rank-biased precision, and expected reciprocal rank than WCSE from the top-2 to the top-10 retrieved results. Specifically, the system achieved 10%, 1.41, 6.42%, and 1.32% higher than WCSE on these metrics for top-10 retrieval results, respectively. Moreover, several retrieval cases were presented to intuitively compare with WCSE. The results of the above comparative study demonstrated that the proposed method outperformed the existing method with regard to accuracy and effectiveness. This paper proposes a graph-similarity-based retrieval approach for medicine information. To obtain satisfactory retrieval results, an isomorphism-based algorithm is proposed for dominant subgraph selection based on the subgraph overlapping analysis, as well as an effective and efficient hypergraph representation of molecules. Experiment results demonstrate the effectiveness of the proposed approach.
Comparison between two lidar methods to retrieve microphysical properties of liquid-water clouds
NASA Astrophysics Data System (ADS)
Jimenez, Cristofer; Ansmann, Albert; Donovan, David; Engelmann, Ronny; Schmidt, Jörg; Wandinger, Ulla
2018-04-01
Since 2010, the Raman dual-FOV lidar system permits the retrieval of microphysical properties of liquid-water clouds during nighttime. A new robust lidar depolarization approach was recently introduced, which permits the retrieval of these properties as well, with high temporal resolution and during daytime. To implement this approach, the lidar system was upgraded, by adding a three channel depolarization receiver. The first preliminary retrieval results and a comparison between both methods is presented.
Decision support environment for medical product safety surveillance.
Botsis, Taxiarchis; Jankosky, Christopher; Arya, Deepa; Kreimeyer, Kory; Foster, Matthew; Pandey, Abhishek; Wang, Wei; Zhang, Guangfan; Forshee, Richard; Goud, Ravi; Menschik, David; Walderhaug, Mark; Woo, Emily Jane; Scott, John
2016-12-01
We have developed a Decision Support Environment (DSE) for medical experts at the US Food and Drug Administration (FDA). The DSE contains two integrated systems: The Event-based Text-mining of Health Electronic Records (ETHER) and the Pattern-based and Advanced Network Analyzer for Clinical Evaluation and Assessment (PANACEA). These systems assist medical experts in reviewing reports submitted to the Vaccine Adverse Event Reporting System (VAERS) and the FDA Adverse Event Reporting System (FAERS). In this manuscript, we describe the DSE architecture and key functionalities, and examine its potential contributions to the signal management process by focusing on four use cases: the identification of missing cases from a case series, the identification of duplicate case reports, retrieving cases for a case series analysis, and community detection for signal identification and characterization. Published by Elsevier Inc.
Social Image Tag Ranking by Two-View Learning
NASA Astrophysics Data System (ADS)
Zhuang, Jinfeng; Hoi, Steven C. H.
Tags play a central role in text-based social image retrieval and browsing. However, the tags annotated by web users could be noisy, irrelevant, and often incomplete for describing the image contents, which may severely deteriorate the performance of text-based image retrieval models. In order to solve this problem, researchers have proposed techniques to rank the annotated tags of a social image according to their relevance to the visual content of the image. In this paper, we aim to overcome the challenge of social image tag ranking for a corpus of social images with rich user-generated tags by proposing a novel two-view learning approach. It can effectively exploit both textual and visual contents of social images to discover the complicated relationship between tags and images. Unlike the conventional learning approaches that usually assumes some parametric models, our method is completely data-driven and makes no assumption about the underlying models, making the proposed solution practically more effective. We formulate our method as an optimization task and present an efficient algorithm to solve it. To evaluate the efficacy of our method, we conducted an extensive set of experiments by applying our technique to both text-based social image retrieval and automatic image annotation tasks. Our empirical results showed that the proposed method can be more effective than the conventional approaches.
Extracting similar terms from multiple EMR-based semantic embeddings to support chart reviews.
Cheng Ye, M S; Fabbri, Daniel
2018-05-21
Word embeddings project semantically similar terms into nearby points in a vector space. When trained on clinical text, these embeddings can be leveraged to improve keyword search and text highlighting. In this paper, we present methods to refine the selection process of similar terms from multiple EMR-based word embeddings, and evaluate their performance quantitatively and qualitatively across multiple chart review tasks. Word embeddings were trained on each clinical note type in an EMR. These embeddings were then combined, weighted, and truncated to select a refined set of similar terms to be used in keyword search and text highlighting. To evaluate their quality, we measured the similar terms' information retrieval (IR) performance using precision-at-K (P@5, P@10). Additionally a user study evaluated users' search term preferences, while a timing study measured the time to answer a question from a clinical chart. The refined terms outperformed the baseline method's information retrieval performance (e.g., increasing the average P@5 from 0.48 to 0.60). Additionally, the refined terms were preferred by most users, and reduced the average time to answer a question. Clinical information can be more quickly retrieved and synthesized when using semantically similar term from multiple embeddings. Copyright © 2018. Published by Elsevier Inc.
Fei, Lin; Zhao, Jing; Leng, Jiahao; Zhang, Shujian
2017-10-12
The ALIPORC full-text database is targeted at a specific full-text database of acupuncture literature in the Republic of China. Starting in 2015, till now, the database has been getting completed, focusing on books relevant with acupuncture, articles and advertising documents, accomplished or published in the Republic of China. The construction of this database aims to achieve the source sharing of acupuncture medical literature in the Republic of China through the retrieval approaches to diversity and accurate content presentation, contributes to the exchange of scholars, reduces the paper damage caused by paging and simplify the retrieval of the rare literature. The writers have made the explanation of the database in light of sources, characteristics and current situation of construction; and have discussed on improving the efficiency and integrity of the database and deepening the development of acupuncture literature in the Republic of China.
New model for distributed multimedia databases and its application to networking of museums
NASA Astrophysics Data System (ADS)
Kuroda, Kazuhide; Komatsu, Naohisa; Komiya, Kazumi; Ikeda, Hiroaki
1998-02-01
This paper proposes a new distributed multimedia data base system where the databases storing MPEG-2 videos and/or super high definition images are connected together through the B-ISDN's, and also refers to an example of the networking of museums on the basis of the proposed database system. The proposed database system introduces a new concept of the 'retrieval manager' which functions an intelligent controller so that the user can recognize a set of image databases as one logical database. A user terminal issues a request to retrieve contents to the retrieval manager which is located in the nearest place to the user terminal on the network. Then, the retrieved contents are directly sent through the B-ISDN's to the user terminal from the server which stores the designated contents. In this case, the designated logical data base dynamically generates the best combination of such a retrieving parameter as a data transfer path referring to directly or data on the basis of the environment of the system. The generated retrieving parameter is then executed to select the most suitable data transfer path on the network. Therefore, the best combination of these parameters fits to the distributed multimedia database system.
Learning of Multimodal Representations With Random Walks on the Click Graph.
Wu, Fei; Lu, Xinyan; Song, Jun; Yan, Shuicheng; Zhang, Zhongfei Mark; Rui, Yong; Zhuang, Yueting
2016-02-01
In multimedia information retrieval, most classic approaches tend to represent different modalities of media in the same feature space. With the click data collected from the users' searching behavior, existing approaches take either one-to-one paired data (text-image pairs) or ranking examples (text-query-image and/or image-query-text ranking lists) as training examples, which do not make full use of the click data, particularly the implicit connections among the data objects. In this paper, we treat the click data as a large click graph, in which vertices are images/text queries and edges indicate the clicks between an image and a query. We consider learning a multimodal representation from the perspective of encoding the explicit/implicit relevance relationship between the vertices in the click graph. By minimizing both the truncated random walk loss as well as the distance between the learned representation of vertices and their corresponding deep neural network output, the proposed model which is named multimodal random walk neural network (MRW-NN) can be applied to not only learn robust representation of the existing multimodal data in the click graph, but also deal with the unseen queries and images to support cross-modal retrieval. We evaluate the latent representation learned by MRW-NN on a public large-scale click log data set Clickture and further show that MRW-NN achieves much better cross-modal retrieval performance on the unseen queries/images than the other state-of-the-art methods.
New frontiers for intelligent content-based retrieval
NASA Astrophysics Data System (ADS)
Benitez, Ana B.; Smith, John R.
2001-01-01
In this paper, we examine emerging frontiers in the evolution of content-based retrieval systems that rely on an intelligent infrastructure. Here, we refer to intelligence as the capabilities of the systems to build and maintain situational or world models, utilize dynamic knowledge representation, exploit context, and leverage advanced reasoning and learning capabilities. We argue that these elements are essential to producing effective systems for retrieving audio-visual content at semantic levels matching those of human perception and cognition. In this paper, we review relevant research on the understanding of human intelligence and construction of intelligent system in the fields of cognitive psychology, artificial intelligence, semiotics, and computer vision. We also discus how some of the principal ideas form these fields lead to new opportunities and capabilities for content-based retrieval systems. Finally, we describe some of our efforts in these directions. In particular, we present MediaNet, a multimedia knowledge presentation framework, and some MPEG-7 description tools that facilitate and enable intelligent content-based retrieval.
New frontiers for intelligent content-based retrieval
NASA Astrophysics Data System (ADS)
Benitez, Ana B.; Smith, John R.
2000-12-01
In this paper, we examine emerging frontiers in the evolution of content-based retrieval systems that rely on an intelligent infrastructure. Here, we refer to intelligence as the capabilities of the systems to build and maintain situational or world models, utilize dynamic knowledge representation, exploit context, and leverage advanced reasoning and learning capabilities. We argue that these elements are essential to producing effective systems for retrieving audio-visual content at semantic levels matching those of human perception and cognition. In this paper, we review relevant research on the understanding of human intelligence and construction of intelligent system in the fields of cognitive psychology, artificial intelligence, semiotics, and computer vision. We also discus how some of the principal ideas form these fields lead to new opportunities and capabilities for content-based retrieval systems. Finally, we describe some of our efforts in these directions. In particular, we present MediaNet, a multimedia knowledge presentation framework, and some MPEG-7 description tools that facilitate and enable intelligent content-based retrieval.
Experiments and Analysis on a Computer Interface to an Information-Retrieval Network.
ERIC Educational Resources Information Center
Marcus, Richard S.; Reintjes, J. Francis
A primary goal of this project was to develop an interface that would provide direct access for inexperienced users to existing online bibliographic information retrieval networks. The experiment tested the concept of a virtual-system mode of access to a network of heterogeneous interactive retrieval systems and databases. An experimental…
Factors Influencing Successful Use of Information Retrieval Systems by Nurse Practitioner Students
Rose, Linda; Crabtree, Katherine; Hersh, William
1998-01-01
This study examined whether a relationship exists between selected Nurse Practitioner students' attributes and successful information retrieval as demonstrated by correctly answering clinical questions using an information retrieval system (Medline). One predictor variable, attitude toward current computer technology, was significantly correlated (r =0.43, p ≤ .05) with successful literature searching.
Sarrouti, Mourad; Ouatik El Alaoui, Said
2017-04-01
Passage retrieval, the identification of top-ranked passages that may contain the answer for a given biomedical question, is a crucial component for any biomedical question answering (QA) system. Passage retrieval in open-domain QA is a longstanding challenge widely studied over the last decades. However, it still requires further efforts in biomedical QA. In this paper, we present a new biomedical passage retrieval method based on Stanford CoreNLP sentence/passage length, probabilistic information retrieval (IR) model and UMLS concepts. In the proposed method, we first use our document retrieval system based on PubMed search engine and UMLS similarity to retrieve relevant documents to a given biomedical question. We then take the abstracts from the retrieved documents and use Stanford CoreNLP for sentence splitter to make a set of sentences, i.e., candidate passages. Using stemmed words and UMLS concepts as features for the BM25 model, we finally compute the similarity scores between the biomedical question and each of the candidate passages and keep the N top-ranked ones. Experimental evaluations performed on large standard datasets, provided by the BioASQ challenge, show that the proposed method achieves good performances compared with the current state-of-the-art methods. The proposed method significantly outperforms the current state-of-the-art methods by an average of 6.84% in terms of mean average precision (MAP). We have proposed an efficient passage retrieval method which can be used to retrieve relevant passages in biomedical QA systems with high mean average precision. Copyright © 2017 Elsevier Inc. All rights reserved.
Medical student, resident, and faculty use of a computerized literature searching system.
Markert, R J; Parisi, A J; Barnes, H V; Cohen, S; Goldenberg, K; Mieczkowski, L E; Dunn, M; Siervogel, R M
1989-04-01
The experiences of medical students, residents, and faculty with a computerized literature searching system were evaluated. Third-year medical students, internal medicine and family practice residents, and full-time and voluntary faculty at one medical school had the opportunity to use a full-text and bibliographic medical literature retrieval system free of charge for an eleven-month period. Subjects conducted nearly nine thousand literature searches over a period of 942 system hours. Questionnaire data showed that participants could learn to use and would use an electronic information system, felt capable of using the system, utilized the system for a variety of purposes and in a number of different ways, and viewed the system as a valuable tool in searching the medical literature. The results are discussed in the context of the educational needs of the four user-groups and medical education planning by institutions.
EnsMart: A Generic System for Fast and Flexible Access to Biological Data
Kasprzyk, Arek; Keefe, Damian; Smedley, Damian; London, Darin; Spooner, William; Melsopp, Craig; Hammond, Martin; Rocca-Serra, Philippe; Cox, Tony; Birney, Ewan
2004-01-01
The EnsMart system (www.ensembl.org/EnsMart) provides a generic data warehousing solution for fast and flexible querying of large biological data sets and integration with third-party data and tools. The system consists of a query-optimized database and interactive, user-friendly interfaces. EnsMart has been applied to Ensembl, where it extends its genomic browser capabilities, facilitating rapid retrieval of customized data sets. A wide variety of complex queries, on various types of annotations, for numerous species are supported. These can be applied to many research problems, ranging from SNP selection for candidate gene screening, through cross-species evolutionary comparisons, to microarray annotation. Users can group and refine biological data according to many criteria, including cross-species analyses, disease links, sequence variations, and expression patterns. Both tabulated list data and biological sequence output can be generated dynamically, in HTML, text, Microsoft Excel, and compressed formats. A wide range of sequence types, such as cDNA, peptides, coding regions, UTRs, and exons, with additional upstream and downstream regions, can be retrieved. The EnsMart database can be accessed via a public Web site, or through a Java application suite. Both implementations and the database are freely available for local installation, and can be extended or adapted to `non-Ensembl' data sets. PMID:14707178
Abstracts versus Full Texts and Patents: A Quantitative Analysis of Biomedical Entities
NASA Astrophysics Data System (ADS)
Müller, Bernd; Klinger, Roman; Gurulingappa, Harsha; Mevissen, Heinz-Theodor; Hofmann-Apitius, Martin; Fluck, Juliane; Friedrich, Christoph M.
In information retrieval, named entity recognition gives the opportunity to apply semantic search in domain specific corpora. Recently, more full text patents and journal articles became freely available. As the information distribution amongst the different sections is unknown, an analysis of the diversity is of interest.
Publication Index and Retrieval System.
1980-04-01
A0DA1087 279, NERNER AND CO WASHINGTON D C F/6 13/2 PUBLICATION INDEX AND RETRIEVAL SYSTEM . (U) APR 80 DAC3977-C0081 UNCLASSIFIED WES-OS-78-2 Nt.m*nn...Engineers. The objective was to develop an information index and retrieval system for the publications of the DMRP. The report was prepared by Herner...Development of the system and preparation and review of the report were under the super- vision of Dr. R. T. Saucier, Special Assistant for Dredged
Broadband Phase Retrieval for Image-Based Wavefront Sensing
NASA Technical Reports Server (NTRS)
Dean, Bruce H.
2007-01-01
A focus-diverse phase-retrieval algorithm has been shown to perform adequately for the purpose of image-based wavefront sensing when (1) broadband light (typically spanning the visible spectrum) is used in forming the images by use of an optical system under test and (2) the assumption of monochromaticity is applied to the broadband image data. Heretofore, it had been assumed that in order to obtain adequate performance, it is necessary to use narrowband or monochromatic light. Some background information, including definitions of terms and a brief description of pertinent aspects of image-based phase retrieval, is prerequisite to a meaningful summary of the present development. Phase retrieval is a general term used in optics to denote estimation of optical imperfections or aberrations of an optical system under test. The term image-based wavefront sensing refers to a general class of algorithms that recover optical phase information, and phase-retrieval algorithms constitute a subset of this class. In phase retrieval, one utilizes the measured response of the optical system under test to produce a phase estimate. The optical response of the system is defined as the image of a point-source object, which could be a star or a laboratory point source. The phase-retrieval problem is characterized as image-based in the sense that a charge-coupled-device camera, preferably of scientific imaging quality, is used to collect image data where the optical system would normally form an image. In a variant of phase retrieval, denoted phase-diverse phase retrieval [which can include focus-diverse phase retrieval (in which various defocus planes are used)], an additional known aberration (or an equivalent diversity function) is superimposed as an aid in estimating unknown aberrations by use of an image-based wavefront-sensing algorithm. Image-based phase-retrieval differs from such other wavefront-sensing methods, such as interferometry, shearing interferometry, curvature wavefront sensing, and Shack-Hartmann sensing, all of which entail disadvantages in comparison with image-based methods. The main disadvantages of these non-image based methods are complexity of test equipment and the need for a wavefront reference.
Application of new type of distributed multimedia databases to networked electronic museum
NASA Astrophysics Data System (ADS)
Kuroda, Kazuhide; Komatsu, Naohisa; Komiya, Kazumi; Ikeda, Hiroaki
1999-01-01
Recently, various kinds of multimedia application systems have actively been developed based on the achievement of advanced high sped communication networks, computer processing technologies, and digital contents-handling technologies. Under this background, this paper proposed a new distributed multimedia database system which can effectively perform a new function of cooperative retrieval among distributed databases. The proposed system introduces a new concept of 'Retrieval manager' which functions as an intelligent controller so that the user can recognize a set of distributed databases as one logical database. The logical database dynamically generates and performs a preferred combination of retrieving parameters on the basis of both directory data and the system environment. Moreover, a concept of 'domain' is defined in the system as a managing unit of retrieval. The retrieval can effectively be performed by cooperation of processing among multiple domains. Communication language and protocols are also defined in the system. These are used in every action for communications in the system. A language interpreter in each machine translates a communication language into an internal language used in each machine. Using the language interpreter, internal processing, such internal modules as DBMS and user interface modules can freely be selected. A concept of 'content-set' is also introduced. A content-set is defined as a package of contents. Contents in the content-set are related to each other. The system handles a content-set as one object. The user terminal can effectively control the displaying of retrieved contents, referring to data indicating the relation of the contents in the content- set. In order to verify the function of the proposed system, a networked electronic museum was experimentally built. The results of this experiment indicate that the proposed system can effectively retrieve the objective contents under the control to a number of distributed domains. The result also indicate that the system can effectively work even if the system becomes large.
Utilization of a multimedia PACS workstation for surgical planning of epilepsy
NASA Astrophysics Data System (ADS)
Soo Hoo, Kent; Wong, Stephen T.; Hawkins, Randall A.; Knowlton, Robert C.; Laxer, Kenneth D.; Rowley, Howard A.
1997-05-01
Surgical treatment of temporal lobe epilepsy requires the localization of the epileptogenic zone for surgical resection. Currently, clinicians utilize electroencephalography, various neuroimaging modalities, and psychological tests together to determine the location of this zone. We investigate how a multimedia neuroimaging workstation built on top of the UCSF Picture Archiving and Communication System can be used to aid surgical planning of epilepsy and related brain diseases. This usage demonstrates the ability of the workstation to retrieve image and textural data from PACS and other image sources, register multimodality images, visualize and render 3D data sets, analyze images, generate new image and text data from the analysis, and organize all data in a relational database management system.
Data Archival and Retrieval Enhancement (DARE) Metadata Modeling and Its User Interface
NASA Technical Reports Server (NTRS)
Hyon, Jason J.; Borgen, Rosana B.
1996-01-01
The Defense Nuclear Agency (DNA) has acquired terabytes of valuable data which need to be archived and effectively distributed to the entire nuclear weapons effects community and others...This paper describes the DARE (Data Archival and Retrieval Enhancement) metadata model and explains how it is used as a source for generating HyperText Markup Language (HTML)or Standard Generalized Markup Language (SGML) documents for access through web browsers such as Netscape.
Using Stream Features for Instant Document Filtering
2012-11-01
expansion and qual- ity indicators in searching microblog posts. Advances in Information Retrieval, pages 362–367, 2011. [12] N. Naveed, T. Gottron, J ...16] G Salton and C Buckley. Term-weighting approaches in automatic text retrieval. Information Processing & Management, 24(5):513–523, 1988. [17...Overview of the TREC-2012 Microblog Track. In trec.nist.gov. NIST. [19] Michael J Welch, Uri Schonfeld, Dan He, and Junghoo Cho. Topical semantics of