Autism, Context/Noncontext Information Processing, and Atypical Development
Skoyles, John R.
2011-01-01
Autism has been attributed to a deficit in contextual information processing. Attempts to understand autism in terms of such a defect, however, do not include more recent computational work upon context. This work has identified that context information processing depends upon the extraction and use of the information hidden in higher-order (or indirect) associations. Higher-order associations underlie the cognition of context rather than that of situations. This paper starts by examining the differences between higher-order and first-order (or direct) associations. Higher-order associations link entities not directly (as with first-order ones) but indirectly through all the connections they have via other entities. Extracting this information requires the processing of past episodes as a totality. As a result, this extraction depends upon specialised extraction processes separate from cognition. This information is then consolidated. Due to this difference, the extraction/consolidation of higher-order information can be impaired whilst cognition remains intact. Although not directly impaired, cognition will be indirectly impaired by knock on effects such as cognition compensating for absent higher-order information with information extracted from first-order associations. This paper discusses the implications of this for the inflexible, literal/immediate, and inappropriate information processing of autistic individuals. PMID:22937255
Information extraction during simultaneous motion processing.
Rideaux, Reuben; Edwards, Mark
2014-02-01
When confronted with multiple moving objects the visual system can process them in two stages: an initial stage in which a limited number of signals are processed in parallel (i.e. simultaneously) followed by a sequential stage. We previously demonstrated that during the simultaneous stage, observers could discriminate between presentations containing up to 5 vs. 6 spatially localized motion signals (Edwards & Rideaux, 2013). Here we investigate what information is actually extracted during the simultaneous stage and whether the simultaneous limit varies with the detail of information extracted. This was achieved by measuring the ability of observers to extract varied information from low detail, i.e. the number of signals presented, to high detail, i.e. the actual directions present and the direction of a specific element, during the simultaneous stage. The results indicate that the resolution of simultaneous processing varies as a function of the information which is extracted, i.e. as the information extraction becomes more detailed, from the number of moving elements to the direction of a specific element, the capacity to process multiple signals is reduced. Thus, when assigning a capacity to simultaneous motion processing, this must be qualified by designating the degree of information extraction. Crown Copyright © 2013. Published by Elsevier Ltd. All rights reserved.
Extraction of Data from a Hospital Information System to Perform Process Mining.
Neira, Ricardo Alfredo Quintano; de Vries, Gert-Jan; Caffarel, Jennifer; Stretton, Erin
2017-01-01
The aim of this work is to share our experience in relevant data extraction from a hospital information system in preparation for a research study using process mining techniques. The steps performed were: research definition, mapping the normative processes, identification of tables and fields names of the database, and extraction of data. We then offer lessons learned during data extraction phase. Any errors made in the extraction phase will propagate and have implications on subsequent analyses. Thus, it is essential to take the time needed and devote sufficient attention to detail to perform all activities with the goal of ensuring high quality of the extracted data. We hope this work will be informative for other researchers to plan and execute extraction of data for process mining research studies.
New Method for Knowledge Management Focused on Communication Pattern in Product Development
NASA Astrophysics Data System (ADS)
Noguchi, Takashi; Shiba, Hajime
In the field of manufacturing, the importance of utilizing knowledge and know-how has been growing. To meet this background, there is a need for new methods to efficiently accumulate and extract effective knowledge and know-how. To facilitate the extraction of knowledge and know-how needed by engineers, we first defined business process information which includes schedule/progress information, document data, information about communication among parties concerned, and information which corresponds to these three types of information. Based on our definitions, we proposed an IT system (FlexPIM: Flexible and collaborative Process Information Management) to register and accumulate business process information with the least effort. In order to efficiently extract effective information from huge volumes of accumulated business process information, focusing attention on “actions” and communication patterns, we propose a new extraction method using communication patterns. And the validity of this method has been verified for some communication patterns.
Considerations on the Optimal and Efficient Processing of Information-Bearing Signals
ERIC Educational Resources Information Center
Harms, Herbert Andrew
2013-01-01
Noise is a fundamental hurdle that impedes the processing of information-bearing signals, specifically the extraction of salient information. Processing that is both optimal and efficient is desired; optimality ensures the extracted information has the highest fidelity allowed by the noise, while efficiency ensures limited resource usage. Optimal…
Information Extraction Using Controlled English to Support Knowledge-Sharing and Decision-Making
2012-06-01
or language variants. CE-based information extraction will greatly facilitate the processes in the cognitive and social domains that enable forces...terminology or language variants. CE-based information extraction will greatly facilitate the processes in the cognitive and social domains that...processor is run to turn the atomic CE into a more “ stylistically felicitous” CE, using techniques such as: aggregating all information about an entity
Translations on USSR Science and Technology, Physical Sciences and Technology, Number 45
1978-08-14
hundred pages. [Question] In the public’s perception a cosmonaut seems to be, in terms of physical and mental fitness, something like a superman . How ...indicate how the original information was processed. Where no processing indicator is given, the information was summarized or extracted. Unfamiliar... how the original Information was processed. Where no processing indicator is given, the information was sipmarized or extracted, unfamiliar names
Thinking graphically: Connecting vision and cognition during graph comprehension.
Ratwani, Raj M; Trafton, J Gregory; Boehm-Davis, Deborah A
2008-03-01
Task analytic theories of graph comprehension account for the perceptual and conceptual processes required to extract specific information from graphs. Comparatively, the processes underlying information integration have received less attention. We propose a new framework for information integration that highlights visual integration and cognitive integration. During visual integration, pattern recognition processes are used to form visual clusters of information; these visual clusters are then used to reason about the graph during cognitive integration. In 3 experiments, the processes required to extract specific information and to integrate information were examined by collecting verbal protocol and eye movement data. Results supported the task analytic theories for specific information extraction and the processes of visual and cognitive integration for integrative questions. Further, the integrative processes scaled up as graph complexity increased, highlighting the importance of these processes for integration in more complex graphs. Finally, based on this framework, design principles to improve both visual and cognitive integration are described. PsycINFO Database Record (c) 2008 APA, all rights reserved
Lexical and sublexical semantic preview benefits in Chinese reading.
Yan, Ming; Zhou, Wei; Shu, Hua; Kliegl, Reinhold
2012-07-01
Semantic processing from parafoveal words is an elusive phenomenon in alphabetic languages, but it has been demonstrated only for a restricted set of noncompound Chinese characters. Using the gaze-contingent boundary paradigm, this experiment examined whether parafoveal lexical and sublexical semantic information was extracted from compound preview characters. Results generalized parafoveal semantic processing to this representative set of Chinese characters and extended the parafoveal processing to radical (sublexical) level semantic information extraction. Implications for notions of parafoveal information extraction during Chinese reading are discussed. 2012 APA, all rights reserved
Joint Extraction of Entities and Relations Using Reinforcement Learning and Deep Learning.
Feng, Yuntian; Zhang, Hongjun; Hao, Wenning; Chen, Gang
2017-01-01
We use both reinforcement learning and deep learning to simultaneously extract entities and relations from unstructured texts. For reinforcement learning, we model the task as a two-step decision process. Deep learning is used to automatically capture the most important information from unstructured texts, which represent the state in the decision process. By designing the reward function per step, our proposed method can pass the information of entity extraction to relation extraction and obtain feedback in order to extract entities and relations simultaneously. Firstly, we use bidirectional LSTM to model the context information, which realizes preliminary entity extraction. On the basis of the extraction results, attention based method can represent the sentences that include target entity pair to generate the initial state in the decision process. Then we use Tree-LSTM to represent relation mentions to generate the transition state in the decision process. Finally, we employ Q -Learning algorithm to get control policy π in the two-step decision process. Experiments on ACE2005 demonstrate that our method attains better performance than the state-of-the-art method and gets a 2.4% increase in recall-score.
Joint Extraction of Entities and Relations Using Reinforcement Learning and Deep Learning
Zhang, Hongjun; Chen, Gang
2017-01-01
We use both reinforcement learning and deep learning to simultaneously extract entities and relations from unstructured texts. For reinforcement learning, we model the task as a two-step decision process. Deep learning is used to automatically capture the most important information from unstructured texts, which represent the state in the decision process. By designing the reward function per step, our proposed method can pass the information of entity extraction to relation extraction and obtain feedback in order to extract entities and relations simultaneously. Firstly, we use bidirectional LSTM to model the context information, which realizes preliminary entity extraction. On the basis of the extraction results, attention based method can represent the sentences that include target entity pair to generate the initial state in the decision process. Then we use Tree-LSTM to represent relation mentions to generate the transition state in the decision process. Finally, we employ Q-Learning algorithm to get control policy π in the two-step decision process. Experiments on ACE2005 demonstrate that our method attains better performance than the state-of-the-art method and gets a 2.4% increase in recall-score. PMID:28894463
Wang, Zhifei; Xie, Yanming; Wang, Yongyan
2011-10-01
Computerizing extracting information from Chinese medicine literature seems more convenient than hand searching, which could simplify searching process and improve the accuracy. However, many computerized auto-extracting methods are increasingly used, regular expression is so special that could be efficient for extracting useful information in research. This article focused on regular expression applying in extracting information from Chinese medicine literature. Two practical examples were reported in this article about regular expression to extract "case number (non-terminology)" and "efficacy rate (subgroups for related information identification)", which explored how to extract information in Chinese medicine literature by means of some special research method.
Knowledge Discovery and Data Mining: An Overview
NASA Technical Reports Server (NTRS)
Fayyad, U.
1995-01-01
The process of knowledge discovery and data mining is the process of information extraction from very large databases. Its importance is described along with several techniques and considerations for selecting the most appropriate technique for extracting information from a particular data set.
Novel method of extracting motion from natural movies.
Suzuki, Wataru; Ichinohe, Noritaka; Tani, Toshiki; Hayami, Taku; Miyakawa, Naohisa; Watanabe, Satoshi; Takeichi, Hiroshige
2017-11-01
The visual system in primates can be segregated into motion and shape pathways. Interaction occurs at multiple stages along these pathways. Processing of shape-from-motion and biological motion is considered to be a higher-order integration process involving motion and shape information. However, relatively limited types of stimuli have been used in previous studies on these integration processes. We propose a new algorithm to extract object motion information from natural movies and to move random dots in accordance with the information. The object motion information is extracted by estimating the dynamics of local normal vectors of the image intensity projected onto the x-y plane of the movie. An electrophysiological experiment on two adult common marmoset monkeys (Callithrix jacchus) showed that the natural and random dot movies generated with this new algorithm yielded comparable neural responses in the middle temporal visual area. In principle, this algorithm provided random dot motion stimuli containing shape information for arbitrary natural movies. This new method is expected to expand the neurophysiological and psychophysical experimental protocols to elucidate the integration processing of motion and shape information in biological systems. The novel algorithm proposed here was effective in extracting object motion information from natural movies and provided new motion stimuli to investigate higher-order motion information processing. Copyright © 2017 The Author(s). Published by Elsevier B.V. All rights reserved.
Tagline: Information Extraction for Semi-Structured Text Elements in Medical Progress Notes
ERIC Educational Resources Information Center
Finch, Dezon Kile
2012-01-01
Text analysis has become an important research activity in the Department of Veterans Affairs (VA). Statistical text mining and natural language processing have been shown to be very effective for extracting useful information from medical documents. However, neither of these techniques is effective at extracting the information stored in…
Wang, Hui; Zhang, Weide; Zeng, Qiang; Li, Zuofeng; Feng, Kaiyan; Liu, Lei
2014-04-01
Extracting information from unstructured clinical narratives is valuable for many clinical applications. Although natural Language Processing (NLP) methods have been profoundly studied in electronic medical records (EMR), few studies have explored NLP in extracting information from Chinese clinical narratives. In this study, we report the development and evaluation of extracting tumor-related information from operation notes of hepatic carcinomas which were written in Chinese. Using 86 operation notes manually annotated by physicians as the training set, we explored both rule-based and supervised machine-learning approaches. Evaluating on unseen 29 operation notes, our best approach yielded 69.6% in precision, 58.3% in recall and 63.5% F-score. Copyright © 2014 Elsevier Inc. All rights reserved.
Zhu, Hongchun; Cai, Lijie; Liu, Haiying; Huang, Wei
2016-01-01
Multi-scale image segmentation and the selection of optimal segmentation parameters are the key processes in the object-oriented information extraction of high-resolution remote sensing images. The accuracy of remote sensing special subject information depends on this extraction. On the basis of WorldView-2 high-resolution data, the optimal segmentation parameters methodof object-oriented image segmentation and high-resolution image information extraction, the following processes were conducted in this study. Firstly, the best combination of the bands and weights was determined for the information extraction of high-resolution remote sensing image. An improved weighted mean-variance method was proposed andused to calculatethe optimal segmentation scale. Thereafter, the best shape factor parameter and compact factor parameters were computed with the use of the control variables and the combination of the heterogeneity and homogeneity indexes. Different types of image segmentation parameters were obtained according to the surface features. The high-resolution remote sensing images were multi-scale segmented with the optimal segmentation parameters. Ahierarchical network structure was established by setting the information extraction rules to achieve object-oriented information extraction. This study presents an effective and practical method that can explain expert input judgment by reproducible quantitative measurements. Furthermore the results of this procedure may be incorporated into a classification scheme. PMID:27362762
Zhu, Hongchun; Cai, Lijie; Liu, Haiying; Huang, Wei
2016-01-01
Multi-scale image segmentation and the selection of optimal segmentation parameters are the key processes in the object-oriented information extraction of high-resolution remote sensing images. The accuracy of remote sensing special subject information depends on this extraction. On the basis of WorldView-2 high-resolution data, the optimal segmentation parameters methodof object-oriented image segmentation and high-resolution image information extraction, the following processes were conducted in this study. Firstly, the best combination of the bands and weights was determined for the information extraction of high-resolution remote sensing image. An improved weighted mean-variance method was proposed andused to calculatethe optimal segmentation scale. Thereafter, the best shape factor parameter and compact factor parameters were computed with the use of the control variables and the combination of the heterogeneity and homogeneity indexes. Different types of image segmentation parameters were obtained according to the surface features. The high-resolution remote sensing images were multi-scale segmented with the optimal segmentation parameters. Ahierarchical network structure was established by setting the information extraction rules to achieve object-oriented information extraction. This study presents an effective and practical method that can explain expert input judgment by reproducible quantitative measurements. Furthermore the results of this procedure may be incorporated into a classification scheme.
Ontology-Based Information Extraction for Business Intelligence
NASA Astrophysics Data System (ADS)
Saggion, Horacio; Funk, Adam; Maynard, Diana; Bontcheva, Kalina
Business Intelligence (BI) requires the acquisition and aggregation of key pieces of knowledge from multiple sources in order to provide valuable information to customers or feed statistical BI models and tools. The massive amount of information available to business analysts makes information extraction and other natural language processing tools key enablers for the acquisition and use of that semantic information. We describe the application of ontology-based extraction and merging in the context of a practical e-business application for the EU MUSING Project where the goal is to gather international company intelligence and country/region information. The results of our experiments so far are very promising and we are now in the process of building a complete end-to-end solution.
Evaluation of Ultrasonic Fiber Structure Extraction Technique Using Autopsy Specimens of Liver
NASA Astrophysics Data System (ADS)
Yamaguchi, Tadashi; Hirai, Kazuki; Yamada, Hiroyuki; Ebara, Masaaki; Hachiya, Hiroyuki
2005-06-01
It is very important to diagnose liver cirrhosis noninvasively and correctly. In our previous studies, we proposed a processing technique to detect changes in liver tissue in vivo. In this paper, we propose the evaluation of the relationship between liver disease and echo information using autopsy specimens of a human liver in vitro. It is possible to verify the function of a processing parameter clearly and to compare the processing result and the actual human liver tissue structure by in vitro experiment. In the results of our processing technique, information that did not obey a Rayleigh distribution from the echo signal of the autopsy liver specimens was extracted depending on changes in a particular processing parameter. The fiber tissue structure of the same specimen was extracted from a number of histological images of stained tissue. We constructed 3D structures using the information extracted from the echo signal and the fiber structure of the stained tissue and compared the two. By comparing the 3D structures, it is possible to evaluate the relationship between the information that does not obey a Rayleigh distribution of the echo signal and the fibrosis structure.
The Agent of extracting Internet Information with Lead Order
NASA Astrophysics Data System (ADS)
Mo, Zan; Huang, Chuliang; Liu, Aijun
In order to carry out e-commerce better, advanced technologies to access business information are in need urgently. An agent is described to deal with the problems of extracting internet information that caused by the non-standard and skimble-scamble structure of Chinese websites. The agent designed includes three modules which respond to the process of extracting information separately. A method of HTTP tree and a kind of Lead algorithm is proposed to generate a lead order, with which the required web can be retrieved easily. How to transform the extracted information structuralized with natural language is also discussed.
Two-dimensional thermal video analysis of offshore bird and bat flight
Matzner, Shari; Cullinan, Valerie I.; Duberstein, Corey A.
2015-09-11
Thermal infrared video can provide essential information about bird and bat presence and activity for risk assessment studies, but the analysis of recorded video can be time-consuming and may not extract all of the available information. Automated processing makes continuous monitoring over extended periods of time feasible, and maximizes the information provided by video. This is especially important for collecting data in remote locations that are difficult for human observers to access, such as proposed offshore wind turbine sites. We present guidelines for selecting an appropriate thermal camera based on environmental conditions and the physical characteristics of the target animals.more » We developed new video image processing algorithms that automate the extraction of bird and bat flight tracks from thermal video, and that characterize the extracted tracks to support animal identification and behavior inference. The algorithms use a video peak store process followed by background masking and perceptual grouping to extract flight tracks. The extracted tracks are automatically quantified in terms that could then be used to infer animal type and possibly behavior. The developed automated processing generates results that are reproducible and verifiable, and reduces the total amount of video data that must be retained and reviewed by human experts. Finally, we suggest models for interpreting thermal imaging information.« less
Two-dimensional thermal video analysis of offshore bird and bat flight
DOE Office of Scientific and Technical Information (OSTI.GOV)
Matzner, Shari; Cullinan, Valerie I.; Duberstein, Corey A.
Thermal infrared video can provide essential information about bird and bat presence and activity for risk assessment studies, but the analysis of recorded video can be time-consuming and may not extract all of the available information. Automated processing makes continuous monitoring over extended periods of time feasible, and maximizes the information provided by video. This is especially important for collecting data in remote locations that are difficult for human observers to access, such as proposed offshore wind turbine sites. We present guidelines for selecting an appropriate thermal camera based on environmental conditions and the physical characteristics of the target animals.more » We developed new video image processing algorithms that automate the extraction of bird and bat flight tracks from thermal video, and that characterize the extracted tracks to support animal identification and behavior inference. The algorithms use a video peak store process followed by background masking and perceptual grouping to extract flight tracks. The extracted tracks are automatically quantified in terms that could then be used to infer animal type and possibly behavior. The developed automated processing generates results that are reproducible and verifiable, and reduces the total amount of video data that must be retained and reviewed by human experts. Finally, we suggest models for interpreting thermal imaging information.« less
An Ontology-Based Approach to Incorporate User-Generated Geo-Content Into Sdi
NASA Astrophysics Data System (ADS)
Deng, D.-P.; Lemmens, R.
2011-08-01
The Web is changing the way people share and communicate information because of emergence of various Web technologies, which enable people to contribute information on the Web. User-Generated Geo-Content (UGGC) is a potential resource of geographic information. Due to the different production methods, UGGC often cannot fit in geographic information model. There is a semantic gap between UGGC and formal geographic information. To integrate UGGC into geographic information, this study conducts an ontology-based process to bridge this semantic gap. This ontology-based process includes five steps: Collection, Extraction, Formalization, Mapping, and Deployment. In addition, this study implements this process on Twitter messages, which is relevant to Japan Earthquake disaster. By using this process, we extract disaster relief information from Twitter messages, and develop a knowledge base for GeoSPARQL queries in disaster relief information.
NASA Astrophysics Data System (ADS)
Brook, A.; Cristofani, E.; Vandewal, M.; Matheis, C.; Jonuscheit, J.; Beigang, R.
2012-05-01
The present study proposes a fully integrated, semi-automatic and near real-time mode-operated image processing methodology developed for Frequency-Modulated Continuous-Wave (FMCW) THz images with the center frequencies around: 100 GHz and 300 GHz. The quality control of aeronautics composite multi-layered materials and structures using Non-Destructive Testing is the main focus of this work. Image processing is applied on the 3-D images to extract useful information. The data is processed by extracting areas of interest. The detected areas are subjected to image analysis for more particular investigation managed by a spatial model. Finally, the post-processing stage examines and evaluates the spatial accuracy of the extracted information.
Optimal protocol for maximum work extraction in a feedback process with a time-varying potential
NASA Astrophysics Data System (ADS)
Kwon, Chulan
2017-12-01
The nonequilibrium nature of information thermodynamics is characterized by the inequality or non-negativity of the total entropy change of the system, memory, and reservoir. Mutual information change plays a crucial role in the inequality, in particular if work is extracted and the paradox of Maxwell's demon is raised. We consider the Brownian information engine where the protocol set of the harmonic potential is initially chosen by the measurement and varies in time. We confirm the inequality of the total entropy change by calculating, in detail, the entropic terms including the mutual information change. We rigorously find the optimal values of the time-dependent protocol for maximum extraction of work both for the finite-time and the quasi-static process.
A Comparative Analysis of Extract, Transformation and Loading (ETL) Process
NASA Astrophysics Data System (ADS)
Runtuwene, J. P. A.; Tangkawarow, I. R. H. T.; Manoppo, C. T. M.; Salaki, R. J.
2018-02-01
The current growth of data and information occurs rapidly in varying amount and media. These types of development will eventually produce large number of data better known as the Big Data. Business Intelligence (BI) utilizes large number of data and information for analysis so that one can obtain important information. This type of information can be used to support decision-making process. In practice a process integrating existing data and information into data warehouse is needed. This data integration process is known as Extract, Transformation and Loading (ETL). In practice, many applications have been developed to carry out the ETL process, but selection which applications are more time, cost and power effective and efficient may become a challenge. Therefore, the objective of the study was to provide comparative analysis through comparison between the ETL process using Microsoft SQL Server Integration Service (SSIS) and one using Pentaho Data Integration (PDI).
ERIC Educational Resources Information Center
Chowdhury, Gobinda G.
2003-01-01
Discusses issues related to natural language processing, including theoretical developments; natural language understanding; tools and techniques; natural language text processing systems; abstracting; information extraction; information retrieval; interfaces; software; Internet, Web, and digital library applications; machine translation for…
Quantum-entanglement storage and extraction in quantum network node
NASA Astrophysics Data System (ADS)
Shan, Zhuoyu; Zhang, Yong
Quantum computing and quantum communication have become the most popular research topic. Nitrogen-vacancy (NV) centers in diamond have been shown the great advantage of implementing quantum information processing. The generation of entanglement between NV centers represents a fundamental prerequisite for all quantum information technologies. In this paper, we propose a scheme to realize the high-fidelity storage and extraction of quantum entanglement information based on the NV centers at room temperature. We store the entangled information of a pair of entangled photons in the Bell state into the nuclear spins of two NV centers, which can make these two NV centers entangled. And then we illuminate how to extract the entangled information from NV centers to prepare on-demand entangled states for optical quantum information processing. The strategy of engineering entanglement demonstrated here maybe pave the way towards a NV center-based quantum network.
Information Fusion for Feature Extraction and the Development of Geospatial Information
2004-07-01
of automated processing . 2. Requirements for Geospatial Information Accurate, timely geospatial information is critical for many military...this evaluation illustrates some of the difficulties in comparing manual and automated processing results (figure 5). The automated delineation of
Video image processing to create a speed sensor
DOT National Transportation Integrated Search
1999-11-01
Image processing has been applied to traffic analysis in recent years, with different goals. In the report, a new approach is presented for extracting vehicular speed information, given a sequence of real-time traffic images. We extract moving edges ...
Buckley, Julliette M; Coopey, Suzanne B; Sharko, John; Polubriaginof, Fernanda; Drohan, Brian; Belli, Ahmet K; Kim, Elizabeth M H; Garber, Judy E; Smith, Barbara L; Gadd, Michele A; Specht, Michelle C; Roche, Constance A; Gudewicz, Thomas M; Hughes, Kevin S
2012-01-01
The opportunity to integrate clinical decision support systems into clinical practice is limited due to the lack of structured, machine readable data in the current format of the electronic health record. Natural language processing has been designed to convert free text into machine readable data. The aim of the current study was to ascertain the feasibility of using natural language processing to extract clinical information from >76,000 breast pathology reports. APPROACH AND PROCEDURE: Breast pathology reports from three institutions were analyzed using natural language processing software (Clearforest, Waltham, MA) to extract information on a variety of pathologic diagnoses of interest. Data tables were created from the extracted information according to date of surgery, side of surgery, and medical record number. The variety of ways in which each diagnosis could be represented was recorded, as a means of demonstrating the complexity of machine interpretation of free text. There was widespread variation in how pathologists reported common pathologic diagnoses. We report, for example, 124 ways of saying invasive ductal carcinoma and 95 ways of saying invasive lobular carcinoma. There were >4000 ways of saying invasive ductal carcinoma was not present. Natural language processor sensitivity and specificity were 99.1% and 96.5% when compared to expert human coders. We have demonstrated how a large body of free text medical information such as seen in breast pathology reports, can be converted to a machine readable format using natural language processing, and described the inherent complexities of the task.
Automatic extraction of pavement markings on streets from point cloud data of mobile LiDAR
NASA Astrophysics Data System (ADS)
Gao, Yang; Zhong, Ruofei; Tang, Tao; Wang, Liuzhao; Liu, Xianlin
2017-08-01
Pavement markings provide an important foundation as they help to keep roads users safe. Accurate and comprehensive information about pavement markings assists the road regulators and is useful in developing driverless technology. Mobile light detection and ranging (LiDAR) systems offer new opportunities to collect and process accurate pavement markings’ information. Mobile LiDAR systems can directly obtain the three-dimensional (3D) coordinates of an object, thus defining spatial data and the intensity of (3D) objects in a fast and efficient way. The RGB attribute information of data points can be obtained based on the panoramic camera in the system. In this paper, we present a novel method process to automatically extract pavement markings using multiple attribute information of the laser scanning point cloud from the mobile LiDAR data. This method process utilizes a differential grayscale of RGB color, laser pulse reflection intensity, and the differential intensity to identify and extract pavement markings. We utilized point cloud density to remove the noise and used morphological operations to eliminate the errors. In the application, we tested our method process on different sections of roads in Beijing, China, and Buffalo, NY, USA. The results indicated that both correctness (p) and completeness (r) were higher than 90%. The method process of this research can be applied to extract pavement markings from huge point cloud data produced by mobile LiDAR.
Electronic processing of informed consents in a global pharmaceutical company environment.
Vishnyakova, Dina; Gobeill, Julien; Oezdemir-Zaech, Fatma; Kreim, Olivier; Vachon, Therese; Clade, Thierry; Haenning, Xavier; Mikhailov, Dmitri; Ruch, Patrick
2014-01-01
We present an electronic capture tool to process informed consents, which are mandatory recorded when running a clinical trial. This tool aims at the extraction of information expressing the duration of the consent given by the patient to authorize the exploitation of biomarker-related information collected during clinical trials. The system integrates a language detection module (LDM) to route a document into the appropriate information extraction module (IEM). The IEM is based on language-specific sets of linguistic rules for the identification of relevant textual facts. The achieved accuracy of both the LDM and IEM is 99%. The architecture of the system is described in detail.
An Overview of Biomolecular Event Extraction from Scientific Documents
Vanegas, Jorge A.; Matos, Sérgio; González, Fabio; Oliveira, José L.
2015-01-01
This paper presents a review of state-of-the-art approaches to automatic extraction of biomolecular events from scientific texts. Events involving biomolecules such as genes, transcription factors, or enzymes, for example, have a central role in biological processes and functions and provide valuable information for describing physiological and pathogenesis mechanisms. Event extraction from biomedical literature has a broad range of applications, including support for information retrieval, knowledge summarization, and information extraction and discovery. However, automatic event extraction is a challenging task due to the ambiguity and diversity of natural language and higher-level linguistic phenomena, such as speculations and negations, which occur in biological texts and can lead to misunderstanding or incorrect interpretation. Many strategies have been proposed in the last decade, originating from different research areas such as natural language processing, machine learning, and statistics. This review summarizes the most representative approaches in biomolecular event extraction and presents an analysis of the current state of the art and of commonly used methods, features, and tools. Finally, current research trends and future perspectives are also discussed. PMID:26587051
An information extraction framework for cohort identification using electronic health records.
Liu, Hongfang; Bielinski, Suzette J; Sohn, Sunghwan; Murphy, Sean; Wagholikar, Kavishwar B; Jonnalagadda, Siddhartha R; Ravikumar, K E; Wu, Stephen T; Kullo, Iftikhar J; Chute, Christopher G
2013-01-01
Information extraction (IE), a natural language processing (NLP) task that automatically extracts structured or semi-structured information from free text, has become popular in the clinical domain for supporting automated systems at point-of-care and enabling secondary use of electronic health records (EHRs) for clinical and translational research. However, a high performance IE system can be very challenging to construct due to the complexity and dynamic nature of human language. In this paper, we report an IE framework for cohort identification using EHRs that is a knowledge-driven framework developed under the Unstructured Information Management Architecture (UIMA). A system to extract specific information can be developed by subject matter experts through expert knowledge engineering of the externalized knowledge resources used in the framework.
Bao, X Y; Huang, W J; Zhang, K; Jin, M; Li, Y; Niu, C Z
2018-04-18
There is a huge amount of diagnostic or treatment information in electronic medical record (EMR), which is a concrete manifestation of clinicians actual diagnosis and treatment details. Plenty of episodes in EMRs, such as complaints, present illness, past history, differential diagnosis, diagnostic imaging, surgical records, reflecting details of diagnosis and treatment in clinical process, adopt Chinese description of natural language. How to extract effective information from these Chinese narrative text data, and organize it into a form of tabular for analysis of medical research, for the practical utilization of clinical data in the real world, is a difficult problem in Chinese medical data processing. Based on the EMRs narrative text data in a tertiary hospital in China, a customized information extracting rules learning, and rule based information extraction methods is proposed. The overall method consists of three steps, which includes: (1) Step 1, a random sample of 600 copies (including the history of present illness, past history, personal history, family history, etc.) of the electronic medical record data, was extracted as raw corpora. With our developed Chinese clinical narrative text annotation platform, the trained clinician and nurses marked the tokens and phrases in the corpora which would be extracted (with a history of diabetes as an example). (2) Step 2, based on the annotated corpora clinical text data, some extraction templates were summarized and induced firstly. Then these templates were rewritten using regular expressions of Perl programming language, as extraction rules. Using these extraction rules as basic knowledge base, we developed extraction packages in Perl, for extracting data from the EMRs text data. In the end, the extracted data items were organized in tabular data format, for later usage in clinical research or hospital surveillance purposes. (3) As the final step of the method, the evaluation and validation of the proposed methods were implemented in the National Clinical Service Data Integration Platform, and we checked the extraction results using artificial verification and automated verification combined, proved the effectiveness of the method. For all the patients with diabetes as diagnosed disease in the Department of Endocrine in the hospital, the medical history episode of these patients showed that, altogether 1 436 patients were dismissed in 2015, and a history of diabetes medical records extraction results showed that the recall rate was 87.6%, the accuracy rate was 99.5%, and F-Score was 0.93. For all the 10% patients (totally 1 223 patients) with diabetes by the dismissed dates of August 2017 in the same department, the extracted diabetes history extraction results showed that the recall rate was 89.2%, the accuracy rate was 99.2%, F-Score was 0.94. This study mainly adopts the combination of natural language processing and rule-based information extraction, and designs and implements an algorithm for extracting customized information from unstructured Chinese electronic medical record text data. It has better results than existing work.
Naviglio, Daniele; Formato, Andrea; Gallo, Monica
2014-09-01
The purpose of this study is to compare the extraction process for the production of China elixir starting from the same vegetable mixture, as performed by conventional maceration or a cyclically pressurized extraction process (rapid solid-liquid dynamic extraction) using the Naviglio Extractor. Dry residue was used as a marker for the kinetics of the extraction process because it was proportional to the amount of active principles extracted and, therefore, to their total concentration in the solution. UV spectra of the hydroalcoholic extracts allowed for the identification of the predominant chemical species in the extracts, while the organoleptic tests carried out on the final product provided an indication of the acceptance of the beverage and highlighted features that were not detectable by instrumental analytical techniques. In addition, a numerical simulation of the process has been performed, obtaining useful information about the timing of the process (time history) as well as its mathematical description. © 2014 Institute of Food Technologists®
A Semantic Approach for Geospatial Information Extraction from Unstructured Documents
NASA Astrophysics Data System (ADS)
Sallaberry, Christian; Gaio, Mauro; Lesbegueries, Julien; Loustau, Pierre
Local cultural heritage document collections are characterized by their content, which is strongly attached to a territory and its land history (i.e., geographical references). Our contribution aims at making the content retrieval process more efficient whenever a query includes geographic criteria. We propose a core model for a formal representation of geographic information. It takes into account characteristics of different modes of expression, such as written language, captures of drawings, maps, photographs, etc. We have developed a prototype that fully implements geographic information extraction (IE) and geographic information retrieval (IR) processes. All PIV prototype processing resources are designed as Web Services. We propose a geographic IE process based on semantic treatment as a supplement to classical IE approaches. We implement geographic IR by using intersection computing algorithms that seek out any intersection between formal geocoded representations of geographic information in a user query and similar representations in document collection indexes.
Glycerol extracting dealcoholization for the biodiesel separation process.
Ye, Jianchu; Sha, Yong; Zhang, Yun; Yuan, Yunlong; Wu, Housheng
2011-04-01
By means of utilizing sunflower oil and Jatropha oil as raw oil respectively, the biodiesel transesterification production and the multi-stage extracting separation were carried out experimentally. Results indicate that dealcoholized crude glycerol can be utilized as the extracting agent to achieve effective separation of methanol from the methyl ester phase, and the glycerol content in the dealcoholized methyl esters is as low as 0.02 wt.%. For the biodiesel separation process utilizing glycerol extracting dealcoholization, its technical and equipment information were acquired through the rigorous process simulation in contrast to the traditional biodiesel distillation separation process, and results show that its energy consumption decrease about 35% in contrast to that of the distillation separation process. The glycerol extracting dealcoholization has sufficient feasibility and superiority for the biodiesel separation process. Copyright © 2011 Elsevier Ltd. All rights reserved.
KneeTex: an ontology-driven system for information extraction from MRI reports.
Spasić, Irena; Zhao, Bo; Jones, Christopher B; Button, Kate
2015-01-01
In the realm of knee pathology, magnetic resonance imaging (MRI) has the advantage of visualising all structures within the knee joint, which makes it a valuable tool for increasing diagnostic accuracy and planning surgical treatments. Therefore, clinical narratives found in MRI reports convey valuable diagnostic information. A range of studies have proven the feasibility of natural language processing for information extraction from clinical narratives. However, no study focused specifically on MRI reports in relation to knee pathology, possibly due to the complexity of knee anatomy and a wide range of conditions that may be associated with different anatomical entities. In this paper we describe KneeTex, an information extraction system that operates in this domain. As an ontology-driven information extraction system, KneeTex makes active use of an ontology to strongly guide and constrain text analysis. We used automatic term recognition to facilitate the development of a domain-specific ontology with sufficient detail and coverage for text mining applications. In combination with the ontology, high regularity of the sublanguage used in knee MRI reports allowed us to model its processing by a set of sophisticated lexico-semantic rules with minimal syntactic analysis. The main processing steps involve named entity recognition combined with coordination, enumeration, ambiguity and co-reference resolution, followed by text segmentation. Ontology-based semantic typing is then used to drive the template filling process. We adopted an existing ontology, TRAK (Taxonomy for RehAbilitation of Knee conditions), for use within KneeTex. The original TRAK ontology expanded from 1,292 concepts, 1,720 synonyms and 518 relationship instances to 1,621 concepts, 2,550 synonyms and 560 relationship instances. This provided KneeTex with a very fine-grained lexico-semantic knowledge base, which is highly attuned to the given sublanguage. Information extraction results were evaluated on a test set of 100 MRI reports. A gold standard consisted of 1,259 filled template records with the following slots: finding, finding qualifier, negation, certainty, anatomy and anatomy qualifier. KneeTex extracted information with precision of 98.00 %, recall of 97.63 % and F-measure of 97.81 %, the values of which are in line with human-like performance. KneeTex is an open-source, stand-alone application for information extraction from narrative reports that describe an MRI scan of the knee. Given an MRI report as input, the system outputs the corresponding clinical findings in the form of JavaScript Object Notation objects. The extracted information is mapped onto TRAK, an ontology that formally models knowledge relevant for the rehabilitation of knee conditions. As a result, formally structured and coded information allows for complex searches to be conducted efficiently over the original MRI reports, thereby effectively supporting epidemiologic studies of knee conditions.
Limited Role of Contextual Information in Adult Word Recognition. Technical Report No. 411.
ERIC Educational Resources Information Center
Durgunoglu, Aydin Y.
Recognizing a word in a meaningful text involves processes that combine information from many different sources, and both bottom-up processes (such as feature extraction and letter recognition) and top-down processes (contextual information) are thought to interact when skilled readers recognize words. Two similar experiments investigated word…
Integrating Information Extraction Agents into a Tourism Recommender System
NASA Astrophysics Data System (ADS)
Esparcia, Sergio; Sánchez-Anguix, Víctor; Argente, Estefanía; García-Fornes, Ana; Julián, Vicente
Recommender systems face some problems. On the one hand information needs to be maintained updated, which can result in a costly task if it is not performed automatically. On the other hand, it may be interesting to include third party services in the recommendation since they improve its quality. In this paper, we present an add-on for the Social-Net Tourism Recommender System that uses information extraction and natural language processing techniques in order to automatically extract and classify information from the Web. Its goal is to maintain the system updated and obtain information about third party services that are not offered by service providers inside the system.
Valdez, Joshua; Rueschman, Michael; Kim, Matthew; Redline, Susan; Sahoo, Satya S
2016-10-01
Extraction of structured information from biomedical literature is a complex and challenging problem due to the complexity of biomedical domain and lack of appropriate natural language processing (NLP) techniques. High quality domain ontologies model both data and metadata information at a fine level of granularity, which can be effectively used to accurately extract structured information from biomedical text. Extraction of provenance metadata, which describes the history or source of information, from published articles is an important task to support scientific reproducibility. Reproducibility of results reported by previous research studies is a foundational component of scientific advancement. This is highlighted by the recent initiative by the US National Institutes of Health called "Principles of Rigor and Reproducibility". In this paper, we describe an effective approach to extract provenance metadata from published biomedical research literature using an ontology-enabled NLP platform as part of the Provenance for Clinical and Healthcare Research (ProvCaRe). The ProvCaRe-NLP tool extends the clinical Text Analysis and Knowledge Extraction System (cTAKES) platform using both provenance and biomedical domain ontologies. We demonstrate the effectiveness of ProvCaRe-NLP tool using a corpus of 20 peer-reviewed publications. The results of our evaluation demonstrate that the ProvCaRe-NLP tool has significantly higher recall in extracting provenance metadata as compared to existing NLP pipelines such as MetaMap.
Challenges in Extracting Information From Large Hydrogeophysical-monitoring Datasets
NASA Astrophysics Data System (ADS)
Day-Lewis, F. D.; Slater, L. D.; Johnson, T.
2012-12-01
Over the last decade, new automated geophysical data-acquisition systems have enabled collection of increasingly large and information-rich geophysical datasets. Concurrent advances in field instrumentation, web services, and high-performance computing have made real-time processing, inversion, and visualization of large three-dimensional tomographic datasets practical. Geophysical-monitoring datasets have provided high-resolution insights into diverse hydrologic processes including groundwater/surface-water exchange, infiltration, solute transport, and bioremediation. Despite the high information content of such datasets, extraction of quantitative or diagnostic hydrologic information is challenging. Visual inspection and interpretation for specific hydrologic processes is difficult for datasets that are large, complex, and (or) affected by forcings (e.g., seasonal variations) unrelated to the target hydrologic process. New strategies are needed to identify salient features in spatially distributed time-series data and to relate temporal changes in geophysical properties to hydrologic processes of interest while effectively filtering unrelated changes. Here, we review recent work using time-series and digital-signal-processing approaches in hydrogeophysics. Examples include applications of cross-correlation, spectral, and time-frequency (e.g., wavelet and Stockwell transforms) approaches to (1) identify salient features in large geophysical time series; (2) examine correlation or coherence between geophysical and hydrologic signals, even in the presence of non-stationarity; and (3) condense large datasets while preserving information of interest. Examples demonstrate analysis of large time-lapse electrical tomography and fiber-optic temperature datasets to extract information about groundwater/surface-water exchange and contaminant transport.
An Information Extraction Framework for Cohort Identification Using Electronic Health Records
Liu, Hongfang; Bielinski, Suzette J.; Sohn, Sunghwan; Murphy, Sean; Wagholikar, Kavishwar B.; Jonnalagadda, Siddhartha R.; Ravikumar, K.E.; Wu, Stephen T.; Kullo, Iftikhar J.; Chute, Christopher G
Information extraction (IE), a natural language processing (NLP) task that automatically extracts structured or semi-structured information from free text, has become popular in the clinical domain for supporting automated systems at point-of-care and enabling secondary use of electronic health records (EHRs) for clinical and translational research. However, a high performance IE system can be very challenging to construct due to the complexity and dynamic nature of human language. In this paper, we report an IE framework for cohort identification using EHRs that is a knowledge-driven framework developed under the Unstructured Information Management Architecture (UIMA). A system to extract specific information can be developed by subject matter experts through expert knowledge engineering of the externalized knowledge resources used in the framework. PMID:24303255
Brain Dynamics Sustaining Rapid Rule Extraction from Speech
ERIC Educational Resources Information Center
de Diego-Balaguer, Ruth; Fuentemilla, Lluis; Rodriguez-Fornells, Antoni
2011-01-01
Language acquisition is a complex process that requires the synergic involvement of different cognitive functions, which include extracting and storing the words of the language and their embedded rules for progressive acquisition of grammatical information. As has been shown in other fields that study learning processes, synchronization…
Parallel Visualization Co-Processing of Overnight CFD Propulsion Applications
NASA Technical Reports Server (NTRS)
Edwards, David E.; Haimes, Robert
1999-01-01
An interactive visualization system pV3 is being developed for the investigation of advanced computational methodologies employing visualization and parallel processing for the extraction of information contained in large-scale transient engineering simulations. Visual techniques for extracting information from the data in terms of cutting planes, iso-surfaces, particle tracing and vector fields are included in this system. This paper discusses improvements to the pV3 system developed under NASA's Affordable High Performance Computing project.
Semantic extraction and processing of medical records for patient-oriented visual index
NASA Astrophysics Data System (ADS)
Zheng, Weilin; Dong, Wenjie; Chen, Xiangjiao; Zhang, Jianguo
2012-02-01
To have comprehensive and completed understanding healthcare status of a patient, doctors need to search patient medical records from different healthcare information systems, such as PACS, RIS, HIS, USIS, as a reference of diagnosis and treatment decisions for the patient. However, it is time-consuming and tedious to do these procedures. In order to solve this kind of problems, we developed a patient-oriented visual index system (VIS) to use the visual technology to show health status and to retrieve the patients' examination information stored in each system with a 3D human model. In this presentation, we present a new approach about how to extract the semantic and characteristic information from the medical record systems such as RIS/USIS to create the 3D Visual Index. This approach includes following steps: (1) Building a medical characteristic semantic knowledge base; (2) Developing natural language processing (NLP) engine to perform semantic analysis and logical judgment on text-based medical records; (3) Applying the knowledge base and NLP engine on medical records to extract medical characteristics (e.g., the positive focus information), and then mapping extracted information to related organ/parts of 3D human model to create the visual index. We performed the testing procedures on 559 samples of radiological reports which include 853 focuses, and achieved 828 focuses' information. The successful rate of focus extraction is about 97.1%.
Thinking Graphically: Connecting Vision and Cognition during Graph Comprehension
ERIC Educational Resources Information Center
Ratwani, Raj M.; Trafton, J. Gregory; Boehm-Davis, Deborah A.
2008-01-01
Task analytic theories of graph comprehension account for the perceptual and conceptual processes required to extract specific information from graphs. Comparatively, the processes underlying information integration have received less attention. We propose a new framework for information integration that highlights visual integration and cognitive…
NASA Technical Reports Server (NTRS)
Jones, J. R.; Bodenheimer, R. E.
1976-01-01
A simple programmable Tse processor organization and arithmetic operations necessary for extraction of the desired topological information are described. Hardware additions to this organization are discussed along with trade-offs peculiar to the tse computing concept. An improved organization is presented along with the complementary software for the various arithmetic operations. The performance of the two organizations is compared in terms of speed, power, and cost. Software routines developed to extract the desired information from an image are included.
Ford, Lauren; Henderson, Robert L; Rayner, Christopher M; Blackburn, Richard S
2017-03-03
Madder (Rubia tinctorum L.) has been widely used as a red dye throughout history. Acid-sensitive colorants present in madder, such as glycosides (lucidin primeveroside, ruberythric acid, galiosin) and sensitive aglycons (lucidin), are degraded in the textile back extraction process; in previous literature these sensitive molecules are either absent or present in only low concentrations due to the use of acid in typical textile back extraction processes. Anthraquinone aglycons alizarin and purpurin are usually identified in analysis following harsh back extraction methods, such those using solvent mixtures with concentrated hydrochloric acid at high temperatures. Use of softer extraction techniques potentially allows for dye components present in madder to be extracted without degradation, which can potentially provide more information about the original dye profile, which varies significantly between madder varieties, species and dyeing technique. Herein, a softer extraction method involving aqueous glucose solution was developed and compared to other back extraction techniques on wool dyed with root extract from different varieties of Rubia tinctorum. Efficiencies of the extraction methods were analysed by HPLC coupled with diode array detection. Acidic literature methods were evaluated and they generally caused hydrolysis and degradation of the dye components, with alizarin, lucidin, and purpurin being the main compounds extracted. In contrast, extraction in aqueous glucose solution provides a highly effective method for extraction of madder dyed wool and is shown to efficiently extract lucidin primeveroside and ruberythric acid without causing hydrolysis and also extract aglycons that are present due to hydrolysis during processing of the plant material. Glucose solution is a favourable extraction medium due to its ability to form extensive hydrogen bonding with glycosides present in madder, and displace them from the fibre. This new glucose method offers an efficient process that preserves these sensitive molecules and is a step-change in analysis of madder dyed textiles as it can provide further information about historical dye preparation and dyeing processes that current methods cannot. The method also efficiently extracts glycosides in artificially aged samples, making it applicable for museum textile artefacts. Copyright © 2017 Elsevier B.V. All rights reserved.
NASA Astrophysics Data System (ADS)
Skersys, Tomas; Butleris, Rimantas; Kapocius, Kestutis
2013-10-01
Approaches for the analysis and specification of business vocabularies and rules are very relevant topics in both Business Process Management and Information Systems Development disciplines. However, in common practice of Information Systems Development, the Business modeling activities still are of mostly empiric nature. In this paper, basic aspects of the approach for business vocabularies' semi-automated extraction from business process models are presented. The approach is based on novel business modeling-level OMG standards "Business Process Model and Notation" (BPMN) and "Semantics for Business Vocabularies and Business Rules" (SBVR), thus contributing to OMG's vision about Model-Driven Architecture (MDA) and to model-driven development in general.
Optical hiding with visual cryptography
NASA Astrophysics Data System (ADS)
Shi, Yishi; Yang, Xiubo
2017-11-01
We propose an optical hiding method based on visual cryptography. In the hiding process, we convert the secret information into a set of fabricated phase-keys, which are completely independent of each other, intensity-detected-proof and image-covered, leading to the high security. During the extraction process, the covered phase-keys are illuminated with laser beams and then incoherently superimposed to extract the hidden information directly by human vision, without complicated optical implementations and any additional computation, resulting in the convenience of extraction. Also, the phase-keys are manufactured as the diffractive optical elements that are robust to the attacks, such as the blocking and the phase-noise. Optical experiments verify that the high security, the easy extraction and the strong robustness are all obtainable in the visual-cryptography-based optical hiding.
Information Retrieval Using Hadoop Big Data Analysis
NASA Astrophysics Data System (ADS)
Motwani, Deepak; Madan, Madan Lal
This paper concern on big data analysis which is the cognitive operation of probing huge amounts of information in an attempt to get uncovers unseen patterns. Through Big Data Analytics Applications such as public and private organization sectors have formed a strategic determination to turn big data into cut throat benefit. The primary occupation of extracting value from big data give rise to a process applied to pull information from multiple different sources; this process is known as extract transforms and lode. This paper approach extract information from log files and Research Paper, awareness reduces the efforts for blueprint finding and summarization of document from several positions. The work is able to understand better Hadoop basic concept and increase the user experience for research. In this paper, we propose an approach for analysis log files for finding concise information which is useful and time saving by using Hadoop. Our proposed approach will be applied on different research papers on a specific domain and applied for getting summarized content for further improvement and make the new content.
Chen, Yili; Fu, Jixiang; Chu, Dawei; Li, Rongmao; Xie, Yaoqin
2017-11-27
A retinal prosthesis is designed to help the blind to obtain some sight. It consists of an external part and an internal part. The external part is made up of a camera, an image processor and an RF transmitter. The internal part is made up of an RF receiver, implant chip and microelectrode. Currently, the number of microelectrodes is in the hundreds, and we do not know the mechanism for using an electrode to stimulate the optic nerve. A simple hypothesis is that the pixels in an image correspond to the electrode. The images captured by the camera should be processed by suitable strategies to correspond to stimulation from the electrode. Thus, it is a question of how to obtain the important information from the image captured in the picture. Here, we use the region of interest (ROI), a useful algorithm for extracting the ROI, to retain the important information, and to remove the redundant information. This paper explains the details of the principles and functions of the ROI. Because we are investigating a real-time system, we need a fast processing ROI as a useful algorithm to extract the ROI. Thus, we simplified the ROI algorithm and used it in an outside image-processing digital signal processing (DSP) system of the retinal prosthesis. The results show that our image-processing strategies are suitable for a real-time retinal prosthesis and can eliminate redundant information and provide useful information for expression in a low-size image.
Extraction of actionable information from crowdsourced disaster data.
Kiatpanont, Rungsun; Tanlamai, Uthai; Chongstitvatana, Prabhas
Natural disasters cause enormous damage to countries all over the world. To deal with these common problems, different activities are required for disaster management at each phase of the crisis. There are three groups of activities as follows: (1) make sense of the situation and determine how best to deal with it, (2) deploy the necessary resources, and (3) harmonize as many parties as possible, using the most effective communication channels. Current technological improvements and developments now enable people to act as real-time information sources. As a result, inundation with crowdsourced data poses a real challenge for a disaster manager. The problem is how to extract the valuable information from a gigantic data pool in the shortest possible time so that the information is still useful and actionable. This research proposed an actionable-data-extraction process to deal with the challenge. Twitter was selected as a test case because messages posted on Twitter are publicly available. Hashtag, an easy and very efficient technique, was also used to differentiate information. A quantitative approach to extract useful information from the tweets was supported and verified by interviews with disaster managers from many leading organizations in Thailand to understand their missions. The information classifications extracted from the collected tweets were first performed manually, and then the tweets were used to train a machine learning algorithm to classify future tweets. One particularly useful, significant, and primary section was the request for help category. The support vector machine algorithm was used to validate the results from the extraction process of 13,696 sample tweets, with over 74 percent accuracy. The results confirmed that the machine learning technique could significantly and practically assist with disaster management by dealing with crowdsourced data.
2018-01-18
processing. Specifically, the method described herein uses wgrib2 commands along with a Python script or program to produce tabular text files that in...It makes use of software that is readily available and can be implemented on many computer systems combined with relatively modest additional...example), extracts appropriate information, and lists the extracted information in a readable tabular form. The Python script used here is described in
eGARD: Extracting associations between genomic anomalies and drug responses from text
Rao, Shruti; McGarvey, Peter; Wu, Cathy; Madhavan, Subha; Vijay-Shanker, K.
2017-01-01
Tumor molecular profiling plays an integral role in identifying genomic anomalies which may help in personalizing cancer treatments, improving patient outcomes and minimizing risks associated with different therapies. However, critical information regarding the evidence of clinical utility of such anomalies is largely buried in biomedical literature. It is becoming prohibitive for biocurators, clinical researchers and oncologists to keep up with the rapidly growing volume and breadth of information, especially those that describe therapeutic implications of biomarkers and therefore relevant for treatment selection. In an effort to improve and speed up the process of manually reviewing and extracting relevant information from literature, we have developed a natural language processing (NLP)-based text mining (TM) system called eGARD (extracting Genomic Anomalies association with Response to Drugs). This system relies on the syntactic nature of sentences coupled with various textual features to extract relations between genomic anomalies and drug response from MEDLINE abstracts. Our system achieved high precision, recall and F-measure of up to 0.95, 0.86 and 0.90, respectively, on annotated evaluation datasets created in-house and obtained externally from PharmGKB. Additionally, the system extracted information that helps determine the confidence level of extraction to support prioritization of curation. Such a system will enable clinical researchers to explore the use of published markers to stratify patients upfront for ‘best-fit’ therapies and readily generate hypotheses for new clinical trials. PMID:29261751
FEX: A Knowledge-Based System For Planimetric Feature Extraction
NASA Astrophysics Data System (ADS)
Zelek, John S.
1988-10-01
Topographical planimetric features include natural surfaces (rivers, lakes) and man-made surfaces (roads, railways, bridges). In conventional planimetric feature extraction, a photointerpreter manually interprets and extracts features from imagery on a stereoplotter. Visual planimetric feature extraction is a very labour intensive operation. The advantages of automating feature extraction include: time and labour savings; accuracy improvements; and planimetric data consistency. FEX (Feature EXtraction) combines techniques from image processing, remote sensing and artificial intelligence for automatic feature extraction. The feature extraction process co-ordinates the information and knowledge in a hierarchical data structure. The system simulates the reasoning of a photointerpreter in determining the planimetric features. Present efforts have concentrated on the extraction of road-like features in SPOT imagery. Keywords: Remote Sensing, Artificial Intelligence (AI), SPOT, image understanding, knowledge base, apars.
Using Process Redesign and Information Technology to Improve Procurement
1994-04-01
contrac- tor. Many large-volume contractors have automated order processing tied to ac- counting, manufacturing, and shipping subsystems. Currently...the contractor must receive the mailed order, analyze it, extract pertinent information, and en- ter that information into the automated order ... processing system. Almost all orders for small purchases are unilateral documents that do not require acceptance or acknowledgment by the contractor. For
A Real-Time System for Lane Detection Based on FPGA and DSP
NASA Astrophysics Data System (ADS)
Xiao, Jing; Li, Shutao; Sun, Bin
2016-12-01
This paper presents a real-time lane detection system including edge detection and improved Hough Transform based lane detection algorithm and its hardware implementation with field programmable gate array (FPGA) and digital signal processor (DSP). Firstly, gradient amplitude and direction information are combined to extract lane edge information. Then, the information is used to determine the region of interest. Finally, the lanes are extracted by using improved Hough Transform. The image processing module of the system consists of FPGA and DSP. Particularly, the algorithms implemented in FPGA are working in pipeline and processing in parallel so that the system can run in real-time. In addition, DSP realizes lane line extraction and display function with an improved Hough Transform. The experimental results show that the proposed system is able to detect lanes under different road situations efficiently and effectively.
NASA Astrophysics Data System (ADS)
Rüther, Heinz; Martine, Hagai M.; Mtalo, E. G.
This paper presents a novel approach to semiautomatic building extraction in informal settlement areas from aerial photographs. The proposed approach uses a strategy of delineating buildings by optimising their approximate building contour position. Approximate building contours are derived automatically by locating elevation blobs in digital surface models. Building extraction is then effected by means of the snakes algorithm and the dynamic programming optimisation technique. With dynamic programming, the building contour optimisation problem is realized through a discrete multistage process and solved by the "time-delayed" algorithm, as developed in this work. The proposed building extraction approach is a semiautomatic process, with user-controlled operations linking fully automated subprocesses. Inputs into the proposed building extraction system are ortho-images and digital surface models, the latter being generated through image matching techniques. Buildings are modeled as "lumps" or elevation blobs in digital surface models, which are derived by altimetric thresholding of digital surface models. Initial windows for building extraction are provided by projecting the elevation blobs centre points onto an ortho-image. In the next step, approximate building contours are extracted from the ortho-image by region growing constrained by edges. Approximate building contours thus derived are inputs into the dynamic programming optimisation process in which final building contours are established. The proposed system is tested on two study areas: Marconi Beam in Cape Town, South Africa, and Manzese in Dar es Salaam, Tanzania. Sixty percent of buildings in the study areas have been extracted and verified and it is concluded that the proposed approach contributes meaningfully to the extraction of buildings in moderately complex and crowded informal settlement areas.
PDF text classification to leverage information extraction from publication reports.
Bui, Duy Duc An; Del Fiol, Guilherme; Jonnalagadda, Siddhartha
2016-06-01
Data extraction from original study reports is a time-consuming, error-prone process in systematic review development. Information extraction (IE) systems have the potential to assist humans in the extraction task, however majority of IE systems were not designed to work on Portable Document Format (PDF) document, an important and common extraction source for systematic review. In a PDF document, narrative content is often mixed with publication metadata or semi-structured text, which add challenges to the underlining natural language processing algorithm. Our goal is to categorize PDF texts for strategic use by IE systems. We used an open-source tool to extract raw texts from a PDF document and developed a text classification algorithm that follows a multi-pass sieve framework to automatically classify PDF text snippets (for brevity, texts) into TITLE, ABSTRACT, BODYTEXT, SEMISTRUCTURE, and METADATA categories. To validate the algorithm, we developed a gold standard of PDF reports that were included in the development of previous systematic reviews by the Cochrane Collaboration. In a two-step procedure, we evaluated (1) classification performance, and compared it with machine learning classifier, and (2) the effects of the algorithm on an IE system that extracts clinical outcome mentions. The multi-pass sieve algorithm achieved an accuracy of 92.6%, which was 9.7% (p<0.001) higher than the best performing machine learning classifier that used a logistic regression algorithm. F-measure improvements were observed in the classification of TITLE (+15.6%), ABSTRACT (+54.2%), BODYTEXT (+3.7%), SEMISTRUCTURE (+34%), and MEDADATA (+14.2%). In addition, use of the algorithm to filter semi-structured texts and publication metadata improved performance of the outcome extraction system (F-measure +4.1%, p=0.002). It also reduced of number of sentences to be processed by 44.9% (p<0.001), which corresponds to a processing time reduction of 50% (p=0.005). The rule-based multi-pass sieve framework can be used effectively in categorizing texts extracted from PDF documents. Text classification is an important prerequisite step to leverage information extraction from PDF documents. Copyright © 2016 Elsevier Inc. All rights reserved.
ABM Drag_Pass Report Generator
NASA Technical Reports Server (NTRS)
Fisher, Forest; Gladden, Roy; Khanampornpan, Teerapat
2008-01-01
dragREPORT software was developed in parallel with abmREPORT, which is described in the preceding article. Both programs were built on the capabilities created during that process. This tool generates a drag_pass report that summarizes vital information from the MRO aerobreaking drag_pass build process to facilitate both sequence reviews and provide a high-level summarization of the sequence for mission management. The script extracts information from the ENV, SSF, FRF, SCMFmax, and OPTG files, presenting them in a single, easy-to-check report providing the majority of parameters needed for cross check and verification as part of the sequence review process. Prior to dragReport, all the needed information was spread across a number of different files, each in a different format. This software is a Perl script that extracts vital summarization information and build-process details from a number of source files into a single, concise report format used to aid the MPST sequence review process and to provide a high-level summarization of the sequence for mission management reference. This software could be adapted for future aerobraking missions to provide similar reports, review and summarization information.
Construction of Green Tide Monitoring System and Research on its Key Techniques
NASA Astrophysics Data System (ADS)
Xing, B.; Li, J.; Zhu, H.; Wei, P.; Zhao, Y.
2018-04-01
As a kind of marine natural disaster, Green Tide has been appearing every year along the Qingdao Coast, bringing great loss to this region, since the large-scale bloom in 2008. Therefore, it is of great value to obtain the real time dynamic information about green tide distribution. In this study, methods of optical remote sensing and microwave remote sensing are employed in Green Tide Monitoring Research. A specific remote sensing data processing flow and a green tide information extraction algorithm are designed, according to the optical and microwave data of different characteristics. In the aspect of green tide spatial distribution information extraction, an automatic extraction algorithm of green tide distribution boundaries is designed based on the principle of mathematical morphology dilation/erosion. And key issues in information extraction, including the division of green tide regions, the obtaining of basic distributions, the limitation of distribution boundary, and the elimination of islands, have been solved. The automatic generation of green tide distribution boundaries from the results of remote sensing information extraction is realized. Finally, a green tide monitoring system is built based on IDL/GIS secondary development in the integrated environment of RS and GIS, achieving the integration of RS monitoring and information extraction.
The extraction and integration framework: a two-process account of statistical learning.
Thiessen, Erik D; Kronstein, Alexandra T; Hufnagle, Daniel G
2013-07-01
The term statistical learning in infancy research originally referred to sensitivity to transitional probabilities. Subsequent research has demonstrated that statistical learning contributes to infant development in a wide array of domains. The range of statistical learning phenomena necessitates a broader view of the processes underlying statistical learning. Learners are sensitive to a much wider range of statistical information than the conditional relations indexed by transitional probabilities, including distributional and cue-based statistics. We propose a novel framework that unifies learning about all of these kinds of statistical structure. From our perspective, learning about conditional relations outputs discrete representations (such as words). Integration across these discrete representations yields sensitivity to cues and distributional information. To achieve sensitivity to all of these kinds of statistical structure, our framework combines processes that extract segments of the input with processes that compare across these extracted items. In this framework, the items extracted from the input serve as exemplars in long-term memory. The similarity structure of those exemplars in long-term memory leads to the discovery of cues and categorical structure, which guides subsequent extraction. The extraction and integration framework provides a way to explain sensitivity to both conditional statistical structure (such as transitional probabilities) and distributional statistical structure (such as item frequency and variability), and also a framework for thinking about how these different aspects of statistical learning influence each other. 2013 APA, all rights reserved
Botsis, Taxiarchis; Foster, Matthew; Arya, Nina; Kreimeyer, Kory; Pandey, Abhishek; Arya, Deepa
2017-04-26
To evaluate the feasibility of automated dose and adverse event information retrieval in supporting the identification of safety patterns. We extracted all rabbit Anti-Thymocyte Globulin (rATG) reports submitted to the United States Food and Drug Administration Adverse Event Reporting System (FAERS) from the product's initial licensure in April 16, 1984 through February 8, 2016. We processed the narratives using the Medication Extraction (MedEx) and the Event-based Text-mining of Health Electronic Records (ETHER) systems and retrieved the appropriate medication, clinical, and temporal information. When necessary, the extracted information was manually curated. This process resulted in a high quality dataset that was analyzed with the Pattern-based and Advanced Network Analyzer for Clinical Evaluation and Assessment (PANACEA) to explore the association of rATG dosing with post-transplant lymphoproliferative disorder (PTLD). Although manual curation was necessary to improve the data quality, MedEx and ETHER supported the extraction of the appropriate information. We created a final dataset of 1,380 cases with complete information for rATG dosing and date of administration. Analysis in PANACEA found that PTLD was associated with cumulative doses of rATG >8 mg/kg, even in periods where most of the submissions to FAERS reported low doses of rATG. We demonstrated the feasibility of investigating a dose-related safety pattern for a particular product in FAERS using a set of automated tools.
Support patient search on pathology reports with interactive online learning based data extraction.
Zheng, Shuai; Lu, James J; Appin, Christina; Brat, Daniel; Wang, Fusheng
2015-01-01
Structural reporting enables semantic understanding and prompt retrieval of clinical findings about patients. While synoptic pathology reporting provides templates for data entries, information in pathology reports remains primarily in narrative free text form. Extracting data of interest from narrative pathology reports could significantly improve the representation of the information and enable complex structured queries. However, manual extraction is tedious and error-prone, and automated tools are often constructed with a fixed training dataset and not easily adaptable. Our goal is to extract data from pathology reports to support advanced patient search with a highly adaptable semi-automated data extraction system, which can adjust and self-improve by learning from a user's interaction with minimal human effort. We have developed an online machine learning based information extraction system called IDEAL-X. With its graphical user interface, the system's data extraction engine automatically annotates values for users to review upon loading each report text. The system analyzes users' corrections regarding these annotations with online machine learning, and incrementally enhances and refines the learning model as reports are processed. The system also takes advantage of customized controlled vocabularies, which can be adaptively refined during the online learning process to further assist the data extraction. As the accuracy of automatic annotation improves overtime, the effort of human annotation is gradually reduced. After all reports are processed, a built-in query engine can be applied to conveniently define queries based on extracted structured data. We have evaluated the system with a dataset of anatomic pathology reports from 50 patients. Extracted data elements include demographical data, diagnosis, genetic marker, and procedure. The system achieves F-1 scores of around 95% for the majority of tests. Extracting data from pathology reports could enable more accurate knowledge to support biomedical research and clinical diagnosis. IDEAL-X provides a bridge that takes advantage of online machine learning based data extraction and the knowledge from human's feedback. By combining iterative online learning and adaptive controlled vocabularies, IDEAL-X can deliver highly adaptive and accurate data extraction to support patient search.
Bioactive phytochemicals in wheat: Extraction, analysis, processing, and functional properties
USDA-ARS?s Scientific Manuscript database
Whole wheat provides a rich source of bioactive phytochemicals namely, phenolic acids, carotenoids, tocopherols, alkylresorcinols, arabinoxylans, benzoxazinoids, phytosterols, and lignans. This review provides information on the distribution, extractability, analysis, and nutraceutical properties of...
Digital image processing for information extraction.
NASA Technical Reports Server (NTRS)
Billingsley, F. C.
1973-01-01
The modern digital computer has made practical image processing techniques for handling nonlinear operations in both the geometrical and the intensity domains, various types of nonuniform noise cleanup, and the numerical analysis of pictures. An initial requirement is that a number of anomalies caused by the camera (e.g., geometric distortion, MTF roll-off, vignetting, and nonuniform intensity response) must be taken into account or removed to avoid their interference with the information extraction process. Examples illustrating these operations are discussed along with computer techniques used to emphasize details, perform analyses, classify materials by multivariate analysis, detect temporal differences, and aid in human interpretation of photos.
Sub-Saharan Africa Report, No. 2869.
1983-11-10
QUui-r^^ 33D< FBiS FOREIGN BROADCAST INFORMATION SERVICE REPRODUCED BY NATIONAL TECHNICAL INFORMATION SERVICE U.S. DEPARTMENT OF COMMERCE...8217 SPRINGFIELD, VA. 22161 / 6? 1^ NOTE JPRS publications contain information primarily from foreign newspapers, periodicals and books, but also from...indicate how the original information was processed. Where no processing indicator is given, the infor- mation was summarized or extracted. Unfamiliar
NASA Astrophysics Data System (ADS)
Guo, H., II
2016-12-01
Spatial distribution information of mountainous area settlement place is of great significance to the earthquake emergency work because most of the key earthquake hazardous areas of china are located in the mountainous area. Remote sensing has the advantages of large coverage and low cost, it is an important way to obtain the spatial distribution information of mountainous area settlement place. At present, fully considering the geometric information, spectral information and texture information, most studies have applied object-oriented methods to extract settlement place information, In this article, semantic constraints is to be added on the basis of object-oriented methods. The experimental data is one scene remote sensing image of domestic high resolution satellite (simply as GF-1), with a resolution of 2 meters. The main processing consists of 3 steps, the first is pretreatment, including ortho rectification and image fusion, the second is Object oriented information extraction, including Image segmentation and information extraction, the last step is removing the error elements under semantic constraints, in order to formulate these semantic constraints, the distribution characteristics of mountainous area settlement place must be analyzed and the spatial logic relation between settlement place and other objects must be considered. The extraction accuracy calculation result shows that the extraction accuracy of object oriented method is 49% and rise up to 86% after the use of semantic constraints. As can be seen from the extraction accuracy, the extract method under semantic constraints can effectively improve the accuracy of mountainous area settlement place information extraction. The result shows that it is feasible to extract mountainous area settlement place information form GF-1 image, so the article proves that it has a certain practicality to use domestic high resolution optical remote sensing image in earthquake emergency preparedness.
This data set consists of Census Designated Place and Federal Information Processing Standard (FIPS) Populated Place boundaries for the State of Arizona which were extracted from the 1992 U.S. Census Bureau TIGER line files.
Friesen, Melissa C.; Locke, Sarah J.; Tornow, Carina; Chen, Yu-Cheng; Koh, Dong-Hee; Stewart, Patricia A.; Purdue, Mark; Colt, Joanne S.
2014-01-01
Objectives: Lifetime occupational history (OH) questionnaires often use open-ended questions to capture detailed information about study participants’ jobs. Exposure assessors use this information, along with responses to job- and industry-specific questionnaires, to assign exposure estimates on a job-by-job basis. An alternative approach is to use information from the OH responses and the job- and industry-specific questionnaires to develop programmable decision rules for assigning exposures. As a first step in this process, we developed a systematic approach to extract the free-text OH responses and convert them into standardized variables that represented exposure scenarios. Methods: Our study population comprised 2408 subjects, reporting 11991 jobs, from a case–control study of renal cell carcinoma. Each subject completed a lifetime OH questionnaire that included verbatim responses, for each job, to open-ended questions including job title, main tasks and activities (task), tools and equipment used (tools), and chemicals and materials handled (chemicals). Based on a review of the literature, we identified exposure scenarios (occupations, industries, tasks/tools/chemicals) expected to involve possible exposure to chlorinated solvents, trichloroethylene (TCE) in particular, lead, and cadmium. We then used a SAS macro to review the information reported by study participants to identify jobs associated with each exposure scenario; this was done using previously coded standardized occupation and industry classification codes, and a priori lists of associated key words and phrases related to possibly exposed tasks, tools, and chemicals. Exposure variables representing the occupation, industry, and task/tool/chemicals exposure scenarios were added to the work history records of the study respondents. Our identification of possibly TCE-exposed scenarios in the OH responses was compared to an expert’s independently assigned probability ratings to evaluate whether we missed identifying possibly exposed jobs. Results: Our process added exposure variables for 52 occupation groups, 43 industry groups, and 46 task/tool/chemical scenarios to the data set of OH responses. Across all four agents, we identified possibly exposed task/tool/chemical exposure scenarios in 44–51% of the jobs in possibly exposed occupations. Possibly exposed task/tool/chemical exposure scenarios were found in a nontrivial 9–14% of the jobs not in possibly exposed occupations, suggesting that our process identified important information that would not be captured using occupation alone. Our extraction process was sensitive: for jobs where our extraction of OH responses identified no exposure scenarios and for which the sole source of information was the OH responses, only 0.1% were assessed as possibly exposed to TCE by the expert. Conclusions: Our systematic extraction of OH information found useful information in the task/chemicals/tools responses that was relatively easy to extract and that was not available from the occupational or industry information. The extracted variables can be used as inputs in the development of decision rules, especially for jobs where no additional information, such as job- and industry-specific questionnaires, is available. PMID:24590110
a Geographic Data Gathering System for Image Geolocalization Refining
NASA Astrophysics Data System (ADS)
Semaan, B.; Servières, M.; Moreau, G.; Chebaro, B.
2017-09-01
Image geolocalization has become an important research field during the last decade. This field is divided into two main sections. The first is image geolocalization that is used to find out which country, region or city the image belongs to. The second one is refining image localization for uses that require more accuracy such as augmented reality and three dimensional environment reconstruction using images. In this paper we present a processing chain that gathers geographic data from several sources in order to deliver a better geolocalization than the GPS one of an image and precise camera pose parameters. In order to do so, we use multiple types of data. Among this information some are visible in the image and are extracted using image processing, other types of data can be extracted from image file headers or online image sharing platforms related information. Extracted information elements will not be expressive enough if they remain disconnected. We show that grouping these information elements helps finding the best geolocalization of the image.
Business Intelligence Applied to the ALMA Software Integration Process
NASA Astrophysics Data System (ADS)
Zambrano, M.; Recabarren, C.; González, V.; Hoffstadt, A.; Soto, R.; Shen, T.-C.
2012-09-01
Software quality assurance and planning of an astronomy project is a complex task, specially if it is a distributed collaborative project such as ALMA, where the development centers are spread across the globe. When you execute a software project there is much valuable information about this process itself that you might be able to collect. One of the ways you can receive this input is via an issue tracking system that will gather the problem reports relative to software bugs captured during the testing of the software, during the integration of the different components or even worst, problems occurred during production time. Usually, there is little time spent on analyzing them but with some multidimensional processing you can extract valuable information from them and it might help you on the long term planning and resources allocation. We present an analysis of the information collected at ALMA from a collection of key unbiased indicators. We describe here the extraction, transformation and load process and how the data was processed. The main goal is to assess a software process and get insights from this information.
1975-08-01
image analysis and processing tasks such as information extraction, image enhancement and restoration, coding, etc. The ultimate objective of this research is to form a basis for the development of technology relevant to military applications of machine extraction of information from aircraft and satellite imagery of the earth’s surface. This report discusses research activities during the three month period February 1 - April 30,
Terrain Extraction by Integrating Terrestrial Laser Scanner Data and Spectral Information
NASA Astrophysics Data System (ADS)
Lau, C. L.; Halim, S.; Zulkepli, M.; Azwan, A. M.; Tang, W. L.; Chong, A. K.
2015-10-01
The extraction of true terrain points from unstructured laser point cloud data is an important process in order to produce an accurate digital terrain model (DTM). However, most of these spatial filtering methods just utilizing the geometrical data to discriminate the terrain points from nonterrain points. The point cloud filtering method also can be improved by using the spectral information available with some scanners. Therefore, the objective of this study is to investigate the effectiveness of using the three-channel (red, green and blue) of the colour image captured from built-in digital camera which is available in some Terrestrial Laser Scanner (TLS) for terrain extraction. In this study, the data acquisition was conducted at a mini replica landscape in Universiti Teknologi Malaysia (UTM), Skudai campus using Leica ScanStation C10. The spectral information of the coloured point clouds from selected sample classes are extracted for spectral analysis. The coloured point clouds which within the corresponding preset spectral threshold are identified as that specific feature point from the dataset. This process of terrain extraction is done through using developed Matlab coding. Result demonstrates that a higher spectral resolution passive image is required in order to improve the output. This is because low quality of the colour images captured by the sensor contributes to the low separability in spectral reflectance. In conclusion, this study shows that, spectral information is capable to be used as a parameter for terrain extraction.
Natural language processing and visualization in the molecular imaging domain.
Tulipano, P Karina; Tao, Ying; Millar, William S; Zanzonico, Pat; Kolbert, Katherine; Xu, Hua; Yu, Hong; Chen, Lifeng; Lussier, Yves A; Friedman, Carol
2007-06-01
Molecular imaging is at the crossroads of genomic sciences and medical imaging. Information within the molecular imaging literature could be used to link to genomic and imaging information resources and to organize and index images in a way that is potentially useful to researchers. A number of natural language processing (NLP) systems are available to automatically extract information from genomic literature. One existing NLP system, known as BioMedLEE, automatically extracts biological information consisting of biomolecular substances and phenotypic data. This paper focuses on the adaptation, evaluation, and application of BioMedLEE to the molecular imaging domain. In order to adapt BioMedLEE for this domain, we extend an existing molecular imaging terminology and incorporate it into BioMedLEE. BioMedLEE's performance is assessed with a formal evaluation study. The system's performance, measured as recall and precision, is 0.74 (95% CI: [.70-.76]) and 0.70 (95% CI [.63-.76]), respectively. We adapt a JAVA viewer known as PGviewer for the simultaneous visualization of images with NLP extracted information.
Structuring and extracting knowledge for the support of hypothesis generation in molecular biology
Roos, Marco; Marshall, M Scott; Gibson, Andrew P; Schuemie, Martijn; Meij, Edgar; Katrenko, Sophia; van Hage, Willem Robert; Krommydas, Konstantinos; Adriaans, Pieter W
2009-01-01
Background Hypothesis generation in molecular and cellular biology is an empirical process in which knowledge derived from prior experiments is distilled into a comprehensible model. The requirement of automated support is exemplified by the difficulty of considering all relevant facts that are contained in the millions of documents available from PubMed. Semantic Web provides tools for sharing prior knowledge, while information retrieval and information extraction techniques enable its extraction from literature. Their combination makes prior knowledge available for computational analysis and inference. While some tools provide complete solutions that limit the control over the modeling and extraction processes, we seek a methodology that supports control by the experimenter over these critical processes. Results We describe progress towards automated support for the generation of biomolecular hypotheses. Semantic Web technologies are used to structure and store knowledge, while a workflow extracts knowledge from text. We designed minimal proto-ontologies in OWL for capturing different aspects of a text mining experiment: the biological hypothesis, text and documents, text mining, and workflow provenance. The models fit a methodology that allows focus on the requirements of a single experiment while supporting reuse and posterior analysis of extracted knowledge from multiple experiments. Our workflow is composed of services from the 'Adaptive Information Disclosure Application' (AIDA) toolkit as well as a few others. The output is a semantic model with putative biological relations, with each relation linked to the corresponding evidence. Conclusion We demonstrated a 'do-it-yourself' approach for structuring and extracting knowledge in the context of experimental research on biomolecular mechanisms. The methodology can be used to bootstrap the construction of semantically rich biological models using the results of knowledge extraction processes. Models specific to particular experiments can be constructed that, in turn, link with other semantic models, creating a web of knowledge that spans experiments. Mapping mechanisms can link to other knowledge resources such as OBO ontologies or SKOS vocabularies. AIDA Web Services can be used to design personalized knowledge extraction procedures. In our example experiment, we found three proteins (NF-Kappa B, p21, and Bax) potentially playing a role in the interplay between nutrients and epigenetic gene regulation. PMID:19796406
Atmospheric and Oceanographic Information Processing System (AOIPS) system description
NASA Technical Reports Server (NTRS)
Bracken, P. A.; Dalton, J. T.; Billingsley, J. B.; Quann, J. J.
1977-01-01
The development of hardware and software for an interactive, minicomputer based processing and display system for atmospheric and oceanographic information extraction and image data analysis is described. The major applications of the system are discussed as well as enhancements planned for the future.
Extracting Information from Folds in Rocks.
ERIC Educational Resources Information Center
Hudleston, Peter John
1986-01-01
Describes the three processes of folding in rocks: buckling, bending, and passive folding. Discusses how geometrical properties and strain distributions help to identify which processes produce natural folds, and also provides information about the mechanical properties of rocks, and the sense of shear in shear zones. (TW)
Zheng, Shuai; Ghasemzadeh, Nima; Hayek, Salim S; Quyyumi, Arshed A
2017-01-01
Background Extracting structured data from narrated medical reports is challenged by the complexity of heterogeneous structures and vocabularies and often requires significant manual effort. Traditional machine-based approaches lack the capability to take user feedbacks for improving the extraction algorithm in real time. Objective Our goal was to provide a generic information extraction framework that can support diverse clinical reports and enables a dynamic interaction between a human and a machine that produces highly accurate results. Methods A clinical information extraction system IDEAL-X has been built on top of online machine learning. It processes one document at a time, and user interactions are recorded as feedbacks to update the learning model in real time. The updated model is used to predict values for extraction in subsequent documents. Once prediction accuracy reaches a user-acceptable threshold, the remaining documents may be batch processed. A customizable controlled vocabulary may be used to support extraction. Results Three datasets were used for experiments based on report styles: 100 cardiac catheterization procedure reports, 100 coronary angiographic reports, and 100 integrated reports—each combines history and physical report, discharge summary, outpatient clinic notes, outpatient clinic letter, and inpatient discharge medication report. Data extraction was performed by 3 methods: online machine learning, controlled vocabularies, and a combination of these. The system delivers results with F1 scores greater than 95%. Conclusions IDEAL-X adopts a unique online machine learning–based approach combined with controlled vocabularies to support data extraction for clinical reports. The system can quickly learn and improve, thus it is highly adaptable. PMID:28487265
Fast and accurate edge orientation processing during object manipulation
Flanagan, J Randall; Johansson, Roland S
2018-01-01
Quickly and accurately extracting information about a touched object’s orientation is a critical aspect of dexterous object manipulation. However, the speed and acuity of tactile edge orientation processing with respect to the fingertips as reported in previous perceptual studies appear inadequate in these respects. Here we directly establish the tactile system’s capacity to process edge-orientation information during dexterous manipulation. Participants extracted tactile information about edge orientation very quickly, using it within 200 ms of first touching the object. Participants were also strikingly accurate. With edges spanning the entire fingertip, edge-orientation resolution was better than 3° in our object manipulation task, which is several times better than reported in previous perceptual studies. Performance remained impressive even with edges as short as 2 mm, consistent with our ability to precisely manipulate very small objects. Taken together, our results radically redefine the spatial processing capacity of the tactile system. PMID:29611804
Conception of Self-Construction Production Scheduling System
NASA Astrophysics Data System (ADS)
Xue, Hai; Zhang, Xuerui; Shimizu, Yasuhiro; Fujimura, Shigeru
With the high speed innovation of information technology, many production scheduling systems have been developed. However, a lot of customization according to individual production environment is required, and then a large investment for development and maintenance is indispensable. Therefore now the direction to construct scheduling systems should be changed. The final objective of this research aims at developing a system which is built by it extracting the scheduling technique automatically through the daily production scheduling work, so that an investment will be reduced. This extraction mechanism should be applied for various production processes for the interoperability. Using the master information extracted by the system, production scheduling operators can be supported to accelerate the production scheduling work easily and accurately without any restriction of scheduling operations. By installing this extraction mechanism, it is easy to introduce scheduling system without a lot of expense for customization. In this paper, at first a model for expressing a scheduling problem is proposed. Then the guideline to extract the scheduling information and use the extracted information is shown and some applied functions are also proposed based on it.
Shao, Dongyan; Atungulu, Griffiths G; Pan, Zhongli; Yue, Tianli; Zhang, Ang; Li, Xuan
2012-08-01
Value of tomato seed has not been fully recognized. The objectives of this research were to establish suitable processing conditions for extracting oil from tomato seed by using solvent, determine the impact of processing conditions on yield and antioxidant activity of extracted oil, and elucidate kinetics of the oil extraction process. Four processing parameters, including time, temperature, solvent-to-solid ratio and particle size were studied. A second order model was established to describe the oil extraction process. Based on the results, increasing temperature, solvent-to-solid ratio, and extraction time increased oil yield. In contrast, larger particle size reduced the oil yield. The recommended oil extraction conditions were 8 min of extraction time at temperature of 25 °C, solvent-to-solids ratio of 5/1 (v/w) and particle size of 0.38 mm, which gave oil yield of 20.32% with recovery rate of 78.56%. The DPPH scavenging activity of extracted oil was not significantly affected by the extraction parameters. The inhibitory concentration (IC(50) ) of tomato seed oil was 8.67 mg/mL which was notably low compared to most vegetable oils. A 2nd order model successfully described the kinetics of tomato oil extraction process and parameters of extraction kinetics including initial extraction rate (h), equilibrium concentration of oil (C(s) ), and the extraction rate constant (k) could be precisely predicted with R(2) of at least 0.957. The study revealed that tomato seed which is typically treated as a low value byproduct of tomato processing has great potential in producing oil with high antioxidant capability. The impact of processing conditions including time, temperature, solvent-to-solid ratio and particle size on yield, and antioxidant activity of extracted tomato seed oil are reported. Optimal conditions and models which describe the extraction process are recommended. The information is vital for determining the extraction processing conditions for industrial production of high quality tomato seed oil. Journal of Food Science © 2012 Institute of Food Technologists® No claim to original US government works.
NASA Astrophysics Data System (ADS)
Oschepkova, Elena; Vasinskaya, Irina; Sockoluck, Irina
2017-11-01
In view of changing educational paradigm (adopting of two-tier system of higher education concept - undergraduate and graduate programs) a need of using of modern learning and information and communications technologies arises putting into practice learner-centered approaches in training of highly qualified specialists for extraction and processing of solid commercial minerals enterprises. In the unstable market demand situation and changeable institutional environment, from one side, and necessity of work balancing, supplying conditions and product quality when mining-and-geological parameters change, from the other side, mining enterprises have to introduce and develop the integrated management process of product and informative and logistic flows under united management system. One of the main limitations, which keeps down the developing process on Russian mining enterprises, is staff incompetence at all levels of logistic management. Under present-day conditions extraction and processing of solid commercial minerals enterprises need highly qualified specialists who can do self-directed researches, develop new and improve present arranging, planning and managing technologies of technical operation and commercial exploitation of transport and transportation and processing facilities based on logistics. Learner-centered approach and individualization of the learning process necessitate the designing of individual learning route (ILR), which can help the students to realize their professional facilities according to requirements for specialists for extraction and processing of solid commercial minerals enterprises.
NASA Technical Reports Server (NTRS)
Haralick, R. H. (Principal Investigator); Bosley, R. J.
1974-01-01
The author has identified the following significant results. A procedure was developed to extract cross-band textural features from ERTS MSS imagery. Evolving from a single image texture extraction procedure which uses spatial dependence matrices to measure relative co-occurrence of nearest neighbor grey tones, the cross-band texture procedure uses the distribution of neighboring grey tone N-tuple differences to measure the spatial interrelationships, or co-occurrences, of the grey tone N-tuples present in a texture pattern. In both procedures, texture is characterized in such a way as to be invariant under linear grey tone transformations. However, the cross-band procedure complements the single image procedure by extracting texture information and spectral information contained in ERTS multi-images. Classification experiments show that when used alone, without spectral processing, the cross-band texture procedure extracts more information than the single image texture analysis. Results show an improvement in average correct classification from 86.2% to 88.8% for ERTS image no. 1021-16333 with the cross-band texture procedure. However, when used together with spectral features, the single image texture plus spectral features perform better than the cross-band texture plus spectral features, with an average correct classification of 93.8% and 91.6%, respectively.
The extraction characteristic of Au-Ag from Au concentrate by thiourea solution
NASA Astrophysics Data System (ADS)
Kim, Bongju; Cho, Kanghee; On, Hyunsung; Choi, Nagchoul; Park, Cheonyoung
2013-04-01
The cyanidation process has been used commercially for the past 100 years, there are ores that are not amenable to treatment by cyanide. Interest in alternative lixiviants, such as thiourea, halogens, thiosulfate and malononitrile, has been revived as a result of a major increase in gold price, which has stimulated new developments in extraction technology, combined with environmental concern. The Au extraction process using the thiourea solvent has many advantages over the cyanidation process, including higher leaching rates, faster extraction time and less than toxicity. The purpose of this study was investigated to the extraction characteristic of Au-Ag from two different Au concentrate (sulfuric acid washing and roasting) under various experiment conditions (thiourea concentration, pH of solvent, temperature) by thiourea solvent. The result of extraction experiment showed that the Au-Ag extraction was a fast extraction process, reaching equilibrium (maximum extraction rate) within 30 min. The Au-Ag extraction rate was higher in the roasted concentrate than in the sulfuric acid washing. The higher the Au-Ag extraction rate (Au - 70.87%, Ag - 98.12%) from roasted concentrate was found when the more concentration of thiourea increased, pH decreased and extraction temperature increased. This study informs extraction method basic knowledge when thiourea was a possibility to eco-/economic resources of Au-Ag utilization studies including the hydrometallurgy.
Information extraction from multi-institutional radiology reports.
Hassanpour, Saeed; Langlotz, Curtis P
2016-01-01
The radiology report is the most important source of clinical imaging information. It documents critical information about the patient's health and the radiologist's interpretation of medical findings. It also communicates information to the referring physicians and records that information for future clinical and research use. Although efforts to structure some radiology report information through predefined templates are beginning to bear fruit, a large portion of radiology report information is entered in free text. The free text format is a major obstacle for rapid extraction and subsequent use of information by clinicians, researchers, and healthcare information systems. This difficulty is due to the ambiguity and subtlety of natural language, complexity of described images, and variations among different radiologists and healthcare organizations. As a result, radiology reports are used only once by the clinician who ordered the study and rarely are used again for research and data mining. In this work, machine learning techniques and a large multi-institutional radiology report repository are used to extract the semantics of the radiology report and overcome the barriers to the re-use of radiology report information in clinical research and other healthcare applications. We describe a machine learning system to annotate radiology reports and extract report contents according to an information model. This information model covers the majority of clinically significant contents in radiology reports and is applicable to a wide variety of radiology study types. Our automated approach uses discriminative sequence classifiers for named-entity recognition to extract and organize clinically significant terms and phrases consistent with the information model. We evaluated our information extraction system on 150 radiology reports from three major healthcare organizations and compared its results to a commonly used non-machine learning information extraction method. We also evaluated the generalizability of our approach across different organizations by training and testing our system on data from different organizations. Our results show the efficacy of our machine learning approach in extracting the information model's elements (10-fold cross-validation average performance: precision: 87%, recall: 84%, F1 score: 85%) and its superiority and generalizability compared to the common non-machine learning approach (p-value<0.05). Our machine learning information extraction approach provides an effective automatic method to annotate and extract clinically significant information from a large collection of free text radiology reports. This information extraction system can help clinicians better understand the radiology reports and prioritize their review process. In addition, the extracted information can be used by researchers to link radiology reports to information from other data sources such as electronic health records and the patient's genome. Extracted information also can facilitate disease surveillance, real-time clinical decision support for the radiologist, and content-based image retrieval. Copyright © 2015 Elsevier B.V. All rights reserved.
Representation control increases task efficiency in complex graphical representations.
Moritz, Julia; Meyerhoff, Hauke S; Meyer-Dernbecher, Claudia; Schwan, Stephan
2018-01-01
In complex graphical representations, the relevant information for a specific task is often distributed across multiple spatial locations. In such situations, understanding the representation requires internal transformation processes in order to extract the relevant information. However, digital technology enables observers to alter the spatial arrangement of depicted information and therefore to offload the transformation processes. The objective of this study was to investigate the use of such a representation control (i.e. the users' option to decide how information should be displayed) in order to accomplish an information extraction task in terms of solution time and accuracy. In the representation control condition, the participants were allowed to reorganize the graphical representation and reduce information density. In the control condition, no interactive features were offered. We observed that participants in the representation control condition solved tasks that required reorganization of the maps faster and more accurate than participants without representation control. The present findings demonstrate how processes of cognitive offloading, spatial contiguity, and information coherence interact in knowledge media intended for broad and diverse groups of recipients.
Representation control increases task efficiency in complex graphical representations
Meyerhoff, Hauke S.; Meyer-Dernbecher, Claudia; Schwan, Stephan
2018-01-01
In complex graphical representations, the relevant information for a specific task is often distributed across multiple spatial locations. In such situations, understanding the representation requires internal transformation processes in order to extract the relevant information. However, digital technology enables observers to alter the spatial arrangement of depicted information and therefore to offload the transformation processes. The objective of this study was to investigate the use of such a representation control (i.e. the users' option to decide how information should be displayed) in order to accomplish an information extraction task in terms of solution time and accuracy. In the representation control condition, the participants were allowed to reorganize the graphical representation and reduce information density. In the control condition, no interactive features were offered. We observed that participants in the representation control condition solved tasks that required reorganization of the maps faster and more accurate than participants without representation control. The present findings demonstrate how processes of cognitive offloading, spatial contiguity, and information coherence interact in knowledge media intended for broad and diverse groups of recipients. PMID:29698443
NASA Astrophysics Data System (ADS)
Mouloudakis, K.; Kominis, I. K.
2017-02-01
Radical-ion-pair reactions, central for understanding the avian magnetic compass and spin transport in photosynthetic reaction centers, were recently shown to be a fruitful paradigm of the new synthesis of quantum information science with biological processes. We show here that the master equation so far constituting the theoretical foundation of spin chemistry violates fundamental bounds for the entropy of quantum systems, in particular the Ozawa bound. In contrast, a recently developed theory based on quantum measurements, quantum coherence measures, and quantum retrodiction, thus exemplifying the paradigm of quantum biology, satisfies the Ozawa bound as well as the Lanford-Robinson bound on information extraction. By considering Groenewold's information, the quantum information extracted during the reaction, we reproduce the known and unravel other magnetic-field effects not conveyed by reaction yields.
Multichannel Doppler Processing for an Experimental Low-Angle Tracking System
1990-05-01
estimation techniques at sea. Because of clutter and noise, it is necessary to use a number of different processing algorithms to extract the required...a number of different processing algorithms to extract the required information. Consequently, the ELAT radar system is composed of multiple...corresponding to RF frequencies, f, and f2. For mode 3, the ambiguities occur at vbi = 15.186 knots and vb2 = 16.96 knots. The sea clutter, with a spectrum
Friesen, Melissa C; Locke, Sarah J; Tornow, Carina; Chen, Yu-Cheng; Koh, Dong-Hee; Stewart, Patricia A; Purdue, Mark; Colt, Joanne S
2014-06-01
Lifetime occupational history (OH) questionnaires often use open-ended questions to capture detailed information about study participants' jobs. Exposure assessors use this information, along with responses to job- and industry-specific questionnaires, to assign exposure estimates on a job-by-job basis. An alternative approach is to use information from the OH responses and the job- and industry-specific questionnaires to develop programmable decision rules for assigning exposures. As a first step in this process, we developed a systematic approach to extract the free-text OH responses and convert them into standardized variables that represented exposure scenarios. Our study population comprised 2408 subjects, reporting 11991 jobs, from a case-control study of renal cell carcinoma. Each subject completed a lifetime OH questionnaire that included verbatim responses, for each job, to open-ended questions including job title, main tasks and activities (task), tools and equipment used (tools), and chemicals and materials handled (chemicals). Based on a review of the literature, we identified exposure scenarios (occupations, industries, tasks/tools/chemicals) expected to involve possible exposure to chlorinated solvents, trichloroethylene (TCE) in particular, lead, and cadmium. We then used a SAS macro to review the information reported by study participants to identify jobs associated with each exposure scenario; this was done using previously coded standardized occupation and industry classification codes, and a priori lists of associated key words and phrases related to possibly exposed tasks, tools, and chemicals. Exposure variables representing the occupation, industry, and task/tool/chemicals exposure scenarios were added to the work history records of the study respondents. Our identification of possibly TCE-exposed scenarios in the OH responses was compared to an expert's independently assigned probability ratings to evaluate whether we missed identifying possibly exposed jobs. Our process added exposure variables for 52 occupation groups, 43 industry groups, and 46 task/tool/chemical scenarios to the data set of OH responses. Across all four agents, we identified possibly exposed task/tool/chemical exposure scenarios in 44-51% of the jobs in possibly exposed occupations. Possibly exposed task/tool/chemical exposure scenarios were found in a nontrivial 9-14% of the jobs not in possibly exposed occupations, suggesting that our process identified important information that would not be captured using occupation alone. Our extraction process was sensitive: for jobs where our extraction of OH responses identified no exposure scenarios and for which the sole source of information was the OH responses, only 0.1% were assessed as possibly exposed to TCE by the expert. Our systematic extraction of OH information found useful information in the task/chemicals/tools responses that was relatively easy to extract and that was not available from the occupational or industry information. The extracted variables can be used as inputs in the development of decision rules, especially for jobs where no additional information, such as job- and industry-specific questionnaires, is available. Published by Oxford University Press on behalf of the British Occupational Hygiene Society 2014.
Extracting laboratory test information from biomedical text
Kang, Yanna Shen; Kayaalp, Mehmet
2013-01-01
Background: No previous study reported the efficacy of current natural language processing (NLP) methods for extracting laboratory test information from narrative documents. This study investigates the pathology informatics question of how accurately such information can be extracted from text with the current tools and techniques, especially machine learning and symbolic NLP methods. The study data came from a text corpus maintained by the U.S. Food and Drug Administration, containing a rich set of information on laboratory tests and test devices. Methods: The authors developed a symbolic information extraction (SIE) system to extract device and test specific information about four types of laboratory test entities: Specimens, analytes, units of measures and detection limits. They compared the performance of SIE and three prominent machine learning based NLP systems, LingPipe, GATE and BANNER, each implementing a distinct supervised machine learning method, hidden Markov models, support vector machines and conditional random fields, respectively. Results: Machine learning systems recognized laboratory test entities with moderately high recall, but low precision rates. Their recall rates were relatively higher when the number of distinct entity values (e.g., the spectrum of specimens) was very limited or when lexical morphology of the entity was distinctive (as in units of measures), yet SIE outperformed them with statistically significant margins on extracting specimen, analyte and detection limit information in both precision and F-measure. Its high recall performance was statistically significant on analyte information extraction. Conclusions: Despite its shortcomings against machine learning methods, a well-tailored symbolic system may better discern relevancy among a pile of information of the same type and may outperform a machine learning system by tapping into lexically non-local contextual information such as the document structure. PMID:24083058
ERIC Educational Resources Information Center
Zhou, Mingming
2013-01-01
The ability to search, process, extract, evaluate and integrate information for learning purposes has clearly become the basic skills of the twenty first century. Although this process is often taken as a cognitive process, research has shown a strong connection between emotion and cognition. Recent research has suggested that positive emotions…
Zheng, Shuai; Lu, James J; Ghasemzadeh, Nima; Hayek, Salim S; Quyyumi, Arshed A; Wang, Fusheng
2017-05-09
Extracting structured data from narrated medical reports is challenged by the complexity of heterogeneous structures and vocabularies and often requires significant manual effort. Traditional machine-based approaches lack the capability to take user feedbacks for improving the extraction algorithm in real time. Our goal was to provide a generic information extraction framework that can support diverse clinical reports and enables a dynamic interaction between a human and a machine that produces highly accurate results. A clinical information extraction system IDEAL-X has been built on top of online machine learning. It processes one document at a time, and user interactions are recorded as feedbacks to update the learning model in real time. The updated model is used to predict values for extraction in subsequent documents. Once prediction accuracy reaches a user-acceptable threshold, the remaining documents may be batch processed. A customizable controlled vocabulary may be used to support extraction. Three datasets were used for experiments based on report styles: 100 cardiac catheterization procedure reports, 100 coronary angiographic reports, and 100 integrated reports-each combines history and physical report, discharge summary, outpatient clinic notes, outpatient clinic letter, and inpatient discharge medication report. Data extraction was performed by 3 methods: online machine learning, controlled vocabularies, and a combination of these. The system delivers results with F1 scores greater than 95%. IDEAL-X adopts a unique online machine learning-based approach combined with controlled vocabularies to support data extraction for clinical reports. The system can quickly learn and improve, thus it is highly adaptable. ©Shuai Zheng, James J Lu, Nima Ghasemzadeh, Salim S Hayek, Arshed A Quyyumi, Fusheng Wang. Originally published in JMIR Medical Informatics (http://medinform.jmir.org), 09.05.2017.
Map Design for Computer Processing: Literature Review and DMA Product Critique.
1985-01-01
requirements can be separated contour lines (vegetation shown by iconic symbols) from user preference. versus extracting relief information using only con...tour lines (vegetation shown by tints); 0 extracting vegetation information using iconic sym- PERFORMANCE TESTING bols (relief shown by elevation...show another: trapolating the symbols on a white background) in tim- * in the case of point symbols, iconic forms where ing the performance of tasks
ERIC Educational Resources Information Center
Ramsey-Klee, Diane M.; Richman, Vivian
The purpose of this research is to develop content analytic techniques capable of extracting the differentiating information in narrative performance evaluations for enlisted personnel in order to aid in the process of selecting personnel for advancement, duty assignment, training, or quality retention. Four tasks were performed. The first task…
Modern Techniques in Acoustical Signal and Image Processing
DOE Office of Scientific and Technical Information (OSTI.GOV)
Candy, J V
2002-04-04
Acoustical signal processing problems can lead to some complex and intricate techniques to extract the desired information from noisy, sometimes inadequate, measurements. The challenge is to formulate a meaningful strategy that is aimed at performing the processing required even in the face of uncertainties. This strategy can be as simple as a transformation of the measured data to another domain for analysis or as complex as embedding a full-scale propagation model into the processor. The aims of both approaches are the same--to extract the desired information and reject the extraneous, that is, develop a signal processing scheme to achieve thismore » goal. In this paper, we briefly discuss this underlying philosophy from a ''bottom-up'' approach enabling the problem to dictate the solution rather than visa-versa.« less
Document Exploration and Automatic Knowledge Extraction for Unstructured Biomedical Text
NASA Astrophysics Data System (ADS)
Chu, S.; Totaro, G.; Doshi, N.; Thapar, S.; Mattmann, C. A.; Ramirez, P.
2015-12-01
We describe our work on building a web-browser based document reader with built-in exploration tool and automatic concept extraction of medical entities for biomedical text. Vast amounts of biomedical information are offered in unstructured text form through scientific publications and R&D reports. Utilizing text mining can help us to mine information and extract relevant knowledge from a plethora of biomedical text. The ability to employ such technologies to aid researchers in coping with information overload is greatly desirable. In recent years, there has been an increased interest in automatic biomedical concept extraction [1, 2] and intelligent PDF reader tools with the ability to search on content and find related articles [3]. Such reader tools are typically desktop applications and are limited to specific platforms. Our goal is to provide researchers with a simple tool to aid them in finding, reading, and exploring documents. Thus, we propose a web-based document explorer, which we called Shangri-Docs, which combines a document reader with automatic concept extraction and highlighting of relevant terms. Shangri-Docsalso provides the ability to evaluate a wide variety of document formats (e.g. PDF, Words, PPT, text, etc.) and to exploit the linked nature of the Web and personal content by performing searches on content from public sites (e.g. Wikipedia, PubMed) and private cataloged databases simultaneously. Shangri-Docsutilizes Apache cTAKES (clinical Text Analysis and Knowledge Extraction System) [4] and Unified Medical Language System (UMLS) to automatically identify and highlight terms and concepts, such as specific symptoms, diseases, drugs, and anatomical sites, mentioned in the text. cTAKES was originally designed specially to extract information from clinical medical records. Our investigation leads us to extend the automatic knowledge extraction process of cTAKES for biomedical research domain by improving the ontology guided information extraction process. We will describe our experience and implementation of our system and share lessons learned from our development. We will also discuss ways in which this could be adapted to other science fields. [1] Funk et al., 2014. [2] Kang et al., 2014. [3] Utopia Documents, http://utopiadocs.com [4] Apache cTAKES, http://ctakes.apache.org
Hyperspectral image processing methods
USDA-ARS?s Scientific Manuscript database
Hyperspectral image processing refers to the use of computer algorithms to extract, store and manipulate both spatial and spectral information contained in hyperspectral images across the visible and near-infrared portion of the electromagnetic spectrum. A typical hyperspectral image processing work...
Zheng, Xiaodong; Zhu, Fengtao; Wu, Maoyu; Yan, Xinhuan; Meng, Xiaomeng; Song, Ye
2015-11-01
Carotenoid content analysis in wolfberry processed products has mainly focused on the determination of zeaxanthin or zeaxanthin dipalmitate, which cannot indicate the total carotenoid content (TCC) in wolfberries. We have exploited an effective approach for rapid extraction of carotenoid from wolfberry juice and determined TCC using UV-visible spectrophotometry. Several solvent mixtures, adsorption wavelengths of carotenoid extracts and extraction procedures were investigated. The optimal solvent mixture with broad spectrum polarity was hexane-ethanol-acetone (2:1:1) and optimal wavelength was 456 nm. There was no significant difference of TCC in wolfberry juice between direct extraction and saponification extraction. The developed method for assessment of TCC has been successfully employed in quality evaluation of wolfberry juice under different processing conditions. This measurement approach has inherent advantages (simplicity, rapidity, effectiveness) that make it appropriate for obtaining on-site information of TCC in wolfberry juice during processing. © 2014 Society of Chemical Industry.
The algorithm of fast image stitching based on multi-feature extraction
NASA Astrophysics Data System (ADS)
Yang, Chunde; Wu, Ge; Shi, Jing
2018-05-01
This paper proposed an improved image registration method combining Hu-based invariant moment contour information and feature points detection, aiming to solve the problems in traditional image stitching algorithm, such as time-consuming feature points extraction process, redundant invalid information overload and inefficiency. First, use the neighborhood of pixels to extract the contour information, employing the Hu invariant moment as similarity measure to extract SIFT feature points in those similar regions. Then replace the Euclidean distance with Hellinger kernel function to improve the initial matching efficiency and get less mismatching points, further, estimate affine transformation matrix between the images. Finally, local color mapping method is adopted to solve uneven exposure, using the improved multiresolution fusion algorithm to fuse the mosaic images and realize seamless stitching. Experimental results confirm high accuracy and efficiency of method proposed in this paper.
An introduction to the Marshall information retrieval and display system
NASA Technical Reports Server (NTRS)
1974-01-01
An on-line terminal oriented data storage and retrieval system is presented which allows a user to extract and process information from stored data bases. The use of on-line terminals for extracting and displaying data from the data bases provides a fast and responsive method for obtaining needed information. The system consists of general purpose computer programs that provide the overall capabilities of the total system. The system can process any number of data files via a Dictionary (one for each file) which describes the data format to the system. New files may be added to the system at any time, and reprogramming is not required. Illustrations of the system are shown, and sample inquiries and responses are given.
Artificial retina model for the retinally blind based on wavelet transform
NASA Astrophysics Data System (ADS)
Zeng, Yan-an; Song, Xin-qiang; Jiang, Fa-gang; Chang, Da-ding
2007-01-01
Artificial retina is aimed for the stimulation of remained retinal neurons in the patients with degenerated photoreceptors. Microelectrode arrays have been developed for this as a part of stimulator. Design such microelectrode arrays first requires a suitable mathematical method for human retinal information processing. In this paper, a flexible and adjustable human visual information extracting model is presented, which is based on the wavelet transform. With the flexible of wavelet transform to image information processing and the consistent to human visual information extracting, wavelet transform theory is applied to the artificial retina model for the retinally blind. The response of the model to synthetic image is shown. The simulated experiment demonstrates that the model behaves in a manner qualitatively similar to biological retinas and thus may serve as a basis for the development of an artificial retina.
Aerobraking Maneuver (ABM) Report Generator
NASA Technical Reports Server (NTRS)
Fisher, Forrest; Gladden, Roy; Khanampornpan, Teerapat
2008-01-01
abmREPORT Version 3.1 is a Perl script that extracts vital summarization information from the Mars Reconnaissance Orbiter (MRO) aerobraking ABM build process. This information facilitates sequence reviews, and provides a high-level summarization of the sequence for mission management. The script extracts information from the ENV, SSF, FRF, SCMFmax, and OPTG files and burn magnitude configuration files and presents them in a single, easy-to-check report that provides the majority of the parameters necessary for cross check and verification during the sequence review process. This means that needed information, formerly spread across a number of different files and each in a different format, is all available in this one application. This program is built on the capabilities developed in dragReport and then the scripts evolved as the two tools continued to be developed in parallel.
Wilson, Richard A.; Chapman, Wendy W.; DeFries, Shawn J.; Becich, Michael J.; Chapman, Brian E.
2010-01-01
Background: Clinical records are often unstructured, free-text documents that create information extraction challenges and costs. Healthcare delivery and research organizations, such as the National Mesothelioma Virtual Bank, require the aggregation of both structured and unstructured data types. Natural language processing offers techniques for automatically extracting information from unstructured, free-text documents. Methods: Five hundred and eight history and physical reports from mesothelioma patients were split into development (208) and test sets (300). A reference standard was developed and each report was annotated by experts with regard to the patient’s personal history of ancillary cancer and family history of any cancer. The Hx application was developed to process reports, extract relevant features, perform reference resolution and classify them with regard to cancer history. Two methods, Dynamic-Window and ConText, for extracting information were evaluated. Hx’s classification responses using each of the two methods were measured against the reference standard. The average Cohen’s weighted kappa served as the human benchmark in evaluating the system. Results: Hx had a high overall accuracy, with each method, scoring 96.2%. F-measures using the Dynamic-Window and ConText methods were 91.8% and 91.6%, which were comparable to the human benchmark of 92.8%. For the personal history classification, Dynamic-Window scored highest with 89.2% and for the family history classification, ConText scored highest with 97.6%, in which both methods were comparable to the human benchmark of 88.3% and 97.2%, respectively. Conclusion: We evaluated an automated application’s performance in classifying a mesothelioma patient’s personal and family history of cancer from clinical reports. To do so, the Hx application must process reports, identify cancer concepts, distinguish the known mesothelioma from ancillary cancers, recognize negation, perform reference resolution and determine the experiencer. Results indicated that both information extraction methods tested were dependant on the domain-specific lexicon and negation extraction. We showed that the more general method, ConText, performed as well as our task-specific method. Although Dynamic- Window could be modified to retrieve other concepts, ConText is more robust and performs better on inconclusive concepts. Hx could greatly improve and expedite the process of extracting data from free-text, clinical records for a variety of research or healthcare delivery organizations. PMID:21031012
NASA Astrophysics Data System (ADS)
Zhou, Tingting; Gu, Lingjia; Ren, Ruizhi; Cao, Qiong
2016-09-01
With the rapid development of remote sensing technology, the spatial resolution and temporal resolution of satellite imagery also have a huge increase. Meanwhile, High-spatial-resolution images are becoming increasingly popular for commercial applications. The remote sensing image technology has broad application prospects in intelligent traffic. Compared with traditional traffic information collection methods, vehicle information extraction using high-resolution remote sensing image has the advantages of high resolution and wide coverage. This has great guiding significance to urban planning, transportation management, travel route choice and so on. Firstly, this paper preprocessed the acquired high-resolution multi-spectral and panchromatic remote sensing images. After that, on the one hand, in order to get the optimal thresholding for image segmentation, histogram equalization and linear enhancement technologies were applied into the preprocessing results. On the other hand, considering distribution characteristics of road, the normalized difference vegetation index (NDVI) and normalized difference water index (NDWI) were used to suppress water and vegetation information of preprocessing results. Then, the above two processing result were combined. Finally, the geometric characteristics were used to completed road information extraction. The road vector extracted was used to limit the target vehicle area. Target vehicle extraction was divided into bright vehicles extraction and dark vehicles extraction. Eventually, the extraction results of the two kinds of vehicles were combined to get the final results. The experiment results demonstrated that the proposed algorithm has a high precision for the vehicle information extraction for different high resolution remote sensing images. Among these results, the average fault detection rate was about 5.36%, the average residual rate was about 13.60% and the average accuracy was approximately 91.26%.
Volume and Value of Big Healthcare Data.
Dinov, Ivo D
Modern scientific inquiries require significant data-driven evidence and trans-disciplinary expertise to extract valuable information and gain actionable knowledge about natural processes. Effective evidence-based decisions require collection, processing and interpretation of vast amounts of complex data. The Moore's and Kryder's laws of exponential increase of computational power and information storage, respectively, dictate the need rapid trans-disciplinary advances, technological innovation and effective mechanisms for managing and interrogating Big Healthcare Data. In this article, we review important aspects of Big Data analytics and discuss important questions like: What are the challenges and opportunities associated with this biomedical, social, and healthcare data avalanche? Are there innovative statistical computing strategies to represent, model, analyze and interpret Big heterogeneous data? We present the foundation of a new compressive big data analytics (CBDA) framework for representation, modeling and inference of large, complex and heterogeneous datasets. Finally, we consider specific directions likely to impact the process of extracting information from Big healthcare data, translating that information to knowledge, and deriving appropriate actions.
Volume and Value of Big Healthcare Data
Dinov, Ivo D.
2016-01-01
Modern scientific inquiries require significant data-driven evidence and trans-disciplinary expertise to extract valuable information and gain actionable knowledge about natural processes. Effective evidence-based decisions require collection, processing and interpretation of vast amounts of complex data. The Moore's and Kryder's laws of exponential increase of computational power and information storage, respectively, dictate the need rapid trans-disciplinary advances, technological innovation and effective mechanisms for managing and interrogating Big Healthcare Data. In this article, we review important aspects of Big Data analytics and discuss important questions like: What are the challenges and opportunities associated with this biomedical, social, and healthcare data avalanche? Are there innovative statistical computing strategies to represent, model, analyze and interpret Big heterogeneous data? We present the foundation of a new compressive big data analytics (CBDA) framework for representation, modeling and inference of large, complex and heterogeneous datasets. Finally, we consider specific directions likely to impact the process of extracting information from Big healthcare data, translating that information to knowledge, and deriving appropriate actions. PMID:26998309
Rajaei, Karim; Khaligh-Razavi, Seyed-Mahdi; Ghodrati, Masoud; Ebrahimpour, Reza; Shiri Ahmad Abadi, Mohammad Ebrahim
2012-01-01
The brain mechanism of extracting visual features for recognizing various objects has consistently been a controversial issue in computational models of object recognition. To extract visual features, we introduce a new, biologically motivated model for facial categorization, which is an extension of the Hubel and Wiesel simple-to-complex cell hierarchy. To address the synaptic stability versus plasticity dilemma, we apply the Adaptive Resonance Theory (ART) for extracting informative intermediate level visual features during the learning process, which also makes this model stable against the destruction of previously learned information while learning new information. Such a mechanism has been suggested to be embedded within known laminar microcircuits of the cerebral cortex. To reveal the strength of the proposed visual feature learning mechanism, we show that when we use this mechanism in the training process of a well-known biologically motivated object recognition model (the HMAX model), it performs better than the HMAX model in face/non-face classification tasks. Furthermore, we demonstrate that our proposed mechanism is capable of following similar trends in performance as humans in a psychophysical experiment using a face versus non-face rapid categorization task.
Spineli, Loukia M
2017-12-01
Tο report challenges encountered during the extraction process from Cochrane reviews in mental health and Campbell reviews and to indicate their implications on the empirical performance of different methods to handle missingness. We used a collection of meta-analyses on binary outcomes collated from a previous work on missing outcome data. To evaluate the accuracy of their extraction, we developed specific criteria pertaining to the reporting of missing outcome data in systematic reviews. Using the most popular methods to handle missing binary outcome data, we investigated the implications of the accuracy of the extracted meta-analysis on the random-effects meta-analysis results. Of 113 meta-analyses from Cochrane reviews, 60 (53%) were judged as "unclearly" extracted (ie, no information on the outcome of completers but available information on how missing participants were handled) and 42 (37%) as "unacceptably" extracted (ie, no information on the outcome of completers as well as no information on how missing participants were handled). For the remaining meta-analyses, it was judged that data were "acceptably" extracted (ie, information on the completers' outcome was provided for all trials). Overall, "unclear" extraction overestimated the magnitude of the summary odds ratio and the between-study variance and additionally inflated the uncertainty of both meta-analytical parameters. The only eligible Campbell review was judged as "unclear." Depending on the extent of missingness, the reporting quality of the systematic reviews can greatly affect the accuracy of the extracted meta-analyses and by extent, the empirical performance of different methods to handle missingness. Copyright © 2017 John Wiley & Sons, Ltd.
Motion processing with two eyes in three dimensions.
Rokers, Bas; Czuba, Thaddeus B; Cormack, Lawrence K; Huk, Alexander C
2011-02-11
The movement of an object toward or away from the head is perhaps the most critical piece of information an organism can extract from its environment. Such 3D motion produces horizontally opposite motions on the two retinae. Little is known about how or where the visual system combines these two retinal motion signals, relative to the wealth of knowledge about the neural hierarchies involved in 2D motion processing and binocular vision. Canonical conceptions of primate visual processing assert that neurons early in the visual system combine monocular inputs into a single cyclopean stream (lacking eye-of-origin information) and extract 1D ("component") motions; later stages then extract 2D pattern motion from the cyclopean output of the earlier stage. Here, however, we show that 3D motion perception is in fact affected by the comparison of opposite 2D pattern motions between the two eyes. Three-dimensional motion sensitivity depends systematically on pattern motion direction when dichoptically viewing gratings and plaids-and a novel "dichoptic pseudoplaid" stimulus provides strong support for use of interocular pattern motion differences by precluding potential contributions from conventional disparity-based mechanisms. These results imply the existence of eye-of-origin information in later stages of motion processing and therefore motivate the incorporation of such eye-specific pattern-motion signals in models of motion processing and binocular integration.
Quantity and unit extraction for scientific and technical intelligence analysis
NASA Astrophysics Data System (ADS)
David, Peter; Hawes, Timothy
2017-05-01
Scientific and Technical (S and T) intelligence analysts consume huge amounts of data to understand how scientific progress and engineering efforts affect current and future military capabilities. One of the most important types of information S and T analysts exploit is the quantities discussed in their source material. Frequencies, ranges, size, weight, power, and numerous other properties and measurements describing the performance characteristics of systems and the engineering constraints that define them must be culled from source documents before quantified analysis can begin. Automating the process of finding and extracting the relevant quantities from a wide range of S and T documents is difficult because information about quantities and their units is often contained in unstructured text with ad hoc conventions used to convey their meaning. Currently, even simple tasks, such as searching for documents discussing RF frequencies in a band of interest, is a labor intensive and error prone process. This research addresses the challenges facing development of a document processing capability that extracts quantities and units from S and T data, and how Natural Language Processing algorithms can be used to overcome these challenges.
Rethinking the 2000 ACRL Standards: Some Things to Consider
ERIC Educational Resources Information Center
Kuhlthau, Carol C.
2013-01-01
I propose three "rethinks" to consider in recasting the ACRL Standards for information literacy for the coming decades. First, rethink the concept of information need. Second, rethink the notion that information literacy is composed of a set of abilities for "extracting information." Third, rethink the holistic process of…
Hunter, Lawrence; Lu, Zhiyong; Firby, James; Baumgartner, William A; Johnson, Helen L; Ogren, Philip V; Cohen, K Bretonnel
2008-01-01
Background Information extraction (IE) efforts are widely acknowledged to be important in harnessing the rapid advance of biomedical knowledge, particularly in areas where important factual information is published in a diverse literature. Here we report on the design, implementation and several evaluations of OpenDMAP, an ontology-driven, integrated concept analysis system. It significantly advances the state of the art in information extraction by leveraging knowledge in ontological resources, integrating diverse text processing applications, and using an expanded pattern language that allows the mixing of syntactic and semantic elements and variable ordering. Results OpenDMAP information extraction systems were produced for extracting protein transport assertions (transport), protein-protein interaction assertions (interaction) and assertions that a gene is expressed in a cell type (expression). Evaluations were performed on each system, resulting in F-scores ranging from .26 – .72 (precision .39 – .85, recall .16 – .85). Additionally, each of these systems was run over all abstracts in MEDLINE, producing a total of 72,460 transport instances, 265,795 interaction instances and 176,153 expression instances. Conclusion OpenDMAP advances the performance standards for extracting protein-protein interaction predications from the full texts of biomedical research articles. Furthermore, this level of performance appears to generalize to other information extraction tasks, including extracting information about predicates of more than two arguments. The output of the information extraction system is always constructed from elements of an ontology, ensuring that the knowledge representation is grounded with respect to a carefully constructed model of reality. The results of these efforts can be used to increase the efficiency of manual curation efforts and to provide additional features in systems that integrate multiple sources for information extraction. The open source OpenDMAP code library is freely available at PMID:18237434
NASA Technical Reports Server (NTRS)
Thorley, G. A.; Draeger, W. C.; Lauer, D. T.; Lent, J.; Roberts, E.
1971-01-01
The four problem are as being investigated are: (1) determination of the feasibility of providing the resource manager with operationally useful information through the use of remote sensing techniques; (2) definition of the spectral characteristics of earth resources and the optimum procedures for calibrating tone and color characteristics of multispectral imagery (3) determination of the extent to which humans can extract useful earth resource information through remote sensing imagery; (4) determination of the extent to which automatic classification and data processing can extract useful information from remote sensing data.
Zhang, Yanjun; Liu, Wen-zhe; Fu, Xing-hu; Bi, Wei-hong
2016-02-01
Given that the traditional signal processing methods can not effectively distinguish the different vibration intrusion signal, a feature extraction and recognition method of the vibration information is proposed based on EMD-AWPP and HOSA-SVM, using for high precision signal recognition of distributed fiber optic intrusion detection system. When dealing with different types of vibration, the method firstly utilizes the adaptive wavelet processing algorithm based on empirical mode decomposition effect to reduce the abnormal value influence of sensing signal and improve the accuracy of signal feature extraction. Not only the low frequency part of the signal is decomposed, but also the high frequency part the details of the signal disposed better by time-frequency localization process. Secondly, it uses the bispectrum and bicoherence spectrum to accurately extract the feature vector which contains different types of intrusion vibration. Finally, based on the BPNN reference model, the recognition parameters of SVM after the implementation of the particle swarm optimization can distinguish signals of different intrusion vibration, which endows the identification model stronger adaptive and self-learning ability. It overcomes the shortcomings, such as easy to fall into local optimum. The simulation experiment results showed that this new method can effectively extract the feature vector of sensing information, eliminate the influence of random noise and reduce the effects of outliers for different types of invasion source. The predicted category identifies with the output category and the accurate rate of vibration identification can reach above 95%. So it is better than BPNN recognition algorithm and improves the accuracy of the information analysis effectively.
AAlAbdulsalam, Abdulrahman K.; Garvin, Jennifer H.; Redd, Andrew; Carter, Marjorie E.; Sweeny, Carol; Meystre, Stephane M.
2018-01-01
Cancer stage is one of the most important prognostic parameters in most cancer subtypes. The American Joint Com-mittee on Cancer (AJCC) specifies criteria for staging each cancer type based on tumor characteristics (T), lymph node involvement (N), and tumor metastasis (M) known as TNM staging system. Information related to cancer stage is typically recorded in clinical narrative text notes and other informal means of communication in the Electronic Health Record (EHR). As a result, human chart-abstractors (known as certified tumor registrars) have to search through volu-minous amounts of text to extract accurate stage information and resolve discordance between different data sources. This study proposes novel applications of natural language processing and machine learning to automatically extract and classify TNM stage mentions from records at the Utah Cancer Registry. Our results indicate that TNM stages can be extracted and classified automatically with high accuracy (extraction sensitivity: 95.5%–98.4% and classification sensitivity: 83.5%–87%). PMID:29888032
AAlAbdulsalam, Abdulrahman K; Garvin, Jennifer H; Redd, Andrew; Carter, Marjorie E; Sweeny, Carol; Meystre, Stephane M
2018-01-01
Cancer stage is one of the most important prognostic parameters in most cancer subtypes. The American Joint Com-mittee on Cancer (AJCC) specifies criteria for staging each cancer type based on tumor characteristics (T), lymph node involvement (N), and tumor metastasis (M) known as TNM staging system. Information related to cancer stage is typically recorded in clinical narrative text notes and other informal means of communication in the Electronic Health Record (EHR). As a result, human chart-abstractors (known as certified tumor registrars) have to search through volu-minous amounts of text to extract accurate stage information and resolve discordance between different data sources. This study proposes novel applications of natural language processing and machine learning to automatically extract and classify TNM stage mentions from records at the Utah Cancer Registry. Our results indicate that TNM stages can be extracted and classified automatically with high accuracy (extraction sensitivity: 95.5%-98.4% and classification sensitivity: 83.5%-87%).
The comparison and analysis of extracting video key frame
NASA Astrophysics Data System (ADS)
Ouyang, S. Z.; Zhong, L.; Luo, R. Q.
2018-05-01
Video key frame extraction is an important part of the large data processing. Based on the previous work in key frame extraction, we summarized four important key frame extraction algorithms, and these methods are largely developed by comparing the differences between each of two frames. If the difference exceeds a threshold value, take the corresponding frame as two different keyframes. After the research, the key frame extraction based on the amount of mutual trust is proposed, the introduction of information entropy, by selecting the appropriate threshold values into the initial class, and finally take a similar mean mutual information as a candidate key frame. On this paper, several algorithms is used to extract the key frame of tunnel traffic videos. Then, with the analysis to the experimental results and comparisons between the pros and cons of these algorithms, the basis of practical applications is well provided.
Hand and goods judgment algorithm based on depth information
NASA Astrophysics Data System (ADS)
Li, Mingzhu; Zhang, Jinsong; Yan, Dan; Wang, Qin; Zhang, Ruiqi; Han, Jing
2016-03-01
A tablet computer with a depth camera and a color camera is loaded on a traditional shopping cart. The inside information of the shopping cart is obtained by two cameras. In the shopping cart monitoring field, it is very important for us to determine whether the customer with goods in or out of the shopping cart. This paper establishes a basic framework for judging empty hand, it includes the hand extraction process based on the depth information, process of skin color model building based on WPCA (Weighted Principal Component Analysis), an algorithm for judging handheld products based on motion and skin color information, statistical process. Through this framework, the first step can ensure the integrity of the hand information, and effectively avoids the influence of sleeve and other debris, the second step can accurately extract skin color and eliminate the similar color interference, light has little effect on its results, it has the advantages of fast computation speed and high efficiency, and the third step has the advantage of greatly reducing the noise interference and improving the accuracy.
Aniba, Mohamed Radhouene; Siguenza, Sophie; Friedrich, Anne; Plewniak, Frédéric; Poch, Olivier; Marchler-Bauer, Aron; Thompson, Julie Dawn
2009-01-01
The traditional approach to bioinformatics analyses relies on independent task-specific services and applications, using different input and output formats, often idiosyncratic, and frequently not designed to inter-operate. In general, such analyses were performed by experts who manually verified the results obtained at each step in the process. Today, the amount of bioinformatics information continuously being produced means that handling the various applications used to study this information presents a major data management and analysis challenge to researchers. It is now impossible to manually analyse all this information and new approaches are needed that are capable of processing the large-scale heterogeneous data in order to extract the pertinent information. We review the recent use of integrated expert systems aimed at providing more efficient knowledge extraction for bioinformatics research. A general methodology for building knowledge-based expert systems is described, focusing on the unstructured information management architecture, UIMA, which provides facilities for both data and process management. A case study involving a multiple alignment expert system prototype called AlexSys is also presented.
Aniba, Mohamed Radhouene; Siguenza, Sophie; Friedrich, Anne; Plewniak, Frédéric; Poch, Olivier; Marchler-Bauer, Aron
2009-01-01
The traditional approach to bioinformatics analyses relies on independent task-specific services and applications, using different input and output formats, often idiosyncratic, and frequently not designed to inter-operate. In general, such analyses were performed by experts who manually verified the results obtained at each step in the process. Today, the amount of bioinformatics information continuously being produced means that handling the various applications used to study this information presents a major data management and analysis challenge to researchers. It is now impossible to manually analyse all this information and new approaches are needed that are capable of processing the large-scale heterogeneous data in order to extract the pertinent information. We review the recent use of integrated expert systems aimed at providing more efficient knowledge extraction for bioinformatics research. A general methodology for building knowledge-based expert systems is described, focusing on the unstructured information management architecture, UIMA, which provides facilities for both data and process management. A case study involving a multiple alignment expert system prototype called AlexSys is also presented. PMID:18971242
A novel speech-processing strategy incorporating tonal information for cochlear implants.
Lan, N; Nie, K B; Gao, S K; Zeng, F G
2004-05-01
Good performance in cochlear implant users depends in large part on the ability of a speech processor to effectively decompose speech signals into multiple channels of narrow-band electrical pulses for stimulation of the auditory nerve. Speech processors that extract only envelopes of the narrow-band signals (e.g., the continuous interleaved sampling (CIS) processor) may not provide sufficient information to encode the tonal cues in languages such as Chinese. To improve the performance in cochlear implant users who speak tonal language, we proposed and developed a novel speech-processing strategy, which extracted both the envelopes of the narrow-band signals and the fundamental frequency (F0) of the speech signal, and used them to modulate both the amplitude and the frequency of the electrical pulses delivered to stimulation electrodes. We developed an algorithm to extract the fundatmental frequency and identified the general patterns of pitch variations of four typical tones in Chinese speech. The effectiveness of the extraction algorithm was verified with an artificial neural network that recognized the tonal patterns from the extracted F0 information. We then compared the novel strategy with the envelope-extraction CIS strategy in human subjects with normal hearing. The novel strategy produced significant improvement in perception of Chinese tones, phrases, and sentences. This novel processor with dynamic modulation of both frequency and amplitude is encouraging for the design of a cochlear implant device for sensorineurally deaf patients who speak tonal languages.
Automated extraction of family history information from clinical notes.
Bill, Robert; Pakhomov, Serguei; Chen, Elizabeth S; Winden, Tamara J; Carter, Elizabeth W; Melton, Genevieve B
2014-01-01
Despite increased functionality for obtaining family history in a structured format within electronic health record systems, clinical notes often still contain this information. We developed and evaluated an Unstructured Information Management Application (UIMA)-based natural language processing (NLP) module for automated extraction of family history information with functionality for identifying statements, observations (e.g., disease or procedure), relative or side of family with attributes (i.e., vital status, age of diagnosis, certainty, and negation), and predication ("indicator phrases"), the latter of which was used to establish relationships between observations and family member. The family history NLP system demonstrated F-scores of 66.9, 92.4, 82.9, 57.3, 97.7, and 61.9 for detection of family history statements, family member identification, observation identification, negation identification, vital status, and overall extraction of the predications between family members and observations, respectively. While the system performed well for detection of family history statements and predication constituents, further work is needed to improve extraction of certainty and temporal modifications.
Automated Extraction of Family History Information from Clinical Notes
Bill, Robert; Pakhomov, Serguei; Chen, Elizabeth S.; Winden, Tamara J.; Carter, Elizabeth W.; Melton, Genevieve B.
2014-01-01
Despite increased functionality for obtaining family history in a structured format within electronic health record systems, clinical notes often still contain this information. We developed and evaluated an Unstructured Information Management Application (UIMA)-based natural language processing (NLP) module for automated extraction of family history information with functionality for identifying statements, observations (e.g., disease or procedure), relative or side of family with attributes (i.e., vital status, age of diagnosis, certainty, and negation), and predication (“indicator phrases”), the latter of which was used to establish relationships between observations and family member. The family history NLP system demonstrated F-scores of 66.9, 92.4, 82.9, 57.3, 97.7, and 61.9 for detection of family history statements, family member identification, observation identification, negation identification, vital status, and overall extraction of the predications between family members and observations, respectively. While the system performed well for detection of family history statements and predication constituents, further work is needed to improve extraction of certainty and temporal modifications. PMID:25954443
Information Extraction for System-Software Safety Analysis: Calendar Year 2008 Year-End Report
NASA Technical Reports Server (NTRS)
Malin, Jane T.
2009-01-01
This annual report describes work to integrate a set of tools to support early model-based analysis of failures and hazards due to system-software interactions. The tools perform and assist analysts in the following tasks: 1) extract model parts from text for architecture and safety/hazard models; 2) combine the parts with library information to develop the models for visualization and analysis; 3) perform graph analysis and simulation to identify and evaluate possible paths from hazard sources to vulnerable entities and functions, in nominal and anomalous system-software configurations and scenarios; and 4) identify resulting candidate scenarios for software integration testing. There has been significant technical progress in model extraction from Orion program text sources, architecture model derivation (components and connections) and documentation of extraction sources. Models have been derived from Internal Interface Requirements Documents (IIRDs) and FMEA documents. Linguistic text processing is used to extract model parts and relationships, and the Aerospace Ontology also aids automated model development from the extracted information. Visualizations of these models assist analysts in requirements overview and in checking consistency and completeness.
Multidimensional biochemical information processing of dynamical patterns
NASA Astrophysics Data System (ADS)
Hasegawa, Yoshihiko
2018-02-01
Cells receive signaling molecules by receptors and relay information via sensory networks so that they can respond properly depending on the type of signal. Recent studies have shown that cells can extract multidimensional information from dynamical concentration patterns of signaling molecules. We herein study how biochemical systems can process multidimensional information embedded in dynamical patterns. We model the decoding networks by linear response functions, and optimize the functions with the calculus of variations to maximize the mutual information between patterns and output. We find that, when the noise intensity is lower, decoders with different linear response functions, i.e., distinct decoders, can extract much information. However, when the noise intensity is higher, distinct decoders do not provide the maximum amount of information. This indicates that, when transmitting information by dynamical patterns, embedding information in multiple patterns is not optimal when the noise intensity is very large. Furthermore, we explore the biochemical implementations of these decoders using control theory and demonstrate that these decoders can be implemented biochemically through the modification of cascade-type networks, which are prevalent in actual signaling pathways.
Multidimensional biochemical information processing of dynamical patterns.
Hasegawa, Yoshihiko
2018-02-01
Cells receive signaling molecules by receptors and relay information via sensory networks so that they can respond properly depending on the type of signal. Recent studies have shown that cells can extract multidimensional information from dynamical concentration patterns of signaling molecules. We herein study how biochemical systems can process multidimensional information embedded in dynamical patterns. We model the decoding networks by linear response functions, and optimize the functions with the calculus of variations to maximize the mutual information between patterns and output. We find that, when the noise intensity is lower, decoders with different linear response functions, i.e., distinct decoders, can extract much information. However, when the noise intensity is higher, distinct decoders do not provide the maximum amount of information. This indicates that, when transmitting information by dynamical patterns, embedding information in multiple patterns is not optimal when the noise intensity is very large. Furthermore, we explore the biochemical implementations of these decoders using control theory and demonstrate that these decoders can be implemented biochemically through the modification of cascade-type networks, which are prevalent in actual signaling pathways.
A research of road centerline extraction algorithm from high resolution remote sensing images
NASA Astrophysics Data System (ADS)
Zhang, Yushan; Xu, Tingfa
2017-09-01
Satellite remote sensing technology has become one of the most effective methods for land surface monitoring in recent years, due to its advantages such as short period, large scale and rich information. Meanwhile, road extraction is an important field in the applications of high resolution remote sensing images. An intelligent and automatic road extraction algorithm with high precision has great significance for transportation, road network updating and urban planning. The fuzzy c-means (FCM) clustering segmentation algorithms have been used in road extraction, but the traditional algorithms did not consider spatial information. An improved fuzzy C-means clustering algorithm combined with spatial information (SFCM) is proposed in this paper, which is proved to be effective for noisy image segmentation. Firstly, the image is segmented using the SFCM. Secondly, the segmentation result is processed by mathematical morphology to remover the joint region. Thirdly, the road centerlines are extracted by morphology thinning and burr trimming. The average integrity of the centerline extraction algorithm is 97.98%, the average accuracy is 95.36% and the average quality is 93.59%. Experimental results show that the proposed method in this paper is effective for road centerline extraction.
NASA Astrophysics Data System (ADS)
Zhu, Yunqiang; Zhu, Huazhong; Lu, Heli; Ni, Jianguang; Zhu, Shaoxia
2005-10-01
Remote sensing dynamic monitoring of land use can detect the change information of land use and update the current land use map, which is important for rational utilization and scientific management of land resources. This paper discusses the technological procedure of remote sensing dynamic monitoring of land use including the process of remote sensing images, the extraction of annual change information of land use, field survey, indoor post processing and accuracy assessment. Especially, we emphasize on comparative research on the choice of remote sensing rectifying models, image fusion algorithms and accuracy assessment methods. Taking Anning district in Lanzhou as an example, we extract the land use change information of the district during 2002-2003, access monitoring accuracy and analyze the reason of land use change.
Data Processing and Text Mining Technologies on Electronic Medical Records: A Review
Sun, Wencheng; Li, Yangyang; Liu, Fang; Fang, Shengqun; Wang, Guoyan
2018-01-01
Currently, medical institutes generally use EMR to record patient's condition, including diagnostic information, procedures performed, and treatment results. EMR has been recognized as a valuable resource for large-scale analysis. However, EMR has the characteristics of diversity, incompleteness, redundancy, and privacy, which make it difficult to carry out data mining and analysis directly. Therefore, it is necessary to preprocess the source data in order to improve data quality and improve the data mining results. Different types of data require different processing technologies. Most structured data commonly needs classic preprocessing technologies, including data cleansing, data integration, data transformation, and data reduction. For semistructured or unstructured data, such as medical text, containing more health information, it requires more complex and challenging processing methods. The task of information extraction for medical texts mainly includes NER (named-entity recognition) and RE (relation extraction). This paper focuses on the process of EMR processing and emphatically analyzes the key techniques. In addition, we make an in-depth study on the applications developed based on text mining together with the open challenges and research issues for future work. PMID:29849998
Federal Register 2010, 2011, 2012, 2013, 2014
2013-07-12
... beginning of the scoping process to solicit public comments regarding issues and resource information for... Lander Field Office intends to prepare an EIS to inform decision-making regarding the proposed Lower Gas... of the proposed Project is to explore for and identify mining reserves and extract and process...
ERIC Educational Resources Information Center
Onwuegbuzie, Anthony J.; Weinbaum, Rebecca K.
2016-01-01
Recently, several authors have attempted to make the literature review process more transparent by providing a step-by-step guide to conducting literature reviews. However, although these works are very informative, none of them delineate how to display information extracted from literature reviews in a reader-friendly and visually appealing…
Extracting and standardizing medication information in clinical text - the MedEx-UIMA system.
Jiang, Min; Wu, Yonghui; Shah, Anushi; Priyanka, Priyanka; Denny, Joshua C; Xu, Hua
2014-01-01
Extraction of medication information embedded in clinical text is important for research using electronic health records (EHRs). However, most of current medication information extraction systems identify drug and signature entities without mapping them to standard representation. In this study, we introduced the open source Java implementation of MedEx, an existing high-performance medication information extraction system, based on the Unstructured Information Management Architecture (UIMA) framework. In addition, we developed new encoding modules in the MedEx-UIMA system, which mapped an extracted drug name/dose/form to both generalized and specific RxNorm concepts and translated drug frequency information to ISO standard. We processed 826 documents by both systems and verified that MedEx-UIMA and MedEx (the Python version) performed similarly by comparing both results. Using two manually annotated test sets that contained 300 drug entries from medication list and 300 drug entries from narrative reports, the MedEx-UIMA system achieved F-measures of 98.5% and 97.5% respectively for encoding drug names to corresponding RxNorm generic drug ingredients, and F-measures of 85.4% and 88.1% respectively for mapping drug names/dose/form to the most specific RxNorm concepts. It also achieved an F-measure of 90.4% for normalizing frequency information to ISO standard. The open source MedEx-UIMA system is freely available online at http://code.google.com/p/medex-uima/.
Feature extraction for document text using Latent Dirichlet Allocation
NASA Astrophysics Data System (ADS)
Prihatini, P. M.; Suryawan, I. K.; Mandia, IN
2018-01-01
Feature extraction is one of stages in the information retrieval system that used to extract the unique feature values of a text document. The process of feature extraction can be done by several methods, one of which is Latent Dirichlet Allocation. However, researches related to text feature extraction using Latent Dirichlet Allocation method are rarely found for Indonesian text. Therefore, through this research, a text feature extraction will be implemented for Indonesian text. The research method consists of data acquisition, text pre-processing, initialization, topic sampling and evaluation. The evaluation is done by comparing Precision, Recall and F-Measure value between Latent Dirichlet Allocation and Term Frequency Inverse Document Frequency KMeans which commonly used for feature extraction. The evaluation results show that Precision, Recall and F-Measure value of Latent Dirichlet Allocation method is higher than Term Frequency Inverse Document Frequency KMeans method. This shows that Latent Dirichlet Allocation method is able to extract features and cluster Indonesian text better than Term Frequency Inverse Document Frequency KMeans method.
Zhou, Li; Friedman, Carol; Parsons, Simon; Hripcsak, George
2005-01-01
Exploring temporal information in narrative Electronic Medical Records (EMRs) is essential and challenging. We propose an architecture for an integrated approach to process temporal information in clinical narrative reports. The goal is to initiate and build a foundation that supports applications which assist healthcare practice and research by including the ability to determine the time of clinical events (e.g., past vs. present). Key components include: (1) a temporal constraint structure for temporal expressions and the development of an associated tagger; (2) a Natural Language Processing (NLP) system for encoding and extracting medical events and associating them with formalized temporal data; (3) a post-processor, with a knowledge-based subsystem to help discover implicit information, that resolves temporal expressions and deals with issues such as granularity and vagueness; and (4) a reasoning mechanism which models clinical reports as Simple Temporal Problems (STPs). PMID:16779164
Audio feature extraction using probability distribution function
NASA Astrophysics Data System (ADS)
Suhaib, A.; Wan, Khairunizam; Aziz, Azri A.; Hazry, D.; Razlan, Zuradzman M.; Shahriman A., B.
2015-05-01
Voice recognition has been one of the popular applications in robotic field. It is also known to be recently used for biometric and multimedia information retrieval system. This technology is attained from successive research on audio feature extraction analysis. Probability Distribution Function (PDF) is a statistical method which is usually used as one of the processes in complex feature extraction methods such as GMM and PCA. In this paper, a new method for audio feature extraction is proposed which is by using only PDF as a feature extraction method itself for speech analysis purpose. Certain pre-processing techniques are performed in prior to the proposed feature extraction method. Subsequently, the PDF result values for each frame of sampled voice signals obtained from certain numbers of individuals are plotted. From the experimental results obtained, it can be seen visually from the plotted data that each individuals' voice has comparable PDF values and shapes.
On-line object feature extraction for multispectral scene representation
NASA Technical Reports Server (NTRS)
Ghassemian, Hassan; Landgrebe, David
1988-01-01
A new on-line unsupervised object-feature extraction method is presented that reduces the complexity and costs associated with the analysis of the multispectral image data and data transmission, storage, archival and distribution. The ambiguity in the object detection process can be reduced if the spatial dependencies, which exist among the adjacent pixels, are intelligently incorporated into the decision making process. The unity relation was defined that must exist among the pixels of an object. Automatic Multispectral Image Compaction Algorithm (AMICA) uses the within object pixel-feature gradient vector as a valuable contextual information to construct the object's features, which preserve the class separability information within the data. For on-line object extraction the path-hypothesis and the basic mathematical tools for its realization are introduced in terms of a specific similarity measure and adjacency relation. AMICA is applied to several sets of real image data, and the performance and reliability of features is evaluated.
Kinetics from Replica Exchange Molecular Dynamics Simulations.
Stelzl, Lukas S; Hummer, Gerhard
2017-08-08
Transitions between metastable states govern many fundamental processes in physics, chemistry and biology, from nucleation events in phase transitions to the folding of proteins. The free energy surfaces underlying these processes can be obtained from simulations using enhanced sampling methods. However, their altered dynamics makes kinetic and mechanistic information difficult or impossible to extract. Here, we show that, with replica exchange molecular dynamics (REMD), one can not only sample equilibrium properties but also extract kinetic information. For systems that strictly obey first-order kinetics, the procedure to extract rates is rigorous. For actual molecular systems whose long-time dynamics are captured by kinetic rate models, accurate rate coefficients can be determined from the statistics of the transitions between the metastable states at each replica temperature. We demonstrate the practical applicability of the procedure by constructing master equation (Markov state) models of peptide and RNA folding from REMD simulations.
Fine-grained information extraction from German transthoracic echocardiography reports.
Toepfer, Martin; Corovic, Hamo; Fette, Georg; Klügl, Peter; Störk, Stefan; Puppe, Frank
2015-11-12
Information extraction techniques that get structured representations out of unstructured data make a large amount of clinically relevant information about patients accessible for semantic applications. These methods typically rely on standardized terminologies that guide this process. Many languages and clinical domains, however, lack appropriate resources and tools, as well as evaluations of their applications, especially if detailed conceptualizations of the domain are required. For instance, German transthoracic echocardiography reports have not been targeted sufficiently before, despite of their importance for clinical trials. This work therefore aimed at development and evaluation of an information extraction component with a fine-grained terminology that enables to recognize almost all relevant information stated in German transthoracic echocardiography reports at the University Hospital of Würzburg. A domain expert validated and iteratively refined an automatically inferred base terminology. The terminology was used by an ontology-driven information extraction system that outputs attribute value pairs. The final component has been mapped to the central elements of a standardized terminology, and it has been evaluated according to documents with different layouts. The final system achieved state-of-the-art precision (micro average.996) and recall (micro average.961) on 100 test documents that represent more than 90 % of all reports. In particular, principal aspects as defined in a standardized external terminology were recognized with f 1=.989 (micro average) and f 1=.963 (macro average). As a result of keyword matching and restraint concept extraction, the system obtained high precision also on unstructured or exceptionally short documents, and documents with uncommon layout. The developed terminology and the proposed information extraction system allow to extract fine-grained information from German semi-structured transthoracic echocardiography reports with very high precision and high recall on the majority of documents at the University Hospital of Würzburg. Extracted results populate a clinical data warehouse which supports clinical research.
Highway extraction from high resolution aerial photography using a geometric active contour model
NASA Astrophysics Data System (ADS)
Niu, Xutong
Highway extraction and vehicle detection are two of the most important steps in traffic-flow analysis from multi-frame aerial photographs. The traditional method of deriving traffic flow trajectories relies on manual vehicle counting from a sequence of aerial photographs, which is tedious and time-consuming. This research presents a new framework for semi-automatic highway extraction. The basis of the new framework is an improved geometric active contour (GAC) model. This novel model seeks to minimize an objective function that transforms a problem of propagation of regular curves into an optimization problem. The implementation of curve propagation is based on level set theory. By using an implicit representation of a two-dimensional curve, a level set approach can be used to deal with topological changes naturally, and the output is unaffected by different initial positions of the curve. However, the original GAC model, on which the new model is based, only incorporates boundary information into the curve propagation process. An error-producing phenomenon called leakage is inevitable wherever there is an uncertain weak edge. In this research, region-based information is added as a constraint into the original GAC model, thereby, giving this proposed method the ability of integrating both boundary and region-based information during the curve propagation. Adding the region-based constraint eliminates the leakage problem. This dissertation applies the proposed augmented GAC model to the problem of highway extraction from high-resolution aerial photography. First, an optimized stopping criterion is designed and used in the implementation of the GAC model. It effectively saves processing time and computations. Second, a seed point propagation framework is designed and implemented. This framework incorporates highway extraction, tracking, and linking into one procedure. A seed point is usually placed at an end node of highway segments close to the boundary of the image or at a position where possible blocking may occur, such as at an overpass bridge or near vehicle crowds. These seed points can be automatically propagated throughout the entire highway network. During the process, road center points are also extracted, which introduces a search direction for solving possible blocking problems. This new framework has been successfully applied to highway network extraction from a large orthophoto mosaic. In the process, vehicles on the highway extracted from mosaic were detected with an 83% success rate.
Natural Language Processing in Radiology: A Systematic Review.
Pons, Ewoud; Braun, Loes M M; Hunink, M G Myriam; Kors, Jan A
2016-05-01
Radiological reporting has generated large quantities of digital content within the electronic health record, which is potentially a valuable source of information for improving clinical care and supporting research. Although radiology reports are stored for communication and documentation of diagnostic imaging, harnessing their potential requires efficient and automated information extraction: they exist mainly as free-text clinical narrative, from which it is a major challenge to obtain structured data. Natural language processing (NLP) provides techniques that aid the conversion of text into a structured representation, and thus enables computers to derive meaning from human (ie, natural language) input. Used on radiology reports, NLP techniques enable automatic identification and extraction of information. By exploring the various purposes for their use, this review examines how radiology benefits from NLP. A systematic literature search identified 67 relevant publications describing NLP methods that support practical applications in radiology. This review takes a close look at the individual studies in terms of tasks (ie, the extracted information), the NLP methodology and tools used, and their application purpose and performance results. Additionally, limitations, future challenges, and requirements for advancing NLP in radiology will be discussed. (©) RSNA, 2016 Online supplemental material is available for this article.
Digital image processing for photo-reconnaissance applications
NASA Technical Reports Server (NTRS)
Billingsley, F. C.
1972-01-01
Digital image-processing techniques developed for processing pictures from NASA space vehicles are analyzed in terms of enhancement, quantitative restoration, and information extraction. Digital filtering, and the action of a high frequency filter in the real and Fourier domain are discussed along with color and brightness.
Knowledge representation and management: transforming textual information into useful knowledge.
Rassinoux, A-M
2010-01-01
To summarize current outstanding research in the field of knowledge representation and management. Synopsis of the articles selected for the IMIA Yearbook 2010. Four interesting papers, dealing with structured knowledge, have been selected for the section knowledge representation and management. Combining the newest techniques in computational linguistics and natural language processing with the latest methods in statistical data analysis, machine learning and text mining has proved to be efficient for turning unstructured textual information into meaningful knowledge. Three of the four selected papers for the section knowledge representation and management corroborate this approach and depict various experiments conducted to .extract meaningful knowledge from unstructured free texts such as extracting cancer disease characteristics from pathology reports, or extracting protein-protein interactions from biomedical papers, as well as extracting knowledge for the support of hypothesis generation in molecular biology from the Medline literature. Finally, the last paper addresses the level of formally representing and structuring information within clinical terminologies in order to render such information easily available and shareable among the health informatics community. Delivering common powerful tools able to automatically extract meaningful information from the huge amount of electronically unstructured free texts is an essential step towards promoting sharing and reusability across applications, domains, and institutions thus contributing to building capacities worldwide.
Automated extraction of radiation dose information for CT examinations.
Cook, Tessa S; Zimmerman, Stefan; Maidment, Andrew D A; Kim, Woojin; Boonn, William W
2010-11-01
Exposure to radiation as a result of medical imaging is currently in the spotlight, receiving attention from Congress as well as the lay press. Although scanner manufacturers are moving toward including effective dose information in the Digital Imaging and Communications in Medicine headers of imaging studies, there is a vast repository of retrospective CT data at every imaging center that stores dose information in an image-based dose sheet. As such, it is difficult for imaging centers to participate in the ACR's Dose Index Registry. The authors have designed an automated extraction system to query their PACS archive and parse CT examinations to extract the dose information stored in each dose sheet. First, an open-source optical character recognition program processes each dose sheet and converts the information to American Standard Code for Information Interchange (ASCII) text. Each text file is parsed, and radiation dose information is extracted and stored in a database which can be queried using an existing pathology and radiology enterprise search tool. Using this automated extraction pipeline, it is possible to perform dose analysis on the >800,000 CT examinations in the PACS archive and generate dose reports for all of these patients. It is also possible to more effectively educate technologists, radiologists, and referring physicians about exposure to radiation from CT by generating report cards for interpreted and performed studies. The automated extraction pipeline enables compliance with the ACR's reporting guidelines and greater awareness of radiation dose to patients, thus resulting in improved patient care and management. Copyright © 2010 American College of Radiology. Published by Elsevier Inc. All rights reserved.
Concept recognition for extracting protein interaction relations from biomedical text
Baumgartner, William A; Lu, Zhiyong; Johnson, Helen L; Caporaso, J Gregory; Paquette, Jesse; Lindemann, Anna; White, Elizabeth K; Medvedeva, Olga; Cohen, K Bretonnel; Hunter, Lawrence
2008-01-01
Background: Reliable information extraction applications have been a long sought goal of the biomedical text mining community, a goal that if reached would provide valuable tools to benchside biologists in their increasingly difficult task of assimilating the knowledge contained in the biomedical literature. We present an integrated approach to concept recognition in biomedical text. Concept recognition provides key information that has been largely missing from previous biomedical information extraction efforts, namely direct links to well defined knowledge resources that explicitly cement the concept's semantics. The BioCreative II tasks discussed in this special issue have provided a unique opportunity to demonstrate the effectiveness of concept recognition in the field of biomedical language processing. Results: Through the modular construction of a protein interaction relation extraction system, we present several use cases of concept recognition in biomedical text, and relate these use cases to potential uses by the benchside biologist. Conclusion: Current information extraction technologies are approaching performance standards at which concept recognition can begin to deliver high quality data to the benchside biologist. Our system is available as part of the BioCreative Meta-Server project and on the internet . PMID:18834500
Activities of the Remote Sensing Information Sciences Research Group
NASA Technical Reports Server (NTRS)
Estes, J. E.; Botkin, D.; Peuquet, D.; Smith, T.; Star, J. L. (Principal Investigator)
1984-01-01
Topics on the analysis and processing of remotely sensed data in the areas of vegetation analysis and modelling, georeferenced information systems, machine assisted information extraction from image data, and artificial intelligence are investigated. Discussions on support field data and specific applications of the proposed technologies are also included.
[Application of hyper-spectral remote sensing technology in environmental protection].
Zhao, Shao-Hua; Zhang, Feng; Wang, Qiao; Yao, Yun-Jun; Wang, Zhong-Ting; You, Dai-An
2013-12-01
Hyper-spectral remote sensing (RS) technology has been widely used in environmental protection. The present work introduces its recent application in the RS monitoring of pollution gas, green-house gas, algal bloom, water quality of catch water environment, safety of drinking water sources, biodiversity, vegetation classification, soil pollution, and so on. Finally, issues such as scarce hyper-spectral satellites, the limits of data processing and information extract are related. Some proposals are also presented, including developing subsequent satellites of HJ-1 satellite with differential optical absorption spectroscopy, greenhouse gas spectroscopy and hyper-spectral imager, strengthening the study of hyper-spectral data processing and information extraction, and promoting the construction of environmental application system.
Method for extracting long-equivalent wavelength interferometric information
NASA Technical Reports Server (NTRS)
Hochberg, Eric B. (Inventor)
1991-01-01
A process for extracting long-equivalent wavelength interferometric information from a two-wavelength polychromatic or achromatic interferometer. The process comprises the steps of simultaneously recording a non-linear sum of two different frequency visible light interferograms on a high resolution film and then placing the developed film in an optical train for Fourier transformation, low pass spatial filtering and inverse transformation of the film image to produce low spatial frequency fringes corresponding to a long-equivalent wavelength interferogram. The recorded non-linear sum irradiance derived from the two-wavelength interferometer is obtained by controlling the exposure so that the average interferogram irradiance is set at either the noise level threshold or the saturation level threshold of the film.
NASA Astrophysics Data System (ADS)
Abdelzaher, Tarek; Roy, Heather; Wang, Shiguang; Giridhar, Prasanna; Al Amin, Md. Tanvir; Bowman, Elizabeth K.; Kolodny, Michael A.
2016-05-01
Signal processing techniques such as filtering, detection, estimation and frequency domain analysis have long been applied to extract information from noisy sensor data. This paper describes the exploitation of these signal processing techniques to extract information from social networks, such as Twitter and Instagram. Specifically, we view social networks as noisy sensors that report events in the physical world. We then present a data processing stack for detection, localization, tracking, and veracity analysis of reported events using social network data. We show using a controlled experiment that the behavior of social sources as information relays varies dramatically depending on context. In benign contexts, there is general agreement on events, whereas in conflict scenarios, a significant amount of collective filtering is introduced by conflicted groups, creating a large data distortion. We describe signal processing techniques that mitigate such distortion, resulting in meaningful approximations of actual ground truth, given noisy reported observations. Finally, we briefly present an implementation of the aforementioned social network data processing stack in a sensor network analysis toolkit, called Apollo. Experiences with Apollo show that our techniques are successful at identifying and tracking credible events in the physical world.
Modelling and representation issues in automated feature extraction from aerial and satellite images
NASA Astrophysics Data System (ADS)
Sowmya, Arcot; Trinder, John
New digital systems for the processing of photogrammetric and remote sensing images have led to new approaches to information extraction for mapping and Geographic Information System (GIS) applications, with the expectation that data can become more readily available at a lower cost and with greater currency. Demands for mapping and GIS data are increasing as well for environmental assessment and monitoring. Hence, researchers from the fields of photogrammetry and remote sensing, as well as computer vision and artificial intelligence, are bringing together their particular skills for automating these tasks of information extraction. The paper will review some of the approaches used in knowledge representation and modelling for machine vision, and give examples of their applications in research for image understanding of aerial and satellite imagery.
Applying high resolution remote sensing image and DEM to falling boulder hazard assessment
NASA Astrophysics Data System (ADS)
Huang, Changqing; Shi, Wenzhong; Ng, K. C.
2005-10-01
Boulder fall hazard assessing generally requires gaining the boulder information. The extensive mapping and surveying fieldwork is a time-consuming, laborious and dangerous conventional method. So this paper proposes an applying image processing technology to extract boulder and assess boulder fall hazard from high resolution remote sensing image. The method can replace the conventional method and extract the boulder information in high accuracy, include boulder size, shape, height and the slope and aspect of its position. With above boulder information, it can be satisfied for assessing, prevention and cure boulder fall hazard.
Kreuzmair, Christina; Siegrist, Michael; Keller, Carmen
2017-03-01
Researchers recommend the use of pictographs in medical risk communication to improve people's risk comprehension and decision making. However, it is not yet clear whether the iconicity used in pictographs to convey risk information influences individuals' information processing and comprehension. In an eye-tracking experiment with participants from the general population (N = 188), we examined whether specific types of pictograph icons influence the processing strategy viewers use to extract numerical information. In addition, we examined the effect of iconicity and numeracy on probability estimation, recall, and icon liking. This experiment used a 2 (iconicity: blocks vs. restroom icons) × 2 (scenario: medical vs. nonmedical) between-subject design. Numeracy had a significant effect on information processing strategy, but we found no effect of iconicity or scenario. Results indicated that both icon types enabled high and low numerates to use their default way of processing and extracting the gist of the message from the pictorial risk communication format: high numerates counted icons, whereas low numerates used large-area processing. There was no effect of iconicity in the probability estimation. However, people who saw restroom icons had a higher probability of correctly recalling the exact risk level. Iconicity had no effect on icon liking. Although the effects are small, our findings suggest that person-like restroom icons in pictographs seem to have some advantages for risk communication. Specifically, in nonpersonalized prevention brochures, person-like restroom icons may maintain reader motivation for processing the risk information. © 2016 Society for Risk Analysis.
Modeling ECM fiber formation: structure information extracted by analysis of 2D and 3D image sets
NASA Astrophysics Data System (ADS)
Wu, Jun; Voytik-Harbin, Sherry L.; Filmer, David L.; Hoffman, Christoph M.; Yuan, Bo; Chiang, Ching-Shoei; Sturgis, Jennis; Robinson, Joseph P.
2002-05-01
Recent evidence supports the notion that biological functions of extracellular matrix (ECM) are highly correlated to its structure. Understanding this fibrous structure is very crucial in tissue engineering to develop the next generation of biomaterials for restoration of tissues and organs. In this paper, we integrate confocal microscopy imaging and image-processing techniques to analyze the structural properties of ECM. We describe a 2D fiber middle-line tracing algorithm and apply it via Euclidean distance maps (EDM) to extract accurate fibrous structure information, such as fiber diameter, length, orientation, and density, from single slices. Based on a 2D tracing algorithm, we extend our analysis to 3D tracing via Euclidean distance maps to extract 3D fibrous structure information. We use computer simulation to construct the 3D fibrous structure which is subsequently used to test our tracing algorithms. After further image processing, these models are then applied to a variety of ECM constructions from which results of 2D and 3D traces are statistically analyzed.
The Remote Maxwell Demon as Energy Down-Converter
NASA Astrophysics Data System (ADS)
Hossenfelder, S.
2016-04-01
It is demonstrated that Maxwell's demon can be used to allow a machine to extract energy from a heat bath by use of information that is processed by the demon at a remote location. The model proposed here effectively replaces transmission of energy by transmission of information. For that we use a feedback protocol that enables a net gain by stimulating emission in selected fluctuations around thermal equilibrium. We estimate the down conversion rate and the efficiency of energy extraction from the heat bath.
Jagannathan, V; Mullett, Charles J; Arbogast, James G; Halbritter, Kevin A; Yellapragada, Deepthi; Regulapati, Sushmitha; Bandaru, Pavani
2009-04-01
We assessed the current state of commercial natural language processing (NLP) engines for their ability to extract medication information from textual clinical documents. Two thousand de-identified discharge summaries and family practice notes were submitted to four commercial NLP engines with the request to extract all medication information. The four sets of returned results were combined to create a comparison standard which was validated against a manual, physician-derived gold standard created from a subset of 100 reports. Once validated, the individual vendor results for medication names, strengths, route, and frequency were compared against this automated standard with precision, recall, and F measures calculated. Compared with the manual, physician-derived gold standard, the automated standard was successful at accurately capturing medication names (F measure=93.2%), but performed less well with strength (85.3%) and route (80.3%), and relatively poorly with dosing frequency (48.3%). Moderate variability was seen in the strengths of the four vendors. The vendors performed better with the structured discharge summaries than with the clinic notes in an analysis comparing the two document types. Although automated extraction may serve as the foundation for a manual review process, it is not ready to automate medication lists without human intervention.
ID card number detection algorithm based on convolutional neural network
NASA Astrophysics Data System (ADS)
Zhu, Jian; Ma, Hanjie; Feng, Jie; Dai, Leiyan
2018-04-01
In this paper, a new detection algorithm based on Convolutional Neural Network is presented in order to realize the fast and convenient ID information extraction in multiple scenarios. The algorithm uses the mobile device equipped with Android operating system to locate and extract the ID number; Use the special color distribution of the ID card, select the appropriate channel component; Use the image threshold segmentation, noise processing and morphological processing to take the binary processing for image; At the same time, the image rotation and projection method are used for horizontal correction when image was tilting; Finally, the single character is extracted by the projection method, and recognized by using Convolutional Neural Network. Through test shows that, A single ID number image from the extraction to the identification time is about 80ms, the accuracy rate is about 99%, It can be applied to the actual production and living environment.
User's guide to the SEPHIS computer code for calculating the Thorex solvent extraction system
DOE Office of Scientific and Technical Information (OSTI.GOV)
Watson, S.B.; Rainey, R.H.
1979-05-01
The SEPHIS computer program was developed to simulate the countercurrent solvent extraction process. The code has now been adapted to model the Acid Thorex flow sheet. This report represents a practical user's guide to SEPHIS - Thorex containing a program description, user information, program listing, and sample input and output.
DBpedia and the Live Extraction of Structured Data from Wikipedia
ERIC Educational Resources Information Center
Morsey, Mohamed; Lehmann, Jens; Auer, Soren; Stadler, Claus; Hellmann, Sebastian
2012-01-01
Purpose: DBpedia extracts structured information from Wikipedia, interlinks it with other knowledge bases and freely publishes the results on the web using Linked Data and SPARQL. However, the DBpedia release process is heavyweight and releases are sometimes based on several months old data. DBpedia-Live solves this problem by providing a live…
DOE Office of Scientific and Technical Information (OSTI.GOV)
Trease, Lynn L.; Trease, Harold E.; Fowler, John
2007-03-15
One of the critical steps toward performing computational biology simulations, using mesh based integration methods, is in using topologically faithful geometry derived from experimental digital image data as the basis for generating the computational meshes. Digital image data representations contain both the topology of the geometric features and experimental field data distributions. The geometric features that need to be captured from the digital image data are three-dimensional, therefore the process and tools we have developed work with volumetric image data represented as data-cubes. This allows us to take advantage of 2D curvature information during the segmentation and feature extraction process.more » The process is basically: 1) segmenting to isolate and enhance the contrast of the features that we wish to extract and reconstruct, 2) extracting the geometry of the features in an isosurfacing technique, and 3) building the computational mesh using the extracted feature geometry. “Quantitative” image reconstruction and feature extraction is done for the purpose of generating computational meshes, not just for producing graphics "screen" quality images. For example, the surface geometry that we extract must represent a closed water-tight surface.« less
Staccini, Pascal; Joubert, Michel; Quaranta, Jean-François; Fieschi, Marius
2005-03-01
Today, the economic and regulatory environment, involving activity-based and prospective payment systems, healthcare quality and risk analysis, traceability of the acts performed and evaluation of care practices, accounts for the current interest in clinical and hospital information systems. The structured gathering of information relative to users' needs and system requirements is fundamental when installing such systems. This stage takes time and is generally misconstrued by caregivers and is of limited efficacy to analysts. We used a modelling technique designed for manufacturing processes (IDEF0/SADT). We enhanced the basic model of an activity with descriptors extracted from the Ishikawa cause-and-effect diagram (methods, men, materials, machines, and environment). We proposed an object data model of a process and its components, and programmed a web-based tool in an object-oriented environment. This tool makes it possible to extract the data dictionary of a given process from the description of its elements and to locate documents (procedures, recommendations, instructions) according to each activity or role. Aimed at structuring needs and storing information provided by directly involved teams regarding the workings of an institution (or at least part of it), the process-mapping approach has an important contribution to make in the analysis of clinical information systems.
Sugarcane Crop Extraction Using Object-Oriented Method from ZY-3 High Resolution Satellite Tlc Image
NASA Astrophysics Data System (ADS)
Luo, H.; Ling, Z. Y.; Shao, G. Z.; Huang, Y.; He, Y. Q.; Ning, W. Y.; Zhong, Z.
2018-04-01
Sugarcane is one of the most important crops in Guangxi, China. As the development of satellite remote sensing technology, more remotely sensed images can be used for monitoring sugarcane crop. With the help of Three Line Camera (TLC) images, wide coverage and stereoscopic mapping ability, Chinese ZY-3 high resolution stereoscopic mapping satellite is useful in attaining more information for sugarcane crop monitoring, such as spectral, shape, texture difference between forward, nadir and backward images. Digital surface model (DSM) derived from ZY-3 TLC images are also able to provide height information for sugarcane crop. In this study, we make attempt to extract sugarcane crop from ZY-3 images, which are acquired in harvest period. Ortho-rectified TLC images, fused image, DSM are processed for our extraction. Then Object-oriented method is used in image segmentation, example collection, and feature extraction. The results of our study show that with the help of ZY-3 TLC image, the information of sugarcane crop in harvest time can be automatic extracted, with an overall accuracy of about 85.3 %.
NASA Astrophysics Data System (ADS)
Cao, Qiong; Gu, Lingjia; Ren, Ruizhi; Wang, Lang
2016-09-01
Building extraction currently is important in the application of high-resolution remote sensing imagery. At present, quite a few algorithms are available for detecting building information, however, most of them still have some obvious disadvantages, such as the ignorance of spectral information, the contradiction between extraction rate and extraction accuracy. The purpose of this research is to develop an effective method to detect building information for Chinese GF-1 data. Firstly, the image preprocessing technique is used to normalize the image and image enhancement is used to highlight the useful information in the image. Secondly, multi-spectral information is analyzed. Subsequently, an improved morphological building index (IMBI) based on remote sensing imagery is proposed to get the candidate building objects. Furthermore, in order to refine building objects and further remove false objects, the post-processing (e.g., the shape features, the vegetation index and the water index) is employed. To validate the effectiveness of the proposed algorithm, the omission errors (OE), commission errors (CE), the overall accuracy (OA) and Kappa are used at final. The proposed method can not only effectively use spectral information and other basic features, but also avoid extracting excessive interference details from high-resolution remote sensing images. Compared to the original MBI algorithm, the proposed method reduces the OE by 33.14% .At the same time, the Kappa increase by 16.09%. In experiments, IMBI achieved satisfactory results and outperformed other algorithms in terms of both accuracies and visual inspection
Extracting and standardizing medication information in clinical text – the MedEx-UIMA system
Jiang, Min; Wu, Yonghui; Shah, Anushi; Priyanka, Priyanka; Denny, Joshua C.; Xu, Hua
2014-01-01
Extraction of medication information embedded in clinical text is important for research using electronic health records (EHRs). However, most of current medication information extraction systems identify drug and signature entities without mapping them to standard representation. In this study, we introduced the open source Java implementation of MedEx, an existing high-performance medication information extraction system, based on the Unstructured Information Management Architecture (UIMA) framework. In addition, we developed new encoding modules in the MedEx-UIMA system, which mapped an extracted drug name/dose/form to both generalized and specific RxNorm concepts and translated drug frequency information to ISO standard. We processed 826 documents by both systems and verified that MedEx-UIMA and MedEx (the Python version) performed similarly by comparing both results. Using two manually annotated test sets that contained 300 drug entries from medication list and 300 drug entries from narrative reports, the MedEx-UIMA system achieved F-measures of 98.5% and 97.5% respectively for encoding drug names to corresponding RxNorm generic drug ingredients, and F-measures of 85.4% and 88.1% respectively for mapping drug names/dose/form to the most specific RxNorm concepts. It also achieved an F-measure of 90.4% for normalizing frequency information to ISO standard. The open source MedEx-UIMA system is freely available online at http://code.google.com/p/medex-uima/. PMID:25954575
Natural language processing-based COTS software and related technologies survey.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Stickland, Michael G.; Conrad, Gregory N.; Eaton, Shelley M.
Natural language processing-based knowledge management software, traditionally developed for security organizations, is now becoming commercially available. An informal survey was conducted to discover and examine current NLP and related technologies and potential applications for information retrieval, information extraction, summarization, categorization, terminology management, link analysis, and visualization for possible implementation at Sandia National Laboratories. This report documents our current understanding of the technologies, lists software vendors and their products, and identifies potential applications of these technologies.
KAM (Knowledge Acquisition Module): A tool to simplify the knowledge acquisition process
NASA Technical Reports Server (NTRS)
Gettig, Gary A.
1988-01-01
Analysts, knowledge engineers and information specialists are faced with increasing volumes of time-sensitive data in text form, either as free text or highly structured text records. Rapid access to the relevant data in these sources is essential. However, due to the volume and organization of the contents, and limitations of human memory and association, frequently: (1) important information is not located in time; (2) reams of irrelevant data are searched; and (3) interesting or critical associations are missed due to physical or temporal gaps involved in working with large files. The Knowledge Acquisition Module (KAM) is a microcomputer-based expert system designed to assist knowledge engineers, analysts, and other specialists in extracting useful knowledge from large volumes of digitized text and text-based files. KAM formulates non-explicit, ambiguous, or vague relations, rules, and facts into a manageable and consistent formal code. A library of system rules or heuristics is maintained to control the extraction of rules, relations, assertions, and other patterns from the text. These heuristics can be added, deleted or customized by the user. The user can further control the extraction process with optional topic specifications. This allows the user to cluster extracts based on specific topics. Because KAM formalizes diverse knowledge, it can be used by a variety of expert systems and automated reasoning applications. KAM can also perform important roles in computer-assisted training and skill development. Current research efforts include the applicability of neural networks to aid in the extraction process and the conversion of these extracts into standard formats.
Extracting biomedical events from pairs of text entities
2015-01-01
Background Huge amounts of electronic biomedical documents, such as molecular biology reports or genomic papers are generated daily. Nowadays, these documents are mainly available in the form of unstructured free texts, which require heavy processing for their registration into organized databases. This organization is instrumental for information retrieval, enabling to answer the advanced queries of researchers and practitioners in biology, medicine, and related fields. Hence, the massive data flow calls for efficient automatic methods of text-mining that extract high-level information, such as biomedical events, from biomedical text. The usual computational tools of Natural Language Processing cannot be readily applied to extract these biomedical events, due to the peculiarities of the domain. Indeed, biomedical documents contain highly domain-specific jargon and syntax. These documents also describe distinctive dependencies, making text-mining in molecular biology a specific discipline. Results We address biomedical event extraction as the classification of pairs of text entities into the classes corresponding to event types. The candidate pairs of text entities are recursively provided to a multiclass classifier relying on Support Vector Machines. This recursive process extracts events involving other events as arguments. Compared to joint models based on Markov Random Fields, our model simplifies inference and hence requires shorter training and prediction times along with lower memory capacity. Compared to usual pipeline approaches, our model passes over a complex intermediate problem, while making a more extensive usage of sophisticated joint features between text entities. Our method focuses on the core event extraction of the Genia task of BioNLP challenges yielding the best result reported so far on the 2013 edition. PMID:26201478
Towards an intelligent framework for multimodal affective data analysis.
Poria, Soujanya; Cambria, Erik; Hussain, Amir; Huang, Guang-Bin
2015-03-01
An increasingly large amount of multimodal content is posted on social media websites such as YouTube and Facebook everyday. In order to cope with the growth of such so much multimodal data, there is an urgent need to develop an intelligent multi-modal analysis framework that can effectively extract information from multiple modalities. In this paper, we propose a novel multimodal information extraction agent, which infers and aggregates the semantic and affective information associated with user-generated multimodal data in contexts such as e-learning, e-health, automatic video content tagging and human-computer interaction. In particular, the developed intelligent agent adopts an ensemble feature extraction approach by exploiting the joint use of tri-modal (text, audio and video) features to enhance the multimodal information extraction process. In preliminary experiments using the eNTERFACE dataset, our proposed multi-modal system is shown to achieve an accuracy of 87.95%, outperforming the best state-of-the-art system by more than 10%, or in relative terms, a 56% reduction in error rate. Copyright © 2014 Elsevier Ltd. All rights reserved.
From data to information and knowledge for geospatial applications
NASA Astrophysics Data System (ADS)
Schenk, T.; Csatho, B.; Yoon, T.
2006-12-01
An ever-increasing number of airborne and spaceborne data-acquisition missions with various sensors produce a glut of data. Sensory data rarely contains information in a explicit form such that an application can directly use it. The processing and analyzing of data constitutes a real bottleneck; therefore, automating the processes of gaining useful information and knowledge from the raw data is of paramount interest. This presentation is concerned with the transition from data to information and knowledge. With data we refer to the sensor output and we notice that data provide very rarely direct answers for applications. For example, a pixel in a digital image or a laser point from a LIDAR system (data) have no direct relationship with elevation changes of topographic surfaces or the velocity of a glacier (information, knowledge). We propose to employ the computer vision paradigm to extract information and knowledge as it pertains to a wide range of geoscience applications. After introducing the paradigm we describe the major steps to be undertaken for extracting information and knowledge from sensory input data. Features play an important role in this process. Thus we focus on extracting features and their perceptual organization to higher order constructs. We demonstrate these concepts with imaging data and laser point clouds. The second part of the presentation addresses the problem of combining data obtained by different sensors. An absolute prerequisite for successful fusion is to establish a common reference frame. We elaborate on the concept of sensor invariant features that allow the registration of such disparate data sets as aerial/satellite imagery, 3D laser point clouds, and multi/hyperspectral imagery. Fusion takes place on the data level (sensor registration) and on the information level. We show how fusion increases the degree of automation for reconstructing topographic surfaces. Moreover, fused information gained from the three sensors results in a more abstract surface representation with a rich set of explicit surface information that can be readily used by an analyst for applications such as change detection.
Truly work-like work extraction via a single-shot analysis.
Aberg, Johan
2013-01-01
The work content of non-equilibrium systems in relation to a heat bath is often analysed in terms of expectation values of an underlying random work variable. However, when optimizing the expectation value of the extracted work, the resulting extraction process is subject to intrinsic fluctuations, uniquely determined by the Hamiltonian and the initial distribution of the system. These fluctuations can be of the same order as the expected work content per se, in which case the extracted energy is unpredictable, thus intuitively more heat-like than work-like. This raises the question of the 'truly' work-like energy that can be extracted. Here we consider an alternative that corresponds to an essentially fluctuation-free extraction. We show that this quantity can be expressed in terms of a one-shot relative entropy measure introduced in information theory. This suggests that the relations between information theory and statistical mechanics, as illustrated by concepts like Maxwell's demon, Szilard engines and Landauer's principle, extends to the single-shot regime.
NASA Astrophysics Data System (ADS)
Liu, Shuang; Liu, Fei; Hu, Shaohua; Yin, Zhenbiao
The major power information of the main transmission system in machine tools (MTSMT) during machining process includes effective output power (i.e. cutting power), input power and power loss from the mechanical transmission system, and the main motor power loss. These information are easy to obtain in the lab but difficult to evaluate in a manufacturing process. To solve this problem, a separation method is proposed here to extract the MTSMT power information during machining process. In this method, the energy flow and the mathematical models of major power information of MTSMT during the machining process are set up first. Based on the mathematical models and the basic data tables obtained from experiments, the above mentioned power information during machining process can be separated just by measuring the real time total input power of the spindle motor. The operation program of this method is also given.
Stanislawski, Larry V.; Falgout, Jeff T.; Buttenfield, Barbara P.
2015-01-01
Hydrographic networks form an important data foundation for cartographic base mapping and for hydrologic analysis. Drainage density patterns for these networks can be derived to characterize local landscape, bedrock and climate conditions, and further inform hydrologic and geomorphological analysis by indicating areas where too few headwater channels have been extracted. But natural drainage density patterns are not consistently available in existing hydrographic data for the United States because compilation and capture criteria historically varied, along with climate, during the period of data collection over the various terrain types throughout the country. This paper demonstrates an automated workflow that is being tested in a high-performance computing environment by the U.S. Geological Survey (USGS) to map natural drainage density patterns at the 1:24,000-scale (24K) for the conterminous United States. Hydrographic network drainage patterns may be extracted from elevation data to guide corrections for existing hydrographic network data. The paper describes three stages in this workflow including data pre-processing, natural channel extraction, and generation of drainage density patterns from extracted channels. The workflow is concurrently implemented by executing procedures on multiple subbasin watersheds within the U.S. National Hydrography Dataset (NHD). Pre-processing defines parameters that are needed for the extraction process. Extraction proceeds in standard fashion: filling sinks, developing flow direction and weighted flow accumulation rasters. Drainage channels with assigned Strahler stream order are extracted within a subbasin and simplified. Drainage density patterns are then estimated with 100-meter resolution and subsequently smoothed with a low-pass filter. The extraction process is found to be of better quality in higher slope terrains. Concurrent processing through the high performance computing environment is shown to facilitate and refine the choice of drainage density extraction parameters and more readily improve extraction procedures than conventional processing.
Extracting Information from Narratives: An Application to Aviation Safety Reports
DOE Office of Scientific and Technical Information (OSTI.GOV)
Posse, Christian; Matzke, Brett D.; Anderson, Catherine M.
2005-05-12
Aviation safety reports are the best available source of information about why a flight incident happened. However, stream of consciousness permeates the narratives making difficult the automation of the information extraction task. We propose an approach and infrastructure based on a common pattern specification language to capture relevant information via normalized template expression matching in context. Template expression matching handles variants of multi-word expressions. Normalization improves the likelihood of correct hits by standardizing and cleaning the vocabulary used in narratives. Checking for the presence of negative modifiers in the proximity of a potential hit reduces the chance of false hits.more » We present the above approach in the context of a specific application, which is the extraction of human performance factors from NASA ASRS reports. While knowledge infusion from experts plays a critical role during the learning phase, early results show that in a production mode, the automated process provides information that is consistent with analyses by human subjects.« less
Semi-automatic building extraction in informal settlements from high-resolution satellite imagery
NASA Astrophysics Data System (ADS)
Mayunga, Selassie David
The extraction of man-made features from digital remotely sensed images is considered as an important step underpinning management of human settlements in any country. Man-made features and buildings in particular are required for varieties of applications such as urban planning, creation of geographical information systems (GIS) databases and Urban City models. The traditional man-made feature extraction methods are very expensive in terms of equipment, labour intensive, need well-trained personnel and cannot cope with changing environments, particularly in dense urban settlement areas. This research presents an approach for extracting buildings in dense informal settlement areas using high-resolution satellite imagery. The proposed system uses a novel strategy of extracting building by measuring a single point at the approximate centre of the building. The fine measurement of the building outlines is then effected using a modified snake model. The original snake model on which this framework is based, incorporates an external constraint energy term which is tailored to preserving the convergence properties of the snake model; its use to unstructured objects will negatively affect their actual shapes. The external constrained energy term was removed from the original snake model formulation, thereby, giving ability to cope with high variability of building shapes in informal settlement areas. The proposed building extraction system was tested on two areas, which have different situations. The first area was Tungi in Dar Es Salaam, Tanzania where three sites were tested. This area is characterized by informal settlements, which are illegally formulated within the city boundaries. The second area was Oromocto in New Brunswick, Canada where two sites were tested. Oromocto area is mostly flat and the buildings are constructed using similar materials. Qualitative and quantitative measures were employed to evaluate the accuracy of the results as well as the performance of the system. The qualitative and quantitative measures were based on visual inspection and by comparing the measured coordinates to the reference data respectively. In the course of this process, a mean area coverage of 98% was achieved for Dar Es Salaam test sites, which globally indicated that the extracted building polygons were close to the ground truth data. Furthermore, the proposed system saved time to extract a single building by 32%. Although the extracted building polygons are within the perimeter of ground truth data, visually some of the extracted building polygons were somewhat distorted. This implies that interactive post-editing process is necessary for cartographic representation.
Kumar, Shiu; Sharma, Alok; Tsunoda, Tatsuhiko
2017-12-28
Common spatial pattern (CSP) has been an effective technique for feature extraction in electroencephalography (EEG) based brain computer interfaces (BCIs). However, motor imagery EEG signal feature extraction using CSP generally depends on the selection of the frequency bands to a great extent. In this study, we propose a mutual information based frequency band selection approach. The idea of the proposed method is to utilize the information from all the available channels for effectively selecting the most discriminative filter banks. CSP features are extracted from multiple overlapping sub-bands. An additional sub-band has been introduced that cover the wide frequency band (7-30 Hz) and two different types of features are extracted using CSP and common spatio-spectral pattern techniques, respectively. Mutual information is then computed from the extracted features of each of these bands and the top filter banks are selected for further processing. Linear discriminant analysis is applied to the features extracted from each of the filter banks. The scores are fused together, and classification is done using support vector machine. The proposed method is evaluated using BCI Competition III dataset IVa, BCI Competition IV dataset I and BCI Competition IV dataset IIb, and it outperformed all other competing methods achieving the lowest misclassification rate and the highest kappa coefficient on all three datasets. Introducing a wide sub-band and using mutual information for selecting the most discriminative sub-bands, the proposed method shows improvement in motor imagery EEG signal classification.
Model of experts for decision support in the diagnosis of leukemia patients.
Corchado, Juan M; De Paz, Juan F; Rodríguez, Sara; Bajo, Javier
2009-07-01
Recent advances in the field of biomedicine, specifically in the field of genomics, have led to an increase in the information available for conducting expression analysis. Expression analysis is a technique used in transcriptomics, a branch of genomics that deals with the study of messenger ribonucleic acid (mRNA) and the extraction of information contained in the genes. This increase in information is reflected in the exon arrays, which require the use of new techniques in order to extract the information. The purpose of this study is to provide a tool based on a mixture of experts model that allows the analysis of the information contained in the exon arrays, from which automatic classifications for decision support in diagnoses of leukemia patients can be made. The proposed model integrates several cooperative algorithms characterized for their efficiency for data processing, filtering, classification and knowledge extraction. The Cancer Institute of the University of Salamanca is making an effort to develop tools to automate the evaluation of data and to facilitate de analysis of information. This proposal is a step forward in this direction and the first step toward the development of a mixture of experts tool that integrates different cognitive and statistical approaches to deal with the analysis of exon arrays. The mixture of experts model presented within this work provides great capacities for learning and adaptation to the characteristics of the problem in consideration, using novel algorithms in each of the stages of the analysis process that can be easily configured and combined, and provides results that notably improve those provided by the existing methods for exon arrays analysis. The material used consists of data from exon arrays provided by the Cancer Institute that contain samples from leukemia patients. The methodology used consists of a system based on a mixture of experts. Each one of the experts incorporates novel artificial intelligence techniques that improve the process of carrying out various tasks such as pre-processing, filtering, classification and extraction of knowledge. This article will detail the manner in which individual experts are combined so that together they generate a system capable of extracting knowledge, thus permitting patients to be classified in an automatic and efficient manner that is also comprehensible for medical personnel. The system has been tested in a real setting and has been used for classifying patients who suffer from different forms of leukemia at various stages. Personnel from the Cancer Institute supervised and participated throughout the testing period. Preliminary results are promising, notably improving the results obtained with previously used tools. The medical staff from the Cancer Institute considers the tools that have been developed to be positive and very useful in a supporting capacity for carrying out their daily tasks. Additionally the mixture of experts supplies a tool for the extraction of necessary information in order to explain the associations that have been made in simple terms. That is, it permits the extraction of knowledge for each classification made and generalized in order to be used in subsequent classifications. This allows for a large amount of learning and adaptation within the proposed system.
Biological network extraction from scientific literature: state of the art and challenges.
Li, Chen; Liakata, Maria; Rebholz-Schuhmann, Dietrich
2014-09-01
Networks of molecular interactions explain complex biological processes, and all known information on molecular events is contained in a number of public repositories including the scientific literature. Metabolic and signalling pathways are often viewed separately, even though both types are composed of interactions involving proteins and other chemical entities. It is necessary to be able to combine data from all available resources to judge the functionality, complexity and completeness of any given network overall, but especially the full integration of relevant information from the scientific literature is still an ongoing and complex task. Currently, the text-mining research community is steadily moving towards processing the full body of the scientific literature by making use of rich linguistic features such as full text parsing, to extract biological interactions. The next step will be to combine these with information from scientific databases to support hypothesis generation for the discovery of new knowledge and the extension of biological networks. The generation of comprehensive networks requires technologies such as entity grounding, coordination resolution and co-reference resolution, which are not fully solved and are required to further improve the quality of results. Here, we analyse the state of the art for the extraction of network information from the scientific literature and the evaluation of extraction methods against reference corpora, discuss challenges involved and identify directions for future research. © The Author 2013. Published by Oxford University Press. For Permissions, please email: journals.permissions@oup.com.
PASTE: patient-centered SMS text tagging in a medication management system.
Stenner, Shane P; Johnson, Kevin B; Denny, Joshua C
2012-01-01
To evaluate the performance of a system that extracts medication information and administration-related actions from patient short message service (SMS) messages. Mobile technologies provide a platform for electronic patient-centered medication management. MyMediHealth (MMH) is a medication management system that includes a medication scheduler, a medication administration record, and a reminder engine that sends text messages to cell phones. The object of this work was to extend MMH to allow two-way interaction using mobile phone-based SMS technology. Unprompted text-message communication with patients using natural language could engage patients in their healthcare, but presents unique natural language processing challenges. The authors developed a new functional component of MMH, the Patient-centered Automated SMS Tagging Engine (PASTE). The PASTE web service uses natural language processing methods, custom lexicons, and existing knowledge sources to extract and tag medication information from patient text messages. A pilot evaluation of PASTE was completed using 130 medication messages anonymously submitted by 16 volunteers via a website. System output was compared with manually tagged messages. Verified medication names, medication terms, and action terms reached high F-measures of 91.3%, 94.7%, and 90.4%, respectively. The overall medication name F-measure was 79.8%, and the medication action term F-measure was 90%. Other studies have demonstrated systems that successfully extract medication information from clinical documents using semantic tagging, regular expression-based approaches, or a combination of both approaches. This evaluation demonstrates the feasibility of extracting medication information from patient-generated medication messages.
The Gender Puzzle: Toddlers' Use of Articles to Access Noun Information
ERIC Educational Resources Information Center
Arias-Trejo, Natalia; Falcon, Alberto; Alva-Canto, Elda A.
2013-01-01
Grammatical gender embedded in determiners, nouns and adjectives allows indirect and more rapid processing of the referents implied in sentences. However in a language such as Spanish, this useful information cannot be reliably retrieved from a single source of information. Instead, noun gender may be extracted either from phono-morphological,…
Real time system design of motor imagery brain-computer interface based on multi band CSP and SVM
NASA Astrophysics Data System (ADS)
Zhao, Li; Li, Xiaoqin; Bian, Yan
2018-04-01
Motion imagery (MT) is an effective method to promote the recovery of limbs in patients after stroke. Though an online MT brain computer interface (BCT) system, which apply MT, can enhance the patient's participation and accelerate their recovery process. The traditional method deals with the electroencephalogram (EEG) induced by MT by common spatial pattern (CSP), which is used to extract information from a frequency band. Tn order to further improve the classification accuracy of the system, information of two characteristic frequency bands is extracted. The effectiveness of the proposed feature extraction method is verified by off-line analysis of competition data and the analysis of online system.
Automated Text Markup for Information Retrieval from an Electronic Textbook of Infectious Disease
Berrios, Daniel C.; Kehler, Andrew; Kim, David K.; Yu, Victor L.; Fagan, Lawrence M.
1998-01-01
The information needs of practicing clinicians frequently require textbook or journal searches. Making these sources available in electronic form improves the speed of these searches, but precision (i.e., the fraction of relevant to total documents retrieved) remains low. Improving the traditional keyword search by transforming search terms into canonical concepts does not improve search precision greatly. Kim et al. have designed and built a prototype system (MYCIN II) for computer-based information retrieval from a forthcoming electronic textbook of infectious disease. The system requires manual indexing by experts in the form of complex text markup. However, this mark-up process is time consuming (about 3 person-hours to generate, review, and transcribe the index for each of 218 chapters). We have designed and implemented a system to semiautomate the markup process. The system, information extraction for semiautomated indexing of documents (ISAID), uses query models and existing information-extraction tools to provide support for any user, including the author of the source material, to mark up tertiary information sources quickly and accurately.
Real-Time Digital Signal Processing Based on FPGAs for Electronic Skin Implementation †
Ibrahim, Ali; Gastaldo, Paolo; Chible, Hussein; Valle, Maurizio
2017-01-01
Enabling touch-sensing capability would help appliances understand interaction behaviors with their surroundings. Many recent studies are focusing on the development of electronic skin because of its necessity in various application domains, namely autonomous artificial intelligence (e.g., robots), biomedical instrumentation, and replacement prosthetic devices. An essential task of the electronic skin system is to locally process the tactile data and send structured information either to mimic human skin or to respond to the application demands. The electronic skin must be fabricated together with an embedded electronic system which has the role of acquiring the tactile data, processing, and extracting structured information. On the other hand, processing tactile data requires efficient methods to extract meaningful information from raw sensor data. Machine learning represents an effective method for data analysis in many domains: it has recently demonstrated its effectiveness in processing tactile sensor data. In this framework, this paper presents the implementation of digital signal processing based on FPGAs for tactile data processing. It provides the implementation of a tensorial kernel function for a machine learning approach. Implementation results are assessed by highlighting the FPGA resource utilization and power consumption. Results demonstrate the feasibility of the proposed implementation when real-time classification of input touch modalities are targeted. PMID:28287448
Real-Time Digital Signal Processing Based on FPGAs for Electronic Skin Implementation.
Ibrahim, Ali; Gastaldo, Paolo; Chible, Hussein; Valle, Maurizio
2017-03-10
Enabling touch-sensing capability would help appliances understand interaction behaviors with their surroundings. Many recent studies are focusing on the development of electronic skin because of its necessity in various application domains, namely autonomous artificial intelligence (e.g., robots), biomedical instrumentation, and replacement prosthetic devices. An essential task of the electronic skin system is to locally process the tactile data and send structured information either to mimic human skin or to respond to the application demands. The electronic skin must be fabricated together with an embedded electronic system which has the role of acquiring the tactile data, processing, and extracting structured information. On the other hand, processing tactile data requires efficient methods to extract meaningful information from raw sensor data. Machine learning represents an effective method for data analysis in many domains: it has recently demonstrated its effectiveness in processing tactile sensor data. In this framework, this paper presents the implementation of digital signal processing based on FPGAs for tactile data processing. It provides the implementation of a tensorial kernel function for a machine learning approach. Implementation results are assessed by highlighting the FPGA resource utilization and power consumption. Results demonstrate the feasibility of the proposed implementation when real-time classification of input touch modalities are targeted.
Opinion mining on book review using CNN-L2-SVM algorithm
NASA Astrophysics Data System (ADS)
Rozi, M. F.; Mukhlash, I.; Soetrisno; Kimura, M.
2018-03-01
Review of a product can represent quality of a product itself. An extraction to that review can be used to know sentiment of that opinion. Process to extract useful information of user review is called Opinion Mining. Review extraction model that is enhancing nowadays is Deep Learning model. This Model has been used by many researchers to obtain excellent performance on Natural Language Processing. In this research, one of deep learning model, Convolutional Neural Network (CNN) is used for feature extraction and L2 Support Vector Machine (SVM) as classifier. These methods are implemented to know the sentiment of book review data. The result of this method shows state-of-the art performance in 83.23% for training phase and 64.6% for testing phase.
Stimulus encoding and feature extraction by multiple sensory neurons.
Krahe, Rüdiger; Kreiman, Gabriel; Gabbiani, Fabrizio; Koch, Christof; Metzner, Walter
2002-03-15
Neighboring cells in topographical sensory maps may transmit similar information to the next higher level of processing. How information transmission by groups of nearby neurons compares with the performance of single cells is a very important question for understanding the functioning of the nervous system. To tackle this problem, we quantified stimulus-encoding and feature extraction performance by pairs of simultaneously recorded electrosensory pyramidal cells in the hindbrain of weakly electric fish. These cells constitute the output neurons of the first central nervous stage of electrosensory processing. Using random amplitude modulations (RAMs) of a mimic of the fish's own electric field within behaviorally relevant frequency bands, we found that pyramidal cells with overlapping receptive fields exhibit strong stimulus-induced correlations. To quantify the encoding of the RAM time course, we estimated the stimuli from simultaneously recorded spike trains and found significant improvements over single spike trains. The quality of stimulus reconstruction, however, was still inferior to the one measured for single primary sensory afferents. In an analysis of feature extraction, we found that spikes of pyramidal cell pairs coinciding within a time window of a few milliseconds performed significantly better at detecting upstrokes and downstrokes of the stimulus compared with isolated spikes and even spike bursts of single cells. Coincident spikes can thus be considered "distributed bursts." Our results suggest that stimulus encoding by primary sensory afferents is transformed into feature extraction at the next processing stage. There, stimulus-induced coincident activity can improve the extraction of behaviorally relevant features from the stimulus.
Rouinfar, Amy; Agra, Elise; Larson, Adam M.; Rebello, N. Sanjay; Loschky, Lester C.
2014-01-01
This study investigated links between visual attention processes and conceptual problem solving. This was done by overlaying visual cues on conceptual physics problem diagrams to direct participants’ attention to relevant areas to facilitate problem solving. Participants (N = 80) individually worked through four problem sets, each containing a diagram, while their eye movements were recorded. Each diagram contained regions that were relevant to solving the problem correctly and separate regions related to common incorrect responses. Problem sets contained an initial problem, six isomorphic training problems, and a transfer problem. The cued condition saw visual cues overlaid on the training problems. Participants’ verbal responses were used to determine their accuracy. This study produced two major findings. First, short duration visual cues which draw attention to solution-relevant information and aid in the organizing and integrating of it, facilitate both immediate problem solving and generalization of that ability to new problems. Thus, visual cues can facilitate re-representing a problem and overcoming impasse, enabling a correct solution. Importantly, these cueing effects on problem solving did not involve the solvers’ attention necessarily embodying the solution to the problem, but were instead caused by solvers attending to and integrating relevant information in the problems into a solution path. Second, this study demonstrates that when such cues are used across multiple problems, solvers can automatize the extraction of problem-relevant information extraction. These results suggest that low-level attentional selection processes provide a necessary gateway for relevant information to be used in problem solving, but are generally not sufficient for correct problem solving. Instead, factors that lead a solver to an impasse and to organize and integrate problem information also greatly facilitate arriving at correct solutions. PMID:25324804
Rouinfar, Amy; Agra, Elise; Larson, Adam M; Rebello, N Sanjay; Loschky, Lester C
2014-01-01
This study investigated links between visual attention processes and conceptual problem solving. This was done by overlaying visual cues on conceptual physics problem diagrams to direct participants' attention to relevant areas to facilitate problem solving. Participants (N = 80) individually worked through four problem sets, each containing a diagram, while their eye movements were recorded. Each diagram contained regions that were relevant to solving the problem correctly and separate regions related to common incorrect responses. Problem sets contained an initial problem, six isomorphic training problems, and a transfer problem. The cued condition saw visual cues overlaid on the training problems. Participants' verbal responses were used to determine their accuracy. This study produced two major findings. First, short duration visual cues which draw attention to solution-relevant information and aid in the organizing and integrating of it, facilitate both immediate problem solving and generalization of that ability to new problems. Thus, visual cues can facilitate re-representing a problem and overcoming impasse, enabling a correct solution. Importantly, these cueing effects on problem solving did not involve the solvers' attention necessarily embodying the solution to the problem, but were instead caused by solvers attending to and integrating relevant information in the problems into a solution path. Second, this study demonstrates that when such cues are used across multiple problems, solvers can automatize the extraction of problem-relevant information extraction. These results suggest that low-level attentional selection processes provide a necessary gateway for relevant information to be used in problem solving, but are generally not sufficient for correct problem solving. Instead, factors that lead a solver to an impasse and to organize and integrate problem information also greatly facilitate arriving at correct solutions.
Profiling of poorly stratified smoky atmospheres with scanning lidar
Vladimir Kovalev; Cyle Wold; Alexander Petkov; Wei Min Hao
2012-01-01
The multiangle data processing technique is considered based on using the signal measured in zenith (or close to zenith) as a core source for extracting the information about the vertical atmospheric aerosol loading. The multiangle signals are used as the auxiliary data to extract the vertical transmittance profile from the zenith signal. Simulated and experimental...
PKDE4J: Entity and relation extraction for public knowledge discovery.
Song, Min; Kim, Won Chul; Lee, Dahee; Heo, Go Eun; Kang, Keun Young
2015-10-01
Due to an enormous number of scientific publications that cannot be handled manually, there is a rising interest in text-mining techniques for automated information extraction, especially in the biomedical field. Such techniques provide effective means of information search, knowledge discovery, and hypothesis generation. Most previous studies have primarily focused on the design and performance improvement of either named entity recognition or relation extraction. In this paper, we present PKDE4J, a comprehensive text-mining system that integrates dictionary-based entity extraction and rule-based relation extraction in a highly flexible and extensible framework. Starting with the Stanford CoreNLP, we developed the system to cope with multiple types of entities and relations. The system also has fairly good performance in terms of accuracy as well as the ability to configure text-processing components. We demonstrate its competitive performance by evaluating it on many corpora and found that it surpasses existing systems with average F-measures of 85% for entity extraction and 81% for relation extraction. Copyright © 2015 Elsevier Inc. All rights reserved.
Addressing Information Proliferation: Applications of Information Extraction and Text Mining
ERIC Educational Resources Information Center
Li, Jingjing
2013-01-01
The advent of the Internet and the ever-increasing capacity of storage media have made it easy to store, deliver, and share enormous volumes of data, leading to a proliferation of information on the Web, in online libraries, on news wires, and almost everywhere in our daily lives. Since our ability to process and absorb this information remains…
Effects of band selection on endmember extraction for forestry applications
NASA Astrophysics Data System (ADS)
Karathanassi, Vassilia; Andreou, Charoula; Andronis, Vassilis; Kolokoussis, Polychronis
2014-10-01
In spectral unmixing theory, data reduction techniques play an important role as hyperspectral imagery contains an immense amount of data, posing many challenging problems such as data storage, computational efficiency, and the so called "curse of dimensionality". Feature extraction and feature selection are the two main approaches for dimensionality reduction. Feature extraction techniques are used for reducing the dimensionality of the hyperspectral data by applying transforms on hyperspectral data. Feature selection techniques retain the physical meaning of the data by selecting a set of bands from the input hyperspectral dataset, which mainly contain the information needed for spectral unmixing. Although feature selection techniques are well-known for their dimensionality reduction potentials they are rarely used in the unmixing process. The majority of the existing state-of-the-art dimensionality reduction methods set criteria to the spectral information, which is derived by the whole wavelength, in order to define the optimum spectral subspace. These criteria are not associated with any particular application but with the data statistics, such as correlation and entropy values. However, each application is associated with specific land c over materials, whose spectral characteristics present variations in specific wavelengths. In forestry for example, many applications focus on tree leaves, in which specific pigments such as chlorophyll, xanthophyll, etc. determine the wavelengths where tree species, diseases, etc., can be detected. For such applications, when the unmixing process is applied, the tree species, diseases, etc., are considered as the endmembers of interest. This paper focuses on investigating the effects of band selection on the endmember extraction by exploiting the information of the vegetation absorbance spectral zones. More precisely, it is explored whether endmember extraction can be optimized when specific sets of initial bands related to leaf spectral characteristics are selected. Experiments comprise application of well-known signal subspace estimation and endmember extraction methods on a hyperspectral imagery that presents a forest area. Evaluation of the extracted endmembers showed that more forest species can be extracted as endmembers using selected bands.
Zhang, Hanyuan; Tian, Xuemin; Deng, Xiaogang; Cao, Yuping
2018-05-16
As an attractive nonlinear dynamic data analysis tool, global preserving kernel slow feature analysis (GKSFA) has achieved great success in extracting the high nonlinearity and inherently time-varying dynamics of batch process. However, GKSFA is an unsupervised feature extraction method and lacks the ability to utilize batch process class label information, which may not offer the most effective means for dealing with batch process monitoring. To overcome this problem, we propose a novel batch process monitoring method based on the modified GKSFA, referred to as discriminant global preserving kernel slow feature analysis (DGKSFA), by closely integrating discriminant analysis and GKSFA. The proposed DGKSFA method can extract discriminant feature of batch process as well as preserve global and local geometrical structure information of observed data. For the purpose of fault detection, a monitoring statistic is constructed based on the distance between the optimal kernel feature vectors of test data and normal data. To tackle the challenging issue of nonlinear fault variable identification, a new nonlinear contribution plot method is also developed to help identifying the fault variable after a fault is detected, which is derived from the idea of variable pseudo-sample trajectory projection in DGKSFA nonlinear biplot. Simulation results conducted on a numerical nonlinear dynamic system and the benchmark fed-batch penicillin fermentation process demonstrate that the proposed process monitoring and fault diagnosis approach can effectively detect fault and distinguish fault variables from normal variables. Copyright © 2018 ISA. Published by Elsevier Ltd. All rights reserved.
Tool Wear Feature Extraction Based on Hilbert Marginal Spectrum
NASA Astrophysics Data System (ADS)
Guan, Shan; Song, Weijie; Pang, Hongyang
2017-09-01
In the metal cutting process, the signal contains a wealth of tool wear state information. A tool wear signal’s analysis and feature extraction method based on Hilbert marginal spectrum is proposed. Firstly, the tool wear signal was decomposed by empirical mode decomposition algorithm and the intrinsic mode functions including the main information were screened out by the correlation coefficient and the variance contribution rate. Secondly, Hilbert transform was performed on the main intrinsic mode functions. Hilbert time-frequency spectrum and Hilbert marginal spectrum were obtained by Hilbert transform. Finally, Amplitude domain indexes were extracted on the basis of the Hilbert marginal spectrum and they structured recognition feature vector of tool wear state. The research results show that the extracted features can effectively characterize the different wear state of the tool, which provides a basis for monitoring tool wear condition.
SAR matrices: automated extraction of information-rich SAR tables from large compound data sets.
Wassermann, Anne Mai; Haebel, Peter; Weskamp, Nils; Bajorath, Jürgen
2012-07-23
We introduce the SAR matrix data structure that is designed to elucidate SAR patterns produced by groups of structurally related active compounds, which are extracted from large data sets. SAR matrices are systematically generated and sorted on the basis of SAR information content. Matrix generation is computationally efficient and enables processing of large compound sets. The matrix format is reminiscent of SAR tables, and SAR patterns revealed by different categories of matrices are easily interpretable. The structural organization underlying matrix formation is more flexible than standard R-group decomposition schemes. Hence, the resulting matrices capture SAR information in a comprehensive manner.
HEDEA: A Python Tool for Extracting and Analysing Semi-structured Information from Medical Records
Aggarwal, Anshul; Garhwal, Sunita
2018-01-01
Objectives One of the most important functions for a medical practitioner while treating a patient is to study the patient's complete medical history by going through all records, from test results to doctor's notes. With the increasing use of technology in medicine, these records are mostly digital, alleviating the problem of looking through a stack of papers, which are easily misplaced, but some of these are in an unstructured form. Large parts of clinical reports are in written text form and are tedious to use directly without appropriate pre-processing. In medical research, such health records may be a good, convenient source of medical data; however, lack of structure means that the data is unfit for statistical evaluation. In this paper, we introduce a system to extract, store, retrieve, and analyse information from health records, with a focus on the Indian healthcare scene. Methods A Python-based tool, Healthcare Data Extraction and Analysis (HEDEA), has been designed to extract structured information from various medical records using a regular expression-based approach. Results The HEDEA system is working, covering a large set of formats, to extract and analyse health information. Conclusions This tool can be used to generate analysis report and charts using the central database. This information is only provided after prior approval has been received from the patient for medical research purposes. PMID:29770248
Information extraction from Italian medical reports: An ontology-driven approach.
Viani, Natalia; Larizza, Cristiana; Tibollo, Valentina; Napolitano, Carlo; Priori, Silvia G; Bellazzi, Riccardo; Sacchi, Lucia
2018-03-01
In this work, we propose an ontology-driven approach to identify events and their attributes from episodes of care included in medical reports written in Italian. For this language, shared resources for clinical information extraction are not easily accessible. The corpus considered in this work includes 5432 non-annotated medical reports belonging to patients with rare arrhythmias. To guide the information extraction process, we built a domain-specific ontology that includes the events and the attributes to be extracted, with related regular expressions. The ontology and the annotation system were constructed on a development set, while the performance was evaluated on an independent test set. As a gold standard, we considered a manually curated hospital database named TRIAD, which stores most of the information written in reports. The proposed approach performs well on the considered Italian medical corpus, with a percentage of correct annotations above 90% for most considered clinical events. We also assessed the possibility to adapt the system to the analysis of another language (i.e., English), with promising results. Our annotation system relies on a domain ontology to extract and link information in clinical text. We developed an ontology that can be easily enriched and translated, and the system performs well on the considered task. In the future, it could be successfully used to automatically populate the TRIAD database. Copyright © 2017 Elsevier B.V. All rights reserved.
HEDEA: A Python Tool for Extracting and Analysing Semi-structured Information from Medical Records.
Aggarwal, Anshul; Garhwal, Sunita; Kumar, Ajay
2018-04-01
One of the most important functions for a medical practitioner while treating a patient is to study the patient's complete medical history by going through all records, from test results to doctor's notes. With the increasing use of technology in medicine, these records are mostly digital, alleviating the problem of looking through a stack of papers, which are easily misplaced, but some of these are in an unstructured form. Large parts of clinical reports are in written text form and are tedious to use directly without appropriate pre-processing. In medical research, such health records may be a good, convenient source of medical data; however, lack of structure means that the data is unfit for statistical evaluation. In this paper, we introduce a system to extract, store, retrieve, and analyse information from health records, with a focus on the Indian healthcare scene. A Python-based tool, Healthcare Data Extraction and Analysis (HEDEA), has been designed to extract structured information from various medical records using a regular expression-based approach. The HEDEA system is working, covering a large set of formats, to extract and analyse health information. This tool can be used to generate analysis report and charts using the central database. This information is only provided after prior approval has been received from the patient for medical research purposes.
NASA Astrophysics Data System (ADS)
Su, Zuqiang; Xiao, Hong; Zhang, Yi; Tang, Baoping; Jiang, Yonghua
2017-04-01
Extraction of sensitive features is a challenging but key task in data-driven machinery running state identification. Aimed at solving this problem, a method for machinery running state identification that applies discriminant semi-supervised local tangent space alignment (DSS-LTSA) for feature fusion and extraction is proposed. Firstly, in order to extract more distinct features, the vibration signals are decomposed by wavelet packet decomposition WPD, and a mixed-domain feature set consisted of statistical features, autoregressive (AR) model coefficients, instantaneous amplitude Shannon entropy and WPD energy spectrum is extracted to comprehensively characterize the properties of machinery running state(s). Then, the mixed-dimension feature set is inputted into DSS-LTSA for feature fusion and extraction to eliminate redundant information and interference noise. The proposed DSS-LTSA can extract intrinsic structure information of both labeled and unlabeled state samples, and as a result the over-fitting problem of supervised manifold learning and blindness problem of unsupervised manifold learning are overcome. Simultaneously, class discrimination information is integrated within the dimension reduction process in a semi-supervised manner to improve sensitivity of the extracted fusion features. Lastly, the extracted fusion features are inputted into a pattern recognition algorithm to achieve the running state identification. The effectiveness of the proposed method is verified by a running state identification case in a gearbox, and the results confirm the improved accuracy of the running state identification.
Uncovering the essential links in online commercial networks
NASA Astrophysics Data System (ADS)
Zeng, Wei; Fang, Meiling; Shao, Junming; Shang, Mingsheng
2016-09-01
Recommender systems are designed to effectively support individuals' decision-making process on various web sites. It can be naturally represented by a user-object bipartite network, where a link indicates that a user has collected an object. Recently, research on the information backbone has attracted researchers' interests, which is a sub-network with fewer nodes and links but carrying most of the relevant information. With the backbone, a system can generate satisfactory recommenda- tions while saving much computing resource. In this paper, we propose an enhanced topology-aware method to extract the information backbone in the bipartite network mainly based on the information of neighboring users and objects. Our backbone extraction method enables the recommender systems achieve more than 90% of the accuracy of the top-L recommendation, however, consuming only 20% links. The experimental results show that our method outperforms the alternative backbone extraction methods. Moreover, the structure of the information backbone is studied in detail. Finally, we highlight that the information backbone is one of the most important properties of the bipartite network, with which one can significantly improve the efficiency of the recommender system.
Robust watermark technique using masking and Hermite transform.
Coronel, Sandra L Gomez; Ramírez, Boris Escalante; Mosqueda, Marco A Acevedo
2016-01-01
The following paper evaluates a watermark algorithm designed for digital images by using a perceptive mask and a normalization process, thus preventing human eye detection, as well as ensuring its robustness against common processing and geometric attacks. The Hermite transform is employed because it allows a perfect reconstruction of the image, while incorporating human visual system properties; moreover, it is based on the Gaussian functions derivates. The applied watermark represents information of the digital image proprietor. The extraction process is blind, because it does not require the original image. The following techniques were utilized in the evaluation of the algorithm: peak signal-to-noise ratio, the structural similarity index average, the normalized crossed correlation, and bit error rate. Several watermark extraction tests were performed, with against geometric and common processing attacks. It allowed us to identify how many bits in the watermark can be modified for its adequate extraction.
Extraction of UMLS® Concepts Using Apache cTAKES™ for German Language.
Becker, Matthias; Böckmann, Britta
2016-01-01
Automatic information extraction of medical concepts and classification with semantic standards from medical reports is useful for standardization and for clinical research. This paper presents an approach for an UMLS concept extraction with a customized natural language processing pipeline for German clinical notes using Apache cTAKES. The objectives are, to test the natural language processing tool for German language if it is suitable to identify UMLS concepts and map these with SNOMED-CT. The German UMLS database and German OpenNLP models extended the natural language processing pipeline, so the pipeline can normalize to domain ontologies such as SNOMED-CT using the German concepts. For testing, the ShARe/CLEF eHealth 2013 training dataset translated into German was used. The implemented algorithms are tested with a set of 199 German reports, obtaining a result of average 0.36 F1 measure without German stemming, pre- and post-processing of the reports.
Text Information Extraction System (TIES) | Informatics Technology for Cancer Research (ITCR)
TIES is a service based software system for acquiring, deidentifying, and processing clinical text reports using natural language processing, and also for querying, sharing and using this data to foster tissue and image based research, within and between institutions.
Industrial application of semantic process mining
NASA Astrophysics Data System (ADS)
Espen Ingvaldsen, Jon; Atle Gulla, Jon
2012-05-01
Process mining relates to the extraction of non-trivial and useful information from information system event logs. It is a new research discipline that has evolved significantly since the early work on idealistic process logs. Over the last years, process mining prototypes have incorporated elements from semantics and data mining and targeted visualisation techniques that are more user-friendly to business experts and process owners. In this article, we present a framework for evaluating different aspects of enterprise process flows and address practical challenges of state-of-the-art industrial process mining. We also explore the inherent strengths of the technology for more efficient process optimisation.
Evans, D. A.; Brownlow, N. D.; Hersh, W. R.; Campbell, E. M.
1996-01-01
We discuss the development and evaluation of an automated procedure for extracting drug-dosage information from clinical narratives. The process was developed rapidly using existing technology and resources, including categories of terms from UMLS96. Evaluations over a large training and smaller test set of medical records demonstrate an approximately 80% rate of exact and partial matches' on target phrases, with few false positives and a modest rate of false negatives. The results suggest a strategy for automating general concept identification in electronic medical records. PMID:8947694
Effect of two doses of ginkgo biloba extract (EGb 761) on the dual-coding test in elderly subjects.
Allain, H; Raoul, P; Lieury, A; LeCoz, F; Gandon, J M; d'Arbigny, P
1993-01-01
The subjects of this double-blind study were 18 elderly men and women (mean age, 69.3 years) with slight age-related memory impairment. In a crossover-study design, each subject received placebo or an extract of Ginkgo biloba (EGb 761) (320 mg or 600 mg) 1 hour before performing a dual-coding test that measures the speed of information processing; the test consists of several coding series of drawings and words presented at decreasing times of 1920, 960, 480, 240, and 120 ms. The dual-coding phenomenon (a break point between coding verbal material and images) was demonstrated in all the tests. After placebo, the break point was observed at 960 ms and dual coding beginning at 1920 ms. After each dose of the ginkgo extract, the break point (at 480 ms) and dual coding (at 960 ms) were significantly shifted toward a shorter presentation time, indicating an improvement in the speed of information processing.
Quantifying Human Visible Color Variation from High Definition Digital Images of Orb Web Spiders.
Tapia-McClung, Horacio; Ajuria Ibarra, Helena; Rao, Dinesh
2016-01-01
Digital processing and analysis of high resolution images of 30 individuals of the orb web spider Verrucosa arenata were performed to extract and quantify human visible colors present on the dorsal abdomen of this species. Color extraction was performed with minimal user intervention using an unsupervised algorithm to determine groups of colors on each individual spider, which was then analyzed in order to quantify and classify the colors obtained, both spatially and using energy and entropy measures of the digital images. Analysis shows that the colors cover a small region of the visible spectrum, are not spatially homogeneously distributed over the patterns and from an entropic point of view, colors that cover a smaller region on the whole pattern carry more information than colors covering a larger region. This study demonstrates the use of processing tools to create automatic systems to extract valuable information from digital images that are precise, efficient and helpful for the understanding of the underlying biology.
Quantifying Human Visible Color Variation from High Definition Digital Images of Orb Web Spiders
Ajuria Ibarra, Helena; Rao, Dinesh
2016-01-01
Digital processing and analysis of high resolution images of 30 individuals of the orb web spider Verrucosa arenata were performed to extract and quantify human visible colors present on the dorsal abdomen of this species. Color extraction was performed with minimal user intervention using an unsupervised algorithm to determine groups of colors on each individual spider, which was then analyzed in order to quantify and classify the colors obtained, both spatially and using energy and entropy measures of the digital images. Analysis shows that the colors cover a small region of the visible spectrum, are not spatially homogeneously distributed over the patterns and from an entropic point of view, colors that cover a smaller region on the whole pattern carry more information than colors covering a larger region. This study demonstrates the use of processing tools to create automatic systems to extract valuable information from digital images that are precise, efficient and helpful for the understanding of the underlying biology. PMID:27902724
Sadeh, Talya; Maril, Anat; Bitan, Tali; Goshen-Gottstein, Yonatan
2012-03-01
A remarkable act of memory entails binding different forms of information. We focus on the timeless question of how the bound engram is accessed such that its component features-item and context-are extracted. To shed light on this question, we investigate the dynamics between brain structures that together mediate the binding and extraction of item and context. Converging evidence has implicated the Parahippocampal cortex (PHc) in contextual processing, the Perirhinal cortex (PRc) in item processing, and the hippocampus in item-context binding. Effective connectivity analysis was conducted on fMRI data gathered during retrieval on tests that differ with regard to the to-be-extracted information. Results revealed that recall is initiated by context-related PHc activity, followed by hippocampal item-context engram activation, and completed with retrieval of the study-item by the PRc. The reverse path was found for recognition. We thus provide novel evidence for dissociative patterns of item-context unbinding during retrieval. Copyright © 2011 Elsevier Inc. All rights reserved.
Comparison of Three Information Sources for Smoking Information in Electronic Health Records
Wang, Liwei; Ruan, Xiaoyang; Yang, Ping; Liu, Hongfang
2016-01-01
OBJECTIVE The primary aim was to compare independent and joint performance of retrieving smoking status through different sources, including narrative text processed by natural language processing (NLP), patient-provided information (PPI), and diagnosis codes (ie, International Classification of Diseases, Ninth Revision [ICD-9]). We also compared the performance of retrieving smoking strength information (ie, heavy/light smoker) from narrative text and PPI. MATERIALS AND METHODS Our study leveraged an existing lung cancer cohort for smoking status, amount, and strength information, which was manually chart-reviewed. On the NLP side, smoking-related electronic medical record (EMR) data were retrieved first. A pattern-based smoking information extraction module was then implemented to extract smoking-related information. After that, heuristic rules were used to obtain smoking status-related information. Smoking information was also obtained from structured data sources based on diagnosis codes and PPI. Sensitivity, specificity, and accuracy were measured using patients with coverage (ie, the proportion of patients whose smoking status/strength can be effectively determined). RESULTS NLP alone has the best overall performance for smoking status extraction (patient coverage: 0.88; sensitivity: 0.97; specificity: 0.70; accuracy: 0.88); combining PPI with NLP further improved patient coverage to 0.96. ICD-9 does not provide additional improvement to NLP and its combination with PPI. For smoking strength, combining NLP with PPI has slight improvement over NLP alone. CONCLUSION These findings suggest that narrative text could serve as a more reliable and comprehensive source for obtaining smoking-related information than structured data sources. PPI, the readily available structured data, could be used as a complementary source for more comprehensive patient coverage. PMID:27980387
Chen, Yen-Lin; Liang, Wen-Yew; Chiang, Chuan-Yen; Hsieh, Tung-Ju; Lee, Da-Cheng; Yuan, Shyan-Ming; Chang, Yang-Lang
2011-01-01
This study presents efficient vision-based finger detection, tracking, and event identification techniques and a low-cost hardware framework for multi-touch sensing and display applications. The proposed approach uses a fast bright-blob segmentation process based on automatic multilevel histogram thresholding to extract the pixels of touch blobs obtained from scattered infrared lights captured by a video camera. The advantage of this automatic multilevel thresholding approach is its robustness and adaptability when dealing with various ambient lighting conditions and spurious infrared noises. To extract the connected components of these touch blobs, a connected-component analysis procedure is applied to the bright pixels acquired by the previous stage. After extracting the touch blobs from each of the captured image frames, a blob tracking and event recognition process analyzes the spatial and temporal information of these touch blobs from consecutive frames to determine the possible touch events and actions performed by users. This process also refines the detection results and corrects for errors and occlusions caused by noise and errors during the blob extraction process. The proposed blob tracking and touch event recognition process includes two phases. First, the phase of blob tracking associates the motion correspondence of blobs in succeeding frames by analyzing their spatial and temporal features. The touch event recognition process can identify meaningful touch events based on the motion information of touch blobs, such as finger moving, rotating, pressing, hovering, and clicking actions. Experimental results demonstrate that the proposed vision-based finger detection, tracking, and event identification system is feasible and effective for multi-touch sensing applications in various operational environments and conditions. PMID:22163990
Qi, Xiubin; Crooke, Emma; Ross, Andrew; Bastow, Trevor P; Stalvies, Charlotte
2011-09-21
This paper presents a system and method developed to identify a source oil's characteristic properties by testing the oil's dissolved components in water. Through close examination of the oil dissolution process in water, we hypothesise that when oil is in contact with water, the resulting oil-water extract, a complex hydrocarbon mixture, carries the signature property information of the parent oil. If the dominating differences in compositions between such extracts of different oils can be identified, this information could guide the selection of various sensors, capable of capturing such chemical variations. When used as an array, such a sensor system can be used to determine parent oil information from the oil-water extract. To test this hypothesis, 22 oils' water extracts were prepared and selected dominant hydrocarbons analyzed with Gas Chromatography-Mass Spectrometry (GC-MS); the subsequent Principal Component Analysis (PCA) indicates that the major difference between the extract solutions is the relative concentration between the volatile mono-aromatics and fluorescent polyaromatics. An integrated sensor array system that is composed of 3 volatile hydrocarbon sensors and 2 polyaromatic hydrocarbon sensors was built accordingly to capture the major and subtle differences of these extracts. It was tested by exposure to a total of 110 water extract solutions diluted from the 22 extracts. The sensor response data collected from the testing were processed with two multivariate analysis tools to reveal information retained in the response patterns of the arrayed sensors: by conducting PCA, we were able to demonstrate the ability to qualitatively identify and distinguish different oil samples from their sensor array response patterns. When a supervised PCA, Linear Discriminate Analysis (LDA), was applied, even quantitative classification can be achieved: the multivariate model generated from the LDA achieved 89.7% of successful classification of the type of the oil samples. By grouping the samples based on the level of viscosity and density we were able to reveal the correlation between the oil extracts' sensor array responses and their original oils' feature properties. The equipment and method developed in this study have promising potential to be readily applied in field studies and marine surveys for oil exploration or oil spill monitoring.
Usability Evaluation of NLP-PIER: A Clinical Document Search Engine for Researchers.
Hultman, Gretchen; McEwan, Reed; Pakhomov, Serguei; Lindemann, Elizabeth; Skube, Steven; Melton, Genevieve B
2017-01-01
NLP-PIER (Natural Language Processing - Patient Information Extraction for Research) is a self-service platform with a search engine for clinical researchers to perform natural language processing (NLP) queries using clinical notes. We conducted user-centered testing of NLP-PIER's usability to inform future design decisions. Quantitative and qualitative data were analyzed. Our findings will be used to improve the usability of NLP-PIER.
Pathak, Jyotishman; Bailey, Kent R; Beebe, Calvin E; Bethard, Steven; Carrell, David S; Chen, Pei J; Dligach, Dmitriy; Endle, Cory M; Hart, Lacey A; Haug, Peter J; Huff, Stanley M; Kaggal, Vinod C; Li, Dingcheng; Liu, Hongfang; Marchant, Kyle; Masanz, James; Miller, Timothy; Oniki, Thomas A; Palmer, Martha; Peterson, Kevin J; Rea, Susan; Savova, Guergana K; Stancl, Craig R; Sohn, Sunghwan; Solbrig, Harold R; Suesse, Dale B; Tao, Cui; Taylor, David P; Westberg, Les; Wu, Stephen; Zhuo, Ning; Chute, Christopher G
2013-01-01
Research objective To develop scalable informatics infrastructure for normalization of both structured and unstructured electronic health record (EHR) data into a unified, concept-based model for high-throughput phenotype extraction. Materials and methods Software tools and applications were developed to extract information from EHRs. Representative and convenience samples of both structured and unstructured data from two EHR systems—Mayo Clinic and Intermountain Healthcare—were used for development and validation. Extracted information was standardized and normalized to meaningful use (MU) conformant terminology and value set standards using Clinical Element Models (CEMs). These resources were used to demonstrate semi-automatic execution of MU clinical-quality measures modeled using the Quality Data Model (QDM) and an open-source rules engine. Results Using CEMs and open-source natural language processing and terminology services engines—namely, Apache clinical Text Analysis and Knowledge Extraction System (cTAKES) and Common Terminology Services (CTS2)—we developed a data-normalization platform that ensures data security, end-to-end connectivity, and reliable data flow within and across institutions. We demonstrated the applicability of this platform by executing a QDM-based MU quality measure that determines the percentage of patients between 18 and 75 years with diabetes whose most recent low-density lipoprotein cholesterol test result during the measurement year was <100 mg/dL on a randomly selected cohort of 273 Mayo Clinic patients. The platform identified 21 and 18 patients for the denominator and numerator of the quality measure, respectively. Validation results indicate that all identified patients meet the QDM-based criteria. Conclusions End-to-end automated systems for extracting clinical information from diverse EHR systems require extensive use of standardized vocabularies and terminologies, as well as robust information models for storing, discovering, and processing that information. This study demonstrates the application of modular and open-source resources for enabling secondary use of EHR data through normalization into standards-based, comparable, and consistent format for high-throughput phenotyping to identify patient cohorts. PMID:24190931
Davis, Brian N.; Werpy, Jason; Friesz, Aaron M.; Impecoven, Kevin; Quenzer, Robert; Maiersperger, Tom; Meyer, David J.
2015-01-01
Current methods of searching for and retrieving data from satellite land remote sensing archives do not allow for interactive information extraction. Instead, Earth science data users are required to download files over low-bandwidth networks to local workstations and process data before science questions can be addressed. New methods of extracting information from data archives need to become more interactive to meet user demands for deriving increasingly complex information from rapidly expanding archives. Moving the tools required for processing data to computer systems of data providers, and away from systems of the data consumer, can improve turnaround times for data processing workflows. The implementation of middleware services was used to provide interactive access to archive data. The goal of this middleware services development is to enable Earth science data users to access remote sensing archives for immediate answers to science questions instead of links to large volumes of data to download and process. Exposing data and metadata to web-based services enables machine-driven queries and data interaction. Also, product quality information can be integrated to enable additional filtering and sub-setting. Only the reduced content required to complete an analysis is then transferred to the user.
Wiese, Holger; Schweinberger, Stefan R; Neumann, Markus F
2008-11-01
We used repetition priming to investigate implicit and explicit processes of unfamiliar face categorization. During prime and test phases, participants categorized unfamiliar faces according to either age or gender. Faces presented at test were either new or primed in a task-congruent (same task during priming and test) or incongruent (different tasks) condition. During age categorization, reaction times revealed significant priming for both priming conditions, and event-related potentials yielded an increased N170 over the left hemisphere as a result of priming. During gender categorization, congruent faces elicited priming and a latency decrease in the right N170. Accordingly, information about age is extracted irrespective of processing demands, and priming facilitates the extraction of feature information reflected in the left N170 effect. By contrast, priming of gender categorization may depend on whether the task at initial presentation requires configural processing.
DEXTER: Disease-Expression Relation Extraction from Text.
Gupta, Samir; Dingerdissen, Hayley; Ross, Karen E; Hu, Yu; Wu, Cathy H; Mazumder, Raja; Vijay-Shanker, K
2018-01-01
Gene expression levels affect biological processes and play a key role in many diseases. Characterizing expression profiles is useful for clinical research, and diagnostics and prognostics of diseases. There are currently several high-quality databases that capture gene expression information, obtained mostly from large-scale studies, such as microarray and next-generation sequencing technologies, in the context of disease. The scientific literature is another rich source of information on gene expression-disease relationships that not only have been captured from large-scale studies but have also been observed in thousands of small-scale studies. Expression information obtained from literature through manual curation can extend expression databases. While many of the existing databases include information from literature, they are limited by the time-consuming nature of manual curation and have difficulty keeping up with the explosion of publications in the biomedical field. In this work, we describe an automated text-mining tool, Disease-Expression Relation Extraction from Text (DEXTER) to extract information from literature on gene and microRNA expression in the context of disease. One of the motivations in developing DEXTER was to extend the BioXpress database, a cancer-focused gene expression database that includes data derived from large-scale experiments and manual curation of publications. The literature-based portion of BioXpress lags behind significantly compared to expression information obtained from large-scale studies and can benefit from our text-mined results. We have conducted two different evaluations to measure the accuracy of our text-mining tool and achieved average F-scores of 88.51 and 81.81% for the two evaluations, respectively. Also, to demonstrate the ability to extract rich expression information in different disease-related scenarios, we used DEXTER to extract information on differential expression information for 2024 genes in lung cancer, 115 glycosyltransferases in 62 cancers and 826 microRNA in 171 cancers. All extractions using DEXTER are integrated in the literature-based portion of BioXpress.Database URL: http://biotm.cis.udel.edu/DEXTER.
Extracting genetic alteration information for personalized cancer therapy from ClinicalTrials.gov
Xu, Jun; Lee, Hee-Jin; Zeng, Jia; Wu, Yonghui; Zhang, Yaoyun; Huang, Liang-Chin; Johnson, Amber; Holla, Vijaykumar; Bailey, Ann M; Cohen, Trevor; Meric-Bernstam, Funda; Bernstam, Elmer V
2016-01-01
Objective: Clinical trials investigating drugs that target specific genetic alterations in tumors are important for promoting personalized cancer therapy. The goal of this project is to create a knowledge base of cancer treatment trials with annotations about genetic alterations from ClinicalTrials.gov. Methods: We developed a semi-automatic framework that combines advanced text-processing techniques with manual review to curate genetic alteration information in cancer trials. The framework consists of a document classification system to identify cancer treatment trials from ClinicalTrials.gov and an information extraction system to extract gene and alteration pairs from the Title and Eligibility Criteria sections of clinical trials. By applying the framework to trials at ClinicalTrials.gov, we created a knowledge base of cancer treatment trials with genetic alteration annotations. We then evaluated each component of the framework against manually reviewed sets of clinical trials and generated descriptive statistics of the knowledge base. Results and Discussion: The automated cancer treatment trial identification system achieved a high precision of 0.9944. Together with the manual review process, it identified 20 193 cancer treatment trials from ClinicalTrials.gov. The automated gene-alteration extraction system achieved a precision of 0.8300 and a recall of 0.6803. After validation by manual review, we generated a knowledge base of 2024 cancer trials that are labeled with specific genetic alteration information. Analysis of the knowledge base revealed the trend of increased use of targeted therapy for cancer, as well as top frequent gene-alteration pairs of interest. We expect this knowledge base to be a valuable resource for physicians and patients who are seeking information about personalized cancer therapy. PMID:27013523
Extracting genetic alteration information for personalized cancer therapy from ClinicalTrials.gov.
Xu, Jun; Lee, Hee-Jin; Zeng, Jia; Wu, Yonghui; Zhang, Yaoyun; Huang, Liang-Chin; Johnson, Amber; Holla, Vijaykumar; Bailey, Ann M; Cohen, Trevor; Meric-Bernstam, Funda; Bernstam, Elmer V; Xu, Hua
2016-07-01
Clinical trials investigating drugs that target specific genetic alterations in tumors are important for promoting personalized cancer therapy. The goal of this project is to create a knowledge base of cancer treatment trials with annotations about genetic alterations from ClinicalTrials.gov. We developed a semi-automatic framework that combines advanced text-processing techniques with manual review to curate genetic alteration information in cancer trials. The framework consists of a document classification system to identify cancer treatment trials from ClinicalTrials.gov and an information extraction system to extract gene and alteration pairs from the Title and Eligibility Criteria sections of clinical trials. By applying the framework to trials at ClinicalTrials.gov, we created a knowledge base of cancer treatment trials with genetic alteration annotations. We then evaluated each component of the framework against manually reviewed sets of clinical trials and generated descriptive statistics of the knowledge base. The automated cancer treatment trial identification system achieved a high precision of 0.9944. Together with the manual review process, it identified 20 193 cancer treatment trials from ClinicalTrials.gov. The automated gene-alteration extraction system achieved a precision of 0.8300 and a recall of 0.6803. After validation by manual review, we generated a knowledge base of 2024 cancer trials that are labeled with specific genetic alteration information. Analysis of the knowledge base revealed the trend of increased use of targeted therapy for cancer, as well as top frequent gene-alteration pairs of interest. We expect this knowledge base to be a valuable resource for physicians and patients who are seeking information about personalized cancer therapy. © The Author 2016. Published by Oxford University Press on behalf of the American Medical Informatics Association. All rights reserved. For Permissions, please email: journals.permissions@oup.com.
Extracting DEM from airborne X-band data based on PolInSAR
NASA Astrophysics Data System (ADS)
Hou, X. X.; Huang, G. M.; Zhao, Z.
2015-06-01
Polarimetric Interferometric Synthetic Aperture Radar (PolInSAR) is a new trend of SAR remote sensing technology which combined polarized multichannel information and Interferometric information. It is of great significance for extracting DEM in some regions with low precision of DEM such as vegetation coverage area and building concentrated area. In this paper we describe our experiments with high-resolution X-band full Polarimetric SAR data acquired by a dual-baseline interferometric airborne SAR system over an area of Danling in southern China. Pauli algorithm is used to generate the double polarimetric interferometry data, Singular Value Decomposition (SVD), Numerical Radius (NR) and Phase diversity (PD) methods are used to generate the full polarimetric interferometry data. Then we can make use of the polarimetric interferometric information to extract DEM with processing of pre filtering , image registration, image resampling, coherence optimization, multilook processing, flat-earth removal, interferogram filtering, phase unwrapping, parameter calibration, height derivation and geo-coding. The processing system named SARPlore has been exploited based on VC++ led by Chinese Academy of Surveying and Mapping. Finally compared optimization results with the single polarimetric interferometry, it has been observed that optimization ways can reduce the interferometric noise and the phase unwrapping residuals, and improve the precision of DEM. The result of full polarimetric interferometry is better than double polarimetric interferometry. Meanwhile, in different terrain, the result of full polarimetric interferometry will have a different degree of increase.
Kellman, Philip J; Massey, Christine M; Son, Ji Y
2010-04-01
Learning in educational settings emphasizes declarative and procedural knowledge. Studies of expertise, however, point to other crucial components of learning, especially improvements produced by experience in the extraction of information: perceptual learning (PL). We suggest that such improvements characterize both simple sensory and complex cognitive, even symbolic, tasks through common processes of discovery and selection. We apply these ideas in the form of perceptual learning modules (PLMs) to mathematics learning. We tested three PLMs, each emphasizing different aspects of complex task performance, in middle and high school mathematics. In the MultiRep PLM, practice in matching function information across multiple representations improved students' abilities to generate correct graphs and equations from word problems. In the Algebraic Transformations PLM, practice in seeing equation structure across transformations (but not solving equations) led to dramatic improvements in the speed of equation solving. In the Linear Measurement PLM, interactive trials involving extraction of information about units and lengths produced successful transfer to novel measurement problems and fraction problem solving. Taken together, these results suggest (a) that PL techniques have the potential to address crucial, neglected dimensions of learning, including discovery and fluent processing of relations; (b) PL effects apply even to complex tasks that involve symbolic processing; and (c) appropriately designed PL technology can produce rapid and enduring advances in learning. Copyright © 2009 Cognitive Science Society, Inc.
ERIC Educational Resources Information Center
Mangina, Eleni; Kilbride, John
2008-01-01
The research presented in this paper is an examination of the applicability of IUI techniques in an online e-learning environment. In particular we make use of user modeling techniques, information retrieval and extraction mechanisms and collaborative filtering methods. The domains of e-learning, web-based training and instruction and intelligent…
A business process modeling experience in a complex information system re-engineering.
Bernonville, Stéphanie; Vantourout, Corinne; Fendeler, Geneviève; Beuscart, Régis
2013-01-01
This article aims to share a business process modeling experience in a re-engineering project of a medical records department in a 2,965-bed hospital. It presents the modeling strategy, an extract of the results and the feedback experience.
PLAN2L: a web tool for integrated text mining and literature-based bioentity relation extraction.
Krallinger, Martin; Rodriguez-Penagos, Carlos; Tendulkar, Ashish; Valencia, Alfonso
2009-07-01
There is an increasing interest in using literature mining techniques to complement information extracted from annotation databases or generated by bioinformatics applications. Here we present PLAN2L, a web-based online search system that integrates text mining and information extraction techniques to access systematically information useful for analyzing genetic, cellular and molecular aspects of the plant model organism Arabidopsis thaliana. Our system facilitates a more efficient retrieval of information relevant to heterogeneous biological topics, from implications in biological relationships at the level of protein interactions and gene regulation, to sub-cellular locations of gene products and associations to cellular and developmental processes, i.e. cell cycle, flowering, root, leaf and seed development. Beyond single entities, also predefined pairs of entities can be provided as queries for which literature-derived relations together with textual evidences are returned. PLAN2L does not require registration and is freely accessible at http://zope.bioinfo.cnio.es/plan2l.
Nikfarjam, Azadeh; Sarker, Abeed; O'Connor, Karen; Ginn, Rachel; Gonzalez, Graciela
2015-05-01
Social media is becoming increasingly popular as a platform for sharing personal health-related information. This information can be utilized for public health monitoring tasks, particularly for pharmacovigilance, via the use of natural language processing (NLP) techniques. However, the language in social media is highly informal, and user-expressed medical concepts are often nontechnical, descriptive, and challenging to extract. There has been limited progress in addressing these challenges, and thus far, advanced machine learning-based NLP techniques have been underutilized. Our objective is to design a machine learning-based approach to extract mentions of adverse drug reactions (ADRs) from highly informal text in social media. We introduce ADRMine, a machine learning-based concept extraction system that uses conditional random fields (CRFs). ADRMine utilizes a variety of features, including a novel feature for modeling words' semantic similarities. The similarities are modeled by clustering words based on unsupervised, pretrained word representation vectors (embeddings) generated from unlabeled user posts in social media using a deep learning technique. ADRMine outperforms several strong baseline systems in the ADR extraction task by achieving an F-measure of 0.82. Feature analysis demonstrates that the proposed word cluster features significantly improve extraction performance. It is possible to extract complex medical concepts, with relatively high performance, from informal, user-generated content. Our approach is particularly scalable, suitable for social media mining, as it relies on large volumes of unlabeled data, thus diminishing the need for large, annotated training data sets. © The Author 2015. Published by Oxford University Press on behalf of the American Medical Informatics Association.
Multisensor multiresolution data fusion for improvement in classification
NASA Astrophysics Data System (ADS)
Rubeena, V.; Tiwari, K. C.
2016-04-01
The rapid advancements in technology have facilitated easy availability of multisensor and multiresolution remote sensing data. Multisensor, multiresolution data contain complementary information and fusion of such data may result in application dependent significant information which may otherwise remain trapped within. The present work aims at improving classification by fusing features of coarse resolution hyperspectral (1 m) LWIR and fine resolution (20 cm) RGB data. The classification map comprises of eight classes. The class names are Road, Trees, Red Roof, Grey Roof, Concrete Roof, Vegetation, bare Soil and Unclassified. The processing methodology for hyperspectral LWIR data comprises of dimensionality reduction, resampling of data by interpolation technique for registering the two images at same spatial resolution, extraction of the spatial features to improve classification accuracy. In the case of fine resolution RGB data, the vegetation index is computed for classifying the vegetation class and the morphological building index is calculated for buildings. In order to extract the textural features, occurrence and co-occurence statistics is considered and the features will be extracted from all the three bands of RGB data. After extracting the features, Support Vector Machine (SVMs) has been used for training and classification. To increase the classification accuracy, post processing steps like removal of any spurious noise such as salt and pepper noise is done which is followed by filtering process by majority voting within the objects for better object classification.
Neuroanatomic organization of sound memory in humans.
Kraut, Michael A; Pitcock, Jeffery A; Calhoun, Vince; Li, Juan; Freeman, Thomas; Hart, John
2006-11-01
The neural interface between sensory perception and memory is a central issue in neuroscience, particularly initial memory organization following perceptual analyses. We used functional magnetic resonance imaging to identify anatomic regions extracting initial auditory semantic memory information related to environmental sounds. Two distinct anatomic foci were detected in the right superior temporal gyrus when subjects identified sounds representing either animals or threatening items. Threatening animal stimuli elicited signal changes in both foci, suggesting a distributed neural representation. Our results demonstrate both category- and feature-specific responses to nonverbal sounds in early stages of extracting semantic memory information from these sounds. This organization allows for these category-feature detection nodes to extract early, semantic memory information for efficient processing of transient sound stimuli. Neural regions selective for threatening sounds are similar to those of nonhuman primates, demonstrating semantic memory organization for basic biological/survival primitives are present across species.
Mao, Jin; Moore, Lisa R; Blank, Carrine E; Wu, Elvis Hsin-Hui; Ackerman, Marcia; Ranade, Sonali; Cui, Hong
2016-12-13
The large-scale analysis of phenomic data (i.e., full phenotypic traits of an organism, such as shape, metabolic substrates, and growth conditions) in microbial bioinformatics has been hampered by the lack of tools to rapidly and accurately extract phenotypic data from existing legacy text in the field of microbiology. To quickly obtain knowledge on the distribution and evolution of microbial traits, an information extraction system needed to be developed to extract phenotypic characters from large numbers of taxonomic descriptions so they can be used as input to existing phylogenetic analysis software packages. We report the development and evaluation of Microbial Phenomics Information Extractor (MicroPIE, version 0.1.0). MicroPIE is a natural language processing application that uses a robust supervised classification algorithm (Support Vector Machine) to identify characters from sentences in prokaryotic taxonomic descriptions, followed by a combination of algorithms applying linguistic rules with groups of known terms to extract characters as well as character states. The input to MicroPIE is a set of taxonomic descriptions (clean text). The output is a taxon-by-character matrix-with taxa in the rows and a set of 42 pre-defined characters (e.g., optimum growth temperature) in the columns. The performance of MicroPIE was evaluated against a gold standard matrix and another student-made matrix. Results show that, compared to the gold standard, MicroPIE extracted 21 characters (50%) with a Relaxed F1 score > 0.80 and 16 characters (38%) with Relaxed F1 scores ranging between 0.50 and 0.80. Inclusion of a character prediction component (SVM) improved the overall performance of MicroPIE, notably the precision. Evaluated against the same gold standard, MicroPIE performed significantly better than the undergraduate students. MicroPIE is a promising new tool for the rapid and efficient extraction of phenotypic character information from prokaryotic taxonomic descriptions. However, further development, including incorporation of ontologies, will be necessary to improve the performance of the extraction for some character types.
Automation for System Safety Analysis
NASA Technical Reports Server (NTRS)
Malin, Jane T.; Fleming, Land; Throop, David; Thronesbery, Carroll; Flores, Joshua; Bennett, Ted; Wennberg, Paul
2009-01-01
This presentation describes work to integrate a set of tools to support early model-based analysis of failures and hazards due to system-software interactions. The tools perform and assist analysts in the following tasks: 1) extract model parts from text for architecture and safety/hazard models; 2) combine the parts with library information to develop the models for visualization and analysis; 3) perform graph analysis and simulation to identify and evaluate possible paths from hazard sources to vulnerable entities and functions, in nominal and anomalous system-software configurations and scenarios; and 4) identify resulting candidate scenarios for software integration testing. There has been significant technical progress in model extraction from Orion program text sources, architecture model derivation (components and connections) and documentation of extraction sources. Models have been derived from Internal Interface Requirements Documents (IIRDs) and FMEA documents. Linguistic text processing is used to extract model parts and relationships, and the Aerospace Ontology also aids automated model development from the extracted information. Visualizations of these models assist analysts in requirements overview and in checking consistency and completeness.
Visualization of DNA in highly processed botanical materials.
Lu, Zhengfei; Rubinsky, Maria; Babajanian, Silva; Zhang, Yanjun; Chang, Peter; Swanson, Gary
2018-04-15
DNA-based methods have been gaining recognition as a tool for botanical authentication in herbal medicine; however, their application in processed botanical materials is challenging due to the low quality and quantity of DNA left after extensive manufacturing processes. The low amount of DNA recovered from processed materials, especially extracts, is "invisible" by current technology, which has casted doubt on the presence of amplifiable botanical DNA. A method using adapter-ligation and PCR amplification was successfully applied to visualize the "invisible" DNA in botanical extracts. The size of the "invisible" DNA fragments in botanical extracts was around 20-220 bp compared to fragments of around 600 bp for the more easily visualized DNA in botanical powders. This technique is the first to allow characterization and visualization of small fragments of DNA in processed botanical materials and will provide key information to guide the development of appropriate DNA-based botanical authentication methods in the future. Copyright © 2017 Elsevier Ltd. All rights reserved.
Structural health monitoring feature design by genetic programming
NASA Astrophysics Data System (ADS)
Harvey, Dustin Y.; Todd, Michael D.
2014-09-01
Structural health monitoring (SHM) systems provide real-time damage and performance information for civil, aerospace, and other high-capital or life-safety critical structures. Conventional data processing involves pre-processing and extraction of low-dimensional features from in situ time series measurements. The features are then input to a statistical pattern recognition algorithm to perform the relevant classification or regression task necessary to facilitate decisions by the SHM system. Traditional design of signal processing and feature extraction algorithms can be an expensive and time-consuming process requiring extensive system knowledge and domain expertise. Genetic programming, a heuristic program search method from evolutionary computation, was recently adapted by the authors to perform automated, data-driven design of signal processing and feature extraction algorithms for statistical pattern recognition applications. The proposed method, called Autofead, is particularly suitable to handle the challenges inherent in algorithm design for SHM problems where the manifestation of damage in structural response measurements is often unclear or unknown. Autofead mines a training database of response measurements to discover information-rich features specific to the problem at hand. This study provides experimental validation on three SHM applications including ultrasonic damage detection, bearing damage classification for rotating machinery, and vibration-based structural health monitoring. Performance comparisons with common feature choices for each problem area are provided demonstrating the versatility of Autofead to produce significant algorithm improvements on a wide range of problems.
Signal processing methods for MFE plasma diagnostics
DOE Office of Scientific and Technical Information (OSTI.GOV)
Candy, J.V.; Casper, T.; Kane, R.
1985-02-01
The application of various signal processing methods to extract energy storage information from plasma diamagnetism sensors occurring during physics experiments on the Tandom Mirror Experiment-Upgrade (TMX-U) is discussed. We show how these processing techniques can be used to decrease the uncertainty in the corresponding sensor measurements. The algorithms suggested are implemented using SIG, an interactive signal processing package developed at LLNL.
PASTE: patient-centered SMS text tagging in a medication management system
Johnson, Kevin B; Denny, Joshua C
2011-01-01
Objective To evaluate the performance of a system that extracts medication information and administration-related actions from patient short message service (SMS) messages. Design Mobile technologies provide a platform for electronic patient-centered medication management. MyMediHealth (MMH) is a medication management system that includes a medication scheduler, a medication administration record, and a reminder engine that sends text messages to cell phones. The object of this work was to extend MMH to allow two-way interaction using mobile phone-based SMS technology. Unprompted text-message communication with patients using natural language could engage patients in their healthcare, but presents unique natural language processing challenges. The authors developed a new functional component of MMH, the Patient-centered Automated SMS Tagging Engine (PASTE). The PASTE web service uses natural language processing methods, custom lexicons, and existing knowledge sources to extract and tag medication information from patient text messages. Measurements A pilot evaluation of PASTE was completed using 130 medication messages anonymously submitted by 16 volunteers via a website. System output was compared with manually tagged messages. Results Verified medication names, medication terms, and action terms reached high F-measures of 91.3%, 94.7%, and 90.4%, respectively. The overall medication name F-measure was 79.8%, and the medication action term F-measure was 90%. Conclusion Other studies have demonstrated systems that successfully extract medication information from clinical documents using semantic tagging, regular expression-based approaches, or a combination of both approaches. This evaluation demonstrates the feasibility of extracting medication information from patient-generated medication messages. PMID:21984605
DOE Office of Scientific and Technical Information (OSTI.GOV)
Adams, C.; et al.
The single-phase liquid argon time projection chamber (LArTPC) provides a large amount of detailed information in the form of fine-grained drifted ionization charge from particle traces. To fully utilize this information, the deposited charge must be accurately extracted from the raw digitized waveforms via a robust signal processing chain. Enabled by the ultra-low noise levels associated with cryogenic electronics in the MicroBooNE detector, the precise extraction of ionization charge from the induction wire planes in a single-phase LArTPC is qualitatively demonstrated on MicroBooNE data with event display images, and quantitatively demonstrated via waveform-level and track-level metrics. Improved performance of inductionmore » plane calorimetry is demonstrated through the agreement of extracted ionization charge measurements across different wire planes for various event topologies. In addition to the comprehensive waveform-level comparison of data and simulation, a calibration of the cryogenic electronics response is presented and solutions to various MicroBooNE-specific TPC issues are discussed. This work presents an important improvement in LArTPC signal processing, the foundation of reconstruction and therefore physics analyses in MicroBooNE.« less
Masanz, James J; Ogren, Philip V; Zheng, Jiaping; Sohn, Sunghwan; Kipper-Schuler, Karin C; Chute, Christopher G
2010-01-01
We aim to build and evaluate an open-source natural language processing system for information extraction from electronic medical record clinical free-text. We describe and evaluate our system, the clinical Text Analysis and Knowledge Extraction System (cTAKES), released open-source at http://www.ohnlp.org. The cTAKES builds on existing open-source technologies—the Unstructured Information Management Architecture framework and OpenNLP natural language processing toolkit. Its components, specifically trained for the clinical domain, create rich linguistic and semantic annotations. Performance of individual components: sentence boundary detector accuracy=0.949; tokenizer accuracy=0.949; part-of-speech tagger accuracy=0.936; shallow parser F-score=0.924; named entity recognizer and system-level evaluation F-score=0.715 for exact and 0.824 for overlapping spans, and accuracy for concept mapping, negation, and status attributes for exact and overlapping spans of 0.957, 0.943, 0.859, and 0.580, 0.939, and 0.839, respectively. Overall performance is discussed against five applications. The cTAKES annotations are the foundation for methods and modules for higher-level semantic processing of clinical free-text. PMID:20819853
Common Ground: An Interactive Visual Exploration and Discovery for Complex Health Data
2015-04-01
working with Intermountain Healthcare on a new rich dataset extracted directly from medical notes using natural language processing ( NLP ) algorithms...probabilities based on a state- of-the-art NLP classifiers. At that stage the data did not include geographic information or temporal information but we
Wavelets and their applications past and future
NASA Astrophysics Data System (ADS)
Coifman, Ronald R.
2009-04-01
As this is a conference on mathematical tools for defense, I would like to dedicate this talk to the memory of Louis Auslander, who through his insights and visionary leadership, brought powerful new mathematics into DARPA, he has provided the main impetus to the development and insertion of wavelet based processing in defense. My goal here is to describe the evolution of a stream of ideas in Harmonic Analysis, ideas which in the past have been mostly applied for the analysis and extraction of information from physical data, and which now are increasingly applied to organize and extract information and knowledge from any set of digital documents, from text to music to questionnaires. This form of signal processing on digital data, is part of the future of wavelet analysis.
MedEx/J: A One-Scan Simple and Fast NLP Tool for Japanese Clinical Texts.
Aramaki, Eiji; Yano, Ken; Wakamiya, Shoko
2017-01-01
Because of recent replacement of physical documents with electronic medical records (EMR), the importance of information processing in the medical field has increased. In light of this trend, we have been developing MedEx/J, which retrieves important Japanese language information from medical reports. MedEx/J executes two tasks simultaneously: (1) term extraction, and (2) positive and negative event classification. We designate this approach as a one-scan approach, providing simplicity of systems and reasonable accuracy. MedEx/J performance on the two tasks is described herein: (1) term extraction (F
Image processing and analysis using neural networks for optometry area
NASA Astrophysics Data System (ADS)
Netto, Antonio V.; Ferreira de Oliveira, Maria C.
2002-11-01
In this work we describe the framework of a functional system for processing and analyzing images of the human eye acquired by the Hartmann-Shack technique (HS), in order to extract information to formulate a diagnosis of eye refractive errors (astigmatism, hypermetropia and myopia). The analysis is to be carried out using an Artificial Intelligence system based on Neural Nets, Fuzzy Logic and Classifier Combination. The major goal is to establish the basis of a new technology to effectively measure ocular refractive errors that is based on methods alternative those adopted in current patented systems. Moreover, analysis of images acquired with the Hartmann-Shack technique may enable the extraction of additional information on the health of an eye under exam from the same image used to detect refraction errors.
How extractive industries affect health: Political economy underpinnings and pathways.
Schrecker, Ted; Birn, Anne-Emanuelle; Aguilera, Mariajosé
2018-06-07
A systematic and theoretically informed analysis of how extractive industries affect health outcomes and health inequities is overdue. Informed by the work of Saskia Sassen on "logics of extraction," we adopt an expansive definition of extractive industries to include (for example) large-scale foreign acquisitions of agricultural land for export production. To ground our analysis in concrete place-based evidence, we begin with a brief review of four case examples of major extractive activities. We then analyze the political economy of extractivism, focusing on the societal structures, processes, and relationships of power that drive and enable extraction. Next, we examine how this global order shapes and interacts with politics, institutions, and policies at the state/national level contextualizing extractive activity. Having provided necessary context, we posit a set of pathways that link the global political economy and national politics and institutional practices surrounding extraction to health outcomes and their distribution. These pathways involve both direct health effects, such as toxic work and environmental exposures and assassination of activists, and indirect effects, including sustained impoverishment, water insecurity, and stress-related ailments. We conclude with some reflections on the need for future research on the health and health equity implications of the global extractive order. Copyright © 2018 The Authors. Published by Elsevier Ltd.. All rights reserved.
A Robust Gradient Based Method for Building Extraction from LiDAR and Photogrammetric Imagery.
Siddiqui, Fasahat Ullah; Teng, Shyh Wei; Awrangjeb, Mohammad; Lu, Guojun
2016-07-19
Existing automatic building extraction methods are not effective in extracting buildings which are small in size and have transparent roofs. The application of large area threshold prohibits detection of small buildings and the use of ground points in generating the building mask prevents detection of transparent buildings. In addition, the existing methods use numerous parameters to extract buildings in complex environments, e.g., hilly area and high vegetation. However, the empirical tuning of large number of parameters reduces the robustness of building extraction methods. This paper proposes a novel Gradient-based Building Extraction (GBE) method to address these limitations. The proposed method transforms the Light Detection And Ranging (LiDAR) height information into intensity image without interpolation of point heights and then analyses the gradient information in the image. Generally, building roof planes have a constant height change along the slope of a roof plane whereas trees have a random height change. With such an analysis, buildings of a greater range of sizes with a transparent or opaque roof can be extracted. In addition, a local colour matching approach is introduced as a post-processing stage to eliminate trees. This stage of our proposed method does not require any manual setting and all parameters are set automatically from the data. The other post processing stages including variance, point density and shadow elimination are also applied to verify the extracted buildings, where comparatively fewer empirically set parameters are used. The performance of the proposed GBE method is evaluated on two benchmark data sets by using the object and pixel based metrics (completeness, correctness and quality). Our experimental results show the effectiveness of the proposed method in eliminating trees, extracting buildings of all sizes, and extracting buildings with and without transparent roof. When compared with current state-of-the-art building extraction methods, the proposed method outperforms the existing methods in various evaluation metrics.
A Robust Gradient Based Method for Building Extraction from LiDAR and Photogrammetric Imagery
Siddiqui, Fasahat Ullah; Teng, Shyh Wei; Awrangjeb, Mohammad; Lu, Guojun
2016-01-01
Existing automatic building extraction methods are not effective in extracting buildings which are small in size and have transparent roofs. The application of large area threshold prohibits detection of small buildings and the use of ground points in generating the building mask prevents detection of transparent buildings. In addition, the existing methods use numerous parameters to extract buildings in complex environments, e.g., hilly area and high vegetation. However, the empirical tuning of large number of parameters reduces the robustness of building extraction methods. This paper proposes a novel Gradient-based Building Extraction (GBE) method to address these limitations. The proposed method transforms the Light Detection And Ranging (LiDAR) height information into intensity image without interpolation of point heights and then analyses the gradient information in the image. Generally, building roof planes have a constant height change along the slope of a roof plane whereas trees have a random height change. With such an analysis, buildings of a greater range of sizes with a transparent or opaque roof can be extracted. In addition, a local colour matching approach is introduced as a post-processing stage to eliminate trees. This stage of our proposed method does not require any manual setting and all parameters are set automatically from the data. The other post processing stages including variance, point density and shadow elimination are also applied to verify the extracted buildings, where comparatively fewer empirically set parameters are used. The performance of the proposed GBE method is evaluated on two benchmark data sets by using the object and pixel based metrics (completeness, correctness and quality). Our experimental results show the effectiveness of the proposed method in eliminating trees, extracting buildings of all sizes, and extracting buildings with and without transparent roof. When compared with current state-of-the-art building extraction methods, the proposed method outperforms the existing methods in various evaluation metrics. PMID:27447631
DNA extraction for streamlined metagenomics of diverse environmental samples.
Marotz, Clarisse; Amir, Amnon; Humphrey, Greg; Gaffney, James; Gogul, Grant; Knight, Rob
2017-06-01
A major bottleneck for metagenomic sequencing is rapid and efficient DNA extraction. Here, we compare the extraction efficiencies of three magnetic bead-based platforms (KingFisher, epMotion, and Tecan) to a standardized column-based extraction platform across a variety of sample types, including feces, oral, skin, soil, and water. Replicate sample plates were extracted and prepared for 16S rRNA gene amplicon sequencing in parallel to assess extraction bias and DNA quality. The data demonstrate that any effect of extraction method on sequencing results was small compared with the variability across samples; however, the KingFisher platform produced the largest number of high-quality reads in the shortest amount of time. Based on these results, we have identified an extraction pipeline that dramatically reduces sample processing time without sacrificing bacterial taxonomic or abundance information.
USSR Report, Consumer Goods and Domestic Trade, No. 65
1983-06-08
types of oil-bearing raw materials— grape seeds , fruit and tree-fruit pits, corn germs and others. In recent years the ties of the workers of oils and...brief, indicate how the original information was processed. Where no processing indicator is given, the information was summarized or extracted ...good crop. More than 2 million tons of grapes were harvested for the first time. Public education, science and culture underwent further development
Quantifying hydrogen-deuterium exchange of meteoritic dicarboxylic acids during aqueous extraction
NASA Astrophysics Data System (ADS)
Fuller, M.; Huang, Y.
2003-03-01
Hydrogen isotope ratios of organic compounds in carbonaceous chondrites provide critical information about their origins and evolutionary history. However, because many of these compounds are obtained by aqueous extraction, the degree of hydrogen-deuterium (H/D) exchange that occurs during the process needs to be quantitatively evaluated. This study uses compound- specific hydrogen isotopic analysis to quantify the H/D exchange during aqueous extraction. Three common meteoritic dicarboxylic acids (succinic, glutaric, and 2-methyl glutaric acids) were refluxed under conditions simulating the extraction process. Changes in D values of the dicarboxylic acids were measured following the reflux experiments. A pseudo-first order rate law was used to model the H/D exchange rates which were then used to calculate the isotope exchange resulting from aqueous extraction. The degree of H/D exchange varies as a result of differences in molecular structure, the alkalinity of the extraction solution and presence/absence of meteorite powder. However, our model indicates that succinic, glutaric, and 2-methyl glutaric acids with a D of 1800 would experience isotope changes of 38, 10, and 6, respectively during the extraction process. Therefore, the overall change in D values of the dicarboxylic acids during the aqueous extraction process is negligible. We also demonstrate that H/D exchange occurs on the chiral -carbon in 2-methyl glutaric acid. The results suggest that the racemic mixture of 2-methyl glutaric acid in the Tagish Lake meteorite could result from post-synthesis aqueous alteration. The approach employed in this study can also be used to quantify H/D exchange for other important meteoritic compounds such as amino acids.
NASA Astrophysics Data System (ADS)
Ye, Fa-wang; Liu, De-chang
2008-12-01
Practices of sandstone-type uranium exploration in recent years in China indicate that the uranium mineralization alteration information is of great importance for selecting a new uranium target or prospecting in outer area of the known uranium ore district. Taking a case study of BASHIBULAKE uranium ore district, this paper mainly presents the technical minds and methods of extracting the reduced alteration information by oil and gas in BASHIBULAKE ore district using ASTER data. First, the regional geological setting and study status in BASHIBULAKE uranium ore district are introduced in brief. Then, the spectral characteristics of altered sandstone and un-altered sandstone in BASHIBULAKE ore district are analyzed deeply. Based on the spectral analysis, two technical minds to extract the remote sensing reduced alteration information are proposed, and the un-mixing method is introduced to process ASTER data to extract the reduced alteration information in BASHIBULAKE ore district. From the enhanced images, three remote sensing anomaly zones are discovered, and their geological and prospecting significances are further made sure by taking the advantages of multi-bands in SWIR of ASTER data. Finally, the distribution and intensity of the reduced alteration information in Cretaceous system and its relationship with the genesis of uranium deposit are discussed, the specific suggestions for uranium prospecting orientation in outer of BASHIBULAKE ore district are also proposed.
Gao, Yingbin; Kong, Xiangyu; Zhang, Huihui; Hou, Li'an
2017-05-01
Minor component (MC) plays an important role in signal processing and data analysis, so it is a valuable work to develop MC extraction algorithms. Based on the concepts of weighted subspace and optimum theory, a weighted information criterion is proposed for searching the optimum solution of a linear neural network. This information criterion exhibits a unique global minimum attained if and only if the state matrix is composed of the desired MCs of an autocorrelation matrix of an input signal. By using gradient ascent method and recursive least square (RLS) method, two algorithms are developed for multiple MCs extraction. The global convergences of the proposed algorithms are also analyzed by the Lyapunov method. The proposed algorithms can extract the multiple MCs in parallel and has advantage in dealing with high dimension matrices. Since the weighted matrix does not require an accurate value, it facilitates the system design of the proposed algorithms for practical applications. The speed and computation advantages of the proposed algorithms are verified through simulations. Copyright © 2017 Elsevier Ltd. All rights reserved.
NASA Astrophysics Data System (ADS)
Poux, F.; Neuville, R.; Billen, R.
2017-08-01
Reasoning from information extraction given by point cloud data mining allows contextual adaptation and fast decision making. However, to achieve this perceptive level, a point cloud must be semantically rich, retaining relevant information for the end user. This paper presents an automatic knowledge-based method for pre-processing multi-sensory data and classifying a hybrid point cloud from both terrestrial laser scanning and dense image matching. Using 18 features including sensor's biased data, each tessera in the high-density point cloud from the 3D captured complex mosaics of Germigny-des-prés (France) is segmented via a colour multi-scale abstraction-based featuring extracting connectivity. A 2D surface and outline polygon of each tessera is generated by a RANSAC plane extraction and convex hull fitting. Knowledge is then used to classify every tesserae based on their size, surface, shape, material properties and their neighbour's class. The detection and semantic enrichment method shows promising results of 94% correct semantization, a first step toward the creation of an archaeological smart point cloud.
A study on building data warehouse of hospital information system.
Li, Ping; Wu, Tao; Chen, Mu; Zhou, Bin; Xu, Wei-guo
2011-08-01
Existing hospital information systems with simple statistical functions cannot meet current management needs. It is well known that hospital resources are distributed with private property rights among hospitals, such as in the case of the regional coordination of medical services. In this study, to integrate and make full use of medical data effectively, we propose a data warehouse modeling method for the hospital information system. The method can also be employed for a distributed-hospital medical service system. To ensure that hospital information supports the diverse needs of health care, the framework of the hospital information system has three layers: datacenter layer, system-function layer, and user-interface layer. This paper discusses the role of a data warehouse management system in handling hospital information from the establishment of the data theme to the design of a data model to the establishment of a data warehouse. Online analytical processing tools assist user-friendly multidimensional analysis from a number of different angles to extract the required data and information. Use of the data warehouse improves online analytical processing and mitigates deficiencies in the decision support system. The hospital information system based on a data warehouse effectively employs statistical analysis and data mining technology to handle massive quantities of historical data, and summarizes from clinical and hospital information for decision making. This paper proposes the use of a data warehouse for a hospital information system, specifically a data warehouse for the theme of hospital information to determine latitude, modeling and so on. The processing of patient information is given as an example that demonstrates the usefulness of this method in the case of hospital information management. Data warehouse technology is an evolving technology, and more and more decision support information extracted by data mining and with decision-making technology is required for further research.
Jenke, Dennis
2007-01-01
Leaching of plastic materials, packaging, or containment systems by finished drug products and/or their related solutions can happen when contact occurs between such materials, systems, and products. While the drug product vendor has the regulatory/legal responsibility to demonstrate that such leaching does not affect the safety, efficacy, and/or compliance of the finished drug product, the plastic's supplier can facilitate that demonstration by providing the drug product vendor with appropriate and relevant information. Although it is a reasonable expectation that suppliers would possess and share such facilitating information, it is not reasonable for vendors to expect suppliers to (1) reveal confidential information without appropriate safeguards and (2) possess information specific to the vendor's finished drug product. Any potential conflict between the vendor's desire for information and the supplier's willingness to either generate or supply such information can be mitigated if the roles and responsibilities of these two stakeholders are established up front. The vendor of the finished drug product is responsible for supplying regulators with a full and complete leachables assessment for its finished drug product. To facilitate (but not take the place of) the vendor's leachables assessment, suppliers of the materials, components, or systems can provide the vendor with a full and complete extractables assessment for their material/system. The vendor and supplier share the responsibility for reconciling or correlating the extractables and leachables data. While this document establishes the components of a full and complete extractables assessment, specifying the detailed process by which a full and complete extractables assessment is performed is beyond its scope.
NASA Astrophysics Data System (ADS)
Li, Da; Cheung, Chifai; Zhao, Xing; Ren, Mingjun; Zhang, Juan; Zhou, Liqiu
2016-10-01
Autostereoscopy based three-dimensional (3D) digital reconstruction has been widely applied in the field of medical science, entertainment, design, industrial manufacture, precision measurement and many other areas. The 3D digital model of the target can be reconstructed based on the series of two-dimensional (2D) information acquired by the autostereoscopic system, which consists multiple lens and can provide information of the target from multiple angles. This paper presents a generalized and precise autostereoscopic three-dimensional (3D) digital reconstruction method based on Direct Extraction of Disparity Information (DEDI) which can be used to any transform autostereoscopic systems and provides accurate 3D reconstruction results through error elimination process based on statistical analysis. The feasibility of DEDI method has been successfully verified through a series of optical 3D digital reconstruction experiments on different autostereoscopic systems which is highly efficient to perform the direct full 3D digital model construction based on tomography-like operation upon every depth plane with the exclusion of the defocused information. With the absolute focused information processed by DEDI method, the 3D digital model of the target can be directly and precisely formed along the axial direction with the depth information.
Overview of machine vision methods in x-ray imaging and microtomography
NASA Astrophysics Data System (ADS)
Buzmakov, Alexey; Zolotov, Denis; Chukalina, Marina; Nikolaev, Dmitry; Gladkov, Andrey; Ingacheva, Anastasia; Yakimchuk, Ivan; Asadchikov, Victor
2018-04-01
Digital X-ray imaging became widely used in science, medicine, non-destructive testing. This allows using modern digital images analysis for automatic information extraction and interpretation. We give short review of scientific applications of machine vision in scientific X-ray imaging and microtomography, including image processing, feature detection and extraction, images compression to increase camera throughput, microtomography reconstruction, visualization and setup adjustment.
ChemEngine: harvesting 3D chemical structures of supplementary data from PDF files.
Karthikeyan, Muthukumarasamy; Vyas, Renu
2016-01-01
Digital access to chemical journals resulted in a vast array of molecular information that is now available in the supplementary material files in PDF format. However, extracting this molecular information, generally from a PDF document format is a daunting task. Here we present an approach to harvest 3D molecular data from the supporting information of scientific research articles that are normally available from publisher's resources. In order to demonstrate the feasibility of extracting truly computable molecules from PDF file formats in a fast and efficient manner, we have developed a Java based application, namely ChemEngine. This program recognizes textual patterns from the supplementary data and generates standard molecular structure data (bond matrix, atomic coordinates) that can be subjected to a multitude of computational processes automatically. The methodology has been demonstrated via several case studies on different formats of coordinates data stored in supplementary information files, wherein ChemEngine selectively harvested the atomic coordinates and interpreted them as molecules with high accuracy. The reusability of extracted molecular coordinate data was demonstrated by computing Single Point Energies that were in close agreement with the original computed data provided with the articles. It is envisaged that the methodology will enable large scale conversion of molecular information from supplementary files available in the PDF format into a collection of ready- to- compute molecular data to create an automated workflow for advanced computational processes. Software along with source codes and instructions available at https://sourceforge.net/projects/chemengine/files/?source=navbar.Graphical abstract.
Introducing meta-services for biomedical information extraction
Leitner, Florian; Krallinger, Martin; Rodriguez-Penagos, Carlos; Hakenberg, Jörg; Plake, Conrad; Kuo, Cheng-Ju; Hsu, Chun-Nan; Tsai, Richard Tzong-Han; Hung, Hsi-Chuan; Lau, William W; Johnson, Calvin A; Sætre, Rune; Yoshida, Kazuhiro; Chen, Yan Hua; Kim, Sun; Shin, Soo-Yong; Zhang, Byoung-Tak; Baumgartner, William A; Hunter, Lawrence; Haddow, Barry; Matthews, Michael; Wang, Xinglong; Ruch, Patrick; Ehrler, Frédéric; Özgür, Arzucan; Erkan, Güneş; Radev, Dragomir R; Krauthammer, Michael; Luong, ThaiBinh; Hoffmann, Robert; Sander, Chris; Valencia, Alfonso
2008-01-01
We introduce the first meta-service for information extraction in molecular biology, the BioCreative MetaServer (BCMS; ). This prototype platform is a joint effort of 13 research groups and provides automatically generated annotations for PubMed/Medline abstracts. Annotation types cover gene names, gene IDs, species, and protein-protein interactions. The annotations are distributed by the meta-server in both human and machine readable formats (HTML/XML). This service is intended to be used by biomedical researchers and database annotators, and in biomedical language processing. The platform allows direct comparison, unified access, and result aggregation of the annotations. PMID:18834497
Passive Polarimetric Information Processing for Target Classification
NASA Astrophysics Data System (ADS)
Sadjadi, Firooz; Sadjadi, Farzad
Polarimetric sensing is an area of active research in a variety of applications. In particular, the use of polarization diversity has been shown to improve performance in automatic target detection and recognition. Within the diverse scope of polarimetric sensing, the field of passive polarimetric sensing is of particular interest. This chapter presents several new methods for gathering in formation using such passive techniques. One method extracts three-dimensional (3D) information and surface properties using one or more sensors. Another method extracts scene-specific algebraic expressions that remain unchanged under polariza tion transformations (such as along the transmission path to the sensor).
Information-driven trade and price-volume relationship in artificial stock markets
NASA Astrophysics Data System (ADS)
Liu, Xinghua; Liu, Xin; Liang, Xiaobei
2015-07-01
The positive relation between stock price changes and trading volume (price-volume relationship) as a stylized fact has attracted significant interest among finance researchers and investment practitioners. However, until now, consensus has not been reached regarding the causes of the relationship based on real market data because extracting valuable variables (such as information-driven trade volume) from real data is difficult. This lack of general consensus motivates us to develop a simple agent-based computational artificial stock market where extracting the necessary variables is easy. Based on this model and its artificial data, our tests have found that the aggressive trading style of informed agents can produce a price-volume relationship. Therefore, the information spreading process is not a necessary condition for producing price-volume relationship.
Japan Report, Science and Technology.
1987-04-03
n are supplied by JPRS. Processing indicators such as [Textj or TExcerpt] in the first line of each item, or following the last line of a brief...indicate how the original information was processed . Where no processing indicator is given, the infor- mation was summarized or extracted...000th of a second radiated by permanent stars during their evolutionary process , the institute said. The satellite’s astronomical survey will
Information extraction with object based support vector machines and vegetation indices
NASA Astrophysics Data System (ADS)
Ustuner, Mustafa; Abdikan, Saygin; Balik Sanli, Fusun
2016-07-01
Information extraction through remote sensing data is important for policy and decision makers as extracted information provide base layers for many application of real world. Classification of remotely sensed data is the one of the most common methods of extracting information however it is still a challenging issue because several factors are affecting the accuracy of the classification. Resolution of the imagery, number and homogeneity of land cover classes, purity of training data and characteristic of adopted classifiers are just some of these challenging factors. Object based image classification has some superiority than pixel based classification for high resolution images since it uses geometry and structure information besides spectral information. Vegetation indices are also commonly used for the classification process since it provides additional spectral information for vegetation, forestry and agricultural areas. In this study, the impacts of the Normalized Difference Vegetation Index (NDVI) and Normalized Difference Red Edge Index (NDRE) on the classification accuracy of RapidEye imagery were investigated. Object based Support Vector Machines were implemented for the classification of crop types for the study area located in Aegean region of Turkey. Results demonstrated that the incorporation of NDRE increase the classification accuracy from 79,96% to 86,80% as overall accuracy, however NDVI decrease the classification accuracy from 79,96% to 78,90%. Moreover it is proven than object based classification with RapidEye data give promising results for crop type mapping and analysis.
a R-Shiny Based Phenology Analysis System and Case Study Using Digital Camera Dataset
NASA Astrophysics Data System (ADS)
Zhou, Y. K.
2018-05-01
Accurate extracting of the vegetation phenology information play an important role in exploring the effects of climate changes on vegetation. Repeated photos from digital camera is a useful and huge data source in phonological analysis. Data processing and mining on phenological data is still a big challenge. There is no single tool or a universal solution for big data processing and visualization in the field of phenology extraction. In this paper, we proposed a R-shiny based web application for vegetation phenological parameters extraction and analysis. Its main functions include phenological site distribution visualization, ROI (Region of Interest) selection, vegetation index calculation and visualization, data filtering, growth trajectory fitting, phenology parameters extraction, etc. the long-term observation photography data from Freemanwood site in 2013 is processed by this system as an example. The results show that: (1) this system is capable of analyzing large data using a distributed framework; (2) The combination of multiple parameter extraction and growth curve fitting methods could effectively extract the key phenology parameters. Moreover, there are discrepancies between different combination methods in unique study areas. Vegetation with single-growth peak is suitable for using the double logistic module to fit the growth trajectory, while vegetation with multi-growth peaks should better use spline method.
Extraction of Profile Information from Cloud Contaminated Radiances. Appendixes 2
NASA Technical Reports Server (NTRS)
Smith, W. L.; Zhou, D. K.; Huang, H.-L.; Li, Jun; Liu, X.; Larar, A. M.
2003-01-01
Clouds act to reduce the signal level and may produce noise dependence on the complexity of the cloud properties and the manner in which they are treated in the profile retrieval process. There are essentially three ways to extract profile information from cloud contaminated radiances: (1) cloud-clearing using spatially adjacent cloud contaminated radiance measurements, (2) retrieval based upon the assumption of opaque cloud conditions, and (3) retrieval or radiance assimilation using a physically correct cloud radiative transfer model which accounts for the absorption and scattering of the radiance observed. Cloud clearing extracts the radiance arising from the clear air portion of partly clouded fields of view permitting soundings to the surface or the assimilation of radiances as in the clear field of view case. However, the accuracy of the clear air radiance signal depends upon the cloud height and optical property uniformity across the two fields of view used in the cloud clearing process. The assumption of opaque clouds within the field of view permits relatively accurate profiles to be retrieved down to near cloud top levels, the accuracy near the cloud top level being dependent upon the actual microphysical properties of the cloud. The use of a physically correct cloud radiative transfer model enables accurate retrievals down to cloud top levels and below semi-transparent cloud layers (e.g., cirrus). It should also be possible to assimilate cloudy radiances directly into the model given a physically correct cloud radiative transfer model using geometric and microphysical cloud parameters retrieved from the radiance spectra as initial cloud variables in the radiance assimilation process. This presentation reviews the above three ways to extract profile information from cloud contaminated radiances. NPOESS Airborne Sounder Testbed-Interferometer radiance spectra and Aqua satellite AIRS radiance spectra are used to illustrate how cloudy radiances can be used in the profile retrieval process.
Extracting Coherent Information from Noise Based Correlation Processing
2015-09-30
from deep-water experiments using eigenrays. Jit Sarkar , Christopher M. Verlinden, Jefferey D. Tippmann, William S. Hodgkiss, and W.A. Kuperman ASA abstract in Jackson, Nov 2015 and manuscript in preparation.
TRENCADIS--a WSRF grid MiddleWare for managing DICOM structured reporting objects.
Blanquer, Ignacio; Hernandez, Vicente; Segrelles, Damià
2006-01-01
The adoption of the digital processing of medical data, especially on radiology, has leaded to the availability of millions of records (images and reports). However, this information is mainly used at patient level, being the extraction of information, organised according to administrative criteria, which make the extraction of knowledge difficult. Moreover, legal constraints make the direct integration of information systems complex or even impossible. On the other side, the widespread of the DICOM format has leaded to the inclusion of other information different from just radiological images. The possibility of coding radiology reports in a structured form, adding semantic information about the data contained in the DICOM objects, eases the process of structuring images according to content. DICOM Structured Reporting (DICOM-SR) is a specification of tags and sections to code and integrate radiology reports, with seamless references to findings and regions of interests of the associated images, movies, waveforms, signals, etc. The work presented in this paper aims at developing of a framework to efficiently and securely share medical images and radiology reports, as well as to provide high throughput processing services. This system is based on a previously developed architecture in the framework of the TRENCADIS project, and uses other components such as the security system and the Grid processing service developed in previous activities. The work presented here introduces a semantic structuring and an ontology framework, to organise medical images considering standard terminology and disease coding formats (SNOMED, ICD9, LOINC..).
WRIS: a resource information system for wildland management
Robert M. Russell; David A. Sharpnack; Elliot Amidon
1975-01-01
WRIS (Wildland Resource Information System) is a computer system for processing, storing, retrieving, updating, and displaying geographic data. The polygon, representing a land area boundary, forms the building block of WRIS. Polygons form a map. Maps are digitized manually or by automatic scanning. Computer programs can extract and produce polygon maps and can overlay...
A Review of Feature Extraction Software for Microarray Gene Expression Data
Tan, Ching Siang; Ting, Wai Soon; Mohamad, Mohd Saberi; Chan, Weng Howe; Deris, Safaai; Ali Shah, Zuraini
2014-01-01
When gene expression data are too large to be processed, they are transformed into a reduced representation set of genes. Transforming large-scale gene expression data into a set of genes is called feature extraction. If the genes extracted are carefully chosen, this gene set can extract the relevant information from the large-scale gene expression data, allowing further analysis by using this reduced representation instead of the full size data. In this paper, we review numerous software applications that can be used for feature extraction. The software reviewed is mainly for Principal Component Analysis (PCA), Independent Component Analysis (ICA), Partial Least Squares (PLS), and Local Linear Embedding (LLE). A summary and sources of the software are provided in the last section for each feature extraction method. PMID:25250315
v3NLP Framework: Tools to Build Applications for Extracting Concepts from Clinical Text
Divita, Guy; Carter, Marjorie E.; Tran, Le-Thuy; Redd, Doug; Zeng, Qing T; Duvall, Scott; Samore, Matthew H.; Gundlapalli, Adi V.
2016-01-01
Introduction: Substantial amounts of clinically significant information are contained only within the narrative of the clinical notes in electronic medical records. The v3NLP Framework is a set of “best-of-breed” functionalities developed to transform this information into structured data for use in quality improvement, research, population health surveillance, and decision support. Background: MetaMap, cTAKES and similar well-known natural language processing (NLP) tools do not have sufficient scalability out of the box. The v3NLP Framework evolved out of the necessity to scale-up these tools up and provide a framework to customize and tune techniques that fit a variety of tasks, including document classification, tuned concept extraction for specific conditions, patient classification, and information retrieval. Innovation: Beyond scalability, several v3NLP Framework-developed projects have been efficacy tested and benchmarked. While v3NLP Framework includes annotators, pipelines and applications, its functionalities enable developers to create novel annotators and to place annotators into pipelines and scaled applications. Discussion: The v3NLP Framework has been successfully utilized in many projects including general concept extraction, risk factors for homelessness among veterans, and identification of mentions of the presence of an indwelling urinary catheter. Projects as diverse as predicting colonization with methicillin-resistant Staphylococcus aureus and extracting references to military sexual trauma are being built using v3NLP Framework components. Conclusion: The v3NLP Framework is a set of functionalities and components that provide Java developers with the ability to create novel annotators and to place those annotators into pipelines and applications to extract concepts from clinical text. There are scale-up and scale-out functionalities to process large numbers of records. PMID:27683667
NASA Astrophysics Data System (ADS)
Wang, Haixia; Suo, Tongchuan; Wu, Xiaolin; Zhang, Yue; Wang, Chunhua; Yu, Heshui; Li, Zheng
2018-03-01
The control of batch-to-batch quality variations remains a challenging task for pharmaceutical industries, e.g., traditional Chinese medicine (TCM) manufacturing. One difficult problem is to produce pharmaceutical products with consistent quality from raw material of large quality variations. In this paper, an integrated methodology combining the near infrared spectroscopy (NIRS) and dynamic predictive modeling is developed for the monitoring and control of the batch extraction process of licorice. With the spectra data in hand, the initial state of the process is firstly estimated with a state-space model to construct a process monitoring strategy for the early detection of variations induced by the initial process inputs such as raw materials. Secondly, the quality property of the end product is predicted at the mid-course during the extraction process with a partial least squares (PLS) model. The batch-end-time (BET) is then adjusted accordingly to minimize the quality variations. In conclusion, our study shows that with the help of the dynamic predictive modeling, NIRS can offer the past and future information of the process, which enables more accurate monitoring and control of process performance and product quality.
Concept maps: A tool for knowledge management and synthesis in web-based conversational learning.
Joshi, Ankur; Singh, Satendra; Jaswal, Shivani; Badyal, Dinesh Kumar; Singh, Tejinder
2016-01-01
Web-based conversational learning provides an opportunity for shared knowledge base creation through collaboration and collective wisdom extraction. Usually, the amount of generated information in such forums is very huge, multidimensional (in alignment with the desirable preconditions for constructivist knowledge creation), and sometimes, the nature of expected new information may not be anticipated in advance. Thus, concept maps (crafted from constructed data) as "process summary" tools may be a solution to improve critical thinking and learning by making connections between the facts or knowledge shared by the participants during online discussion This exploratory paper begins with the description of this innovation tried on a web-based interacting platform (email list management software), FAIMER-Listserv, and generated qualitative evidence through peer-feedback. This process description is further supported by a theoretical construct which shows how social constructivism (inclusive of autonomy and complexity) affects the conversational learning. The paper rationalizes the use of concept map as mid-summary tool for extracting information and further sense making out of this apparent intricacy.
Zhou, Li; Plasek, Joseph M; Mahoney, Lisa M; Karipineni, Neelima; Chang, Frank; Yan, Xuemin; Chang, Fenny; Dimaggio, Dana; Goldman, Debora S.; Rocha, Roberto A.
2011-01-01
Clinical information is often coded using different terminologies, and therefore is not interoperable. Our goal is to develop a general natural language processing (NLP) system, called Medical Text Extraction, Reasoning and Mapping System (MTERMS), which encodes clinical text using different terminologies and simultaneously establishes dynamic mappings between them. MTERMS applies a modular, pipeline approach flowing from a preprocessor, semantic tagger, terminology mapper, context analyzer, and parser to structure inputted clinical notes. Evaluators manually reviewed 30 free-text and 10 structured outpatient clinical notes compared to MTERMS output. MTERMS achieved an overall F-measure of 90.6 and 94.0 for free-text and structured notes respectively for medication and temporal information. The local medication terminology had 83.0% coverage compared to RxNorm’s 98.0% coverage for free-text notes. 61.6% of mappings between the terminologies are exact match. Capture of duration was significantly improved (91.7% vs. 52.5%) from systems in the third i2b2 challenge. PMID:22195230
Worldwide Report, Nuclear Development and Proliferation
1984-03-05
transmissions and broadcasts. Materials from foreign- language sources are translated; those from English- language sources are transcribed or reprinted, with... Processing indicators such as [Text] or [Excerpt] in the first line of each item, or following the last line of a brief, indicate how the original...information was processed . Where no processing indicator is given, the infor- mation was summarized or extracted. Unfamiliar names rendered
Latin America Report, No. 2712
1983-07-26
other characteristics retained. Headlines, editorial reports, and material enclosed in brackets [] are supplied by JPRS. Processing indicators such...as [Text] or [Excerpt] in the first line of each item, or following the last line of a brief, indicate how the original information was processed ...Where no processing indicator is given, the infor- mation was summarized or extracted. Unfamiliar names rendered phonetically or transliterated are
Balabanova, Dobrinka A.; Paunov, Momchil; Goltsev, Vasillij; Cuypers, Ann; Vangronsveld, Jaco; Vassilev, Andon
2016-01-01
The herbicide imazamox may provoke temporary yellowing and growth retardation in IMI-R sunflower hybrids, more often under stressful environmental conditions. Although, photosynthetic processes are not the primary sites of imazamox action, they might be influenced; therefore, more information about the photosynthetic performance of the herbicide-treated plants could be valuable for a further improvement of the Clearfield technology. Plant biostimulants have been shown to ameliorate damages caused by different stress factors on plants, but very limited information exists about their effects on herbicide-stressed plants. In order to characterize photosynthetic performance of imazamox-treated sunflower IMI-R plants, we carried out experiments including both single and combined treatments by imazamox and a plant biostimulants containing amino acid extract. We found that imazamox application in a rate of 132 μg per plant (equivalent of 40 g active ingredient ha−1) induced negative effects on both light-light dependent photosynthetic redox reactions and leaf gas exchange processes, which was much less pronounced after the combined application of imazamox and amino acid extract. PMID:27826304
Rinaldi, Fabio; Schneider, Gerold; Kaljurand, Kaarel; Hess, Michael; Andronis, Christos; Konstandi, Ourania; Persidis, Andreas
2007-02-01
The amount of new discoveries (as published in the scientific literature) in the biomedical area is growing at an exponential rate. This growth makes it very difficult to filter the most relevant results, and thus the extraction of the core information becomes very expensive. Therefore, there is a growing interest in text processing approaches that can deliver selected information from scientific publications, which can limit the amount of human intervention normally needed to gather those results. This paper presents and evaluates an approach aimed at automating the process of extracting functional relations (e.g. interactions between genes and proteins) from scientific literature in the biomedical domain. The approach, using a novel dependency-based parser, is based on a complete syntactic analysis of the corpus. We have implemented a state-of-the-art text mining system for biomedical literature, based on a deep-linguistic, full-parsing approach. The results are validated on two different corpora: the manually annotated genomics information access (GENIA) corpus and the automatically annotated arabidopsis thaliana circadian rhythms (ATCR) corpus. We show how a deep-linguistic approach (contrary to common belief) can be used in a real world text mining application, offering high-precision relation extraction, while at the same time retaining a sufficient recall.
Extraction and purification methods in downstream processing of plant-based recombinant proteins.
Łojewska, Ewelina; Kowalczyk, Tomasz; Olejniczak, Szymon; Sakowicz, Tomasz
2016-04-01
During the last two decades, the production of recombinant proteins in plant systems has been receiving increased attention. Currently, proteins are considered as the most important biopharmaceuticals. However, high costs and problems with scaling up the purification and isolation processes make the production of plant-based recombinant proteins a challenging task. This paper presents a summary of the information regarding the downstream processing in plant systems and provides a comprehensible overview of its key steps, such as extraction and purification. To highlight the recent progress, mainly new developments in the downstream technology have been chosen. Furthermore, besides most popular techniques, alternative methods have been described. Copyright © 2015 Elsevier Inc. All rights reserved.
Random bits, true and unbiased, from atmospheric turbulence
Marangon, Davide G.; Vallone, Giuseppe; Villoresi, Paolo
2014-01-01
Random numbers represent a fundamental ingredient for secure communications and numerical simulation as well as to games and in general to Information Science. Physical processes with intrinsic unpredictability may be exploited to generate genuine random numbers. The optical propagation in strong atmospheric turbulence is here taken to this purpose, by observing a laser beam after a 143 km free-space path. In addition, we developed an algorithm to extract the randomness of the beam images at the receiver without post-processing. The numbers passed very selective randomness tests for qualification as genuine random numbers. The extracting algorithm can be easily generalized to random images generated by different physical processes. PMID:24976499
Aqueous biphasic systems in the separation of food colorants.
Santos, João H P M; Capela, Emanuel V; Boal-Palheiros, Isabel; Coutinho, João A P; Freire, Mara G; Ventura, Sónia P M
2018-04-25
Aqueous biphasic systems (ABS) composed of polypropylene glycol and carbohydrates, two benign substances are proposed to separate two food colorants (E122 and E133). ABS are promising extractive platforms, particularly for biomolecules, due to their aqueous and mild nature (pH and temperature), reduced environmental impact and processing costs. Another major aspect considered, particularly useful in downstream processing, is the "tuning" ability for the extraction and purification of these systems by a proper choice of the ABS components. In this work, our intention is to show the concept of ABS as an alternative and volatile organic solvent-free tool to separate two different biomolecules in a simple way, so simple that teachers can effectively adopt it in their classes to explain the concept of bioseparation processes. Informative documents and general information about the preparation of binodal curves and their use in the partition of biomolecules is available in this work to be used by teachers in their classes. In this sense, the students use different carbohydrates to build ABS, then study the partition of two food color dyes (synthetic origin), thus evaluating their ability on the separation of both food colorants. Through these experiments, the students get acquainted with ABS, learn how to determine solubility curves and perform extraction procedures using colorant food additives, that can also be applied in the extraction of various (bio)molecules. © 2018 by The International Union of Biochemistry and Molecular Biology, 2018. © 2018 The International Union of Biochemistry and Molecular Biology.
Space Research Data Management in the National Aeronautics and Space Administration
NASA Technical Reports Server (NTRS)
Ludwig, G. H.
1986-01-01
Space related scientific research has passed through a natural evolutionary process. The task of extracting the meaningful information from the raw data is highly involved and will require data processing capabilities that do not exist today. The results are presented of a three year examination of this subject, using an earlier report as a starting point. The general conclusion is that there are areas in which NASA's data management practices can be improved and recommends specific actions. These actions will enhance NASA's ability to extract more of the potential data and to capitalize on future opportunities.
Modelling spatiotemporal change using multidimensional arrays Meng
NASA Astrophysics Data System (ADS)
Lu, Meng; Appel, Marius; Pebesma, Edzer
2017-04-01
The large variety of remote sensors, model simulations, and in-situ records provide great opportunities to model environmental change. The massive amount of high-dimensional data calls for methods to integrate data from various sources and to analyse spatiotemporal and thematic information jointly. An array is a collection of elements ordered and indexed in arbitrary dimensions, which naturally represent spatiotemporal phenomena that are identified by their geographic locations and recording time. In addition, array regridding (e.g., resampling, down-/up-scaling), dimension reduction, and spatiotemporal statistical algorithms are readily applicable to arrays. However, the role of arrays in big geoscientific data analysis has not been systematically studied: How can arrays discretise continuous spatiotemporal phenomena? How can arrays facilitate the extraction of multidimensional information? How can arrays provide a clean, scalable and reproducible change modelling process that is communicable between mathematicians, computer scientist, Earth system scientist and stakeholders? This study emphasises on detecting spatiotemporal change using satellite image time series. Current change detection methods using satellite image time series commonly analyse data in separate steps: 1) forming a vegetation index, 2) conducting time series analysis on each pixel, and 3) post-processing and mapping time series analysis results, which does not consider spatiotemporal correlations and ignores much of the spectral information. Multidimensional information can be better extracted by jointly considering spatial, spectral, and temporal information. To approach this goal, we use principal component analysis to extract multispectral information and spatial autoregressive models to account for spatial correlation in residual based time series structural change modelling. We also discuss the potential of multivariate non-parametric time series structural change methods, hierarchical modelling, and extreme event detection methods to model spatiotemporal change. We show how array operations can facilitate expressing these methods, and how the open-source array data management and analytics software SciDB and R can be used to scale the process and make it easily reproducible.
Fast Coding of Orientation in Primary Visual Cortex
Shriki, Oren; Kohn, Adam; Shamir, Maoz
2012-01-01
Understanding how populations of neurons encode sensory information is a major goal of systems neuroscience. Attempts to answer this question have focused on responses measured over several hundred milliseconds, a duration much longer than that frequently used by animals to make decisions about the environment. How reliably sensory information is encoded on briefer time scales, and how best to extract this information, is unknown. Although it has been proposed that neuronal response latency provides a major cue for fast decisions in the visual system, this hypothesis has not been tested systematically and in a quantitative manner. Here we use a simple ‘race to threshold’ readout mechanism to quantify the information content of spike time latency of primary visual (V1) cortical cells to stimulus orientation. We find that many V1 cells show pronounced tuning of their spike latency to stimulus orientation and that almost as much information can be extracted from spike latencies as from firing rates measured over much longer durations. To extract this information, stimulus onset must be estimated accurately. We show that the responses of cells with weak tuning of spike latency can provide a reliable onset detector. We find that spike latency information can be pooled from a large neuronal population, provided that the decision threshold is scaled linearly with the population size, yielding a processing time of the order of a few tens of milliseconds. Our results provide a novel mechanism for extracting information from neuronal populations over the very brief time scales in which behavioral judgments must sometimes be made. PMID:22719237
Pahl, Ina; Dorey, Samuel; Barbaroux, Magali; Lagrange, Bertille; Frankl, Heike
2014-01-01
This paper describes an approach of extractables determination and gives information on extractables profiles for gamma-sterilized single-use bags with polyethylene inner contact surfaces from five different suppliers. Four extraction solvents were chosen to capture a broad spectrum of extractables. An 80% ethanol extraction was used to extract compounds that represent the bag resin and the organic additives used to stabilize or process the polymer films which would not normally be water-soluble. Extractions with1 M HCl extract, 1 M NaOH extract, and 1% polysorbate 80 were used to bracket potential leachables in biopharmaceutical process fluids. The objective of this study was to obtain extractables data from different bags under identical test conditions. All the bags had a nominal capacity of 5 L, were gamma-irradiated prior to testing, and were tested without modification except that connectors, if any, were removed prior to filling. They were extracted at 40 °C for 30 days. Extractables from all bag extracts were identified and the concentration estimated using headspace gas chromatography-mass spectrometry and flame ionization detection for volatile compounds and for semi-volatile compounds, and liquid chromatography-mass spectrometry for targeted compounds. Metals and other elements were detected and quantified by inductively coupled plasma mass spectrometry analysis. The results showed a variety of extractables, some of which are not related to the inner polyethylene contact layer. Detected organic compounds included oligomers from polyolefins, additives and their degradation products, and oligomers from the fill tubing. The concentrations of extractables were in the range of parts-per-billion to parts-per-million per bag under the applied extraction conditions. Toxicological effects of the extractables are not addressed in this paper. Extractables and leachables characterization supports the validation and the use of single-use bags in the biopharmaceutical manufacturing process. This paper describes an approach for the identification and quantification of extractable substances for five commercially available single-use bags from different suppliers under identical analytical conditions. Four test formulations were used for the extraction, and extractables were analyzed with appropriately qualified analytical techniques, allowing for the detection of a broad range of released chemical compounds. Polymer additives such as antioxidants and processing aids and their degradation products were found to be the source of most of the extracted compounds. The concentration of extractables ranged from parts-per-billion to parts-per-million under the applied extraction conditions. © PDA, Inc. 2014.
Super-pixel extraction based on multi-channel pulse coupled neural network
NASA Astrophysics Data System (ADS)
Xu, GuangZhu; Hu, Song; Zhang, Liu; Zhao, JingJing; Fu, YunXia; Lei, BangJun
2018-04-01
Super-pixel extraction techniques group pixels to form over-segmented image blocks according to the similarity among pixels. Compared with the traditional pixel-based methods, the image descripting method based on super-pixel has advantages of less calculation, being easy to perceive, and has been widely used in image processing and computer vision applications. Pulse coupled neural network (PCNN) is a biologically inspired model, which stems from the phenomenon of synchronous pulse release in the visual cortex of cats. Each PCNN neuron can correspond to a pixel of an input image, and the dynamic firing pattern of each neuron contains both the pixel feature information and its context spatial structural information. In this paper, a new color super-pixel extraction algorithm based on multi-channel pulse coupled neural network (MPCNN) was proposed. The algorithm adopted the block dividing idea of SLIC algorithm, and the image was divided into blocks with same size first. Then, for each image block, the adjacent pixels of each seed with similar color were classified as a group, named a super-pixel. At last, post-processing was adopted for those pixels or pixel blocks which had not been grouped. Experiments show that the proposed method can adjust the number of superpixel and segmentation precision by setting parameters, and has good potential for super-pixel extraction.
NASA Astrophysics Data System (ADS)
Liu, X.; Zhang, J. X.; Zhao, Z.; Ma, A. D.
2015-06-01
Synthetic aperture radar in the application of remote sensing technology is becoming more and more widely because of its all-time and all-weather operation, feature extraction research in high resolution SAR image has become a hot topic of concern. In particular, with the continuous improvement of airborne SAR image resolution, image texture information become more abundant. It's of great significance to classification and extraction. In this paper, a novel method for built-up areas extraction using both statistical and structural features is proposed according to the built-up texture features. First of all, statistical texture features and structural features are respectively extracted by classical method of gray level co-occurrence matrix and method of variogram function, and the direction information is considered in this process. Next, feature weights are calculated innovatively according to the Bhattacharyya distance. Then, all features are weighted fusion. At last, the fused image is classified with K-means classification method and the built-up areas are extracted after post classification process. The proposed method has been tested by domestic airborne P band polarization SAR images, at the same time, two groups of experiments based on the method of statistical texture and the method of structural texture were carried out respectively. On the basis of qualitative analysis, quantitative analysis based on the built-up area selected artificially is enforced, in the relatively simple experimentation area, detection rate is more than 90%, in the relatively complex experimentation area, detection rate is also higher than the other two methods. In the study-area, the results show that this method can effectively and accurately extract built-up areas in high resolution airborne SAR imagery.
Semantic Location Extraction from Crowdsourced Data
NASA Astrophysics Data System (ADS)
Koswatte, S.; Mcdougall, K.; Liu, X.
2016-06-01
Crowdsourced Data (CSD) has recently received increased attention in many application areas including disaster management. Convenience of production and use, data currency and abundancy are some of the key reasons for attracting this high interest. Conversely, quality issues like incompleteness, credibility and relevancy prevent the direct use of such data in important applications like disaster management. Moreover, location information availability of CSD is problematic as it remains very low in many crowd sourced platforms such as Twitter. Also, this recorded location is mostly related to the mobile device or user location and often does not represent the event location. In CSD, event location is discussed descriptively in the comments in addition to the recorded location (which is generated by means of mobile device's GPS or mobile communication network). This study attempts to semantically extract the CSD location information with the help of an ontological Gazetteer and other available resources. 2011 Queensland flood tweets and Ushahidi Crowd Map data were semantically analysed to extract the location information with the support of Queensland Gazetteer which is converted to an ontological gazetteer and a global gazetteer. Some preliminary results show that the use of ontologies and semantics can improve the accuracy of place name identification of CSD and the process of location information extraction.
Experimental extraction of an entangled photon pair from two identically decohered pairs.
Yamamoto, Takashi; Koashi, Masato; Ozdemir, Sahin Kaya; Imoto, Nobuyuki
2003-01-23
Entanglement is considered to be one of the most important resources in quantum information processing schemes, including teleportation, dense coding and entanglement-based quantum key distribution. Because entanglement cannot be generated by classical communication between distant parties, distribution of entangled particles between them is necessary. During the distribution process, entanglement between the particles is degraded by the decoherence and dissipation processes that result from unavoidable coupling with the environment. Entanglement distillation and concentration schemes are therefore needed to extract pairs with a higher degree of entanglement from these less-entangled pairs; this is accomplished using local operations and classical communication. Here we report an experimental demonstration of extraction of a polarization-entangled photon pair from two decohered photon pairs. Two polarization-entangled photon pairs are generated by spontaneous parametric down-conversion and then distributed through a channel that induces identical phase fluctuations to both pairs; this ensures that no entanglement is available as long as each pair is manipulated individually. Then, through collective local operations and classical communication we extract from the two decohered pairs a photon pair that is observed to be polarization-entangled.
Feature Extraction of Electronic Nose Signals Using QPSO-Based Multiple KFDA Signal Processing
Wen, Tailai; Huang, Daoyu; Lu, Kun; Deng, Changjian; Zeng, Tanyue; Yu, Song; He, Zhiyi
2018-01-01
The aim of this research was to enhance the classification accuracy of an electronic nose (E-nose) in different detecting applications. During the learning process of the E-nose to predict the types of different odors, the prediction accuracy was not quite satisfying because the raw features extracted from sensors’ responses were regarded as the input of a classifier without any feature extraction processing. Therefore, in order to obtain more useful information and improve the E-nose’s classification accuracy, in this paper, a Weighted Kernels Fisher Discriminant Analysis (WKFDA) combined with Quantum-behaved Particle Swarm Optimization (QPSO), i.e., QWKFDA, was presented to reprocess the original feature matrix. In addition, we have also compared the proposed method with quite a few previously existing ones including Principal Component Analysis (PCA), Locality Preserving Projections (LPP), Fisher Discriminant Analysis (FDA) and Kernels Fisher Discriminant Analysis (KFDA). Experimental results proved that QWKFDA is an effective feature extraction method for E-nose in predicting the types of wound infection and inflammable gases, which shared much higher classification accuracy than those of the contrast methods. PMID:29382146
Feature Extraction of Electronic Nose Signals Using QPSO-Based Multiple KFDA Signal Processing.
Wen, Tailai; Yan, Jia; Huang, Daoyu; Lu, Kun; Deng, Changjian; Zeng, Tanyue; Yu, Song; He, Zhiyi
2018-01-29
The aim of this research was to enhance the classification accuracy of an electronic nose (E-nose) in different detecting applications. During the learning process of the E-nose to predict the types of different odors, the prediction accuracy was not quite satisfying because the raw features extracted from sensors' responses were regarded as the input of a classifier without any feature extraction processing. Therefore, in order to obtain more useful information and improve the E-nose's classification accuracy, in this paper, a Weighted Kernels Fisher Discriminant Analysis (WKFDA) combined with Quantum-behaved Particle Swarm Optimization (QPSO), i.e., QWKFDA, was presented to reprocess the original feature matrix. In addition, we have also compared the proposed method with quite a few previously existing ones including Principal Component Analysis (PCA), Locality Preserving Projections (LPP), Fisher Discriminant Analysis (FDA) and Kernels Fisher Discriminant Analysis (KFDA). Experimental results proved that QWKFDA is an effective feature extraction method for E-nose in predicting the types of wound infection and inflammable gases, which shared much higher classification accuracy than those of the contrast methods.
Macedo, Alessandra A; Pessotti, Hugo C; Almansa, Luciana F; Felipe, Joaquim C; Kimura, Edna T
2016-07-01
The analyses of several systems for medical-imaging processing typically support the extraction of image attributes, but do not comprise some information that characterizes images. For example, morphometry can be applied to find new information about the visual content of an image. The extension of information may result in knowledge. Subsequently, results of mappings can be applied to recognize exam patterns, thus improving the accuracy of image retrieval and allowing a better interpretation of exam results. Although successfully applied in breast lesion images, the morphometric approach is still poorly explored in thyroid lesions due to the high subjectivity thyroid examinations. This paper presents a theoretical-practical study, considering Computer Aided Diagnosis (CAD) and Morphometry, to reduce the semantic discontinuity between medical image features and human interpretation of image content. The proposed method aggregates the content of microscopic images characterized by morphometric information and other image attributes extracted by traditional object extraction algorithms. This method carries out segmentation, feature extraction, image labeling and classification. Morphometric analysis was included as an object extraction method in order to verify the improvement of its accuracy for automatic classification of microscopic images. To validate this proposal and verify the utility of morphometric information to characterize thyroid images, a CAD system was created to classify real thyroid image-exams into Papillary Cancer, Goiter and Non-Cancer. Results showed that morphometric information can improve the accuracy and precision of image retrieval and the interpretation of results in computer-aided diagnosis. For example, in the scenario where all the extractors are combined with the morphometric information, the CAD system had its best performance (70% of precision in Papillary cases). Results signalized a positive use of morphometric information from images to reduce semantic discontinuity between human interpretation and image characterization. Copyright © 2016 Elsevier Ireland Ltd. All rights reserved.
Patterson, Olga V; Freiberg, Matthew S; Skanderson, Melissa; J Fodeh, Samah; Brandt, Cynthia A; DuVall, Scott L
2017-06-12
In order to investigate the mechanisms of cardiovascular disease in HIV infected and uninfected patients, an analysis of echocardiogram reports is required for a large longitudinal multi-center study. A natural language processing system using a dictionary lookup, rules, and patterns was developed to extract heart function measurements that are typically recorded in echocardiogram reports as measurement-value pairs. Curated semantic bootstrapping was used to create a custom dictionary that extends existing terminologies based on terms that actually appear in the medical record. A novel disambiguation method based on semantic constraints was created to identify and discard erroneous alternative definitions of the measurement terms. The system was built utilizing a scalable framework, making it available for processing large datasets. The system was developed for and validated on notes from three sources: general clinic notes, echocardiogram reports, and radiology reports. The system achieved F-scores of 0.872, 0.844, and 0.877 with precision of 0.936, 0.982, and 0.969 for each dataset respectively averaged across all extracted values. Left ventricular ejection fraction (LVEF) is the most frequently extracted measurement. The precision of extraction of the LVEF measure ranged from 0.968 to 1.0 across different document types. This system illustrates the feasibility and effectiveness of a large-scale information extraction on clinical data. New clinical questions can be addressed in the domain of heart failure using retrospective clinical data analysis because key heart function measurements can be successfully extracted using natural language processing.
Video indexing based on image and sound
NASA Astrophysics Data System (ADS)
Faudemay, Pascal; Montacie, Claude; Caraty, Marie-Jose
1997-10-01
Video indexing is a major challenge for both scientific and economic reasons. Information extraction can sometimes be easier from sound channel than from image channel. We first present a multi-channel and multi-modal query interface, to query sound, image and script through 'pull' and 'push' queries. We then summarize the segmentation phase, which needs information from the image channel. Detection of critical segments is proposed. It should speed-up both automatic and manual indexing. We then present an overview of the information extraction phase. Information can be extracted from the sound channel, through speaker recognition, vocal dictation with unconstrained vocabularies, and script alignment with speech. We present experiment results for these various techniques. Speaker recognition methods were tested on the TIMIT and NTIMIT database. Vocal dictation as experimented on newspaper sentences spoken by several speakers. Script alignment was tested on part of a carton movie, 'Ivanhoe'. For good quality sound segments, error rates are low enough for use in indexing applications. Major issues are the processing of sound segments with noise or music, and performance improvement through the use of appropriate, low-cost architectures or networks of workstations.
Extracting nursing practice patterns from structured labor and delivery data sets.
Hall, Eric S; Thornton, Sidney N
2007-10-11
This study was designed to demonstrate the feasibility of a computerized care process model that provides real-time case profiling and outcome forecasting. A methodology was defined for extracting nursing practice patterns from structured point-of-care data collected using the labor and delivery information system at Intermountain Healthcare. Data collected during January 2006 were retrieved from Intermountain Healthcare's enterprise data warehouse for use in the study. The knowledge discovery in databases process provided a framework for data analysis including data selection, preprocessing, data-mining, and evaluation. Development of an interactive data-mining tool and construction of a data model for stratification of patient records into profiles supported the goals of the study. Five benefits of the practice pattern extraction capability, which extend to other clinical domains, are listed with supporting examples.
NASA Astrophysics Data System (ADS)
Zoratti, Paul K.; Gilbert, R. Kent; Majewski, Ronald; Ference, Jack
1995-12-01
Development of automotive collision warning systems has progressed rapidly over the past several years. A key enabling technology for these systems is millimeter-wave radar. This paper addresses a very critical millimeter-wave radar sensing issue for automotive radar, namely the scattering characteristics of common roadway objects such as vehicles, roadsigns, and bridge overpass structures. The data presented in this paper were collected on ERIM's Fine Resolution Radar Imaging Rotary Platform Facility and processed with ERIM's image processing tools. The value of this approach is that it provides system developers with a 2D radar image from which information about individual point scatterers `within a single target' can be extracted. This information on scattering characteristics will be utilized to refine threat assessment processing algorithms and automotive radar hardware configurations. (1) By evaluating the scattering characteristics identified in the radar image, radar signatures as a function of aspect angle for common roadway objects can be established. These signatures will aid in the refinement of threat assessment processing algorithms. (2) Utilizing ERIM's image manipulation tools, total RCS and RCS as a function of range and azimuth can be extracted from the radar image data. This RCS information will be essential in defining the operational envelope (e.g. dynamic range) within which any radar sensor hardware must be designed.
Semantic and syntactic interoperability in online processing of big Earth observation data.
Sudmanns, Martin; Tiede, Dirk; Lang, Stefan; Baraldi, Andrea
2018-01-01
The challenge of enabling syntactic and semantic interoperability for comprehensive and reproducible online processing of big Earth observation (EO) data is still unsolved. Supporting both types of interoperability is one of the requirements to efficiently extract valuable information from the large amount of available multi-temporal gridded data sets. The proposed system wraps world models, (semantic interoperability) into OGC Web Processing Services (syntactic interoperability) for semantic online analyses. World models describe spatio-temporal entities and their relationships in a formal way. The proposed system serves as enabler for (1) technical interoperability using a standardised interface to be used by all types of clients and (2) allowing experts from different domains to develop complex analyses together as collaborative effort. Users are connecting the world models online to the data, which are maintained in a centralised storage as 3D spatio-temporal data cubes. It allows also non-experts to extract valuable information from EO data because data management, low-level interactions or specific software issues can be ignored. We discuss the concept of the proposed system, provide a technical implementation example and describe three use cases for extracting changes from EO images and demonstrate the usability also for non-EO, gridded, multi-temporal data sets (CORINE land cover).
Semantic and syntactic interoperability in online processing of big Earth observation data
Sudmanns, Martin; Tiede, Dirk; Lang, Stefan; Baraldi, Andrea
2018-01-01
ABSTRACT The challenge of enabling syntactic and semantic interoperability for comprehensive and reproducible online processing of big Earth observation (EO) data is still unsolved. Supporting both types of interoperability is one of the requirements to efficiently extract valuable information from the large amount of available multi-temporal gridded data sets. The proposed system wraps world models, (semantic interoperability) into OGC Web Processing Services (syntactic interoperability) for semantic online analyses. World models describe spatio-temporal entities and their relationships in a formal way. The proposed system serves as enabler for (1) technical interoperability using a standardised interface to be used by all types of clients and (2) allowing experts from different domains to develop complex analyses together as collaborative effort. Users are connecting the world models online to the data, which are maintained in a centralised storage as 3D spatio-temporal data cubes. It allows also non-experts to extract valuable information from EO data because data management, low-level interactions or specific software issues can be ignored. We discuss the concept of the proposed system, provide a technical implementation example and describe three use cases for extracting changes from EO images and demonstrate the usability also for non-EO, gridded, multi-temporal data sets (CORINE land cover). PMID:29387171
Acquiring geographical data with web harvesting
NASA Astrophysics Data System (ADS)
Dramowicz, K.
2016-04-01
Many websites contain very attractive and up to date geographical information. This information can be extracted, stored, analyzed and mapped using web harvesting techniques. Poorly organized data from websites are transformed with web harvesting into a more structured format, which can be stored in a database and analyzed. Almost 25% of web traffic is related to web harvesting, mostly while using search engines. This paper presents how to harvest geographic information from web documents using the free tool called the Beautiful Soup, one of the most commonly used Python libraries for pulling data from HTML and XML files. It is a relatively easy task to process one static HTML table. The more challenging task is to extract and save information from tables located in multiple and poorly organized websites. Legal and ethical aspects of web harvesting are discussed as well. The paper demonstrates two case studies. The first one shows how to extract various types of information about the Good Country Index from the multiple web pages, load it into one attribute table and map the results. The second case study shows how script tools and GIS can be used to extract information from one hundred thirty six websites about Nova Scotia wines. In a little more than three minutes a database containing one hundred and six liquor stores selling these wines is created. Then the availability and spatial distribution of various types of wines (by grape types, by wineries, and by liquor stores) are mapped and analyzed.
Multilingual Information Retrieval in Thoracic Radiology: Feasibility Study
Castilla, André Coutinho; Furuie, Sérgio Shiguemi; Mendonça, Eneida A.
2014-01-01
Most of essential information contained on Electronic Medical Record is stored as text, imposing several difficulties on automated data extraction and retrieval. Natural language processing is an approach that can unlock clinical information from free texts. The proposed methodology uses the specialized natural language processor MEDLEE developed for English language. To use this processor on Portuguese medical texts, chest x-ray reports were Machine Translated into English. The result of serial coupling of MT an NLP is tagged text which needs further investigation for extracting clinical findings. The objective of this experiment was to investigate normal reports and reports with device description on a set of 165 chest x-ray reports. We obtained sensitivity and specificity of 1 and 0.71 for the first condition and 0.97 and 0.97 for the second respectively. The reference was formed by the opinion of two radiologists. The results of this experiment indicate the viability of extracting clinical findings from chest x-ray reports through coupling MT and NLP. PMID:17911745
DOE Office of Scientific and Technical Information (OSTI.GOV)
Petrie, G.M.; Perry, E.M.; Kirkham, R.R.
1997-09-01
This report describes the work performed at the Pacific Northwest National Laboratory (PNNL) for the U.S. Department of Energy`s Office of Nonproliferation and National Security, Office of Research and Development (NN-20). The work supports the NN-20 Broad Area Search and Analysis, a program initiated by NN-20 to improve the detection and classification of undeclared weapons facilities. Ongoing PNNL research activities are described in three main components: image collection, information processing, and change analysis. The Multispectral Airborne Imaging System, which was developed to collect georeferenced imagery in the visible through infrared regions of the spectrum, and flown on a light aircraftmore » platform, will supply current land use conditions. The image information extraction software (dynamic clustering and end-member extraction) uses imagery, like the multispectral data collected by the PNNL multispectral system, to efficiently generate landcover information. The advanced change detection uses a priori (benchmark) information, current landcover conditions, and user-supplied rules to rank suspect areas by probable risk of undeclared facilities or proliferation activities. These components, both separately and combined, provide important tools for improving the detection of undeclared facilities.« less
An Active Learning Framework for Hyperspectral Image Classification Using Hierarchical Segmentation
NASA Technical Reports Server (NTRS)
Zhang, Zhou; Pasolli, Edoardo; Crawford, Melba M.; Tilton, James C.
2015-01-01
Augmenting spectral data with spatial information for image classification has recently gained significant attention, as classification accuracy can often be improved by extracting spatial information from neighboring pixels. In this paper, we propose a new framework in which active learning (AL) and hierarchical segmentation (HSeg) are combined for spectral-spatial classification of hyperspectral images. The spatial information is extracted from a best segmentation obtained by pruning the HSeg tree using a new supervised strategy. The best segmentation is updated at each iteration of the AL process, thus taking advantage of informative labeled samples provided by the user. The proposed strategy incorporates spatial information in two ways: 1) concatenating the extracted spatial features and the original spectral features into a stacked vector and 2) extending the training set using a self-learning-based semi-supervised learning (SSL) approach. Finally, the two strategies are combined within an AL framework. The proposed framework is validated with two benchmark hyperspectral datasets. Higher classification accuracies are obtained by the proposed framework with respect to five other state-of-the-art spectral-spatial classification approaches. Moreover, the effectiveness of the proposed pruning strategy is also demonstrated relative to the approaches based on a fixed segmentation.
NASA Astrophysics Data System (ADS)
Soilán, M.; Riveiro, B.; Sánchez-Rodríguez, A.; González-deSantos, L. M.
2018-05-01
During the last few years, there has been a huge methodological development regarding the automatic processing of 3D point cloud data acquired by both terrestrial and aerial mobile mapping systems, motivated by the improvement of surveying technologies and hardware performance. This paper presents a methodology that, in a first place, extracts geometric and semantic information regarding the road markings within the surveyed area from Mobile Laser Scanning (MLS) data, and then employs it to isolate street areas where pedestrian crossings are found and, therefore, pedestrians are more likely to cross the road. Then, different safety-related features can be extracted in order to offer information about the adequacy of the pedestrian crossing regarding its safety, which can be displayed in a Geographical Information System (GIS) layer. These features are defined in four different processing modules: Accessibility analysis, traffic lights classification, traffic signs classification, and visibility analysis. The validation of the proposed methodology has been carried out in two different cities in the northwest of Spain, obtaining both quantitative and qualitative results for pedestrian crossing classification and for each processing module of the safety assessment on pedestrian crossing environments.
Thermodynamical detection of entanglement by Maxwell's demons
DOE Office of Scientific and Technical Information (OSTI.GOV)
Maruyama, Koji; Vedral, Vlatko; Morikoshi, Fumiaki
2005-01-01
Quantum correlation, or entanglement, is now believed to be an indispensable physical resource for certain tasks in quantum information processing, for which classically correlated states cannot be useful. Besides information processing, what kind of physical processes can exploit entanglement? In this paper, we show that there is indeed a more basic relationship between entanglement and its usefulness in thermodynamics. We derive an inequality showing that we can extract more work out of a heat bath via entangled systems than via classically correlated ones. We also analyze the work balance of the process as a heat engine, in connection with themore » second law of thermodynamics.« less
Developing Data Systems To Support the Analysis and Development of Large-Scale, On-Line Assessment.
ERIC Educational Resources Information Center
Yu, Chong Ho
Today many data warehousing systems are data rich, but information poor. Extracting useful information from an ocean of data to support administrative, policy, and instructional decisions becomes a major challenge to both database designers and measurement specialists. This paper focuses on the development of a data processing system that…
D'Avolio, Leonard W; Nguyen, Thien M; Goryachev, Sergey; Fiore, Louis D
2011-01-01
Despite at least 40 years of promising empirical performance, very few clinical natural language processing (NLP) or information extraction systems currently contribute to medical science or care. The authors address this gap by reducing the need for custom software and rules development with a graphical user interface-driven, highly generalizable approach to concept-level retrieval. A 'learn by example' approach combines features derived from open-source NLP pipelines with open-source machine learning classifiers to automatically and iteratively evaluate top-performing configurations. The Fourth i2b2/VA Shared Task Challenge's concept extraction task provided the data sets and metrics used to evaluate performance. Top F-measure scores for each of the tasks were medical problems (0.83), treatments (0.82), and tests (0.83). Recall lagged precision in all experiments. Precision was near or above 0.90 in all tasks. Discussion With no customization for the tasks and less than 5 min of end-user time to configure and launch each experiment, the average F-measure was 0.83, one point behind the mean F-measure of the 22 entrants in the competition. Strong precision scores indicate the potential of applying the approach for more specific clinical information extraction tasks. There was not one best configuration, supporting an iterative approach to model creation. Acceptable levels of performance can be achieved using fully automated and generalizable approaches to concept-level information extraction. The described implementation and related documentation is available for download.
Raja, Kalpana; Natarajan, Jeyakumar
2018-07-01
Extraction of protein phosphorylation information from biomedical literature has gained much attention because of the importance in numerous biological processes. In this study, we propose a text mining methodology which consists of two phases, NLP parsing and SVM classification to extract phosphorylation information from literature. First, using NLP parsing we divide the data into three base-forms depending on the biomedical entities related to phosphorylation and further classify into ten sub-forms based on their distribution with phosphorylation keyword. Next, we extract the phosphorylation entity singles/pairs/triplets and apply SVM to classify the extracted singles/pairs/triplets using a set of features applicable to each sub-form. The performance of our methodology was evaluated on three corpora namely PLC, iProLink and hPP corpus. We obtained promising results of >85% F-score on ten sub-forms of training datasets on cross validation test. Our system achieved overall F-score of 93.0% on iProLink and 96.3% on hPP corpus test datasets. Furthermore, our proposed system achieved best performance on cross corpus evaluation and outperformed the existing system with recall of 90.1%. The performance analysis of our unique system on three corpora reveals that it extracts protein phosphorylation information efficiently in both non-organism specific general datasets such as PLC and iProLink, and human specific dataset such as hPP corpus. Copyright © 2018 Elsevier B.V. All rights reserved.
NASA Astrophysics Data System (ADS)
Dogon-yaro, M. A.; Kumar, P.; Rahman, A. Abdul; Buyuksalih, G.
2016-10-01
Timely and accurate acquisition of information on the condition and structural changes of urban trees serves as a tool for decision makers to better appreciate urban ecosystems and their numerous values which are critical to building up strategies for sustainable development. The conventional techniques used for extracting tree features include; ground surveying and interpretation of the aerial photography. However, these techniques are associated with some constraint, such as labour intensive field work, a lot of financial requirement, influences by weather condition and topographical covers which can be overcome by means of integrated airborne based LiDAR and very high resolution digital image datasets. This study presented a semi-automated approach for extracting urban trees from integrated airborne based LIDAR and multispectral digital image datasets over Istanbul city of Turkey. The above scheme includes detection and extraction of shadow free vegetation features based on spectral properties of digital images using shadow index and NDVI techniques and automated extraction of 3D information about vegetation features from the integrated processing of shadow free vegetation image and LiDAR point cloud datasets. The ability of the developed algorithms shows a promising result as an automated and cost effective approach to estimating and delineated 3D information of urban trees. The research also proved that integrated datasets is a suitable technology and a viable source of information for city managers to be used in urban trees management.
Photon extraction and conversion for scalable ion-trap quantum computing
NASA Astrophysics Data System (ADS)
Clark, Susan; Benito, Francisco; McGuinness, Hayden; Stick, Daniel
2014-03-01
Trapped ions represent one of the most mature and promising systems for quantum information processing. They have high-fidelity one- and two-qubit gates, long coherence times, and their qubit states can be reliably prepared and detected. Taking advantage of these inherent qualities in a system with many ions requires a means of entangling spatially separated ion qubits. One architecture achieves this entanglement through the use of emitted photons to distribute quantum information - a favorable strategy if photon extraction can be made efficient and reliable. Here I present results for photon extraction from an ion in a cavity formed by integrated optics on a surface trap, as well as results in frequency converting extracted photons for long distance transmission or interfering with photons from other types of optically active qubits. Sandia National Laboratories is a multi-program laboratory managed and operated by Sandia Corporation, a wholly owned subsidiary of Lockheed Martin Corporation, for the U. S. Department of Energy's National Nuclear Security Administration under contract DE-AC04-94AL85000.
Scholarly Information Extraction Is Going to Make a Quantum Leap with PubMed Central (PMC).
Matthies, Franz; Hahn, Udo
2017-01-01
With the increasing availability of complete full texts (journal articles), rather than their surrogates (titles, abstracts), as resources for text analytics, entirely new opportunities arise for information extraction and text mining from scholarly publications. Yet, we gathered evidence that a range of problems are encountered for full-text processing when biomedical text analytics simply reuse existing NLP pipelines which were developed on the basis of abstracts (rather than full texts). We conducted experiments with four different relation extraction engines all of which were top performers in previous BioNLP Event Extraction Challenges. We found that abstract-trained engines loose up to 6.6% F-score points when run on full-text data. Hence, the reuse of existing abstract-based NLP software in a full-text scenario is considered harmful because of heavy performance losses. Given the current lack of annotated full-text resources to train on, our study quantifies the price paid for this short cut.
Jonnalagadda, Siddhartha; Gonzalez, Graciela
2010-11-13
BioSimplify is an open source tool written in Java that introduces and facilitates the use of a novel model for sentence simplification tuned for automatic discourse analysis and information extraction (as opposed to sentence simplification for improving human readability). The model is based on a "shot-gun" approach that produces many different (simpler) versions of the original sentence by combining variants of its constituent elements. This tool is optimized for processing biomedical scientific literature such as the abstracts indexed in PubMed. We tested our tool on its impact to the task of PPI extraction and it improved the f-score of the PPI tool by around 7%, with an improvement in recall of around 20%. The BioSimplify tool and test corpus can be downloaded from https://biosimplify.sourceforge.net.
Directional dual-tree rational-dilation complex wavelet transform.
Serbes, Gorkem; Gulcur, Halil Ozcan; Aydin, Nizamettin
2014-01-01
Dyadic discrete wavelet transform (DWT) has been used successfully in processing signals having non-oscillatory transient behaviour. However, due to the low Q-factor property of their wavelet atoms, the dyadic DWT is less effective in processing oscillatory signals such as embolic signals (ESs). ESs are extracted from quadrature Doppler signals, which are the output of Doppler ultrasound systems. In order to process ESs, firstly, a pre-processing operation known as phase filtering for obtaining directional signals from quadrature Doppler signals must be employed. Only then, wavelet based methods can be applied to these directional signals for further analysis. In this study, a directional dual-tree rational-dilation complex wavelet transform, which can be applied directly to quadrature signals and has the ability of extracting directional information during analysis, is introduced.
Fernandez-Ricaud, Luciano; Kourtchenko, Olga; Zackrisson, Martin; Warringer, Jonas; Blomberg, Anders
2016-06-23
Phenomics is a field in functional genomics that records variation in organismal phenotypes in the genetic, epigenetic or environmental context at a massive scale. For microbes, the key phenotype is the growth in population size because it contains information that is directly linked to fitness. Due to technical innovations and extensive automation our capacity to record complex and dynamic microbial growth data is rapidly outpacing our capacity to dissect and visualize this data and extract the fitness components it contains, hampering progress in all fields of microbiology. To automate visualization, analysis and exploration of complex and highly resolved microbial growth data as well as standardized extraction of the fitness components it contains, we developed the software PRECOG (PREsentation and Characterization Of Growth-data). PRECOG allows the user to quality control, interact with and evaluate microbial growth data with ease, speed and accuracy, also in cases of non-standard growth dynamics. Quality indices filter high- from low-quality growth experiments, reducing false positives. The pre-processing filters in PRECOG are computationally inexpensive and yet functionally comparable to more complex neural network procedures. We provide examples where data calibration, project design and feature extraction methodologies have a clear impact on the estimated growth traits, emphasising the need for proper standardization in data analysis. PRECOG is a tool that streamlines growth data pre-processing, phenotypic trait extraction, visualization, distribution and the creation of vast and informative phenomics databases.
Big Data Mining and Adverse Event Pattern Analysis in Clinical Drug Trials
Federer, Callie; Yoo, Minjae
2016-01-01
Abstract Drug adverse events (AEs) are a major health threat to patients seeking medical treatment and a significant barrier in drug discovery and development. AEs are now required to be submitted during clinical trials and can be extracted from ClinicalTrials.gov (https://clinicaltrials.gov/), a database of clinical studies around the world. By extracting drug and AE information from ClinicalTrials.gov and structuring it into a database, drug-AEs could be established for future drug development and repositioning. To our knowledge, current AE databases contain mainly U.S. Food and Drug Administration (FDA)-approved drugs. However, our database contains both FDA-approved and experimental compounds extracted from ClinicalTrials.gov. Our database contains 8,161 clinical trials of 3,102,675 patients and 713,103 reported AEs. We extracted the information from ClinicalTrials.gov using a set of python scripts, and then used regular expressions and a drug dictionary to process and structure relevant information into a relational database. We performed data mining and pattern analysis of drug-AEs in our database. Our database can serve as a tool to assist researchers to discover drug-AE relationships for developing, repositioning, and repurposing drugs. PMID:27631620
Big Data Mining and Adverse Event Pattern Analysis in Clinical Drug Trials.
Federer, Callie; Yoo, Minjae; Tan, Aik Choon
2016-12-01
Drug adverse events (AEs) are a major health threat to patients seeking medical treatment and a significant barrier in drug discovery and development. AEs are now required to be submitted during clinical trials and can be extracted from ClinicalTrials.gov ( https://clinicaltrials.gov/ ), a database of clinical studies around the world. By extracting drug and AE information from ClinicalTrials.gov and structuring it into a database, drug-AEs could be established for future drug development and repositioning. To our knowledge, current AE databases contain mainly U.S. Food and Drug Administration (FDA)-approved drugs. However, our database contains both FDA-approved and experimental compounds extracted from ClinicalTrials.gov . Our database contains 8,161 clinical trials of 3,102,675 patients and 713,103 reported AEs. We extracted the information from ClinicalTrials.gov using a set of python scripts, and then used regular expressions and a drug dictionary to process and structure relevant information into a relational database. We performed data mining and pattern analysis of drug-AEs in our database. Our database can serve as a tool to assist researchers to discover drug-AE relationships for developing, repositioning, and repurposing drugs.
The utility of an automated electronic system to monitor and audit transfusion practice.
Grey, D E; Smith, V; Villanueva, G; Richards, B; Augustson, B; Erber, W N
2006-05-01
Transfusion laboratories with transfusion committees have a responsibility to monitor transfusion practice and generate improvements in clinical decision-making and red cell usage. However, this can be problematic and expensive because data cannot be readily extracted from most laboratory information systems. To overcome this problem, we developed and introduced a system to electronically extract and collate extensive amounts of data from two laboratory information systems and to link it with ICD10 clinical codes in a new database using standard information technology. Three data files were generated from two laboratory information systems, ULTRA (version 3.2) and TM, using standard information technology scripts. These were patient pre- and post-transfusion haemoglobin, blood group and antibody screen, and cross match and transfusion data. These data together with ICD10 codes for surgical cases were imported into an MS ACCESS database and linked by means of a unique laboratory number. Queries were then run to extract the relevant information and processed in Microsoft Excel for graphical presentation. We assessed the utility of this data extraction system to audit transfusion practice in a 600-bed adult tertiary hospital over an 18-month period. A total of 52 MB of data were extracted from the two laboratory information systems for the 18-month period and together with 2.0 MB theatre ICD10 data enabled case-specific transfusion information to be generated. The audit evaluated 15,992 blood group and antibody screens, 25,344 cross-matched red cell units and 15,455 transfused red cell units. Data evaluated included cross-matched to transfusion ratios and pre- and post-transfusion haemoglobin levels for a range of clinical diagnoses. Data showed significant differences between clinical units and by ICD10 code. This method to electronically extract large amounts of data and linkage with clinical databases has provided a powerful and sustainable tool for monitoring transfusion practice. It has been successfully used to identify areas requiring education, training and clinical guidance and allows for comparison with national haemoglobin-based transfusion guidelines.
IMAGE 100: The interactive multispectral image processing system
NASA Technical Reports Server (NTRS)
Schaller, E. S.; Towles, R. W.
1975-01-01
The need for rapid, cost-effective extraction of useful information from vast quantities of multispectral imagery available from aircraft or spacecraft has resulted in the design, implementation and application of a state-of-the-art processing system known as IMAGE 100. Operating on the general principle that all objects or materials possess unique spectral characteristics or signatures, the system uses this signature uniqueness to identify similar features in an image by simultaneously analyzing signatures in multiple frequency bands. Pseudo-colors, or themes, are assigned to features having identical spectral characteristics. These themes are displayed on a color CRT, and may be recorded on tape, film, or other media. The system was designed to incorporate key features such as interactive operation, user-oriented displays and controls, and rapid-response machine processing. Owing to these features, the user can readily control and/or modify the analysis process based on his knowledge of the input imagery. Effective use can be made of conventional photographic interpretation skills and state-of-the-art machine analysis techniques in the extraction of useful information from multispectral imagery. This approach results in highly accurate multitheme classification of imagery in seconds or minutes rather than the hours often involved in processing using other means.
Aqueous Chloride Operations Overview: Plutonium and Americium Purification/Recovery
DOE Office of Scientific and Technical Information (OSTI.GOV)
Gardner, Kyle Shelton; Kimball, David Bryan; Skidmore, Bradley Evan
These are a set of slides intended for an information session as part of recruiting activities at Brigham Young University. It gives an overview of aqueous chloride operations, specifically on plutonium and americium purification/recovery. This presentation details the steps taken perform these processes, from plutonium size reduction, dissolution, solvent extraction, oxalate precipitation, to calcination. For americium recovery, it details the CLEAR (chloride extraction and actinide recovery) Line, oxalate precipitation and calcination.
Intelligent Vision On The SM9O Mini-Computer Basis And Applications
NASA Astrophysics Data System (ADS)
Hawryszkiw, J.
1985-02-01
Distinction has to be made between image processing and vision Image processing finds its roots in the strong tradition of linear signal processing and promotes geometrical transform techniques, such as fi I tering , compression, and restoration. Its purpose is to transform an image for a human observer to easily extract from that image information significant for him. For example edges after a gradient operator, or a specific direction after a directional filtering operation. Image processing consists in fact in a set of local or global space-time transforms. The interpretation of the final image is done by the human observer. The purpose of vision is to extract the semantic content of the image. The machine can then understand that content, and run a process of decision, which turns into an action. Thus, intel I i gent vision depends on - Image processing - Pattern recognition - Artificial intel I igence
White, Brittany L; Howard, Luke R; Prior, Ronald L
2011-05-11
Juice is the most common form in which cranberries are consumed; however there is limited information on the changes of polyphenolic content of the berries during juice processing. This study investigated the effects of three different pretreatments (grinding plus blanching; only grinding; only blanching) for cranberry juice processing on the concentrations of anthocyanins, flavonols, and procyanidins throughout processing. Flavonols and procyanidins were retained in the juice to a greater extent than anthocyanins, and pressing resulted in the most significant losses in polyphenolics due to removal of the seeds and skins. Flavonol aglycones were formed during processing as a result of heat treatment. Drying of cranberry pomace resulted in increased extraction of flavonols and procyanidin oligomers but lower extraction of polymeric procyanidins. The results indicate that cranberry polyphenolics are relatively stable during processing compared to other berries; however, more work is needed to determine their fate during storage of juices.
NASA Astrophysics Data System (ADS)
Lu, Mujie; Shang, Wenjie; Ji, Xinkai; Hua, Mingzhuang; Cheng, Kuo
2015-12-01
Nowadays, intelligent transportation system (ITS) has already become the new direction of transportation development. Traffic data, as a fundamental part of intelligent transportation system, is having a more and more crucial status. In recent years, video observation technology has been widely used in the field of traffic information collecting. Traffic flow information contained in video data has many advantages which is comprehensive and can be stored for a long time, but there are still many problems, such as low precision and high cost in the process of collecting information. This paper aiming at these problems, proposes a kind of traffic target detection method with broad applicability. Based on three different ways of getting video data, such as aerial photography, fixed camera and handheld camera, we develop a kind of intelligent analysis software which can be used to extract the macroscopic, microscopic traffic flow information in the video, and the information can be used for traffic analysis and transportation planning. For road intersections, the system uses frame difference method to extract traffic information, for freeway sections, the system uses optical flow method to track the vehicles. The system was applied in Nanjing, Jiangsu province, and the application shows that the system for extracting different types of traffic flow information has a high accuracy, it can meet the needs of traffic engineering observations and has a good application prospect.
Federal Register 2010, 2011, 2012, 2013, 2014
2013-12-24
...: http://www.epa.gov/dockets . Abstract: The sources subject to this rule (i.e., extraction plants, ceramic plants, foundries, incinerators, propellant plants, and machine shops which process beryllium and...
Beetz, M Jerome; Hechavarría, Julio C; Kössl, Manfred
2016-06-30
Precise temporal coding is necessary for proper acoustic analysis. However, at cortical level, forward suppression appears to limit the ability of neurons to extract temporal information from natural sound sequences. Here we studied how temporal processing can be maintained in the bats' cortex in the presence of suppression evoked by natural echolocation streams that are relevant to the bats' behavior. We show that cortical neurons tuned to target-distance actually profit from forward suppression induced by natural echolocation sequences. These neurons can more precisely extract target distance information when they are stimulated with natural echolocation sequences than during stimulation with isolated call-echo pairs. We conclude that forward suppression does for time domain tuning what lateral inhibition does for selectivity forms such as auditory frequency tuning and visual orientation tuning. When talking about cortical processing, suppression should be seen as a mechanistic tool rather than a limiting element.
3D electron tomography of pretreated biomass informs atomic modeling of cellulose microfibrils.
Ciesielski, Peter N; Matthews, James F; Tucker, Melvin P; Beckham, Gregg T; Crowley, Michael F; Himmel, Michael E; Donohoe, Bryon S
2013-09-24
Fundamental insights into the macromolecular architecture of plant cell walls will elucidate new structure-property relationships and facilitate optimization of catalytic processes that produce fuels and chemicals from biomass. Here we introduce computational methodology to extract nanoscale geometry of cellulose microfibrils within thermochemically treated biomass directly from electron tomographic data sets. We quantitatively compare the cell wall nanostructure in corn stover following two leading pretreatment strategies: dilute acid with iron sulfate co-catalyst and ammonia fiber expansion (AFEX). Computational analysis of the tomographic data is used to extract mathematical descriptions for longitudinal axes of cellulose microfibrils from which we calculate their nanoscale curvature. These nanostructural measurements are used to inform the construction of atomistic models that exhibit features of cellulose within real, process-relevant biomass. By computational evaluation of these atomic models, we propose relationships between the crystal structure of cellulose Iβ and the nanoscale geometry of cellulose microfibrils.
A mask quality control tool for the OSIRIS multi-object spectrograph
NASA Astrophysics Data System (ADS)
López-Ruiz, J. C.; Vaz Cedillo, Jacinto Javier; Ederoclite, Alessandro; Bongiovanni, Ángel; González Escalera, Víctor
2012-09-01
OSIRIS multi object spectrograph uses a set of user-customised-masks, which are manufactured on-demand. The manufacturing process consists of drilling the specified slits on the mask with the required accuracy. Ensuring that slits are on the right place when observing is of vital importance. We present a tool for checking the quality of the process of manufacturing the masks which is based on analyzing the instrument images obtained with the manufactured masks on place. The tool extracts the slit information from these images, relates specifications with the extracted slit information, and finally communicates to the operator if the manufactured mask fulfills the expectations of the mask designer. The proposed tool has been built using scripting languages and using standard libraries such as opencv, pyraf and scipy. The software architecture, advantages and limits of this tool in the lifecycle of a multiobject acquisition are presented.
Exploring patterns of epigenetic information with data mining techniques.
Aguiar-Pulido, Vanessa; Seoane, José A; Gestal, Marcos; Dorado, Julián
2013-01-01
Data mining, a part of the Knowledge Discovery in Databases process (KDD), is the process of extracting patterns from large data sets by combining methods from statistics and artificial intelligence with database management. Analyses of epigenetic data have evolved towards genome-wide and high-throughput approaches, thus generating great amounts of data for which data mining is essential. Part of these data may contain patterns of epigenetic information which are mitotically and/or meiotically heritable determining gene expression and cellular differentiation, as well as cellular fate. Epigenetic lesions and genetic mutations are acquired by individuals during their life and accumulate with ageing. Both defects, either together or individually, can result in losing control over cell growth and, thus, causing cancer development. Data mining techniques could be then used to extract the previous patterns. This work reviews some of the most important applications of data mining to epigenetics.
Using decision-tree classifier systems to extract knowledge from databases
NASA Technical Reports Server (NTRS)
St.clair, D. C.; Sabharwal, C. L.; Hacke, Keith; Bond, W. E.
1990-01-01
One difficulty in applying artificial intelligence techniques to the solution of real world problems is that the development and maintenance of many AI systems, such as those used in diagnostics, require large amounts of human resources. At the same time, databases frequently exist which contain information about the process(es) of interest. Recently, efforts to reduce development and maintenance costs of AI systems have focused on using machine learning techniques to extract knowledge from existing databases. Research is described in the area of knowledge extraction using a class of machine learning techniques called decision-tree classifier systems. Results of this research suggest ways of performing knowledge extraction which may be applied in numerous situations. In addition, a measurement called the concept strength metric (CSM) is described which can be used to determine how well the resulting decision tree can differentiate between the concepts it has learned. The CSM can be used to determine whether or not additional knowledge needs to be extracted from the database. An experiment involving real world data is presented to illustrate the concepts described.
Using texts in science education: cognitive processes and knowledge representation.
van den Broek, Paul
2010-04-23
Texts form a powerful tool in teaching concepts and principles in science. How do readers extract information from a text, and what are the limitations in this process? Central to comprehension of and learning from a text is the construction of a coherent mental representation that integrates the textual information and relevant background knowledge. This representation engenders learning if it expands the reader's existing knowledge base or if it corrects misconceptions in this knowledge base. The Landscape Model captures the reading process and the influences of reader characteristics (such as working-memory capacity, reading goal, prior knowledge, and inferential skills) and text characteristics (such as content/structure of presented information, processing demands, and textual cues). The model suggests factors that can optimize--or jeopardize--learning science from text.
Enzymatic vegetable organic extracts as soil biochemical biostimulants and atrazine extenders.
García-Martínez, Ana María; Tejada, Manuel; Díaz, Ana Isabel; Rodríguez-Morgado, Bruno; Bautista, Juan; Parrado, Juan
2010-09-08
The purpose of this study was to gather information on the potential effects of organic biostimulants on soil activity and atrazine biodegradation. Carob germ enzymatic extract (CGEE) and wheat condensed distiller solubles enzymatic extract (WCDS-EE) have been obtained using an enzymatic process; their main organic components are soluble carbohydrates and proteins in the form of peptides and free amino acids. Their application to soil results in high biostimulation, rapidly increased dehydrogenase, phosphatase and glucosidase activities, and an observed atrazine extender capacity due to inhibition of its mineralization. The extender capacity of both extracts is proportional to the protein/carbohydrate ratio content. As a result, these enzymatic extracts are highly microbially available, leading to two independent phenomena, fertility and an atrazine persistence that is linked to increased soil activity.
Dillahunt-Aspillaga, Christina; Finch, Dezon; Massengale, Jill; Kretzmer, Tracy; Luther, Stephen L.; McCart, James A.
2014-01-01
Objective The purpose of this pilot study is 1) to develop an annotation schema and a training set of annotated notes to support the future development of a natural language processing (NLP) system to automatically extract employment information, and 2) to determine if information about employment status, goals and work-related challenges reported by service members and Veterans with mild traumatic brain injury (mTBI) and post-deployment stress can be identified in the Electronic Health Record (EHR). Design Retrospective cohort study using data from selected progress notes stored in the EHR. Setting Post-deployment Rehabilitation and Evaluation Program (PREP), an in-patient rehabilitation program for Veterans with TBI at the James A. Haley Veterans' Hospital in Tampa, Florida. Participants Service members and Veterans with TBI who participated in the PREP program (N = 60). Main Outcome Measures Documentation of employment status, goals, and work-related challenges reported by service members and recorded in the EHR. Results Two hundred notes were examined and unique vocational information was found indicating a variety of self-reported employment challenges. Current employment status and future vocational goals along with information about cognitive, physical, and behavioral symptoms that may affect return-to-work were extracted from the EHR. The annotation schema developed for this study provides an excellent tool upon which NLP studies can be developed. Conclusions Information related to employment status and vocational history is stored in text notes in the EHR system. Information stored in text does not lend itself to easy extraction or summarization for research and rehabilitation planning purposes. Development of NLP systems to automatically extract text-based employment information provides data that may improve the understanding and measurement of employment in this important cohort. PMID:25541956
Ortega-Martorell, Sandra; Ruiz, Héctor; Vellido, Alfredo; Olier, Iván; Romero, Enrique; Julià-Sapé, Margarida; Martín, José D.; Jarman, Ian H.; Arús, Carles; Lisboa, Paulo J. G.
2013-01-01
Background The clinical investigation of human brain tumors often starts with a non-invasive imaging study, providing information about the tumor extent and location, but little insight into the biochemistry of the analyzed tissue. Magnetic Resonance Spectroscopy can complement imaging by supplying a metabolic fingerprint of the tissue. This study analyzes single-voxel magnetic resonance spectra, which represent signal information in the frequency domain. Given that a single voxel may contain a heterogeneous mix of tissues, signal source identification is a relevant challenge for the problem of tumor type classification from the spectroscopic signal. Methodology/Principal Findings Non-negative matrix factorization techniques have recently shown their potential for the identification of meaningful sources from brain tissue spectroscopy data. In this study, we use a convex variant of these methods that is capable of handling negatively-valued data and generating sources that can be interpreted as tumor class prototypes. A novel approach to convex non-negative matrix factorization is proposed, in which prior knowledge about class information is utilized in model optimization. Class-specific information is integrated into this semi-supervised process by setting the metric of a latent variable space where the matrix factorization is carried out. The reported experimental study comprises 196 cases from different tumor types drawn from two international, multi-center databases. The results indicate that the proposed approach outperforms a purely unsupervised process by achieving near perfect correlation of the extracted sources with the mean spectra of the tumor types. It also improves tissue type classification. Conclusions/Significance We show that source extraction by unsupervised matrix factorization benefits from the integration of the available class information, so operating in a semi-supervised learning manner, for discriminative source identification and brain tumor labeling from single-voxel spectroscopy data. We are confident that the proposed methodology has wider applicability for biomedical signal processing. PMID:24376744
Electrochemical Probing through a Redox Capacitor To Acquire Chemical Information on Biothiols
2016-01-01
The acquisition of chemical information is a critical need for medical diagnostics, food/environmental monitoring, and national security. Here, we report an electrochemical information processing approach that integrates (i) complex electrical inputs/outputs, (ii) mediators to transduce the electrical I/O into redox signals that can actively probe the chemical environment, and (iii) a redox capacitor that manipulates signals for information extraction. We demonstrate the capabilities of this chemical information processing strategy using biothiols because of the emerging importance of these molecules in medicine and because their distinct chemical properties allow evaluation of hypothesis-driven information probing. We show that input sequences can be tailored to probe for chemical information both qualitatively (step inputs probe for thiol-specific signatures) and quantitatively. Specifically, we observed picomolar limits of detection and linear responses to concentrations over 5 orders of magnitude (1 pM–0.1 μM). This approach allows the capabilities of signal processing to be extended for rapid, robust, and on-site analysis of chemical information. PMID:27385047
Electrochemical Probing through a Redox Capacitor To Acquire Chemical Information on Biothiols.
Liu, Zhengchun; Liu, Yi; Kim, Eunkyoung; Bentley, William E; Payne, Gregory F
2016-07-19
The acquisition of chemical information is a critical need for medical diagnostics, food/environmental monitoring, and national security. Here, we report an electrochemical information processing approach that integrates (i) complex electrical inputs/outputs, (ii) mediators to transduce the electrical I/O into redox signals that can actively probe the chemical environment, and (iii) a redox capacitor that manipulates signals for information extraction. We demonstrate the capabilities of this chemical information processing strategy using biothiols because of the emerging importance of these molecules in medicine and because their distinct chemical properties allow evaluation of hypothesis-driven information probing. We show that input sequences can be tailored to probe for chemical information both qualitatively (step inputs probe for thiol-specific signatures) and quantitatively. Specifically, we observed picomolar limits of detection and linear responses to concentrations over 5 orders of magnitude (1 pM-0.1 μM). This approach allows the capabilities of signal processing to be extended for rapid, robust, and on-site analysis of chemical information.
Automatic definition of the oncologic EHR data elements from NCIT in OWL.
Cuggia, Marc; Bourdé, Annabel; Turlin, Bruno; Vincendeau, Sebastien; Bertaud, Valerie; Bohec, Catherine; Duvauferrier, Régis
2011-01-01
Semantic interoperability based on ontologies allows systems to combine their information and process them automatically. The ability to extract meaningful fragments from ontology is a key for the ontology re-use and the construction of a subset will help to structure clinical data entries. The aim of this work is to provide a method for extracting a set of concepts for a specific domain, in order to help to define data elements of an oncologic EHR. a generic extraction algorithm was developed to extract, from the NCIT and for a specific disease (i.e. prostate neoplasm), all the concepts of interest into a sub-ontology. We compared all the concepts extracted to the concepts encoded manually contained into the multi-disciplinary meeting report form (MDMRF). We extracted two sub-ontologies: sub-ontology 1 by using a single key concept and sub-ontology 2 by using 5 additional keywords. The coverage of sub-ontology 2 to the MDMRF concepts was 51%. The low rate of coverage is due to the lack of definition or mis-classification of the NCIT concepts. By providing a subset of concepts focused on a particular domain, this extraction method helps at optimizing the binding process of data elements and at maintaining and enriching a domain ontology.
The Role of Mother in Informing Girls About Puberty: A Meta-Analysis Study
Sooki, Zahra; Shariati, Mohammad; Chaman, Reza; Khosravi, Ahmad; Effatpanah, Mohammad; Keramat, Afsaneh
2016-01-01
Context Family, especially the mother, has the most important role in the education, transformation of information, and health behaviors of girls in order for them to have a healthy transition from the critical stage of puberty, but there are different views in this regard. Objectives Considering the various findings about the source of information about puberty, a meta-analysis study was conducted to investigate the extent of the mother’s role in informing girls about puberty. Data Sources This meta-analysis study was based on English articles published from 2000 to February 2015 in the Scopus, PubMed, and Science direct databases and on Persian articles in the SID, Magiran, and Iran Medex databases with determined key words and their MeSH equivalent. Study Selection Quantitative cross-sectional articles were extracted by two independent researchers and finally 46 articles were selected based on inclusion criteria. STROBE list were used for evaluation of studies. Data Extraction The percent of mothers as the current and preferred source of gaining information about the process of puberty, menarche, and menstruation from the perspective of adolescent girls was extracted from the articles. The results of studies were analyzed using meta-analysis (random effects model) and the studies’ heterogeneity was analyzed using the I2 calculation index. Variance between studies was analyzed using tau squared (Tau2) and review manager 5 software. Results The results showed that, from the perspective of teenage girls in Iran and other countries, in 56% of cases, the mother was the current source of information about the process of puberty, menarche, and menstruation. The preferred source of information about the process of puberty, menarche, and menstruation was the mother in all studies at 60% (Iran 57%, and other countries 66%). Conclusions According to the findings of this study, it is essential that health professionals and officials of the ministry of health train mothers about the time, trends, and factors affecting the start of puberty using a multi-dimensional approach that involves religious organizations, community groups, and peer groups. PMID:27331056
DOE Office of Scientific and Technical Information (OSTI.GOV)
Nilsson, Mikael
This 3-year project was a collaboration between University of California Irvine (UC Irvine), Pacific Northwest National Laboratory (PNNL), Idaho National Laboratory (INL), Argonne National Laboratory (ANL) and with an international collaborator at ForschungZentrum Jülich (FZJ). The project was led from UC Irvine under the direction of Profs. Mikael Nilsson and Hung Nguyen. The leads at PNNL, INL, ANL and FZJ were Dr. Liem Dang, Dr. Peter Zalupski, Dr. Nathaniel Hoyt and Dr. Giuseppe Modolo, respectively. Involved in this project at UC Irvine were three full time PhD graduate students, Tro Babikian, Ted Yoo, and Quynh Vo, and one MS student,more » Alba Font Bosch. The overall objective of this project was to study how the kinetics and thermodynamics of metal ion extraction can be described by molecular dynamic (MD) simulations and how the simulations can be validated by experimental data. Furthermore, the project includes the applied separation by testing the extraction systems in a single stage annular centrifugal contactor and coupling the experimental data with computational fluid dynamic (CFD) simulations. Specific objectives of the proposed research were: Study and establish a rigorous connection between MD simulations based on polarizable force fields and extraction thermodynamic and kinetic data. Compare and validate CFD simulations of extraction processes for An/Ln separation using different sizes (and types) of annular centrifugal contactors. Provide a theoretical/simulation and experimental base for scale-up of batch-wise extraction to continuous contactors. We approached objective 1 and 2 in parallel. For objective 1 we started by studying a well established extraction system with a relatively simple extraction mechanism, namely tributyl phosphate. What we found was that well optimized simulations can inform experiments and new information on TBP behavior was presented in this project, as well be discussed below. The second objective proved a larger challenge and most of the efforts were devoted to experimental studies.« less
NASA Astrophysics Data System (ADS)
Susanto, Arif; Mulyono, Nur Budi
2018-02-01
The changes of environmental management system standards into the latest version, i.e. ISO 14001:2015, may cause a change on a data and information need in decision making and achieving the objectives in the organization coverage. Information management is the organization's responsibility to ensure that effectiveness and efficiency start from its creating, storing, processing and distribution processes to support operations and effective decision making activity in environmental performance management. The objective of this research was to set up an information management program and to adopt the technology as the supporting component of the program which was done by PTFI Concentrating Division so that it could be in line with the desirable organization objective in environmental management based on ISO 14001:2015 environmental management system standards. Materials and methods used covered technical aspects in information management, i.e. with web-based application development by using usage centered design. The result of this research showed that the use of Single Sign On gave ease to its user to interact further on the use of the environmental management system. Developing a web-based through creating entity relationship diagram (ERD) and information extraction by conducting information extraction which focuses on attributes, keys, determination of constraints. While creating ERD is obtained from relational database scheme from a number of database from environmental performances in Concentrating Division.
ERIC Educational Resources Information Center
Mitri, Michel
2012-01-01
XML has become the most ubiquitous format for exchange of data between applications running on the Internet. Most Web Services provide their information to clients in the form of XML. The ability to process complex XML documents in order to extract relevant information is becoming as important a skill for IS students to master as querying…
The Effects of Using Corpora on Revision Tasks in L2 Writing with Coded Error Feedback
ERIC Educational Resources Information Center
Tono, Yukio; Satake, Yoshiho; Miura, Aika
2014-01-01
This study reports on the results of classroom research investigating the effects of corpus use in the process of revising compositions in English as a foreign language. Our primary aim was to investigate the relationship between the information extracted from corpus data and how that information actually helped in revising different types of…
ERIC Educational Resources Information Center
Baadte, Christiane; Rasch, Thorsten; Honstein, Helena
2015-01-01
The ability to flexibly allocate attention to goal-relevant information is pivotal for the completion of high-level cognitive processes. For instance, in comprehending illustrated texts, the reader permanently has to switch the attentional focus between the text and the corresponding picture in order to extract relevant information from both…
Local and global aspects of biological motion perception in children born at very low birth weight
Williamson, K. E.; Jakobson, L. S.; Saunders, D. R.; Troje, N. F.
2015-01-01
Biological motion perception can be assessed using a variety of tasks. In the present study, 8- to 11-year-old children born prematurely at very low birth weight (<1500 g) and matched, full-term controls completed tasks that required the extraction of local motion cues, the ability to perceptually group these cues to extract information about body structure, and the ability to carry out higher order processes required for action recognition and person identification. Preterm children exhibited difficulties in all 4 aspects of biological motion perception. However, intercorrelations between test scores were weak in both full-term and preterm children—a finding that supports the view that these processes are relatively independent. Preterm children also displayed more autistic-like traits than full-term peers. In preterm (but not full-term) children, these traits were negatively correlated with performance in the task requiring structure-from-motion processing, r(30) = −.36, p < .05), but positively correlated with the ability to extract identity, r(30) = .45, p < .05). These findings extend previous reports of vulnerability in systems involved in processing dynamic cues in preterm children and suggest that a core deficit in social perception/cognition may contribute to the development of the social and behavioral difficulties even in members of this population who are functioning within the normal range intellectually. The results could inform the development of screening, diagnostic, and intervention tools. PMID:25103588
Raisutis, Renaldas; Samaitis, Vykintas
2017-01-01
This work proposes a novel hybrid signal processing technique to extract information on disbond-type defects from a single B-scan in the process of non-destructive testing (NDT) of glass fiber reinforced plastic (GFRP) material using ultrasonic guided waves (GW). The selected GFRP sample has been a segment of wind turbine blade, which possessed an aerodynamic shape. Two disbond type defects having diameters of 15 mm and 25 mm were artificially constructed on its trailing edge. The experiment has been performed using the low-frequency ultrasonic system developed at the Ultrasound Institute of Kaunas University of Technology and only one side of the sample was accessed. A special configuration of the transmitting and receiving transducers fixed on a movable panel with a separation distance of 50 mm was proposed for recording the ultrasonic guided wave signals at each one-millimeter step along the scanning distance up to 500 mm. Finally, the hybrid signal processing technique comprising the valuable features of the three most promising signal processing techniques: cross-correlation, wavelet transform, and Hilbert–Huang transform has been applied to the received signals for the extraction of defects information from a single B-scan image. The wavelet transform and cross-correlation techniques have been combined in order to extract the approximated size and location of the defects and measurements of time delays. Thereafter, Hilbert–Huang transform has been applied to the wavelet transformed signal to compare the variation of instantaneous frequencies and instantaneous amplitudes of the defect-free and defective signals. PMID:29232845
A mobile unit for memory retrieval in daily life based on image and sensor processing
NASA Astrophysics Data System (ADS)
Takesumi, Ryuji; Ueda, Yasuhiro; Nakanishi, Hidenobu; Nakamura, Atsuyoshi; Kakimori, Nobuaki
2003-10-01
We developed a Mobile Unit which purpose is to support memory retrieval of daily life. In this paper, we describe the two characteristic factors of this unit. (1)The behavior classification with an acceleration sensor. (2)Extracting the difference of environment with image processing technology. In (1), By analyzing power and frequency of an acceleration sensor which turns to gravity direction, the one's activities can be classified using some techniques to walk, stay, and so on. In (2), By extracting the difference between the beginning scene and the ending scene of a stay scene with image processing, the result which is done by user is recognized as the difference of environment. Using those 2 techniques, specific scenes of daily life can be extracted, and important information at the change of scenes can be realized to record. Especially we describe the effect to support retrieving important things, such as a thing left behind and a state of working halfway.
NASA Astrophysics Data System (ADS)
Tsagaan, Baigalmaa; Abe, Keiichi; Goto, Masahiro; Yamamoto, Seiji; Terakawa, Susumu
2006-03-01
This paper presents a segmentation method of brain tissues from MR images, invented for our image-guided neurosurgery system under development. Our goal is to segment brain tissues for creating biomechanical model. The proposed segmentation method is based on 3-D region growing and outperforms conventional approaches by stepwise usage of intensity similarities between voxels in conjunction with edge information. Since the intensity and the edge information are complementary to each other in the region-based segmentation, we use them twice by performing a coarse-to-fine extraction. First, the edge information in an appropriate neighborhood of the voxel being considered is examined to constrain the region growing. The expanded region of the first extraction result is then used as the domain for the next processing. The intensity and the edge information of the current voxel only are utilized in the final extraction. Before segmentation, the intensity parameters of the brain tissues as well as partial volume effect are estimated by using expectation-maximization (EM) algorithm in order to provide an accurate data interpretation into the extraction. We tested the proposed method on T1-weighted MR images of brain and evaluated the segmentation effectiveness comparing the results with ground truths. Also, the generated meshes from the segmented brain volume by using mesh generating software are shown in this paper.
NASA Astrophysics Data System (ADS)
Jarihani, B.
2015-12-01
Digital Elevation Models (DEMs) that accurately replicate both landscape form and processes are critical to support modeling of environmental processes. Pre-processing analysis of DEMs and extracting characteristics of the watershed (e.g., stream networks, catchment delineation, surface and subsurface flow paths) is essential for hydrological and geomorphic analysis and sediment transport. This study investigates the status of the current remotely-sensed DEMs in providing advanced morphometric information of drainage basins particularly in data sparse regions. Here we assess the accuracy of three available DEMs: (i) hydrologically corrected "H-DEM" of Geoscience Australia derived from the Shuttle Radar Topography Mission (SRTM) data; (ii) the Advanced Spaceborne Thermal Emission and Reflection Radiometer Global Digital Elevation Model (ASTER GDEM) version2 1-arc-second (~30 m) data; and (iii) the 9-arc-second national GEODATA DEM-9S ver3 from Geoscience Australia and the Australian National University. We used ESRI's geospatial data model, Arc Hydro and HEC-GeoHMS, designed for building hydrologic information systems to synthesize geospatial and temporal water resources data that support hydrologic modeling and analysis. A coastal catchment in northeast Australia was selected as the study site where very high resolution LiDAR data are available for parts of the area as reference data to assess the accuracy of other lower resolution datasets. This study provides morphometric information for drainage basins as part of the broad research on sediment flux from coastal basins to Great Barrier Reef, Australia. After applying geo-referencing and elevation corrections, stream and sub basins were delineated for each DEM. Then physical characteristics for streams (i.e., length, upstream and downstream elevation, and slope) and sub-basins (i.e., longest flow lengths, area, relief and slopes) were extracted and compared with reference datasets from LiDAR. Results showed that, in the absence of high-precision and high resolution DEM data, ASTER GDEM or SRTM DEM can be used to extract common morphometric relationship which are widely used for hydrological and geomorphological modelling.
An automatic rat brain extraction method based on a deformable surface model.
Li, Jiehua; Liu, Xiaofeng; Zhuo, Jiachen; Gullapalli, Rao P; Zara, Jason M
2013-08-15
The extraction of the brain from the skull in medical images is a necessary first step before image registration or segmentation. While pre-clinical MR imaging studies on small animals, such as rats, are increasing, fully automatic imaging processing techniques specific to small animal studies remain lacking. In this paper, we present an automatic rat brain extraction method, the Rat Brain Deformable model method (RBD), which adapts the popular human brain extraction tool (BET) through the incorporation of information on the brain geometry and MR image characteristics of the rat brain. The robustness of the method was demonstrated on T2-weighted MR images of 64 rats and compared with other brain extraction methods (BET, PCNN, PCNN-3D). The results demonstrate that RBD reliably extracts the rat brain with high accuracy (>92% volume overlap) and is robust against signal inhomogeneity in the images. Copyright © 2013 Elsevier B.V. All rights reserved.
Retrieval of radiology reports citing critical findings with disease-specific customization.
Lacson, Ronilda; Sugarbaker, Nathanael; Prevedello, Luciano M; Ivan, Ip; Mar, Wendy; Andriole, Katherine P; Khorasani, Ramin
2012-01-01
Communication of critical results from diagnostic procedures between caregivers is a Joint Commission national patient safety goal. Evaluating critical result communication often requires manual analysis of voluminous data, especially when reviewing unstructured textual results of radiologic findings. Information retrieval (IR) tools can facilitate this process by enabling automated retrieval of radiology reports that cite critical imaging findings. However, IR tools that have been developed for one disease or imaging modality often need substantial reconfiguration before they can be utilized for another disease entity. THIS PAPER: 1) describes the process of customizing two Natural Language Processing (NLP) and Information Retrieval/Extraction applications - an open-source toolkit, A Nearly New Information Extraction system (ANNIE); and an application developed in-house, Information for Searching Content with an Ontology-Utilizing Toolkit (iSCOUT) - to illustrate the varying levels of customization required for different disease entities and; 2) evaluates each application's performance in identifying and retrieving radiology reports citing critical imaging findings for three distinct diseases, pulmonary nodule, pneumothorax, and pulmonary embolus. Both applications can be utilized for retrieval. iSCOUT and ANNIE had precision values between 0.90-0.98 and recall values between 0.79 and 0.94. ANNIE had consistently higher precision but required more customization. Understanding the customizations involved in utilizing NLP applications for various diseases will enable users to select the most suitable tool for specific tasks.
Retrieval of Radiology Reports Citing Critical Findings with Disease-Specific Customization
Lacson, Ronilda; Sugarbaker, Nathanael; Prevedello, Luciano M; Ivan, IP; Mar, Wendy; Andriole, Katherine P; Khorasani, Ramin
2012-01-01
Background: Communication of critical results from diagnostic procedures between caregivers is a Joint Commission national patient safety goal. Evaluating critical result communication often requires manual analysis of voluminous data, especially when reviewing unstructured textual results of radiologic findings. Information retrieval (IR) tools can facilitate this process by enabling automated retrieval of radiology reports that cite critical imaging findings. However, IR tools that have been developed for one disease or imaging modality often need substantial reconfiguration before they can be utilized for another disease entity. Purpose: This paper: 1) describes the process of customizing two Natural Language Processing (NLP) and Information Retrieval/Extraction applications – an open-source toolkit, A Nearly New Information Extraction system (ANNIE); and an application developed in-house, Information for Searching Content with an Ontology-Utilizing Toolkit (iSCOUT) – to illustrate the varying levels of customization required for different disease entities and; 2) evaluates each application’s performance in identifying and retrieving radiology reports citing critical imaging findings for three distinct diseases, pulmonary nodule, pneumothorax, and pulmonary embolus. Results: Both applications can be utilized for retrieval. iSCOUT and ANNIE had precision values between 0.90-0.98 and recall values between 0.79 and 0.94. ANNIE had consistently higher precision but required more customization. Conclusion: Understanding the customizations involved in utilizing NLP applications for various diseases will enable users to select the most suitable tool for specific tasks. PMID:22934127
2016-01-01
A novel method of extracting heart rate and oxygen saturation from a video-based biosignal is described. The method comprises a novel modular continuous wavelet transform approach which includes: performing the transform, undertaking running wavelet archetyping to enhance the pulse information, extraction of the pulse ridge time–frequency information [and thus a heart rate (HRvid) signal], creation of a wavelet ratio surface, projection of the pulse ridge onto the ratio surface to determine the ratio of ratios from which a saturation trending signal is derived, and calibrating this signal to provide an absolute saturation signal (SvidO2). The method is illustrated through its application to a video photoplethysmogram acquired during a porcine model of acute desaturation. The modular continuous wavelet transform-based approach is advocated by the author as a powerful methodology to deal with noisy, non-stationary biosignals in general. PMID:27382479
Latent Dirichlet Allocation (LDA) Model and kNN Algorithm to Classify Research Project Selection
NASA Astrophysics Data System (ADS)
Safi’ie, M. A.; Utami, E.; Fatta, H. A.
2018-03-01
Universitas Sebelas Maret has a teaching staff more than 1500 people, and one of its tasks is to carry out research. In the other side, the funding support for research and service is limited, so there is need to be evaluated to determine the Research proposal submission and devotion on society (P2M). At the selection stage, research proposal documents are collected as unstructured data and the data stored is very large. To extract information contained in the documents therein required text mining technology. This technology applied to gain knowledge to the documents by automating the information extraction. In this articles we use Latent Dirichlet Allocation (LDA) to the documents as a model in feature extraction process, to get terms that represent its documents. Hereafter we use k-Nearest Neighbour (kNN) algorithm to classify the documents based on its terms.
NASA Astrophysics Data System (ADS)
Shauly, Eitan N.; Levi, Shimon; Schwarzband, Ishai; Adan, Ofer; Latinsky, Sergey
2015-04-01
A fully automated silicon-based methodology for systematic analysis of electrical features is shown. The system was developed for process monitoring and electrical variability reduction. A mapping step was created by dedicated structures such as static-random-access-memory (SRAM) array or standard cell library, or by using a simple design rule checking run-set. The resulting database was then used as an input for choosing locations for critical dimension scanning electron microscope images and for specific layout parameter extraction then was input to SPICE compact modeling simulation. Based on the experimental data, we identified two items that must be checked and monitored using the method described here: transistor's sensitivity to the distance between the poly end cap and edge of active area (AA) due to AA rounding, and SRAM leakage due to a too close N-well to P-well. Based on this example, for process monitoring and variability analyses, we extensively used this method to analyze transistor gates having different shapes. In addition, analysis for a large area of high density standard cell library was done. Another set of monitoring focused on a high density SRAM array is also presented. These examples provided information on the poly and AA layers, using transistor parameters such as leakage current and drive current. We successfully define "robust" and "less-robust" transistor configurations included in the library and identified unsymmetrical transistors in the SRAM bit-cells. These data were compared to data extracted from the same devices at the end of the line. Another set of analyses was done to samples after Cu M1 etch. Process monitoring information on M1 enclosed contact was extracted based on contact resistance as a feedback. Guidelines for the optimal M1 space for different layout configurations were also extracted. All these data showed the successful in-field implementation of our methodology as a useful process monitoring method.
Second Iteration of Photogrammetric Pipeline to Enhance the Accuracy of Image Pose Estimation
NASA Astrophysics Data System (ADS)
Nguyen, T. G.; Pierrot-Deseilligny, M.; Muller, J.-M.; Thom, C.
2017-05-01
In classical photogrammetric processing pipeline, the automatic tie point extraction plays a key role in the quality of achieved results. The image tie points are crucial to pose estimation and have a significant influence on the precision of calculated orientation parameters. Therefore, both relative and absolute orientations of the 3D model can be affected. By improving the precision of image tie point measurement, one can enhance the quality of image orientation. The quality of image tie points is under the influence of several factors such as the multiplicity, the measurement precision and the distribution in 2D images as well as in 3D scenes. In complex acquisition scenarios such as indoor applications and oblique aerial images, tie point extraction is limited while only image information can be exploited. Hence, we propose here a method which improves the precision of pose estimation in complex scenarios by adding a second iteration to the classical processing pipeline. The result of a first iteration is used as a priori information to guide the extraction of new tie points with better quality. Evaluated with multiple case studies, the proposed method shows its validity and its high potiential for precision improvement.
Dilated contour extraction and component labeling algorithm for object vector representation
NASA Astrophysics Data System (ADS)
Skourikhine, Alexei N.
2005-08-01
Object boundary extraction from binary images is important for many applications, e.g., image vectorization, automatic interpretation of images containing segmentation results, printed and handwritten documents and drawings, maps, and AutoCAD drawings. Efficient and reliable contour extraction is also important for pattern recognition due to its impact on shape-based object characterization and recognition. The presented contour tracing and component labeling algorithm produces dilated (sub-pixel) contours associated with corresponding regions. The algorithm has the following features: (1) it always produces non-intersecting, non-degenerate contours, including the case of one-pixel wide objects; (2) it associates the outer and inner (i.e., around hole) contours with the corresponding regions during the process of contour tracing in a single pass over the image; (3) it maintains desired connectivity of object regions as specified by 8-neighbor or 4-neighbor connectivity of adjacent pixels; (4) it avoids degenerate regions in both background and foreground; (5) it allows an easy augmentation that will provide information about the containment relations among regions; (6) it has a time complexity that is dominantly linear in the number of contour points. This early component labeling (contour-region association) enables subsequent efficient object-based processing of the image information.
Nagy, Paul G; Warnock, Max J; Daly, Mark; Toland, Christopher; Meenan, Christopher D; Mezrich, Reuben S
2009-11-01
Radiology departments today are faced with many challenges to improve operational efficiency, performance, and quality. Many organizations rely on antiquated, paper-based methods to review their historical performance and understand their operations. With increased workloads, geographically dispersed image acquisition and reading sites, and rapidly changing technologies, this approach is increasingly untenable. A Web-based dashboard was constructed to automate the extraction, processing, and display of indicators and thereby provide useful and current data for twice-monthly departmental operational meetings. The feasibility of extracting specific metrics from clinical information systems was evaluated as part of a longer-term effort to build a radiology business intelligence architecture. Operational data were extracted from clinical information systems and stored in a centralized data warehouse. Higher-level analytics were performed on the centralized data, a process that generated indicators in a dynamic Web-based graphical environment that proved valuable in discussion and root cause analysis. Results aggregated over a 24-month period since implementation suggest that this operational business intelligence reporting system has provided significant data for driving more effective management decisions to improve productivity, performance, and quality of service in the department.
Text Mining in Biomedical Domain with Emphasis on Document Clustering.
Renganathan, Vinaitheerthan
2017-07-01
With the exponential increase in the number of articles published every year in the biomedical domain, there is a need to build automated systems to extract unknown information from the articles published. Text mining techniques enable the extraction of unknown knowledge from unstructured documents. This paper reviews text mining processes in detail and the software tools available to carry out text mining. It also reviews the roles and applications of text mining in the biomedical domain. Text mining processes, such as search and retrieval of documents, pre-processing of documents, natural language processing, methods for text clustering, and methods for text classification are described in detail. Text mining techniques can facilitate the mining of vast amounts of knowledge on a given topic from published biomedical research articles and draw meaningful conclusions that are not possible otherwise.
Clinical records anonymisation and text extraction (CRATE): an open-source software system.
Cardinal, Rudolf N
2017-04-26
Electronic medical records contain information of value for research, but contain identifiable and often highly sensitive confidential information. Patient-identifiable information cannot in general be shared outside clinical care teams without explicit consent, but anonymisation/de-identification allows research uses of clinical data without explicit consent. This article presents CRATE (Clinical Records Anonymisation and Text Extraction), an open-source software system with separable functions: (1) it anonymises or de-identifies arbitrary relational databases, with sensitivity and precision similar to previous comparable systems; (2) it uses public secure cryptographic methods to map patient identifiers to research identifiers (pseudonyms); (3) it connects relational databases to external tools for natural language processing; (4) it provides a web front end for research and administrative functions; and (5) it supports a specific model through which patients may consent to be contacted about research. Creation and management of a research database from sensitive clinical records with secure pseudonym generation, full-text indexing, and a consent-to-contact process is possible and practical using entirely free and open-source software.
Characterization of autoregressive processes using entropic quantifiers
NASA Astrophysics Data System (ADS)
Traversaro, Francisco; Redelico, Francisco O.
2018-01-01
The aim of the contribution is to introduce a novel information plane, the causal-amplitude informational plane. As previous works seems to indicate, Bandt and Pompe methodology for estimating entropy does not allow to distinguish between probability distributions which could be fundamental for simulation or for probability analysis purposes. Once a time series is identified as stochastic by the causal complexity-entropy informational plane, the novel causal-amplitude gives a deeper understanding of the time series, quantifying both, the autocorrelation strength and the probability distribution of the data extracted from the generating processes. Two examples are presented, one from climate change model and the other from financial markets.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Nilsson, Mikael
Advanced nuclear fuel cycles rely on successful chemical separation of various elements in the used fuel. Numerous solvent extraction (SX) processes have been developed for the recovery and purification of metal ions from this used material. However, the predictability of process operations has been challenged by the lack of a fundamental understanding of the chemical interactions in several of these separation systems. For example, gaps in the thermodynamic description of the mechanism and the complexes formed will make predictions very challenging. Recent studies of certain extraction systems under development and a number of more established SX processes have suggested thatmore » aggregate formation in the organic phase results in a transformation of its selectivity and efficiency. Aggregation phenomena have consistently been interfering in SX process development, and have, over the years, become synonymous with an undesirable effect that must be prevented. This multiyear, multicollaborative research effort was carried out to study solvation and self-organization in non-aqueous solutions at conditions promoting aggregation phenomena. Our approach to this challenging topic was to investigate extraction systems comprising more than one extraction reagent where synergy of the metal ion could be observed. These systems were probed for the existence of stable microemulsions in the organic phase, and a number of high-end characterization tools were employed to elucidate the role of the aggregates in metal ion extraction. The ultimate goal was to find connections between synergy of metal ion extraction and reverse micellar formation. Our main accomplishment for this project was the expansion of the understanding of metal ion complexation in the extraction system combining tributyl phosphate (TBP) and dibutyl phosphoric acid (HDBP). We have found that for this system no direct correlation exists for the metal ion extraction and the formation of aggregates, meaning that the metal ion is not solubilized in a reverse micelle core. Rather we have found solid evidence that the metal ions are extracted and coordinated by the organic ligands as suggested by classic SX theories. However, we have challenged the existence of mixed complexes that have been suggested to exist in this particular extraction system. Most importantly we have generated a wealth of information and trained students on important lab techniques and strengthened the collaboration between the DOE national laboratories and US educational institution involved in this work.« less
Daemonic ergotropy: enhanced work extraction from quantum correlations
NASA Astrophysics Data System (ADS)
Francica, Gianluca; Goold, John; Plastina, Francesco; Paternostro, Mauro
2017-03-01
We investigate how the presence of quantum correlations can influence work extraction in closed quantum systems, establishing a new link between the field of quantum non-equilibrium thermodynamics and the one of quantum information theory. We consider a bipartite quantum system and we show that it is possible to optimize the process of work extraction, thanks to the correlations between the two parts of the system, by using an appropriate feedback protocol based on the concept of ergotropy. We prove that the maximum gain in the extracted work is related to the existence of quantum correlations between the two parts, quantified by either quantum discord or, for pure states, entanglement. We then illustrate our general findings on a simple physical situation consisting of a qubit system.
TNT and RDX degradation and extraction from contaminated soil using subcritical water.
Islam, Mohammad Nazrul; Shin, Moon-Su; Jo, Young-Tae; Park, Jeong-Hun
2015-01-01
The use of explosives either for industrial or military operations have resulted in the environmental pollution, poses ecological and health hazard. In this work, a subcritical water extraction (SCWE) process at laboratory scale was used at varying water temperature (100-175 °C) and flow rate (0.5-1.5 mL min(-1)), to treat 2,4,6-trinitrotoluene (TNT) and hexahydro-1,3,5-trinitro-1,3,5-triazine (RDX) contaminated soil, to reveal information with respect to the explosives removal (based on the analyses of soil residue after extraction), and degradation performance (based on the analyses of water extracts) of this process. Continuous flow subcritical water has been considered on removal of explosives to avoid the repartitioning of non-degraded compounds to the soil upon cooling which usually occurs in the batch system. In the SCWE experiments, near complete degradation of both TNT and RDX was observed at 175 °C based on analysis of water extracts and soil. Test results also indicated that TNT removal of >99% and a complete RDX removal were achieved by this process, when the operating conditions were 1 mL min(-1), and treatment time of 20 min, after the temperature reached 175 °C. HPLC-UV and ion chromatography analysis confirmed that the explosives underwent for degradation. The low concentration of explosives found in the process wastewater indicates that water recycling may be viable, to treat additional soil. Our results have shown in the remediation of explosives contaminated soil, the effectiveness of the continuous flow SCWE process. Copyright © 2014 Elsevier Ltd. All rights reserved.
Walker Ranch 3D seismic images
Robert J. Mellors
2016-03-01
Amplitude images (both vertical and depth slices) extracted from 3D seismic reflection survey over area of Walker Ranch area (adjacent to Raft River). Crossline spacing of 660 feet and inline of 165 feet using a Vibroseis source. Processing included depth migration. Micro-earthquake hypocenters on images. Stratigraphic information and nearby well tracks added to images. Images are embedded in a Microsoft Word document with additional information. Exact location and depth restricted for proprietary reasons. Data collection and processing funded by Agua Caliente. Original data remains property of Agua Caliente.
Video monitoring of oxygen saturation during controlled episodes of acute hypoxia.
Addison, Paul S; Foo, David M H; Jacquel, Dominique; Borg, Ulf
2016-08-01
A method for extracting video photoplethysmographic information from an RGB video stream is tested on data acquired during a porcine model of acute hypoxia. Cardiac pulsatile information was extracted from the acquired signals and processed to determine a continuously reported oxygen saturation (SvidO2). A high degree of correlation was found to exist between the video and a reference from a pulse oximeter. The calculated mean bias and accuracy across all eight desaturation episodes were -0.03% (range: -0.21% to 0.24%) and accuracy 4.90% (range: 3.80% to 6.19%) respectively. The results support the hypothesis that oxygen saturation trending can be evaluated accurately from a video system during acute hypoxia.
Ward, Brodie J; Thornton, Ashleigh; Lay, Brendan; Rosenberg, Michael
2017-01-01
Fundamental movement skill (FMS) assessment remains an important tool in classifying individuals' level of FMS proficiency. The collection of FMS performances for assessment and monitoring has remained unchanged over the last few decades, but new motion capture technologies offer opportunities to automate this process. To achieve this, a greater understanding of the human process of movement skill assessment is required. The authors present the rationale and protocols of a project in which they aim to investigate the visual search patterns and information extraction employed by human assessors during FMS assessment, as well as the implementation of the Kinect system for FMS capture.
FPGA-based real time processing of the Plenoptic Wavefront Sensor
NASA Astrophysics Data System (ADS)
Rodríguez-Ramos, L. F.; Marín, Y.; Díaz, J. J.; Piqueras, J.; García-Jiménez, J.; Rodríguez-Ramos, J. M.
The plenoptic wavefront sensor combines measurements at pupil and image planes in order to obtain simultaneously wavefront information from different points of view, being capable to sample the volume above the telescope to extract the tomographic information of the atmospheric turbulence. The advantages of this sensor are presented elsewhere at this conference (José M. Rodríguez-Ramos et al). This paper will concentrate in the processing required for pupil plane phase recovery, and its computation in real time using FPGAs (Field Programmable Gate Arrays). This technology eases the implementation of massive parallel processing and allows tailoring the system to the requirements, maintaining flexibility, speed and cost figures.
The effects of solar incidence angle over digital processing of LANDSAT data
NASA Technical Reports Server (NTRS)
Parada, N. D. J. (Principal Investigator); Novo, E. M. L. M.
1983-01-01
A technique to extract the topography modulation component from digital data is described. The enhancement process is based on the fact that the pixel contains two types of information: (1) reflectance variation due to the target; (2) reflectance variation due to the topography. In order to enhance the signal variation due to topography, the technique recommends the extraction from original LANDSAT data of the component resulting from target reflectance. Considering that the role of topographic modulation over the pixel information will vary with solar incidence angle, the results of this technique of digital processing will differ from one season to another, mainly in highly dissected topography. In this context, the effects of solar incidence angle over the topographic modulation technique were evaluated. Two sets of MSS/LANDSAT data, with solar elevation angles varying from 22 to 41 deg were selected to implement the digital processing at the Image-100 System. A secondary watershed (Rio Bocaina) draining into Rio Paraiba do Sul (Sao Paulo State) was selected as a test site. The results showed that the technique used was more appropriate to MSS data acquired under higher Sun elevation angles. Topographic modulation components applied to low Sun elevation angles lessens rather than enhances topography.
Design and process aspects of laboratory scale SCF particle formation systems.
Vemavarapu, Chandra; Mollan, Matthew J; Lodaya, Mayur; Needham, Thomas E
2005-03-23
Consistent production of solid drug materials of desired particle and crystallographic morphologies under cGMP conditions is a frequent challenge to pharmaceutical researchers. Supercritical fluid (SCF) technology gained significant attention in pharmaceutical research by not only showing a promise in this regard but also accommodating the principles of green chemistry. Given that this technology attained commercialization in coffee decaffeination and in the extraction of hops and other essential oils, a majority of the off-the-shelf SCF instrumentation is designed for extraction purposes. Only a selective few vendors appear to be in the early stages of manufacturing equipment designed for particle formation. The scarcity of information on the design and process engineering of laboratory scale equipment is recognized as a significant shortcoming to the technological progress. The purpose of this article is therefore to provide the information and resources necessary for startup research involving particle formation using supercritical fluids. The various stages of particle formation by supercritical fluid processing can be broadly classified into delivery, reaction, pre-expansion, expansion and collection. The importance of each of these processes in tailoring the particle morphology is discussed in this article along with presenting various alternatives to perform these operations.
End-User Evaluations of Semantic Web Technologies
DOE Office of Scientific and Technical Information (OSTI.GOV)
McCool, Rob; Cowell, Andrew J.; Thurman, David A.
Stanford University's Knowledge Systems Laboratory (KSL) is working in partnership with Battelle Memorial Institute and IBM Watson Research Center to develop a suite of technologies for information extraction, knowledge representation & reasoning, and human-information interaction, in unison entitled 'Knowledge Associates for Novel Intelligence' (KANI). We have developed an integrated analytic environment composed of a collection of analyst associates, software components that aid the user at different stages of the information analysis process. An important part of our participatory design process has been to ensure our technologies and designs are tightly integrate with the needs and requirements of our end users,more » To this end, we perform a sequence of evaluations towards the end of the development process that ensure the technologies are both functional and usable. This paper reports on that process.« less
NASA Astrophysics Data System (ADS)
Setiyoko, A.; Dharma, I. G. W. S.; Haryanto, T.
2017-01-01
Multispectral data and hyperspectral data acquired from satellite sensor have the ability in detecting various objects on the earth ranging from low scale to high scale modeling. These data are increasingly being used to produce geospatial information for rapid analysis by running feature extraction or classification process. Applying the most suited model for this data mining is still challenging because there are issues regarding accuracy and computational cost. This research aim is to develop a better understanding regarding object feature extraction and classification applied for satellite image by systematically reviewing related recent research projects. A method used in this research is based on PRISMA statement. After deriving important points from trusted sources, pixel based and texture-based feature extraction techniques are promising technique to be analyzed more in recent development of feature extraction and classification.
On the creation of a clinical gold standard corpus in Spanish: Mining adverse drug reactions.
Oronoz, Maite; Gojenola, Koldo; Pérez, Alicia; de Ilarraza, Arantza Díaz; Casillas, Arantza
2015-08-01
The advances achieved in Natural Language Processing make it possible to automatically mine information from electronically created documents. Many Natural Language Processing methods that extract information from texts make use of annotated corpora, but these are scarce in the clinical domain due to legal and ethical issues. In this paper we present the creation of the IxaMed-GS gold standard composed of real electronic health records written in Spanish and manually annotated by experts in pharmacology and pharmacovigilance. The experts mainly annotated entities related to diseases and drugs, but also relationships between entities indicating adverse drug reaction events. To help the experts in the annotation task, we adapted a general corpus linguistic analyzer to the medical domain. The quality of the annotation process in the IxaMed-GS corpus has been assessed by measuring the inter-annotator agreement, which was 90.53% for entities and 82.86% for events. In addition, the corpus has been used for the automatic extraction of adverse drug reaction events using machine learning. Copyright © 2015 Elsevier Inc. All rights reserved.
Van Dijck, Gert; Van Hulle, Marc M.
2011-01-01
The damage caused by corrosion in chemical process installations can lead to unexpected plant shutdowns and the leakage of potentially toxic chemicals into the environment. When subjected to corrosion, structural changes in the material occur, leading to energy releases as acoustic waves. This acoustic activity can in turn be used for corrosion monitoring, and even for predicting the type of corrosion. Here we apply wavelet packet decomposition to extract features from acoustic emission signals. We then use the extracted wavelet packet coefficients for distinguishing between the most important types of corrosion processes in the chemical process industry: uniform corrosion, pitting and stress corrosion cracking. The local discriminant basis selection algorithm can be considered as a standard for the selection of the most discriminative wavelet coefficients. However, it does not take the statistical dependencies between wavelet coefficients into account. We show that, when these dependencies are ignored, a lower accuracy is obtained in predicting the corrosion type. We compare several mutual information filters to take these dependencies into account in order to arrive at a more accurate prediction. PMID:22163921
A color fusion method of infrared and low-light-level images based on visual perception
NASA Astrophysics Data System (ADS)
Han, Jing; Yan, Minmin; Zhang, Yi; Bai, Lianfa
2014-11-01
The color fusion images can be obtained through the fusion of infrared and low-light-level images, which will contain both the information of the two. The fusion images can help observers to understand the multichannel images comprehensively. However, simple fusion may lose the target information due to inconspicuous targets in long-distance infrared and low-light-level images; and if targets extraction is adopted blindly, the perception of the scene information will be affected seriously. To solve this problem, a new fusion method based on visual perception is proposed in this paper. The extraction of the visual targets ("what" information) and parallel processing mechanism are applied in traditional color fusion methods. The infrared and low-light-level color fusion images are achieved based on efficient typical targets learning. Experimental results show the effectiveness of the proposed method. The fusion images achieved by our algorithm can not only improve the detection rate of targets, but also get rich natural information of the scenes.
The Design of Case Products’ Shape Form Information Database Based on NURBS Surface
NASA Astrophysics Data System (ADS)
Liu, Xing; Liu, Guo-zhong; Xu, Nuo-qi; Zhang, Wei-she
2017-07-01
In order to improve the computer design of product shape design,applying the Non-uniform Rational B-splines(NURBS) of curves and surfaces surface to the representation of the product shape helps designers to design the product effectively.On the basis of the typical product image contour extraction and using Pro/Engineer(Pro/E) to extract the geometric feature of scanning mold,in order to structure the information data base system of value point,control point and node vector parameter information,this paper put forward a unified expression method of using NURBS curves and surfaces to describe products’ geometric shape and using matrix laboratory(MATLAB) to simulate when products have the same or similar function.A case study of electric vehicle’s front cover illustrates the access process of geometric shape information of case product in this paper.This method can not only greatly reduce the capacity of information debate,but also improve the effectiveness of computer aided geometric innovation modeling.
Paavilainen, P; Simola, J; Jaramillo, M; Näätänen, R; Winkler, I
2001-03-01
Brain mechanisms extracting invariant information from varying auditory inputs were studied using the mismatch-negativity (MMN) brain response. We wished to determine whether the preattentive sound-analysis mechanisms, reflected by MMN, are capable of extracting invariant relationships based on abstract conjunctions between two sound features. The standard stimuli varied over a large range in frequency and intensity dimensions following the rule that the higher the frequency, the louder the intensity. The occasional deviant stimuli violated this frequency-intensity relationship and elicited an MMN. The results demonstrate that preattentive processing of auditory stimuli extends to unexpectedly complex relationships between the stimulus features.
Wildey, R.L.
1980-01-01
An economical method of digitally extracting sea-wave spectra from synthetic-aperture radar-signal records, which can be performed routinely in real or near-real time with the reception of telemetry from Seasat satellites, would be of value to a variety of scientific disciplines. This paper explores techniques for such data extraction and concludes that the mere fact that the desired result is devoid of phase information does not, of itself, lead to a simplification in data processing because of the nature of the modulation performed on the radar pulse by the backscattering surface. -from Author
Bouaouine, Omar; Bourven, Isabelle; Khalil, Fouad; Baudu, Michel
2018-04-01
Opuntia ficus-indica that belongs to the Cactaceae family and is a member of Opuntia kind has received increasing research interest for wastewater treatment by flocculation. The objectives of this study were (i) to provide more information regarding the active constituents of Opuntia spp. and (ii) to improve the extracting and using conditions of the flocculant molecules for water treatment. A classic approach by jar test experiments was used with raw and extracted material by solubilization and precipitation. The surface properties of solid material were characterized by FTIR, SEM, zeta potential measurement, and surface titration. The splitting based on the solubility of the material with pH and the titration of functional groups completed the method. The optimal pH value for a coagulation-flocculation process using cactus solid material (CSM) was 10.0 and a processing rate of 35 mg L -1 . The alkaline pH of flocculation suggests an adsorption mechanism with bridging effect between particles by water-soluble extracted molecules. To validate this mechanism, an extraction water was carried out at pH = 10 (optimum of flocculation) and the solution was acidified (pH = 7) to allow precipitation of so considered active flocculant molecules. The strong flocculant property of this extract was verified, and titration of this solution showed at least one specific pKa of 9.0 ± 0.6. This pKa corresponds to phenol groups, which could be assigned to lignin and tannin.
The edge detection method of the infrared imagery of the laser spot
NASA Astrophysics Data System (ADS)
Che, Jinxi; Zhang, Jinchun; Li, Zhongmin
2016-01-01
In the jamming effectiveness experiments, in which the thermal infrared imager was interfered by the CO2 Laser, in order to evaluate the jamming effect of the thermal infrared imager by the CO2 Laser, it was needed to analyses the obtained infrared imagery of laser spot. Because the laser spot pictures obtained from the thermal infrared imager are irregular, the edge detection is an important process. The image edge is one of the most basic characteristics of the image, and it contains most of the information of the image. Generally, because of the thermal balance effect, the partly temperature of objective is no quite difference; therefore the infrared imagery's ability of reflecting the local detail of object is obvious week. At the same time, when the information of heat distribution of the thermal imagery was combined with the basic information of target, such as the object size, the relative position of field of view, shape and outline, and so on, the information just has more value. Hence, it is an important step for making image processing to extract the objective edge of the infrared imagery. Meanwhile it is an important part of image processing procedure and it is the premise of many subsequent processing. So as to extract outline information of the target from the original thermal imagery, and overcome the disadvantage, such as the low image contrast of the image and serious noise interference, and so on, the edge of thermal imagery needs detecting and processing. The principles of the Roberts, Sobel, Prewitt and Canny operator were analyzed, and then they were used to making edge detection on the thermal imageries of laser spot, which were obtained from the jamming effect experiments of CO2 laser jamming the thermal infrared imager. On the basis of the detection result, their performances were compared. At the end, the characteristics of the operators were summarized, which provide reference for the choice of edge detection operators in thermal imagery processing in future.
Multiplexed Sequence Encoding: A Framework for DNA Communication.
Zakeri, Bijan; Carr, Peter A; Lu, Timothy K
2016-01-01
Synthetic DNA has great propensity for efficiently and stably storing non-biological information. With DNA writing and reading technologies rapidly advancing, new applications for synthetic DNA are emerging in data storage and communication. Traditionally, DNA communication has focused on the encoding and transfer of complete sets of information. Here, we explore the use of DNA for the communication of short messages that are fragmented across multiple distinct DNA molecules. We identified three pivotal points in a communication-data encoding, data transfer & data extraction-and developed novel tools to enable communication via molecules of DNA. To address data encoding, we designed DNA-based individualized keyboards (iKeys) to convert plaintext into DNA, while reducing the occurrence of DNA homopolymers to improve synthesis and sequencing processes. To address data transfer, we implemented a secret-sharing system-Multiplexed Sequence Encoding (MuSE)-that conceals messages between multiple distinct DNA molecules, requiring a combination key to reveal messages. To address data extraction, we achieved the first instance of chromatogram patterning through multiplexed sequencing, thereby enabling a new method for data extraction. We envision these approaches will enable more widespread communication of information via DNA.
Estimation of the Scatterer Distribution of the Cirrhotic Liver using Ultrasonic Image
NASA Astrophysics Data System (ADS)
Yamaguchi, Tadashi; Hachiya, Hiroyuki
1998-05-01
In the B-mode image of the liver obtained by an ultrasonic imaging system, the speckled pattern changes with the progression of the disease such as liver cirrhosis.In this paper we present the statistical characteristics of the echo envelope of the liver, and the technique to extract information of the scatterer distribution from the normal and cirrhotic liver images using constant false alarm rate (CFAR) processing.We analyze the relationship between the extracted scatterer distribution and the stage of liver cirrhosis. The ratio of the area in which the amplitude of the processing signal is more than the threshold to the entire processed image area is related quantitatively to the stage of liver cirrhosis.It is found that the proposed technique is valid for the quantitative diagnosis of liver cirrhosis.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Yang, Zhaoqing; Wang, Taiping; Copping, Andrea E.
Understanding and providing proactive information on the potential for tidal energy projects to cause changes to the physical system and to key water quality constituents in tidal waters is a necessary and cost-effective means to avoid costly regulatory involvement and late stage surprises in the permitting process. This paper presents a modeling study for evaluating the tidal energy extraction and its potential impacts on the marine environment in a real world site - Tacoma Narrows of Puget Sound, Washington State, USA. An unstructured-grid coastal ocean model, fitted with a module that simulates tidal energy devices, was applied to simulate themore » tidal energy extracted by different turbine array configurations and the potential effects of the extraction at local and system-wide scales in Tacoma Narrows and South Puget Sound. Model results demonstrated the advantage of an unstructured-grid model for simulating the far-field effects of tidal energy extraction in a large model domain, as well as assessing the near-field effect using a fine grid resolution near the tidal turbines. The outcome shows that a realistic near-term deployment scenario extracts a very small fraction of the total tidal energy in the system and that system wide environmental effects are not likely; however, near-field effects on the flow field and bed shear stress in the area of tidal turbine farm are more likely. Model results also indicate that from a practical standpoint, hydrodynamic or water quality effects are not likely to be the limiting factor for development of large commercial-scale tidal farms. Results indicate that very high numbers of turbines are required to significantly alter the tidal system; limitations on marine space or other environmental concerns are likely to be reached before reaching these deployment levels. These findings show that important information obtained from numerical modeling can be used to inform regulatory and policy processes for tidal energy development.« less
Medem, Anna V; Seidling, Hanna M; Eichler, Hans-Georg; Kaltschmidt, Jens; Metzner, Michael; Hubert, Carina M; Czock, David; Haefeli, Walter E
2017-05-01
Electronic clinical decision support systems (CDSS) require drug information that can be processed by computers. The goal of this project was to determine and evaluate a compilation of variables that comprehensively capture the information contained in the summary of product characteristic (SmPC) and unequivocally describe the drug, its dosage options, and clinical pharmacokinetics. An expert panel defined and structured a set of variables and drafted a guideline to extract and enter information on dosage and clinical pharmacokinetics from textual SmPCs as published by the European Medicines Agency (EMA). The set of variables was iteratively revised and evaluated by data extraction and variable allocation of roughly 7% of all centrally approved drugs. The information contained in the SmPC was allocated to three information clusters consisting of 260 variables. The cluster "drug characterization" specifies the nature of the drug. The cluster "dosage" provides information on approved drug dosages and defines corresponding specific conditions. The cluster "clinical pharmacokinetics" includes pharmacokinetic parameters of relevance for dosing in clinical practice. A first evaluation demonstrated that, despite the complexity of the current free text SmPCs, dosage and pharmacokinetic information can be reliably extracted from the SmPCs and comprehensively described by a limited set of variables. By proposing a compilation of variables well describing drug dosage and clinical pharmacokinetics, the project represents a step forward towards the development of a comprehensive database system serving as information source for sophisticated CDSS.
Extracting features of Gaussian self-similar stochastic processes via the Bandt-Pompe approach.
Rosso, O A; Zunino, L; Pérez, D G; Figliola, A; Larrondo, H A; Garavaglia, M; Martín, M T; Plastino, A
2007-12-01
By recourse to appropriate information theory quantifiers (normalized Shannon entropy and Martín-Plastino-Rosso intensive statistical complexity measure), we revisit the characterization of Gaussian self-similar stochastic processes from a Bandt-Pompe viewpoint. We show that the ensuing approach exhibits considerable advantages with respect to other treatments. In particular, clear quantifiers gaps are found in the transition between the continuous processes and their associated noises.
Medical document anonymization with a semantic lexicon.
Ruch, P.; Baud, R. H.; Rassinoux, A. M.; Bouillon, P.; Robert, G.
2000-01-01
We present an original system for locating and removing personally-identifying information in patient records. In this experiment, anonymization is seen as a particular case of knowledge extraction. We use natural language processing tools provided by the MEDTAG framework: a semantic lexicon specialized in medicine, and a toolkit for word-sense and morpho-syntactic tagging. The system finds 98-99% of all personally-identifying information. PMID:11079980
Kress, Daniel; Egelhaaf, Martin
2014-01-01
During locomotion animals rely heavily on visual cues gained from the environment to guide their behavior. Examples are basic behaviors like collision avoidance or the approach to a goal. The saccadic gaze strategy of flying flies, which separates translational from rotational phases of locomotion, has been suggested to facilitate the extraction of environmental information, because only image flow evoked by translational self-motion contains relevant distance information about the surrounding world. In contrast to the translational phases of flight during which gaze direction is kept largely constant, walking flies experience continuous rotational image flow that is coupled to their stride-cycle. The consequences of these self-produced image shifts for the extraction of environmental information are still unclear. To assess the impact of stride-coupled image shifts on visual information processing, we performed electrophysiological recordings from the HSE cell, a motion sensitive wide-field neuron in the blowfly visual system. This cell has been concluded to play a key role in mediating optomotor behavior, self-motion estimation and spatial information processing. We used visual stimuli that were based on the visual input experienced by walking blowflies while approaching a black vertical bar. The response of HSE to these stimuli was dominated by periodic membrane potential fluctuations evoked by stride-coupled image shifts. Nevertheless, during the approach the cell’s response contained information about the bar and its background. The response components evoked by the bar were larger than the responses to its background, especially during the last phase of the approach. However, as revealed by targeted modifications of the visual input during walking, the extraction of distance information on the basis of HSE responses is much impaired by stride-coupled retinal image shifts. Possible mechanisms that may cope with these stride-coupled responses are discussed. PMID:25309362
Analysis of a municipal wastewater treatment plant using a neural network-based pattern analysis
Hong, Y.-S.T.; Rosen, Michael R.; Bhamidimarri, R.
2003-01-01
This paper addresses the problem of how to capture the complex relationships that exist between process variables and to diagnose the dynamic behaviour of a municipal wastewater treatment plant (WTP). Due to the complex biological reaction mechanisms, the highly time-varying, and multivariable aspects of the real WTP, the diagnosis of the WTP are still difficult in practice. The application of intelligent techniques, which can analyse the multi-dimensional process data using a sophisticated visualisation technique, can be useful for analysing and diagnosing the activated-sludge WTP. In this paper, the Kohonen Self-Organising Feature Maps (KSOFM) neural network is applied to analyse the multi-dimensional process data, and to diagnose the inter-relationship of the process variables in a real activated-sludge WTP. By using component planes, some detailed local relationships between the process variables, e.g., responses of the process variables under different operating conditions, as well as the global information is discovered. The operating condition and the inter-relationship among the process variables in the WTP have been diagnosed and extracted by the information obtained from the clustering analysis of the maps. It is concluded that the KSOFM technique provides an effective analysing and diagnosing tool to understand the system behaviour and to extract knowledge contained in multi-dimensional data of a large-scale WTP. ?? 2003 Elsevier Science Ltd. All rights reserved.
Automatic generation of Web mining environments
NASA Astrophysics Data System (ADS)
Cibelli, Maurizio; Costagliola, Gennaro
1999-02-01
The main problem related to the retrieval of information from the world wide web is the enormous number of unstructured documents and resources, i.e., the difficulty of locating and tracking appropriate sources. This paper presents a web mining environment (WME), which is capable of finding, extracting and structuring information related to a particular domain from web documents, using general purpose indices. The WME architecture includes a web engine filter (WEF), to sort and reduce the answer set returned by a web engine, a data source pre-processor (DSP), which processes html layout cues in order to collect and qualify page segments, and a heuristic-based information extraction system (HIES), to finally retrieve the required data. Furthermore, we present a web mining environment generator, WMEG, that allows naive users to generate a WME specific to a given domain by providing a set of specifications.
NASA Astrophysics Data System (ADS)
Lucciani, Roberto; Laneve, Giovanni; Jahjah, Munzer; Mito, Collins
2016-08-01
The crop growth stage represents essential information for agricultural areas management. In this study we investigate the feasibility of a tool based on remotely sensed satellite (Landsat 8) imagery, capable of automatically classify crop fields and how much resolution enhancement based on pan-sharpening techniques and phenological information extraction, useful to create decision rules that allow to identify semantic class to assign to an object, can effectively support the classification process. Moreover we investigate the opportunity to extract vegetation health status information from remotely sensed assessment of the equivalent water thickness (EWT). Our case study is the Kenya's Great Rift valley, in this area a ground truth campaign was conducted during August 2015 in order to collect crop fields GPS measurements, leaf area index (LAI) and chlorophyll samples.
An Experimental Study of an Ultra-Mobile Vehicle for Off-Road Transportation.
1983-07-01
implemented. 2.2.3 Image Processing Algorithms The ultimate goal of a vision system is to understand the content of a scene and to extract useful...to extract useful information from it. Four existing robot-vision systems, the General Motors CONSIGHT system, the UNIVISIUN system, the Westinghouse...cos + C . sino A (5.48) By taking out a comon factor, Eq. (5.48) can be rewritten as /-- c. ( B coso + C sine) A (5.49) 203 !_ - Let Z B sie4 = : v, VB2
Challenges in Managing Information Extraction
ERIC Educational Resources Information Center
Shen, Warren H.
2009-01-01
This dissertation studies information extraction (IE), the problem of extracting structured information from unstructured data. Example IE tasks include extracting person names from news articles, product information from e-commerce Web pages, street addresses from emails, and names of emerging music bands from blogs. IE is all increasingly…
Spaceborne synthetic aperture radar signal processing using FPGAs
NASA Astrophysics Data System (ADS)
Sugimoto, Yohei; Ozawa, Satoru; Inaba, Noriyasu
2017-10-01
Synthetic Aperture Radar (SAR) imagery requires image reproduction through successive signal processing of received data before browsing images and extracting information. The received signal data records of the ALOS-2/PALSAR-2 are stored in the onboard mission data storage and transmitted to the ground. In order to compensate the storage usage and the capacity of transmission data through the mission date communication networks, the operation duty of the PALSAR-2 is limited. This balance strongly relies on the network availability. The observation operations of the present spaceborne SAR systems are rigorously planned by simulating the mission data balance, given conflicting user demands. This problem should be solved such that we do not have to compromise the operations and the potential of the next-generation spaceborne SAR systems. One of the solutions is to compress the SAR data through onboard image reproduction and information extraction from the reproduced images. This is also beneficial for fast delivery of information products and event-driven observations by constellation. The Emergence Studio (Sōhatsu kōbō in Japanese) with Japan Aerospace Exploration Agency is developing evaluation models of FPGA-based signal processing system for onboard SAR image reproduction. The model, namely, "Fast L1 Processor (FLIP)" developed in 2016 can reproduce a 10m-resolution single look complex image (Level 1.1) from ALOS/PALSAR raw signal data (Level 1.0). The processing speed of the FLIP at 200 MHz results in twice faster than CPU-based computing at 3.7 GHz. The image processed by the FLIP is no way inferior to the image processed with 32-bit computing in MATLAB.
Automatic Extraction of Road Markings from Mobile Laser-Point Cloud Using Intensity Data
NASA Astrophysics Data System (ADS)
Yao, L.; Chen, Q.; Qin, C.; Wu, H.; Zhang, S.
2018-04-01
With the development of intelligent transportation, road's high precision information data has been widely applied in many fields. This paper proposes a concise and practical way to extract road marking information from point cloud data collected by mobile mapping system (MMS). The method contains three steps. Firstly, road surface is segmented through edge detection from scan lines. Then the intensity image is generated by inverse distance weighted (IDW) interpolation and the road marking is extracted by using adaptive threshold segmentation based on integral image without intensity calibration. Moreover, the noise is reduced by removing a small number of plaque pixels from binary image. Finally, point cloud mapped from binary image is clustered into marking objects according to Euclidean distance, and using a series of algorithms including template matching and feature attribute filtering for the classification of linear markings, arrow markings and guidelines. Through processing the point cloud data collected by RIEGL VUX-1 in case area, the results show that the F-score of marking extraction is 0.83, and the average classification rate is 0.9.
Comparison of Two Simplification Methods for Shoreline Extraction from Digital Orthophoto Images
NASA Astrophysics Data System (ADS)
Bayram, B.; Sen, A.; Selbesoglu, M. O.; Vārna, I.; Petersons, P.; Aykut, N. O.; Seker, D. Z.
2017-11-01
The coastal ecosystems are very sensitive to external influences. Coastal resources such as sand dunes, coral reefs and mangroves has vital importance to prevent coastal erosion. Human based effects also threats the coastal areas. Therefore, the change of coastal areas should be monitored. Up-to-date, accurate shoreline information is indispensable for coastal managers and decision makers. Remote sensing and image processing techniques give a big opportunity to obtain reliable shoreline information. In the presented study, NIR bands of seven 1:5000 scaled digital orthophoto images of Riga Bay-Latvia have been used. The Object-oriented Simple Linear Clustering method has been utilized to extract shoreline of Riga Bay. Bend and Douglas-Peucker methods have been used to simplify the extracted shoreline to test the effect of both methods. Photogrammetrically digitized shoreline has been taken as reference data to compare obtained results. The accuracy assessment has been realised by Digital Shoreline Analysis tool. As a result, the achieved shoreline by the Bend method has been found closer to the extracted shoreline with Simple Linear Clustering method.
Deep SOMs for automated feature extraction and classification from big data streaming
NASA Astrophysics Data System (ADS)
Sakkari, Mohamed; Ejbali, Ridha; Zaied, Mourad
2017-03-01
In this paper, we proposed a deep self-organizing map model (Deep-SOMs) for automated features extracting and learning from big data streaming which we benefit from the framework Spark for real time streams and highly parallel data processing. The SOMs deep architecture is based on the notion of abstraction (patterns automatically extract from the raw data, from the less to more abstract). The proposed model consists of three hidden self-organizing layers, an input and an output layer. Each layer is made up of a multitude of SOMs, each map only focusing at local headmistress sub-region from the input image. Then, each layer trains the local information to generate more overall information in the higher layer. The proposed Deep-SOMs model is unique in terms of the layers architecture, the SOMs sampling method and learning. During the learning stage we use a set of unsupervised SOMs for feature extraction. We validate the effectiveness of our approach on large data sets such as Leukemia dataset and SRBCT. Results of comparison have shown that the Deep-SOMs model performs better than many existing algorithms for images classification.
Knowledge Acquisition and Management for the NASA Earth Exchange (NEX)
NASA Astrophysics Data System (ADS)
Votava, P.; Michaelis, A.; Nemani, R. R.
2013-12-01
NASA Earth Exchange (NEX) is a data, computing and knowledge collaboratory that houses NASA satellite, climate and ancillary data where a focused community can come together to share modeling and analysis codes, scientific results, knowledge and expertise on a centralized platform with access to large supercomputing resources. As more and more projects are being executed on NEX, we are increasingly focusing on capturing the knowledge of the NEX users and provide mechanisms for sharing it with the community in order to facilitate reuse and accelerate research. There are many possible knowledge contributions to NEX, it can be a wiki entry on the NEX portal contributed by a developer, information extracted from a publication in an automated way, or a workflow captured during code execution on the supercomputing platform. The goal of the NEX knowledge platform is to capture and organize this information and make it easily accessible to the NEX community and beyond. The knowledge acquisition process consists of three main faucets - data and metadata, workflows and processes, and web-based information. Once the knowledge is acquired, it is processed in a number of ways ranging from custom metadata parsers to entity extraction using natural language processing techniques. The processed information is linked with existing taxonomies and aligned with internal ontology (which heavily reuses number of external ontologies). This forms a knowledge graph that can then be used to improve users' search query results as well as provide additional analytics capabilities to the NEX system. Such a knowledge graph will be an important building block in creating a dynamic knowledge base for the NEX community where knowledge is both generated and easily shared.
PANTHER. Pattern ANalytics To support High-performance Exploitation and Reasoning.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Czuchlewski, Kristina Rodriguez; Hart, William E.
Sandia has approached the analysis of big datasets with an integrated methodology that uses computer science, image processing, and human factors to exploit critical patterns and relationships in large datasets despite the variety and rapidity of information. The work is part of a three-year LDRD Grand Challenge called PANTHER (Pattern ANalytics To support High-performance Exploitation and Reasoning). To maximize data analysis capability, Sandia pursued scientific advances across three key technical domains: (1) geospatial-temporal feature extraction via image segmentation and classification; (2) geospatial-temporal analysis capabilities tailored to identify and process new signatures more efficiently; and (3) domain- relevant models of humanmore » perception and cognition informing the design of analytic systems. Our integrated results include advances in geographical information systems (GIS) in which we discover activity patterns in noisy, spatial-temporal datasets using geospatial-temporal semantic graphs. We employed computational geometry and machine learning to allow us to extract and predict spatial-temporal patterns and outliers from large aircraft and maritime trajectory datasets. We automatically extracted static and ephemeral features from real, noisy synthetic aperture radar imagery for ingestion into a geospatial-temporal semantic graph. We worked with analysts and investigated analytic workflows to (1) determine how experiential knowledge evolves and is deployed in high-demand, high-throughput visual search workflows, and (2) better understand visual search performance and attention. Through PANTHER, Sandia's fundamental rethinking of key aspects of geospatial data analysis permits the extraction of much richer information from large amounts of data. The project results enable analysts to examine mountains of historical and current data that would otherwise go untouched, while also gaining meaningful, measurable, and defensible insights into overlooked relationships and patterns. The capability is directly relevant to the nation's nonproliferation remote-sensing activities and has broad national security applications for military and intelligence- gathering organizations.« less
ERIC Educational Resources Information Center
Benoit, Gerald
2002-01-01
Discusses data mining (DM) and knowledge discovery in databases (KDD), taking the view that KDD is the larger view of the entire process, with DM emphasizing the cleaning, warehousing, mining, and visualization of knowledge discovery in databases. Highlights include algorithms; users; the Internet; text mining; and information extraction.…
Asteroids: Does Space Weathering Matter?
NASA Technical Reports Server (NTRS)
Gaffey, Michael J.
2001-01-01
The interpretive calibrations and methodologies used to extract mineralogy from asteroidal spectra appear to remain valid until the space weathering process is advanced to a degree which appears to be rare or absent on asteroid surfaces. Additional information is contained in the original extended abstract.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Boing, L.E.; Miller, R.L.
1983-10-01
This document presents, in summary form, generic conceptual information relevant to the decommissioning of a reference test reactor (RTR). All of the data presented were extracted from NUREG/CR-1756 and arranged in a form that will provide a basis for future comparison studies for the Evaluation of Nuclear Facility Decommissioning Projects (ENFDP) program. During the data extraction process no attempt was made to challenge any of the assumptions used in the original studies nor was any attempt made to update assumed methods or processes to state-of-the-art decommissioning techniques. In a few instances obvious errors were corrected after consultation with the studymore » author.« less
Landsat surface reflectance quality assurance extraction (version 1.7)
Jones, J.W.; Starbuck, M.J.; Jenkerson, Calli B.
2013-01-01
The U.S. Geological Survey (USGS) Land Remote Sensing Program is developing an operational capability to produce Climate Data Records (CDRs) and Essential Climate Variables (ECVs) from the Landsat Archive to support a wide variety of science and resource management activities from regional to global scale. The USGS Earth Resources Observation and Science (EROS) Center is charged with prototyping systems and software to generate these high-level data products. Various USGS Geographic Science Centers are charged with particular ECV algorithm development and (or) selection as well as the evaluation and application demonstration of various USGS CDRs and ECVs. Because it is a foundation for many other ECVs, the first CDR in development is the Landsat Surface Reflectance Product (LSRP). The LSRP incorporates data quality information in a bit-packed structure that is not readily accessible without postprocessing services performed by the user. This document describes two general methods of LSRP quality-data extraction for use in image processing systems. Helpful hints for the installation and use of software originally developed for manipulation of Hierarchical Data Format (HDF) produced through the National Aeronautics and Space Administration (NASA) Earth Observing System are first provided for users who wish to extract quality data into separate HDF files. Next, steps follow to incorporate these extracted data into an image processing system. Finally, an alternative example is illustrated in which the data are extracted within a particular image processing system.
Overview of image processing tools to extract physical information from JET videos
NASA Astrophysics Data System (ADS)
Craciunescu, T.; Murari, A.; Gelfusa, M.; Tiseanu, I.; Zoita, V.; EFDA Contributors, JET
2014-11-01
In magnetic confinement nuclear fusion devices such as JET, the last few years have witnessed a significant increase in the use of digital imagery, not only for the surveying and control of experiments, but also for the physical interpretation of results. More than 25 cameras are routinely used for imaging on JET in the infrared (IR) and visible spectral regions. These cameras can produce up to tens of Gbytes per shot and their information content can be very different, depending on the experimental conditions. However, the relevant information about the underlying physical processes is generally of much reduced dimensionality compared to the recorded data. The extraction of this information, which allows full exploitation of these diagnostics, is a challenging task. The image analysis consists, in most cases, of inverse problems which are typically ill-posed mathematically. The typology of objects to be analysed is very wide, and usually the images are affected by noise, low levels of contrast, low grey-level in-depth resolution, reshaping of moving objects, etc. Moreover, the plasma events have time constants of ms or tens of ms, which imposes tough conditions for real-time applications. On JET, in the last few years new tools and methods have been developed for physical information retrieval. The methodology of optical flow has allowed, under certain assumptions, the derivation of information about the dynamics of video objects associated with different physical phenomena, such as instabilities, pellets and filaments. The approach has been extended in order to approximate the optical flow within the MPEG compressed domain, allowing the manipulation of the large JET video databases and, in specific cases, even real-time data processing. The fast visible camera may provide new information that is potentially useful for disruption prediction. A set of methods, based on the extraction of structural information from the visual scene, have been developed for the automatic detection of MARFE (multifaceted asymmetric radiation from the edge) occurrences, which precede disruptions in density limit discharges. An original spot detection method has been developed for large surveys of videos in JET, and for the assessment of the long term trends in their evolution. The analysis of JET IR videos, recorded during JET operation with the ITER-like wall, allows the retrieval of data and hence correlation of the evolution of spots properties with macroscopic events, in particular series of intentional disruptions.
Summary of water body extraction methods based on ZY-3 satellite
NASA Astrophysics Data System (ADS)
Zhu, Yu; Sun, Li Jian; Zhang, Chuan Yin
2017-12-01
Extracting from remote sensing images is one of the main means of water information extraction. Affected by spectral characteristics, many methods can be not applied to the satellite image of ZY-3. To solve this problem, we summarize the extraction methods for ZY-3 and analyze the extraction results of existing methods. According to the characteristics of extraction results, the method of WI& single band threshold and the method of texture filtering based on probability statistics are explored. In addition, the advantages and disadvantages of all methods are compared, which provides some reference for the research of water extraction from images. The obtained conclusions are as follows. 1) NIR has higher water sensitivity, consequently when the surface reflectance in the study area is less similar to water, using single band threshold method or multi band operation can obtain the ideal effect. 2) Compared with the water index and HIS optimal index method, object extraction method based on rules, which takes into account not only the spectral information of the water, but also space and texture feature constraints, can obtain better extraction effect, yet the image segmentation process is time consuming and the definition of the rules requires a certain knowledge. 3) The combination of the spectral relationship and water index can eliminate the interference of the shadow to a certain extent. When there is less small water or small water is not considered in further study, texture filtering based on probability statistics can effectively reduce the noises in result and avoid mixing shadows or paddy field with water in a certain extent.
NASA Astrophysics Data System (ADS)
Bredfeldt, Jeremy S.; Liu, Yuming; Pehlke, Carolyn A.; Conklin, Matthew W.; Szulczewski, Joseph M.; Inman, David R.; Keely, Patricia J.; Nowak, Robert D.; Mackie, Thomas R.; Eliceiri, Kevin W.
2014-01-01
Second-harmonic generation (SHG) imaging can help reveal interactions between collagen fibers and cancer cells. Quantitative analysis of SHG images of collagen fibers is challenged by the heterogeneity of collagen structures and low signal-to-noise ratio often found while imaging collagen in tissue. The role of collagen in breast cancer progression can be assessed post acquisition via enhanced computation. To facilitate this, we have implemented and evaluated four algorithms for extracting fiber information, such as number, length, and curvature, from a variety of SHG images of collagen in breast tissue. The image-processing algorithms included a Gaussian filter, SPIRAL-TV filter, Tubeness filter, and curvelet-denoising filter. Fibers are then extracted using an automated tracking algorithm called fiber extraction (FIRE). We evaluated the algorithm performance by comparing length, angle and position of the automatically extracted fibers with those of manually extracted fibers in twenty-five SHG images of breast cancer. We found that the curvelet-denoising filter followed by FIRE, a process we call CT-FIRE, outperforms the other algorithms under investigation. CT-FIRE was then successfully applied to track collagen fiber shape changes over time in an in vivo mouse model for breast cancer.
NASA Astrophysics Data System (ADS)
Tai, Do Chiem; Hai, Dam Thi Thanh; Vinh, Nguyen Hanh; Phung, Le Thi Kim
2016-06-01
In this research, the fatty acids of isolated microalgae were extracted by some technologies such as maceration, Soxhlet, ultrasonic-assisted extraction and supercritical fluid extraction; and analyzed for biodiesel production using GC-MS. This work deals with the extraction of microalgae oil from dry biomass by using supercritical fluid extraction method. A complete study at laboratory of the influence of some parameters on the extraction kinetics and yields and on the composition of the oil in terms of lipid classes and profiles is proposed. Two types of microalgae were studied: Chlorella sp. and Spirulina sp. For the extraction of oil from microalgae, supercritical CO2 (SC-CO2) is regarded with interest, being safer than n-hexane and offering a negligible environmental impact, a short extraction time and a high-quality final product. Whilst some experimental papers are available on the supercritical fluid extraction (SFE) of oil from microalgae, only limited information exists on the kinetics of the process. These results demonstrate that supercritical CO2 extraction is an efficient method for the complete recovery of the neutral lipid phase.
a Probability-Based Statistical Method to Extract Water Body of TM Images with Missing Information
NASA Astrophysics Data System (ADS)
Lian, Shizhong; Chen, Jiangping; Luo, Minghai
2016-06-01
Water information cannot be accurately extracted using TM images because true information is lost in some images because of blocking clouds and missing data stripes, thereby water information cannot be accurately extracted. Water is continuously distributed in natural conditions; thus, this paper proposed a new method of water body extraction based on probability statistics to improve the accuracy of water information extraction of TM images with missing information. Different disturbing information of clouds and missing data stripes are simulated. Water information is extracted using global histogram matching, local histogram matching, and the probability-based statistical method in the simulated images. Experiments show that smaller Areal Error and higher Boundary Recall can be obtained using this method compared with the conventional methods.
NASA Astrophysics Data System (ADS)
Jenerowicz, Małgorzata; Kemper, Thomas
2016-10-01
Every year thousands of people are displaced by conflicts or natural disasters and often gather in large camps. Knowing how many people have been gathered is crucial for an efficient relief operation. However, it is often difficult to collect exact information on the total number of the population. This paper presents the improved morphological methodology for the estimation of dwellings structures located in several Internally Displaced Persons (IDPs) Camps, based on Very High Resolution (VHR) multispectral satellite imagery with pixel sizes of 1 meter or less including GeoEye-1, WorldView-2, QuickBird-2, Ikonos-2, Pléiades-A and Pléiades-B. The main topic of this paper is the approach enhancement with selection of feature extraction algorithm, the improvement and automation of pre-processing and results verification. For the informal and temporary dwellings extraction purpose the high quality of data has to be ensured. The pre-processing has been extended by including the input data hierarchy level assignment and data fusion method selection and evaluation. The feature extraction algorithm follows the procedure presented in Jenerowicz, M., Kemper, T., 2011. Optical data are analysed in a cyclic approach comprising image segmentation, geometrical, textural and spectral class modeling aiming at camp area identification. The successive steps of morphological processing have been combined in a one stand-alone application for automatic dwellings detection and enumeration. Actively implemented, these approaches can provide a reliable and consistent results, independent of the imaging satellite type and different study sites location, providing decision support in emergency response for the humanitarian community like United Nations, European Union and Non-Governmental relief organizations.
Madkour, Mohcine; Benhaddou, Driss; Tao, Cui
2016-01-01
Background and Objective We live our lives by the calendar and the clock, but time is also an abstraction, even an illusion. The sense of time can be both domain-specific and complex, and is often left implicit, requiring significant domain knowledge to accurately recognize and harness. In the clinical domain, the momentum gained from recent advances in infrastructure and governance practices has enabled the collection of tremendous amount of data at each moment in time. Electronic Health Records (EHRs) have paved the way to making these data available for practitioners and researchers. However, temporal data representation, normalization, extraction and reasoning are very important in order to mine such massive data and therefore for constructing the clinical timeline. The objective of this work is to provide an overview of the problem of constructing a timeline at the clinical point of care and to summarize the state-of-the-art in processing temporal information of clinical narratives. Methods This review surveys the methods used in three important area: modeling and representing of time, Medical NLP methods for extracting time, and methods of time reasoning and processing. The review emphasis on the current existing gap between present methods and the semantic web technologies and catch up with the possible combinations. Results the main findings of this review is revealing the importance of time processing not only in constructing timelines and clinical decision support systems but also as a vital component of EHR data models and operations. Conclusions Extracting temporal information in clinical narratives is a challenging task. The inclusion of ontologies and semantic web will lead to better assessment of the annotation task and, together with medical NLP techniques, will help resolving granularity and co-reference resolution problems. PMID:27040831
Knowledge-driven information mining in remote-sensing image archives
NASA Astrophysics Data System (ADS)
Datcu, M.; Seidel, K.; D'Elia, S.; Marchetti, P. G.
2002-05-01
Users in all domains require information or information-related services that are focused, concise, reliable, low cost and timely and which are provided in forms and formats compatible with the user's own activities. In the current Earth Observation (EO) scenario, the archiving centres generally only offer data, images and other "low level" products. The user's needs are being only partially satisfied by a number of, usually small, value-adding companies applying time-consuming (mostly manual) and expensive processes relying on the knowledge of experts to extract information from those data or images.
Text Mining in Biomedical Domain with Emphasis on Document Clustering
2017-01-01
Objectives With the exponential increase in the number of articles published every year in the biomedical domain, there is a need to build automated systems to extract unknown information from the articles published. Text mining techniques enable the extraction of unknown knowledge from unstructured documents. Methods This paper reviews text mining processes in detail and the software tools available to carry out text mining. It also reviews the roles and applications of text mining in the biomedical domain. Results Text mining processes, such as search and retrieval of documents, pre-processing of documents, natural language processing, methods for text clustering, and methods for text classification are described in detail. Conclusions Text mining techniques can facilitate the mining of vast amounts of knowledge on a given topic from published biomedical research articles and draw meaningful conclusions that are not possible otherwise. PMID:28875048
NASA Astrophysics Data System (ADS)
Oby, Emily R.; Perel, Sagi; Sadtler, Patrick T.; Ruff, Douglas A.; Mischel, Jessica L.; Montez, David F.; Cohen, Marlene R.; Batista, Aaron P.; Chase, Steven M.
2016-06-01
Objective. A traditional goal of neural recording with extracellular electrodes is to isolate action potential waveforms of an individual neuron. Recently, in brain-computer interfaces (BCIs), it has been recognized that threshold crossing events of the voltage waveform also convey rich information. To date, the threshold for detecting threshold crossings has been selected to preserve single-neuron isolation. However, the optimal threshold for single-neuron identification is not necessarily the optimal threshold for information extraction. Here we introduce a procedure to determine the best threshold for extracting information from extracellular recordings. We apply this procedure in two distinct contexts: the encoding of kinematic parameters from neural activity in primary motor cortex (M1), and visual stimulus parameters from neural activity in primary visual cortex (V1). Approach. We record extracellularly from multi-electrode arrays implanted in M1 or V1 in monkeys. Then, we systematically sweep the voltage detection threshold and quantify the information conveyed by the corresponding threshold crossings. Main Results. The optimal threshold depends on the desired information. In M1, velocity is optimally encoded at higher thresholds than speed; in both cases the optimal thresholds are lower than are typically used in BCI applications. In V1, information about the orientation of a visual stimulus is optimally encoded at higher thresholds than is visual contrast. A conceptual model explains these results as a consequence of cortical topography. Significance. How neural signals are processed impacts the information that can be extracted from them. Both the type and quality of information contained in threshold crossings depend on the threshold setting. There is more information available in these signals than is typically extracted. Adjusting the detection threshold to the parameter of interest in a BCI context should improve our ability to decode motor intent, and thus enhance BCI control. Further, by sweeping the detection threshold, one can gain insights into the topographic organization of the nearby neural tissue.
Oby, Emily R; Perel, Sagi; Sadtler, Patrick T; Ruff, Douglas A; Mischel, Jessica L; Montez, David F; Cohen, Marlene R; Batista, Aaron P; Chase, Steven M
2018-01-01
Objective A traditional goal of neural recording with extracellular electrodes is to isolate action potential waveforms of an individual neuron. Recently, in brain–computer interfaces (BCIs), it has been recognized that threshold crossing events of the voltage waveform also convey rich information. To date, the threshold for detecting threshold crossings has been selected to preserve single-neuron isolation. However, the optimal threshold for single-neuron identification is not necessarily the optimal threshold for information extraction. Here we introduce a procedure to determine the best threshold for extracting information from extracellular recordings. We apply this procedure in two distinct contexts: the encoding of kinematic parameters from neural activity in primary motor cortex (M1), and visual stimulus parameters from neural activity in primary visual cortex (V1). Approach We record extracellularly from multi-electrode arrays implanted in M1 or V1 in monkeys. Then, we systematically sweep the voltage detection threshold and quantify the information conveyed by the corresponding threshold crossings. Main Results The optimal threshold depends on the desired information. In M1, velocity is optimally encoded at higher thresholds than speed; in both cases the optimal thresholds are lower than are typically used in BCI applications. In V1, information about the orientation of a visual stimulus is optimally encoded at higher thresholds than is visual contrast. A conceptual model explains these results as a consequence of cortical topography. Significance How neural signals are processed impacts the information that can be extracted from them. Both the type and quality of information contained in threshold crossings depend on the threshold setting. There is more information available in these signals than is typically extracted. Adjusting the detection threshold to the parameter of interest in a BCI context should improve our ability to decode motor intent, and thus enhance BCI control. Further, by sweeping the detection threshold, one can gain insights into the topographic organization of the nearby neural tissue. PMID:27097901
Oby, Emily R; Perel, Sagi; Sadtler, Patrick T; Ruff, Douglas A; Mischel, Jessica L; Montez, David F; Cohen, Marlene R; Batista, Aaron P; Chase, Steven M
2016-06-01
A traditional goal of neural recording with extracellular electrodes is to isolate action potential waveforms of an individual neuron. Recently, in brain-computer interfaces (BCIs), it has been recognized that threshold crossing events of the voltage waveform also convey rich information. To date, the threshold for detecting threshold crossings has been selected to preserve single-neuron isolation. However, the optimal threshold for single-neuron identification is not necessarily the optimal threshold for information extraction. Here we introduce a procedure to determine the best threshold for extracting information from extracellular recordings. We apply this procedure in two distinct contexts: the encoding of kinematic parameters from neural activity in primary motor cortex (M1), and visual stimulus parameters from neural activity in primary visual cortex (V1). We record extracellularly from multi-electrode arrays implanted in M1 or V1 in monkeys. Then, we systematically sweep the voltage detection threshold and quantify the information conveyed by the corresponding threshold crossings. The optimal threshold depends on the desired information. In M1, velocity is optimally encoded at higher thresholds than speed; in both cases the optimal thresholds are lower than are typically used in BCI applications. In V1, information about the orientation of a visual stimulus is optimally encoded at higher thresholds than is visual contrast. A conceptual model explains these results as a consequence of cortical topography. How neural signals are processed impacts the information that can be extracted from them. Both the type and quality of information contained in threshold crossings depend on the threshold setting. There is more information available in these signals than is typically extracted. Adjusting the detection threshold to the parameter of interest in a BCI context should improve our ability to decode motor intent, and thus enhance BCI control. Further, by sweeping the detection threshold, one can gain insights into the topographic organization of the nearby neural tissue.
Yang, Heejung; Lee, Dong Young; Kang, Kyo Bin; Kim, Jeom Yong; Kim, Sun Ok; Yoo, Young Hyo; Sung, Sang Hyun
2015-05-10
A dry purified extract of Panax ginseng (PEG) was prepared using a manufacturing process that includes column chromatography, acid hydrolysis, and an enzyme reaction. During the manufacturing process, the more polar ginsenosides were altered into less polar forms via cleavage of their sugar chains and structural modifications of the aglycones, such as hydroxylation and dehydroxylation. The structural changes of ginsenosides during the intermediate steps from dried ginseng extract (DGE) to PEG were monitored by ultra-performance liquid chromatography coupled with quadrupole time-of-flight mass spectroscopy (UPLC-QTOF/MS). 22 ginsenosides isolated from PEG were used as the reference standards for determining of unknown ginsenosides and further suggesting of the metabolic markers. The elution order of 22 ginsenosides based on the type of aglycones, and the location and number of sugar chains can be used for the structural elucidation of unknown ginsenosides. This information could be used in a dereplication process for quick and efficient identification of ginsenoside derivatives in ginseng preparations. A dereplication approach helped the identification of the metabolic markers in the UPLC-QTOF/MS chromatograms during the conversion process with multivariate analyses, including principal component analysis (PCA) and orthogonal partial least squares discriminant analysis (OPLS-DA) plots. These metabolic markers were identified by comparing with the dereplication information of the reference standards of 22 ginsenosides, or they were assigned using the pattern of the MS/MS fragmented ions. Consequently, the developed metabolic profiling approach using UPLC-QTOF/MS and multivariate analysis represents a new method for providing quality control as well as useful criteria for a similarity evaluation of the manufacturing process of ginseng preparations. Copyright © 2015 Elsevier B.V. All rights reserved.
Lemmond, Tracy D; Hanley, William G; Guensche, Joseph Wendell; Perry, Nathan C; Nitao, John J; Kidwell, Paul Brandon; Boakye, Kofi Agyeman; Glaser, Ron E; Prenger, Ryan James
2014-05-13
An information extraction system and methods of operating the system are provided. In particular, an information extraction system for performing meta-extraction of named entities of people, organizations, and locations as well as relationships and events from text documents are described herein.
Contextual information and perceptual-cognitive expertise in a dynamic, temporally-constrained task.
Murphy, Colm P; Jackson, Robin C; Cooke, Karl; Roca, André; Benguigui, Nicolas; Williams, A Mark
2016-12-01
Skilled performers extract and process postural information from an opponent during anticipation more effectively than their less-skilled counterparts. In contrast, the role and importance of contextual information in anticipation has received only minimal attention. We evaluate the importance of contextual information in anticipation and examine the underlying perceptual-cognitive processes. We present skilled and less-skilled tennis players with normal video or animated footage of the same rallies. In the animated condition, sequences were created using player movement and ball trajectory data, and postural information from the players was removed, constraining participants to anticipate based on contextual information alone. Participants judged ball bounce location of the opponent's final occluded shot. The 2 groups were more accurate than chance in both display conditions with skilled being more accurate than less-skilled (Exp. 1) participants. When anticipating based on contextual information alone, skilled participants employed different gaze behaviors to less-skilled counterparts and provided verbal reports of thoughts which were indicative of more thorough evaluation of contextual information (Exp. 2). Findings highlight the importance of both postural and contextual information in anticipation and indicate that perceptual-cognitive expertise is underpinned by processes that facilitate more effective processing of contextual information, in the absence of postural information. (PsycINFO Database Record (c) 2016 APA, all rights reserved).
Text mining and its potential applications in systems biology.
Ananiadou, Sophia; Kell, Douglas B; Tsujii, Jun-ichi
2006-12-01
With biomedical literature increasing at a rate of several thousand papers per week, it is impossible to keep abreast of all developments; therefore, automated means to manage the information overload are required. Text mining techniques, which involve the processes of information retrieval, information extraction and data mining, provide a means of solving this. By adding meaning to text, these techniques produce a more structured analysis of textual knowledge than simple word searches, and can provide powerful tools for the production and analysis of systems biology models.
NASA Technical Reports Server (NTRS)
1974-01-01
User models defined as any explicit process or procedure used to transform information extracted from remotely sensed data into a form useful as a resource management information input are discussed. The role of the user models as information, technological, and operations interfaces between the TERSSE and the resource managers is emphasized. It is recommended that guidelines and management strategies be developed for a systems approach to user model development.
1985-07-08
the original information was processed. Where no processing indicator is given, the infor- mation was summarized or extracted . Unfamiliar names...continuous stream of pro- perty acquisitions with the Government owing citizens as much as $11 million; and it was also a period when property owners...reduction In property taxes." Dr. Haynes said many buildings erected In Bridgetown in the past five years were now in a situation where they had
Hemispheric association and dissociation of voice and speech information processing in stroke.
Jones, Anna B; Farrall, Andrew J; Belin, Pascal; Pernet, Cyril R
2015-10-01
As we listen to someone speaking, we extract both linguistic and non-linguistic information. Knowing how these two sets of information are processed in the brain is fundamental for the general understanding of social communication, speech recognition and therapy of language impairments. We investigated the pattern of performances in phoneme versus gender categorization in left and right hemisphere stroke patients, and found an anatomo-functional dissociation in the right frontal cortex, establishing a new syndrome in voice discrimination abilities. In addition, phoneme and gender performances were most often associated than dissociated in the left hemisphere patients, suggesting a common neural underpinnings. Copyright © 2015 Elsevier Ltd. All rights reserved.
Remote Sensing Extraction of Stopes and Tailings Ponds in AN Ultra-Low Iron Mining Area
NASA Astrophysics Data System (ADS)
Ma, B.; Chen, Y.; Li, X.; Wu, L.
2018-04-01
With the development of economy, global demand for steel has accelerated since 2000, and thus mining activities of iron ore have become intensive accordingly. An ultra-low-grade iron has been extracted by open-pit mining and processed massively since 2001 in Kuancheng County, Hebei Province. There are large-scale stopes and tailings ponds in this area. It is important to extract their spatial distribution information for environmental protection and disaster prevention. A remote sensing method of extracting stopes and tailings ponds is studied based on spectral characteristics by use of Landsat 8 OLI imagery and ground spectral data. The overall accuracy of extraction is 95.06 %. In addition, tailings ponds are distinguished from stopes based on thermal characteristics by use of temperature image. The results could provide decision support for environmental protection, disaster prevention, and ecological restoration in the ultra-low-grade iron ore mining area.
15 CFR Supplement No. 2 to Part 710 - Definitions of Production
Code of Federal Regulations, 2010 CFR
2010-01-01
... reaction Produced by synthesis* Formation through chemical synthesis.Processing to extract and isolate... (Continued) BUREAU OF INDUSTRY AND SECURITY, DEPARTMENT OF COMMERCE CHEMICAL WEAPONS CONVENTION REGULATIONS GENERAL INFORMATION AND OVERVIEW OF THE CHEMICAL WEAPONS CONVENTION REGULATIONS (CWCR) Pt. 710, Supp. 2...
Refraction-based X-ray Computed Tomography for Biomedical Purpose Using Dark Field Imaging Method
NASA Astrophysics Data System (ADS)
Sunaguchi, Naoki; Yuasa, Tetsuya; Huo, Qingkai; Ichihara, Shu; Ando, Masami
We have proposed a tomographic x-ray imaging system using DFI (dark field imaging) optics along with a data-processing method to extract information on refraction from the measured intensities, and a reconstruction algorithm to reconstruct a refractive-index field from the projections generated from the extracted refraction information. The DFI imaging system consists of a tandem optical system of Bragg- and Laue-case crystals, a positioning device system for a sample, and two CCD (charge coupled device) cameras. Then, we developed a software code to simulate the data-acquisition, data-processing, and reconstruction methods to investigate the feasibility of the proposed methods. Finally, in order to demonstrate its efficacy, we imaged a sample with DCIS (ductal carcinoma in situ) excised from a breast cancer patient using a system constructed at the vertical wiggler beamline BL-14C in KEK-PF. Its CT images depicted a variety of fine histological structures, such as milk ducts, duct walls, secretions, adipose and fibrous tissue. They correlate well with histological sections.
Image processing of metal surface with structured light
NASA Astrophysics Data System (ADS)
Luo, Cong; Feng, Chang; Wang, Congzheng
2014-09-01
In structured light vision measurement system, the ideal image of structured light strip, in addition to black background , contains only the gray information of the position of the stripe. However, the actual image contains image noise, complex background and so on, which does not belong to the stripe, and it will cause interference to useful information. To extract the stripe center of mental surface accurately, a new processing method was presented. Through adaptive median filtering, the noise can be preliminary removed, and the noise which introduced by CCD camera and measured environment can be further removed with difference image method. To highlight fine details and enhance the blurred regions between the stripe and noise, the sharping algorithm is used which combine the best features of Laplacian operator and Sobel operator. Morphological opening operation and closing operation are used to compensate the loss of information.Experimental results show that this method is effective in the image processing, not only to restrain the information but also heighten contrast. It is beneficial for the following processing.
Zhang, Yin; Diao, Tianxi; Wang, Lei
2014-12-01
Designed to advance the two-way translational process between basic research and clinical practice, translational medicine has become one of the most important areas in biomedicine. The quantitative evaluation of translational medicine is valuable for the decision making of global translational medical research and funding. Using the scientometric analysis and information extraction techniques, this study quantitatively analyzed the scientific articles on translational medicine. The results showed that translational medicine had significant scientific output and impact, specific core field and institute, and outstanding academic status and benefit. While it is not considered in this study, the patent data are another important indicators that should be integrated in the relevant research in the future. © 2014 Wiley Periodicals, Inc.
Breast cancer mitosis detection in histopathological images with spatial feature extraction
NASA Astrophysics Data System (ADS)
Albayrak, Abdülkadir; Bilgin, Gökhan
2013-12-01
In this work, cellular mitosis detection in histopathological images has been investigated. Mitosis detection is very expensive and time consuming process. Development of digital imaging in pathology has enabled reasonable and effective solution to this problem. Segmentation of digital images provides easier analysis of cell structures in histopathological data. To differentiate normal and mitotic cells in histopathological images, feature extraction step is very crucial step for the system accuracy. A mitotic cell has more distinctive textural dissimilarities than the other normal cells. Hence, it is important to incorporate spatial information in feature extraction or in post-processing steps. As a main part of this study, Haralick texture descriptor has been proposed with different spatial window sizes in RGB and La*b* color spaces. So, spatial dependencies of normal and mitotic cellular pixels can be evaluated within different pixel neighborhoods. Extracted features are compared with various sample sizes by Support Vector Machines using k-fold cross validation method. According to the represented results, it has been shown that separation accuracy on mitotic and non-mitotic cellular pixels gets better with the increasing size of spatial window.
Patil, Maheshkumar Prakash; Kim, Gun-Do
2017-01-01
This review covers general information about the eco-friendly process for the synthesis of silver nanoparticles (AgNP) and gold nanoparticles (AuNP) and focuses on mechanism of the antibacterial activity of AgNPs and the anticancer activity of AuNPs. Biomolecules in the plant extract are involved in reduction of metal ions to nanoparticle in a one-step and eco-friendly synthesis process. Natural plant extracts contain wide range of metabolites including carbohydrates, alkaloids, terpenoids, phenolic compounds, and enzymes. A variety of plant species and plant parts have been successfully extracted and utilized for AgNP and AuNP syntheses. Green-synthesized nanoparticles eliminate the need for a stabilizing and capping agent and show shape and size-dependent biological activities. Here, we describe some of the plant extracts involved in nanoparticle synthesis, characterization methods, and biological applications. Nanoparticles are important in the field of pharmaceuticals for their strong antibacterial and anticancer activity. Considering the importance and uniqueness of this concept, the synthesis, characterization, and application of AgNPs and AuNPs are discussed in this review.
Automated detection and location of indications in eddy current signals
Brudnoy, David M.; Oppenlander, Jane E.; Levy, Arthur J.
2000-01-01
A computer implemented information extraction process that locates and identifies eddy current signal features in digital point-ordered signals, signals representing data from inspection of test materials, by enhancing the signal features relative to signal noise, detecting features of the signals, verifying the location of the signal features that can be known in advance, and outputting information about the identity and location of all detected signal features.
ATtRACT-a database of RNA-binding proteins and associated motifs.
Giudice, Girolamo; Sánchez-Cabo, Fátima; Torroja, Carlos; Lara-Pezzi, Enrique
2016-01-01
RNA-binding proteins (RBPs) play a crucial role in key cellular processes, including RNA transport, splicing, polyadenylation and stability. Understanding the interaction between RBPs and RNA is key to improve our knowledge of RNA processing, localization and regulation in a global manner. Despite advances in recent years, a unified non-redundant resource that includes information on experimentally validated motifs, RBPs and integrated tools to exploit this information is lacking. Here, we developed a database named ATtRACT (available athttp://attract.cnic.es) that compiles information on 370 RBPs and 1583 RBP consensus binding motifs, 192 of which are not present in any other database. To populate ATtRACT we (i) extracted and hand-curated experimentally validated data from CISBP-RNA, SpliceAid-F, RBPDB databases, (ii) integrated and updated the unavailable ASD database and (iii) extracted information from Protein-RNA complexes present in Protein Data Bank database through computational analyses. ATtRACT provides also efficient algorithms to search a specific motif and scan one or more RNA sequences at a time. It also allows discoveringde novomotifs enriched in a set of related sequences and compare them with the motifs included in the database.Database URL:http:// attract. cnic. es. © The Author(s) 2016. Published by Oxford University Press.
Automatic Extraction of High-Resolution Rainfall Series from Rainfall Strip Charts
NASA Astrophysics Data System (ADS)
Saa-Requejo, Antonio; Valencia, Jose Luis; Garrido, Alberto; Tarquis, Ana M.
2015-04-01
Soil erosion is a complex phenomenon involving the detachment and transport of soil particles, storage and runoff of rainwater, and infiltration. The relative magnitude and importance of these processes depends on a host of factors, including climate, soil, topography, cropping and land management practices among others. Most models for soil erosion or hydrological processes need an accurate storm characterization. However, this data are not always available and in some cases indirect models are generated to fill this gap. In Spain, the rain intensity data known for time periods less than 24 hours back to 1924 and many studies are limited by it. In many cases this data is stored in rainfall strip charts in the meteorological stations but haven't been transfer in a numerical form. To overcome this deficiency in the raw data a process of information extraction from large amounts of rainfall strip charts is implemented by means of computer software. The method has been developed that largely automates the intensive-labour extraction work based on van Piggelen et al. (2011). The method consists of the following five basic steps: 1) scanning the charts to high-resolution digital images, 2) manually and visually registering relevant meta information from charts and pre-processing, 3) applying automatic curve extraction software in a batch process to determine the coordinates of cumulative rainfall lines on the images (main step), 4) post processing the curves that were not correctly determined in step 3, and 5) aggregating the cumulative rainfall in pixel coordinates to the desired time resolution. A colour detection procedure is introduced that automatically separates the background of the charts and rolls from the grid and subsequently the rainfall curve. The rainfall curve is detected by minimization of a cost function. Some utilities have been added to improve the previous work and automates some auxiliary processes: readjust the bands properly, merge bands when those have been scanned in two parts, detect and cut the borders of bands not used (demanded by the software). Also some variations in which colour system is tried basing in HUE or RGB colour have been included. Thanks to apply this digitization rainfall strip charts 209 station-years of three locations in the centre of Spain have been transformed to long-term rainfall time series with 5-min resolution. References van Piggelen, H.E., T. Brandsma, H. Manders, and J. F. Lichtenauer, 2011: Automatic Curve Extraction for Digitizing Rainfall Strip Charts. J. Atmos. Oceanic Technol., 28, 891-906. Acknowledgements Financial support for this research by DURERO Project (Env.C1.3913442) is greatly appreciated.
Tracking transcriptional activities with high-content epifluorescent imaging
NASA Astrophysics Data System (ADS)
Hua, Jianping; Sima, Chao; Cypert, Milana; Gooden, Gerald C.; Shack, Sonsoles; Alla, Lalitamba; Smith, Edward A.; Trent, Jeffrey M.; Dougherty, Edward R.; Bittner, Michael L.
2012-04-01
High-content cell imaging based on fluorescent protein reporters has recently been used to track the transcriptional activities of multiple genes under different external stimuli for extended periods. This technology enhances our ability to discover treatment-induced regulatory mechanisms, temporally order their onsets and recognize their relationships. To fully realize these possibilities and explore their potential in biological and pharmaceutical applications, we introduce a new data processing procedure to extract information about the dynamics of cell processes based on this technology. The proposed procedure contains two parts: (1) image processing, where the fluorescent images are processed to identify individual cells and allow their transcriptional activity levels to be quantified; and (2) data representation, where the extracted time course data are summarized and represented in a way that facilitates efficient evaluation. Experiments show that the proposed procedure achieves fast and robust image segmentation with sufficient accuracy. The extracted cellular dynamics are highly reproducible and sensitive enough to detect subtle activity differences and identify mechanisms responding to selected perturbations. This method should be able to help biologists identify the alterations of cellular mechanisms that allow drug candidates to change cell behavior and thereby improve the efficiency of drug discovery and treatment design.
NASA Astrophysics Data System (ADS)
Römer, H.; Kiefl, R.; Henkel, F.; Wenxi, C.; Nippold, R.; Kurz, F.; Kippnich, U.
2016-06-01
Enhancing situational awareness in real-time (RT) civil protection and emergency response scenarios requires the development of comprehensive monitoring concepts combining classical remote sensing disciplines with geospatial information science. In the VABENE++ project of the German Aerospace Center (DLR) monitoring tools are being developed by which innovative data acquisition approaches are combined with information extraction as well as the generation and dissemination of information products to a specific user. DLR's 3K and 4k camera system which allow for a RT acquisition and pre-processing of high resolution aerial imagery are applied in two application examples conducted with end users: a civil protection exercise with humanitarian relief organisations and a large open-air music festival in cooperation with a festival organising company. This study discusses how airborne remote sensing can significantly contribute to both, situational assessment and awareness, focussing on the downstream processes required for extracting information from imagery and for visualising and disseminating imagery in combination with other geospatial information. Valuable user feedback and impetus for further developments has been obtained from both applications, referring to innovations in thematic image analysis (supporting festival site management) and product dissemination (editable web services). Thus, this study emphasises the important role of user involvement in application-related research, i.e. by aligning it closer to user's requirements.
Hsieh, Chi-Hsuan; Chiu, Yu-Fang; Shen, Yi-Hsiang; Chu, Ta-Shun; Huang, Yuan-Hao
2016-02-01
This paper presents an ultra-wideband (UWB) impulse-radio radar signal processing platform used to analyze human respiratory features. Conventional radar systems used in human detection only analyze human respiration rates or the response of a target. However, additional respiratory signal information is available that has not been explored using radar detection. The authors previously proposed a modified raised cosine waveform (MRCW) respiration model and an iterative correlation search algorithm that could acquire additional respiratory features such as the inspiration and expiration speeds, respiration intensity, and respiration holding ratio. To realize real-time respiratory feature extraction by using the proposed UWB signal processing platform, this paper proposes a new four-segment linear waveform (FSLW) respiration model. This model offers a superior fit to the measured respiration signal compared with the MRCW model and decreases the computational complexity of feature extraction. In addition, an early-terminated iterative correlation search algorithm is presented, substantially decreasing the computational complexity and yielding negligible performance degradation. These extracted features can be considered the compressed signals used to decrease the amount of data storage required for use in long-term medical monitoring systems and can also be used in clinical diagnosis. The proposed respiratory feature extraction algorithm was designed and implemented using the proposed UWB radar signal processing platform including a radar front-end chip and an FPGA chip. The proposed radar system can detect human respiration rates at 0.1 to 1 Hz and facilitates the real-time analysis of the respiratory features of each respiration period.
The detection and analysis of point processes in biological signals
NASA Technical Reports Server (NTRS)
Anderson, D. J.; Correia, M. J.
1977-01-01
A pragmatic approach to the detection and analysis of discrete events in biomedical signals is taken. Examples from both clinical and basic research are provided. Introductory sections discuss not only discrete events which are easily extracted from recordings by conventional threshold detectors but also events embedded in other information carrying signals. The primary considerations are factors governing event-time resolution and the effects limits to this resolution have on the subsequent analysis of the underlying process. The analysis portion describes tests for qualifying the records as stationary point processes and procedures for providing meaningful information about the biological signals under investigation. All of these procedures are designed to be implemented on laboratory computers of modest computational capacity.
Smart Networked Elements in Support of ISHM
NASA Technical Reports Server (NTRS)
Oostdyk, Rebecca; Mata, Carlos; Perotti, Jose M.
2008-01-01
At the core of ISHM is the ability to extract information and knowledge from raw data. Conventional data acquisition systems sample and convert physical measurements to engineering units, which higher-level systems use to derive health and information about processes and systems. Although health management is essential at the top level, there are considerable advantages to implementing health-related functions at the sensor level. The distribution of processing to lower levels reduces bandwidth requirements, enhances data fusion, and improves the resolution for detection and isolation of failures in a system, subsystem, component, or process. The Smart Networked Element (SNE) has been developed to implement intelligent functions and algorithms at the sensor level in support of ISHM.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Wardle, Kent E.; Frey, Kurt; Pereira, Candido
2014-02-02
This task is aimed at predictive modeling of solvent extraction processes in typical extraction equipment through multiple simulation methods at various scales of resolution. We have conducted detailed continuum fluid dynamics simulation on the process unit level as well as simulations of the molecular-level physical interactions which govern extraction chemistry. Through combination of information gained through simulations at each of these two tiers along with advanced techniques such as the Lattice Boltzmann Method (LBM) which can bridge these two scales, we can develop the tools to work towards predictive simulation for solvent extraction on the equipment scale (Figure 1). Themore » goal of such a tool-along with enabling optimized design and operation of extraction units-would be to allow prediction of stage extraction effrciency under specified conditions. Simulation efforts on each of the two scales will be described below. As the initial application of FELBM in the work performed during FYl0 has been on annular mixing it will be discussed in context of the continuum-scale. In the future, however, it is anticipated that the real value of FELBM will be in its use as a tool for sub-grid model development through highly refined DNS-like multiphase simulations facilitating exploration and development of droplet models including breakup and coalescence which will be needed for the large-scale simulations where droplet level physics cannot be resolved. In this area, it can have a significant advantage over traditional CFD methods as its high computational efficiency allows exploration of significantly greater physical detail especially as computational resources increase in the future.« less
Deep Learning for Automated Extraction of Primary Sites From Cancer Pathology Reports.
Qiu, John X; Yoon, Hong-Jun; Fearn, Paul A; Tourassi, Georgia D
2018-01-01
Pathology reports are a primary source of information for cancer registries which process high volumes of free-text reports annually. Information extraction and coding is a manual, labor-intensive process. In this study, we investigated deep learning and a convolutional neural network (CNN), for extracting ICD-O-3 topographic codes from a corpus of breast and lung cancer pathology reports. We performed two experiments, using a CNN and a more conventional term frequency vector approach, to assess the effects of class prevalence and inter-class transfer learning. The experiments were based on a set of 942 pathology reports with human expert annotations as the gold standard. CNN performance was compared against a more conventional term frequency vector space approach. We observed that the deep learning models consistently outperformed the conventional approaches in the class prevalence experiment, resulting in micro- and macro-F score increases of up to 0.132 and 0.226, respectively, when class labels were well populated. Specifically, the best performing CNN achieved a micro-F score of 0.722 over 12 ICD-O-3 topography codes. Transfer learning provided a consistent but modest performance boost for the deep learning methods but trends were contingent on the CNN method and cancer site. These encouraging results demonstrate the potential of deep learning for automated abstraction of pathology reports.
Itatani, Tomoya; Nagata, Kyoko; Yanagihara, Kiyoko; Tabuchi, Noriko
2017-08-22
The importance of active learning has continued to increase in Japan. The authors conducted classes for first-year students who entered the nursing program using the problem-based learning method which is a kind of active learning. Students discussed social topics in classes. The purposes of this study were to analyze the post-class essay, describe logical and critical thinking after attended a Problem-Based Learning (PBL) course. The authors used Mayring's methodology for qualitative content analysis and text mining. In the description about the skills required to resolve social issues, seven categories were extracted: (recognition of diverse social issues), (attitudes about resolving social issues), (discerning the root cause), (multi-lateral information processing skills), (making a path to resolve issues), (processivity in dealing with issues), and (reflecting). In the description about communication, five categories were extracted: (simple statement), (robust theories), (respecting the opponent), (communication skills), and (attractive presentations). As the result of text mining, the words extracted more than 100 times included "issue," "society," "resolve," "myself," "ability," "opinion," and "information." Education using PBL could be an effective means of improving skills that students described, and communication in general. Some students felt difficulty of communication resulting from characteristics of Japanese.
Supporting the Growing Needs of the GIS Industry
NASA Technical Reports Server (NTRS)
2003-01-01
Visual Learning Systems, Inc. (VLS), of Missoula, Montana, has developed a commercial software application called Feature Analyst. Feature Analyst was conceived under a Small Business Innovation Research (SBIR) contract with NASA's Stennis Space Center, and through the Montana State University TechLink Center, an organization funded by NASA and the U.S. Department of Defense to link regional companies with Federal laboratories for joint research and technology transfer. The software provides a paradigm shift to automated feature extraction, as it utilizes spectral, spatial, temporal, and ancillary information to model the feature extraction process; presents the ability to remove clutter; incorporates advanced machine learning techniques to supply unparalleled levels of accuracy; and includes an exceedingly simple interface for feature extraction.
Extracting Inter-business Relationship from World Wide Web
NASA Astrophysics Data System (ADS)
Jin, Yingzi; Matsuo, Yutaka; Ishizuka, Mitsuru
Social relation plays an important role in a real community. Interaction patterns reveal relations among actors (such as persons, groups, companies), which can be merged into valuable information as a network structure. In this paper, we propose a new approach to extract inter-business relationship from the Web. Extraction of relation between a pair of companies is realized by using a search engine and text processing. Since names of companies co-appear coincidentaly on the Web, we propose an advanced algorithm which is characterized by addition of keywords (or we call relation words) to a query. The relation words are obtained from either an annotated corpus or the Web. We show some examples and comprehensive evaluations on our approach.
Extraction and fusion of spectral parameters for face recognition
NASA Astrophysics Data System (ADS)
Boisier, B.; Billiot, B.; Abdessalem, Z.; Gouton, P.; Hardeberg, J. Y.
2011-03-01
Many methods have been developed in image processing for face recognition, especially in recent years with the increase of biometric technologies. However, most of these techniques are used on grayscale images acquired in the visible range of the electromagnetic spectrum. The aims of our study are to improve existing tools and to develop new methods for face recognition. The techniques used take advantage of the different spectral ranges, the visible, optical infrared and thermal infrared, by either combining them or analyzing them separately in order to extract the most appropriate information for face recognition. We also verify the consistency of several keypoints extraction techniques in the Near Infrared (NIR) and in the Visible Spectrum.
Bayesian learning of visual chunks by human observers
Orbán, Gergő; Fiser, József; Aslin, Richard N.; Lengyel, Máté
2008-01-01
Efficient and versatile processing of any hierarchically structured information requires a learning mechanism that combines lower-level features into higher-level chunks. We investigated this chunking mechanism in humans with a visual pattern-learning paradigm. We developed an ideal learner based on Bayesian model comparison that extracts and stores only those chunks of information that are minimally sufficient to encode a set of visual scenes. Our ideal Bayesian chunk learner not only reproduced the results of a large set of previous empirical findings in the domain of human pattern learning but also made a key prediction that we confirmed experimentally. In accordance with Bayesian learning but contrary to associative learning, human performance was well above chance when pair-wise statistics in the exemplars contained no relevant information. Thus, humans extract chunks from complex visual patterns by generating accurate yet economical representations and not by encoding the full correlational structure of the input. PMID:18268353
NASA Astrophysics Data System (ADS)
Wu, T.; Li, T.; Li, J.; Wang, G.
2017-12-01
Improved drainage network extraction can be achieved by flow enforcement whereby information of known river maps is imposed to the flow-path modeling process. However, the common elevation-based stream burning method can sometimes cause unintended topological errors and misinterpret the overall drainage pattern. We presented an enhanced flow enforcement method to facilitate accurate and efficient process of drainage network extraction. Both the topology of the mapped hydrography and the initial landscape of the DEM are well preserved and fully utilized in the proposed method. An improved stream rasterization is achieved here, yielding continuous, unambiguous and stream-collision-free raster equivalent of stream vectors for flow enforcement. By imposing priority-based enforcement with a complementary flow direction enhancement procedure, the drainage patterns of the mapped hydrography are fully represented in the derived results. The proposed method was tested over the Rogue River Basin, using DEMs with various resolutions. As indicated by the visual and statistical analyses, the proposed method has three major advantages: (1) it significantly reduces the occurrences of topological errors, yielding very accurate watershed partition and channel delineation, (2) it ensures scale-consistent performance at DEMs of various resolutions, and (3) the entire extraction process is well-designed to achieve great computational efficiency.
Structural study of complexes formed by acidic and neutral organophosphorus reagents
DOE Office of Scientific and Technical Information (OSTI.GOV)
Braatz, Alexander D.; Antonio, Mark R.; Nilsson, Mikael
The coordination of the trivalent 4f ions, Ln = La 3+, Dy 3+, and Lu 3+, with neutral and acidic organophosphorus reagents, both individually and combined, was studied by use of X-ray absorption spectroscopy. These studies provide metrical information about the interatomic interactions between these cations and the ligands tri- n-butyl phosphate (TBP) and di- n-butyl phosphoric acid (HDBP), whose behavior are of practical importance to chemical separation processes that are currently used on an industrial scale. Previous studies have suggested the existence of complexes involving a mixture of ligands, accounting for extraction synergy. Through systematic variation of the aqueousmore » phase acidity and extractant concentration and combination, we have found that complexes with Ln and TBP : HDBP at any mixture and HDBP alone involve direct Ln–O interactions involving 6 oxygen atoms and distant Ln–P interactions involving on average 3–5 phosphorus atoms per Ln ion. It was also found that Ln complexes formed by TBP alone seem to favor eight oxygen coordination, though we were unable to obtain metrical results regarding the distant Ln–P interactions due to the low signal attributed to a lower concentration of Ln ions in the organic phases. Our study does not support the existence of mixed Ln–TBP–HDBP complexes but, rather, indicates that the lanthanides are extracted as either Ln–HDBP complexes or Ln–TBP complexes and that these complexes exist in different ratios depending on the conditions of the extraction system. Furthermore, this fundamental structural information offers insight into the solvent extraction processes that are taking place and are of particular importance to issues arising from the separation and disposal of radioactive materials from used nuclear fuel.« less
Structural study of complexes formed by acidic and neutral organophosphorus reagents
Braatz, Alexander D.; Antonio, Mark R.; Nilsson, Mikael
2016-12-23
The coordination of the trivalent 4f ions, Ln = La 3+, Dy 3+, and Lu 3+, with neutral and acidic organophosphorus reagents, both individually and combined, was studied by use of X-ray absorption spectroscopy. These studies provide metrical information about the interatomic interactions between these cations and the ligands tri- n-butyl phosphate (TBP) and di- n-butyl phosphoric acid (HDBP), whose behavior are of practical importance to chemical separation processes that are currently used on an industrial scale. Previous studies have suggested the existence of complexes involving a mixture of ligands, accounting for extraction synergy. Through systematic variation of the aqueousmore » phase acidity and extractant concentration and combination, we have found that complexes with Ln and TBP : HDBP at any mixture and HDBP alone involve direct Ln–O interactions involving 6 oxygen atoms and distant Ln–P interactions involving on average 3–5 phosphorus atoms per Ln ion. It was also found that Ln complexes formed by TBP alone seem to favor eight oxygen coordination, though we were unable to obtain metrical results regarding the distant Ln–P interactions due to the low signal attributed to a lower concentration of Ln ions in the organic phases. Our study does not support the existence of mixed Ln–TBP–HDBP complexes but, rather, indicates that the lanthanides are extracted as either Ln–HDBP complexes or Ln–TBP complexes and that these complexes exist in different ratios depending on the conditions of the extraction system. Furthermore, this fundamental structural information offers insight into the solvent extraction processes that are taking place and are of particular importance to issues arising from the separation and disposal of radioactive materials from used nuclear fuel.« less
NASA Astrophysics Data System (ADS)
Xue, Q.; Tang, J., Sr.; Chen, H.
2017-12-01
High concentrations of ammonium sulfate, often used in the in-situ mining process, can result in a decrease of pH in the environment and dissolution of rare earth metals. Ammonium sulfate can also cause desorption of toxic heavy metals, leading to environmental and human health implications. In this study, the desorption behavior and fraction changes of lead in the ion-absorbed rare earth ore were studied using batch desorption experiments and column leaching tests. Results from batch desorption experiments showed that the desorption process of lead included fast and slow stages, and followed an Elovich model well. The desorption rate and the proportion of lead content in the solution to the total lead in the soil were observed to increase with a decrease in the initial pH of the ammonium sulfate solution. The lead in soil included an acid extractable fraction, reducible fraction, oxidizable fraction, and a residual fraction, with the predominant fractions being the reducible and acid extractable fractions. 96% of the extractable fraction in soil were desorbed into solution at pH=3.0, and the content of the reducible fraction was observed to initially increase (when pH>4.0) and then decrease (when pH<4.0) with a decrease in pH. Column leaching tests indicated that the content of lead in the different fractions of soil followed the trend of reducible fraction > oxidizable fraction > acid extractable fraction > residual fraction after the simulating leaching mining process. The change in pH was also found to have a larger influence on the acid extractable and reducible fractions than the other two fractions. The proportion of the extractable fraction being leached was ca. 86%, and the reducible fraction was enriched along the migration direction of the leaching liquid. These results suggest that certain lead fractions may desorb again and contaminate the environment via acid rain, which provides significant information for environmental assessment and remediation after mining process.
Text feature extraction based on deep learning: a review.
Liang, Hong; Sun, Xiao; Sun, Yunlei; Gao, Yuan
2017-01-01
Selection of text feature item is a basic and important matter for text mining and information retrieval. Traditional methods of feature extraction require handcrafted features. To hand-design, an effective feature is a lengthy process, but aiming at new applications, deep learning enables to acquire new effective feature representation from training data. As a new feature extraction method, deep learning has made achievements in text mining. The major difference between deep learning and conventional methods is that deep learning automatically learns features from big data, instead of adopting handcrafted features, which mainly depends on priori knowledge of designers and is highly impossible to take the advantage of big data. Deep learning can automatically learn feature representation from big data, including millions of parameters. This thesis outlines the common methods used in text feature extraction first, and then expands frequently used deep learning methods in text feature extraction and its applications, and forecasts the application of deep learning in feature extraction.
The use of analytical sedimentation velocity to extract thermodynamic linkage.
Cole, James L; Correia, John J; Stafford, Walter F
2011-11-01
For 25 years, the Gibbs Conference on Biothermodynamics has focused on the use of thermodynamics to extract information about the mechanism and regulation of biological processes. This includes the determination of equilibrium constants for macromolecular interactions by high precision physical measurements. These approaches further reveal thermodynamic linkages to ligand binding events. Analytical ultracentrifugation has been a fundamental technique in the determination of macromolecular reaction stoichiometry and energetics for 85 years. This approach is highly amenable to the extraction of thermodynamic couplings to small molecule binding in the overall reaction pathway. In the 1980s this approach was extended to the use of sedimentation velocity techniques, primarily by the analysis of tubulin-drug interactions by Na and Timasheff. This transport method necessarily incorporates the complexity of both hydrodynamic and thermodynamic nonideality. The advent of modern computational methods in the last 20 years has subsequently made the analysis of sedimentation velocity data for interacting systems more robust and rigorous. Here we review three examples where sedimentation velocity has been useful at extracting thermodynamic information about reaction stoichiometry and energetics. Approaches to extract linkage to small molecule binding and the influence of hydrodynamic nonideality are emphasized. These methods are shown to also apply to the collection of fluorescence data with the new Aviv FDS. Copyright © 2011 Elsevier B.V. All rights reserved.
The use of analytical sedimentation velocity to extract thermodynamic linkage
Cole, James L.; Correia, John J.; Stafford, Walter F.
2011-01-01
For 25 years, the Gibbs Conference on Biothermodynamics has focused on the use of thermodynamics to extract information about the mechanism and regulation of biological processes. This includes the determination of equilibrium constants for macromolecular interactions by high precision physical measurements. These approaches further reveal thermodynamic linkages to ligand binding events. Analytical ultracentrifugation has been a fundamental technique in the determination of macromolecular reaction stoichiometry and energetics for 85 years. This approach is highly amenable to the extraction of thermodynamic couplings to small molecule binding in the overall reaction pathway. In the 1980’s this approach was extended to the use of sedimentation velocity techniques, primarily by the analysis of tubulin-drug interactions by Na and Timasheff. This transport method necessarily incorporates the complexity of both hydrodynamic and thermodynamic nonideality. The advent of modern computational methods in the last 20 years has subsequently made the analysis of sedimentation velocity data for interacting systems more robust and rigorous. Here we review three examples where sedimentation velocity has been useful at extracting thermodynamic information about reaction stoichiometry and energetics. Approaches to extract linkage to small molecule binding and the influence of hydrodynamic nonideality are emphasized. These methods are shown to also apply to the collection of fluorescence data with the new Aviv FDS. PMID:21703752
Extraction of edge-based and region-based features for object recognition
NASA Astrophysics Data System (ADS)
Coutts, Benjamin; Ravi, Srinivas; Hu, Gongzhu; Shrikhande, Neelima
1993-08-01
One of the central problems of computer vision is object recognition. A catalogue of model objects is described as a set of features such as edges and surfaces. The same features are extracted from the scene and matched against the models for object recognition. Edges and surfaces extracted from the scenes are often noisy and imperfect. In this paper algorithms are described for improving low level edge and surface features. Existing edge extraction algorithms are applied to the intensity image to obtain edge features. Initial edges are traced by following directions of the current contour. These are improved by using corresponding depth and intensity information for decision making at branch points. Surface fitting routines are applied to the range image to obtain planar surface patches. An algorithm of region growing is developed that starts with a coarse segmentation and uses quadric surface fitting to iteratively merge adjacent regions into quadric surfaces based on approximate orthogonal distance regression. Surface information obtained is returned to the edge extraction routine to detect and remove fake edges. This process repeats until no more merging or edge improvement can take place. Both synthetic (with Gaussian noise) and real images containing multiple object scenes have been tested using the merging criteria. Results appeared quite encouraging.
Road extraction from aerial images using a region competition algorithm.
Amo, Miriam; Martínez, Fernando; Torre, Margarita
2006-05-01
In this paper, we present a user-guided method based on the region competition algorithm to extract roads, and therefore we also provide some clues concerning the placement of the points required by the algorithm. The initial points are analyzed in order to find out whether it is necessary to add more initial points, and this process will be based on image information. Not only is the algorithm able to obtain the road centerline, but it also recovers the road sides. An initial simple model is deformed by using region growing techniques to obtain a rough road approximation. This model will be refined by region competition. The result of this approach is that it delivers the simplest output vector information, fully recovering the road details as they are on the image, without performing any kind of symbolization. Therefore, we tried to refine a general road model by using a reliable method to detect transitions between regions. This method is proposed in order to obtain information for feeding large-scale Geographic Information System.
MedEx: a medication information extraction system for clinical narratives
Stenner, Shane P; Doan, Son; Johnson, Kevin B; Waitman, Lemuel R; Denny, Joshua C
2010-01-01
Medication information is one of the most important types of clinical data in electronic medical records. It is critical for healthcare safety and quality, as well as for clinical research that uses electronic medical record data. However, medication data are often recorded in clinical notes as free-text. As such, they are not accessible to other computerized applications that rely on coded data. We describe a new natural language processing system (MedEx), which extracts medication information from clinical notes. MedEx was initially developed using discharge summaries. An evaluation using a data set of 50 discharge summaries showed it performed well on identifying not only drug names (F-measure 93.2%), but also signature information, such as strength, route, and frequency, with F-measures of 94.5%, 93.9%, and 96.0% respectively. We then applied MedEx unchanged to outpatient clinic visit notes. It performed similarly with F-measures over 90% on a set of 25 clinic visit notes. PMID:20064797
Isolating contour information from arbitrary images
NASA Technical Reports Server (NTRS)
Jobson, Daniel J.
1989-01-01
Aspects of natural vision (physiological and perceptual) serve as a basis for attempting the development of a general processing scheme for contour extraction. Contour information is assumed to be central to visual recognition skills. While the scheme must be regarded as highly preliminary, initial results do compare favorably with the visual perception of structure. The scheme pays special attention to the construction of a smallest scale circular difference-of-Gaussian (DOG) convolution, calibration of multiscale edge detection thresholds with the visual perception of grayscale boundaries, and contour/texture discrimination methods derived from fundamental assumptions of connectivity and the characteristics of printed text. Contour information is required to fall between a minimum connectivity limit and maximum regional spatial density limit at each scale. Results support the idea that contour information, in images possessing good image quality, is (centered at about 10 cyc/deg and 30 cyc/deg). Further, lower spatial frequency channels appear to play a major role only in contour extraction from images with serious global image defects.
Extracting Objects for Aerial Manipulation on UAVs Using Low Cost Stereo Sensors
Ramon Soria, Pablo; Bevec, Robert; Arrue, Begoña C.; Ude, Aleš; Ollero, Aníbal
2016-01-01
Giving unmanned aerial vehicles (UAVs) the possibility to manipulate objects vastly extends the range of possible applications. This applies to rotary wing UAVs in particular, where their capability of hovering enables a suitable position for in-flight manipulation. Their manipulation skills must be suitable for primarily natural, partially known environments, where UAVs mostly operate. We have developed an on-board object extraction method that calculates information necessary for autonomous grasping of objects, without the need to provide the model of the object’s shape. A local map of the work-zone is generated using depth information, where object candidates are extracted by detecting areas different to our floor model. Their image projections are then evaluated using support vector machine (SVM) classification to recognize specific objects or reject bad candidates. Our method builds a sparse cloud representation of each object and calculates the object’s centroid and the dominant axis. This information is then passed to a grasping module. Our method works under the assumption that objects are static and not clustered, have visual features and the floor shape of the work-zone area is known. We used low cost cameras for creating depth information that cause noisy point clouds, but our method has proved robust enough to process this data and return accurate results. PMID:27187413
The ISES: A non-intrusive medium for in-space experiments in on-board information extraction
NASA Technical Reports Server (NTRS)
Murray, Nicholas D.; Katzberg, Stephen J.; Nealy, Mike
1990-01-01
The Information Science Experiment System (ISES) represents a new approach in applying advanced systems technology and techniques to on-board information extraction in the space environment. Basically, what is proposed is a 'black box' attached to the spacecraft data bus or local area network. To the spacecraft the 'black box' appears to be just another payload requiring power, heat rejection, interfaces, adding weight, and requiring time on the data management and communication system. In reality, the 'black box' is a programmable computational resource which eavesdrops on the data network, taking and producing selectable, real-time science data back on the network. This paper will present a brief overview of the ISES Concept and will discuss issues related to applying the ISES to the polar platform and Space Station Freedom. Critical to the operation of ISES is the viability of a payload-like interface to the spacecraft data bus or local area network. Study results that address this question will be reviewed vis-a-vis the solar platform and the core space station. Also, initial results of processing science and other requirements for onboard, real-time information extraction will be presented with particular emphasis on the polar platform. Opportunities for a broader range of applications on the core space station will also be discussed.
Extracting Objects for Aerial Manipulation on UAVs Using Low Cost Stereo Sensors.
Ramon Soria, Pablo; Bevec, Robert; Arrue, Begoña C; Ude, Aleš; Ollero, Aníbal
2016-05-14
Giving unmanned aerial vehicles (UAVs) the possibility to manipulate objects vastly extends the range of possible applications. This applies to rotary wing UAVs in particular, where their capability of hovering enables a suitable position for in-flight manipulation. Their manipulation skills must be suitable for primarily natural, partially known environments, where UAVs mostly operate. We have developed an on-board object extraction method that calculates information necessary for autonomous grasping of objects, without the need to provide the model of the object's shape. A local map of the work-zone is generated using depth information, where object candidates are extracted by detecting areas different to our floor model. Their image projections are then evaluated using support vector machine (SVM) classification to recognize specific objects or reject bad candidates. Our method builds a sparse cloud representation of each object and calculates the object's centroid and the dominant axis. This information is then passed to a grasping module. Our method works under the assumption that objects are static and not clustered, have visual features and the floor shape of the work-zone area is known. We used low cost cameras for creating depth information that cause noisy point clouds, but our method has proved robust enough to process this data and return accurate results.
Shimazaki, Kei-ichi; Kushida, Tatsuya
2010-06-01
Lactoferrin is a multi-functional metal-binding glycoprotein that exhibits many biological functions of interest to many researchers from the fields of clinical medicine, dentistry, pharmacology, veterinary medicine, nutrition and milk science. To date, a number of academic reports concerning the biological activities of lactoferrin have been published and are easily accessible through public data repositories. However, as the literature is expanding daily, this presents challenges in understanding the larger picture of lactoferrin function and mechanisms. In order to overcome the "analysis paralysis" associated with lactoferrin information, we attempted to apply a text mining method to the accumulated lactoferrin literature. To this end, we used the information extraction system GENPAC (provided by Nalapro Technologies Inc., Tokyo). This information extraction system uses natural language processing and text mining technology. This system analyzes the sentences and titles from abstracts stored in the PubMed database, and can automatically extract binary relations that consist of interactions between genes/proteins, chemicals and diseases/functions. We expect that such information visualization analysis will be useful in determining novel relationships among a multitude of lactoferrin functions and mechanisms. We have demonstrated the utilization of this method to find pathways of lactoferrin participation in neovascularization, Helicobacter pylori attack on gastric mucosa, atopic dermatitis and lipid metabolism.
Review of Extracting Information From the Social Web for Health Personalization
Karlsen, Randi; Bonander, Jason
2011-01-01
In recent years the Web has come into its own as a social platform where health consumers are actively creating and consuming Web content. Moreover, as the Web matures, consumers are gaining access to personalized applications adapted to their health needs and interests. The creation of personalized Web applications relies on extracted information about the users and the content to personalize. The Social Web itself provides many sources of information that can be used to extract information for personalization apart from traditional Web forms and questionnaires. This paper provides a review of different approaches for extracting information from the Social Web for health personalization. We reviewed research literature across different fields addressing the disclosure of health information in the Social Web, techniques to extract that information, and examples of personalized health applications. In addition, the paper includes a discussion of technical and socioethical challenges related to the extraction of information for health personalization. PMID:21278049
Background information for the Leaching environmental Assessment Framework (LEAF) test methods
The U.S. Environmental Protection Agency Office of Resource Conservation and Recovery has initiated the review and validation process for four leaching tests under consideration for inclusion into SW-846: Method 1313 "Liquid-Solid Partitioning as a Function of Extract pH for Co...
Lexical and Sublexical Semantic Preview Benefits in Chinese Reading
ERIC Educational Resources Information Center
Yan, Ming; Zhou, Wei; Shu, Hua; Kliegl, Reinhold
2012-01-01
Semantic processing from parafoveal words is an elusive phenomenon in alphabetic languages, but it has been demonstrated only for a restricted set of noncompound Chinese characters. Using the gaze-contingent boundary paradigm, this experiment examined whether parafoveal lexical and sublexical semantic information was extracted from compound…
Infrared Spectroscopic Imaging: The Next Generation
Bhargava, Rohit
2013-01-01
Infrared (IR) spectroscopic imaging seemingly matured as a technology in the mid-2000s, with commercially successful instrumentation and reports in numerous applications. Recent developments, however, have transformed our understanding of the recorded data, provided capability for new instrumentation, and greatly enhanced the ability to extract more useful information in less time. These developments are summarized here in three broad areas— data recording, interpretation of recorded data, and information extraction—and their critical review is employed to project emerging trends. Overall, the convergence of selected components from hardware, theory, algorithms, and applications is one trend. Instead of similar, general-purpose instrumentation, another trend is likely to be diverse and application-targeted designs of instrumentation driven by emerging component technologies. The recent renaissance in both fundamental science and instrumentation will likely spur investigations at the confluence of conventional spectroscopic analyses and optical physics for improved data interpretation. While chemometrics has dominated data processing, a trend will likely lie in the development of signal processing algorithms to optimally extract spectral and spatial information prior to conventional chemometric analyses. Finally, the sum of these recent advances is likely to provide unprecedented capability in measurement and scientific insight, which will present new opportunities for the applied spectroscopist. PMID:23031693
Topological properties of flat electroencephalography's state space
NASA Astrophysics Data System (ADS)
Ken, Tan Lit; Ahmad, Tahir bin; Mohd, Mohd Sham bin; Ngien, Su Kong; Suwa, Tohru; Meng, Ong Sie
2016-02-01
Neuroinverse problem are often associated with complex neuronal activity. It involves locating problematic cell which is highly challenging. While epileptic foci localization is possible with the aid of EEG signals, it relies greatly on the ability to extract hidden information or pattern within EEG signals. Flat EEG being an enhancement of EEG is a way of viewing electroencephalograph on the real plane. In the perspective of dynamical systems, Flat EEG is equivalent to epileptic seizure hence, making it a great platform to study epileptic seizure. Throughout the years, various mathematical tools have been applied on Flat EEG to extract hidden information that is hardly noticeable by traditional visual inspection. While these tools have given worthy results, the journey towards understanding seizure process completely is yet to be succeeded. Since the underlying structure of Flat EEG is dynamic and is deemed to contain wealthy information regarding brainstorm, it would certainly be appealing to explore in depth its structures. To better understand the complex seizure process, this paper studies the event of epileptic seizure via Flat EEG in a more general framework by means of topology, particularly, on the state space where the event of Flat EEG lies.
Extraction of stability and control derivatives from orbiter flight data
NASA Technical Reports Server (NTRS)
Iliff, Kenneth W.; Shafer, Mary F.
1993-01-01
The Space Shuttle Orbiter has provided unique and important information on aircraft flight dynamics. This information has provided the opportunity to assess the flight-derived stability and control derivatives for maneuvering flight in the hypersonic regime. In the case of the Space Shuttle Orbiter, these derivatives are required to determine if certain configuration placards (limitations on the flight envelope) can be modified. These placards were determined on the basis of preflight predictions and the associated uncertainties. As flight-determined derivatives are obtained, the placards are reassessed, and some of them are removed or modified. Extraction of the stability and control derivatives was justified by operational considerations and not by research considerations. Using flight results to update the predicted database of the orbiter is one of the most completely documented processes for a flight vehicle. This process followed from the requirement for analysis of flight data for control system updates and for expansion of the operational flight envelope. These results show significant changes in many important stability and control derivatives from the preflight database. This paper presents some of the stability and control derivative results obtained from Space Shuttle flights. Some of the limitations of this information are also examined.
Saini, Ramesh Kumar; Moon, So Hyun; Keum, Young-Soo
2018-06-01
Globally, the amount of food processing waste has become a major concern for environmental sustainability. The valorization of these waste materials can solve the problems of its disposal. Notably, the tomato pomace and crustacean processing waste presents enormous opportunities for the extraction of commercially vital carotenoids, lycopene, and astaxanthin, which have diverse applications in the food, feed, pharmaceuticals, and cosmetic industries. Moreover, such waste can generate surplus revenue which can significantly improve the economics of food production and processing. Considering these aspects, many reports have been published on the efficient use of tomato and crustacean processing waste to recover lycopene and astaxanthin. The current review provides up-to-date information available on the chemistry of lycopene and astaxanthin, their extraction methods that use environmentally friendly green solvents to minimize the impact of toxic chemical solvents on health and environment. Future research challenges in this context are also identified. Copyright © 2018 Elsevier Ltd. All rights reserved.
CDC WONDER: a cooperative processing architecture for public health.
Friede, A; Rosen, D H; Reid, J A
1994-01-01
CDC WONDER is an information management architecture designed for public health. It provides access to information and communications without the user's needing to know the location of data or communication pathways and mechanisms. CDC WONDER users have access to extractions from some 40 databases; electronic mail (e-mail); and surveillance data processing. System components include the Remote Client, the Communications Server, the Queue Managers, and Data Servers and Process Servers. The Remote Client software resides in the user's machine; other components are at the Centers for Disease Control and Prevention (CDC). The Remote Client, the Communications Server, and the Applications Server provide access to the information and functions in the Data Servers and Process Servers. The system architecture is based on cooperative processing, and components are coupled via pure message passing, using several protocols. This architecture allows flexibility in the choice of hardware and software. One system limitation is that final results from some subsystems are obtained slowly. Although designed for public health, CDC WONDER could be useful for other disciplines that need flexible, integrated information exchange. PMID:7719813
Volumetric Forest Change Detection Through Vhr Satellite Imagery
NASA Astrophysics Data System (ADS)
Akca, Devrim; Stylianidis, Efstratios; Smagas, Konstantinos; Hofer, Martin; Poli, Daniela; Gruen, Armin; Sanchez Martin, Victor; Altan, Orhan; Walli, Andreas; Jimeno, Elisa; Garcia, Alejandro
2016-06-01
Quick and economical ways of detecting of planimetric and volumetric changes of forest areas are in high demand. A research platform, called FORSAT (A satellite processing platform for high resolution forest assessment), was developed for the extraction of 3D geometric information from VHR (very-high resolution) imagery from satellite optical sensors and automatic change detection. This 3D forest information solution was developed during a Eurostars project. FORSAT includes two main units. The first one is dedicated to the geometric and radiometric processing of satellite optical imagery and 2D/3D information extraction. This includes: image radiometric pre-processing, image and ground point measurement, improvement of geometric sensor orientation, quasiepipolar image generation for stereo measurements, digital surface model (DSM) extraction by using a precise and robust image matching approach specially designed for VHR satellite imagery, generation of orthoimages, and 3D measurements in single images using mono-plotting and in stereo images as well as triplets. FORSAT supports most of the VHR optically imagery commonly used for civil applications: IKONOS, OrbView - 3, SPOT - 5 HRS, SPOT - 5 HRG, QuickBird, GeoEye-1, WorldView-1/2, Pléiades 1A/1B, SPOT 6/7, and sensors of similar type to be expected in the future. The second unit of FORSAT is dedicated to 3D surface comparison for change detection. It allows users to import digital elevation models (DEMs), align them using an advanced 3D surface matching approach and calculate the 3D differences and volume changes between epochs. To this end our 3D surface matching method LS3D is being used. FORSAT is a single source and flexible forest information solution with a very competitive price/quality ratio, allowing expert and non-expert remote sensing users to monitor forests in three and four dimensions from VHR optical imagery for many forest information needs. The capacity and benefits of FORSAT have been tested in six case studies located in Austria, Cyprus, Spain, Switzerland and Turkey, using optical data from different sensors and with the purpose to monitor forest with different geometric characteristics. The validation run on Cyprus dataset is reported and commented.
Ni, Yan; Su, Mingming; Qiu, Yunping; Jia, Wei
2017-01-01
ADAP-GC is an automated computational pipeline for untargeted, GC-MS-based metabolomics studies. It takes raw mass spectrometry data as input and carries out a sequence of data processing steps including construction of extracted ion chromatograms, detection of chromatographic peak features, deconvolution of co-eluting compounds, and alignment of compounds across samples. Despite the increased accuracy from the original version to version 2.0 in terms of extracting metabolite information for identification and quantitation, ADAP-GC 2.0 requires appropriate specification of a number of parameters and has difficulty in extracting information of compounds that are in low concentration. To overcome these two limitations, ADAP-GC 3.0 was developed to improve both the robustness and sensitivity of compound detection. In this paper, we report how these goals were achieved and compare ADAP-GC 3.0 against three other software tools including ChromaTOF, AnalyzerPro, and AMDIS that are widely used in the metabolomics community. PMID:27461032
Ni, Yan; Su, Mingming; Qiu, Yunping; Jia, Wei; Du, Xiuxia
2016-09-06
ADAP-GC is an automated computational pipeline for untargeted, GC/MS-based metabolomics studies. It takes raw mass spectrometry data as input and carries out a sequence of data processing steps including construction of extracted ion chromatograms, detection of chromatographic peak features, deconvolution of coeluting compounds, and alignment of compounds across samples. Despite the increased accuracy from the original version to version 2.0 in terms of extracting metabolite information for identification and quantitation, ADAP-GC 2.0 requires appropriate specification of a number of parameters and has difficulty in extracting information on compounds that are in low concentration. To overcome these two limitations, ADAP-GC 3.0 was developed to improve both the robustness and sensitivity of compound detection. In this paper, we report how these goals were achieved and compare ADAP-GC 3.0 against three other software tools including ChromaTOF, AnalyzerPro, and AMDIS that are widely used in the metabolomics community.
Automated Extraction of Substance Use Information from Clinical Texts.
Wang, Yan; Chen, Elizabeth S; Pakhomov, Serguei; Arsoniadis, Elliot; Carter, Elizabeth W; Lindemann, Elizabeth; Sarkar, Indra Neil; Melton, Genevieve B
2015-01-01
Within clinical discourse, social history (SH) includes important information about substance use (alcohol, drug, and nicotine use) as key risk factors for disease, disability, and mortality. In this study, we developed and evaluated a natural language processing (NLP) system for automated detection of substance use statements and extraction of substance use attributes (e.g., temporal and status) based on Stanford Typed Dependencies. The developed NLP system leveraged linguistic resources and domain knowledge from a multi-site social history study, Propbank and the MiPACQ corpus. The system attained F-scores of 89.8, 84.6 and 89.4 respectively for alcohol, drug, and nicotine use statement detection, as well as average F-scores of 82.1, 90.3, 80.8, 88.7, 96.6, and 74.5 respectively for extraction of attributes. Our results suggest that NLP systems can achieve good performance when augmented with linguistic resources and domain knowledge when applied to a wide breadth of substance use free text clinical notes.
Jackson MSc, Richard G.; Ball, Michael; Patel, Rashmi; Hayes, Richard D.; Dobson, Richard J.B.; Stewart, Robert
2014-01-01
Observational research using data from electronic health records (EHR) is a rapidly growing area, which promises both increased sample size and data richness - therefore unprecedented study power. However, in many medical domains, large amounts of potentially valuable data are contained within the free text clinical narrative. Manually reviewing free text to obtain desired information is an inefficient use of researcher time and skill. Previous work has demonstrated the feasibility of applying Natural Language Processing (NLP) to extract information. However, in real world research environments, the demand for NLP skills outweighs supply, creating a bottleneck in the secondary exploitation of the EHR. To address this, we present TextHunter, a tool for the creation of training data, construction of concept extraction machine learning models and their application to documents. Using confidence thresholds to ensure high precision (>90%), we achieved recall measurements as high as 99% in real world use cases. PMID:25954379
Aprea, Eugenio; Gika, Helen; Carlin, Silvia; Theodoridis, Georgios; Vrhovsek, Urska; Mattivi, Fulvio
2011-07-15
A headspace SPME GC-TOF-MS method was developed for the acquisition of metabolite profiles of apple volatiles. As a first step, an experimental design was applied to find out the most appropriate conditions for the extraction of apple volatile compounds by SPME. The selected SPME method was applied in profiling of four different apple varieties by GC-EI-TOF-MS. Full scan GC-MS data were processed by MarkerLynx software for peak picking, normalisation, alignment and feature extraction. Advanced chemometric/statistical techniques (PCA and PLS-DA) were used to explore data and extract useful information. Characteristic markers of each variety were successively identified using the NIST library thus providing useful information for variety classification. The developed HS-SPME sampling method is fully automated and proved useful in obtaining the fingerprint of the volatile content of the fruit. The described analytical protocol can aid in further studies of the apple metabolome. Copyright © 2011 Elsevier B.V. All rights reserved.
Dynamic information processing states revealed through neurocognitive models of object semantics
Clarke, Alex
2015-01-01
Recognising objects relies on highly dynamic, interactive brain networks to process multiple aspects of object information. To fully understand how different forms of information about objects are represented and processed in the brain requires a neurocognitive account of visual object recognition that combines a detailed cognitive model of semantic knowledge with a neurobiological model of visual object processing. Here we ask how specific cognitive factors are instantiated in our mental processes and how they dynamically evolve over time. We suggest that coarse semantic information, based on generic shared semantic knowledge, is rapidly extracted from visual inputs and is sufficient to drive rapid category decisions. Subsequent recurrent neural activity between the anterior temporal lobe and posterior fusiform supports the formation of object-specific semantic representations – a conjunctive process primarily driven by the perirhinal cortex. These object-specific representations require the integration of shared and distinguishing object properties and support the unique recognition of objects. We conclude that a valuable way of understanding the cognitive activity of the brain is though testing the relationship between specific cognitive measures and dynamic neural activity. This kind of approach allows us to move towards uncovering the information processing states of the brain and how they evolve over time. PMID:25745632
NASA Astrophysics Data System (ADS)
Chen, Lei; Li, Dehua; Yang, Jie
2007-12-01
Constructing virtual international strategy environment needs many kinds of information, such as economy, politic, military, diploma, culture, science, etc. So it is very important to build an information auto-extract, classification, recombination and analysis management system with high efficiency as the foundation and component of military strategy hall. This paper firstly use improved Boost algorithm to classify obtained initial information, then use a strategy intelligence extract algorithm to extract strategy intelligence from initial information to help strategist to analysis information.
1985-06-14
original information was processed. Where no processing indicator is given, the infor- mation was summarized or extracted . Unfamiliar names rendered...4 VRPR on Struggle, by Ko Ui-chol 6 Additional Comment From VRPR 7 Comments on Kwangju Incident Reported (KCNA, 25, 27 May 85) 11 South Prime...Action 18 DPRK Dailies Support Occupation 19 -a - Comments on U.S. Military in South Korea (KCNA, 24, 25 May 85) . 20 Foreign Papers on Scheme for
A Multi-Disciplinary Approach to Remote Sensing through Low-Cost UAVs.
Calvario, Gabriela; Sierra, Basilio; Alarcón, Teresa E; Hernandez, Carmen; Dalmau, Oscar
2017-06-16
The use of Unmanned Aerial Vehicles (UAVs) based on remote sensing has generated low cost monitoring, since the data can be acquired quickly and easily. This paper reports the experience related to agave crop analysis with a low cost UAV. The data were processed by traditional photogrammetric flow and data extraction techniques were applied to extract new layers and separate the agave plants from weeds and other elements of the environment. Our proposal combines elements of photogrammetry, computer vision, data mining, geomatics and computer science. This fusion leads to very interesting results in agave control. This paper aims to demonstrate the potential of UAV monitoring in agave crops and the importance of information processing with reliable data flow.
A Multi-Disciplinary Approach to Remote Sensing through Low-Cost UAVs
Calvario, Gabriela; Sierra, Basilio; Alarcón, Teresa E.; Hernandez, Carmen; Dalmau, Oscar
2017-01-01
The use of Unmanned Aerial Vehicles (UAVs) based on remote sensing has generated low cost monitoring, since the data can be acquired quickly and easily. This paper reports the experience related to agave crop analysis with a low cost UAV. The data were processed by traditional photogrammetric flow and data extraction techniques were applied to extract new layers and separate the agave plants from weeds and other elements of the environment. Our proposal combines elements of photogrammetry, computer vision, data mining, geomatics and computer science. This fusion leads to very interesting results in agave control. This paper aims to demonstrate the potential of UAV monitoring in agave crops and the importance of information processing with reliable data flow. PMID:28621740
High-Resolution Remote Sensing Image Building Extraction Based on Markov Model
NASA Astrophysics Data System (ADS)
Zhao, W.; Yan, L.; Chang, Y.; Gong, L.
2018-04-01
With the increase of resolution, remote sensing images have the characteristics of increased information load, increased noise, more complex feature geometry and texture information, which makes the extraction of building information more difficult. To solve this problem, this paper designs a high resolution remote sensing image building extraction method based on Markov model. This method introduces Contourlet domain map clustering and Markov model, captures and enhances the contour and texture information of high-resolution remote sensing image features in multiple directions, and further designs the spectral feature index that can characterize "pseudo-buildings" in the building area. Through the multi-scale segmentation and extraction of image features, the fine extraction from the building area to the building is realized. Experiments show that this method can restrain the noise of high-resolution remote sensing images, reduce the interference of non-target ground texture information, and remove the shadow, vegetation and other pseudo-building information, compared with the traditional pixel-level image information extraction, better performance in building extraction precision, accuracy and completeness.
Mercury Phase II Study - Mercury Behavior across the High-Level Waste Evaporator System
DOE Office of Scientific and Technical Information (OSTI.GOV)
Bannochie, C. J.; Crawford, C. L.; Jackson, D. G.
2016-06-17
The Mercury Program team’s effort continues to develop more fundamental information concerning mercury behavior across the liquid waste facilities and unit operations. Previously, the team examined the mercury chemistry across salt processing, including the Actinide Removal Process/Modular Caustic Side Solvent Extraction Unit (ARP/MCU), and the Defense Waste Processing Facility (DWPF) flowsheets. This report documents the data and understanding of mercury across the high level waste 2H and 3H evaporator systems.
Using automatically extracted information from mammography reports for decision-support
Bozkurt, Selen; Gimenez, Francisco; Burnside, Elizabeth S.; Gulkesen, Kemal H.; Rubin, Daniel L.
2016-01-01
Objective To evaluate a system we developed that connects natural language processing (NLP) for information extraction from narrative text mammography reports with a Bayesian network for decision-support about breast cancer diagnosis. The ultimate goal of this system is to provide decision support as part of the workflow of producing the radiology report. Materials and methods We built a system that uses an NLP information extraction system (which extract BI-RADS descriptors and clinical information from mammography reports) to provide the necessary inputs to a Bayesian network (BN) decision support system (DSS) that estimates lesion malignancy from BI-RADS descriptors. We used this integrated system to predict diagnosis of breast cancer from radiology text reports and evaluated it with a reference standard of 300 mammography reports. We collected two different outputs from the DSS: (1) the probability of malignancy and (2) the BI-RADS final assessment category. Since NLP may produce imperfect inputs to the DSS, we compared the difference between using perfect (“reference standard”) structured inputs to the DSS (“RS-DSS”) vs NLP-derived inputs (“NLP-DSS”) on the output of the DSS using the concordance correlation coefficient. We measured the classification accuracy of the BI-RADS final assessment category when using NLP-DSS, compared with the ground truth category established by the radiologist. Results The NLP-DSS and RS-DSS had closely matched probabilities, with a mean paired difference of 0.004 ± 0.025. The concordance correlation of these paired measures was 0.95. The accuracy of the NLP-DSS to predict the correct BI-RADS final assessment category was 97.58%. Conclusion The accuracy of the information extracted from mammography reports using the NLP system was sufficient to provide accurate DSS results. We believe our system could ultimately reduce the variation in practice in mammography related to assessment of malignant lesions and improve management decisions. PMID:27388877
Automatic Extraction of Destinations, Origins and Route Parts from Human Generated Route Directions
NASA Astrophysics Data System (ADS)
Zhang, Xiao; Mitra, Prasenjit; Klippel, Alexander; Maceachren, Alan
Researchers from the cognitive and spatial sciences are studying text descriptions of movement patterns in order to examine how humans communicate and understand spatial information. In particular, route directions offer a rich source of information on how cognitive systems conceptualize movement patterns by segmenting them into meaningful parts. Route directions are composed using a plethora of cognitive spatial organization principles: changing levels of granularity, hierarchical organization, incorporation of cognitively and perceptually salient elements, and so forth. Identifying such information in text documents automatically is crucial for enabling machine-understanding of human spatial language. The benefits are: a) creating opportunities for large-scale studies of human linguistic behavior; b) extracting and georeferencing salient entities (landmarks) that are used by human route direction providers; c) developing methods to translate route directions to sketches and maps; and d) enabling queries on large corpora of crawled/analyzed movement data. In this paper, we introduce our approach and implementations that bring us closer to the goal of automatically processing linguistic route directions. We report on research directed at one part of the larger problem, that is, extracting the three most critical parts of route directions and movement patterns in general: origin, destination, and route parts. We use machine-learning based algorithms to extract these parts of routes, including, for example, destination names and types. We prove the effectiveness of our approach in several experiments using hand-tagged corpora.
Generalized parton distributions from deep virtual compton scattering at CLAS
Guidal, M.
2010-04-24
Here, we have analyzed the beam spin asymmetry and the longitudinally polarized target spin asymmetry of the Deep Virtual Compton Scattering process, recently measured by the Jefferson Lab CLAS collaboration. Our aim is to extract information about the Generalized Parton Distributions of the proton. By fitting these data, in a largely model-independent procedure, we are able to extract numerical values for the two Compton Form Factorsmore » $$H_{Im}$$ and $$\\tilde{H}_{Im}$$ with uncertainties, in average, of the order of 30%.« less
Procedure for extraction of disparate data from maps into computerized data bases
NASA Technical Reports Server (NTRS)
Junkin, B. G.
1979-01-01
A procedure is presented for extracting disparate sources of data from geographic maps and for the conversion of these data into a suitable format for processing on a computer-oriented information system. Several graphic digitizing considerations are included and related to the NASA Earth Resources Laboratory's Digitizer System. Current operating procedures for the Digitizer System are given in a simplified and logical manner. The report serves as a guide to those organizations interested in converting map-based data by using a comparable map digitizing system.
The Evolution of Holistic Processing of Faces
Burke, Darren; Sulikowski, Danielle
2013-01-01
In this paper we examine the holistic processing of faces from an evolutionary perspective, clarifying what such an approach entails, and evaluating the extent to which the evidence currently available permits any strong conclusions. While it seems clear that the holistic processing of faces depends on mechanisms evolved to perform that task, our review of the comparative literature reveals that there is currently insufficient evidence (or sometimes insufficiently compelling evidence) to decide when in our evolutionary past such processing may have arisen. It is also difficult to assess what kinds of selection pressures may have led to evolution of such a mechanism, or even what kinds of information holistic processing may have originally evolved to extract, given that many sources of socially relevant face-based information other than identity depend on integrating information across different regions of the face – judgments of expression, behavioral intent, attractiveness, sex, age, etc. We suggest some directions for future research that would help to answer these important questions. PMID:23382721
Context-based automated defect classification system using multiple morphological masks
Gleason, Shaun S.; Hunt, Martin A.; Sari-Sarraf, Hamed
2002-01-01
Automatic detection of defects during the fabrication of semiconductor wafers is largely automated, but the classification of those defects is still performed manually by technicians. This invention includes novel digital image analysis techniques that generate unique feature vector descriptions of semiconductor defects as well as classifiers that use these descriptions to automatically categorize the defects into one of a set of pre-defined classes. Feature extraction techniques based on multiple-focus images, multiple-defect mask images, and segmented semiconductor wafer images are used to create unique feature-based descriptions of the semiconductor defects. These feature-based defect descriptions are subsequently classified by a defect classifier into categories that depend on defect characteristics and defect contextual information, that is, the semiconductor process layer(s) with which the defect comes in contact. At the heart of the system is a knowledge database that stores and distributes historical semiconductor wafer and defect data to guide the feature extraction and classification processes. In summary, this invention takes as its input a set of images containing semiconductor defect information, and generates as its output a classification for the defect that describes not only the defect itself, but also the location of that defect with respect to the semiconductor process layers.
Non-Discretionary Access Control for Decentralized Computing Systems
1977-05-01
Semaphores are inherently read-write objects to all users. Reed and Kanodia <Reed77> propose a scheme for process synchronization using...Capabilities 84 8.2.4.4 UNIX Style Naming 85 8.2.5 Garbage Collection 86 8.3 Synchronization Without Writing 87 9. Downgrading Information 89 9.1...intruder could not follow the rapid exchange of messages and would be unable to extract information. Farber and Larsen describe synchronization and
Extraction of quantitative surface characteristics from AIRSAR data for Death Valley, California
NASA Technical Reports Server (NTRS)
Kierein-Young, K. S.; Kruse, F. A.
1992-01-01
Polarimetric Airborne Synthetic Aperture Radar (AIRSAR) data were collected for the Geologic Remote Sensing Field Experiment (GRSFE) over Death Valley, California, USA, in Sep. 1989. AIRSAR is a four-look, quad-polarization, three frequency instrument. It collects measurements at C-band (5.66 cm), L-band (23.98 cm), and P-band (68.13 cm), and has a GIFOV of 10 meters and a swath width of 12 kilometers. Because the radar measures at three wavelengths, different scales of surface roughness are measured. Also, dielectric constants can be calculated from the data. The AIRSAR data were calibrated using in-scene trihedral corner reflectors to remove cross-talk; and to calibrate the phase, amplitude, and co-channel gain imbalance. The calibration allows for the extraction of accurate values of rms surface roughness, dielectric constants, sigma(sub 0) backscatter, and polarization information. The radar data sets allow quantitative characterization of small scale surface structure of geologic units, providing information about the physical and chemical processes that control the surface morphology. Combining the quantitative information extracted from the radar data with other remotely sensed data sets allows discrimination, identification and mapping of geologic units that may be difficult to discern using conventional techniques.
A semantic model for multimodal data mining in healthcare information systems.
Iakovidis, Dimitris; Smailis, Christos
2012-01-01
Electronic health records (EHRs) are representative examples of multimodal/multisource data collections; including measurements, images and free texts. The diversity of such information sources and the increasing amounts of medical data produced by healthcare institutes annually, pose significant challenges in data mining. In this paper we present a novel semantic model that describes knowledge extracted from the lowest-level of a data mining process, where information is represented by multiple features i.e. measurements or numerical descriptors extracted from measurements, images, texts or other medical data, forming multidimensional feature spaces. Knowledge collected by manual annotation or extracted by unsupervised data mining from one or more feature spaces is modeled through generalized qualitative spatial semantics. This model enables a unified representation of knowledge across multimodal data repositories. It contributes to bridging the semantic gap, by enabling direct links between low-level features and higher-level concepts e.g. describing body parts, anatomies and pathological findings. The proposed model has been developed in web ontology language based on description logics (OWL-DL) and can be applied to a variety of data mining tasks in medical informatics. It utility is demonstrated for automatic annotation of medical data.
Decoding rule search domain in the left inferior frontal gyrus
Babcock, Laura; Vallesi, Antonino
2018-01-01
Traditionally, the left hemisphere has been thought to extract mainly verbal patterns of information, but recent evidence has shown that the left Inferior Frontal Gyrus (IFG) is active during inductive reasoning in both the verbal and spatial domains. We aimed to understand whether the left IFG supports inductive reasoning in a domain-specific or domain-general fashion. To do this we used Multi-Voxel Pattern Analysis to decode the representation of domain during a rule search task. Thirteen participants were asked to extract the rule underlying streams of letters presented in different spatial locations. Each rule was either verbal (letters forming words) or spatial (positions forming geometric figures). Our results show that domain was decodable in the left prefrontal cortex, suggesting that this region represents domain-specific information, rather than processes common to the two domains. A replication study with the same participants tested two years later confirmed these findings, though the individual representations changed, providing evidence for the flexible nature of representations. This study extends our knowledge on the neural basis of goal-directed behaviors and on how information relevant for rule extraction is flexibly mapped in the prefrontal cortex. PMID:29547623
NASA Astrophysics Data System (ADS)
Sacchi, Elisa; Michelot, Jean-Luc; Pitsch, Helmut; Lalieux, Philippe; Aranyossy, Jean-François
2001-01-01
This paper summarises the results of a comprehensive critical review, initiated by the OECD/NEA "Clay Club," of the extraction techniques available to obtain water and solutes from argillaceous rocks. The paper focuses on the mechanisms involved in the extraction processes, the consequences on the isotopic and chemical composition of the extracted pore water and the attempts made to reconstruct its original composition. Finally, it provides some examples of reliable techniques and information, as a function of the purpose of the geochemical study. Résumé. Cet article résume les résultats d'une synthèse critique d'ensemble, lancée par le OECD/NEA "Clay Club", sur les techniques d'extraction disponibles pour obtenir l'eau et les solutés de roches argileuses. L'article est consacré aux mécanismes impliqués dans les processus d'extraction, aux conséquences sur la composition isotopique et chimique de l'eau porale extraite et aux tentatives faites pour reconstituer sa composition originelle. Finalement, il donne quelques exemples de techniques fiables et d'informations, en fonction du but de l'étude géochimique. Resúmen. Este artículo resume los resultados de una revisión crítica exhaustiva (iniciada por el "Clay Club" OECD/NEA) de las técnicas de extracción disponibles para obtener agua y solutos en rocas arcillosas. El artículo se centra en los mecanismos involucrados en los procesos extractivos, las consecuencias en la composición isotópica y química del agua intersticial extraída, y en los intentos realizados para reconstruir su composición original. Finalmente, se presentan algunos ejemplos de técnicas fiables e información, en función del propósito del estudio geoquímico.
Biomedical question answering using semantic relations.
Hristovski, Dimitar; Dinevski, Dejan; Kastrin, Andrej; Rindflesch, Thomas C
2015-01-16
The proliferation of the scientific literature in the field of biomedicine makes it difficult to keep abreast of current knowledge, even for domain experts. While general Web search engines and specialized information retrieval (IR) systems have made important strides in recent decades, the problem of accurate knowledge extraction from the biomedical literature is far from solved. Classical IR systems usually return a list of documents that have to be read by the user to extract relevant information. This tedious and time-consuming work can be lessened with automatic Question Answering (QA) systems, which aim to provide users with direct and precise answers to their questions. In this work we propose a novel methodology for QA based on semantic relations extracted from the biomedical literature. We extracted semantic relations with the SemRep natural language processing system from 122,421,765 sentences, which came from 21,014,382 MEDLINE citations (i.e., the complete MEDLINE distribution up to the end of 2012). A total of 58,879,300 semantic relation instances were extracted and organized in a relational database. The QA process is implemented as a search in this database, which is accessed through a Web-based application, called SemBT (available at http://sembt.mf.uni-lj.si ). We conducted an extensive evaluation of the proposed methodology in order to estimate the accuracy of extracting a particular semantic relation from a particular sentence. Evaluation was performed by 80 domain experts. In total 7,510 semantic relation instances belonging to 2,675 distinct relations were evaluated 12,083 times. The instances were evaluated as correct 8,228 times (68%). In this work we propose an innovative methodology for biomedical QA. The system is implemented as a Web-based application that is able to provide precise answers to a wide range of questions. A typical question is answered within a few seconds. The tool has some extensions that make it especially useful for interpretation of DNA microarray results.
Cognition-Based Approaches for High-Precision Text Mining
ERIC Educational Resources Information Center
Shannon, George John
2017-01-01
This research improves the precision of information extraction from free-form text via the use of cognitive-based approaches to natural language processing (NLP). Cognitive-based approaches are an important, and relatively new, area of research in NLP and search, as well as linguistics. Cognitive approaches enable significant improvements in both…
Modelling Mathematics Problem Solving Item Responses Using a Multidimensional IRT Model
ERIC Educational Resources Information Center
Wu, Margaret; Adams, Raymond
2006-01-01
This research examined students' responses to mathematics problem-solving tasks and applied a general multidimensional IRT model at the response category level. In doing so, cognitive processes were identified and modelled through item response modelling to extract more information than would be provided using conventional practices in scoring…
Association Rule Mining from an Intelligent Tutor
ERIC Educational Resources Information Center
Dogan, Buket; Camurcu, A. Yilmaz
2008-01-01
Educational data mining is a very novel research area, offering fertile ground for many interesting data mining applications. Educational data mining can extract useful information from educational activities for better understanding and assessment of the student learning process. In this way, it is possible to explore how students learn topics in…
Word Recognition: Theoretical Issues and Instructional Hints.
ERIC Educational Resources Information Center
Smith, Edward E.; Kleiman, Glenn M.
Research on adult readers' word recognition skills is used in this paper to develop a general information processing model of reading. Stages of the model include feature extraction, interpretation, lexical access, working memory, and integration. Of those stages, particular attention is given to the units of interpretation, speech recoding and…
Multivariate EMD and full spectrum based condition monitoring for rotating machinery
NASA Astrophysics Data System (ADS)
Zhao, Xiaomin; Patel, Tejas H.; Zuo, Ming J.
2012-02-01
Early assessment of machinery health condition is of paramount importance today. A sensor network with sensors in multiple directions and locations is usually employed for monitoring the condition of rotating machinery. Extraction of health condition information from these sensors for effective fault detection and fault tracking is always challenging. Empirical mode decomposition (EMD) is an advanced signal processing technology that has been widely used for this purpose. Standard EMD has the limitation in that it works only for a single real-valued signal. When dealing with data from multiple sensors and multiple health conditions, standard EMD faces two problems. First, because of the local and self-adaptive nature of standard EMD, the decomposition of signals from different sources may not match in either number or frequency content. Second, it may not be possible to express the joint information between different sensors. The present study proposes a method of extracting fault information by employing multivariate EMD and full spectrum. Multivariate EMD can overcome the limitations of standard EMD when dealing with data from multiple sources. It is used to extract the intrinsic mode functions (IMFs) embedded in raw multivariate signals. A criterion based on mutual information is proposed for selecting a sensitive IMF. A full spectral feature is then extracted from the selected fault-sensitive IMF to capture the joint information between signals measured from two orthogonal directions. The proposed method is first explained using simple simulated data, and then is tested for the condition monitoring of rotating machinery applications. The effectiveness of the proposed method is demonstrated through monitoring damage on the vane trailing edge of an impeller and rotor-stator rub in an experimental rotor rig.
High-Efficiency Nitride-Base Photonic Crystal Light Sources
DOE Office of Scientific and Technical Information (OSTI.GOV)
James Speck; Evelyn Hu; Claude Weisbuch
2010-01-31
The research activities performed in the framework of this project represent a major breakthrough in the demonstration of Photonic Crystals (PhC) as a competitive technology for LEDs with high light extraction efficiency. The goals of the project were to explore the viable approaches to manufacturability of PhC LEDS through proven standard industrial processes, establish the limits of light extraction by various concepts of PhC LEDs, and determine the possible advantages of PhC LEDs over current and forthcoming LED extraction concepts. We have developed three very different geometries for PhC light extraction in LEDs. In addition, we have demonstrated reliable methodsmore » for their in-depth analysis allowing the extraction of important parameters such as light extraction efficiency, modal extraction length, directionality, internal and external quantum efficiency. The information gained allows better understanding of the physical processes and the effect of the design parameters on the light directionality and extraction efficiency. As a result, we produced LEDs with controllable emission directionality and a state of the art extraction efficiency that goes up to 94%. Those devices are based on embedded air-gap PhC - a novel technology concept developed in the framework of this project. They rely on a simple and planar fabrication process that is very interesting for industrial implementation due to its robustness and scalability. In fact, besides the additional patterning and regrowth steps, the process is identical as that for standard industrially used p-side-up LEDs. The final devices exhibit the same good electrical characteristics and high process yield as a series of test standard LEDs obtained in comparable conditions. Finally, the technology of embedded air-gap patterns (PhC) has significant potential in other related fields such as: increasing the optical mode interaction with the active region in semiconductor lasers; increasing the coupling of the incident light into the active region of solar cells; increasing the efficiency of the phosphorous light conversion in white light LEDs etc. In addition to the technology of embedded PhC LEDs, we demonstrate a technique for improvement of the light extraction and emission directionality for existing flip-chip microcavity (thin) LEDs by introducing PhC grating into the top n-contact. Although, the performances of these devices in terms of increase of the extraction efficiency are not significantly superior compared to those obtained by other techniques like surface roughening, the use of PhC offers some significant advantages such as improved and controllable emission directionality and a process that is directly applicable to any material system. The PhC microcavity LEDs have also potential for industrial implementation as the fabrication process has only minor differences to that already used for flip-chip thin LEDs. Finally, we have demonstrated that achieving good electrical properties and high fabrication yield for these devices is straightforward.« less
Chen, Elizabeth S.; Maloney, Francine L.; Shilmayster, Eugene; Goldberg, Howard S.
2009-01-01
A systematic and standard process for capturing information within free-text clinical documents could facilitate opportunities for improving quality and safety of patient care, enhancing decision support, and advancing data warehousing across an enterprise setting. At Partners HealthCare System, the Medical Language Processing (MLP) services project was initiated to establish a component-based architectural model and processes to facilitate putting MLP functionality into production for enterprise consumption, promote sharing of components, and encourage reuse. Key objectives included exploring the use of an open-source framework called the Unstructured Information Management Architecture (UIMA) and leveraging existing MLP-related efforts, terminology, and document standards. This paper describes early experiences in defining the infrastructure and standards for extracting, encoding, and structuring clinical observations from a variety of clinical documents to serve enterprise-wide needs. PMID:20351830
Chen, Elizabeth S; Maloney, Francine L; Shilmayster, Eugene; Goldberg, Howard S
2009-11-14
A systematic and standard process for capturing information within free-text clinical documents could facilitate opportunities for improving quality and safety of patient care, enhancing decision support, and advancing data warehousing across an enterprise setting. At Partners HealthCare System, the Medical Language Processing (MLP) services project was initiated to establish a component-based architectural model and processes to facilitate putting MLP functionality into production for enterprise consumption, promote sharing of components, and encourage reuse. Key objectives included exploring the use of an open-source framework called the Unstructured Information Management Architecture (UIMA) and leveraging existing MLP-related efforts, terminology, and document standards. This paper describes early experiences in defining the infrastructure and standards for extracting, encoding, and structuring clinical observations from a variety of clinical documents to serve enterprise-wide needs.
Mode extraction on wind turbine blades via phase-based video motion estimation
NASA Astrophysics Data System (ADS)
Sarrafi, Aral; Poozesh, Peyman; Niezrecki, Christopher; Mao, Zhu
2017-04-01
In recent years, image processing techniques are being applied more often for structural dynamics identification, characterization, and structural health monitoring. Although as a non-contact and full-field measurement method, image processing still has a long way to go to outperform other conventional sensing instruments (i.e. accelerometers, strain gauges, laser vibrometers, etc.,). However, the technologies associated with image processing are developing rapidly and gaining more attention in a variety of engineering applications including structural dynamics identification and modal analysis. Among numerous motion estimation and image-processing methods, phase-based video motion estimation is considered as one of the most efficient methods regarding computation consumption and noise robustness. In this paper, phase-based video motion estimation is adopted for structural dynamics characterization on a 2.3-meter long Skystream wind turbine blade, and the modal parameters (natural frequencies, operating deflection shapes) are extracted. Phase-based video processing adopted in this paper provides reliable full-field 2-D motion information, which is beneficial for manufacturing certification and model updating at the design stage. The phase-based video motion estimation approach is demonstrated through processing data on a full-scale commercial structure (i.e. a wind turbine blade) with complex geometry and properties, and the results obtained have a good correlation with the modal parameters extracted from accelerometer measurements, especially for the first four bending modes, which have significant importance in blade characterization.
Insight into the CH3NH3PbI3/C interface in hole-conductor-free mesoscopic perovskite solar cells
NASA Astrophysics Data System (ADS)
Li, Jiangwei; Niu, Guangda; Li, Wenzhe; Cao, Kun; Wang, Mingkui; Wang, Liduo
2016-07-01
Perovskite solar cells (PSCs) with hole-conductor-free mesoscopic architecture have shown superb stability and great potential in practical application. The printable carbon counter electrodes take full responsibility of extracting holes from the active CH3NH3PbI3 absorbers. However, an in depth study of the CH3NH3PbI3/C interface properties, such as the structural formation process and the effect of interfacial conditions on hole extraction, is still lacking. Herein, we present, for the first time, an insight into the spatial confinement induced CH3NH3PbI3/C interface formation by in situ photoluminescence observations during the crystallization process of CH3NH3PbI3. The derived reaction kinetics allows a quantitative description of the perovskite formation process. In addition, we found that the interfacial contact between carbon and perovskite was dominant for hole extraction efficiency and associated with the photovoltaic parameter of short circuit current density (JSC). Consequently, we conducted a solvent vapor assisted process of PbI2 diffusion to carefully control the CH3NH3PbI3/C interface with less unreacted PbI2 barrier. The improvement of interface conditions thereby contributes to a high hole extraction proved by the charge extraction resistance and PL lifetime change, resulting in the increased JSC valve.Perovskite solar cells (PSCs) with hole-conductor-free mesoscopic architecture have shown superb stability and great potential in practical application. The printable carbon counter electrodes take full responsibility of extracting holes from the active CH3NH3PbI3 absorbers. However, an in depth study of the CH3NH3PbI3/C interface properties, such as the structural formation process and the effect of interfacial conditions on hole extraction, is still lacking. Herein, we present, for the first time, an insight into the spatial confinement induced CH3NH3PbI3/C interface formation by in situ photoluminescence observations during the crystallization process of CH3NH3PbI3. The derived reaction kinetics allows a quantitative description of the perovskite formation process. In addition, we found that the interfacial contact between carbon and perovskite was dominant for hole extraction efficiency and associated with the photovoltaic parameter of short circuit current density (JSC). Consequently, we conducted a solvent vapor assisted process of PbI2 diffusion to carefully control the CH3NH3PbI3/C interface with less unreacted PbI2 barrier. The improvement of interface conditions thereby contributes to a high hole extraction proved by the charge extraction resistance and PL lifetime change, resulting in the increased JSC valve. Electronic supplementary information (ESI) available: Fig. S1-S11, Tables S1, S2 and details of the Avrami model for reaction kinetics. See DOI: 10.1039/c6nr03359h
NASA Astrophysics Data System (ADS)
Tian, J.; Krauß, T.; d'Angelo, P.
2017-05-01
Automatic rooftop extraction is one of the most challenging problems in remote sensing image analysis. Classical 2D image processing techniques are expensive due to the high amount of features required to locate buildings. This problem can be avoided when 3D information is available. In this paper, we show how to fuse the spectral and height information of stereo imagery to achieve an efficient and robust rooftop extraction. In the first step, the digital terrain model (DTM) and in turn the normalized digital surface model (nDSM) is generated by using a newly step-edge approach. In the second step, the initial building locations and rooftop boundaries are derived by removing the low-level pixels and high-level pixels with higher probability to be trees and shadows. This boundary is then served as the initial level set function, which is further refined to fit the best possible boundaries through distance regularized level-set curve evolution. During the fitting procedure, the edge-based active contour model is adopted and implemented by using the edges indicators extracted from panchromatic image. The performance of the proposed approach is tested by using the WorldView-2 satellite data captured over Munich.
An annotated corpus with nanomedicine and pharmacokinetic parameters
Lewinski, Nastassja A; Jimenez, Ivan; McInnes, Bridget T
2017-01-01
A vast amount of data on nanomedicines is being generated and published, and natural language processing (NLP) approaches can automate the extraction of unstructured text-based data. Annotated corpora are a key resource for NLP and information extraction methods which employ machine learning. Although corpora are available for pharmaceuticals, resources for nanomedicines and nanotechnology are still limited. To foster nanotechnology text mining (NanoNLP) efforts, we have constructed a corpus of annotated drug product inserts taken from the US Food and Drug Administration’s Drugs@FDA online database. In this work, we present the development of the Engineered Nanomedicine Database corpus to support the evaluation of nanomedicine entity extraction. The data were manually annotated for 21 entity mentions consisting of nanomedicine physicochemical characterization, exposure, and biologic response information of 41 Food and Drug Administration-approved nanomedicines. We evaluate the reliability of the manual annotations and demonstrate the use of the corpus by evaluating two state-of-the-art named entity extraction systems, OpenNLP and Stanford NER. The annotated corpus is available open source and, based on these results, guidelines and suggestions for future development of additional nanomedicine corpora are provided. PMID:29066897
Tabei, Yasuo; Pauwels, Edouard; Stoven, Véronique; Takemoto, Kazuhiro; Yamanishi, Yoshihiro
2012-01-01
Motivation: Drug effects are mainly caused by the interactions between drug molecules and their target proteins including primary targets and off-targets. Identification of the molecular mechanisms behind overall drug–target interactions is crucial in the drug design process. Results: We develop a classifier-based approach to identify chemogenomic features (the underlying associations between drug chemical substructures and protein domains) that are involved in drug–target interaction networks. We propose a novel algorithm for extracting informative chemogenomic features by using L1 regularized classifiers over the tensor product space of possible drug–target pairs. It is shown that the proposed method can extract a very limited number of chemogenomic features without loosing the performance of predicting drug–target interactions and the extracted features are biologically meaningful. The extracted substructure–domain association network enables us to suggest ligand chemical fragments specific for each protein domain and ligand core substructures important for a wide range of protein families. Availability: Softwares are available at the supplemental website. Contact: yamanishi@bioreg.kyushu-u.ac.jp Supplementary Information: Datasets and all results are available at http://cbio.ensmp.fr/~yyamanishi/l1binary/ . PMID:22962471
Interferometer with Continuously Varying Path Length Measured in Wavelengths to the Reference Mirror
NASA Technical Reports Server (NTRS)
Ohara, Tetsuo (Inventor)
2016-01-01
An interferometer in which the path length of the reference beam, measured in wavelengths, is continuously changing in sinusoidal fashion and the interference signal created by combining the measurement beam and the reference beam is processed in real time to obtain the physical distance along the measurement beam between the measured surface and a spatial reference frame such as the beam splitter. The processing involves analyzing the Fourier series of the intensity signal at one or more optical detectors in real time and using the time-domain multi-frequency harmonic signals to extract the phase information independently at each pixel position of one or more optical detectors and converting the phase information to distance information.
Thalamic and cortical pathways supporting auditory processing
Lee, Charles C.
2012-01-01
The neural processing of auditory information engages pathways that begin initially at the cochlea and that eventually reach forebrain structures. At these higher levels, the computations necessary for extracting auditory source and identity information rely on the neuroanatomical connections between the thalamus and cortex. Here, the general organization of these connections in the medial geniculate body (thalamus) and the auditory cortex is reviewed. In addition, we consider two models organizing the thalamocortical pathways of the non-tonotopic and multimodal auditory nuclei. Overall, the transfer of information to the cortex via the thalamocortical pathways is complemented by the numerous intracortical and corticocortical pathways. Although interrelated, the convergent interactions among thalamocortical, corticocortical, and commissural pathways enable the computations necessary for the emergence of higher auditory perception. PMID:22728130