Sample records for extract important information

  1. Optimal Information Extraction of Laser Scanning Dataset by Scale-Adaptive Reduction

    NASA Astrophysics Data System (ADS)

    Zang, Y.; Yang, B.

    2018-04-01

    3D laser technology is widely used to collocate the surface information of object. For various applications, we need to extract a good perceptual quality point cloud from the scanned points. To solve the problem, most of existing methods extract important points based on a fixed scale. However, geometric features of 3D object come from various geometric scales. We propose a multi-scale construction method based on radial basis function. For each scale, important points are extracted from the point cloud based on their importance. We apply a perception metric Just-Noticeable-Difference to measure degradation of each geometric scale. Finally, scale-adaptive optimal information extraction is realized. Experiments are undertaken to evaluate the effective of the proposed method, suggesting a reliable solution for optimal information extraction of object.

  2. Research of information classification and strategy intelligence extract algorithm based on military strategy hall

    NASA Astrophysics Data System (ADS)

    Chen, Lei; Li, Dehua; Yang, Jie

    2007-12-01

    Constructing virtual international strategy environment needs many kinds of information, such as economy, politic, military, diploma, culture, science, etc. So it is very important to build an information auto-extract, classification, recombination and analysis management system with high efficiency as the foundation and component of military strategy hall. This paper firstly use improved Boost algorithm to classify obtained initial information, then use a strategy intelligence extract algorithm to extract strategy intelligence from initial information to help strategist to analysis information.

  3. An Effective Approach to Biomedical Information Extraction with Limited Training Data

    ERIC Educational Resources Information Center

    Jonnalagadda, Siddhartha

    2011-01-01

    In the current millennium, extensive use of computers and the internet caused an exponential increase in information. Few research areas are as important as information extraction, which primarily involves extracting concepts and the relations between them from free text. Limitations in the size of training data, lack of lexicons and lack of…

  4. Tagline: Information Extraction for Semi-Structured Text Elements in Medical Progress Notes

    ERIC Educational Resources Information Center

    Finch, Dezon Kile

    2012-01-01

    Text analysis has become an important research activity in the Department of Veterans Affairs (VA). Statistical text mining and natural language processing have been shown to be very effective for extracting useful information from medical documents. However, neither of these techniques is effective at extracting the information stored in…

  5. [Study on Information Extraction of Clinic Expert Information from Hospital Portals].

    PubMed

    Zhang, Yuanpeng; Dong, Jiancheng; Qian, Danmin; Geng, Xingyun; Wu, Huiqun; Wang, Li

    2015-12-01

    Clinic expert information provides important references for residents in need of hospital care. Usually, such information is hidden in the deep web and cannot be directly indexed by search engines. To extract clinic expert information from the deep web, the first challenge is to make a judgment on forms. This paper proposes a novel method based on a domain model, which is a tree structure constructed by the attributes of search interfaces. With this model, search interfaces can be classified to a domain and filled in with domain keywords. Another challenge is to extract information from the returned web pages indexed by search interfaces. To filter the noise information on a web page, a block importance model is proposed. The experiment results indicated that the domain model yielded a precision 10.83% higher than that of the rule-based method, whereas the block importance model yielded an F₁ measure 10.5% higher than that of the XPath method.

  6. Information Extraction from Unstructured Text for the Biodefense Knowledge Center

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Samatova, N F; Park, B; Krishnamurthy, R

    2005-04-29

    The Bio-Encyclopedia at the Biodefense Knowledge Center (BKC) is being constructed to allow an early detection of emerging biological threats to homeland security. It requires highly structured information extracted from variety of data sources. However, the quantity of new and vital information available from every day sources cannot be assimilated by hand, and therefore reliable high-throughput information extraction techniques are much anticipated. In support of the BKC, Lawrence Livermore National Laboratory and Oak Ridge National Laboratory, together with the University of Utah, are developing an information extraction system built around the bioterrorism domain. This paper reports two important pieces ofmore » our effort integrated in the system: key phrase extraction and semantic tagging. Whereas two key phrase extraction technologies developed during the course of project help identify relevant texts, our state-of-the-art semantic tagging system can pinpoint phrases related to emerging biological threats. Also we are enhancing and tailoring the Bio-Encyclopedia by augmenting semantic dictionaries and extracting details of important events, such as suspected disease outbreaks. Some of these technologies have already been applied to large corpora of free text sources vital to the BKC mission, including ProMED-mail, PubMed abstracts, and the DHS's Information Analysis and Infrastructure Protection (IAIP) news clippings. In order to address the challenges involved in incorporating such large amounts of unstructured text, the overall system is focused on precise extraction of the most relevant information for inclusion in the BKC.« less

  7. Clinic expert information extraction based on domain model and block importance model.

    PubMed

    Zhang, Yuanpeng; Wang, Li; Qian, Danmin; Geng, Xingyun; Yao, Dengfu; Dong, Jiancheng

    2015-11-01

    To extract expert clinic information from the Deep Web, there are two challenges to face. The first one is to make a judgment on forms. A novel method based on a domain model, which is a tree structure constructed by the attributes of query interfaces is proposed. With this model, query interfaces can be classified to a domain and filled in with domain keywords. Another challenge is to extract information from response Web pages indexed by query interfaces. To filter the noisy information on a Web page, a block importance model is proposed, both content and spatial features are taken into account in this model. The experimental results indicate that the domain model yields a precision 4.89% higher than that of the rule-based method, whereas the block importance model yields an F1 measure 10.5% higher than that of the XPath method. Copyright © 2015 Elsevier Ltd. All rights reserved.

  8. Knowledge Discovery and Data Mining: An Overview

    NASA Technical Reports Server (NTRS)

    Fayyad, U.

    1995-01-01

    The process of knowledge discovery and data mining is the process of information extraction from very large databases. Its importance is described along with several techniques and considerations for selecting the most appropriate technique for extracting information from a particular data set.

  9. OpenDMAP: An open source, ontology-driven concept analysis engine, with applications to capturing knowledge regarding protein transport, protein interactions and cell-type-specific gene expression

    PubMed Central

    Hunter, Lawrence; Lu, Zhiyong; Firby, James; Baumgartner, William A; Johnson, Helen L; Ogren, Philip V; Cohen, K Bretonnel

    2008-01-01

    Background Information extraction (IE) efforts are widely acknowledged to be important in harnessing the rapid advance of biomedical knowledge, particularly in areas where important factual information is published in a diverse literature. Here we report on the design, implementation and several evaluations of OpenDMAP, an ontology-driven, integrated concept analysis system. It significantly advances the state of the art in information extraction by leveraging knowledge in ontological resources, integrating diverse text processing applications, and using an expanded pattern language that allows the mixing of syntactic and semantic elements and variable ordering. Results OpenDMAP information extraction systems were produced for extracting protein transport assertions (transport), protein-protein interaction assertions (interaction) and assertions that a gene is expressed in a cell type (expression). Evaluations were performed on each system, resulting in F-scores ranging from .26 – .72 (precision .39 – .85, recall .16 – .85). Additionally, each of these systems was run over all abstracts in MEDLINE, producing a total of 72,460 transport instances, 265,795 interaction instances and 176,153 expression instances. Conclusion OpenDMAP advances the performance standards for extracting protein-protein interaction predications from the full texts of biomedical research articles. Furthermore, this level of performance appears to generalize to other information extraction tasks, including extracting information about predicates of more than two arguments. The output of the information extraction system is always constructed from elements of an ontology, ensuring that the knowledge representation is grounded with respect to a carefully constructed model of reality. The results of these efforts can be used to increase the efficiency of manual curation efforts and to provide additional features in systems that integrate multiple sources for information extraction. The open source OpenDMAP code library is freely available at PMID:18237434

  10. The comparison and analysis of extracting video key frame

    NASA Astrophysics Data System (ADS)

    Ouyang, S. Z.; Zhong, L.; Luo, R. Q.

    2018-05-01

    Video key frame extraction is an important part of the large data processing. Based on the previous work in key frame extraction, we summarized four important key frame extraction algorithms, and these methods are largely developed by comparing the differences between each of two frames. If the difference exceeds a threshold value, take the corresponding frame as two different keyframes. After the research, the key frame extraction based on the amount of mutual trust is proposed, the introduction of information entropy, by selecting the appropriate threshold values into the initial class, and finally take a similar mean mutual information as a candidate key frame. On this paper, several algorithms is used to extract the key frame of tunnel traffic videos. Then, with the analysis to the experimental results and comparisons between the pros and cons of these algorithms, the basis of practical applications is well provided.

  11. Semantic Information Extraction of Lanes Based on Onboard Camera Videos

    NASA Astrophysics Data System (ADS)

    Tang, L.; Deng, T.; Ren, C.

    2018-04-01

    In the field of autonomous driving, semantic information of lanes is very important. This paper proposes a method of automatic detection of lanes and extraction of semantic information from onboard camera videos. The proposed method firstly detects the edges of lanes by the grayscale gradient direction, and improves the Probabilistic Hough transform to fit them; then, it uses the vanishing point principle to calculate the lane geometrical position, and uses lane characteristics to extract lane semantic information by the classification of decision trees. In the experiment, 216 road video images captured by a camera mounted onboard a moving vehicle were used to detect lanes and extract lane semantic information. The results show that the proposed method can accurately identify lane semantics from video images.

  12. New Method for Knowledge Management Focused on Communication Pattern in Product Development

    NASA Astrophysics Data System (ADS)

    Noguchi, Takashi; Shiba, Hajime

    In the field of manufacturing, the importance of utilizing knowledge and know-how has been growing. To meet this background, there is a need for new methods to efficiently accumulate and extract effective knowledge and know-how. To facilitate the extraction of knowledge and know-how needed by engineers, we first defined business process information which includes schedule/progress information, document data, information about communication among parties concerned, and information which corresponds to these three types of information. Based on our definitions, we proposed an IT system (FlexPIM: Flexible and collaborative Process Information Management) to register and accumulate business process information with the least effort. In order to efficiently extract effective information from huge volumes of accumulated business process information, focusing attention on “actions” and communication patterns, we propose a new extraction method using communication patterns. And the validity of this method has been verified for some communication patterns.

  13. A rapid extraction of landslide disaster information research based on GF-1 image

    NASA Astrophysics Data System (ADS)

    Wang, Sai; Xu, Suning; Peng, Ling; Wang, Zhiyi; Wang, Na

    2015-08-01

    In recent years, the landslide disasters occurred frequently because of the seismic activity. It brings great harm to people's life. It has caused high attention of the state and the extensive concern of society. In the field of geological disaster, landslide information extraction based on remote sensing has been controversial, but high resolution remote sensing image can improve the accuracy of information extraction effectively with its rich texture and geometry information. Therefore, it is feasible to extract the information of earthquake- triggered landslides with serious surface damage and large scale. Taking the Wenchuan county as the study area, this paper uses multi-scale segmentation method to extract the landslide image object through domestic GF-1 images and DEM data, which uses the estimation of scale parameter tool to determine the optimal segmentation scale; After analyzing the characteristics of landslide high-resolution image comprehensively and selecting spectrum feature, texture feature, geometric features and landform characteristics of the image, we can establish the extracting rules to extract landslide disaster information. The extraction results show that there are 20 landslide whose total area is 521279.31 .Compared with visual interpretation results, the extraction accuracy is 72.22%. This study indicates its efficient and feasible to extract earthquake landslide disaster information based on high resolution remote sensing and it provides important technical support for post-disaster emergency investigation and disaster assessment.

  14. [Extraction of buildings three-dimensional information from high-resolution satellite imagery based on Barista software].

    PubMed

    Zhang, Pei-feng; Hu, Yuan-man; He, Hong-shi

    2010-05-01

    The demand for accurate and up-to-date spatial information of urban buildings is becoming more and more important for urban planning, environmental protection, and other vocations. Today's commercial high-resolution satellite imagery offers the potential to extract the three-dimensional information of urban buildings. This paper extracted the three-dimensional information of urban buildings from QuickBird imagery, and validated the precision of the extraction based on Barista software. It was shown that the extraction of three-dimensional information of the buildings from high-resolution satellite imagery based on Barista software had the advantages of low professional level demand, powerful universality, simple operation, and high precision. One pixel level of point positioning and height determination accuracy could be achieved if the digital elevation model (DEM) and sensor orientation model had higher precision and the off-Nadir View Angle was relatively perfect.

  15. Analysis of Technique to Extract Data from the Web for Improved Performance

    NASA Astrophysics Data System (ADS)

    Gupta, Neena; Singh, Manish

    2010-11-01

    The World Wide Web rapidly guides the world into a newly amazing electronic world, where everyone can publish anything in electronic form and extract almost all the information. Extraction of information from semi structured or unstructured documents, such as web pages, is a useful yet complex task. Data extraction, which is important for many applications, extracts the records from the HTML files automatically. Ontologies can achieve a high degree of accuracy in data extraction. We analyze method for data extraction OBDE (Ontology-Based Data Extraction), which automatically extracts the query result records from the web with the help of agents. OBDE first constructs an ontology for a domain according to information matching between the query interfaces and query result pages from different web sites within the same domain. Then, the constructed domain ontology is used during data extraction to identify the query result section in a query result page and to align and label the data values in the extracted records. The ontology-assisted data extraction method is fully automatic and overcomes many of the deficiencies of current automatic data extraction methods.

  16. Extracting important information from Chinese Operation Notes with natural language processing methods.

    PubMed

    Wang, Hui; Zhang, Weide; Zeng, Qiang; Li, Zuofeng; Feng, Kaiyan; Liu, Lei

    2014-04-01

    Extracting information from unstructured clinical narratives is valuable for many clinical applications. Although natural Language Processing (NLP) methods have been profoundly studied in electronic medical records (EMR), few studies have explored NLP in extracting information from Chinese clinical narratives. In this study, we report the development and evaluation of extracting tumor-related information from operation notes of hepatic carcinomas which were written in Chinese. Using 86 operation notes manually annotated by physicians as the training set, we explored both rule-based and supervised machine-learning approaches. Evaluating on unseen 29 operation notes, our best approach yielded 69.6% in precision, 58.3% in recall and 63.5% F-score. Copyright © 2014 Elsevier Inc. All rights reserved.

  17. Automatic updating and 3D modeling of airport information from high resolution images using GIS and LIDAR data

    NASA Astrophysics Data System (ADS)

    Lv, Zheng; Sui, Haigang; Zhang, Xilin; Huang, Xianfeng

    2007-11-01

    As one of the most important geo-spatial objects and military establishment, airport is always a key target in fields of transportation and military affairs. Therefore, automatic recognition and extraction of airport from remote sensing images is very important and urgent for updating of civil aviation and military application. In this paper, a new multi-source data fusion approach on automatic airport information extraction, updating and 3D modeling is addressed. Corresponding key technologies including feature extraction of airport information based on a modified Ostu algorithm, automatic change detection based on new parallel lines-based buffer detection algorithm, 3D modeling based on gradual elimination of non-building points algorithm, 3D change detecting between old airport model and LIDAR data, typical CAD models imported and so on are discussed in detail. At last, based on these technologies, we develop a prototype system and the results show our method can achieve good effects.

  18. a Statistical Texture Feature for Building Collapse Information Extraction of SAR Image

    NASA Astrophysics Data System (ADS)

    Li, L.; Yang, H.; Chen, Q.; Liu, X.

    2018-04-01

    Synthetic Aperture Radar (SAR) has become one of the most important ways to extract post-disaster collapsed building information, due to its extreme versatility and almost all-weather, day-and-night working capability, etc. In view of the fact that the inherent statistical distribution of speckle in SAR images is not used to extract collapsed building information, this paper proposed a novel texture feature of statistical models of SAR images to extract the collapsed buildings. In the proposed feature, the texture parameter of G0 distribution from SAR images is used to reflect the uniformity of the target to extract the collapsed building. This feature not only considers the statistical distribution of SAR images, providing more accurate description of the object texture, but also is applied to extract collapsed building information of single-, dual- or full-polarization SAR data. The RADARSAT-2 data of Yushu earthquake which acquired on April 21, 2010 is used to present and analyze the performance of the proposed method. In addition, the applicability of this feature to SAR data with different polarizations is also analysed, which provides decision support for the data selection of collapsed building information extraction.

  19. A method for automatically extracting infectious disease-related primers and probes from the literature

    PubMed Central

    2010-01-01

    Background Primer and probe sequences are the main components of nucleic acid-based detection systems. Biologists use primers and probes for different tasks, some related to the diagnosis and prescription of infectious diseases. The biological literature is the main information source for empirically validated primer and probe sequences. Therefore, it is becoming increasingly important for researchers to navigate this important information. In this paper, we present a four-phase method for extracting and annotating primer/probe sequences from the literature. These phases are: (1) convert each document into a tree of paper sections, (2) detect the candidate sequences using a set of finite state machine-based recognizers, (3) refine problem sequences using a rule-based expert system, and (4) annotate the extracted sequences with their related organism/gene information. Results We tested our approach using a test set composed of 297 manuscripts. The extracted sequences and their organism/gene annotations were manually evaluated by a panel of molecular biologists. The results of the evaluation show that our approach is suitable for automatically extracting DNA sequences, achieving precision/recall rates of 97.98% and 95.77%, respectively. In addition, 76.66% of the detected sequences were correctly annotated with their organism name. The system also provided correct gene-related information for 46.18% of the sequences assigned a correct organism name. Conclusions We believe that the proposed method can facilitate routine tasks for biomedical researchers using molecular methods to diagnose and prescribe different infectious diseases. In addition, the proposed method can be expanded to detect and extract other biological sequences from the literature. The extracted information can also be used to readily update available primer/probe databases or to create new databases from scratch. PMID:20682041

  20. Optimal tuning of a confined Brownian information engine.

    PubMed

    Park, Jong-Min; Lee, Jae Sung; Noh, Jae Dong

    2016-03-01

    A Brownian information engine is a device extracting mechanical work from a single heat bath by exploiting the information on the state of a Brownian particle immersed in the bath. As for engines, it is important to find the optimal operating condition that yields the maximum extracted work or power. The optimal condition for a Brownian information engine with a finite cycle time τ has been rarely studied because of the difficulty in finding the nonequilibrium steady state. In this study, we introduce a model for the Brownian information engine and develop an analytic formalism for its steady-state distribution for any τ. We find that the extracted work per engine cycle is maximum when τ approaches infinity, while the power is maximum when τ approaches zero.

  1. A extract method of mountainous area settlement place information from GF-1 high resolution optical remote sensing image under semantic constraints

    NASA Astrophysics Data System (ADS)

    Guo, H., II

    2016-12-01

    Spatial distribution information of mountainous area settlement place is of great significance to the earthquake emergency work because most of the key earthquake hazardous areas of china are located in the mountainous area. Remote sensing has the advantages of large coverage and low cost, it is an important way to obtain the spatial distribution information of mountainous area settlement place. At present, fully considering the geometric information, spectral information and texture information, most studies have applied object-oriented methods to extract settlement place information, In this article, semantic constraints is to be added on the basis of object-oriented methods. The experimental data is one scene remote sensing image of domestic high resolution satellite (simply as GF-1), with a resolution of 2 meters. The main processing consists of 3 steps, the first is pretreatment, including ortho rectification and image fusion, the second is Object oriented information extraction, including Image segmentation and information extraction, the last step is removing the error elements under semantic constraints, in order to formulate these semantic constraints, the distribution characteristics of mountainous area settlement place must be analyzed and the spatial logic relation between settlement place and other objects must be considered. The extraction accuracy calculation result shows that the extraction accuracy of object oriented method is 49% and rise up to 86% after the use of semantic constraints. As can be seen from the extraction accuracy, the extract method under semantic constraints can effectively improve the accuracy of mountainous area settlement place information extraction. The result shows that it is feasible to extract mountainous area settlement place information form GF-1 image, so the article proves that it has a certain practicality to use domestic high resolution optical remote sensing image in earthquake emergency preparedness.

  2. Information retrieval and terminology extraction in online resources for patients with diabetes.

    PubMed

    Seljan, Sanja; Baretić, Maja; Kucis, Vlasta

    2014-06-01

    Terminology use, as a mean for information retrieval or document indexing, plays an important role in health literacy. Specific types of users, i.e. patients with diabetes need access to various online resources (on foreign and/or native language) searching for information on self-education of basic diabetic knowledge, on self-care activities regarding importance of dietetic food, medications, physical exercises and on self-management of insulin pumps. Automatic extraction of corpus-based terminology from online texts, manuals or professional papers, can help in building terminology lists or list of "browsing phrases" useful in information retrieval or in document indexing. Specific terminology lists represent an intermediate step between free text search and controlled vocabulary, between user's demands and existing online resources in native and foreign language. The research aiming to detect the role of terminology in online resources, is conducted on English and Croatian manuals and Croatian online texts, and divided into three interrelated parts: i) comparison of professional and popular terminology use ii) evaluation of automatic statistically-based terminology extraction on English and Croatian texts iii) comparison and evaluation of extracted terminology performed on English manual using statistical and hybrid approaches. Extracted terminology candidates are evaluated by comparison with three types of reference lists: list created by professional medical person, list of highly professional vocabulary contained in MeSH and list created by non-medical persons, made as intersection of 15 lists. Results report on use of popular and professional terminology in online diabetes resources, on evaluation of automatically extracted terminology candidates in English and Croatian texts and on comparison of statistical and hybrid extraction methods in English text. Evaluation of automatic and semi-automatic terminology extraction methods is performed by recall, precision and f-measure.

  3. Integrating semantic information into multiple kernels for protein-protein interaction extraction from biomedical literatures.

    PubMed

    Li, Lishuang; Zhang, Panpan; Zheng, Tianfu; Zhang, Hongying; Jiang, Zhenchao; Huang, Degen

    2014-01-01

    Protein-Protein Interaction (PPI) extraction is an important task in the biomedical information extraction. Presently, many machine learning methods for PPI extraction have achieved promising results. However, the performance is still not satisfactory. One reason is that the semantic resources were basically ignored. In this paper, we propose a multiple-kernel learning-based approach to extract PPIs, combining the feature-based kernel, tree kernel and semantic kernel. Particularly, we extend the shortest path-enclosed tree kernel (SPT) by a dynamic extended strategy to retrieve the richer syntactic information. Our semantic kernel calculates the protein-protein pair similarity and the context similarity based on two semantic resources: WordNet and Medical Subject Heading (MeSH). We evaluate our method with Support Vector Machine (SVM) and achieve an F-score of 69.40% and an AUC of 92.00%, which show that our method outperforms most of the state-of-the-art systems by integrating semantic information.

  4. User-centered evaluation of Arizona BioPathway: an information extraction, integration, and visualization system.

    PubMed

    Quiñones, Karin D; Su, Hua; Marshall, Byron; Eggers, Shauna; Chen, Hsinchun

    2007-09-01

    Explosive growth in biomedical research has made automated information extraction, knowledge integration, and visualization increasingly important and critically needed. The Arizona BioPathway (ABP) system extracts and displays biological regulatory pathway information from the abstracts of journal articles. This study uses relations extracted from more than 200 PubMed abstracts presented in a tabular and graphical user interface with built-in search and aggregation functionality. This paper presents a task-centered assessment of the usefulness and usability of the ABP system focusing on its relation aggregation and visualization functionalities. Results suggest that our graph-based visualization is more efficient in supporting pathway analysis tasks and is perceived as more useful and easier to use as compared to a text-based literature-viewing method. Relation aggregation significantly contributes to knowledge-acquisition efficiency. Together, the graphic and tabular views in the ABP Visualizer provide a flexible and effective interface for pathway relation browsing and analysis. Our study contributes to pathway-related research and biological information extraction by assessing the value of a multiview, relation-based interface that supports user-controlled exploration of pathway information across multiple granularities.

  5. Information extraction from multi-institutional radiology reports.

    PubMed

    Hassanpour, Saeed; Langlotz, Curtis P

    2016-01-01

    The radiology report is the most important source of clinical imaging information. It documents critical information about the patient's health and the radiologist's interpretation of medical findings. It also communicates information to the referring physicians and records that information for future clinical and research use. Although efforts to structure some radiology report information through predefined templates are beginning to bear fruit, a large portion of radiology report information is entered in free text. The free text format is a major obstacle for rapid extraction and subsequent use of information by clinicians, researchers, and healthcare information systems. This difficulty is due to the ambiguity and subtlety of natural language, complexity of described images, and variations among different radiologists and healthcare organizations. As a result, radiology reports are used only once by the clinician who ordered the study and rarely are used again for research and data mining. In this work, machine learning techniques and a large multi-institutional radiology report repository are used to extract the semantics of the radiology report and overcome the barriers to the re-use of radiology report information in clinical research and other healthcare applications. We describe a machine learning system to annotate radiology reports and extract report contents according to an information model. This information model covers the majority of clinically significant contents in radiology reports and is applicable to a wide variety of radiology study types. Our automated approach uses discriminative sequence classifiers for named-entity recognition to extract and organize clinically significant terms and phrases consistent with the information model. We evaluated our information extraction system on 150 radiology reports from three major healthcare organizations and compared its results to a commonly used non-machine learning information extraction method. We also evaluated the generalizability of our approach across different organizations by training and testing our system on data from different organizations. Our results show the efficacy of our machine learning approach in extracting the information model's elements (10-fold cross-validation average performance: precision: 87%, recall: 84%, F1 score: 85%) and its superiority and generalizability compared to the common non-machine learning approach (p-value<0.05). Our machine learning information extraction approach provides an effective automatic method to annotate and extract clinically significant information from a large collection of free text radiology reports. This information extraction system can help clinicians better understand the radiology reports and prioritize their review process. In addition, the extracted information can be used by researchers to link radiology reports to information from other data sources such as electronic health records and the patient's genome. Extracted information also can facilitate disease surveillance, real-time clinical decision support for the radiologist, and content-based image retrieval. Copyright © 2015 Elsevier B.V. All rights reserved.

  6. From Specific Information Extraction to Inferences: A Hierarchical Framework of Graph Comprehension

    DTIC Science & Technology

    2004-09-01

    The skill to interpret the information displayed in graphs is so important to have, the National Council of Teachers of Mathematics has created...guidelines to ensure that students learn these skills ( NCTM : Standards for Mathematics , 2003). These guidelines are based primarily on the extraction of...graphical perception. Human Computer Interaction, 8, 353-388. NCTM : Standards for Mathematics . (2003, 2003). Peebles, D., & Cheng, P. C.-H. (2002

  7. Health Education in Mauritius.

    ERIC Educational Resources Information Center

    Mamet, Linda

    1983-01-01

    Presents some extracts from a survey given to new mothers to determine the approach of a multimedia campaign on mother and child health and the importance of breastfeeding in Mauritius. These extracts include information on socioeconomic characteristics, housing conditions, pregnancy and childbirth habits, and breastfeeding. (Author/MBR)

  8. Parallel Key Frame Extraction for Surveillance Video Service in a Smart City.

    PubMed

    Zheng, Ran; Yao, Chuanwei; Jin, Hai; Zhu, Lei; Zhang, Qin; Deng, Wei

    2015-01-01

    Surveillance video service (SVS) is one of the most important services provided in a smart city. It is very important for the utilization of SVS to provide design efficient surveillance video analysis techniques. Key frame extraction is a simple yet effective technique to achieve this goal. In surveillance video applications, key frames are typically used to summarize important video content. It is very important and essential to extract key frames accurately and efficiently. A novel approach is proposed to extract key frames from traffic surveillance videos based on GPU (graphics processing units) to ensure high efficiency and accuracy. For the determination of key frames, motion is a more salient feature in presenting actions or events, especially in surveillance videos. The motion feature is extracted in GPU to reduce running time. It is also smoothed to reduce noise, and the frames with local maxima of motion information are selected as the final key frames. The experimental results show that this approach can extract key frames more accurately and efficiently compared with several other methods.

  9. An effective image classification method with the fusion of invariant feature and a new color descriptor

    NASA Astrophysics Data System (ADS)

    Mansourian, Leila; Taufik Abdullah, Muhamad; Nurliyana Abdullah, Lili; Azman, Azreen; Mustaffa, Mas Rina

    2017-02-01

    Pyramid Histogram of Words (PHOW), combined Bag of Visual Words (BoVW) with the spatial pyramid matching (SPM) in order to add location information to extracted features. However, different PHOW extracted from various color spaces, and they did not extract color information individually, that means they discard color information, which is an important characteristic of any image that is motivated by human vision. This article, concatenated PHOW Multi-Scale Dense Scale Invariant Feature Transform (MSDSIFT) histogram and a proposed Color histogram to improve the performance of existing image classification algorithms. Performance evaluation on several datasets proves that the new approach outperforms other existing, state-of-the-art methods.

  10. Sentiment Analysis Using Common-Sense and Context Information

    PubMed Central

    Mittal, Namita; Bansal, Pooja; Garg, Sonal

    2015-01-01

    Sentiment analysis research has been increasing tremendously in recent times due to the wide range of business and social applications. Sentiment analysis from unstructured natural language text has recently received considerable attention from the research community. In this paper, we propose a novel sentiment analysis model based on common-sense knowledge extracted from ConceptNet based ontology and context information. ConceptNet based ontology is used to determine the domain specific concepts which in turn produced the domain specific important features. Further, the polarities of the extracted concepts are determined using the contextual polarity lexicon which we developed by considering the context information of a word. Finally, semantic orientations of domain specific features of the review document are aggregated based on the importance of a feature with respect to the domain. The importance of the feature is determined by the depth of the feature in the ontology. Experimental results show the effectiveness of the proposed methods. PMID:25866505

  11. Sentiment analysis using common-sense and context information.

    PubMed

    Agarwal, Basant; Mittal, Namita; Bansal, Pooja; Garg, Sonal

    2015-01-01

    Sentiment analysis research has been increasing tremendously in recent times due to the wide range of business and social applications. Sentiment analysis from unstructured natural language text has recently received considerable attention from the research community. In this paper, we propose a novel sentiment analysis model based on common-sense knowledge extracted from ConceptNet based ontology and context information. ConceptNet based ontology is used to determine the domain specific concepts which in turn produced the domain specific important features. Further, the polarities of the extracted concepts are determined using the contextual polarity lexicon which we developed by considering the context information of a word. Finally, semantic orientations of domain specific features of the review document are aggregated based on the importance of a feature with respect to the domain. The importance of the feature is determined by the depth of the feature in the ontology. Experimental results show the effectiveness of the proposed methods.

  12. Combined rule extraction and feature elimination in supervised classification.

    PubMed

    Liu, Sheng; Patel, Ronak Y; Daga, Pankaj R; Liu, Haining; Fu, Gang; Doerksen, Robert J; Chen, Yixin; Wilkins, Dawn E

    2012-09-01

    There are a vast number of biology related research problems involving a combination of multiple sources of data to achieve a better understanding of the underlying problems. It is important to select and interpret the most important information from these sources. Thus it will be beneficial to have a good algorithm to simultaneously extract rules and select features for better interpretation of the predictive model. We propose an efficient algorithm, Combined Rule Extraction and Feature Elimination (CRF), based on 1-norm regularized random forests. CRF simultaneously extracts a small number of rules generated by random forests and selects important features. We applied CRF to several drug activity prediction and microarray data sets. CRF is capable of producing performance comparable with state-of-the-art prediction algorithms using a small number of decision rules. Some of the decision rules are biologically significant.

  13. Automated Extraction and Classification of Cancer Stage Mentions fromUnstructured Text Fields in a Central Cancer Registry

    PubMed Central

    AAlAbdulsalam, Abdulrahman K.; Garvin, Jennifer H.; Redd, Andrew; Carter, Marjorie E.; Sweeny, Carol; Meystre, Stephane M.

    2018-01-01

    Cancer stage is one of the most important prognostic parameters in most cancer subtypes. The American Joint Com-mittee on Cancer (AJCC) specifies criteria for staging each cancer type based on tumor characteristics (T), lymph node involvement (N), and tumor metastasis (M) known as TNM staging system. Information related to cancer stage is typically recorded in clinical narrative text notes and other informal means of communication in the Electronic Health Record (EHR). As a result, human chart-abstractors (known as certified tumor registrars) have to search through volu-minous amounts of text to extract accurate stage information and resolve discordance between different data sources. This study proposes novel applications of natural language processing and machine learning to automatically extract and classify TNM stage mentions from records at the Utah Cancer Registry. Our results indicate that TNM stages can be extracted and classified automatically with high accuracy (extraction sensitivity: 95.5%–98.4% and classification sensitivity: 83.5%–87%). PMID:29888032

  14. Automated Extraction and Classification of Cancer Stage Mentions fromUnstructured Text Fields in a Central Cancer Registry.

    PubMed

    AAlAbdulsalam, Abdulrahman K; Garvin, Jennifer H; Redd, Andrew; Carter, Marjorie E; Sweeny, Carol; Meystre, Stephane M

    2018-01-01

    Cancer stage is one of the most important prognostic parameters in most cancer subtypes. The American Joint Com-mittee on Cancer (AJCC) specifies criteria for staging each cancer type based on tumor characteristics (T), lymph node involvement (N), and tumor metastasis (M) known as TNM staging system. Information related to cancer stage is typically recorded in clinical narrative text notes and other informal means of communication in the Electronic Health Record (EHR). As a result, human chart-abstractors (known as certified tumor registrars) have to search through volu-minous amounts of text to extract accurate stage information and resolve discordance between different data sources. This study proposes novel applications of natural language processing and machine learning to automatically extract and classify TNM stage mentions from records at the Utah Cancer Registry. Our results indicate that TNM stages can be extracted and classified automatically with high accuracy (extraction sensitivity: 95.5%-98.4% and classification sensitivity: 83.5%-87%).

  15. A novel murmur-based heart sound feature extraction technique using envelope-morphological analysis

    NASA Astrophysics Data System (ADS)

    Yao, Hao-Dong; Ma, Jia-Li; Fu, Bin-Bin; Wang, Hai-Yang; Dong, Ming-Chui

    2015-07-01

    Auscultation of heart sound (HS) signals serves as an important primary approach to diagnose cardiovascular diseases (CVDs) for centuries. Confronting the intrinsic drawbacks of traditional HS auscultation, computer-aided automatic HS auscultation based on feature extraction technique has witnessed explosive development. Yet, most existing HS feature extraction methods adopt acoustic or time-frequency features which exhibit poor relationship with diagnostic information, thus restricting the performance of further interpretation and analysis. Tackling such a bottleneck problem, this paper innovatively proposes a novel murmur-based HS feature extraction method since murmurs contain massive pathological information and are regarded as the first indications of pathological occurrences of heart valves. Adapting discrete wavelet transform (DWT) and Shannon envelope, the envelope-morphological characteristics of murmurs are obtained and three features are extracted accordingly. Validated by discriminating normal HS and 5 various abnormal HS signals with extracted features, the proposed method provides an attractive candidate in automatic HS auscultation.

  16. Using UMLS to construct a generalized hierarchical concept-based dictionary of brain functions for information extraction from the fMRI literature.

    PubMed

    Hsiao, Mei-Yu; Chen, Chien-Chung; Chen, Jyh-Horng

    2009-10-01

    With a rapid progress in the field, a great many fMRI studies are published every year, to the extent that it is now becoming difficult for researchers to keep up with the literature, since reading papers is extremely time-consuming and labor-intensive. Thus, automatic information extraction has become an important issue. In this study, we used the Unified Medical Language System (UMLS) to construct a hierarchical concept-based dictionary of brain functions. To the best of our knowledge, this is the first generalized dictionary of this kind. We also developed an information extraction system for recognizing, mapping and classifying terms relevant to human brain study. The precision and recall of our system was on a par with that of human experts in term recognition, term mapping and term classification. Our approach presented in this paper presents an alternative to the more laborious, manual entry approach to information extraction.

  17. A Low-Storage-Consumption XML Labeling Method for Efficient Structural Information Extraction

    NASA Astrophysics Data System (ADS)

    Liang, Wenxin; Takahashi, Akihiro; Yokota, Haruo

    Recently, labeling methods to extract and reconstruct the structural information of XML data, which are important for many applications such as XPath query and keyword search, are becoming more attractive. To achieve efficient structural information extraction, in this paper we propose C-DO-VLEI code, a novel update-friendly bit-vector encoding scheme, based on register-length bit operations combining with the properties of Dewey Order numbers, which cannot be implemented in other relevant existing schemes such as ORDPATH. Meanwhile, the proposed method also achieves lower storage consumption because it does not require either prefix schema or any reserved codes for node insertion. We performed experiments to evaluate and compare the performance and storage consumption of the proposed method with those of the ORDPATH method. Experimental results show that the execution times for extracting depth information and parent node labels using the C-DO-VLEI code are about 25% and 15% less, respectively, and the average label size using the C-DO-VLEI code is about 24% smaller, comparing with ORDPATH.

  18. Thinking graphically: Connecting vision and cognition during graph comprehension.

    PubMed

    Ratwani, Raj M; Trafton, J Gregory; Boehm-Davis, Deborah A

    2008-03-01

    Task analytic theories of graph comprehension account for the perceptual and conceptual processes required to extract specific information from graphs. Comparatively, the processes underlying information integration have received less attention. We propose a new framework for information integration that highlights visual integration and cognitive integration. During visual integration, pattern recognition processes are used to form visual clusters of information; these visual clusters are then used to reason about the graph during cognitive integration. In 3 experiments, the processes required to extract specific information and to integrate information were examined by collecting verbal protocol and eye movement data. Results supported the task analytic theories for specific information extraction and the processes of visual and cognitive integration for integrative questions. Further, the integrative processes scaled up as graph complexity increased, highlighting the importance of these processes for integration in more complex graphs. Finally, based on this framework, design principles to improve both visual and cognitive integration are described. PsycINFO Database Record (c) 2008 APA, all rights reserved

  19. Scorebox extraction from mobile sports videos using Support Vector Machines

    NASA Astrophysics Data System (ADS)

    Kim, Wonjun; Park, Jimin; Kim, Changick

    2008-08-01

    Scorebox plays an important role in understanding contents of sports videos. However, the tiny scorebox may give the small-display-viewers uncomfortable experience in grasping the game situation. In this paper, we propose a novel framework to extract the scorebox from sports video frames. We first extract candidates by using accumulated intensity and edge information after short learning period. Since there are various types of scoreboxes inserted in sports videos, multiple attributes need to be used for efficient extraction. Based on those attributes, the optimal information gain is computed and top three ranked attributes in terms of information gain are selected as a three-dimensional feature vector for Support Vector Machines (SVM) to distinguish the scorebox from other candidates, such as logos and advertisement boards. The proposed method is tested on various videos of sports games and experimental results show the efficiency and robustness of our proposed method.

  20. a Novel Deep Convolutional Neural Network for Spectral-Spatial Classification of Hyperspectral Data

    NASA Astrophysics Data System (ADS)

    Li, N.; Wang, C.; Zhao, H.; Gong, X.; Wang, D.

    2018-04-01

    Spatial and spectral information are obtained simultaneously by hyperspectral remote sensing. Joint extraction of these information of hyperspectral image is one of most import methods for hyperspectral image classification. In this paper, a novel deep convolutional neural network (CNN) is proposed, which extracts spectral-spatial information of hyperspectral images correctly. The proposed model not only learns sufficient knowledge from the limited number of samples, but also has powerful generalization ability. The proposed framework based on three-dimensional convolution can extract spectral-spatial features of labeled samples effectively. Though CNN has shown its robustness to distortion, it cannot extract features of different scales through the traditional pooling layer that only have one size of pooling window. Hence, spatial pyramid pooling (SPP) is introduced into three-dimensional local convolutional filters for hyperspectral classification. Experimental results with a widely used hyperspectral remote sensing dataset show that the proposed model provides competitive performance.

  1. A research of road centerline extraction algorithm from high resolution remote sensing images

    NASA Astrophysics Data System (ADS)

    Zhang, Yushan; Xu, Tingfa

    2017-09-01

    Satellite remote sensing technology has become one of the most effective methods for land surface monitoring in recent years, due to its advantages such as short period, large scale and rich information. Meanwhile, road extraction is an important field in the applications of high resolution remote sensing images. An intelligent and automatic road extraction algorithm with high precision has great significance for transportation, road network updating and urban planning. The fuzzy c-means (FCM) clustering segmentation algorithms have been used in road extraction, but the traditional algorithms did not consider spatial information. An improved fuzzy C-means clustering algorithm combined with spatial information (SFCM) is proposed in this paper, which is proved to be effective for noisy image segmentation. Firstly, the image is segmented using the SFCM. Secondly, the segmentation result is processed by mathematical morphology to remover the joint region. Thirdly, the road centerlines are extracted by morphology thinning and burr trimming. The average integrity of the centerline extraction algorithm is 97.98%, the average accuracy is 95.36% and the average quality is 93.59%. Experimental results show that the proposed method in this paper is effective for road centerline extraction.

  2. Joint Extraction of Entities and Relations Using Reinforcement Learning and Deep Learning.

    PubMed

    Feng, Yuntian; Zhang, Hongjun; Hao, Wenning; Chen, Gang

    2017-01-01

    We use both reinforcement learning and deep learning to simultaneously extract entities and relations from unstructured texts. For reinforcement learning, we model the task as a two-step decision process. Deep learning is used to automatically capture the most important information from unstructured texts, which represent the state in the decision process. By designing the reward function per step, our proposed method can pass the information of entity extraction to relation extraction and obtain feedback in order to extract entities and relations simultaneously. Firstly, we use bidirectional LSTM to model the context information, which realizes preliminary entity extraction. On the basis of the extraction results, attention based method can represent the sentences that include target entity pair to generate the initial state in the decision process. Then we use Tree-LSTM to represent relation mentions to generate the transition state in the decision process. Finally, we employ Q -Learning algorithm to get control policy π in the two-step decision process. Experiments on ACE2005 demonstrate that our method attains better performance than the state-of-the-art method and gets a 2.4% increase in recall-score.

  3. Joint Extraction of Entities and Relations Using Reinforcement Learning and Deep Learning

    PubMed Central

    Zhang, Hongjun; Chen, Gang

    2017-01-01

    We use both reinforcement learning and deep learning to simultaneously extract entities and relations from unstructured texts. For reinforcement learning, we model the task as a two-step decision process. Deep learning is used to automatically capture the most important information from unstructured texts, which represent the state in the decision process. By designing the reward function per step, our proposed method can pass the information of entity extraction to relation extraction and obtain feedback in order to extract entities and relations simultaneously. Firstly, we use bidirectional LSTM to model the context information, which realizes preliminary entity extraction. On the basis of the extraction results, attention based method can represent the sentences that include target entity pair to generate the initial state in the decision process. Then we use Tree-LSTM to represent relation mentions to generate the transition state in the decision process. Finally, we employ Q-Learning algorithm to get control policy π in the two-step decision process. Experiments on ACE2005 demonstrate that our method attains better performance than the state-of-the-art method and gets a 2.4% increase in recall-score. PMID:28894463

  4. Use of Information--LMC Connection

    ERIC Educational Resources Information Center

    Darrow, Rob

    2005-01-01

    Note taking plays an important part in the correct extracting of information from reference sources. The "Cornell Note Taking Method" initially developed as a method of taking notes during a lecture is well suited for taking notes from print sources and is one of the best "Use of Information" methods.

  5. Extracting and standardizing medication information in clinical text - the MedEx-UIMA system.

    PubMed

    Jiang, Min; Wu, Yonghui; Shah, Anushi; Priyanka, Priyanka; Denny, Joshua C; Xu, Hua

    2014-01-01

    Extraction of medication information embedded in clinical text is important for research using electronic health records (EHRs). However, most of current medication information extraction systems identify drug and signature entities without mapping them to standard representation. In this study, we introduced the open source Java implementation of MedEx, an existing high-performance medication information extraction system, based on the Unstructured Information Management Architecture (UIMA) framework. In addition, we developed new encoding modules in the MedEx-UIMA system, which mapped an extracted drug name/dose/form to both generalized and specific RxNorm concepts and translated drug frequency information to ISO standard. We processed 826 documents by both systems and verified that MedEx-UIMA and MedEx (the Python version) performed similarly by comparing both results. Using two manually annotated test sets that contained 300 drug entries from medication list and 300 drug entries from narrative reports, the MedEx-UIMA system achieved F-measures of 98.5% and 97.5% respectively for encoding drug names to corresponding RxNorm generic drug ingredients, and F-measures of 85.4% and 88.1% respectively for mapping drug names/dose/form to the most specific RxNorm concepts. It also achieved an F-measure of 90.4% for normalizing frequency information to ISO standard. The open source MedEx-UIMA system is freely available online at http://code.google.com/p/medex-uima/.

  6. An image-processing strategy to extract important information suitable for a low-size stimulus pattern in a retinal prosthesis.

    PubMed

    Chen, Yili; Fu, Jixiang; Chu, Dawei; Li, Rongmao; Xie, Yaoqin

    2017-11-27

    A retinal prosthesis is designed to help the blind to obtain some sight. It consists of an external part and an internal part. The external part is made up of a camera, an image processor and an RF transmitter. The internal part is made up of an RF receiver, implant chip and microelectrode. Currently, the number of microelectrodes is in the hundreds, and we do not know the mechanism for using an electrode to stimulate the optic nerve. A simple hypothesis is that the pixels in an image correspond to the electrode. The images captured by the camera should be processed by suitable strategies to correspond to stimulation from the electrode. Thus, it is a question of how to obtain the important information from the image captured in the picture. Here, we use the region of interest (ROI), a useful algorithm for extracting the ROI, to retain the important information, and to remove the redundant information. This paper explains the details of the principles and functions of the ROI. Because we are investigating a real-time system, we need a fast processing ROI as a useful algorithm to extract the ROI. Thus, we simplified the ROI algorithm and used it in an outside image-processing digital signal processing (DSP) system of the retinal prosthesis. The results show that our image-processing strategies are suitable for a real-time retinal prosthesis and can eliminate redundant information and provide useful information for expression in a low-size image.

  7. Tags Extarction from Spatial Documents in Search Engines

    NASA Astrophysics Data System (ADS)

    Borhaninejad, S.; Hakimpour, F.; Hamzei, E.

    2015-12-01

    Nowadays the selective access to information on the Web is provided by search engines, but in the cases which the data includes spatial information the search task becomes more complex and search engines require special capabilities. The purpose of this study is to extract the information which lies in spatial documents. To that end, we implement and evaluate information extraction from GML documents and a retrieval method in an integrated approach. Our proposed system consists of three components: crawler, database and user interface. In crawler component, GML documents are discovered and their text is parsed for information extraction; storage. The database component is responsible for indexing of information which is collected by crawlers. Finally the user interface component provides the interaction between system and user. We have implemented this system as a pilot system on an Application Server as a simulation of Web. Our system as a spatial search engine provided searching capability throughout the GML documents and thus an important step to improve the efficiency of search engines has been taken.

  8. Automated Information Extraction on Treatment and Prognosis for Non-Small Cell Lung Cancer Radiotherapy Patients: Clinical Study.

    PubMed

    Zheng, Shuai; Jabbour, Salma K; O'Reilly, Shannon E; Lu, James J; Dong, Lihua; Ding, Lijuan; Xiao, Ying; Yue, Ning; Wang, Fusheng; Zou, Wei

    2018-02-01

    In outcome studies of oncology patients undergoing radiation, researchers extract valuable information from medical records generated before, during, and after radiotherapy visits, such as survival data, toxicities, and complications. Clinical studies rely heavily on these data to correlate the treatment regimen with the prognosis to develop evidence-based radiation therapy paradigms. These data are available mainly in forms of narrative texts or table formats with heterogeneous vocabularies. Manual extraction of the related information from these data can be time consuming and labor intensive, which is not ideal for large studies. The objective of this study was to adapt the interactive information extraction platform Information and Data Extraction using Adaptive Learning (IDEAL-X) to extract treatment and prognosis data for patients with locally advanced or inoperable non-small cell lung cancer (NSCLC). We transformed patient treatment and prognosis documents into normalized structured forms using the IDEAL-X system for easy data navigation. The adaptive learning and user-customized controlled toxicity vocabularies were applied to extract categorized treatment and prognosis data, so as to generate structured output. In total, we extracted data from 261 treatment and prognosis documents relating to 50 patients, with overall precision and recall more than 93% and 83%, respectively. For toxicity information extractions, which are important to study patient posttreatment side effects and quality of life, the precision and recall achieved 95.7% and 94.5% respectively. The IDEAL-X system is capable of extracting study data regarding NSCLC chemoradiation patients with significant accuracy and effectiveness, and therefore can be used in large-scale radiotherapy clinical data studies. ©Shuai Zheng, Salma K Jabbour, Shannon E O'Reilly, James J Lu, Lihua Dong, Lijuan Ding, Ying Xiao, Ning Yue, Fusheng Wang, Wei Zou. Originally published in JMIR Medical Informatics (http://medinform.jmir.org), 01.02.2018.

  9. Fine-grained information extraction from German transthoracic echocardiography reports.

    PubMed

    Toepfer, Martin; Corovic, Hamo; Fette, Georg; Klügl, Peter; Störk, Stefan; Puppe, Frank

    2015-11-12

    Information extraction techniques that get structured representations out of unstructured data make a large amount of clinically relevant information about patients accessible for semantic applications. These methods typically rely on standardized terminologies that guide this process. Many languages and clinical domains, however, lack appropriate resources and tools, as well as evaluations of their applications, especially if detailed conceptualizations of the domain are required. For instance, German transthoracic echocardiography reports have not been targeted sufficiently before, despite of their importance for clinical trials. This work therefore aimed at development and evaluation of an information extraction component with a fine-grained terminology that enables to recognize almost all relevant information stated in German transthoracic echocardiography reports at the University Hospital of Würzburg. A domain expert validated and iteratively refined an automatically inferred base terminology. The terminology was used by an ontology-driven information extraction system that outputs attribute value pairs. The final component has been mapped to the central elements of a standardized terminology, and it has been evaluated according to documents with different layouts. The final system achieved state-of-the-art precision (micro average.996) and recall (micro average.961) on 100 test documents that represent more than 90 % of all reports. In particular, principal aspects as defined in a standardized external terminology were recognized with f 1=.989 (micro average) and f 1=.963 (macro average). As a result of keyword matching and restraint concept extraction, the system obtained high precision also on unstructured or exceptionally short documents, and documents with uncommon layout. The developed terminology and the proposed information extraction system allow to extract fine-grained information from German semi-structured transthoracic echocardiography reports with very high precision and high recall on the majority of documents at the University Hospital of Würzburg. Extracted results populate a clinical data warehouse which supports clinical research.

  10. An Ontology-Enabled Natural Language Processing Pipeline for Provenance Metadata Extraction from Biomedical Text (Short Paper).

    PubMed

    Valdez, Joshua; Rueschman, Michael; Kim, Matthew; Redline, Susan; Sahoo, Satya S

    2016-10-01

    Extraction of structured information from biomedical literature is a complex and challenging problem due to the complexity of biomedical domain and lack of appropriate natural language processing (NLP) techniques. High quality domain ontologies model both data and metadata information at a fine level of granularity, which can be effectively used to accurately extract structured information from biomedical text. Extraction of provenance metadata, which describes the history or source of information, from published articles is an important task to support scientific reproducibility. Reproducibility of results reported by previous research studies is a foundational component of scientific advancement. This is highlighted by the recent initiative by the US National Institutes of Health called "Principles of Rigor and Reproducibility". In this paper, we describe an effective approach to extract provenance metadata from published biomedical research literature using an ontology-enabled NLP platform as part of the Provenance for Clinical and Healthcare Research (ProvCaRe). The ProvCaRe-NLP tool extends the clinical Text Analysis and Knowledge Extraction System (cTAKES) platform using both provenance and biomedical domain ontologies. We demonstrate the effectiveness of ProvCaRe-NLP tool using a corpus of 20 peer-reviewed publications. The results of our evaluation demonstrate that the ProvCaRe-NLP tool has significantly higher recall in extracting provenance metadata as compared to existing NLP pipelines such as MetaMap.

  11. FIR: An Effective Scheme for Extracting Useful Metadata from Social Media.

    PubMed

    Chen, Long-Sheng; Lin, Zue-Cheng; Chang, Jing-Rong

    2015-11-01

    Recently, the use of social media for health information exchange is expanding among patients, physicians, and other health care professionals. In medical areas, social media allows non-experts to access, interpret, and generate medical information for their own care and the care of others. Researchers paid much attention on social media in medical educations, patient-pharmacist communications, adverse drug reactions detection, impacts of social media on medicine and healthcare, and so on. However, relatively few papers discuss how to extract useful knowledge from a huge amount of textual comments in social media effectively. Therefore, this study aims to propose a Fuzzy adaptive resonance theory network based Information Retrieval (FIR) scheme by combining Fuzzy adaptive resonance theory (ART) network, Latent Semantic Indexing (LSI), and association rules (AR) discovery to extract knowledge from social media. In our FIR scheme, Fuzzy ART network firstly has been employed to segment comments. Next, for each customer segment, we use LSI technique to retrieve important keywords. Then, in order to make the extracted keywords understandable, association rules mining is presented to organize these extracted keywords to build metadata. These extracted useful voices of customers will be transformed into design needs by using Quality Function Deployment (QFD) for further decision making. Unlike conventional information retrieval techniques which acquire too many keywords to get key points, our FIR scheme can extract understandable metadata from social media.

  12. [Road Extraction in Remote Sensing Images Based on Spectral and Edge Analysis].

    PubMed

    Zhao, Wen-zhi; Luo, Li-qun; Guo, Zhou; Yue, Jun; Yu, Xue-ying; Liu, Hui; Wei, Jing

    2015-10-01

    Roads are typically man-made objects in urban areas. Road extraction from high-resolution images has important applications for urban planning and transportation development. However, due to the confusion of spectral characteristic, it is difficult to distinguish roads from other objects by merely using traditional classification methods that mainly depend on spectral information. Edge is an important feature for the identification of linear objects (e. g. , roads). The distribution patterns of edges vary greatly among different objects. It is crucial to merge edge statistical information into spectral ones. In this study, a new method that combines spectral information and edge statistical features has been proposed. First, edge detection is conducted by using self-adaptive mean-shift algorithm on the panchromatic band, which can greatly reduce pseudo-edges and noise effects. Then, edge statistical features are obtained from the edge statistical model, which measures the length and angle distribution of edges. Finally, by integrating the spectral and edge statistical features, SVM algorithm is used to classify the image and roads are ultimately extracted. A series of experiments are conducted and the results show that the overall accuracy of proposed method is 93% comparing with only 78% overall accuracy of the traditional. The results demonstrate that the proposed method is efficient and valuable for road extraction, especially on high-resolution images.

  13. The Science and Art of Eyebrow Transplantation by Follicular Unit Extraction

    PubMed Central

    Gupta, Jyoti; Kumar, Amrendra; Chouhan, Kavish; Ariganesh, C; Nandal, Vinay

    2017-01-01

    Eyebrows constitute a very important and prominent feature of the face. With growing information, eyebrow transplant has become a popular procedure. However, though it is a small area it requires a lot of precision and knowledge regarding anatomy, designing of brows, extraction and implantation technique. This article gives a comprehensive view regarding eyebrow transplant with special emphasis on follicular unit extraction technique, which has become the most popular technique. PMID:28852290

  14. Distant supervision for neural relation extraction integrated with word attention and property features.

    PubMed

    Qu, Jianfeng; Ouyang, Dantong; Hua, Wen; Ye, Yuxin; Li, Ximing

    2018-04-01

    Distant supervision for neural relation extraction is an efficient approach to extracting massive relations with reference to plain texts. However, the existing neural methods fail to capture the critical words in sentence encoding and meanwhile lack useful sentence information for some positive training instances. To address the above issues, we propose a novel neural relation extraction model. First, we develop a word-level attention mechanism to distinguish the importance of each individual word in a sentence, increasing the attention weights for those critical words. Second, we investigate the semantic information from word embeddings of target entities, which can be developed as a supplementary feature for the extractor. Experimental results show that our model outperforms previous state-of-the-art baselines. Copyright © 2018 Elsevier Ltd. All rights reserved.

  15. Strategies for the extraction and analysis of non-extractable polyphenols from plants.

    PubMed

    Domínguez-Rodríguez, Gloria; Marina, María Luisa; Plaza, Merichel

    2017-09-08

    The majority of studies based on phenolic compounds from plants are focused on the extractable fraction derived from an aqueous or aqueous-organic extraction. However, an important fraction of polyphenols is ignored due to the fact that they remain retained in the residue of extraction. They are the so-called non-extractable polyphenols (NEPs) which are high molecular weight polymeric polyphenols or individual low molecular weight phenolics associated to macromolecules. The scarce information available about NEPs shows that these compounds possess interesting biological activities. That is why the interest about the study of these compounds has been increasing in the last years. Furthermore, the extraction and characterization of NEPs are considered a challenge because the developed analytical methodologies present some limitations. Thus, the present literature review summarizes current knowledge of NEPs and the different methodologies for the extraction of these compounds, with a particular focus on hydrolysis treatments. Besides, this review provides information on the most recent developments in the purification, separation, identification and quantification of NEPs from plants. Copyright © 2017 Elsevier B.V. All rights reserved.

  16. Efficient feature extraction from wide-area motion imagery by MapReduce in Hadoop

    NASA Astrophysics Data System (ADS)

    Cheng, Erkang; Ma, Liya; Blaisse, Adam; Blasch, Erik; Sheaff, Carolyn; Chen, Genshe; Wu, Jie; Ling, Haibin

    2014-06-01

    Wide-Area Motion Imagery (WAMI) feature extraction is important for applications such as target tracking, traffic management and accident discovery. With the increasing amount of WAMI collections and feature extraction from the data, a scalable framework is needed to handle the large amount of information. Cloud computing is one of the approaches recently applied in large scale or big data. In this paper, MapReduce in Hadoop is investigated for large scale feature extraction tasks for WAMI. Specifically, a large dataset of WAMI images is divided into several splits. Each split has a small subset of WAMI images. The feature extractions of WAMI images in each split are distributed to slave nodes in the Hadoop system. Feature extraction of each image is performed individually in the assigned slave node. Finally, the feature extraction results are sent to the Hadoop File System (HDFS) to aggregate the feature information over the collected imagery. Experiments of feature extraction with and without MapReduce are conducted to illustrate the effectiveness of our proposed Cloud-Enabled WAMI Exploitation (CAWE) approach.

  17. Impact of JPEG2000 compression on spatial-spectral endmember extraction from hyperspectral data

    NASA Astrophysics Data System (ADS)

    Martín, Gabriel; Ruiz, V. G.; Plaza, Antonio; Ortiz, Juan P.; García, Inmaculada

    2009-08-01

    Hyperspectral image compression has received considerable interest in recent years. However, an important issue that has not been investigated in the past is the impact of lossy compression on spectral mixture analysis applications, which characterize mixed pixels in terms of a suitable combination of spectrally pure spectral substances (called endmembers) weighted by their estimated fractional abundances. In this paper, we specifically investigate the impact of JPEG2000 compression of hyperspectral images on the quality of the endmembers extracted by algorithms that incorporate both the spectral and the spatial information (useful for incorporating contextual information in the spectral endmember search). The two considered algorithms are the automatic morphological endmember extraction (AMEE) and the spatial spectral endmember extraction (SSEE) techniques. Experimental results are conducted using a well-known data set collected by AVIRIS over the Cuprite mining district in Nevada and with detailed ground-truth information available from U. S. Geological Survey. Our experiments reveal some interesting findings that may be useful to specialists applying spatial-spectral endmember extraction algorithms to compressed hyperspectral imagery.

  18. Sugarcane Crop Extraction Using Object-Oriented Method from ZY-3 High Resolution Satellite Tlc Image

    NASA Astrophysics Data System (ADS)

    Luo, H.; Ling, Z. Y.; Shao, G. Z.; Huang, Y.; He, Y. Q.; Ning, W. Y.; Zhong, Z.

    2018-04-01

    Sugarcane is one of the most important crops in Guangxi, China. As the development of satellite remote sensing technology, more remotely sensed images can be used for monitoring sugarcane crop. With the help of Three Line Camera (TLC) images, wide coverage and stereoscopic mapping ability, Chinese ZY-3 high resolution stereoscopic mapping satellite is useful in attaining more information for sugarcane crop monitoring, such as spectral, shape, texture difference between forward, nadir and backward images. Digital surface model (DSM) derived from ZY-3 TLC images are also able to provide height information for sugarcane crop. In this study, we make attempt to extract sugarcane crop from ZY-3 images, which are acquired in harvest period. Ortho-rectified TLC images, fused image, DSM are processed for our extraction. Then Object-oriented method is used in image segmentation, example collection, and feature extraction. The results of our study show that with the help of ZY-3 TLC image, the information of sugarcane crop in harvest time can be automatic extracted, with an overall accuracy of about 85.3 %.

  19. A combination of feature extraction methods with an ensemble of different classifiers for protein structural class prediction problem.

    PubMed

    Dehzangi, Abdollah; Paliwal, Kuldip; Sharma, Alok; Dehzangi, Omid; Sattar, Abdul

    2013-01-01

    Better understanding of structural class of a given protein reveals important information about its overall folding type and its domain. It can also be directly used to provide critical information on general tertiary structure of a protein which has a profound impact on protein function determination and drug design. Despite tremendous enhancements made by pattern recognition-based approaches to solve this problem, it still remains as an unsolved issue for bioinformatics that demands more attention and exploration. In this study, we propose a novel feature extraction model that incorporates physicochemical and evolutionary-based information simultaneously. We also propose overlapped segmented distribution and autocorrelation-based feature extraction methods to provide more local and global discriminatory information. The proposed feature extraction methods are explored for 15 most promising attributes that are selected from a wide range of physicochemical-based attributes. Finally, by applying an ensemble of different classifiers namely, Adaboost.M1, LogitBoost, naive Bayes, multilayer perceptron (MLP), and support vector machine (SVM) we show enhancement of the protein structural class prediction accuracy for four popular benchmarks.

  20. Extraction of urban vegetation with Pleiades multiangular images

    NASA Astrophysics Data System (ADS)

    Lefebvre, Antoine; Nabucet, Jean; Corpetti, Thomas; Courty, Nicolas; Hubert-Moy, Laurence

    2016-10-01

    Vegetation is essential in urban environments since it provides significant services in terms of health, heat, property value, ecology ... As part of the European Union Biodiversity Strategy Plan for 2020, the protection and development of green-infrastructures is strengthened in urban areas. In order to evaluate and monitor the quality of the green infra-structures, this article investigates contributions of Pléiades multi-angular images to extract and characterize low and high urban vegetation. From such images one can extract both spectral and elevation information from optical images. Our method is composed of 3 main steps : (1) the computation of a normalized Digital Surface Model from the multi-angular images ; (2) Extraction of spectral and contextual features ; (3) a classification of vegetation classes (tree and grass) performed with a random forest classifier. Results performed in the city of Rennes in France show the ability of multi-angular images to extract DEM in urban area despite building height. It also highlights its importance and its complementarity with contextual information to extract urban vegetation.

  1. Research of building information extraction and evaluation based on high-resolution remote-sensing imagery

    NASA Astrophysics Data System (ADS)

    Cao, Qiong; Gu, Lingjia; Ren, Ruizhi; Wang, Lang

    2016-09-01

    Building extraction currently is important in the application of high-resolution remote sensing imagery. At present, quite a few algorithms are available for detecting building information, however, most of them still have some obvious disadvantages, such as the ignorance of spectral information, the contradiction between extraction rate and extraction accuracy. The purpose of this research is to develop an effective method to detect building information for Chinese GF-1 data. Firstly, the image preprocessing technique is used to normalize the image and image enhancement is used to highlight the useful information in the image. Secondly, multi-spectral information is analyzed. Subsequently, an improved morphological building index (IMBI) based on remote sensing imagery is proposed to get the candidate building objects. Furthermore, in order to refine building objects and further remove false objects, the post-processing (e.g., the shape features, the vegetation index and the water index) is employed. To validate the effectiveness of the proposed algorithm, the omission errors (OE), commission errors (CE), the overall accuracy (OA) and Kappa are used at final. The proposed method can not only effectively use spectral information and other basic features, but also avoid extracting excessive interference details from high-resolution remote sensing images. Compared to the original MBI algorithm, the proposed method reduces the OE by 33.14% .At the same time, the Kappa increase by 16.09%. In experiments, IMBI achieved satisfactory results and outperformed other algorithms in terms of both accuracies and visual inspection

  2. Extracting and standardizing medication information in clinical text – the MedEx-UIMA system

    PubMed Central

    Jiang, Min; Wu, Yonghui; Shah, Anushi; Priyanka, Priyanka; Denny, Joshua C.; Xu, Hua

    2014-01-01

    Extraction of medication information embedded in clinical text is important for research using electronic health records (EHRs). However, most of current medication information extraction systems identify drug and signature entities without mapping them to standard representation. In this study, we introduced the open source Java implementation of MedEx, an existing high-performance medication information extraction system, based on the Unstructured Information Management Architecture (UIMA) framework. In addition, we developed new encoding modules in the MedEx-UIMA system, which mapped an extracted drug name/dose/form to both generalized and specific RxNorm concepts and translated drug frequency information to ISO standard. We processed 826 documents by both systems and verified that MedEx-UIMA and MedEx (the Python version) performed similarly by comparing both results. Using two manually annotated test sets that contained 300 drug entries from medication list and 300 drug entries from narrative reports, the MedEx-UIMA system achieved F-measures of 98.5% and 97.5% respectively for encoding drug names to corresponding RxNorm generic drug ingredients, and F-measures of 85.4% and 88.1% respectively for mapping drug names/dose/form to the most specific RxNorm concepts. It also achieved an F-measure of 90.4% for normalizing frequency information to ISO standard. The open source MedEx-UIMA system is freely available online at http://code.google.com/p/medex-uima/. PMID:25954575

  3. System for definition of the central-chest vasculature

    NASA Astrophysics Data System (ADS)

    Taeprasartsit, Pinyo; Higgins, William E.

    2009-02-01

    Accurate definition of the central-chest vasculature from three-dimensional (3D) multi-detector CT (MDCT) images is important for pulmonary applications. For instance, the aorta and pulmonary artery help in automatic definition of the Mountain lymph-node stations for lung-cancer staging. This work presents a system for defining major vascular structures in the central chest. The system provides automatic methods for extracting the aorta and pulmonary artery and semi-automatic methods for extracting the other major central chest arteries/veins, such as the superior vena cava and azygos vein. Automatic aorta and pulmonary artery extraction are performed by model fitting and selection. The system also extracts certain vascular structure information to validate outputs. A semi-automatic method extracts vasculature by finding the medial axes between provided important sites. Results of the system are applied to lymph-node station definition and guidance of bronchoscopic biopsy.

  4. Quantitative evaluation of translational medicine based on scientometric analysis and information extraction.

    PubMed

    Zhang, Yin; Diao, Tianxi; Wang, Lei

    2014-12-01

    Designed to advance the two-way translational process between basic research and clinical practice, translational medicine has become one of the most important areas in biomedicine. The quantitative evaluation of translational medicine is valuable for the decision making of global translational medical research and funding. Using the scientometric analysis and information extraction techniques, this study quantitatively analyzed the scientific articles on translational medicine. The results showed that translational medicine had significant scientific output and impact, specific core field and institute, and outstanding academic status and benefit. While it is not considered in this study, the patent data are another important indicators that should be integrated in the relevant research in the future. © 2014 Wiley Periodicals, Inc.

  5. Quantification of network structural dissimilarities.

    PubMed

    Schieber, Tiago A; Carpi, Laura; Díaz-Guilera, Albert; Pardalos, Panos M; Masoller, Cristina; Ravetti, Martín G

    2017-01-09

    Identifying and quantifying dissimilarities among graphs is a fundamental and challenging problem of practical importance in many fields of science. Current methods of network comparison are limited to extract only partial information or are computationally very demanding. Here we propose an efficient and precise measure for network comparison, which is based on quantifying differences among distance probability distributions extracted from the networks. Extensive experiments on synthetic and real-world networks show that this measure returns non-zero values only when the graphs are non-isomorphic. Most importantly, the measure proposed here can identify and quantify structural topological differences that have a practical impact on the information flow through the network, such as the presence or absence of critical links that connect or disconnect connected components.

  6. Using input feature information to improve ultraviolet retrieval in neural networks

    NASA Astrophysics Data System (ADS)

    Sun, Zhibin; Chang, Ni-Bin; Gao, Wei; Chen, Maosi; Zempila, Melina

    2017-09-01

    In neural networks, the training/predicting accuracy and algorithm efficiency can be improved significantly via accurate input feature extraction. In this study, some spatial features of several important factors in retrieving surface ultraviolet (UV) are extracted. An extreme learning machine (ELM) is used to retrieve the surface UV of 2014 in the continental United States, using the extracted features. The results conclude that more input weights can improve the learning capacities of neural networks.

  7. Volume and Value of Big Healthcare Data.

    PubMed

    Dinov, Ivo D

    Modern scientific inquiries require significant data-driven evidence and trans-disciplinary expertise to extract valuable information and gain actionable knowledge about natural processes. Effective evidence-based decisions require collection, processing and interpretation of vast amounts of complex data. The Moore's and Kryder's laws of exponential increase of computational power and information storage, respectively, dictate the need rapid trans-disciplinary advances, technological innovation and effective mechanisms for managing and interrogating Big Healthcare Data. In this article, we review important aspects of Big Data analytics and discuss important questions like: What are the challenges and opportunities associated with this biomedical, social, and healthcare data avalanche? Are there innovative statistical computing strategies to represent, model, analyze and interpret Big heterogeneous data? We present the foundation of a new compressive big data analytics (CBDA) framework for representation, modeling and inference of large, complex and heterogeneous datasets. Finally, we consider specific directions likely to impact the process of extracting information from Big healthcare data, translating that information to knowledge, and deriving appropriate actions.

  8. Volume and Value of Big Healthcare Data

    PubMed Central

    Dinov, Ivo D.

    2016-01-01

    Modern scientific inquiries require significant data-driven evidence and trans-disciplinary expertise to extract valuable information and gain actionable knowledge about natural processes. Effective evidence-based decisions require collection, processing and interpretation of vast amounts of complex data. The Moore's and Kryder's laws of exponential increase of computational power and information storage, respectively, dictate the need rapid trans-disciplinary advances, technological innovation and effective mechanisms for managing and interrogating Big Healthcare Data. In this article, we review important aspects of Big Data analytics and discuss important questions like: What are the challenges and opportunities associated with this biomedical, social, and healthcare data avalanche? Are there innovative statistical computing strategies to represent, model, analyze and interpret Big heterogeneous data? We present the foundation of a new compressive big data analytics (CBDA) framework for representation, modeling and inference of large, complex and heterogeneous datasets. Finally, we consider specific directions likely to impact the process of extracting information from Big healthcare data, translating that information to knowledge, and deriving appropriate actions. PMID:26998309

  9. Text extraction method for historical Tibetan document images based on block projections

    NASA Astrophysics Data System (ADS)

    Duan, Li-juan; Zhang, Xi-qun; Ma, Long-long; Wu, Jian

    2017-11-01

    Text extraction is an important initial step in digitizing the historical documents. In this paper, we present a text extraction method for historical Tibetan document images based on block projections. The task of text extraction is considered as text area detection and location problem. The images are divided equally into blocks and the blocks are filtered by the information of the categories of connected components and corner point density. By analyzing the filtered blocks' projections, the approximate text areas can be located, and the text regions are extracted. Experiments on the dataset of historical Tibetan documents demonstrate the effectiveness of the proposed method.

  10. PDF text classification to leverage information extraction from publication reports.

    PubMed

    Bui, Duy Duc An; Del Fiol, Guilherme; Jonnalagadda, Siddhartha

    2016-06-01

    Data extraction from original study reports is a time-consuming, error-prone process in systematic review development. Information extraction (IE) systems have the potential to assist humans in the extraction task, however majority of IE systems were not designed to work on Portable Document Format (PDF) document, an important and common extraction source for systematic review. In a PDF document, narrative content is often mixed with publication metadata or semi-structured text, which add challenges to the underlining natural language processing algorithm. Our goal is to categorize PDF texts for strategic use by IE systems. We used an open-source tool to extract raw texts from a PDF document and developed a text classification algorithm that follows a multi-pass sieve framework to automatically classify PDF text snippets (for brevity, texts) into TITLE, ABSTRACT, BODYTEXT, SEMISTRUCTURE, and METADATA categories. To validate the algorithm, we developed a gold standard of PDF reports that were included in the development of previous systematic reviews by the Cochrane Collaboration. In a two-step procedure, we evaluated (1) classification performance, and compared it with machine learning classifier, and (2) the effects of the algorithm on an IE system that extracts clinical outcome mentions. The multi-pass sieve algorithm achieved an accuracy of 92.6%, which was 9.7% (p<0.001) higher than the best performing machine learning classifier that used a logistic regression algorithm. F-measure improvements were observed in the classification of TITLE (+15.6%), ABSTRACT (+54.2%), BODYTEXT (+3.7%), SEMISTRUCTURE (+34%), and MEDADATA (+14.2%). In addition, use of the algorithm to filter semi-structured texts and publication metadata improved performance of the outcome extraction system (F-measure +4.1%, p=0.002). It also reduced of number of sentences to be processed by 44.9% (p<0.001), which corresponds to a processing time reduction of 50% (p=0.005). The rule-based multi-pass sieve framework can be used effectively in categorizing texts extracted from PDF documents. Text classification is an important prerequisite step to leverage information extraction from PDF documents. Copyright © 2016 Elsevier Inc. All rights reserved.

  11. Technical design and system implementation of region-line primitive association framework

    NASA Astrophysics Data System (ADS)

    Wang, Min; Xing, Jinjin; Wang, Jie; Lv, Guonian

    2017-08-01

    Apart from regions, image edge lines are an important information source, and they deserve more attention in object-based image analysis (OBIA) than they currently receive. In the region-line primitive association framework (RLPAF), we promote straight-edge lines as line primitives to achieve powerful OBIAs. Along with regions, straight lines become basic units for subsequent extraction and analysis of OBIA features. This study develops a new software system called remote-sensing knowledge finder (RSFinder) to implement RLPAF for engineering application purposes. This paper introduces the extended technical framework, a comprehensively designed feature set, key technology, and software implementation. To our knowledge, RSFinder is the world's first OBIA system based on two types of primitives, namely, regions and lines. It is fundamentally different from other well-known region-only-based OBIA systems, such as eCogntion and ENVI feature extraction module. This paper has important reference values for the development of similarly structured OBIA systems and line-involved extraction algorithms of remote sensing information.

  12. Information-Based Analysis of Data Assimilation (Invited)

    NASA Astrophysics Data System (ADS)

    Nearing, G. S.; Gupta, H. V.; Crow, W. T.; Gong, W.

    2013-12-01

    Data assimilation is defined as the Bayesian conditioning of uncertain model simulations on observations for the purpose of reducing uncertainty about model states. Practical data assimilation methods make the application of Bayes' law tractable either by employing assumptions about the prior, posterior and likelihood distributions (e.g., the Kalman family of filters) or by using resampling methods (e.g., bootstrap filter). We propose to quantify the efficiency of these approximations in an OSSE setting using information theory and, in an OSSE or real-world validation setting, to measure the amount - and more importantly, the quality - of information extracted from observations during data assimilation. To analyze DA assumptions, uncertainty is quantified as the Shannon-type entropy of a discretized probability distribution. The maximum amount of information that can be extracted from observations about model states is the mutual information between states and observations, which is equal to the reduction in entropy in our estimate of the state due to Bayesian filtering. The difference between this potential and the actual reduction in entropy due to Kalman (or other type of) filtering measures the inefficiency of the filter assumptions. Residual uncertainty in DA posterior state estimates can be attributed to three sources: (i) non-injectivity of the observation operator, (ii) noise in the observations, and (iii) filter approximations. The contribution of each of these sources is measurable in an OSSE setting. The amount of information extracted from observations by data assimilation (or system identification, including parameter estimation) can also be measured by Shannon's theory. Since practical filters are approximations of Bayes' law, it is important to know whether the information that is extracted form observations by a filter is reliable. We define information as either good or bad, and propose to measure these two types of information using partial Kullback-Leibler divergences. Defined this way, good and bad information sum to total information. This segregation of information into good and bad components requires a validation target distribution; in a DA OSSE setting, this can be the true Bayesian posterior, but in a real-world setting the validation target might be determined by a set of in situ observations.

  13. HEDEA: A Python Tool for Extracting and Analysing Semi-structured Information from Medical Records

    PubMed Central

    Aggarwal, Anshul; Garhwal, Sunita

    2018-01-01

    Objectives One of the most important functions for a medical practitioner while treating a patient is to study the patient's complete medical history by going through all records, from test results to doctor's notes. With the increasing use of technology in medicine, these records are mostly digital, alleviating the problem of looking through a stack of papers, which are easily misplaced, but some of these are in an unstructured form. Large parts of clinical reports are in written text form and are tedious to use directly without appropriate pre-processing. In medical research, such health records may be a good, convenient source of medical data; however, lack of structure means that the data is unfit for statistical evaluation. In this paper, we introduce a system to extract, store, retrieve, and analyse information from health records, with a focus on the Indian healthcare scene. Methods A Python-based tool, Healthcare Data Extraction and Analysis (HEDEA), has been designed to extract structured information from various medical records using a regular expression-based approach. Results The HEDEA system is working, covering a large set of formats, to extract and analyse health information. Conclusions This tool can be used to generate analysis report and charts using the central database. This information is only provided after prior approval has been received from the patient for medical research purposes. PMID:29770248

  14. HEDEA: A Python Tool for Extracting and Analysing Semi-structured Information from Medical Records.

    PubMed

    Aggarwal, Anshul; Garhwal, Sunita; Kumar, Ajay

    2018-04-01

    One of the most important functions for a medical practitioner while treating a patient is to study the patient's complete medical history by going through all records, from test results to doctor's notes. With the increasing use of technology in medicine, these records are mostly digital, alleviating the problem of looking through a stack of papers, which are easily misplaced, but some of these are in an unstructured form. Large parts of clinical reports are in written text form and are tedious to use directly without appropriate pre-processing. In medical research, such health records may be a good, convenient source of medical data; however, lack of structure means that the data is unfit for statistical evaluation. In this paper, we introduce a system to extract, store, retrieve, and analyse information from health records, with a focus on the Indian healthcare scene. A Python-based tool, Healthcare Data Extraction and Analysis (HEDEA), has been designed to extract structured information from various medical records using a regular expression-based approach. The HEDEA system is working, covering a large set of formats, to extract and analyse health information. This tool can be used to generate analysis report and charts using the central database. This information is only provided after prior approval has been received from the patient for medical research purposes.

  15. Uncovering the essential links in online commercial networks

    NASA Astrophysics Data System (ADS)

    Zeng, Wei; Fang, Meiling; Shao, Junming; Shang, Mingsheng

    2016-09-01

    Recommender systems are designed to effectively support individuals' decision-making process on various web sites. It can be naturally represented by a user-object bipartite network, where a link indicates that a user has collected an object. Recently, research on the information backbone has attracted researchers' interests, which is a sub-network with fewer nodes and links but carrying most of the relevant information. With the backbone, a system can generate satisfactory recommenda- tions while saving much computing resource. In this paper, we propose an enhanced topology-aware method to extract the information backbone in the bipartite network mainly based on the information of neighboring users and objects. Our backbone extraction method enables the recommender systems achieve more than 90% of the accuracy of the top-L recommendation, however, consuming only 20% links. The experimental results show that our method outperforms the alternative backbone extraction methods. Moreover, the structure of the information backbone is studied in detail. Finally, we highlight that the information backbone is one of the most important properties of the bipartite network, with which one can significantly improve the efficiency of the recommender system.

  16. A Method for Extracting Road Boundary Information from Crowdsourcing Vehicle GPS Trajectories.

    PubMed

    Yang, Wei; Ai, Tinghua; Lu, Wei

    2018-04-19

    Crowdsourcing trajectory data is an important approach for accessing and updating road information. In this paper, we present a novel approach for extracting road boundary information from crowdsourcing vehicle traces based on Delaunay triangulation (DT). First, an optimization and interpolation method is proposed to filter abnormal trace segments from raw global positioning system (GPS) traces and interpolate the optimization segments adaptively to ensure there are enough tracking points. Second, constructing the DT and the Voronoi diagram within interpolated tracking lines to calculate road boundary descriptors using the area of Voronoi cell and the length of triangle edge. Then, the road boundary detection model is established integrating the boundary descriptors and trajectory movement features (e.g., direction) by DT. Third, using the boundary detection model to detect road boundary from the DT constructed by trajectory lines, and a regional growing method based on seed polygons is proposed to extract the road boundary. Experiments were conducted using the GPS traces of taxis in Beijing, China, and the results show that the proposed method is suitable for extracting the road boundary from low-frequency GPS traces, multi-type road structures, and different time intervals. Compared with two existing methods, the automatically extracted boundary information was proved to be of higher quality.

  17. A Method for Extracting Road Boundary Information from Crowdsourcing Vehicle GPS Trajectories

    PubMed Central

    Yang, Wei

    2018-01-01

    Crowdsourcing trajectory data is an important approach for accessing and updating road information. In this paper, we present a novel approach for extracting road boundary information from crowdsourcing vehicle traces based on Delaunay triangulation (DT). First, an optimization and interpolation method is proposed to filter abnormal trace segments from raw global positioning system (GPS) traces and interpolate the optimization segments adaptively to ensure there are enough tracking points. Second, constructing the DT and the Voronoi diagram within interpolated tracking lines to calculate road boundary descriptors using the area of Voronoi cell and the length of triangle edge. Then, the road boundary detection model is established integrating the boundary descriptors and trajectory movement features (e.g., direction) by DT. Third, using the boundary detection model to detect road boundary from the DT constructed by trajectory lines, and a regional growing method based on seed polygons is proposed to extract the road boundary. Experiments were conducted using the GPS traces of taxis in Beijing, China, and the results show that the proposed method is suitable for extracting the road boundary from low-frequency GPS traces, multi-type road structures, and different time intervals. Compared with two existing methods, the automatically extracted boundary information was proved to be of higher quality. PMID:29671792

  18. Two-dimensional thermal video analysis of offshore bird and bat flight

    DOE PAGES

    Matzner, Shari; Cullinan, Valerie I.; Duberstein, Corey A.

    2015-09-11

    Thermal infrared video can provide essential information about bird and bat presence and activity for risk assessment studies, but the analysis of recorded video can be time-consuming and may not extract all of the available information. Automated processing makes continuous monitoring over extended periods of time feasible, and maximizes the information provided by video. This is especially important for collecting data in remote locations that are difficult for human observers to access, such as proposed offshore wind turbine sites. We present guidelines for selecting an appropriate thermal camera based on environmental conditions and the physical characteristics of the target animals.more » We developed new video image processing algorithms that automate the extraction of bird and bat flight tracks from thermal video, and that characterize the extracted tracks to support animal identification and behavior inference. The algorithms use a video peak store process followed by background masking and perceptual grouping to extract flight tracks. The extracted tracks are automatically quantified in terms that could then be used to infer animal type and possibly behavior. The developed automated processing generates results that are reproducible and verifiable, and reduces the total amount of video data that must be retained and reviewed by human experts. Finally, we suggest models for interpreting thermal imaging information.« less

  19. Two-dimensional thermal video analysis of offshore bird and bat flight

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Matzner, Shari; Cullinan, Valerie I.; Duberstein, Corey A.

    Thermal infrared video can provide essential information about bird and bat presence and activity for risk assessment studies, but the analysis of recorded video can be time-consuming and may not extract all of the available information. Automated processing makes continuous monitoring over extended periods of time feasible, and maximizes the information provided by video. This is especially important for collecting data in remote locations that are difficult for human observers to access, such as proposed offshore wind turbine sites. We present guidelines for selecting an appropriate thermal camera based on environmental conditions and the physical characteristics of the target animals.more » We developed new video image processing algorithms that automate the extraction of bird and bat flight tracks from thermal video, and that characterize the extracted tracks to support animal identification and behavior inference. The algorithms use a video peak store process followed by background masking and perceptual grouping to extract flight tracks. The extracted tracks are automatically quantified in terms that could then be used to infer animal type and possibly behavior. The developed automated processing generates results that are reproducible and verifiable, and reduces the total amount of video data that must be retained and reviewed by human experts. Finally, we suggest models for interpreting thermal imaging information.« less

  20. Linguistic feature analysis for protein interaction extraction

    PubMed Central

    2009-01-01

    Background The rapid growth of the amount of publicly available reports on biomedical experimental results has recently caused a boost of text mining approaches for protein interaction extraction. Most approaches rely implicitly or explicitly on linguistic, i.e., lexical and syntactic, data extracted from text. However, only few attempts have been made to evaluate the contribution of the different feature types. In this work, we contribute to this evaluation by studying the relative importance of deep syntactic features, i.e., grammatical relations, shallow syntactic features (part-of-speech information) and lexical features. For this purpose, we use a recently proposed approach that uses support vector machines with structured kernels. Results Our results reveal that the contribution of the different feature types varies for the different data sets on which the experiments were conducted. The smaller the training corpus compared to the test data, the more important the role of grammatical relations becomes. Moreover, deep syntactic information based classifiers prove to be more robust on heterogeneous texts where no or only limited common vocabulary is shared. Conclusion Our findings suggest that grammatical relations play an important role in the interaction extraction task. Moreover, the net advantage of adding lexical and shallow syntactic features is small related to the number of added features. This implies that efficient classifiers can be built by using only a small fraction of the features that are typically being used in recent approaches. PMID:19909518

  1. Gstruct: a system for extracting schemas from GML documents

    NASA Astrophysics Data System (ADS)

    Chen, Hui; Zhu, Fubao; Guan, Jihong; Zhou, Shuigeng

    2008-10-01

    Geography Markup Language (GML) becomes the de facto standard for geographic information representation on the internet. GML schema provides a way to define the structure, content, and semantic of GML documents. It contains useful structural information of GML documents and plays an important role in storing, querying and analyzing GML data. However, GML schema is not mandatory, and it is common that a GML document contains no schema. In this paper, we present Gstruct, a tool for GML schema extraction. Gstruct finds the features in the input GML documents, identifies geometry datatypes as well as simple datatypes, then integrates all these features and eliminates improper components to output the optimal schema. Experiments demonstrate that Gstruct is effective in extracting semantically meaningful schemas from GML documents.

  2. Improving the Accuracy of Attribute Extraction using the Relatedness between Attribute Values

    NASA Astrophysics Data System (ADS)

    Bollegala, Danushka; Tani, Naoki; Ishizuka, Mitsuru

    Extracting attribute-values related to entities from web texts is an important step in numerous web related tasks such as information retrieval, information extraction, and entity disambiguation (namesake disambiguation). For example, for a search query that contains a personal name, we can not only return documents that contain that personal name, but if we have attribute-values such as the organization for which that person works, we can also suggest documents that contain information related to that organization, thereby improving the user's search experience. Despite numerous potential applications of attribute extraction, it remains a challenging task due to the inherent noise in web data -- often a single web page contains multiple entities and attributes. We propose a graph-based approach to select the correct attribute-values from a set of candidate attribute-values extracted for a particular entity. First, we build an undirected weighted graph in which, attribute-values are represented by nodes, and the edge that connects two nodes in the graph represents the degree of relatedness between the corresponding attribute-values. Next, we find the maximum spanning tree of this graph that connects exactly one attribute-value for each attribute-type. The proposed method outperforms previously proposed attribute extraction methods on a dataset that contains 5000 web pages.

  3. a Geographic Data Gathering System for Image Geolocalization Refining

    NASA Astrophysics Data System (ADS)

    Semaan, B.; Servières, M.; Moreau, G.; Chebaro, B.

    2017-09-01

    Image geolocalization has become an important research field during the last decade. This field is divided into two main sections. The first is image geolocalization that is used to find out which country, region or city the image belongs to. The second one is refining image localization for uses that require more accuracy such as augmented reality and three dimensional environment reconstruction using images. In this paper we present a processing chain that gathers geographic data from several sources in order to deliver a better geolocalization than the GPS one of an image and precise camera pose parameters. In order to do so, we use multiple types of data. Among this information some are visible in the image and are extracted using image processing, other types of data can be extracted from image file headers or online image sharing platforms related information. Extracted information elements will not be expressive enough if they remain disconnected. We show that grouping these information elements helps finding the best geolocalization of the image.

  4. Visual content highlighting via automatic extraction of embedded captions on MPEG compressed video

    NASA Astrophysics Data System (ADS)

    Yeo, Boon-Lock; Liu, Bede

    1996-03-01

    Embedded captions in TV programs such as news broadcasts, documentaries and coverage of sports events provide important information on the underlying events. In digital video libraries, such captions represent a highly condensed form of key information on the contents of the video. In this paper we propose a scheme to automatically detect the presence of captions embedded in video frames. The proposed method operates on reduced image sequences which are efficiently reconstructed from compressed MPEG video and thus does not require full frame decompression. The detection, extraction and analysis of embedded captions help to capture the highlights of visual contents in video documents for better organization of video, to present succinctly the important messages embedded in the images, and to facilitate browsing, searching and retrieval of relevant clips.

  5. MedEx/J: A One-Scan Simple and Fast NLP Tool for Japanese Clinical Texts.

    PubMed

    Aramaki, Eiji; Yano, Ken; Wakamiya, Shoko

    2017-01-01

    Because of recent replacement of physical documents with electronic medical records (EMR), the importance of information processing in the medical field has increased. In light of this trend, we have been developing MedEx/J, which retrieves important Japanese language information from medical reports. MedEx/J executes two tasks simultaneously: (1) term extraction, and (2) positive and negative event classification. We designate this approach as a one-scan approach, providing simplicity of systems and reasonable accuracy. MedEx/J performance on the two tasks is described herein: (1) term extraction (Fβ = 1 = 0.87) and (2) positive-negative classification (Fβ = 1 = 0.63). This paper also presents discussion and explains remaining issues in the medical natural language processing field.

  6. Evaluation of Ultrasonic Fiber Structure Extraction Technique Using Autopsy Specimens of Liver

    NASA Astrophysics Data System (ADS)

    Yamaguchi, Tadashi; Hirai, Kazuki; Yamada, Hiroyuki; Ebara, Masaaki; Hachiya, Hiroyuki

    2005-06-01

    It is very important to diagnose liver cirrhosis noninvasively and correctly. In our previous studies, we proposed a processing technique to detect changes in liver tissue in vivo. In this paper, we propose the evaluation of the relationship between liver disease and echo information using autopsy specimens of a human liver in vitro. It is possible to verify the function of a processing parameter clearly and to compare the processing result and the actual human liver tissue structure by in vitro experiment. In the results of our processing technique, information that did not obey a Rayleigh distribution from the echo signal of the autopsy liver specimens was extracted depending on changes in a particular processing parameter. The fiber tissue structure of the same specimen was extracted from a number of histological images of stained tissue. We constructed 3D structures using the information extracted from the echo signal and the fiber structure of the stained tissue and compared the two. By comparing the 3D structures, it is possible to evaluate the relationship between the information that does not obey a Rayleigh distribution of the echo signal and the fibrosis structure.

  7. Terrain Extraction by Integrating Terrestrial Laser Scanner Data and Spectral Information

    NASA Astrophysics Data System (ADS)

    Lau, C. L.; Halim, S.; Zulkepli, M.; Azwan, A. M.; Tang, W. L.; Chong, A. K.

    2015-10-01

    The extraction of true terrain points from unstructured laser point cloud data is an important process in order to produce an accurate digital terrain model (DTM). However, most of these spatial filtering methods just utilizing the geometrical data to discriminate the terrain points from nonterrain points. The point cloud filtering method also can be improved by using the spectral information available with some scanners. Therefore, the objective of this study is to investigate the effectiveness of using the three-channel (red, green and blue) of the colour image captured from built-in digital camera which is available in some Terrestrial Laser Scanner (TLS) for terrain extraction. In this study, the data acquisition was conducted at a mini replica landscape in Universiti Teknologi Malaysia (UTM), Skudai campus using Leica ScanStation C10. The spectral information of the coloured point clouds from selected sample classes are extracted for spectral analysis. The coloured point clouds which within the corresponding preset spectral threshold are identified as that specific feature point from the dataset. This process of terrain extraction is done through using developed Matlab coding. Result demonstrates that a higher spectral resolution passive image is required in order to improve the output. This is because low quality of the colour images captured by the sensor contributes to the low separability in spectral reflectance. In conclusion, this study shows that, spectral information is capable to be used as a parameter for terrain extraction.

  8. Extraction of reduced alteration information based on Aster data: a case study of the Bashibulake uranium ore district

    NASA Astrophysics Data System (ADS)

    Ye, Fa-wang; Liu, De-chang

    2008-12-01

    Practices of sandstone-type uranium exploration in recent years in China indicate that the uranium mineralization alteration information is of great importance for selecting a new uranium target or prospecting in outer area of the known uranium ore district. Taking a case study of BASHIBULAKE uranium ore district, this paper mainly presents the technical minds and methods of extracting the reduced alteration information by oil and gas in BASHIBULAKE ore district using ASTER data. First, the regional geological setting and study status in BASHIBULAKE uranium ore district are introduced in brief. Then, the spectral characteristics of altered sandstone and un-altered sandstone in BASHIBULAKE ore district are analyzed deeply. Based on the spectral analysis, two technical minds to extract the remote sensing reduced alteration information are proposed, and the un-mixing method is introduced to process ASTER data to extract the reduced alteration information in BASHIBULAKE ore district. From the enhanced images, three remote sensing anomaly zones are discovered, and their geological and prospecting significances are further made sure by taking the advantages of multi-bands in SWIR of ASTER data. Finally, the distribution and intensity of the reduced alteration information in Cretaceous system and its relationship with the genesis of uranium deposit are discussed, the specific suggestions for uranium prospecting orientation in outer of BASHIBULAKE ore district are also proposed.

  9. A weighted information criterion for multiple minor components and its adaptive extraction algorithms.

    PubMed

    Gao, Yingbin; Kong, Xiangyu; Zhang, Huihui; Hou, Li'an

    2017-05-01

    Minor component (MC) plays an important role in signal processing and data analysis, so it is a valuable work to develop MC extraction algorithms. Based on the concepts of weighted subspace and optimum theory, a weighted information criterion is proposed for searching the optimum solution of a linear neural network. This information criterion exhibits a unique global minimum attained if and only if the state matrix is composed of the desired MCs of an autocorrelation matrix of an input signal. By using gradient ascent method and recursive least square (RLS) method, two algorithms are developed for multiple MCs extraction. The global convergences of the proposed algorithms are also analyzed by the Lyapunov method. The proposed algorithms can extract the multiple MCs in parallel and has advantage in dealing with high dimension matrices. Since the weighted matrix does not require an accurate value, it facilitates the system design of the proposed algorithms for practical applications. The speed and computation advantages of the proposed algorithms are verified through simulations. Copyright © 2017 Elsevier Ltd. All rights reserved.

  10. aCGH-MAS: Analysis of aCGH by means of Multiagent System

    PubMed Central

    Benito, Rocío; Bajo, Javier; Rodríguez, Ana Eugenia; Abáigar, María

    2015-01-01

    There are currently different techniques, such as CGH arrays, to study genetic variations in patients. CGH arrays analyze gains and losses in different regions in the chromosome. Regions with gains or losses in pathologies are important for selecting relevant genes or CNVs (copy-number variations) associated with the variations detected within chromosomes. Information corresponding to mutations, genes, proteins, variations, CNVs, and diseases can be found in different databases and it would be of interest to incorporate information of different sources to extract relevant information. This work proposes a multiagent system to manage the information of aCGH arrays, with the aim of providing an intuitive and extensible system to analyze and interpret the results. The agent roles integrate statistical techniques to select relevant variations and visualization techniques for the interpretation of the final results and to extract relevant information from different sources of information by applying a CBR system. PMID:25874203

  11. Information extraction with object based support vector machines and vegetation indices

    NASA Astrophysics Data System (ADS)

    Ustuner, Mustafa; Abdikan, Saygin; Balik Sanli, Fusun

    2016-07-01

    Information extraction through remote sensing data is important for policy and decision makers as extracted information provide base layers for many application of real world. Classification of remotely sensed data is the one of the most common methods of extracting information however it is still a challenging issue because several factors are affecting the accuracy of the classification. Resolution of the imagery, number and homogeneity of land cover classes, purity of training data and characteristic of adopted classifiers are just some of these challenging factors. Object based image classification has some superiority than pixel based classification for high resolution images since it uses geometry and structure information besides spectral information. Vegetation indices are also commonly used for the classification process since it provides additional spectral information for vegetation, forestry and agricultural areas. In this study, the impacts of the Normalized Difference Vegetation Index (NDVI) and Normalized Difference Red Edge Index (NDRE) on the classification accuracy of RapidEye imagery were investigated. Object based Support Vector Machines were implemented for the classification of crop types for the study area located in Aegean region of Turkey. Results demonstrated that the incorporation of NDRE increase the classification accuracy from 79,96% to 86,80% as overall accuracy, however NDVI decrease the classification accuracy from 79,96% to 78,90%. Moreover it is proven than object based classification with RapidEye data give promising results for crop type mapping and analysis.

  12. Isolation and characterization of DNA from archaeological bone.

    PubMed

    Hagelberg, E; Clegg, J B

    1991-04-22

    DNA was extracted from human and animal bones recovered from archaeological sites and mitochondrial DNA sequences were amplified from the extracts using the polymerase chain reaction. Evidence is presented that the amplified sequences are authentic and do not represent contamination by extraneous DNA. The results show that significant amounts of genetic information can survive for long periods in bone, and have important implications for evolutionary genetics, anthropology and forensic science.

  13. How the variance of some extraction variables may affect the quality of espresso coffees served in coffee shops.

    PubMed

    Severini, Carla; Derossi, Antonio; Fiore, Anna G; De Pilli, Teresa; Alessandrino, Ofelia; Del Mastro, Arcangela

    2016-07-01

    To improve the quality of espresso coffee, the variables under the control of the barista, such as grinding grade, coffee quantity and pressure applied to the coffee cake, as well as their variance, are of great importance. A nonlinear mixed effect modeling was used to obtain information on the changes in chemical attributes of espresso coffee (EC) as a function of the variability of extraction conditions. During extraction, the changes in volume were well described by a logistic model, whereas the chemical attributes were better fit by a first-order kinetic. The major source of information was contained in the grinding grade, which accounted for 87-96% of the variance of the experimental data. The variability of the grinding produced changes in caffeine content in the range of 80.03 mg and 130.36 mg when using a constant grinding grade of 6.5. The variability in volume and chemical attributes of EC is large. Grinding had the most important effect as the variability in particle size distribution observed for each grinding level had a profound effect on the quality of EC. Standardization of grinding would be of crucial importance for obtaining all espresso coffees with a high quality. © 2015 Society of Chemical Industry. © 2015 Society of Chemical Industry.

  14. Detecting the red tide based on remote sensing data in optically complex East China Sea

    NASA Astrophysics Data System (ADS)

    Xu, Xiaohui; Pan, Delu; Mao, Zhihua; Tao, Bangyi; Liu, Qiong

    2012-09-01

    Red tide not only destroys marine fishery production, deteriorates the marine environment, affects coastal tourist industry, but also causes human poison, even death by eating toxic seafood contaminated by red tide organisms. Remote sensing technology has the characteristics of large-scale, synchronized, rapid monitoring, so it is one of the most important and most effective means of red tide monitoring. This paper selects the high frequency red tides areas of the East China Sea as study area, MODIS/Aqua L2 data as the data source, analysis and compares the spectral differences in the red tide water bodies and non-red tide water bodies of many historical events. Based on the spectral differences, this paper develops the algorithm of Rrs555/Rrs488> 1.5 to extract the red tide information. Apply the algorithm on red tide event happened in the East China Sea on May 28, 2009 to extract the information of red tide, and found that the method can determine effectively the location of the occurrence of red tide; there is a good corresponding relationship between red tide extraction result and chlorophyll a concentration extracted by remote sensing, shows that these algorithm can determine effectively the location and extract the red tide information.

  15. KAM (Knowledge Acquisition Module): A tool to simplify the knowledge acquisition process

    NASA Technical Reports Server (NTRS)

    Gettig, Gary A.

    1988-01-01

    Analysts, knowledge engineers and information specialists are faced with increasing volumes of time-sensitive data in text form, either as free text or highly structured text records. Rapid access to the relevant data in these sources is essential. However, due to the volume and organization of the contents, and limitations of human memory and association, frequently: (1) important information is not located in time; (2) reams of irrelevant data are searched; and (3) interesting or critical associations are missed due to physical or temporal gaps involved in working with large files. The Knowledge Acquisition Module (KAM) is a microcomputer-based expert system designed to assist knowledge engineers, analysts, and other specialists in extracting useful knowledge from large volumes of digitized text and text-based files. KAM formulates non-explicit, ambiguous, or vague relations, rules, and facts into a manageable and consistent formal code. A library of system rules or heuristics is maintained to control the extraction of rules, relations, assertions, and other patterns from the text. These heuristics can be added, deleted or customized by the user. The user can further control the extraction process with optional topic specifications. This allows the user to cluster extracts based on specific topics. Because KAM formalizes diverse knowledge, it can be used by a variety of expert systems and automated reasoning applications. KAM can also perform important roles in computer-assisted training and skill development. Current research efforts include the applicability of neural networks to aid in the extraction process and the conversion of these extracts into standard formats.

  16. A Comparative Analysis of Extract, Transformation and Loading (ETL) Process

    NASA Astrophysics Data System (ADS)

    Runtuwene, J. P. A.; Tangkawarow, I. R. H. T.; Manoppo, C. T. M.; Salaki, R. J.

    2018-02-01

    The current growth of data and information occurs rapidly in varying amount and media. These types of development will eventually produce large number of data better known as the Big Data. Business Intelligence (BI) utilizes large number of data and information for analysis so that one can obtain important information. This type of information can be used to support decision-making process. In practice a process integrating existing data and information into data warehouse is needed. This data integration process is known as Extract, Transformation and Loading (ETL). In practice, many applications have been developed to carry out the ETL process, but selection which applications are more time, cost and power effective and efficient may become a challenge. Therefore, the objective of the study was to provide comparative analysis through comparison between the ETL process using Microsoft SQL Server Integration Service (SSIS) and one using Pentaho Data Integration (PDI).

  17. Semantic Location Extraction from Crowdsourced Data

    NASA Astrophysics Data System (ADS)

    Koswatte, S.; Mcdougall, K.; Liu, X.

    2016-06-01

    Crowdsourced Data (CSD) has recently received increased attention in many application areas including disaster management. Convenience of production and use, data currency and abundancy are some of the key reasons for attracting this high interest. Conversely, quality issues like incompleteness, credibility and relevancy prevent the direct use of such data in important applications like disaster management. Moreover, location information availability of CSD is problematic as it remains very low in many crowd sourced platforms such as Twitter. Also, this recorded location is mostly related to the mobile device or user location and often does not represent the event location. In CSD, event location is discussed descriptively in the comments in addition to the recorded location (which is generated by means of mobile device's GPS or mobile communication network). This study attempts to semantically extract the CSD location information with the help of an ontological Gazetteer and other available resources. 2011 Queensland flood tweets and Ushahidi Crowd Map data were semantically analysed to extract the location information with the support of Queensland Gazetteer which is converted to an ontological gazetteer and a global gazetteer. Some preliminary results show that the use of ontologies and semantics can improve the accuracy of place name identification of CSD and the process of location information extraction.

  18. Main Road Extraction from ZY-3 Grayscale Imagery Based on Directional Mathematical Morphology and VGI Prior Knowledge in Urban Areas

    PubMed Central

    Liu, Bo; Wu, Huayi; Wang, Yandong; Liu, Wenming

    2015-01-01

    Main road features extracted from remotely sensed imagery play an important role in many civilian and military applications, such as updating Geographic Information System (GIS) databases, urban structure analysis, spatial data matching and road navigation. Current methods for road feature extraction from high-resolution imagery are typically based on threshold value segmentation. It is difficult however, to completely separate road features from the background. We present a new method for extracting main roads from high-resolution grayscale imagery based on directional mathematical morphology and prior knowledge obtained from the Volunteered Geographic Information found in the OpenStreetMap. The two salient steps in this strategy are: (1) using directional mathematical morphology to enhance the contrast between roads and non-roads; (2) using OpenStreetMap roads as prior knowledge to segment the remotely sensed imagery. Experiments were conducted on two ZiYuan-3 images and one QuickBird high-resolution grayscale image to compare our proposed method to other commonly used techniques for road feature extraction. The results demonstrated the validity and better performance of the proposed method for urban main road feature extraction. PMID:26397832

  19. Using Open Web APIs in Teaching Web Mining

    ERIC Educational Resources Information Center

    Chen, Hsinchun; Li, Xin; Chau, M.; Ho, Yi-Jen; Tseng, Chunju

    2009-01-01

    With the advent of the World Wide Web, many business applications that utilize data mining and text mining techniques to extract useful business information on the Web have evolved from Web searching to Web mining. It is important for students to acquire knowledge and hands-on experience in Web mining during their education in information systems…

  20. An effective biometric discretization approach to extract highly discriminative, informative, and privacy-protective binary representation

    NASA Astrophysics Data System (ADS)

    Lim, Meng-Hui; Teoh, Andrew Beng Jin

    2011-12-01

    Biometric discretization derives a binary string for each user based on an ordered set of biometric features. This representative string ought to be discriminative, informative, and privacy protective when it is employed as a cryptographic key in various security applications upon error correction. However, it is commonly believed that satisfying the first and the second criteria simultaneously is not feasible, and a tradeoff between them is always definite. In this article, we propose an effective fixed bit allocation-based discretization approach which involves discriminative feature extraction, discriminative feature selection, unsupervised quantization (quantization that does not utilize class information), and linearly separable subcode (LSSC)-based encoding to fulfill all the ideal properties of a binary representation extracted for cryptographic applications. In addition, we examine a number of discriminative feature-selection measures for discretization and identify the proper way of setting an important feature-selection parameter. Encouraging experimental results vindicate the feasibility of our approach.

  1. Chemical named entities recognition: a review on approaches and applications

    PubMed Central

    2014-01-01

    The rapid increase in the flow rate of published digital information in all disciplines has resulted in a pressing need for techniques that can simplify the use of this information. The chemistry literature is very rich with information about chemical entities. Extracting molecules and their related properties and activities from the scientific literature to “text mine” these extracted data and determine contextual relationships helps research scientists, particularly those in drug development. One of the most important challenges in chemical text mining is the recognition of chemical entities mentioned in the texts. In this review, the authors briefly introduce the fundamental concepts of chemical literature mining, the textual contents of chemical documents, and the methods of naming chemicals in documents. We sketch out dictionary-based, rule-based and machine learning, as well as hybrid chemical named entity recognition approaches with their applied solutions. We end with an outlook on the pros and cons of these approaches and the types of chemical entities extracted. PMID:24834132

  2. Chemical named entities recognition: a review on approaches and applications.

    PubMed

    Eltyeb, Safaa; Salim, Naomie

    2014-01-01

    The rapid increase in the flow rate of published digital information in all disciplines has resulted in a pressing need for techniques that can simplify the use of this information. The chemistry literature is very rich with information about chemical entities. Extracting molecules and their related properties and activities from the scientific literature to "text mine" these extracted data and determine contextual relationships helps research scientists, particularly those in drug development. One of the most important challenges in chemical text mining is the recognition of chemical entities mentioned in the texts. In this review, the authors briefly introduce the fundamental concepts of chemical literature mining, the textual contents of chemical documents, and the methods of naming chemicals in documents. We sketch out dictionary-based, rule-based and machine learning, as well as hybrid chemical named entity recognition approaches with their applied solutions. We end with an outlook on the pros and cons of these approaches and the types of chemical entities extracted.

  3. Event extraction of bacteria biotopes: a knowledge-intensive NLP-based approach

    PubMed Central

    2012-01-01

    Background Bacteria biotopes cover a wide range of diverse habitats including animal and plant hosts, natural, medical and industrial environments. The high volume of publications in the microbiology domain provides a rich source of up-to-date information on bacteria biotopes. This information, as found in scientific articles, is expressed in natural language and is rarely available in a structured format, such as a database. This information is of great importance for fundamental research and microbiology applications (e.g., medicine, agronomy, food, bioenergy). The automatic extraction of this information from texts will provide a great benefit to the field. Methods We present a new method for extracting relationships between bacteria and their locations using the Alvis framework. Recognition of bacteria and their locations was achieved using a pattern-based approach and domain lexical resources. For the detection of environment locations, we propose a new approach that combines lexical information and the syntactic-semantic analysis of corpus terms to overcome the incompleteness of lexical resources. Bacteria location relations extend over sentence borders, and we developed domain-specific rules for dealing with bacteria anaphors. Results We participated in the BioNLP 2011 Bacteria Biotope (BB) task with the Alvis system. Official evaluation results show that it achieves the best performance of participating systems. New developments since then have increased the F-score by 4.1 points. Conclusions We have shown that the combination of semantic analysis and domain-adapted resources is both effective and efficient for event information extraction in the bacteria biotope domain. We plan to adapt the method to deal with a larger set of location types and a large-scale scientific article corpus to enable microbiologists to integrate and use the extracted knowledge in combination with experimental data. PMID:22759462

  4. Event extraction of bacteria biotopes: a knowledge-intensive NLP-based approach.

    PubMed

    Ratkovic, Zorana; Golik, Wiktoria; Warnier, Pierre

    2012-06-26

    Bacteria biotopes cover a wide range of diverse habitats including animal and plant hosts, natural, medical and industrial environments. The high volume of publications in the microbiology domain provides a rich source of up-to-date information on bacteria biotopes. This information, as found in scientific articles, is expressed in natural language and is rarely available in a structured format, such as a database. This information is of great importance for fundamental research and microbiology applications (e.g., medicine, agronomy, food, bioenergy). The automatic extraction of this information from texts will provide a great benefit to the field. We present a new method for extracting relationships between bacteria and their locations using the Alvis framework. Recognition of bacteria and their locations was achieved using a pattern-based approach and domain lexical resources. For the detection of environment locations, we propose a new approach that combines lexical information and the syntactic-semantic analysis of corpus terms to overcome the incompleteness of lexical resources. Bacteria location relations extend over sentence borders, and we developed domain-specific rules for dealing with bacteria anaphors. We participated in the BioNLP 2011 Bacteria Biotope (BB) task with the Alvis system. Official evaluation results show that it achieves the best performance of participating systems. New developments since then have increased the F-score by 4.1 points. We have shown that the combination of semantic analysis and domain-adapted resources is both effective and efficient for event information extraction in the bacteria biotope domain. We plan to adapt the method to deal with a larger set of location types and a large-scale scientific article corpus to enable microbiologists to integrate and use the extracted knowledge in combination with experimental data.

  5. Advanced image collection, information extraction, and change detection in support of NN-20 broad area search and analysis

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Petrie, G.M.; Perry, E.M.; Kirkham, R.R.

    1997-09-01

    This report describes the work performed at the Pacific Northwest National Laboratory (PNNL) for the U.S. Department of Energy`s Office of Nonproliferation and National Security, Office of Research and Development (NN-20). The work supports the NN-20 Broad Area Search and Analysis, a program initiated by NN-20 to improve the detection and classification of undeclared weapons facilities. Ongoing PNNL research activities are described in three main components: image collection, information processing, and change analysis. The Multispectral Airborne Imaging System, which was developed to collect georeferenced imagery in the visible through infrared regions of the spectrum, and flown on a light aircraftmore » platform, will supply current land use conditions. The image information extraction software (dynamic clustering and end-member extraction) uses imagery, like the multispectral data collected by the PNNL multispectral system, to efficiently generate landcover information. The advanced change detection uses a priori (benchmark) information, current landcover conditions, and user-supplied rules to rank suspect areas by probable risk of undeclared facilities or proliferation activities. These components, both separately and combined, provide important tools for improving the detection of undeclared facilities.« less

  6. Natural radioactivity in commercial granites extracted near old uranium mines: scientific, economic and social impact of disinformation.

    NASA Astrophysics Data System (ADS)

    Pereira, Dolores; Pereira, Alcides; Neves, Luis

    2015-04-01

    The study of radioactivity in natural stones is a subject of great interest from different points of view: scientific, social and economic. Several previous studies have demonstrated that the radioactivity is dependent, not only on the uranium content, but also on the structures, textures, minerals containing the uranium and degree of weathering of the natural stone. Villavieja granite is extracted in a village where uranium mining was an important activity during the 20th century. Today the mine is closed but the granite is still extracted. Incorrect information about natural radioactivity given to natural stone users, policy makers, construction managers and the general public has caused turmoil in the media for many years. This paper considers problems associated with the communication of reliable information, as well as uncertainties, on natural radioactivity to these audiences.

  7. CMedTEX: A Rule-based Temporal Expression Extraction and Normalization System for Chinese Clinical Notes.

    PubMed

    Liu, Zengjian; Tang, Buzhou; Wang, Xiaolong; Chen, Qingcai; Li, Haodi; Bu, Junzhao; Jiang, Jingzhi; Deng, Qiwen; Zhu, Suisong

    2016-01-01

    Time is an important aspect of information and is very useful for information utilization. The goal of this study was to analyze the challenges of temporal expression (TE) extraction and normalization in Chinese clinical notes by assessing the performance of a rule-based system developed by us on a manually annotated corpus (including 1,778 clinical notes of 281 hospitalized patients). In order to develop system conveniently, we divided TEs into three categories: direct, indirect and uncertain TEs, and designed different rules for each category of them. Evaluation on the independent test set shows that our system achieves an F-score of93.40% on TE extraction, and an accuracy of 92.58% on TE normalization under "exact-match" criterion. Compared with HeidelTime for Chinese newswire text, our system is much better, indicating that it is necessary to develop a specific TE extraction and normalization system for Chinese clinical notes because of domain difference.

  8. Mining protein phosphorylation information from biomedical literature using NLP parsing and Support Vector Machines.

    PubMed

    Raja, Kalpana; Natarajan, Jeyakumar

    2018-07-01

    Extraction of protein phosphorylation information from biomedical literature has gained much attention because of the importance in numerous biological processes. In this study, we propose a text mining methodology which consists of two phases, NLP parsing and SVM classification to extract phosphorylation information from literature. First, using NLP parsing we divide the data into three base-forms depending on the biomedical entities related to phosphorylation and further classify into ten sub-forms based on their distribution with phosphorylation keyword. Next, we extract the phosphorylation entity singles/pairs/triplets and apply SVM to classify the extracted singles/pairs/triplets using a set of features applicable to each sub-form. The performance of our methodology was evaluated on three corpora namely PLC, iProLink and hPP corpus. We obtained promising results of >85% F-score on ten sub-forms of training datasets on cross validation test. Our system achieved overall F-score of 93.0% on iProLink and 96.3% on hPP corpus test datasets. Furthermore, our proposed system achieved best performance on cross corpus evaluation and outperformed the existing system with recall of 90.1%. The performance analysis of our unique system on three corpora reveals that it extracts protein phosphorylation information efficiently in both non-organism specific general datasets such as PLC and iProLink, and human specific dataset such as hPP corpus. Copyright © 2018 Elsevier B.V. All rights reserved.

  9. SA-Mot: a web server for the identification of motifs of interest extracted from protein loops

    PubMed Central

    Regad, Leslie; Saladin, Adrien; Maupetit, Julien; Geneix, Colette; Camproux, Anne-Claude

    2011-01-01

    The detection of functional motifs is an important step for the determination of protein functions. We present here a new web server SA-Mot (Structural Alphabet Motif) for the extraction and location of structural motifs of interest from protein loops. Contrary to other methods, SA-Mot does not focus only on functional motifs, but it extracts recurrent and conserved structural motifs involved in structural redundancy of loops. SA-Mot uses the structural word notion to extract all structural motifs from uni-dimensional sequences corresponding to loop structures. Then, SA-Mot provides a description of these structural motifs using statistics computed in the loop data set and in SCOP superfamily, sequence and structural parameters. SA-Mot results correspond to an interactive table listing all structural motifs extracted from a target structure and their associated descriptors. Using this information, the users can easily locate loop regions that are important for the protein folding and function. The SA-Mot web server is available at http://sa-mot.mti.univ-paris-diderot.fr. PMID:21665924

  10. SA-Mot: a web server for the identification of motifs of interest extracted from protein loops.

    PubMed

    Regad, Leslie; Saladin, Adrien; Maupetit, Julien; Geneix, Colette; Camproux, Anne-Claude

    2011-07-01

    The detection of functional motifs is an important step for the determination of protein functions. We present here a new web server SA-Mot (Structural Alphabet Motif) for the extraction and location of structural motifs of interest from protein loops. Contrary to other methods, SA-Mot does not focus only on functional motifs, but it extracts recurrent and conserved structural motifs involved in structural redundancy of loops. SA-Mot uses the structural word notion to extract all structural motifs from uni-dimensional sequences corresponding to loop structures. Then, SA-Mot provides a description of these structural motifs using statistics computed in the loop data set and in SCOP superfamily, sequence and structural parameters. SA-Mot results correspond to an interactive table listing all structural motifs extracted from a target structure and their associated descriptors. Using this information, the users can easily locate loop regions that are important for the protein folding and function. The SA-Mot web server is available at http://sa-mot.mti.univ-paris-diderot.fr.

  11. The utility of an automated electronic system to monitor and audit transfusion practice.

    PubMed

    Grey, D E; Smith, V; Villanueva, G; Richards, B; Augustson, B; Erber, W N

    2006-05-01

    Transfusion laboratories with transfusion committees have a responsibility to monitor transfusion practice and generate improvements in clinical decision-making and red cell usage. However, this can be problematic and expensive because data cannot be readily extracted from most laboratory information systems. To overcome this problem, we developed and introduced a system to electronically extract and collate extensive amounts of data from two laboratory information systems and to link it with ICD10 clinical codes in a new database using standard information technology. Three data files were generated from two laboratory information systems, ULTRA (version 3.2) and TM, using standard information technology scripts. These were patient pre- and post-transfusion haemoglobin, blood group and antibody screen, and cross match and transfusion data. These data together with ICD10 codes for surgical cases were imported into an MS ACCESS database and linked by means of a unique laboratory number. Queries were then run to extract the relevant information and processed in Microsoft Excel for graphical presentation. We assessed the utility of this data extraction system to audit transfusion practice in a 600-bed adult tertiary hospital over an 18-month period. A total of 52 MB of data were extracted from the two laboratory information systems for the 18-month period and together with 2.0 MB theatre ICD10 data enabled case-specific transfusion information to be generated. The audit evaluated 15,992 blood group and antibody screens, 25,344 cross-matched red cell units and 15,455 transfused red cell units. Data evaluated included cross-matched to transfusion ratios and pre- and post-transfusion haemoglobin levels for a range of clinical diagnoses. Data showed significant differences between clinical units and by ICD10 code. This method to electronically extract large amounts of data and linkage with clinical databases has provided a powerful and sustainable tool for monitoring transfusion practice. It has been successfully used to identify areas requiring education, training and clinical guidance and allows for comparison with national haemoglobin-based transfusion guidelines.

  12. Application research on land use remote sensing dynamic monitoring: A case study of Anning district, Lanzhou

    NASA Astrophysics Data System (ADS)

    Zhu, Yunqiang; Zhu, Huazhong; Lu, Heli; Ni, Jianguang; Zhu, Shaoxia

    2005-10-01

    Remote sensing dynamic monitoring of land use can detect the change information of land use and update the current land use map, which is important for rational utilization and scientific management of land resources. This paper discusses the technological procedure of remote sensing dynamic monitoring of land use including the process of remote sensing images, the extraction of annual change information of land use, field survey, indoor post processing and accuracy assessment. Especially, we emphasize on comparative research on the choice of remote sensing rectifying models, image fusion algorithms and accuracy assessment methods. Taking Anning district in Lanzhou as an example, we extract the land use change information of the district during 2002-2003, access monitoring accuracy and analyze the reason of land use change.

  13. Feature extraction via KPCA for classification of gait patterns.

    PubMed

    Wu, Jianning; Wang, Jue; Liu, Li

    2007-06-01

    Automated recognition of gait pattern change is important in medical diagnostics as well as in the early identification of at-risk gait in the elderly. We evaluated the use of Kernel-based Principal Component Analysis (KPCA) to extract more gait features (i.e., to obtain more significant amounts of information about human movement) and thus to improve the classification of gait patterns. 3D gait data of 24 young and 24 elderly participants were acquired using an OPTOTRAK 3020 motion analysis system during normal walking, and a total of 36 gait spatio-temporal and kinematic variables were extracted from the recorded data. KPCA was used first for nonlinear feature extraction to then evaluate its effect on a subsequent classification in combination with learning algorithms such as support vector machines (SVMs). Cross-validation test results indicated that the proposed technique could allow spreading the information about the gait's kinematic structure into more nonlinear principal components, thus providing additional discriminatory information for the improvement of gait classification performance. The feature extraction ability of KPCA was affected slightly with different kernel functions as polynomial and radial basis function. The combination of KPCA and SVM could identify young-elderly gait patterns with 91% accuracy, resulting in a markedly improved performance compared to the combination of PCA and SVM. These results suggest that nonlinear feature extraction by KPCA improves the classification of young-elderly gait patterns, and holds considerable potential for future applications in direct dimensionality reduction and interpretation of multiple gait signals.

  14. Applying Analogical Reasoning Techniques for Teaching XML Document Querying Skills in Database Classes

    ERIC Educational Resources Information Center

    Mitri, Michel

    2012-01-01

    XML has become the most ubiquitous format for exchange of data between applications running on the Internet. Most Web Services provide their information to clients in the form of XML. The ability to process complex XML documents in order to extract relevant information is becoming as important a skill for IS students to master as querying…

  15. Using Information from the Electronic Health Record to Improve Measurement of Unemployment in Service Members and Veterans with mTBI and Post-Deployment Stress

    PubMed Central

    Dillahunt-Aspillaga, Christina; Finch, Dezon; Massengale, Jill; Kretzmer, Tracy; Luther, Stephen L.; McCart, James A.

    2014-01-01

    Objective The purpose of this pilot study is 1) to develop an annotation schema and a training set of annotated notes to support the future development of a natural language processing (NLP) system to automatically extract employment information, and 2) to determine if information about employment status, goals and work-related challenges reported by service members and Veterans with mild traumatic brain injury (mTBI) and post-deployment stress can be identified in the Electronic Health Record (EHR). Design Retrospective cohort study using data from selected progress notes stored in the EHR. Setting Post-deployment Rehabilitation and Evaluation Program (PREP), an in-patient rehabilitation program for Veterans with TBI at the James A. Haley Veterans' Hospital in Tampa, Florida. Participants Service members and Veterans with TBI who participated in the PREP program (N = 60). Main Outcome Measures Documentation of employment status, goals, and work-related challenges reported by service members and recorded in the EHR. Results Two hundred notes were examined and unique vocational information was found indicating a variety of self-reported employment challenges. Current employment status and future vocational goals along with information about cognitive, physical, and behavioral symptoms that may affect return-to-work were extracted from the EHR. The annotation schema developed for this study provides an excellent tool upon which NLP studies can be developed. Conclusions Information related to employment status and vocational history is stored in text notes in the EHR system. Information stored in text does not lend itself to easy extraction or summarization for research and rehabilitation planning purposes. Development of NLP systems to automatically extract text-based employment information provides data that may improve the understanding and measurement of employment in this important cohort. PMID:25541956

  16. Consumerism as a branding opportunity.

    PubMed

    Treash, M; Adams, R

    1998-01-01

    Managing a customer portfolio at the individual level is the most difficult and most promising endeavor. An individual level consumer portfolio does not mean creating marketing materials and advertising campaigns customized for every member of your health plan. What it does mean is developing segmentation models based on consumer preferences extracted directly from your members, not socioeconomic or other demographic models. The most important information to extract is perceptions on how much and what kind of value members want from the organization.

  17. Conventional and Accelerated-Solvent Extractions of Green Tea (Camellia sinensis) for Metabolomics-based Chemometrics

    PubMed Central

    Kellogg, Joshua J.; Wallace, Emily D.; Graf, Tyler N.; Oberlies, Nicholas H.; Cech, Nadja B.

    2018-01-01

    Metabolomics has emerged as an important analytical technique for multiple applications. The value of information obtained from metabolomics analysis depends on the degree to which the entire metabolome is present and the reliability of sample treatment to ensure reproducibility across the study. The purpose of this study was to compare methods of preparing complex botanical extract samples prior to metabolomics profiling. Two extraction methodologies, accelerated solvent extraction and a conventional solvent maceration, were compared using commercial green tea [Camellia sinensis (L.) Kuntze (Theaceae)] products as a test case. The accelerated solvent protocol was first evaluated to ascertain critical factors influencing extraction using a D-optimal experimental design study. The accelerated solvent and conventional extraction methods yielded similar metabolite profiles for the green tea samples studied. The accelerated solvent extraction yielded higher total amounts of extracted catechins, was more reproducible, and required less active bench time to prepare the samples. This study demonstrates the effectiveness of accelerated solvent as an efficient methodology for metabolomics studies. PMID:28787673

  18. The information extraction of Gannan citrus orchard based on the GF-1 remote sensing image

    NASA Astrophysics Data System (ADS)

    Wang, S.; Chen, Y. L.

    2017-02-01

    The production of Gannan oranges is the largest in China, which occupied an important part in the world. The extraction of citrus orchard quickly and effectively has important significance for fruit pathogen defense, fruit production and industrial planning. The traditional spectra extraction method of citrus orchard based on pixel has a lower classification accuracy, difficult to avoid the “pepper phenomenon”. In the influence of noise, the phenomenon that different spectrums of objects have the same spectrum is graveness. Taking Xunwu County citrus fruit planting area of Ganzhou as the research object, aiming at the disadvantage of the lower accuracy of the traditional method based on image element classification method, a decision tree classification method based on object-oriented rule set is proposed. Firstly, multi-scale segmentation is performed on the GF-1 remote sensing image data of the study area. Subsequently the sample objects are selected for statistical analysis of spectral features and geometric features. Finally, combined with the concept of decision tree classification, a variety of empirical values of single band threshold, NDVI, band combination and object geometry characteristics are used hierarchically to execute the information extraction of the research area, and multi-scale segmentation and hierarchical decision tree classification is implemented. The classification results are verified with the confusion matrix, and the overall Kappa index is 87.91%.

  19. Dilated contour extraction and component labeling algorithm for object vector representation

    NASA Astrophysics Data System (ADS)

    Skourikhine, Alexei N.

    2005-08-01

    Object boundary extraction from binary images is important for many applications, e.g., image vectorization, automatic interpretation of images containing segmentation results, printed and handwritten documents and drawings, maps, and AutoCAD drawings. Efficient and reliable contour extraction is also important for pattern recognition due to its impact on shape-based object characterization and recognition. The presented contour tracing and component labeling algorithm produces dilated (sub-pixel) contours associated with corresponding regions. The algorithm has the following features: (1) it always produces non-intersecting, non-degenerate contours, including the case of one-pixel wide objects; (2) it associates the outer and inner (i.e., around hole) contours with the corresponding regions during the process of contour tracing in a single pass over the image; (3) it maintains desired connectivity of object regions as specified by 8-neighbor or 4-neighbor connectivity of adjacent pixels; (4) it avoids degenerate regions in both background and foreground; (5) it allows an easy augmentation that will provide information about the containment relations among regions; (6) it has a time complexity that is dominantly linear in the number of contour points. This early component labeling (contour-region association) enables subsequent efficient object-based processing of the image information.

  20. Long-Term Marine Traffic Monitoring for Environmental Safety in the Aegean Sea

    NASA Astrophysics Data System (ADS)

    Giannakopoulos, T.; Gyftakis, S.; Charou, E.; Perantonis, S.; Nivolianitou, Z.; Koromila, I.; Makrygiorgos, A.

    2015-04-01

    The Aegean Sea is characterized by an extremely high marine safety risk, mainly due to the significant increase of the traffic of tankers from and to the Black Sea that pass through narrow straits formed by the 1600 Greek islands. Reducing the risk of a ship accident is therefore vital to all socio-economic and environmental sectors. This paper presents an online long-term marine traffic monitoring work-flow that focuses on extracting aggregated vessel risks using spatiotemporal analysis of multilayer information: vessel trajectories, vessel data, meteorological data, bathymetric / hydrographic data as well as information regarding environmentally important areas (e.g. protected high-risk areas, etc.). A web interface that enables user-friendly spatiotemporal queries is implemented at the frontend, while a series of data mining functionalities extracts aggregated statistics regarding: (a) marine risks and accident probabilities for particular areas (b) trajectories clustering information (c) general marine statistics (cargo types, etc.) and (d) correlation between spatial environmental importance and marine traffic risk. Towards this end, a set of data clustering and probabilistic graphical modelling techniques has been adopted.

  1. Automatic extraction of pavement markings on streets from point cloud data of mobile LiDAR

    NASA Astrophysics Data System (ADS)

    Gao, Yang; Zhong, Ruofei; Tang, Tao; Wang, Liuzhao; Liu, Xianlin

    2017-08-01

    Pavement markings provide an important foundation as they help to keep roads users safe. Accurate and comprehensive information about pavement markings assists the road regulators and is useful in developing driverless technology. Mobile light detection and ranging (LiDAR) systems offer new opportunities to collect and process accurate pavement markings’ information. Mobile LiDAR systems can directly obtain the three-dimensional (3D) coordinates of an object, thus defining spatial data and the intensity of (3D) objects in a fast and efficient way. The RGB attribute information of data points can be obtained based on the panoramic camera in the system. In this paper, we present a novel method process to automatically extract pavement markings using multiple attribute information of the laser scanning point cloud from the mobile LiDAR data. This method process utilizes a differential grayscale of RGB color, laser pulse reflection intensity, and the differential intensity to identify and extract pavement markings. We utilized point cloud density to remove the noise and used morphological operations to eliminate the errors. In the application, we tested our method process on different sections of roads in Beijing, China, and Buffalo, NY, USA. The results indicated that both correctness (p) and completeness (r) were higher than 90%. The method process of this research can be applied to extract pavement markings from huge point cloud data produced by mobile LiDAR.

  2. Towards a Relation Extraction Framework for Cyber-Security Concepts

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Jones, Corinne L; Bridges, Robert A; Huffer, Kelly M

    In order to assist security analysts in obtaining information pertaining to their network, such as novel vulnerabilities, exploits, or patches, information retrieval methods tailored to the security domain are needed. As labeled text data is scarce and expensive, we follow developments in semi-supervised NLP and implement a bootstrapping algorithm for extracting security entities and their relationships from text. The algorithm requires little input data, specifically, a few relations or patterns (heuristics for identifying relations), and incorporates an active learning component which queries the user on the most important decisions to prevent drifting the desired relations. Preliminary testing on a smallmore » corpus shows promising results, obtaining precision of .82.« less

  3. Drug drug interaction extraction from the literature using a recursive neural network

    PubMed Central

    Lim, Sangrak; Lee, Kyubum

    2018-01-01

    Detecting drug-drug interactions (DDI) is important because information on DDIs can help prevent adverse effects from drug combinations. Since there are many new DDI-related papers published in the biomedical domain, manually extracting DDI information from the literature is a laborious task. However, text mining can be used to find DDIs in the biomedical literature. Among the recently developed neural networks, we use a Recursive Neural Network to improve the performance of DDI extraction. Our recursive neural network model uses a position feature, a subtree containment feature, and an ensemble method to improve the performance of DDI extraction. Compared with the state-of-the-art models, the DDI detection and type classifiers of our model performed 4.4% and 2.8% better, respectively, on the DDIExtraction Challenge’13 test data. We also validated our model on the PK DDI corpus that consists of two types of DDIs data: in vivo DDI and in vitro DDI. Compared with the existing model, our detection classifier performed 2.3% and 6.7% better on in vivo and in vitro data respectively. The results of our validation demonstrate that our model can automatically extract DDIs better than existing models. PMID:29373599

  4. SD-MSAEs: Promoter recognition in human genome based on deep feature extraction.

    PubMed

    Xu, Wenxuan; Zhang, Li; Lu, Yaping

    2016-06-01

    The prediction and recognition of promoter in human genome play an important role in DNA sequence analysis. Entropy, in Shannon sense, of information theory is a multiple utility in bioinformatic details analysis. The relative entropy estimator methods based on statistical divergence (SD) are used to extract meaningful features to distinguish different regions of DNA sequences. In this paper, we choose context feature and use a set of methods of SD to select the most effective n-mers distinguishing promoter regions from other DNA regions in human genome. Extracted from the total possible combinations of n-mers, we can get four sparse distributions based on promoter and non-promoters training samples. The informative n-mers are selected by optimizing the differentiating extents of these distributions. Specially, we combine the advantage of statistical divergence and multiple sparse auto-encoders (MSAEs) in deep learning to extract deep feature for promoter recognition. And then we apply multiple SVMs and a decision model to construct a human promoter recognition method called SD-MSAEs. Framework is flexible that it can integrate new feature extraction or new classification models freely. Experimental results show that our method has high sensitivity and specificity. Copyright © 2016 Elsevier Inc. All rights reserved.

  5. Recent development of feature extraction and classification multispectral/hyperspectral images: a systematic literature review

    NASA Astrophysics Data System (ADS)

    Setiyoko, A.; Dharma, I. G. W. S.; Haryanto, T.

    2017-01-01

    Multispectral data and hyperspectral data acquired from satellite sensor have the ability in detecting various objects on the earth ranging from low scale to high scale modeling. These data are increasingly being used to produce geospatial information for rapid analysis by running feature extraction or classification process. Applying the most suited model for this data mining is still challenging because there are issues regarding accuracy and computational cost. This research aim is to develop a better understanding regarding object feature extraction and classification applied for satellite image by systematically reviewing related recent research projects. A method used in this research is based on PRISMA statement. After deriving important points from trusted sources, pixel based and texture-based feature extraction techniques are promising technique to be analyzed more in recent development of feature extraction and classification.

  6. Uncovering the spatial structure of mobility networks

    NASA Astrophysics Data System (ADS)

    Louail, Thomas; Lenormand, Maxime; Picornell, Miguel; García Cantú, Oliva; Herranz, Ricardo; Frias-Martinez, Enrique; Ramasco, José J.; Barthelemy, Marc

    2015-01-01

    The extraction of a clear and simple footprint of the structure of large, weighted and directed networks is a general problem that has relevance for many applications. An important example is seen in origin-destination matrices, which contain the complete information on commuting flows, but are difficult to analyze and compare. We propose here a versatile method, which extracts a coarse-grained signature of mobility networks, under the form of a 2 × 2 matrix that separates the flows into four categories. We apply this method to origin-destination matrices extracted from mobile phone data recorded in 31 Spanish cities. We show that these cities essentially differ by their proportion of two types of flows: integrated (between residential and employment hotspots) and random flows, whose importance increases with city size. Finally, the method allows the determination of categories of networks, and in the mobility case, the classification of cities according to their commuting structure.

  7. Fusion of LBP and SWLD using spatio-spectral information for hyperspectral face recognition

    NASA Astrophysics Data System (ADS)

    Xie, Zhihua; Jiang, Peng; Zhang, Shuai; Xiong, Jinquan

    2018-01-01

    Hyperspectral imaging, recording intrinsic spectral information of the skin cross different spectral bands, become an important issue for robust face recognition. However, the main challenges for hyperspectral face recognition are high data dimensionality, low signal to noise ratio and inter band misalignment. In this paper, hyperspectral face recognition based on LBP (Local binary pattern) and SWLD (Simplified Weber local descriptor) is proposed to extract discriminative local features from spatio-spectral fusion information. Firstly, the spatio-spectral fusion strategy based on statistical information is used to attain discriminative features of hyperspectral face images. Secondly, LBP is applied to extract the orientation of the fusion face edges. Thirdly, SWLD is proposed to encode the intensity information in hyperspectral images. Finally, we adopt a symmetric Kullback-Leibler distance to compute the encoded face images. The hyperspectral face recognition is tested on Hong Kong Polytechnic University Hyperspectral Face database (PolyUHSFD). Experimental results show that the proposed method has higher recognition rate (92.8%) than the state of the art hyperspectral face recognition algorithms.

  8. Comparison of manual and automated nucleic acid extraction methods from clinical specimens for microbial diagnosis purposes.

    PubMed

    Wozniak, Aniela; Geoffroy, Enrique; Miranda, Carolina; Castillo, Claudia; Sanhueza, Francia; García, Patricia

    2016-11-01

    The choice of nucleic acids (NAs) extraction method for molecular diagnosis in microbiology is of major importance because of the low microbial load, different nature of microorganisms, and clinical specimens. The NA yield of different extraction methods has been mostly studied using spiked samples. However, information from real human clinical specimens is scarce. The purpose of this study was to compare the performance of a manual low-cost extraction method (Qiagen kit or salting-out extraction method) with the automated high-cost MagNAPure Compact method. According to cycle threshold values for different pathogens, MagNAPure is as efficient as Qiagen for NA extraction from noncomplex clinical specimens (nasopharyngeal swab, skin swab, plasma, respiratory specimens). In contrast, according to cycle threshold values for RNAseP, MagNAPure method may not be an appropriate method for NA extraction from blood. We believe that MagNAPure versatility reduced risk of cross-contamination and reduced hands-on time compensates its high cost. Copyright © 2016 Elsevier Inc. All rights reserved.

  9. Comparison of Two Simplification Methods for Shoreline Extraction from Digital Orthophoto Images

    NASA Astrophysics Data System (ADS)

    Bayram, B.; Sen, A.; Selbesoglu, M. O.; Vārna, I.; Petersons, P.; Aykut, N. O.; Seker, D. Z.

    2017-11-01

    The coastal ecosystems are very sensitive to external influences. Coastal resources such as sand dunes, coral reefs and mangroves has vital importance to prevent coastal erosion. Human based effects also threats the coastal areas. Therefore, the change of coastal areas should be monitored. Up-to-date, accurate shoreline information is indispensable for coastal managers and decision makers. Remote sensing and image processing techniques give a big opportunity to obtain reliable shoreline information. In the presented study, NIR bands of seven 1:5000 scaled digital orthophoto images of Riga Bay-Latvia have been used. The Object-oriented Simple Linear Clustering method has been utilized to extract shoreline of Riga Bay. Bend and Douglas-Peucker methods have been used to simplify the extracted shoreline to test the effect of both methods. Photogrammetrically digitized shoreline has been taken as reference data to compare obtained results. The accuracy assessment has been realised by Digital Shoreline Analysis tool. As a result, the achieved shoreline by the Bend method has been found closer to the extracted shoreline with Simple Linear Clustering method.

  10. Spectral Regression Based Fault Feature Extraction for Bearing Accelerometer Sensor Signals

    PubMed Central

    Xia, Zhanguo; Xia, Shixiong; Wan, Ling; Cai, Shiyu

    2012-01-01

    Bearings are not only the most important element but also a common source of failures in rotary machinery. Bearing fault prognosis technology has been receiving more and more attention recently, in particular because it plays an increasingly important role in avoiding the occurrence of accidents. Therein, fault feature extraction (FFE) of bearing accelerometer sensor signals is essential to highlight representative features of bearing conditions for machinery fault diagnosis and prognosis. This paper proposes a spectral regression (SR)-based approach for fault feature extraction from original features including time, frequency and time-frequency domain features of bearing accelerometer sensor signals. SR is a novel regression framework for efficient regularized subspace learning and feature extraction technology, and it uses the least squares method to obtain the best projection direction, rather than computing the density matrix of features, so it also has the advantage in dimensionality reduction. The effectiveness of the SR-based method is validated experimentally by applying the acquired vibration signals data to bearings. The experimental results indicate that SR can reduce the computation cost and preserve more structure information about different bearing faults and severities, and it is demonstrated that the proposed feature extraction scheme has an advantage over other similar approaches. PMID:23202017

  11. Temporal data representation, normalization, extraction, and reasoning: A review from clinical domain

    PubMed Central

    Madkour, Mohcine; Benhaddou, Driss; Tao, Cui

    2016-01-01

    Background and Objective We live our lives by the calendar and the clock, but time is also an abstraction, even an illusion. The sense of time can be both domain-specific and complex, and is often left implicit, requiring significant domain knowledge to accurately recognize and harness. In the clinical domain, the momentum gained from recent advances in infrastructure and governance practices has enabled the collection of tremendous amount of data at each moment in time. Electronic Health Records (EHRs) have paved the way to making these data available for practitioners and researchers. However, temporal data representation, normalization, extraction and reasoning are very important in order to mine such massive data and therefore for constructing the clinical timeline. The objective of this work is to provide an overview of the problem of constructing a timeline at the clinical point of care and to summarize the state-of-the-art in processing temporal information of clinical narratives. Methods This review surveys the methods used in three important area: modeling and representing of time, Medical NLP methods for extracting time, and methods of time reasoning and processing. The review emphasis on the current existing gap between present methods and the semantic web technologies and catch up with the possible combinations. Results the main findings of this review is revealing the importance of time processing not only in constructing timelines and clinical decision support systems but also as a vital component of EHR data models and operations. Conclusions Extracting temporal information in clinical narratives is a challenging task. The inclusion of ontologies and semantic web will lead to better assessment of the annotation task and, together with medical NLP techniques, will help resolving granularity and co-reference resolution problems. PMID:27040831

  12. [Application of regular expression in extracting key information from Chinese medicine literatures about re-evaluation of post-marketing surveillance].

    PubMed

    Wang, Zhifei; Xie, Yanming; Wang, Yongyan

    2011-10-01

    Computerizing extracting information from Chinese medicine literature seems more convenient than hand searching, which could simplify searching process and improve the accuracy. However, many computerized auto-extracting methods are increasingly used, regular expression is so special that could be efficient for extracting useful information in research. This article focused on regular expression applying in extracting information from Chinese medicine literature. Two practical examples were reported in this article about regular expression to extract "case number (non-terminology)" and "efficacy rate (subgroups for related information identification)", which explored how to extract information in Chinese medicine literature by means of some special research method.

  13. Comparative sensitivity of centroptilum triangulifer, ceriodaphnia dubia and daphnia magna to standard salt and copper reference toxicants

    EPA Science Inventory

    Development of methods for assessing exposure and effects of produced waters from energy and mineral resource extraction operations on stream invertebrate species is important in order to elucidate environmentally relevant information. Centroptilum triangulifer is a parthenogene...

  14. Application of Monte Carlo algorithms to the Bayesian analysis of the Cosmic Microwave Background

    NASA Technical Reports Server (NTRS)

    Jewell, J.; Levin, S.; Anderson, C. H.

    2004-01-01

    Power spectrum estimation and evaluation of associated errors in the presence of incomplete sky coverage; nonhomogeneous, correlated instrumental noise; and foreground emission are problems of central importance for the extraction of cosmological information from the cosmic microwave background (CMB).

  15. Semi-automatic building extraction in informal settlements from high-resolution satellite imagery

    NASA Astrophysics Data System (ADS)

    Mayunga, Selassie David

    The extraction of man-made features from digital remotely sensed images is considered as an important step underpinning management of human settlements in any country. Man-made features and buildings in particular are required for varieties of applications such as urban planning, creation of geographical information systems (GIS) databases and Urban City models. The traditional man-made feature extraction methods are very expensive in terms of equipment, labour intensive, need well-trained personnel and cannot cope with changing environments, particularly in dense urban settlement areas. This research presents an approach for extracting buildings in dense informal settlement areas using high-resolution satellite imagery. The proposed system uses a novel strategy of extracting building by measuring a single point at the approximate centre of the building. The fine measurement of the building outlines is then effected using a modified snake model. The original snake model on which this framework is based, incorporates an external constraint energy term which is tailored to preserving the convergence properties of the snake model; its use to unstructured objects will negatively affect their actual shapes. The external constrained energy term was removed from the original snake model formulation, thereby, giving ability to cope with high variability of building shapes in informal settlement areas. The proposed building extraction system was tested on two areas, which have different situations. The first area was Tungi in Dar Es Salaam, Tanzania where three sites were tested. This area is characterized by informal settlements, which are illegally formulated within the city boundaries. The second area was Oromocto in New Brunswick, Canada where two sites were tested. Oromocto area is mostly flat and the buildings are constructed using similar materials. Qualitative and quantitative measures were employed to evaluate the accuracy of the results as well as the performance of the system. The qualitative and quantitative measures were based on visual inspection and by comparing the measured coordinates to the reference data respectively. In the course of this process, a mean area coverage of 98% was achieved for Dar Es Salaam test sites, which globally indicated that the extracted building polygons were close to the ground truth data. Furthermore, the proposed system saved time to extract a single building by 32%. Although the extracted building polygons are within the perimeter of ground truth data, visually some of the extracted building polygons were somewhat distorted. This implies that interactive post-editing process is necessary for cartographic representation.

  16. Domain-independent information extraction in unstructured text

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Irwin, N.H.

    Extracting information from unstructured text has become an important research area in recent years due to the large amount of text now electronically available. This status report describes the findings and work done during the second year of a two-year Laboratory Directed Research and Development Project. Building on the first-year`s work of identifying important entities, this report details techniques used to group words into semantic categories and to output templates containing selective document content. Using word profiles and category clustering derived during a training run, the time-consuming knowledge-building task can be avoided. Though the output still lacks in completeness whenmore » compared to systems with domain-specific knowledge bases, the results do look promising. The two approaches are compatible and could complement each other within the same system. Domain-independent approaches retain appeal as a system that adapts and learns will soon outpace a system with any amount of a priori knowledge.« less

  17. A hybrid sales forecasting scheme by combining independent component analysis with K-means clustering and support vector regression.

    PubMed

    Lu, Chi-Jie; Chang, Chi-Chang

    2014-01-01

    Sales forecasting plays an important role in operating a business since it can be used to determine the required inventory level to meet consumer demand and avoid the problem of under/overstocking. Improving the accuracy of sales forecasting has become an important issue of operating a business. This study proposes a hybrid sales forecasting scheme by combining independent component analysis (ICA) with K-means clustering and support vector regression (SVR). The proposed scheme first uses the ICA to extract hidden information from the observed sales data. The extracted features are then applied to K-means algorithm for clustering the sales data into several disjoined clusters. Finally, the SVR forecasting models are applied to each group to generate final forecasting results. Experimental results from information technology (IT) product agent sales data reveal that the proposed sales forecasting scheme outperforms the three comparison models and hence provides an efficient alternative for sales forecasting.

  18. A Hybrid Sales Forecasting Scheme by Combining Independent Component Analysis with K-Means Clustering and Support Vector Regression

    PubMed Central

    2014-01-01

    Sales forecasting plays an important role in operating a business since it can be used to determine the required inventory level to meet consumer demand and avoid the problem of under/overstocking. Improving the accuracy of sales forecasting has become an important issue of operating a business. This study proposes a hybrid sales forecasting scheme by combining independent component analysis (ICA) with K-means clustering and support vector regression (SVR). The proposed scheme first uses the ICA to extract hidden information from the observed sales data. The extracted features are then applied to K-means algorithm for clustering the sales data into several disjoined clusters. Finally, the SVR forecasting models are applied to each group to generate final forecasting results. Experimental results from information technology (IT) product agent sales data reveal that the proposed sales forecasting scheme outperforms the three comparison models and hence provides an efficient alternative for sales forecasting. PMID:25045738

  19. Feature extraction inspired by V1 in visual cortex

    NASA Astrophysics Data System (ADS)

    Lv, Chao; Xu, Yuelei; Zhang, Xulei; Ma, Shiping; Li, Shuai; Xin, Peng; Zhu, Mingning; Ma, Hongqiang

    2018-04-01

    Target feature extraction plays an important role in pattern recognition. It is the most complicated activity in the brain mechanism of biological vision. Inspired by high properties of primary visual cortex (V1) in extracting dynamic and static features, a visual perception model was raised. Firstly, 28 spatial-temporal filters with different orientations, half-squaring operation and divisive normalization were adopted to obtain the responses of V1 simple cells; then, an adjustable parameter was added to the output weight so that the response of complex cells was got. Experimental results indicate that the proposed V1 model can perceive motion information well. Besides, it has a good edge detection capability. The model inspired by V1 has good performance in feature extraction and effectively combines brain-inspired intelligence with computer vision.

  20. Waterbodies Extraction from LANDSAT8-OLI Imagery Using Awater Indexs-Guied Stochastic Fully-Connected Conditional Random Field Model and the Support Vector Machine

    NASA Astrophysics Data System (ADS)

    Wang, X.; Xu, L.

    2018-04-01

    One of the most important applications of remote sensing classification is water extraction. The water index (WI) based on Landsat images is one of the most common ways to distinguish water bodies from other land surface features. But conventional WI methods take into account spectral information only form a limited number of bands, and therefore the accuracy of those WI methods may be constrained in some areas which are covered with snow/ice, clouds, etc. An accurate and robust water extraction method is the key to the study at present. The support vector machine (SVM) using all bands spectral information can reduce for these classification error to some extent. Nevertheless, SVM which barely considers spatial information is relatively sensitive to noise in local regions. Conditional random field (CRF) which considers both spatial information and spectral information has proven to be able to compensate for these limitations. Hence, in this paper, we develop a systematic water extraction method by taking advantage of the complementarity between the SVM and a water index-guided stochastic fully-connected conditional random field (SVM-WIGSFCRF) to address the above issues. In addition, we comprehensively evaluate the reliability and accuracy of the proposed method using Landsat-8 operational land imager (OLI) images of one test site. We assess the method's performance by calculating the following accuracy metrics: Omission Errors (OE) and Commission Errors (CE); Kappa coefficient (KP) and Total Error (TE). Experimental results show that the new method can improve target detection accuracy under complex and changeable environments.

  1. Eco-friendly approach for nanoparticles synthesis and mechanism behind antibacterial activity of silver and anticancer activity of gold nanoparticles.

    PubMed

    Patil, Maheshkumar Prakash; Kim, Gun-Do

    2017-01-01

    This review covers general information about the eco-friendly process for the synthesis of silver nanoparticles (AgNP) and gold nanoparticles (AuNP) and focuses on mechanism of the antibacterial activity of AgNPs and the anticancer activity of AuNPs. Biomolecules in the plant extract are involved in reduction of metal ions to nanoparticle in a one-step and eco-friendly synthesis process. Natural plant extracts contain wide range of metabolites including carbohydrates, alkaloids, terpenoids, phenolic compounds, and enzymes. A variety of plant species and plant parts have been successfully extracted and utilized for AgNP and AuNP syntheses. Green-synthesized nanoparticles eliminate the need for a stabilizing and capping agent and show shape and size-dependent biological activities. Here, we describe some of the plant extracts involved in nanoparticle synthesis, characterization methods, and biological applications. Nanoparticles are important in the field of pharmaceuticals for their strong antibacterial and anticancer activity. Considering the importance and uniqueness of this concept, the synthesis, characterization, and application of AgNPs and AuNPs are discussed in this review.

  2. Challenges in Managing Information Extraction

    ERIC Educational Resources Information Center

    Shen, Warren H.

    2009-01-01

    This dissertation studies information extraction (IE), the problem of extracting structured information from unstructured data. Example IE tasks include extracting person names from news articles, product information from e-commerce Web pages, street addresses from emails, and names of emerging music bands from blogs. IE is all increasingly…

  3. Remote Sensing Extraction of Stopes and Tailings Ponds in AN Ultra-Low Iron Mining Area

    NASA Astrophysics Data System (ADS)

    Ma, B.; Chen, Y.; Li, X.; Wu, L.

    2018-04-01

    With the development of economy, global demand for steel has accelerated since 2000, and thus mining activities of iron ore have become intensive accordingly. An ultra-low-grade iron has been extracted by open-pit mining and processed massively since 2001 in Kuancheng County, Hebei Province. There are large-scale stopes and tailings ponds in this area. It is important to extract their spatial distribution information for environmental protection and disaster prevention. A remote sensing method of extracting stopes and tailings ponds is studied based on spectral characteristics by use of Landsat 8 OLI imagery and ground spectral data. The overall accuracy of extraction is 95.06 %. In addition, tailings ponds are distinguished from stopes based on thermal characteristics by use of temperature image. The results could provide decision support for environmental protection, disaster prevention, and ecological restoration in the ultra-low-grade iron ore mining area.

  4. Time-dependent analysis of dosage delivery information for patient-controlled analgesia services.

    PubMed

    Kuo, I-Ting; Chang, Kuang-Yi; Juan, De-Fong; Hsu, Steen J; Chan, Chia-Tai; Tsou, Mei-Yung

    2018-01-01

    Pain relief always plays the essential part of perioperative care and an important role of medical quality improvement. Patient-controlled analgesia (PCA) is a method that allows a patient to self-administer small boluses of analgesic to relieve the subjective pain. PCA logs from the infusion pump consisted of a lot of text messages which record all events during the therapies. The dosage information can be extracted from PCA logs to provide easily understanding features. The analysis of dosage information with time has great help to figure out the variance of a patient's pain relief condition. To explore the trend of pain relief requirement, we developed a PCA dosage information generator (PCA DIG) to extract meaningful messages from PCA logs during the first 48 hours of therapies. PCA dosage information including consumption, delivery, infusion rate, and the ratio between demand and delivery is presented with corresponding values in 4 successive time frames. Time-dependent statistical analysis demonstrated the trends of analgesia requirements decreased gradually along with time. These findings are compatible with clinical observations and further provide valuable information about the strategy to customize postoperative pain management.

  5. Conventional and accelerated-solvent extractions of green tea (camellia sinensis) for metabolomics-based chemometrics.

    PubMed

    Kellogg, Joshua J; Wallace, Emily D; Graf, Tyler N; Oberlies, Nicholas H; Cech, Nadja B

    2017-10-25

    Metabolomics has emerged as an important analytical technique for multiple applications. The value of information obtained from metabolomics analysis depends on the degree to which the entire metabolome is present and the reliability of sample treatment to ensure reproducibility across the study. The purpose of this study was to compare methods of preparing complex botanical extract samples prior to metabolomics profiling. Two extraction methodologies, accelerated solvent extraction and a conventional solvent maceration, were compared using commercial green tea [Camellia sinensis (L.) Kuntze (Theaceae)] products as a test case. The accelerated solvent protocol was first evaluated to ascertain critical factors influencing extraction using a D-optimal experimental design study. The accelerated solvent and conventional extraction methods yielded similar metabolite profiles for the green tea samples studied. The accelerated solvent extraction yielded higher total amounts of extracted catechins, was more reproducible, and required less active bench time to prepare the samples. This study demonstrates the effectiveness of accelerated solvent as an efficient methodology for metabolomics studies. Copyright © 2017. Published by Elsevier B.V.

  6. Using Airborne Remote Sensing to Increase Situational Awareness in Civil Protection and Humanitarian Relief - the Importance of User Involvement

    NASA Astrophysics Data System (ADS)

    Römer, H.; Kiefl, R.; Henkel, F.; Wenxi, C.; Nippold, R.; Kurz, F.; Kippnich, U.

    2016-06-01

    Enhancing situational awareness in real-time (RT) civil protection and emergency response scenarios requires the development of comprehensive monitoring concepts combining classical remote sensing disciplines with geospatial information science. In the VABENE++ project of the German Aerospace Center (DLR) monitoring tools are being developed by which innovative data acquisition approaches are combined with information extraction as well as the generation and dissemination of information products to a specific user. DLR's 3K and 4k camera system which allow for a RT acquisition and pre-processing of high resolution aerial imagery are applied in two application examples conducted with end users: a civil protection exercise with humanitarian relief organisations and a large open-air music festival in cooperation with a festival organising company. This study discusses how airborne remote sensing can significantly contribute to both, situational assessment and awareness, focussing on the downstream processes required for extracting information from imagery and for visualising and disseminating imagery in combination with other geospatial information. Valuable user feedback and impetus for further developments has been obtained from both applications, referring to innovations in thematic image analysis (supporting festival site management) and product dissemination (editable web services). Thus, this study emphasises the important role of user involvement in application-related research, i.e. by aligning it closer to user's requirements.

  7. Structural study of complexes formed by acidic and neutral organophosphorus reagents

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Braatz, Alexander D.; Antonio, Mark R.; Nilsson, Mikael

    The coordination of the trivalent 4f ions, Ln = La 3+, Dy 3+, and Lu 3+, with neutral and acidic organophosphorus reagents, both individually and combined, was studied by use of X-ray absorption spectroscopy. These studies provide metrical information about the interatomic interactions between these cations and the ligands tri- n-butyl phosphate (TBP) and di- n-butyl phosphoric acid (HDBP), whose behavior are of practical importance to chemical separation processes that are currently used on an industrial scale. Previous studies have suggested the existence of complexes involving a mixture of ligands, accounting for extraction synergy. Through systematic variation of the aqueousmore » phase acidity and extractant concentration and combination, we have found that complexes with Ln and TBP : HDBP at any mixture and HDBP alone involve direct Ln–O interactions involving 6 oxygen atoms and distant Ln–P interactions involving on average 3–5 phosphorus atoms per Ln ion. It was also found that Ln complexes formed by TBP alone seem to favor eight oxygen coordination, though we were unable to obtain metrical results regarding the distant Ln–P interactions due to the low signal attributed to a lower concentration of Ln ions in the organic phases. Our study does not support the existence of mixed Ln–TBP–HDBP complexes but, rather, indicates that the lanthanides are extracted as either Ln–HDBP complexes or Ln–TBP complexes and that these complexes exist in different ratios depending on the conditions of the extraction system. Furthermore, this fundamental structural information offers insight into the solvent extraction processes that are taking place and are of particular importance to issues arising from the separation and disposal of radioactive materials from used nuclear fuel.« less

  8. Structural study of complexes formed by acidic and neutral organophosphorus reagents

    DOE PAGES

    Braatz, Alexander D.; Antonio, Mark R.; Nilsson, Mikael

    2016-12-23

    The coordination of the trivalent 4f ions, Ln = La 3+, Dy 3+, and Lu 3+, with neutral and acidic organophosphorus reagents, both individually and combined, was studied by use of X-ray absorption spectroscopy. These studies provide metrical information about the interatomic interactions between these cations and the ligands tri- n-butyl phosphate (TBP) and di- n-butyl phosphoric acid (HDBP), whose behavior are of practical importance to chemical separation processes that are currently used on an industrial scale. Previous studies have suggested the existence of complexes involving a mixture of ligands, accounting for extraction synergy. Through systematic variation of the aqueousmore » phase acidity and extractant concentration and combination, we have found that complexes with Ln and TBP : HDBP at any mixture and HDBP alone involve direct Ln–O interactions involving 6 oxygen atoms and distant Ln–P interactions involving on average 3–5 phosphorus atoms per Ln ion. It was also found that Ln complexes formed by TBP alone seem to favor eight oxygen coordination, though we were unable to obtain metrical results regarding the distant Ln–P interactions due to the low signal attributed to a lower concentration of Ln ions in the organic phases. Our study does not support the existence of mixed Ln–TBP–HDBP complexes but, rather, indicates that the lanthanides are extracted as either Ln–HDBP complexes or Ln–TBP complexes and that these complexes exist in different ratios depending on the conditions of the extraction system. Furthermore, this fundamental structural information offers insight into the solvent extraction processes that are taking place and are of particular importance to issues arising from the separation and disposal of radioactive materials from used nuclear fuel.« less

  9. Extracting Inter-business Relationship from World Wide Web

    NASA Astrophysics Data System (ADS)

    Jin, Yingzi; Matsuo, Yutaka; Ishizuka, Mitsuru

    Social relation plays an important role in a real community. Interaction patterns reveal relations among actors (such as persons, groups, companies), which can be merged into valuable information as a network structure. In this paper, we propose a new approach to extract inter-business relationship from the Web. Extraction of relation between a pair of companies is realized by using a search engine and text processing. Since names of companies co-appear coincidentaly on the Web, we propose an advanced algorithm which is characterized by addition of keywords (or we call relation words) to a query. The relation words are obtained from either an annotated corpus or the Web. We show some examples and comprehensive evaluations on our approach.

  10. Applicability of Visual Analytics to Defence and Security Operations

    DTIC Science & Technology

    2011-06-01

    It shows the events importance in the news over time. Topics are extracted from fused video, audio and closed captions. Since viewing video...Detection of Anomalous Maritime Behavior, In Banissi, E. et al. (Eds.) Proceedings of the 12th IEEE International Conference on Information Visualisation

  11. Structural analysis of pyrolytic lignins isolated from switchgrass fast pyrolysis oil

    USDA-ARS?s Scientific Manuscript database

    Structural characterization of lignin extracted from the bio-oil produced by fast pyrolysis of switchgrass (Panicum virgatum) is reported. This new information is important to understanding the utility of lignin as a chemical feedstock in a pyrolysis based biorefinery. Pyrolysis induces a variety of...

  12. a Probability-Based Statistical Method to Extract Water Body of TM Images with Missing Information

    NASA Astrophysics Data System (ADS)

    Lian, Shizhong; Chen, Jiangping; Luo, Minghai

    2016-06-01

    Water information cannot be accurately extracted using TM images because true information is lost in some images because of blocking clouds and missing data stripes, thereby water information cannot be accurately extracted. Water is continuously distributed in natural conditions; thus, this paper proposed a new method of water body extraction based on probability statistics to improve the accuracy of water information extraction of TM images with missing information. Different disturbing information of clouds and missing data stripes are simulated. Water information is extracted using global histogram matching, local histogram matching, and the probability-based statistical method in the simulated images. Experiments show that smaller Areal Error and higher Boundary Recall can be obtained using this method compared with the conventional methods.

  13. Defect-Repairable Latent Feature Extraction of Driving Behavior via a Deep Sparse Autoencoder

    PubMed Central

    Taniguchi, Tadahiro; Takenaka, Kazuhito; Bando, Takashi

    2018-01-01

    Data representing driving behavior, as measured by various sensors installed in a vehicle, are collected as multi-dimensional sensor time-series data. These data often include redundant information, e.g., both the speed of wheels and the engine speed represent the velocity of the vehicle. Redundant information can be expected to complicate the data analysis, e.g., more factors need to be analyzed; even varying the levels of redundancy can influence the results of the analysis. We assume that the measured multi-dimensional sensor time-series data of driving behavior are generated from low-dimensional data shared by the many types of one-dimensional data of which multi-dimensional time-series data are composed. Meanwhile, sensor time-series data may be defective because of sensor failure. Therefore, another important function is to reduce the negative effect of defective data when extracting low-dimensional time-series data. This study proposes a defect-repairable feature extraction method based on a deep sparse autoencoder (DSAE) to extract low-dimensional time-series data. In the experiments, we show that DSAE provides high-performance latent feature extraction for driving behavior, even for defective sensor time-series data. In addition, we show that the negative effect of defects on the driving behavior segmentation task could be reduced using the latent features extracted by DSAE. PMID:29462931

  14. Extraction of stability and control derivatives from orbiter flight data

    NASA Technical Reports Server (NTRS)

    Iliff, Kenneth W.; Shafer, Mary F.

    1993-01-01

    The Space Shuttle Orbiter has provided unique and important information on aircraft flight dynamics. This information has provided the opportunity to assess the flight-derived stability and control derivatives for maneuvering flight in the hypersonic regime. In the case of the Space Shuttle Orbiter, these derivatives are required to determine if certain configuration placards (limitations on the flight envelope) can be modified. These placards were determined on the basis of preflight predictions and the associated uncertainties. As flight-determined derivatives are obtained, the placards are reassessed, and some of them are removed or modified. Extraction of the stability and control derivatives was justified by operational considerations and not by research considerations. Using flight results to update the predicted database of the orbiter is one of the most completely documented processes for a flight vehicle. This process followed from the requirement for analysis of flight data for control system updates and for expansion of the operational flight envelope. These results show significant changes in many important stability and control derivatives from the preflight database. This paper presents some of the stability and control derivative results obtained from Space Shuttle flights. Some of the limitations of this information are also examined.

  15. Effects of band selection on endmember extraction for forestry applications

    NASA Astrophysics Data System (ADS)

    Karathanassi, Vassilia; Andreou, Charoula; Andronis, Vassilis; Kolokoussis, Polychronis

    2014-10-01

    In spectral unmixing theory, data reduction techniques play an important role as hyperspectral imagery contains an immense amount of data, posing many challenging problems such as data storage, computational efficiency, and the so called "curse of dimensionality". Feature extraction and feature selection are the two main approaches for dimensionality reduction. Feature extraction techniques are used for reducing the dimensionality of the hyperspectral data by applying transforms on hyperspectral data. Feature selection techniques retain the physical meaning of the data by selecting a set of bands from the input hyperspectral dataset, which mainly contain the information needed for spectral unmixing. Although feature selection techniques are well-known for their dimensionality reduction potentials they are rarely used in the unmixing process. The majority of the existing state-of-the-art dimensionality reduction methods set criteria to the spectral information, which is derived by the whole wavelength, in order to define the optimum spectral subspace. These criteria are not associated with any particular application but with the data statistics, such as correlation and entropy values. However, each application is associated with specific land c over materials, whose spectral characteristics present variations in specific wavelengths. In forestry for example, many applications focus on tree leaves, in which specific pigments such as chlorophyll, xanthophyll, etc. determine the wavelengths where tree species, diseases, etc., can be detected. For such applications, when the unmixing process is applied, the tree species, diseases, etc., are considered as the endmembers of interest. This paper focuses on investigating the effects of band selection on the endmember extraction by exploiting the information of the vegetation absorbance spectral zones. More precisely, it is explored whether endmember extraction can be optimized when specific sets of initial bands related to leaf spectral characteristics are selected. Experiments comprise application of well-known signal subspace estimation and endmember extraction methods on a hyperspectral imagery that presents a forest area. Evaluation of the extracted endmembers showed that more forest species can be extracted as endmembers using selected bands.

  16. Text feature extraction based on deep learning: a review.

    PubMed

    Liang, Hong; Sun, Xiao; Sun, Yunlei; Gao, Yuan

    2017-01-01

    Selection of text feature item is a basic and important matter for text mining and information retrieval. Traditional methods of feature extraction require handcrafted features. To hand-design, an effective feature is a lengthy process, but aiming at new applications, deep learning enables to acquire new effective feature representation from training data. As a new feature extraction method, deep learning has made achievements in text mining. The major difference between deep learning and conventional methods is that deep learning automatically learns features from big data, instead of adopting handcrafted features, which mainly depends on priori knowledge of designers and is highly impossible to take the advantage of big data. Deep learning can automatically learn feature representation from big data, including millions of parameters. This thesis outlines the common methods used in text feature extraction first, and then expands frequently used deep learning methods in text feature extraction and its applications, and forecasts the application of deep learning in feature extraction.

  17. MedEx: a medication information extraction system for clinical narratives

    PubMed Central

    Stenner, Shane P; Doan, Son; Johnson, Kevin B; Waitman, Lemuel R; Denny, Joshua C

    2010-01-01

    Medication information is one of the most important types of clinical data in electronic medical records. It is critical for healthcare safety and quality, as well as for clinical research that uses electronic medical record data. However, medication data are often recorded in clinical notes as free-text. As such, they are not accessible to other computerized applications that rely on coded data. We describe a new natural language processing system (MedEx), which extracts medication information from clinical notes. MedEx was initially developed using discharge summaries. An evaluation using a data set of 50 discharge summaries showed it performed well on identifying not only drug names (F-measure 93.2%), but also signature information, such as strength, route, and frequency, with F-measures of 94.5%, 93.9%, and 96.0% respectively. We then applied MedEx unchanged to outpatient clinic visit notes. It performed similarly with F-measures over 90% on a set of 25 clinic visit notes. PMID:20064797

  18. Information extraction system

    DOEpatents

    Lemmond, Tracy D; Hanley, William G; Guensche, Joseph Wendell; Perry, Nathan C; Nitao, John J; Kidwell, Paul Brandon; Boakye, Kofi Agyeman; Glaser, Ron E; Prenger, Ryan James

    2014-05-13

    An information extraction system and methods of operating the system are provided. In particular, an information extraction system for performing meta-extraction of named entities of people, organizations, and locations as well as relationships and events from text documents are described herein.

  19. Automatically identifying, counting, and describing wild animals in camera-trap images with deep learning.

    PubMed

    Norouzzadeh, Mohammad Sadegh; Nguyen, Anh; Kosmala, Margaret; Swanson, Alexandra; Palmer, Meredith S; Packer, Craig; Clune, Jeff

    2018-06-19

    Having accurate, detailed, and up-to-date information about the location and behavior of animals in the wild would improve our ability to study and conserve ecosystems. We investigate the ability to automatically, accurately, and inexpensively collect such data, which could help catalyze the transformation of many fields of ecology, wildlife biology, zoology, conservation biology, and animal behavior into "big data" sciences. Motion-sensor "camera traps" enable collecting wildlife pictures inexpensively, unobtrusively, and frequently. However, extracting information from these pictures remains an expensive, time-consuming, manual task. We demonstrate that such information can be automatically extracted by deep learning, a cutting-edge type of artificial intelligence. We train deep convolutional neural networks to identify, count, and describe the behaviors of 48 species in the 3.2 million-image Snapshot Serengeti dataset. Our deep neural networks automatically identify animals with >93.8% accuracy, and we expect that number to improve rapidly in years to come. More importantly, if our system classifies only images it is confident about, our system can automate animal identification for 99.3% of the data while still performing at the same 96.6% accuracy as that of crowdsourced teams of human volunteers, saving >8.4 y (i.e., >17,000 h at 40 h/wk) of human labeling effort on this 3.2 million-image dataset. Those efficiency gains highlight the importance of using deep neural networks to automate data extraction from camera-trap images, reducing a roadblock for this widely used technology. Our results suggest that deep learning could enable the inexpensive, unobtrusive, high-volume, and even real-time collection of a wealth of information about vast numbers of animals in the wild. Copyright © 2018 the Author(s). Published by PNAS.

  20. Detecting diffusion-diffraction patterns in size distribution phantoms using double-pulsed field gradient NMR: Theory and experiments.

    PubMed

    Shemesh, Noam; Ozarslan, Evren; Basser, Peter J; Cohen, Yoram

    2010-01-21

    NMR observable nuclei undergoing restricted diffusion within confining pores are important reporters for microstructural features of porous media including, inter-alia, biological tissues, emulsions and rocks. Diffusion NMR, and especially the single-pulsed field gradient (s-PFG) methodology, is one of the most important noninvasive tools for studying such opaque samples, enabling extraction of important microstructural information from diffusion-diffraction phenomena. However, when the pores are not monodisperse and are characterized by a size distribution, the diffusion-diffraction patterns disappear from the signal decay, and the relevant microstructural information is mostly lost. A recent theoretical study predicted that the diffusion-diffraction patterns in double-PFG (d-PFG) experiments have unique characteristics, such as zero-crossings, that make them more robust with respect to size distributions. In this study, we theoretically compared the signal decay arising from diffusion in isolated cylindrical pores characterized by lognormal size distributions in both s-PFG and d-PFG methodologies using a recently presented general framework for treating diffusion in NMR experiments. We showed the gradual loss of diffusion-diffraction patterns in broadening size distributions in s-PFG and the robustness of the zero-crossings in d-PFG even for very large standard deviations of the size distribution. We then performed s-PFG and d-PFG experiments on well-controlled size distribution phantoms in which the ground-truth is well-known a priori. We showed that the microstructural information, as manifested in the diffusion-diffraction patterns, is lost in the s-PFG experiments, whereas in d-PFG experiments the zero-crossings of the signal persist from which relevant microstructural information can be extracted. This study provides a proof of concept that d-PFG may be useful in obtaining important microstructural features in samples characterized by size distributions.

  1. Automated Extraction of Substance Use Information from Clinical Texts.

    PubMed

    Wang, Yan; Chen, Elizabeth S; Pakhomov, Serguei; Arsoniadis, Elliot; Carter, Elizabeth W; Lindemann, Elizabeth; Sarkar, Indra Neil; Melton, Genevieve B

    2015-01-01

    Within clinical discourse, social history (SH) includes important information about substance use (alcohol, drug, and nicotine use) as key risk factors for disease, disability, and mortality. In this study, we developed and evaluated a natural language processing (NLP) system for automated detection of substance use statements and extraction of substance use attributes (e.g., temporal and status) based on Stanford Typed Dependencies. The developed NLP system leveraged linguistic resources and domain knowledge from a multi-site social history study, Propbank and the MiPACQ corpus. The system attained F-scores of 89.8, 84.6 and 89.4 respectively for alcohol, drug, and nicotine use statement detection, as well as average F-scores of 82.1, 90.3, 80.8, 88.7, 96.6, and 74.5 respectively for extraction of attributes. Our results suggest that NLP systems can achieve good performance when augmented with linguistic resources and domain knowledge when applied to a wide breadth of substance use free text clinical notes.

  2. Are figure legends sufficient? Evaluating the contribution of associated text to biomedical figure comprehension.

    PubMed

    Yu, Hong; Agarwal, Shashank; Johnston, Mark; Cohen, Aaron

    2009-01-06

    Biomedical scientists need to access figures to validate research facts and to formulate or to test novel research hypotheses. However, figures are difficult to comprehend without associated text (e.g., figure legend and other reference text). We are developing automated systems to extract the relevant explanatory information along with figures extracted from full text articles. Such systems could be very useful in improving figure retrieval and in reducing the workload of biomedical scientists, who otherwise have to retrieve and read the entire full-text journal article to determine which figures are relevant to their research. As a crucial step, we studied the importance of associated text in biomedical figure comprehension. Twenty subjects evaluated three figure-text combinations: figure+legend, figure+legend+title+abstract, and figure+full-text. Using a Likert scale, each subject scored each figure+text according to the extent to which the subject thought he/she understood the meaning of the figure and the confidence in providing the assigned score. Additionally, each subject entered a free text summary for each figure-text. We identified missing information using indicator words present within the text summaries. Both the Likert scores and the missing information were statistically analyzed for differences among the figure-text types. We also evaluated the quality of text summaries with the text-summarization evaluation method the ROUGE score. Our results showed statistically significant differences in figure comprehension when varying levels of text were provided. When the full-text article is not available, presenting just the figure+legend left biomedical researchers lacking 39-68% of the information about a figure as compared to having complete figure comprehension; adding the title and abstract improved the situation, but still left biomedical researchers missing 30% of the information. When the full-text article is available, figure comprehension increased to 86-97%; this indicates that researchers felt that only 3-14% of the necessary information for full figure comprehension was missing when full text was available to them. Clearly there is information in the abstract and in the full text that biomedical scientists deem important for understanding the figures that appear in full-text biomedical articles. We conclude that the texts that appear in full-text biomedical articles are useful for understanding the meaning of a figure, and an effective figure-mining system needs to unlock the information beyond figure legend. Our work provides important guidance to the figure mining systems that extract information only from figure and figure legend.

  3. Are figure legends sufficient? Evaluating the contribution of associated text to biomedical figure comprehension

    PubMed Central

    2009-01-01

    Background Biomedical scientists need to access figures to validate research facts and to formulate or to test novel research hypotheses. However, figures are difficult to comprehend without associated text (e.g., figure legend and other reference text). We are developing automated systems to extract the relevant explanatory information along with figures extracted from full text articles. Such systems could be very useful in improving figure retrieval and in reducing the workload of biomedical scientists, who otherwise have to retrieve and read the entire full-text journal article to determine which figures are relevant to their research. As a crucial step, we studied the importance of associated text in biomedical figure comprehension. Methods Twenty subjects evaluated three figure-text combinations: figure+legend, figure+legend+title+abstract, and figure+full-text. Using a Likert scale, each subject scored each figure+text according to the extent to which the subject thought he/she understood the meaning of the figure and the confidence in providing the assigned score. Additionally, each subject entered a free text summary for each figure-text. We identified missing information using indicator words present within the text summaries. Both the Likert scores and the missing information were statistically analyzed for differences among the figure-text types. We also evaluated the quality of text summaries with the text-summarization evaluation method the ROUGE score. Results Our results showed statistically significant differences in figure comprehension when varying levels of text were provided. When the full-text article is not available, presenting just the figure+legend left biomedical researchers lacking 39–68% of the information about a figure as compared to having complete figure comprehension; adding the title and abstract improved the situation, but still left biomedical researchers missing 30% of the information. When the full-text article is available, figure comprehension increased to 86–97%; this indicates that researchers felt that only 3–14% of the necessary information for full figure comprehension was missing when full text was available to them. Clearly there is information in the abstract and in the full text that biomedical scientists deem important for understanding the figures that appear in full-text biomedical articles. Conclusion We conclude that the texts that appear in full-text biomedical articles are useful for understanding the meaning of a figure, and an effective figure-mining system needs to unlock the information beyond figure legend. Our work provides important guidance to the figure mining systems that extract information only from figure and figure legend. PMID:19126221

  4. Numerical linear algebra in data mining

    NASA Astrophysics Data System (ADS)

    Eldén, Lars

    Ideas and algorithms from numerical linear algebra are important in several areas of data mining. We give an overview of linear algebra methods in text mining (information retrieval), pattern recognition (classification of handwritten digits), and PageRank computations for web search engines. The emphasis is on rank reduction as a method of extracting information from a data matrix, low-rank approximation of matrices using the singular value decomposition and clustering, and on eigenvalue methods for network analysis.

  5. Cognition-Based Approaches for High-Precision Text Mining

    ERIC Educational Resources Information Center

    Shannon, George John

    2017-01-01

    This research improves the precision of information extraction from free-form text via the use of cognitive-based approaches to natural language processing (NLP). Cognitive-based approaches are an important, and relatively new, area of research in NLP and search, as well as linguistics. Cognitive approaches enable significant improvements in both…

  6. How to Assess Your Training Needs.

    ERIC Educational Resources Information Center

    Ceramics, Glass, and Mineral Products Industry Training Board, Harrow (England).

    In discussing a method for assessing training needs, this paper deals with various phases of training and points out the importance of outside specialists, the recording of information, and the use of alternative methods. Then five case studies are presented, illustrating each of the industrial groups within the Board's scope: extractives, cement…

  7. Relation extraction for biological pathway construction using node2vec.

    PubMed

    Kim, Munui; Baek, Seung Han; Song, Min

    2018-06-13

    Systems biology is an important field for understanding whole biological mechanisms composed of interactions between biological components. One approach for understanding complex and diverse mechanisms is to analyze biological pathways. However, because these pathways consist of important interactions and information on these interactions is disseminated in a large number of biomedical reports, text-mining techniques are essential for extracting these relationships automatically. In this study, we applied node2vec, an algorithmic framework for feature learning in networks, for relationship extraction. To this end, we extracted genes from paper abstracts using pkde4j, a text-mining tool for detecting entities and relationships. Using the extracted genes, a co-occurrence network was constructed and node2vec was used with the network to generate a latent representation. To demonstrate the efficacy of node2vec in extracting relationships between genes, performance was evaluated for gene-gene interactions involved in a type 2 diabetes pathway. Moreover, we compared the results of node2vec to those of baseline methods such as co-occurrence and DeepWalk. Node2vec outperformed existing methods in detecting relationships in the type 2 diabetes pathway, demonstrating that this method is appropriate for capturing the relatedness between pairs of biological entities involved in biological pathways. The results demonstrated that node2vec is useful for automatic pathway construction.

  8. Quantification method for the appearance of melanin pigmentation using independent component analysis

    NASA Astrophysics Data System (ADS)

    Ojima, Nobutoshi; Okiyama, Natsuko; Okaguchi, Saya; Tsumura, Norimichi; Nakaguchi, Toshiya; Hori, Kimihiko; Miyake, Yoichi

    2005-04-01

    In the cosmetics industry, skin color is very important because skin color gives a direct impression of the face. In particular, many people suffer from melanin pigmentation such as liver spots and freckles. However, it is very difficult to evaluate melanin pigmentation using conventional colorimetric values because these values contain information on various skin chromophores simultaneously. Therefore, it is necessary to extract information of the chromophore of individual skins independently as density information. The isolation of the melanin component image based on independent component analysis (ICA) from a single skin image was reported in 2003. However, this technique has not developed a quantification method for melanin pigmentation. This paper introduces a quantification method based on the ICA of a skin color image to isolate melanin pigmentation. The image acquisition system we used consists of commercially available equipment such as digital cameras and lighting sources with polarized light. The images taken were analyzed using ICA to extract the melanin component images, and Laplacian of Gaussian (LOG) filter was applied to extract the pigmented area. As a result, for skin images including those showing melanin pigmentation and acne, the method worked well. Finally, the total amount of extracted area had a strong correspondence to the subjective rating values for the appearance of pigmentation. Further analysis is needed to recognize the appearance of pigmentation concerning the size of the pigmented area and its spatial gradation.

  9. The Effects of Age and Set Size on the Fast Extraction of Egocentric Distance

    PubMed Central

    Gajewski, Daniel A.; Wallin, Courtney P.; Philbeck, John W.

    2016-01-01

    Angular direction is a source of information about the distance to floor-level objects that can be extracted from brief glimpses (near one's threshold for detection). Age and set size are two factors known to impact the viewing time needed to directionally localize an object, and these were posited to similarly govern the extraction of distance. The question here was whether viewing durations sufficient to support object detection (controlled for age and set size) would also be sufficient to support well-constrained judgments of distance. Regardless of viewing duration, distance judgments were more accurate (less biased towards underestimation) when multiple potential targets were presented, suggesting that the relative angular declinations between the objects are an additional source of useful information. Distance judgments were more precise with additional viewing time, but the benefit did not depend on set size and accuracy did not improve with longer viewing durations. The overall pattern suggests that distance can be efficiently derived from direction for floor-level objects. Controlling for age-related differences in the viewing time needed to support detection was sufficient to support distal localization but only when brief and longer glimpse trials were interspersed. Information extracted from longer glimpse trials presumably supported performance on subsequent trials when viewing time was more limited. This outcome suggests a particularly important role for prior visual experience in distance judgments for older observers. PMID:27398065

  10. Information Retrieval and Text Mining Technologies for Chemistry.

    PubMed

    Krallinger, Martin; Rabal, Obdulia; Lourenço, Anália; Oyarzabal, Julen; Valencia, Alfonso

    2017-06-28

    Efficient access to chemical information contained in scientific literature, patents, technical reports, or the web is a pressing need shared by researchers and patent attorneys from different chemical disciplines. Retrieval of important chemical information in most cases starts with finding relevant documents for a particular chemical compound or family. Targeted retrieval of chemical documents is closely connected to the automatic recognition of chemical entities in the text, which commonly involves the extraction of the entire list of chemicals mentioned in a document, including any associated information. In this Review, we provide a comprehensive and in-depth description of fundamental concepts, technical implementations, and current technologies for meeting these information demands. A strong focus is placed on community challenges addressing systems performance, more particularly CHEMDNER and CHEMDNER patents tasks of BioCreative IV and V, respectively. Considering the growing interest in the construction of automatically annotated chemical knowledge bases that integrate chemical information and biological data, cheminformatics approaches for mapping the extracted chemical names into chemical structures and their subsequent annotation together with text mining applications for linking chemistry with biological information are also presented. Finally, future trends and current challenges are highlighted as a roadmap proposal for research in this emerging field.

  11. Hand and goods judgment algorithm based on depth information

    NASA Astrophysics Data System (ADS)

    Li, Mingzhu; Zhang, Jinsong; Yan, Dan; Wang, Qin; Zhang, Ruiqi; Han, Jing

    2016-03-01

    A tablet computer with a depth camera and a color camera is loaded on a traditional shopping cart. The inside information of the shopping cart is obtained by two cameras. In the shopping cart monitoring field, it is very important for us to determine whether the customer with goods in or out of the shopping cart. This paper establishes a basic framework for judging empty hand, it includes the hand extraction process based on the depth information, process of skin color model building based on WPCA (Weighted Principal Component Analysis), an algorithm for judging handheld products based on motion and skin color information, statistical process. Through this framework, the first step can ensure the integrity of the hand information, and effectively avoids the influence of sleeve and other debris, the second step can accurately extract skin color and eliminate the similar color interference, light has little effect on its results, it has the advantages of fast computation speed and high efficiency, and the third step has the advantage of greatly reducing the noise interference and improving the accuracy.

  12. An effective self-assessment based on concept map extraction from test-sheet for personalized learning

    NASA Astrophysics Data System (ADS)

    Liew, Keng-Hou; Lin, Yu-Shih; Chang, Yi-Chun; Chu, Chih-Ping

    2013-12-01

    Examination is a traditional way to assess learners' learning status, progress and performance after a learning activity. Except the test grade, a test sheet hides some implicit information such as test concepts, their relationships, importance, and prerequisite. The implicit information can be extracted and constructed a concept map for considering (1) the test concepts covered in the same question means these test concepts have strong relationships, and (2) questions in the same test sheet means the test concepts are relative. Concept map has been successfully employed in many researches to help instructors and learners organize relationships among concepts. However, concept map construction depends on experts who need to take effort and time for the organization of the domain knowledge. In addition, the previous researches regarding to automatic concept map construction are limited to consider all learners of a class, which have not considered personalized learning. To cope with this problem, this paper proposes a new approach to automatically extract and construct concept map based on implicit information in a test sheet. Furthermore, the proposed approach also can help learner for self-assessment and self-diagnosis. Finally, an example is given to depict the effectiveness of proposed approach.

  13. Systematically Extracting Metal- and Solvent-Related Occupational Information from Free-Text Responses to Lifetime Occupational History Questionnaires

    PubMed Central

    Friesen, Melissa C.; Locke, Sarah J.; Tornow, Carina; Chen, Yu-Cheng; Koh, Dong-Hee; Stewart, Patricia A.; Purdue, Mark; Colt, Joanne S.

    2014-01-01

    Objectives: Lifetime occupational history (OH) questionnaires often use open-ended questions to capture detailed information about study participants’ jobs. Exposure assessors use this information, along with responses to job- and industry-specific questionnaires, to assign exposure estimates on a job-by-job basis. An alternative approach is to use information from the OH responses and the job- and industry-specific questionnaires to develop programmable decision rules for assigning exposures. As a first step in this process, we developed a systematic approach to extract the free-text OH responses and convert them into standardized variables that represented exposure scenarios. Methods: Our study population comprised 2408 subjects, reporting 11991 jobs, from a case–control study of renal cell carcinoma. Each subject completed a lifetime OH questionnaire that included verbatim responses, for each job, to open-ended questions including job title, main tasks and activities (task), tools and equipment used (tools), and chemicals and materials handled (chemicals). Based on a review of the literature, we identified exposure scenarios (occupations, industries, tasks/tools/chemicals) expected to involve possible exposure to chlorinated solvents, trichloroethylene (TCE) in particular, lead, and cadmium. We then used a SAS macro to review the information reported by study participants to identify jobs associated with each exposure scenario; this was done using previously coded standardized occupation and industry classification codes, and a priori lists of associated key words and phrases related to possibly exposed tasks, tools, and chemicals. Exposure variables representing the occupation, industry, and task/tool/chemicals exposure scenarios were added to the work history records of the study respondents. Our identification of possibly TCE-exposed scenarios in the OH responses was compared to an expert’s independently assigned probability ratings to evaluate whether we missed identifying possibly exposed jobs. Results: Our process added exposure variables for 52 occupation groups, 43 industry groups, and 46 task/tool/chemical scenarios to the data set of OH responses. Across all four agents, we identified possibly exposed task/tool/chemical exposure scenarios in 44–51% of the jobs in possibly exposed occupations. Possibly exposed task/tool/chemical exposure scenarios were found in a nontrivial 9–14% of the jobs not in possibly exposed occupations, suggesting that our process identified important information that would not be captured using occupation alone. Our extraction process was sensitive: for jobs where our extraction of OH responses identified no exposure scenarios and for which the sole source of information was the OH responses, only 0.1% were assessed as possibly exposed to TCE by the expert. Conclusions: Our systematic extraction of OH information found useful information in the task/chemicals/tools responses that was relatively easy to extract and that was not available from the occupational or industry information. The extracted variables can be used as inputs in the development of decision rules, especially for jobs where no additional information, such as job- and industry-specific questionnaires, is available. PMID:24590110

  14. A mobile unit for memory retrieval in daily life based on image and sensor processing

    NASA Astrophysics Data System (ADS)

    Takesumi, Ryuji; Ueda, Yasuhiro; Nakanishi, Hidenobu; Nakamura, Atsuyoshi; Kakimori, Nobuaki

    2003-10-01

    We developed a Mobile Unit which purpose is to support memory retrieval of daily life. In this paper, we describe the two characteristic factors of this unit. (1)The behavior classification with an acceleration sensor. (2)Extracting the difference of environment with image processing technology. In (1), By analyzing power and frequency of an acceleration sensor which turns to gravity direction, the one's activities can be classified using some techniques to walk, stay, and so on. In (2), By extracting the difference between the beginning scene and the ending scene of a stay scene with image processing, the result which is done by user is recognized as the difference of environment. Using those 2 techniques, specific scenes of daily life can be extracted, and important information at the change of scenes can be realized to record. Especially we describe the effect to support retrieving important things, such as a thing left behind and a state of working halfway.

  15. Review of Extracting Information From the Social Web for Health Personalization

    PubMed Central

    Karlsen, Randi; Bonander, Jason

    2011-01-01

    In recent years the Web has come into its own as a social platform where health consumers are actively creating and consuming Web content. Moreover, as the Web matures, consumers are gaining access to personalized applications adapted to their health needs and interests. The creation of personalized Web applications relies on extracted information about the users and the content to personalize. The Social Web itself provides many sources of information that can be used to extract information for personalization apart from traditional Web forms and questionnaires. This paper provides a review of different approaches for extracting information from the Social Web for health personalization. We reviewed research literature across different fields addressing the disclosure of health information in the Social Web, techniques to extract that information, and examples of personalized health applications. In addition, the paper includes a discussion of technical and socioethical challenges related to the extraction of information for health personalization. PMID:21278049

  16. Modal-Power-Based Haptic Motion Recognition

    NASA Astrophysics Data System (ADS)

    Kasahara, Yusuke; Shimono, Tomoyuki; Kuwahara, Hiroaki; Sato, Masataka; Ohnishi, Kouhei

    Motion recognition based on sensory information is important for providing assistance to human using robots. Several studies have been carried out on motion recognition based on image information. However, in the motion of humans contact with an object can not be evaluated precisely by image-based recognition. This is because the considering force information is very important for describing contact motion. In this paper, a modal-power-based haptic motion recognition is proposed; modal power is considered to reveal information on both position and force. Modal power is considered to be one of the defining features of human motion. A motion recognition algorithm based on linear discriminant analysis is proposed to distinguish between similar motions. Haptic information is extracted using a bilateral master-slave system. Then, the observed motion is decomposed in terms of primitive functions in a modal space. The experimental results show the effectiveness of the proposed method.

  17. Airway extraction from 3D chest CT volumes based on iterative extension of VOI enhanced by cavity enhancement filter

    NASA Astrophysics Data System (ADS)

    Meng, Qier; Kitasaka, Takayuki; Oda, Masahiro; Mori, Kensaku

    2017-03-01

    Airway segmentation is an important step in analyzing chest CT volumes for computerized lung cancer detection, emphysema diagnosis, asthma diagnosis, and pre- and intra-operative bronchoscope navigation. However, obtaining an integrated 3-D airway tree structure from a CT volume is a quite challenging task. This paper presents a novel airway segmentation method based on intensity structure analysis and bronchi shape structure analysis in volume of interest (VOI). This method segments the bronchial regions by applying the cavity enhancement filter (CEF) to trace the bronchial tree structure from the trachea. It uses the CEF in each VOI to segment each branch and to predict the positions of VOIs which envelope the bronchial regions in next level. At the same time, a leakage detection is performed to avoid the leakage by analysing the pixel information and the shape information of airway candidate regions extracted in the VOI. Bronchial regions are finally obtained by unifying the extracted airway regions. The experiments results showed that the proposed method can extract most of the bronchial region in each VOI and led good results of the airway segmentation.

  18. Identification of chemogenomic features from drug–target interaction networks using interpretable classifiers

    PubMed Central

    Tabei, Yasuo; Pauwels, Edouard; Stoven, Véronique; Takemoto, Kazuhiro; Yamanishi, Yoshihiro

    2012-01-01

    Motivation: Drug effects are mainly caused by the interactions between drug molecules and their target proteins including primary targets and off-targets. Identification of the molecular mechanisms behind overall drug–target interactions is crucial in the drug design process. Results: We develop a classifier-based approach to identify chemogenomic features (the underlying associations between drug chemical substructures and protein domains) that are involved in drug–target interaction networks. We propose a novel algorithm for extracting informative chemogenomic features by using L1 regularized classifiers over the tensor product space of possible drug–target pairs. It is shown that the proposed method can extract a very limited number of chemogenomic features without loosing the performance of predicting drug–target interactions and the extracted features are biologically meaningful. The extracted substructure–domain association network enables us to suggest ligand chemical fragments specific for each protein domain and ligand core substructures important for a wide range of protein families. Availability: Softwares are available at the supplemental website. Contact: yamanishi@bioreg.kyushu-u.ac.jp Supplementary Information: Datasets and all results are available at http://cbio.ensmp.fr/~yyamanishi/l1binary/ . PMID:22962471

  19. Extracting genetic alteration information for personalized cancer therapy from ClinicalTrials.gov

    PubMed Central

    Xu, Jun; Lee, Hee-Jin; Zeng, Jia; Wu, Yonghui; Zhang, Yaoyun; Huang, Liang-Chin; Johnson, Amber; Holla, Vijaykumar; Bailey, Ann M; Cohen, Trevor; Meric-Bernstam, Funda; Bernstam, Elmer V

    2016-01-01

    Objective: Clinical trials investigating drugs that target specific genetic alterations in tumors are important for promoting personalized cancer therapy. The goal of this project is to create a knowledge base of cancer treatment trials with annotations about genetic alterations from ClinicalTrials.gov. Methods: We developed a semi-automatic framework that combines advanced text-processing techniques with manual review to curate genetic alteration information in cancer trials. The framework consists of a document classification system to identify cancer treatment trials from ClinicalTrials.gov and an information extraction system to extract gene and alteration pairs from the Title and Eligibility Criteria sections of clinical trials. By applying the framework to trials at ClinicalTrials.gov, we created a knowledge base of cancer treatment trials with genetic alteration annotations. We then evaluated each component of the framework against manually reviewed sets of clinical trials and generated descriptive statistics of the knowledge base. Results and Discussion: The automated cancer treatment trial identification system achieved a high precision of 0.9944. Together with the manual review process, it identified 20 193 cancer treatment trials from ClinicalTrials.gov. The automated gene-alteration extraction system achieved a precision of 0.8300 and a recall of 0.6803. After validation by manual review, we generated a knowledge base of 2024 cancer trials that are labeled with specific genetic alteration information. Analysis of the knowledge base revealed the trend of increased use of targeted therapy for cancer, as well as top frequent gene-alteration pairs of interest. We expect this knowledge base to be a valuable resource for physicians and patients who are seeking information about personalized cancer therapy. PMID:27013523

  20. Meta-Generalis: A Novel Method for Structuring Information from Radiology Reports

    PubMed Central

    Barbosa, Flavio; Traina, Agma Jucci

    2016-01-01

    Summary Background A structured report for imaging exams aims at increasing the precision in information retrieval and communication between physicians. However, it is more concise than free text and may limit specialists’ descriptions of important findings not covered by pre-defined structures. A computational ontological structure derived from free texts designed by specialists may be a solution for this problem. Therefore, the goal of our study was to develop a methodology for structuring information in radiology reports covering specifications required for the Brazilian Portuguese language, including the terminology to be used. Methods We gathered 1,701 radiological reports of magnetic resonance imaging (MRI) studies of the lumbosacral spine from three different institutions. Techniques of text mining and ontological conceptualization of lexical units extracted were used to structure information. Ten radiologists, specialists in lumbosacral MRI, evaluated the textual superstructure and terminology extracted using an electronic questionnaire. Results The established methodology consists of six steps: 1) collection of radiology reports of a specific MRI examination; 2) textual decomposition; 3) normalization of lexical units; 4) identification of textual superstructures; 5) conceptualization of candidate-terms; and 6) evaluation of superstructures and extracted terminology by experts using an electronic questionnaire. Three different textual superstructures were identified, with terminological variations in the names of their textual categories. The number of candidate-terms conceptualized was 4,183, yielding 727 concepts. There were a total of 13,963 relationships between candidate-terms and concepts and 789 relationships among concepts. Conclusions The proposed methodology allowed structuring information in a more intuitive and practical way. Indications of three textual superstructures, extraction of lexicon units and the normalization and ontologically conceptualization were achieved while maintaining references to their respective categories and free text radiology reports. PMID:27580980

  1. Meta-generalis: A novel method for structuring information from radiology reports.

    PubMed

    Barbosa, Flavio; Traina, Agma Jucci; Muglia, Valdair Francisco

    2016-08-24

    A structured report for imaging exams aims at increasing the precision in information retrieval and communication between physicians. However, it is more concise than free text and may limit specialists' descriptions of important findings not covered by pre-defined structures. A computational ontological structure derived from free texts designed by specialists may be a solution for this problem. Therefore, the goal of our study was to develop a methodology for structuring information in radiology reports covering specifications required for the Brazilian Portuguese language, including the terminology to be used. We gathered 1,701 radiological reports of magnetic resonance imaging (MRI) studies of the lumbosacral spine from three different institutions. Techniques of text mining and ontological conceptualization of lexical units extracted were used to structure information. Ten radiologists, specialists in lumbosacral MRI, evaluated the textual superstructure and terminology extracted using an electronic questionnaire. The established methodology consists of six steps: 1) collection of radiology reports of a specific MRI examination; 2) textual decomposition; 3) normalization of lexical units; 4) identification of textual superstructures; 5) conceptualization of candidate-terms; and 6) evaluation of superstructures and extracted terminology by experts using an electronic questionnaire. Three different textual superstructures were identified, with terminological variations in the names of their textual categories. The number of candidate-terms conceptualized was 4,183, yielding 727 concepts. There were a total of 13,963 relationships between candidate-terms and concepts and 789 relationships among concepts. The proposed methodology allowed structuring information in a more intuitive and practical way. Indications of three textual superstructures, extraction of lexicon units and the normalization and ontologically conceptualization were achieved while maintaining references to their respective categories and free text radiology reports.

  2. Extracting genetic alteration information for personalized cancer therapy from ClinicalTrials.gov.

    PubMed

    Xu, Jun; Lee, Hee-Jin; Zeng, Jia; Wu, Yonghui; Zhang, Yaoyun; Huang, Liang-Chin; Johnson, Amber; Holla, Vijaykumar; Bailey, Ann M; Cohen, Trevor; Meric-Bernstam, Funda; Bernstam, Elmer V; Xu, Hua

    2016-07-01

    Clinical trials investigating drugs that target specific genetic alterations in tumors are important for promoting personalized cancer therapy. The goal of this project is to create a knowledge base of cancer treatment trials with annotations about genetic alterations from ClinicalTrials.gov. We developed a semi-automatic framework that combines advanced text-processing techniques with manual review to curate genetic alteration information in cancer trials. The framework consists of a document classification system to identify cancer treatment trials from ClinicalTrials.gov and an information extraction system to extract gene and alteration pairs from the Title and Eligibility Criteria sections of clinical trials. By applying the framework to trials at ClinicalTrials.gov, we created a knowledge base of cancer treatment trials with genetic alteration annotations. We then evaluated each component of the framework against manually reviewed sets of clinical trials and generated descriptive statistics of the knowledge base. The automated cancer treatment trial identification system achieved a high precision of 0.9944. Together with the manual review process, it identified 20 193 cancer treatment trials from ClinicalTrials.gov. The automated gene-alteration extraction system achieved a precision of 0.8300 and a recall of 0.6803. After validation by manual review, we generated a knowledge base of 2024 cancer trials that are labeled with specific genetic alteration information. Analysis of the knowledge base revealed the trend of increased use of targeted therapy for cancer, as well as top frequent gene-alteration pairs of interest. We expect this knowledge base to be a valuable resource for physicians and patients who are seeking information about personalized cancer therapy. © The Author 2016. Published by Oxford University Press on behalf of the American Medical Informatics Association. All rights reserved. For Permissions, please email: journals.permissions@oup.com.

  3. Multivariate EMD and full spectrum based condition monitoring for rotating machinery

    NASA Astrophysics Data System (ADS)

    Zhao, Xiaomin; Patel, Tejas H.; Zuo, Ming J.

    2012-02-01

    Early assessment of machinery health condition is of paramount importance today. A sensor network with sensors in multiple directions and locations is usually employed for monitoring the condition of rotating machinery. Extraction of health condition information from these sensors for effective fault detection and fault tracking is always challenging. Empirical mode decomposition (EMD) is an advanced signal processing technology that has been widely used for this purpose. Standard EMD has the limitation in that it works only for a single real-valued signal. When dealing with data from multiple sensors and multiple health conditions, standard EMD faces two problems. First, because of the local and self-adaptive nature of standard EMD, the decomposition of signals from different sources may not match in either number or frequency content. Second, it may not be possible to express the joint information between different sensors. The present study proposes a method of extracting fault information by employing multivariate EMD and full spectrum. Multivariate EMD can overcome the limitations of standard EMD when dealing with data from multiple sources. It is used to extract the intrinsic mode functions (IMFs) embedded in raw multivariate signals. A criterion based on mutual information is proposed for selecting a sensitive IMF. A full spectral feature is then extracted from the selected fault-sensitive IMF to capture the joint information between signals measured from two orthogonal directions. The proposed method is first explained using simple simulated data, and then is tested for the condition monitoring of rotating machinery applications. The effectiveness of the proposed method is demonstrated through monitoring damage on the vane trailing edge of an impeller and rotor-stator rub in an experimental rotor rig.

  4. Application of wavelet techniques for cancer diagnosis using ultrasound images: A Review.

    PubMed

    Sudarshan, Vidya K; Mookiah, Muthu Rama Krishnan; Acharya, U Rajendra; Chandran, Vinod; Molinari, Filippo; Fujita, Hamido; Ng, Kwan Hoong

    2016-02-01

    Ultrasound is an important and low cost imaging modality used to study the internal organs of human body and blood flow through blood vessels. It uses high frequency sound waves to acquire images of internal organs. It is used to screen normal, benign and malignant tissues of various organs. Healthy and malignant tissues generate different echoes for ultrasound. Hence, it provides useful information about the potential tumor tissues that can be analyzed for diagnostic purposes before therapeutic procedures. Ultrasound images are affected with speckle noise due to an air gap between the transducer probe and the body. The challenge is to design and develop robust image preprocessing, segmentation and feature extraction algorithms to locate the tumor region and to extract subtle information from isolated tumor region for diagnosis. This information can be revealed using a scale space technique such as the Discrete Wavelet Transform (DWT). It decomposes an image into images at different scales using low pass and high pass filters. These filters help to identify the detail or sudden changes in intensity in the image. These changes are reflected in the wavelet coefficients. Various texture, statistical and image based features can be extracted from these coefficients. The extracted features are subjected to statistical analysis to identify the significant features to discriminate normal and malignant ultrasound images using supervised classifiers. This paper presents a review of wavelet techniques used for preprocessing, segmentation and feature extraction of breast, thyroid, ovarian and prostate cancer using ultrasound images. Copyright © 2015 Elsevier Ltd. All rights reserved.

  5. High-Resolution Remote Sensing Image Building Extraction Based on Markov Model

    NASA Astrophysics Data System (ADS)

    Zhao, W.; Yan, L.; Chang, Y.; Gong, L.

    2018-04-01

    With the increase of resolution, remote sensing images have the characteristics of increased information load, increased noise, more complex feature geometry and texture information, which makes the extraction of building information more difficult. To solve this problem, this paper designs a high resolution remote sensing image building extraction method based on Markov model. This method introduces Contourlet domain map clustering and Markov model, captures and enhances the contour and texture information of high-resolution remote sensing image features in multiple directions, and further designs the spectral feature index that can characterize "pseudo-buildings" in the building area. Through the multi-scale segmentation and extraction of image features, the fine extraction from the building area to the building is realized. Experiments show that this method can restrain the noise of high-resolution remote sensing images, reduce the interference of non-target ground texture information, and remove the shadow, vegetation and other pseudo-building information, compared with the traditional pixel-level image information extraction, better performance in building extraction precision, accuracy and completeness.

  6. The potential of satellite data to study individual wildfire events

    NASA Astrophysics Data System (ADS)

    Benali, Akli; López-Saldana, Gerardo; Russo, Ana; Sá, Ana C. L.; Pinto, Renata M. S.; Nikos, Koutsias; Owen, Price; Pereira, Jose M. C.

    2014-05-01

    Large wildfires have important social, economic and environmental impacts. In order to minimize their impacts, understand their main drivers and study their dynamics, different approaches have been used. The reconstruction of individual wildfire events is usually done by collection of field data, interviews and by implementing fire spread simulations. All these methods have clear limitations in terms of spatial and temporal coverage, accuracy, subjectivity of the collected information and lack of objective independent validation information. In this sense, remote sensing is a promising tool with the potential to provide relevant information for stakeholders and the research community, by complementing or filling gaps in existing information and providing independent accurate quantitative information. In this work we show the potential of satellite data to provide relevant information regarding the dynamics of individual large wildfire events, filling an important gap in wildfire research. We show how MODIS active-fire data, acquired up to four times per day, and satellite-derived burnt perimeters can be combined to extract relevant information wildfire events by describing the methods involved and presenting results for four regions of the world: Portugal, Greece, SE Australia and California. The information that can be retrieved encompasses the start and end date of a wildfire event and its ignition area. We perform an evaluation of the information retrieved by comparing the satellite-derived parameters with national databases, highlighting the strengths and weaknesses of both and showing how the former can complement the latter leading to more complete and accurate datasets. We also show how the spatio-temporal distribution of wildfire spread dynamics can be reconstructed using satellite-derived active-fires and how relevant descriptors can be extracted. Applying graph theory to satellite active-fire data, we define the major fire spread paths that yield information about the major spatial corridors through which fires spread, and their relative importance in the full fire event. These major fire paths are then used to extract relevant descriptors, such as the distribution of fire spread direction, rate of spread and fire intensity (i.e. energy emitted). The reconstruction of the fire spread is shown for some case studies for Portugal and is also compared with fire progressions obtained by air-borne sensors for SE Australia. The approach shows solid results, providing a valuable tool for the reconstruction of individual fire events, understand their complex spread patterns and their main drivers of fire propagation. The major fire pathsand the spatio-temporal distribution of active fires are being currently combined with fire spread simulations within the scope oftheFIRE-MODSATproject, to provideuseful information to support and improve fire suppression strategies.

  7. [Technologies for Complex Intelligent Clinical Data Analysis].

    PubMed

    Baranov, A A; Namazova-Baranova, L S; Smirnov, I V; Devyatkin, D A; Shelmanov, A O; Vishneva, E A; Antonova, E V; Smirnov, V I

    2016-01-01

    The paper presents the system for intelligent analysis of clinical information. Authors describe methods implemented in the system for clinical information retrieval, intelligent diagnostics of chronic diseases, patient's features importance and for detection of hidden dependencies between features. Results of the experimental evaluation of these methods are also presented. Healthcare facilities generate a large flow of both structured and unstructured data which contain important information about patients. Test results are usually retained as structured data but some data is retained in the form of natural language texts (medical history, the results of physical examination, and the results of other examinations, such as ultrasound, ECG or X-ray studies). Many tasks arising in clinical practice can be automated applying methods for intelligent analysis of accumulated structured array and unstructured data that leads to improvement of the healthcare quality. the creation of the complex system for intelligent data analysis in the multi-disciplinary pediatric center. Authors propose methods for information extraction from clinical texts in Russian. The methods are carried out on the basis of deep linguistic analysis. They retrieve terms of diseases, symptoms, areas of the body and drugs. The methods can recognize additional attributes such as "negation" (indicates that the disease is absent), "no patient" (indicates that the disease refers to the patient's family member, but not to the patient), "severity of illness", disease course", "body region to which the disease refers". Authors use a set of hand-drawn templates and various techniques based on machine learning to retrieve information using a medical thesaurus. The extracted information is used to solve the problem of automatic diagnosis of chronic diseases. A machine learning method for classification of patients with similar nosology and the methodfor determining the most informative patients'features are also proposed. Authors have processed anonymized health records from the pediatric center to estimate the proposed methods. The results show the applicability of the information extracted from the texts for solving practical problems. The records ofpatients with allergic, glomerular and rheumatic diseases were used for experimental assessment of the method of automatic diagnostic. Authors have also determined the most appropriate machine learning methods for classification of patients for each group of diseases, as well as the most informative disease signs. It has been found that using additional information extracted from clinical texts, together with structured data helps to improve the quality of diagnosis of chronic diseases. Authors have also obtained pattern combinations of signs of diseases. The proposed methods have been implemented in the intelligent data processing system for a multidisciplinary pediatric center. The experimental results show the availability of the system to improve the quality of pediatric healthcare.

  8. Contextual information and perceptual-cognitive expertise in a dynamic, temporally-constrained task.

    PubMed

    Murphy, Colm P; Jackson, Robin C; Cooke, Karl; Roca, André; Benguigui, Nicolas; Williams, A Mark

    2016-12-01

    Skilled performers extract and process postural information from an opponent during anticipation more effectively than their less-skilled counterparts. In contrast, the role and importance of contextual information in anticipation has received only minimal attention. We evaluate the importance of contextual information in anticipation and examine the underlying perceptual-cognitive processes. We present skilled and less-skilled tennis players with normal video or animated footage of the same rallies. In the animated condition, sequences were created using player movement and ball trajectory data, and postural information from the players was removed, constraining participants to anticipate based on contextual information alone. Participants judged ball bounce location of the opponent's final occluded shot. The 2 groups were more accurate than chance in both display conditions with skilled being more accurate than less-skilled (Exp. 1) participants. When anticipating based on contextual information alone, skilled participants employed different gaze behaviors to less-skilled counterparts and provided verbal reports of thoughts which were indicative of more thorough evaluation of contextual information (Exp. 2). Findings highlight the importance of both postural and contextual information in anticipation and indicate that perceptual-cognitive expertise is underpinned by processes that facilitate more effective processing of contextual information, in the absence of postural information. (PsycINFO Database Record (c) 2016 APA, all rights reserved).

  9. Characteristics of hemolytic activity induced by the aqueous extract of the Mexican fire coral Millepora complanata.

    PubMed

    García-Arredondo, Alejandro; Murillo-Esquivel, Luis J; Rojas, Alejandra; Sanchez-Rodriguez, Judith

    2014-01-01

    Millepora complanata is a plate-like fire coral common throughout the Caribbean. Contact with this species usually provokes burning pain, erythema and urticariform lesions. Our previous study suggested that the aqueous extract of M. complanata contains non-protein hemolysins that are soluble in water and ethanol. In general, the local damage induced by cnidarian venoms has been associated with hemolysins. The characterization of the effects of these components is important for the understanding of the defense mechanisms of fire corals. In addition, this information could lead to better care for victims of envenomation accidents. An ethanolic extract from the lyophilized aqueous extract was prepared and its hemolytic activity was compared with the hemolysis induced by the denatured aqueous extract. Based on the finding that ethanol failed to induce nematocyst discharge, ethanolic extracts were prepared from artificially bleached and normal M. complanata fragments and their hemolytic activity was tested in order to obtain information about the source of the heat-stable hemolysins. Rodent erythrocytes were more susceptible to the aqueous extract than chicken and human erythrocytes. Hemolytic activity started at ten minutes of incubation and was relatively stable within the range of 28-50°C. When the aqueous extract was preincubated at temperatures over 60°C, hemolytic activity was significantly reduced. The denatured extract induced a slow hemolytic activity (HU50 = 1,050.00 ± 45.85 μg/mL), detectable four hours after incubation, which was similar to that induced by the ethanolic extract prepared from the aqueous extract (HU50 = 1,167.00 ± 54.95 μg/mL). No significant differences were observed between hemolysis induced by ethanolic extracts from bleached and normal fragments, although both activities were more potent than hemolysis induced by the denatured extract. The results showed that the aqueous extract of M. complanata possesses one or more powerful heat-labile hemolytic proteins that are slightly more resistant to temperature than jellyfish venoms. This extract also contains slow thermostable hemolysins highly soluble in ethanol that are probably derived from the body tissues of the hydrozoan.

  10. The Importance of Data Quality in Using Health Information Exchange (HIE) Networks to Improve Health Outcomes: Case Study of a HIE Extracted Dataset of Patients with Congestive Heart Failure Participating in a Regional HIE

    ERIC Educational Resources Information Center

    Cartron-Mizeracki, Marie-Astrid

    2016-01-01

    Expenditures on health information technology (HIT) for healthcare organizations are growing exponentially and the value of it is the subject of criticism and skepticism. Because HIT is viewed as capable of improving major health care indicators, the government offers incentives to health care providers and organizations to implement solutions.…

  11. Cancer patients on Twitter: a novel patient community on social media.

    PubMed

    Sugawara, Yuya; Narimatsu, Hiroto; Hozawa, Atsushi; Shao, Li; Otani, Katsumi; Fukao, Akira

    2012-12-27

    Patients increasingly turn to the Internet for information on medical conditions, including clinical news and treatment options. In recent years, an online patient community has arisen alongside the rapidly expanding world of social media, or "Web 2.0." Twitter provides real-time dissemination of news, information, personal accounts and other details via a highly interactive form of social media, and has become an important online tool for patients. This medium is now considered to play an important role in the modern social community of online, "wired" cancer patients. Fifty-one highly influential "power accounts" belonging to cancer patients were extracted from a dataset of 731 Twitter accounts with cancer terminology in their profiles. In accordance with previously established methodology, "power accounts" were defined as those Twitter accounts with 500 or more followers. We extracted data on the cancer patient (female) with the most followers to study the specific relationships that existed between the user and her followers, and found that the majority of the examined tweets focused on greetings, treatment discussions, and other instances of psychological support. These findings went against our hypothesis that cancer patients' tweets would be centered on the dissemination of medical information and similar "newsy" details. At present, there exists a rapidly evolving network of cancer patients engaged in information exchange via Twitter. This network is valuable in the sharing of psychological support among the cancer community.

  12. Local binary pattern variants-based adaptive texture features analysis for posed and nonposed facial expression recognition

    NASA Astrophysics Data System (ADS)

    Sultana, Maryam; Bhatti, Naeem; Javed, Sajid; Jung, Soon Ki

    2017-09-01

    Facial expression recognition (FER) is an important task for various computer vision applications. The task becomes challenging when it requires the detection and encoding of macro- and micropatterns of facial expressions. We present a two-stage texture feature extraction framework based on the local binary pattern (LBP) variants and evaluate its significance in recognizing posed and nonposed facial expressions. We focus on the parametric limitations of the LBP variants and investigate their effects for optimal FER. The size of the local neighborhood is an important parameter of the LBP technique for its extraction in images. To make the LBP adaptive, we exploit the granulometric information of the facial images to find the local neighborhood size for the extraction of center-symmetric LBP (CS-LBP) features. Our two-stage texture representations consist of an LBP variant and the adaptive CS-LBP features. Among the presented two-stage texture feature extractions, the binarized statistical image features and adaptive CS-LBP features were found showing high FER rates. Evaluation of the adaptive texture features shows competitive and higher performance than the nonadaptive features and other state-of-the-art approaches, respectively.

  13. Characterization of green zero-valent iron nanoparticles produced with tree leaf extracts.

    PubMed

    Machado, S; Pacheco, J G; Nouws, H P A; Albergaria, J T; Delerue-Matos, C

    2015-11-15

    In the last decades nanotechnology has become increasingly important because it offers indisputable advantages to almost every area of expertise, including environmental remediation. In this area the synthesis of highly reactive nanomaterials (e.g. zero-valent iron nanoparticles, nZVI) is gaining the attention of the scientific community, service providers and other stakeholders. The synthesis of nZVI by the recently developed green bottom-up method is extremely promising. However, the lack of information about the characteristics of the synthetized particles hinders a wider and more extensive application. This work aims to evaluate the characteristics of nZVI synthesized through the green method using leaves from different trees. Considering the requirements of a product for environmental remediation the following characteristics were studied: size, shape, reactivity and agglomeration tendency. The mulberry and pomegranate leaf extracts produced the smallest nZVIs (5-10 nm), the peach, pear and vine leaf extracts produced the most reactive nZVIs while the ones produced with passion fruit, medlar and cherry extracts did not settle at high nZVI concentrations (931 and 266 ppm). Considering all tests, the nZVIs obtained from medlar and vine leaf extracts are the ones that could present better performances in the environmental remediation. The information gathered in this paper will be useful to choose the most appropriate leaf extracts and operational conditions for the application of the green nZVIs in environmental remediation. Copyright © 2015 Elsevier B.V. All rights reserved.

  14. Multi-scale image segmentation method with visual saliency constraints and its application

    NASA Astrophysics Data System (ADS)

    Chen, Yan; Yu, Jie; Sun, Kaimin

    2018-03-01

    Object-based image analysis method has many advantages over pixel-based methods, so it is one of the current research hotspots. It is very important to get the image objects by multi-scale image segmentation in order to carry out object-based image analysis. The current popular image segmentation methods mainly share the bottom-up segmentation principle, which is simple to realize and the object boundaries obtained are accurate. However, the macro statistical characteristics of the image areas are difficult to be taken into account, and fragmented segmentation (or over-segmentation) results are difficult to avoid. In addition, when it comes to information extraction, target recognition and other applications, image targets are not equally important, i.e., some specific targets or target groups with particular features worth more attention than the others. To avoid the problem of over-segmentation and highlight the targets of interest, this paper proposes a multi-scale image segmentation method with visually saliency graph constraints. Visual saliency theory and the typical feature extraction method are adopted to obtain the visual saliency information, especially the macroscopic information to be analyzed. The visual saliency information is used as a distribution map of homogeneity weight, where each pixel is given a weight. This weight acts as one of the merging constraints in the multi- scale image segmentation. As a result, pixels that macroscopically belong to the same object but are locally different can be more likely assigned to one same object. In addition, due to the constraint of visual saliency model, the constraint ability over local-macroscopic characteristics can be well controlled during the segmentation process based on different objects. These controls will improve the completeness of visually saliency areas in the segmentation results while diluting the controlling effect for non- saliency background areas. Experiments show that this method works better for texture image segmentation than traditional multi-scale image segmentation methods, and can enable us to give priority control to the saliency objects of interest. This method has been used in image quality evaluation, scattered residential area extraction, sparse forest extraction and other applications to verify its validation. All applications showed good results.

  15. The Collaborative Lecture Annotation System (CLAS): A New TOOL for Distributed Learning

    ERIC Educational Resources Information Center

    Risko, E. F.; Foulsham, T.; Dawson, S.; Kingstone, A.

    2013-01-01

    In the context of a lecture, the capacity to readily recognize and synthesize key concepts is crucial for comprehension and overall educational performance. In this paper, we introduce a tool, the Collaborative Lecture Annotation System (CLAS), which has been developed to make the extraction of important information a more collaborative and…

  16. Predictors of Verb-Mediated Anticipatory Eye Movements in the Visual World

    ERIC Educational Resources Information Center

    Hintz, Florian; Meyer, Antje S.; Huettig, Falk

    2017-01-01

    Many studies have demonstrated that listeners use information extracted from verbs to guide anticipatory eye movements to objects in the visual context that satisfy the selection restrictions of the verb. An important question is what underlies such verb-mediated anticipatory eye gaze. Based on empirical and theoretical suggestions, we…

  17. An Alternative Way to Model Population Ability Distributions in Large-Scale Educational Surveys

    ERIC Educational Resources Information Center

    Wetzel, Eunike; Xu, Xueli; von Davier, Matthias

    2015-01-01

    In large-scale educational surveys, a latent regression model is used to compensate for the shortage of cognitive information. Conventionally, the covariates in the latent regression model are principal components extracted from background data. This operational method has several important disadvantages, such as the handling of missing data and…

  18. Energy and minerals industries in national, regional, and state economies

    Treesearch

    D. J. Shields; S. A. Winter; G. S. Alward; K. L. Hartung

    1996-01-01

    This report presents information on the contribution of the extractive industries to the domestic economy at different geopolitical scales. Areas where resource production is important to gross state or regional product, employment, or income are highlighted. Output, employment, value added, and personal and total income multipliers are reported for the energy and...

  19. Spatiotemporal conceptual platform for querying archaeological information systems

    NASA Astrophysics Data System (ADS)

    Partsinevelos, Panagiotis; Sartzetaki, Mary; Sarris, Apostolos

    2015-04-01

    Spatial and temporal distribution of archaeological sites has been shown to associate with several attributes including marine, water, mineral and food resources, climate conditions, geomorphological features, etc. In this study, archeological settlement attributes are evaluated under various associations in order to provide a specialized query platform in a geographic information system (GIS). Towards this end, a spatial database is designed to include a series of archaeological findings for a secluded geographic area of Crete in Greece. The key categories of the geodatabase include the archaeological type (palace, burial site, village, etc.), temporal information of the habitation/usage period (pre Minoan, Minoan, Byzantine, etc.), and the extracted geographical attributes of the sites (distance to sea, altitude, resources, etc.). Most of the related spatial attributes are extracted with readily available GIS tools. Additionally, a series of conceptual data attributes are estimated, including: Temporal relation of an era to a future one in terms of alteration of the archaeological type, topologic relations of various types and attributes, spatial proximity relations between various types. These complex spatiotemporal relational measures reveal new attributes towards better understanding of site selection for prehistoric and/or historic cultures, yet their potential combinations can become numerous. Therefore, after the quantification of the above mentioned attributes, they are classified as of their importance for archaeological site location modeling. Under this new classification scheme, the user may select a geographic area of interest and extract only the important attributes for a specific archaeological type. These extracted attributes may then be queried against the entire spatial database and provide a location map of possible new archaeological sites. This novel type of querying is robust since the user does not have to type a standard SQL query but graphically select an area of interest. In addition, according to the application at hand, novel spatiotemporal attributes and relations can be supported, towards the understanding of historical settlement patterns.

  20. An integrated method for cancer classification and rule extraction from microarray data

    PubMed Central

    Huang, Liang-Tsung

    2009-01-01

    Different microarray techniques recently have been successfully used to investigate useful information for cancer diagnosis at the gene expression level due to their ability to measure thousands of gene expression levels in a massively parallel way. One important issue is to improve classification performance of microarray data. However, it would be ideal that influential genes and even interpretable rules can be explored at the same time to offer biological insight. Introducing the concepts of system design in software engineering, this paper has presented an integrated and effective method (named X-AI) for accurate cancer classification and the acquisition of knowledge from DNA microarray data. This method included a feature selector to systematically extract the relative important genes so as to reduce the dimension and retain as much as possible of the class discriminatory information. Next, diagonal quadratic discriminant analysis (DQDA) was combined to classify tumors, and generalized rule induction (GRI) was integrated to establish association rules which can give an understanding of the relationships between cancer classes and related genes. Two non-redundant datasets of acute leukemia were used to validate the proposed X-AI, showing significantly high accuracy for discriminating different classes. On the other hand, I have presented the abilities of X-AI to extract relevant genes, as well as to develop interpretable rules. Further, a web server has been established for cancer classification and it is freely available at . PMID:19272192

  1. Extracted facial feature of racial closely related faces

    NASA Astrophysics Data System (ADS)

    Liewchavalit, Chalothorn; Akiba, Masakazu; Kanno, Tsuneo; Nagao, Tomoharu

    2010-02-01

    Human faces contain a lot of demographic information such as identity, gender, age, race and emotion. Human being can perceive these pieces of information and use it as an important clue in social interaction with other people. Race perception is considered the most delicacy and sensitive parts of face perception. There are many research concerning image-base race recognition, but most of them are focus on major race group such as Caucasoid, Negroid and Mongoloid. This paper focuses on how people classify race of the racial closely related group. As a sample of racial closely related group, we choose Japanese and Thai face to represents difference between Northern and Southern Mongoloid. Three psychological experiment was performed to study the strategies of face perception on race classification. As a result of psychological experiment, it can be suggested that race perception is an ability that can be learn. Eyes and eyebrows are the most attention point and eyes is a significant factor in race perception. The Principal Component Analysis (PCA) was performed to extract facial features of sample race group. Extracted race features of texture and shape were used to synthesize faces. As the result, it can be suggested that racial feature is rely on detailed texture rather than shape feature. This research is a indispensable important fundamental research on the race perception which are essential in the establishment of human-like race recognition system.

  2. Multi-Filter String Matching and Human-Centric Entity Matching for Information Extraction

    ERIC Educational Resources Information Center

    Sun, Chong

    2012-01-01

    More and more information is being generated in text documents, such as Web pages, emails and blogs. To effectively manage this unstructured information, one broadly used approach includes locating relevant content in documents, extracting structured information and integrating the extracted information for querying, mining or further analysis. In…

  3. Neural network explanation using inversion.

    PubMed

    Saad, Emad W; Wunsch, Donald C

    2007-01-01

    An important drawback of many artificial neural networks (ANN) is their lack of explanation capability [Andrews, R., Diederich, J., & Tickle, A. B. (1996). A survey and critique of techniques for extracting rules from trained artificial neural networks. Knowledge-Based Systems, 8, 373-389]. This paper starts with a survey of algorithms which attempt to explain the ANN output. We then present HYPINV, a new explanation algorithm which relies on network inversion; i.e. calculating the ANN input which produces a desired output. HYPINV is a pedagogical algorithm, that extracts rules, in the form of hyperplanes. It is able to generate rules with arbitrarily desired fidelity, maintaining a fidelity-complexity tradeoff. To our knowledge, HYPINV is the only pedagogical rule extraction method, which extracts hyperplane rules from continuous or binary attribute neural networks. Different network inversion techniques, involving gradient descent as well as an evolutionary algorithm, are presented. An information theoretic treatment of rule extraction is presented. HYPINV is applied to example synthetic problems, to a real aerospace problem, and compared with similar algorithms using benchmark problems.

  4. Extracting Useful Semantic Information from Large Scale Corpora of Text

    ERIC Educational Resources Information Center

    Mendoza, Ray Padilla, Jr.

    2012-01-01

    Extracting and representing semantic information from large scale corpora is at the crux of computer-assisted knowledge generation. Semantic information depends on collocation extraction methods, mathematical models used to represent distributional information, and weighting functions which transform the space. This dissertation provides a…

  5. Generalized Likelihood Uncertainty Estimation (GLUE) methodology for optimization of extraction in natural products.

    PubMed

    Maulidiani; Rudiyanto; Abas, Faridah; Ismail, Intan Safinar; Lajis, Nordin H

    2018-06-01

    Optimization process is an important aspect in the natural product extractions. Herein, an alternative approach is proposed for the optimization in extraction, namely, the Generalized Likelihood Uncertainty Estimation (GLUE). The approach combines the Latin hypercube sampling, the feasible range of independent variables, the Monte Carlo simulation, and the threshold criteria of response variables. The GLUE method is tested in three different techniques including the ultrasound, the microwave, and the supercritical CO 2 assisted extractions utilizing the data from previously published reports. The study found that this method can: provide more information on the combined effects of the independent variables on the response variables in the dotty plots; deal with unlimited number of independent and response variables; consider combined multiple threshold criteria, which is subjective depending on the target of the investigation for response variables; and provide a range of values with their distribution for the optimization. Copyright © 2018 Elsevier Ltd. All rights reserved.

  6. Attention-Based Recurrent Temporal Restricted Boltzmann Machine for Radar High Resolution Range Profile Sequence Recognition.

    PubMed

    Zhang, Yifan; Gao, Xunzhang; Peng, Xuan; Ye, Jiaqi; Li, Xiang

    2018-05-16

    The High Resolution Range Profile (HRRP) recognition has attracted great concern in the field of Radar Automatic Target Recognition (RATR). However, traditional HRRP recognition methods failed to model high dimensional sequential data efficiently and have a poor anti-noise ability. To deal with these problems, a novel stochastic neural network model named Attention-based Recurrent Temporal Restricted Boltzmann Machine (ARTRBM) is proposed in this paper. RTRBM is utilized to extract discriminative features and the attention mechanism is adopted to select major features. RTRBM is efficient to model high dimensional HRRP sequences because it can extract the information of temporal and spatial correlation between adjacent HRRPs. The attention mechanism is used in sequential data recognition tasks including machine translation and relation classification, which makes the model pay more attention to the major features of recognition. Therefore, the combination of RTRBM and the attention mechanism makes our model effective for extracting more internal related features and choose the important parts of the extracted features. Additionally, the model performs well with the noise corrupted HRRP data. Experimental results on the Moving and Stationary Target Acquisition and Recognition (MSTAR) dataset show that our proposed model outperforms other traditional methods, which indicates that ARTRBM extracts, selects, and utilizes the correlation information between adjacent HRRPs effectively and is suitable for high dimensional data or noise corrupted data.

  7. Road Extraction from AVIRIS Using Spectral Mixture and Q-Tree Filter Techniques

    NASA Technical Reports Server (NTRS)

    Gardner, Margaret E.; Roberts, Dar A.; Funk, Chris; Noronha, Val

    2001-01-01

    Accurate road location and condition information are of primary importance in road infrastructure management. Additionally, spatially accurate and up-to-date road networks are essential in ambulance and rescue dispatch in emergency situations. However, accurate road infrastructure databases do not exist for vast areas, particularly in areas with rapid expansion. Currently, the US Department of Transportation (USDOT) extends great effort in field Global Positioning System (GPS) mapping and condition assessment to meet these informational needs. This methodology, though effective, is both time-consuming and costly, because every road within a DOT's jurisdiction must be field-visited to obtain accurate information. Therefore, the USDOT is interested in identifying new technologies that could help meet road infrastructure informational needs more effectively. Remote sensing provides one means by which large areas may be mapped with a high standard of accuracy and is a technology with great potential in infrastructure mapping. The goal of our research is to develop accurate road extraction techniques using high spatial resolution, fine spectral resolution imagery. Additionally, our research will explore the use of hyperspectral data in assessing road quality. Finally, this research aims to define the spatial and spectral requirements for remote sensing data to be used successfully for road feature extraction and road quality mapping. Our findings will facilitate the USDOT in assessing remote sensing as a new resource in infrastructure studies.

  8. Extraction of endoscopic images for biomedical figure classification

    NASA Astrophysics Data System (ADS)

    Xue, Zhiyun; You, Daekeun; Chachra, Suchet; Antani, Sameer; Long, L. R.; Demner-Fushman, Dina; Thoma, George R.

    2015-03-01

    Modality filtering is an important feature in biomedical image searching systems and may significantly improve the retrieval performance of the system. This paper presents a new method for extracting endoscopic image figures from photograph images in biomedical literature, which are found to have highly diverse content and large variability in appearance. Our proposed method consists of three main stages: tissue image extraction, endoscopic image candidate extraction, and ophthalmic image filtering. For tissue image extraction we use image patch level clustering and MRF relabeling to detect images containing skin/tissue regions. Next, we find candidate endoscopic images by exploiting the round shape characteristics that commonly appear in these images. However, this step needs to compensate for images where endoscopic regions are not entirely round. In the third step we filter out the ophthalmic images which have shape characteristics very similar to the endoscopic images. We do this by using text information, specifically, anatomy terms, extracted from the figure caption. We tested and evaluated our method on a dataset of 115,370 photograph figures, and achieved promising precision and recall rates of 87% and 84%, respectively.

  9. PREDOSE: A Semantic Web Platform for Drug Abuse Epidemiology using Social Media

    PubMed Central

    Cameron, Delroy; Smith, Gary A.; Daniulaityte, Raminta; Sheth, Amit P.; Dave, Drashti; Chen, Lu; Anand, Gaurish; Carlson, Robert; Watkins, Kera Z.; Falck, Russel

    2013-01-01

    Objectives The role of social media in biomedical knowledge mining, including clinical, medical and healthcare informatics, prescription drug abuse epidemiology and drug pharmacology, has become increasingly significant in recent years. Social media offers opportunities for people to share opinions and experiences freely in online communities, which may contribute information beyond the knowledge of domain professionals. This paper describes the development of a novel Semantic Web platform called PREDOSE (PREscription Drug abuse Online Surveillance and Epidemiology), which is designed to facilitate the epidemiologic study of prescription (and related) drug abuse practices using social media. PREDOSE uses web forum posts and domain knowledge, modeled in a manually created Drug Abuse Ontology (DAO) (pronounced dow), to facilitate the extraction of semantic information from User Generated Content (UGC). A combination of lexical, pattern-based and semantics-based techniques is used together with the domain knowledge to extract fine-grained semantic information from UGC. In a previous study, PREDOSE was used to obtain the datasets from which new knowledge in drug abuse research was derived. Here, we report on various platform enhancements, including an updated DAO, new components for relationship and triple extraction, and tools for content analysis, trend detection and emerging patterns exploration, which enhance the capabilities of the PREDOSE platform. Given these enhancements, PREDOSE is now more equipped to impact drug abuse research by alleviating traditional labor-intensive content analysis tasks. Methods Using custom web crawlers that scrape UGC from publicly available web forums, PREDOSE first automates the collection of web-based social media content for subsequent semantic annotation. The annotation scheme is modeled in the DAO, and includes domain specific knowledge such as prescription (and related) drugs, methods of preparation, side effects, routes of administration, etc. The DAO is also used to help recognize three types of data, namely: 1) entities, 2) relationships and 3) triples. PREDOSE then uses a combination of lexical and semantic-based techniques to extract entities and relationships from the scraped content, and a top-down approach for triple extraction that uses patterns expressed in the DAO. In addition, PREDOSE uses publicly available lexicons to identify initial sentiment expressions in text, and then a probabilistic optimization algorithm (from related research) to extract the final sentiment expressions. Together, these techniques enable the capture of fine-grained semantic information from UGC, and querying, search, trend analysis and overall content analysis of social media related to prescription drug abuse. Moreover, extracted data are also made available to domain experts for the creation of training and test sets for use in evaluation and refinements in information extraction techniques. Results A recent evaluation of the information extraction techniques applied in the PREDOSE platform indicates 85% precision and 72% recall in entity identification, on a manually created gold standard dataset. In another study, PREDOSE achieved 36% precision in relationship identification and 33% precision in triple extraction, through manual evaluation by domain experts. Given the complexity of the relationship and triple extraction tasks and the abstruse nature of social media texts, we interpret these as favorable initial results. Extracted semantic information is currently in use in an online discovery support system, by prescription drug abuse researchers at the Center for Interventions, Treatment and Addictions Research (CITAR) at Wright State University. Conclusion A comprehensive platform for entity, relationship, triple and sentiment extraction from such abstruse texts has never been developed for drug abuse research. PREDOSE has already demonstrated the importance of mining social media by providing data from which new findings in drug abuse research were uncovered. Given the recent platform enhancements, including the refined DAO, components for relationship and triple extraction, and tools for content, trend and emerging pattern analysis, it is expected that PREDOSE will play a significant role in advancing drug abuse epidemiology in future. PMID:23892295

  10. Social Network Analysis of Elders' Health Literacy and their Use of Online Health Information.

    PubMed

    Jang, Haeran; An, Ji-Young

    2014-07-01

    Utilizing social network analysis, this study aimed to analyze the main keywords in the literature regarding the health literacy of and the use of online health information by aged persons over 65. Medical Subject Heading keywords were extracted from articles on the PubMed database of the National Library of Medicine. For health literacy, 110 articles out of 361 were initially extracted. Seventy-one keywords out of 1,021 were finally selected after removing repeated keywords and applying pruning. Regarding the use of online health information, 19 articles out of 26 were selected. One hundred forty-four keywords were initially extracted. After removing the repeated keywords, 74 keywords were finally selected. Health literacy was found to be strongly connected with 'Health knowledge, attitudes, practices' and 'Patient education as topic.' 'Computer literacy' had strong connections with 'Internet' and 'Attitude towards computers.' 'Computer literacy' was connected to 'Health literacy,' and was studied according to the parameters 'Attitude towards health' and 'Patient education as topic.' The use of online health information was strongly connected with 'Health knowledge, attitudes, practices,' 'Consumer health information,' 'Patient education as topic,' etc. In the network, 'Computer literacy' was connected with 'Health education,' 'Patient satisfaction,' 'Self-efficacy,' 'Attitude to computer,' etc. Research on older citizens' health literacy and their use of online health information was conducted together with study of computer literacy, patient education, attitude towards health, health education, patient satisfaction, etc. In particular, self-efficacy was noted as an important keyword. Further research should be conducted to identify the effective outcomes of self-efficacy in the area of interest.

  11. The Information System at CeSAM

    NASA Astrophysics Data System (ADS)

    Agneray, F.; Gimenez, S.; Moreau, C.; Roehlly, Y.

    2012-09-01

    Modern large observational programmes produce important amounts of data from various origins, and need high level quality control, fast data access via easy-to-use graphic interfaces, as well as possibility to cross-correlate informations coming from different observations. The Centre de donnéeS Astrophysique de Marseille (CeSAM) offer web access to VO compliant Information Systems to access data of different projects (VVDS, HeDAM, EXODAT, HST-COSMOS,…), including ancillary data obtained outside Laboratoire d'Astrophysique de Marseille (LAM) control. The CeSAM Information Systems provides download of catalogues and some additional services like: search, extract and display imaging and spectroscopic data by multi-criteria and Cone Search interfaces.

  12. Evaluation of nuclear-facility decommissioning projects. Summary report: Ames Laboratory Research Reactor

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Link, B.W.; Miller, R.L.

    1983-07-01

    This document summarizes the available information concerning the decommissioning of the Ames Laboratory Research Reactor (ALRR), a five-megawatt heavy water moderated and cooled research reactor. The data were placed in a computerized information retrieval/manipulation system which permits its future utilization for purposes of comparative analysis. This information is presented both in detail in its computer output form and also as a manually assembled summarization which highlights the more important aspects of the decommissioning program. Some comparative information with reference to generic decommissioning data extracted from NUREG/CR 1756, Technology, Safety and Costs of Decommissioning Nuclear Research and Test Reactors, is included.

  13. Video Skimming and Characterization through the Combination of Image and Language Understanding Techniques

    NASA Technical Reports Server (NTRS)

    Smith, Michael A.; Kanade, Takeo

    1997-01-01

    Digital video is rapidly becoming important for education, entertainment, and a host of multimedia applications. With the size of the video collections growing to thousands of hours, technology is needed to effectively browse segments in a short time without losing the content of the video. We propose a method to extract the significant audio and video information and create a "skim" video which represents a very short synopsis of the original. The goal of this work is to show the utility of integrating language and image understanding techniques for video skimming by extraction of significant information, such as specific objects, audio keywords and relevant video structure. The resulting skim video is much shorter, where compaction is as high as 20:1, and yet retains the essential content of the original segment.

  14. Synthesising quantitative and qualitative research in evidence-based patient information.

    PubMed

    Goldsmith, Megan R; Bankhead, Clare R; Austoker, Joan

    2007-03-01

    Systematic reviews have, in the past, focused on quantitative studies and clinical effectiveness, while excluding qualitative evidence. Qualitative research can inform evidence-based practice independently of other research methodologies but methods for the synthesis of such data are currently evolving. Synthesising quantitative and qualitative research in a single review is an important methodological challenge. This paper describes the review methods developed and the difficulties encountered during the process of updating a systematic review of evidence to inform guidelines for the content of patient information related to cervical screening. Systematic searches of 12 electronic databases (January 1996 to July 2004) were conducted. Studies that evaluated the content of information provided to women about cervical screening or that addressed women's information needs were assessed for inclusion. A data extraction form and quality assessment criteria were developed from published resources. A non-quantitative synthesis was conducted and a tabular evidence profile for each important outcome (eg "explain what the test involves") was prepared. The overall quality of evidence for each outcome was then assessed using an approach published by the GRADE working group, which was adapted to suit the review questions and modified to include qualitative research evidence. Quantitative and qualitative studies were considered separately for every outcome. 32 papers were included in the systematic review following data extraction and assessment of methodological quality. The review questions were best answered by evidence from a range of data sources. The inclusion of qualitative research, which was often highly relevant and specific to many components of the screening information materials, enabled the production of a set of recommendations that will directly affect policy within the NHS Cervical Screening Programme. A practical example is provided of how quantitative and qualitative data sources might successfully be brought together and considered in one review.

  15. Kinase Pathway Database: An Integrated Protein-Kinase and NLP-Based Protein-Interaction Resource

    PubMed Central

    Koike, Asako; Kobayashi, Yoshiyuki; Takagi, Toshihisa

    2003-01-01

    Protein kinases play a crucial role in the regulation of cellular functions. Various kinds of information about these molecules are important for understanding signaling pathways and organism characteristics. We have developed the Kinase Pathway Database, an integrated database involving major completely sequenced eukaryotes. It contains the classification of protein kinases and their functional conservation, ortholog tables among species, protein–protein, protein–gene, and protein–compound interaction data, domain information, and structural information. It also provides an automatic pathway graphic image interface. The protein, gene, and compound interactions are automatically extracted from abstracts for all genes and proteins by natural-language processing (NLP).The method of automatic extraction uses phrase patterns and the GENA protein, gene, and compound name dictionary, which was developed by our group. With this database, pathways are easily compared among species using data with more than 47,000 protein interactions and protein kinase ortholog tables. The database is available for querying and browsing at http://kinasedb.ontology.ims.u-tokyo.ac.jp/. PMID:12799355

  16. Systematically extracting metal- and solvent-related occupational information from free-text responses to lifetime occupational history questionnaires.

    PubMed

    Friesen, Melissa C; Locke, Sarah J; Tornow, Carina; Chen, Yu-Cheng; Koh, Dong-Hee; Stewart, Patricia A; Purdue, Mark; Colt, Joanne S

    2014-06-01

    Lifetime occupational history (OH) questionnaires often use open-ended questions to capture detailed information about study participants' jobs. Exposure assessors use this information, along with responses to job- and industry-specific questionnaires, to assign exposure estimates on a job-by-job basis. An alternative approach is to use information from the OH responses and the job- and industry-specific questionnaires to develop programmable decision rules for assigning exposures. As a first step in this process, we developed a systematic approach to extract the free-text OH responses and convert them into standardized variables that represented exposure scenarios. Our study population comprised 2408 subjects, reporting 11991 jobs, from a case-control study of renal cell carcinoma. Each subject completed a lifetime OH questionnaire that included verbatim responses, for each job, to open-ended questions including job title, main tasks and activities (task), tools and equipment used (tools), and chemicals and materials handled (chemicals). Based on a review of the literature, we identified exposure scenarios (occupations, industries, tasks/tools/chemicals) expected to involve possible exposure to chlorinated solvents, trichloroethylene (TCE) in particular, lead, and cadmium. We then used a SAS macro to review the information reported by study participants to identify jobs associated with each exposure scenario; this was done using previously coded standardized occupation and industry classification codes, and a priori lists of associated key words and phrases related to possibly exposed tasks, tools, and chemicals. Exposure variables representing the occupation, industry, and task/tool/chemicals exposure scenarios were added to the work history records of the study respondents. Our identification of possibly TCE-exposed scenarios in the OH responses was compared to an expert's independently assigned probability ratings to evaluate whether we missed identifying possibly exposed jobs. Our process added exposure variables for 52 occupation groups, 43 industry groups, and 46 task/tool/chemical scenarios to the data set of OH responses. Across all four agents, we identified possibly exposed task/tool/chemical exposure scenarios in 44-51% of the jobs in possibly exposed occupations. Possibly exposed task/tool/chemical exposure scenarios were found in a nontrivial 9-14% of the jobs not in possibly exposed occupations, suggesting that our process identified important information that would not be captured using occupation alone. Our extraction process was sensitive: for jobs where our extraction of OH responses identified no exposure scenarios and for which the sole source of information was the OH responses, only 0.1% were assessed as possibly exposed to TCE by the expert. Our systematic extraction of OH information found useful information in the task/chemicals/tools responses that was relatively easy to extract and that was not available from the occupational or industry information. The extracted variables can be used as inputs in the development of decision rules, especially for jobs where no additional information, such as job- and industry-specific questionnaires, is available. Published by Oxford University Press on behalf of the British Occupational Hygiene Society 2014.

  17. ExaCT: automatic extraction of clinical trial characteristics from journal publications

    PubMed Central

    2010-01-01

    Background Clinical trials are one of the most important sources of evidence for guiding evidence-based practice and the design of new trials. However, most of this information is available only in free text - e.g., in journal publications - which is labour intensive to process for systematic reviews, meta-analyses, and other evidence synthesis studies. This paper presents an automatic information extraction system, called ExaCT, that assists users with locating and extracting key trial characteristics (e.g., eligibility criteria, sample size, drug dosage, primary outcomes) from full-text journal articles reporting on randomized controlled trials (RCTs). Methods ExaCT consists of two parts: an information extraction (IE) engine that searches the article for text fragments that best describe the trial characteristics, and a web browser-based user interface that allows human reviewers to assess and modify the suggested selections. The IE engine uses a statistical text classifier to locate those sentences that have the highest probability of describing a trial characteristic. Then, the IE engine's second stage applies simple rules to these sentences to extract text fragments containing the target answer. The same approach is used for all 21 trial characteristics selected for this study. Results We evaluated ExaCT using 50 previously unseen articles describing RCTs. The text classifier (first stage) was able to recover 88% of relevant sentences among its top five candidates (top5 recall) with the topmost candidate being relevant in 80% of cases (top1 precision). Precision and recall of the extraction rules (second stage) were 93% and 91%, respectively. Together, the two stages of the extraction engine were able to provide (partially) correct solutions in 992 out of 1050 test tasks (94%), with a majority of these (696) representing fully correct and complete answers. Conclusions Our experiments confirmed the applicability and efficacy of ExaCT. Furthermore, they demonstrated that combining a statistical method with 'weak' extraction rules can identify a variety of study characteristics. The system is flexible and can be extended to handle other characteristics and document types (e.g., study protocols). PMID:20920176

  18. Text mining in livestock animal science: introducing the potential of text mining to animal sciences.

    PubMed

    Sahadevan, S; Hofmann-Apitius, M; Schellander, K; Tesfaye, D; Fluck, J; Friedrich, C M

    2012-10-01

    In biological research, establishing the prior art by searching and collecting information already present in the domain has equal importance as the experiments done. To obtain a complete overview about the relevant knowledge, researchers mainly rely on 2 major information sources: i) various biological databases and ii) scientific publications in the field. The major difference between the 2 information sources is that information from databases is available, typically well structured and condensed. The information content in scientific literature is vastly unstructured; that is, dispersed among the many different sections of scientific text. The traditional method of information extraction from scientific literature occurs by generating a list of relevant publications in the field of interest and manually scanning these texts for relevant information, which is very time consuming. It is more than likely that in using this "classical" approach the researcher misses some relevant information mentioned in the literature or has to go through biological databases to extract further information. Text mining and named entity recognition methods have already been used in human genomics and related fields as a solution to this problem. These methods can process and extract information from large volumes of scientific text. Text mining is defined as the automatic extraction of previously unknown and potentially useful information from text. Named entity recognition (NER) is defined as the method of identifying named entities (names of real world objects; for example, gene/protein names, drugs, enzymes) in text. In animal sciences, text mining and related methods have been briefly used in murine genomics and associated fields, leaving behind other fields of animal sciences, such as livestock genomics. The aim of this work was to develop an information retrieval platform in the livestock domain focusing on livestock publications and the recognition of relevant data from cattle and pigs. For this purpose, the rather noncomprehensive resources of pig and cattle gene and protein terminologies were enriched with orthologue synonyms, integrated in the NER platform, ProMiner, which is successfully used in human genomics domain. Based on the performance tests done, the present system achieved a fair performance with precision 0.64, recall 0.74, and F(1) measure of 0.69 in a test scenario based on cattle literature.

  19. Susceptibility of Porphyromonas gingivalis and Streptococcus mutans to Antibacterial Effect from Mammea americana

    PubMed Central

    Herrera Herrera, Alejandra; Franco Ospina, Luis; Fang, Luis; Díaz Caballero, Antonio

    2014-01-01

    The development of periodontal disease and dental caries is influenced by several factors, such as microorganisms of bacterial biofilm or commensal bacteria in the mouth. These microorganisms trigger inflammatory and immune responses in the host. Currently, medicinal plants are treatment options for these oral diseases. Mammea americana extracts have reported antimicrobial effects against several microorganisms. Nevertheless, this effect is unknown against oral bacteria. Therefore, the aim of this study was to evaluate the antibacterial effect of M. americana extract against Porphyromonas gingivalis and Streptococcus mutans. For this, an experimental study was conducted. Ethanolic extract was obtained from seeds of M. americana (one oil phase and one ethanolic phase). The strains of Porphyromonas gingivalis ATCC 33277 and Streptococcus mutans ATCC 25175 were exposed to this extract to evaluate its antibacterial effect. Antibacterial activity was observed with the two phases of M. americana extract on P. gingivalis and S. mutans with lower MICs (minimum inhibitory concentration). Also, bactericidal and bacteriostatic activity was detected against S. mutans, depending on the concentration of the extract, while on M. americana extract presented only bacteriostatic activity against P. gingivalis. These findings provide important and promising information allowing for further exploration in the future. PMID:24864137

  20. Quantity and unit extraction for scientific and technical intelligence analysis

    NASA Astrophysics Data System (ADS)

    David, Peter; Hawes, Timothy

    2017-05-01

    Scientific and Technical (S and T) intelligence analysts consume huge amounts of data to understand how scientific progress and engineering efforts affect current and future military capabilities. One of the most important types of information S and T analysts exploit is the quantities discussed in their source material. Frequencies, ranges, size, weight, power, and numerous other properties and measurements describing the performance characteristics of systems and the engineering constraints that define them must be culled from source documents before quantified analysis can begin. Automating the process of finding and extracting the relevant quantities from a wide range of S and T documents is difficult because information about quantities and their units is often contained in unstructured text with ad hoc conventions used to convey their meaning. Currently, even simple tasks, such as searching for documents discussing RF frequencies in a band of interest, is a labor intensive and error prone process. This research addresses the challenges facing development of a document processing capability that extracts quantities and units from S and T data, and how Natural Language Processing algorithms can be used to overcome these challenges.

  1. Autism, Context/Noncontext Information Processing, and Atypical Development

    PubMed Central

    Skoyles, John R.

    2011-01-01

    Autism has been attributed to a deficit in contextual information processing. Attempts to understand autism in terms of such a defect, however, do not include more recent computational work upon context. This work has identified that context information processing depends upon the extraction and use of the information hidden in higher-order (or indirect) associations. Higher-order associations underlie the cognition of context rather than that of situations. This paper starts by examining the differences between higher-order and first-order (or direct) associations. Higher-order associations link entities not directly (as with first-order ones) but indirectly through all the connections they have via other entities. Extracting this information requires the processing of past episodes as a totality. As a result, this extraction depends upon specialised extraction processes separate from cognition. This information is then consolidated. Due to this difference, the extraction/consolidation of higher-order information can be impaired whilst cognition remains intact. Although not directly impaired, cognition will be indirectly impaired by knock on effects such as cognition compensating for absent higher-order information with information extracted from first-order associations. This paper discusses the implications of this for the inflexible, literal/immediate, and inappropriate information processing of autistic individuals. PMID:22937255

  2. The identification of clinically important elements within medical journal abstracts: Patient-Population-Problem, Exposure-Intervention, Comparison, Outcome, Duration and Results (PECODR).

    PubMed

    Dawes, Martin; Pluye, Pierre; Shea, Laura; Grad, Roland; Greenberg, Arlene; Nie, Jian-Yun

    2007-01-01

    Information retrieval in primary care is becoming more difficult as the volume of medical information held in electronic databases expands. The lexical structure of this information might permit automatic indexing and improved retrieval. To determine the possibility of identifying the key elements of clinical studies, namely Patient-Population-Problem, Exposure-Intervention, Comparison, Outcome, Duration and Results (PECODR), from abstracts of medical journals. We used a convenience sample of 20 synopses from the journal Evidence-Based Medicine (EBM) and their matching original journal article abstracts obtained from PubMed. Three independent primary care professionals identified PECODR-related extracts of text. Rules were developed to define each PECODR element and the selection process of characters, words, phrases and sentences. From the extracts of text related to PECODR elements, potential lexical patterns that might help identify those elements were proposed and assessed using NVivo software. A total of 835 PECODR-related text extracts containing 41,263 individual text characters were identified from 20 EBM journal synopses. There were 759 extracts in the corresponding PubMed abstracts containing 31,947 characters. PECODR elements were found in nearly all abstracts and synopses with the exception of duration. There was agreement on 86.6% of the extracts from the 20 EBM synopses and 85.0% on the corresponding PubMed abstracts. After consensus this rose to 98.4% and 96.9% respectively. We found potential text patterns in the Comparison, Outcome and Results elements of both EBM synopses and PubMed abstracts. Some phrases and words are used frequently and are specific for these elements in both synopses and abstracts. Results suggest a PECODR-related structure exists in medical abstracts and that there might be lexical patterns specific to these elements. More sophisticated computer-assisted lexical-semantic analysis might refine these results, and pave the way to automating PECODR indexing, and improve information retrieval in primary care.

  3. Pattern Mining for Extraction of mentions of Adverse Drug Reactions from User Comments

    PubMed Central

    Nikfarjam, Azadeh; Gonzalez, Graciela H.

    2011-01-01

    Rapid growth of online health social networks has enabled patients to communicate more easily with each other. This way of exchange of opinions and experiences has provided a rich source of information about drugs and their effectiveness and more importantly, their possible adverse reactions. We developed a system to automatically extract mentions of Adverse Drug Reactions (ADRs) from user reviews about drugs in social network websites by mining a set of language patterns. The system applied association rule mining on a set of annotated comments to extract the underlying patterns of colloquial expressions about adverse effects. The patterns were tested on a set of unseen comments to evaluate their performance. We reached to precision of 70.01% and recall of 66.32% and F-measure of 67.96%. PMID:22195162

  4. Identifying the Critical Time Period for Information Extraction when Recognizing Sequences of Play

    ERIC Educational Resources Information Center

    North, Jamie S.; Williams, A. Mark

    2008-01-01

    The authors attempted to determine the critical time period for information extraction when recognizing play sequences in soccer. Although efforts have been made to identify the perceptual information underpinning such decisions, no researchers have attempted to determine "when" this information may be extracted from the display. The authors…

  5. Can we replace curation with information extraction software?

    PubMed

    Karp, Peter D

    2016-01-01

    Can we use programs for automated or semi-automated information extraction from scientific texts as practical alternatives to professional curation? I show that error rates of current information extraction programs are too high to replace professional curation today. Furthermore, current IEP programs extract single narrow slivers of information, such as individual protein interactions; they cannot extract the large breadth of information extracted by professional curators for databases such as EcoCyc. They also cannot arbitrate among conflicting statements in the literature as curators can. Therefore, funding agencies should not hobble the curation efforts of existing databases on the assumption that a problem that has stymied Artificial Intelligence researchers for more than 60 years will be solved tomorrow. Semi-automated extraction techniques appear to have significantly more potential based on a review of recent tools that enhance curator productivity. But a full cost-benefit analysis for these tools is lacking. Without such analysis it is possible to expend significant effort developing information-extraction tools that automate small parts of the overall curation workflow without achieving a significant decrease in curation costs.Database URL. © The Author(s) 2016. Published by Oxford University Press.

  6. DOE Office of Scientific and Technical Information (OSTI.GOV)

    Vatsavai, Raju; Cheriyadat, Anil M; Bhaduri, Budhendra L

    The high rate of urbanization, political conflicts and ensuing internal displacement of population, and increased poverty in the 20th century has resulted in rapid increase of informal settlements. These unplanned, unauthorized, and/or unstructured homes, known as informal settlements, shantytowns, barrios, or slums, pose several challenges to the nations, as these settlements are often located in most hazardous regions and lack basic services. Though several World Bank and United Nations sponsored studies stress the importance of poverty maps in designing better policies and interventions, mapping slums of the world is a daunting and challenging task. In this paper, we summarize ourmore » ongoing research on settlement mapping through the utilization of Very high resolution (VHR) remote sensing imagery. Most existing approaches used to classify VHR images are single instance (or pixel-based) learning algorithms, which are inadequate for analyzing VHR imagery, as single pixels do not contain sufficient contextual information (see Figure 1). However, much needed spatial contextual information can be captured via feature extraction and/or through newer machine learning algorithms in order to extract complex spatial patterns that distinguish informal settlements from formal ones. In recent years, we made significant progress in advancing the state of art in both directions. This paper summarizes these results.« less

  7. Automatic lip reading by using multimodal visual features

    NASA Astrophysics Data System (ADS)

    Takahashi, Shohei; Ohya, Jun

    2013-12-01

    Since long time ago, speech recognition has been researched, though it does not work well in noisy places such as in the car or in the train. In addition, people with hearing-impaired or difficulties in hearing cannot receive benefits from speech recognition. To recognize the speech automatically, visual information is also important. People understand speeches from not only audio information, but also visual information such as temporal changes in the lip shape. A vision based speech recognition method could work well in noisy places, and could be useful also for people with hearing disabilities. In this paper, we propose an automatic lip-reading method for recognizing the speech by using multimodal visual information without using any audio information such as speech recognition. First, the ASM (Active Shape Model) is used to track and detect the face and lip in a video sequence. Second, the shape, optical flow and spatial frequencies of the lip features are extracted from the lip detected by ASM. Next, the extracted multimodal features are ordered chronologically so that Support Vector Machine is performed in order to learn and classify the spoken words. Experiments for classifying several words show promising results of this proposed method.

  8. Context Oriented Information Integration

    NASA Astrophysics Data System (ADS)

    Mohania, Mukesh; Bhide, Manish; Roy, Prasan; Chakaravarthy, Venkatesan T.; Gupta, Himanshu

    Faced with growing knowledge management needs, enterprises are increasingly realizing the importance of seamlessly integrating critical business information distributed across both structured and unstructured data sources. Academicians have focused on this problem but there still remain a lot of obstacles for its widespread use in practice. One of the key problems is the absence of schema in unstructured text. In this paper we present a new paradigm for integrating information which overcomes this problem - that of Context Oriented Information Integration. The goal is to integrate unstructured data with the structured data present in the enterprise and use the extracted information to generate actionable insights for the enterprise. We present two techniques which enable context oriented information integration and show how they can be used for solving real world problems.

  9. Semi-Automated Approach for Mapping Urban Trees from Integrated Aerial LiDAR Point Cloud and Digital Imagery Datasets

    NASA Astrophysics Data System (ADS)

    Dogon-Yaro, M. A.; Kumar, P.; Rahman, A. Abdul; Buyuksalih, G.

    2016-09-01

    Mapping of trees plays an important role in modern urban spatial data management, as many benefits and applications inherit from this detailed up-to-date data sources. Timely and accurate acquisition of information on the condition of urban trees serves as a tool for decision makers to better appreciate urban ecosystems and their numerous values which are critical to building up strategies for sustainable development. The conventional techniques used for extracting trees include ground surveying and interpretation of the aerial photography. However, these techniques are associated with some constraints, such as labour intensive field work and a lot of financial requirement which can be overcome by means of integrated LiDAR and digital image datasets. Compared to predominant studies on trees extraction mainly in purely forested areas, this study concentrates on urban areas, which have a high structural complexity with a multitude of different objects. This paper presented a workflow about semi-automated approach for extracting urban trees from integrated processing of airborne based LiDAR point cloud and multispectral digital image datasets over Istanbul city of Turkey. The paper reveals that the integrated datasets is a suitable technology and viable source of information for urban trees management. As a conclusion, therefore, the extracted information provides a snapshot about location, composition and extent of trees in the study area useful to city planners and other decision makers in order to understand how much canopy cover exists, identify new planting, removal, or reforestation opportunities and what locations have the greatest need or potential to maximize benefits of return on investment. It can also help track trends or changes to the urban trees over time and inform future management decisions.

  10. Sieve-based relation extraction of gene regulatory networks from biological literature

    PubMed Central

    2015-01-01

    Background Relation extraction is an essential procedure in literature mining. It focuses on extracting semantic relations between parts of text, called mentions. Biomedical literature includes an enormous amount of textual descriptions of biological entities, their interactions and results of related experiments. To extract them in an explicit, computer readable format, these relations were at first extracted manually from databases. Manual curation was later replaced with automatic or semi-automatic tools with natural language processing capabilities. The current challenge is the development of information extraction procedures that can directly infer more complex relational structures, such as gene regulatory networks. Results We develop a computational approach for extraction of gene regulatory networks from textual data. Our method is designed as a sieve-based system and uses linear-chain conditional random fields and rules for relation extraction. With this method we successfully extracted the sporulation gene regulation network in the bacterium Bacillus subtilis for the information extraction challenge at the BioNLP 2013 conference. To enable extraction of distant relations using first-order models, we transform the data into skip-mention sequences. We infer multiple models, each of which is able to extract different relationship types. Following the shared task, we conducted additional analysis using different system settings that resulted in reducing the reconstruction error of bacterial sporulation network from 0.73 to 0.68, measured as the slot error rate between the predicted and the reference network. We observe that all relation extraction sieves contribute to the predictive performance of the proposed approach. Also, features constructed by considering mention words and their prefixes and suffixes are the most important features for higher accuracy of extraction. Analysis of distances between different mention types in the text shows that our choice of transforming data into skip-mention sequences is appropriate for detecting relations between distant mentions. Conclusions Linear-chain conditional random fields, along with appropriate data transformations, can be efficiently used to extract relations. The sieve-based architecture simplifies the system as new sieves can be easily added or removed and each sieve can utilize the results of previous ones. Furthermore, sieves with conditional random fields can be trained on arbitrary text data and hence are applicable to broad range of relation extraction tasks and data domains. PMID:26551454

  11. Sieve-based relation extraction of gene regulatory networks from biological literature.

    PubMed

    Žitnik, Slavko; Žitnik, Marinka; Zupan, Blaž; Bajec, Marko

    2015-01-01

    Relation extraction is an essential procedure in literature mining. It focuses on extracting semantic relations between parts of text, called mentions. Biomedical literature includes an enormous amount of textual descriptions of biological entities, their interactions and results of related experiments. To extract them in an explicit, computer readable format, these relations were at first extracted manually from databases. Manual curation was later replaced with automatic or semi-automatic tools with natural language processing capabilities. The current challenge is the development of information extraction procedures that can directly infer more complex relational structures, such as gene regulatory networks. We develop a computational approach for extraction of gene regulatory networks from textual data. Our method is designed as a sieve-based system and uses linear-chain conditional random fields and rules for relation extraction. With this method we successfully extracted the sporulation gene regulation network in the bacterium Bacillus subtilis for the information extraction challenge at the BioNLP 2013 conference. To enable extraction of distant relations using first-order models, we transform the data into skip-mention sequences. We infer multiple models, each of which is able to extract different relationship types. Following the shared task, we conducted additional analysis using different system settings that resulted in reducing the reconstruction error of bacterial sporulation network from 0.73 to 0.68, measured as the slot error rate between the predicted and the reference network. We observe that all relation extraction sieves contribute to the predictive performance of the proposed approach. Also, features constructed by considering mention words and their prefixes and suffixes are the most important features for higher accuracy of extraction. Analysis of distances between different mention types in the text shows that our choice of transforming data into skip-mention sequences is appropriate for detecting relations between distant mentions. Linear-chain conditional random fields, along with appropriate data transformations, can be efficiently used to extract relations. The sieve-based architecture simplifies the system as new sieves can be easily added or removed and each sieve can utilize the results of previous ones. Furthermore, sieves with conditional random fields can be trained on arbitrary text data and hence are applicable to broad range of relation extraction tasks and data domains.

  12. Extraction of CT dose information from DICOM metadata: automated Matlab-based approach.

    PubMed

    Dave, Jaydev K; Gingold, Eric L

    2013-01-01

    The purpose of this study was to extract exposure parameters and dose-relevant indexes of CT examinations from information embedded in DICOM metadata. DICOM dose report files were identified and retrieved from a PACS. An automated software program was used to extract from these files information from the structured elements in the DICOM metadata relevant to exposure. Extracting information from DICOM metadata eliminated potential errors inherent in techniques based on optical character recognition, yielding 100% accuracy.

  13. Accurate facade feature extraction method for buildings from three-dimensional point cloud data considering structural information

    NASA Astrophysics Data System (ADS)

    Wang, Yongzhi; Ma, Yuqing; Zhu, A.-xing; Zhao, Hui; Liao, Lixia

    2018-05-01

    Facade features represent segmentations of building surfaces and can serve as a building framework. Extracting facade features from three-dimensional (3D) point cloud data (3D PCD) is an efficient method for 3D building modeling. By combining the advantages of 3D PCD and two-dimensional optical images, this study describes the creation of a highly accurate building facade feature extraction method from 3D PCD with a focus on structural information. The new extraction method involves three major steps: image feature extraction, exploration of the mapping method between the image features and 3D PCD, and optimization of the initial 3D PCD facade features considering structural information. Results show that the new method can extract the 3D PCD facade features of buildings more accurately and continuously. The new method is validated using a case study. In addition, the effectiveness of the new method is demonstrated by comparing it with the range image-extraction method and the optical image-extraction method in the absence of structural information. The 3D PCD facade features extracted by the new method can be applied in many fields, such as 3D building modeling and building information modeling.

  14. Integrated feature extraction and selection for neuroimage classification

    NASA Astrophysics Data System (ADS)

    Fan, Yong; Shen, Dinggang

    2009-02-01

    Feature extraction and selection are of great importance in neuroimage classification for identifying informative features and reducing feature dimensionality, which are generally implemented as two separate steps. This paper presents an integrated feature extraction and selection algorithm with two iterative steps: constrained subspace learning based feature extraction and support vector machine (SVM) based feature selection. The subspace learning based feature extraction focuses on the brain regions with higher possibility of being affected by the disease under study, while the possibility of brain regions being affected by disease is estimated by the SVM based feature selection, in conjunction with SVM classification. This algorithm can not only take into account the inter-correlation among different brain regions, but also overcome the limitation of traditional subspace learning based feature extraction methods. To achieve robust performance and optimal selection of parameters involved in feature extraction, selection, and classification, a bootstrapping strategy is used to generate multiple versions of training and testing sets for parameter optimization, according to the classification performance measured by the area under the ROC (receiver operating characteristic) curve. The integrated feature extraction and selection method is applied to a structural MR image based Alzheimer's disease (AD) study with 98 non-demented and 100 demented subjects. Cross-validation results indicate that the proposed algorithm can improve performance of the traditional subspace learning based classification.

  15. Urban Boundary Extraction and Urban Sprawl Measurement Using High-Resolution Remote Sensing Images: a Case Study of China's Provincial

    NASA Astrophysics Data System (ADS)

    Wang, H.; Ning, X.; Zhang, H.; Liu, Y.; Yu, F.

    2018-04-01

    Urban boundary is an important indicator for urban sprawl analysis. However, methods of urban boundary extraction were inconsistent, and construction land or urban impervious surfaces was usually used to represent urban areas with coarse-resolution images, resulting in lower precision and incomparable urban boundary products. To solve above problems, a semi-automatic method of urban boundary extraction was proposed by using high-resolution image and geographic information data. Urban landscape and form characteristics, geographical knowledge were combined to generate a series of standardized rules for urban boundary extraction. Urban boundaries of China's 31 provincial capitals in year 2000, 2005, 2010 and 2015 were extracted with above-mentioned method. Compared with other two open urban boundary products, accuracy of urban boundary in this study was the highest. Urban boundary, together with other thematic data, were integrated to measure and analyse urban sprawl. Results showed that China's provincial capitals had undergone a rapid urbanization from year 2000 to 2015, with the area change from 6520 square kilometres to 12398 square kilometres. Urban area of provincial capital had a remarkable region difference and a high degree of concentration. Urban land became more intensive in general. Urban sprawl rate showed inharmonious with population growth rate. About sixty percent of the new urban areas came from cultivated land. The paper provided a consistent method of urban boundary extraction and urban sprawl measurement using high-resolution remote sensing images. The result of urban sprawl of China's provincial capital provided valuable urbanization information for government and public.

  16. From data to information and knowledge for geospatial applications

    NASA Astrophysics Data System (ADS)

    Schenk, T.; Csatho, B.; Yoon, T.

    2006-12-01

    An ever-increasing number of airborne and spaceborne data-acquisition missions with various sensors produce a glut of data. Sensory data rarely contains information in a explicit form such that an application can directly use it. The processing and analyzing of data constitutes a real bottleneck; therefore, automating the processes of gaining useful information and knowledge from the raw data is of paramount interest. This presentation is concerned with the transition from data to information and knowledge. With data we refer to the sensor output and we notice that data provide very rarely direct answers for applications. For example, a pixel in a digital image or a laser point from a LIDAR system (data) have no direct relationship with elevation changes of topographic surfaces or the velocity of a glacier (information, knowledge). We propose to employ the computer vision paradigm to extract information and knowledge as it pertains to a wide range of geoscience applications. After introducing the paradigm we describe the major steps to be undertaken for extracting information and knowledge from sensory input data. Features play an important role in this process. Thus we focus on extracting features and their perceptual organization to higher order constructs. We demonstrate these concepts with imaging data and laser point clouds. The second part of the presentation addresses the problem of combining data obtained by different sensors. An absolute prerequisite for successful fusion is to establish a common reference frame. We elaborate on the concept of sensor invariant features that allow the registration of such disparate data sets as aerial/satellite imagery, 3D laser point clouds, and multi/hyperspectral imagery. Fusion takes place on the data level (sensor registration) and on the information level. We show how fusion increases the degree of automation for reconstructing topographic surfaces. Moreover, fused information gained from the three sensors results in a more abstract surface representation with a rich set of explicit surface information that can be readily used by an analyst for applications such as change detection.

  17. The Agent of extracting Internet Information with Lead Order

    NASA Astrophysics Data System (ADS)

    Mo, Zan; Huang, Chuliang; Liu, Aijun

    In order to carry out e-commerce better, advanced technologies to access business information are in need urgently. An agent is described to deal with the problems of extracting internet information that caused by the non-standard and skimble-scamble structure of Chinese websites. The agent designed includes three modules which respond to the process of extracting information separately. A method of HTTP tree and a kind of Lead algorithm is proposed to generate a lead order, with which the required web can be retrieved easily. How to transform the extracted information structuralized with natural language is also discussed.

  18. Automatic segmentation of the bone and extraction of the bone cartilage interface from magnetic resonance images of the knee

    NASA Astrophysics Data System (ADS)

    Fripp, Jurgen; Crozier, Stuart; Warfield, Simon K.; Ourselin, Sébastien

    2007-03-01

    The accurate segmentation of the articular cartilages from magnetic resonance (MR) images of the knee is important for clinical studies and drug trials into conditions like osteoarthritis. Currently, segmentations are obtained using time-consuming manual or semi-automatic algorithms which have high inter- and intra-observer variabilities. This paper presents an important step towards obtaining automatic and accurate segmentations of the cartilages, namely an approach to automatically segment the bones and extract the bone-cartilage interfaces (BCI) in the knee. The segmentation is performed using three-dimensional active shape models, which are initialized using an affine registration to an atlas. The BCI are then extracted using image information and prior knowledge about the likelihood of each point belonging to the interface. The accuracy and robustness of the approach was experimentally validated using an MR database of fat suppressed spoiled gradient recall images. The (femur, tibia, patella) bone segmentation had a median Dice similarity coefficient of (0.96, 0.96, 0.89) and an average point-to-surface error of 0.16 mm on the BCI. The extracted BCI had a median surface overlap of 0.94 with the real interface, demonstrating its usefulness for subsequent cartilage segmentation or quantitative analysis.

  19. Geographic Information System (GIS) capabilities in traffic accident information management: a qualitative approach.

    PubMed

    Ahmadi, Maryam; Valinejadi, Ali; Goodarzi, Afshin; Safari, Ameneh; Hemmat, Morteza; Majdabadi, Hesamedin Askari; Mohammadi, Ali

    2017-06-01

    Traffic accidents are one of the more important national and international issues, and their consequences are important for the political, economical, and social level in a country. Management of traffic accident information requires information systems with analytical and accessibility capabilities to spatial and descriptive data. The aim of this study was to determine the capabilities of a Geographic Information System (GIS) in management of traffic accident information. This qualitative cross-sectional study was performed in 2016. In the first step, GIS capabilities were identified via literature retrieved from the Internet and based on the included criteria. Review of the literature was performed until data saturation was reached; a form was used to extract the capabilities. In the second step, study population were hospital managers, police, emergency, statisticians, and IT experts in trauma, emergency and police centers. Sampling was purposive. Data was collected using a questionnaire based on the first step data; validity and reliability were determined by content validity and Cronbach's alpha of 75%. Data was analyzed using the decision Delphi technique. GIS capabilities were identified in ten categories and 64 sub-categories. Import and process of spatial and descriptive data and so, analysis of this data were the most important capabilities of GIS in traffic accident information management. Storing and retrieving of descriptive and spatial data, providing statistical analysis in table, chart and zoning format, management of bad structure issues, determining the cost effectiveness of the decisions and prioritizing their implementation were the most important capabilities of GIS which can be efficient in the management of traffic accident information.

  20. Longitudinal Analysis of New Information Types in Clinical Notes

    PubMed Central

    Zhang, Rui; Pakhomov, Serguei; Melton, Genevieve B.

    2014-01-01

    It is increasingly recognized that redundant information in clinical notes within electronic health record (EHR) systems is ubiquitous, significant, and may negatively impact the secondary use of these notes for research and patient care. We investigated several automated methods to identify redundant versus relevant new information in clinical reports. These methods may provide a valuable approach to extract clinically pertinent information and further improve the accuracy of clinical information extraction systems. In this study, we used UMLS semantic types to extract several types of new information, including problems, medications, and laboratory information. Automatically identified new information highly correlated with manual reference standard annotations. Methods to identify different types of new information can potentially help to build up more robust information extraction systems for clinical researchers as well as aid clinicians and researchers in navigating clinical notes more effectively and quickly identify information pertaining to changes in health states. PMID:25717418

  1. Stimulus encoding and feature extraction by multiple sensory neurons.

    PubMed

    Krahe, Rüdiger; Kreiman, Gabriel; Gabbiani, Fabrizio; Koch, Christof; Metzner, Walter

    2002-03-15

    Neighboring cells in topographical sensory maps may transmit similar information to the next higher level of processing. How information transmission by groups of nearby neurons compares with the performance of single cells is a very important question for understanding the functioning of the nervous system. To tackle this problem, we quantified stimulus-encoding and feature extraction performance by pairs of simultaneously recorded electrosensory pyramidal cells in the hindbrain of weakly electric fish. These cells constitute the output neurons of the first central nervous stage of electrosensory processing. Using random amplitude modulations (RAMs) of a mimic of the fish's own electric field within behaviorally relevant frequency bands, we found that pyramidal cells with overlapping receptive fields exhibit strong stimulus-induced correlations. To quantify the encoding of the RAM time course, we estimated the stimuli from simultaneously recorded spike trains and found significant improvements over single spike trains. The quality of stimulus reconstruction, however, was still inferior to the one measured for single primary sensory afferents. In an analysis of feature extraction, we found that spikes of pyramidal cell pairs coinciding within a time window of a few milliseconds performed significantly better at detecting upstrokes and downstrokes of the stimulus compared with isolated spikes and even spike bursts of single cells. Coincident spikes can thus be considered "distributed bursts." Our results suggest that stimulus encoding by primary sensory afferents is transformed into feature extraction at the next processing stage. There, stimulus-induced coincident activity can improve the extraction of behaviorally relevant features from the stimulus.

  2. Extracting remanent magnetization from magnetic data inversion

    NASA Astrophysics Data System (ADS)

    Liu, S.; Fedi, M.; Baniamerian, J.; Hu, X.

    2017-12-01

    Remanent magnetization is an important vector parameter of rocks' and ores' magnetism, which is related to the intensity and direction of primary geomagnetic fields at all geological periods and hence shows critical evidences of geological tectonic movement and sedimentary evolution. We extract the remanence information from the distributions of the inverted magnetization vector. Firstly, directions of total magnetization vector are estimated from reduced-to-pole anomaly (max-min algorithm) and by its correlations with other magnitude magnetic transforms such as magnitude magnetic anomaly and normalized source strength. Then we invert data for the magnetization intensity and finally the intensity and direction of the remanent magnetization are separated from the total magnetization vector with a generalized formula of the apparent susceptibility based on a priori information on the Koenigsberger ratio. Our approach is used to investigate the targeted resources and geologic processes of the mining areas in China.

  3. Extraction of decision rules via imprecise probabilities

    NASA Astrophysics Data System (ADS)

    Abellán, Joaquín; López, Griselda; Garach, Laura; Castellano, Javier G.

    2017-05-01

    Data analysis techniques can be applied to discover important relations among features. This is the main objective of the Information Root Node Variation (IRNV) technique, a new method to extract knowledge from data via decision trees. The decision trees used by the original method were built using classic split criteria. The performance of new split criteria based on imprecise probabilities and uncertainty measures, called credal split criteria, differs significantly from the performance obtained using the classic criteria. This paper extends the IRNV method using two credal split criteria: one based on a mathematical parametric model, and other one based on a non-parametric model. The performance of the method is analyzed using a case study of traffic accident data to identify patterns related to the severity of an accident. We found that a larger number of rules is generated, significantly supplementing the information obtained using the classic split criteria.

  4. A mask quality control tool for the OSIRIS multi-object spectrograph

    NASA Astrophysics Data System (ADS)

    López-Ruiz, J. C.; Vaz Cedillo, Jacinto Javier; Ederoclite, Alessandro; Bongiovanni, Ángel; González Escalera, Víctor

    2012-09-01

    OSIRIS multi object spectrograph uses a set of user-customised-masks, which are manufactured on-demand. The manufacturing process consists of drilling the specified slits on the mask with the required accuracy. Ensuring that slits are on the right place when observing is of vital importance. We present a tool for checking the quality of the process of manufacturing the masks which is based on analyzing the instrument images obtained with the manufactured masks on place. The tool extracts the slit information from these images, relates specifications with the extracted slit information, and finally communicates to the operator if the manufactured mask fulfills the expectations of the mask designer. The proposed tool has been built using scripting languages and using standard libraries such as opencv, pyraf and scipy. The software architecture, advantages and limits of this tool in the lifecycle of a multiobject acquisition are presented.

  5. Decoding memory features from hippocampal spiking activities using sparse classification models.

    PubMed

    Dong Song; Hampson, Robert E; Robinson, Brian S; Marmarelis, Vasilis Z; Deadwyler, Sam A; Berger, Theodore W

    2016-08-01

    To understand how memory information is encoded in the hippocampus, we build classification models to decode memory features from hippocampal CA3 and CA1 spatio-temporal patterns of spikes recorded from epilepsy patients performing a memory-dependent delayed match-to-sample task. The classification model consists of a set of B-spline basis functions for extracting memory features from the spike patterns, and a sparse logistic regression classifier for generating binary categorical output of memory features. Results show that classification models can extract significant amount of memory information with respects to types of memory tasks and categories of sample images used in the task, despite the high level of variability in prediction accuracy due to the small sample size. These results support the hypothesis that memories are encoded in the hippocampal activities and have important implication to the development of hippocampal memory prostheses.

  6. Exploring patterns of epigenetic information with data mining techniques.

    PubMed

    Aguiar-Pulido, Vanessa; Seoane, José A; Gestal, Marcos; Dorado, Julián

    2013-01-01

    Data mining, a part of the Knowledge Discovery in Databases process (KDD), is the process of extracting patterns from large data sets by combining methods from statistics and artificial intelligence with database management. Analyses of epigenetic data have evolved towards genome-wide and high-throughput approaches, thus generating great amounts of data for which data mining is essential. Part of these data may contain patterns of epigenetic information which are mitotically and/or meiotically heritable determining gene expression and cellular differentiation, as well as cellular fate. Epigenetic lesions and genetic mutations are acquired by individuals during their life and accumulate with ageing. Both defects, either together or individually, can result in losing control over cell growth and, thus, causing cancer development. Data mining techniques could be then used to extract the previous patterns. This work reviews some of the most important applications of data mining to epigenetics.

  7. Spatial Uncertainty Modeling of Fuzzy Information in Images for Pattern Classification

    PubMed Central

    Pham, Tuan D.

    2014-01-01

    The modeling of the spatial distribution of image properties is important for many pattern recognition problems in science and engineering. Mathematical methods are needed to quantify the variability of this spatial distribution based on which a decision of classification can be made in an optimal sense. However, image properties are often subject to uncertainty due to both incomplete and imprecise information. This paper presents an integrated approach for estimating the spatial uncertainty of vagueness in images using the theory of geostatistics and the calculus of probability measures of fuzzy events. Such a model for the quantification of spatial uncertainty is utilized as a new image feature extraction method, based on which classifiers can be trained to perform the task of pattern recognition. Applications of the proposed algorithm to the classification of various types of image data suggest the usefulness of the proposed uncertainty modeling technique for texture feature extraction. PMID:25157744

  8. Mutual information, neural networks and the renormalization group

    NASA Astrophysics Data System (ADS)

    Koch-Janusz, Maciej; Ringel, Zohar

    2018-06-01

    Physical systems differing in their microscopic details often display strikingly similar behaviour when probed at macroscopic scales. Those universal properties, largely determining their physical characteristics, are revealed by the powerful renormalization group (RG) procedure, which systematically retains `slow' degrees of freedom and integrates out the rest. However, the important degrees of freedom may be difficult to identify. Here we demonstrate a machine-learning algorithm capable of identifying the relevant degrees of freedom and executing RG steps iteratively without any prior knowledge about the system. We introduce an artificial neural network based on a model-independent, information-theoretic characterization of a real-space RG procedure, which performs this task. We apply the algorithm to classical statistical physics problems in one and two dimensions. We demonstrate RG flow and extract the Ising critical exponent. Our results demonstrate that machine-learning techniques can extract abstract physical concepts and consequently become an integral part of theory- and model-building.

  9. A novel quantum steganography scheme for color images

    NASA Astrophysics Data System (ADS)

    Li, Panchi; Liu, Xiande

    In quantum image steganography, embedding capacity and security are two important issues. This paper presents a novel quantum steganography scheme using color images as cover images. First, the secret information is divided into 3-bit segments, and then each 3-bit segment is embedded into the LSB of one color pixel in the cover image according to its own value and using Gray code mapping rules. Extraction is the inverse of embedding. We designed the quantum circuits that implement the embedding and extracting process. The simulation results on a classical computer show that the proposed scheme outperforms several other existing schemes in terms of embedding capacity and security.

  10. Linking attentional processes and conceptual problem solving: visual cues facilitate the automaticity of extracting relevant information from diagrams

    PubMed Central

    Rouinfar, Amy; Agra, Elise; Larson, Adam M.; Rebello, N. Sanjay; Loschky, Lester C.

    2014-01-01

    This study investigated links between visual attention processes and conceptual problem solving. This was done by overlaying visual cues on conceptual physics problem diagrams to direct participants’ attention to relevant areas to facilitate problem solving. Participants (N = 80) individually worked through four problem sets, each containing a diagram, while their eye movements were recorded. Each diagram contained regions that were relevant to solving the problem correctly and separate regions related to common incorrect responses. Problem sets contained an initial problem, six isomorphic training problems, and a transfer problem. The cued condition saw visual cues overlaid on the training problems. Participants’ verbal responses were used to determine their accuracy. This study produced two major findings. First, short duration visual cues which draw attention to solution-relevant information and aid in the organizing and integrating of it, facilitate both immediate problem solving and generalization of that ability to new problems. Thus, visual cues can facilitate re-representing a problem and overcoming impasse, enabling a correct solution. Importantly, these cueing effects on problem solving did not involve the solvers’ attention necessarily embodying the solution to the problem, but were instead caused by solvers attending to and integrating relevant information in the problems into a solution path. Second, this study demonstrates that when such cues are used across multiple problems, solvers can automatize the extraction of problem-relevant information extraction. These results suggest that low-level attentional selection processes provide a necessary gateway for relevant information to be used in problem solving, but are generally not sufficient for correct problem solving. Instead, factors that lead a solver to an impasse and to organize and integrate problem information also greatly facilitate arriving at correct solutions. PMID:25324804

  11. Social Network Analysis of Elders' Health Literacy and their Use of Online Health Information

    PubMed Central

    Jang, Haeran

    2014-01-01

    Objectives Utilizing social network analysis, this study aimed to analyze the main keywords in the literature regarding the health literacy of and the use of online health information by aged persons over 65. Methods Medical Subject Heading keywords were extracted from articles on the PubMed database of the National Library of Medicine. For health literacy, 110 articles out of 361 were initially extracted. Seventy-one keywords out of 1,021 were finally selected after removing repeated keywords and applying pruning. Regarding the use of online health information, 19 articles out of 26 were selected. One hundred forty-four keywords were initially extracted. After removing the repeated keywords, 74 keywords were finally selected. Results Health literacy was found to be strongly connected with 'Health knowledge, attitudes, practices' and 'Patient education as topic.' 'Computer literacy' had strong connections with 'Internet' and 'Attitude towards computers.' 'Computer literacy' was connected to 'Health literacy,' and was studied according to the parameters 'Attitude towards health' and 'Patient education as topic.' The use of online health information was strongly connected with 'Health knowledge, attitudes, practices,' 'Consumer health information,' 'Patient education as topic,' etc. In the network, 'Computer literacy' was connected with 'Health education,' 'Patient satisfaction,' 'Self-efficacy,' 'Attitude to computer,' etc. Conclusions Research on older citizens' health literacy and their use of online health information was conducted together with study of computer literacy, patient education, attitude towards health, health education, patient satisfaction, etc. In particular, self-efficacy was noted as an important keyword. Further research should be conducted to identify the effective outcomes of self-efficacy in the area of interest. PMID:25152835

  12. Linking attentional processes and conceptual problem solving: visual cues facilitate the automaticity of extracting relevant information from diagrams.

    PubMed

    Rouinfar, Amy; Agra, Elise; Larson, Adam M; Rebello, N Sanjay; Loschky, Lester C

    2014-01-01

    This study investigated links between visual attention processes and conceptual problem solving. This was done by overlaying visual cues on conceptual physics problem diagrams to direct participants' attention to relevant areas to facilitate problem solving. Participants (N = 80) individually worked through four problem sets, each containing a diagram, while their eye movements were recorded. Each diagram contained regions that were relevant to solving the problem correctly and separate regions related to common incorrect responses. Problem sets contained an initial problem, six isomorphic training problems, and a transfer problem. The cued condition saw visual cues overlaid on the training problems. Participants' verbal responses were used to determine their accuracy. This study produced two major findings. First, short duration visual cues which draw attention to solution-relevant information and aid in the organizing and integrating of it, facilitate both immediate problem solving and generalization of that ability to new problems. Thus, visual cues can facilitate re-representing a problem and overcoming impasse, enabling a correct solution. Importantly, these cueing effects on problem solving did not involve the solvers' attention necessarily embodying the solution to the problem, but were instead caused by solvers attending to and integrating relevant information in the problems into a solution path. Second, this study demonstrates that when such cues are used across multiple problems, solvers can automatize the extraction of problem-relevant information extraction. These results suggest that low-level attentional selection processes provide a necessary gateway for relevant information to be used in problem solving, but are generally not sufficient for correct problem solving. Instead, factors that lead a solver to an impasse and to organize and integrate problem information also greatly facilitate arriving at correct solutions.

  13. Educational Data Mining Application for Estimating Students Performance in Weka Environment

    NASA Astrophysics Data System (ADS)

    Gowri, G. Shiyamala; Thulasiram, Ramasamy; Amit Baburao, Mahindra

    2017-11-01

    Educational data mining (EDM) is a multi-disciplinary research area that examines artificial intelligence, statistical modeling and data mining with the data generated from an educational institution. EDM utilizes computational ways to deal with explicate educational information keeping in mind the end goal to examine educational inquiries. To make a country stand unique among the other nations of the world, the education system has to undergo a major transition by redesigning its framework. The concealed patterns and data from various information repositories can be extracted by adopting the techniques of data mining. In order to summarize the performance of students with their credentials, we scrutinize the exploitation of data mining in the field of academics. Apriori algorithmic procedure is extensively applied to the database of students for a wider classification based on various categorizes. K-means procedure is applied to the same set of databases in order to accumulate them into a specific category. Apriori algorithm deals with mining the rules in order to extract patterns that are similar along with their associations in relation to various set of records. The records can be extracted from academic information repositories. The parameters used in this study gives more importance to psychological traits than academic features. The undesirable student conduct can be clearly witnessed if we make use of information mining frameworks. Thus, the algorithms efficiently prove to profile the students in any educational environment. The ultimate objective of the study is to suspect if a student is prone to violence or not.

  14. Information Extraction of High Resolution Remote Sensing Images Based on the Calculation of Optimal Segmentation Parameters

    PubMed Central

    Zhu, Hongchun; Cai, Lijie; Liu, Haiying; Huang, Wei

    2016-01-01

    Multi-scale image segmentation and the selection of optimal segmentation parameters are the key processes in the object-oriented information extraction of high-resolution remote sensing images. The accuracy of remote sensing special subject information depends on this extraction. On the basis of WorldView-2 high-resolution data, the optimal segmentation parameters methodof object-oriented image segmentation and high-resolution image information extraction, the following processes were conducted in this study. Firstly, the best combination of the bands and weights was determined for the information extraction of high-resolution remote sensing image. An improved weighted mean-variance method was proposed andused to calculatethe optimal segmentation scale. Thereafter, the best shape factor parameter and compact factor parameters were computed with the use of the control variables and the combination of the heterogeneity and homogeneity indexes. Different types of image segmentation parameters were obtained according to the surface features. The high-resolution remote sensing images were multi-scale segmented with the optimal segmentation parameters. Ahierarchical network structure was established by setting the information extraction rules to achieve object-oriented information extraction. This study presents an effective and practical method that can explain expert input judgment by reproducible quantitative measurements. Furthermore the results of this procedure may be incorporated into a classification scheme. PMID:27362762

  15. Information extraction during simultaneous motion processing.

    PubMed

    Rideaux, Reuben; Edwards, Mark

    2014-02-01

    When confronted with multiple moving objects the visual system can process them in two stages: an initial stage in which a limited number of signals are processed in parallel (i.e. simultaneously) followed by a sequential stage. We previously demonstrated that during the simultaneous stage, observers could discriminate between presentations containing up to 5 vs. 6 spatially localized motion signals (Edwards & Rideaux, 2013). Here we investigate what information is actually extracted during the simultaneous stage and whether the simultaneous limit varies with the detail of information extracted. This was achieved by measuring the ability of observers to extract varied information from low detail, i.e. the number of signals presented, to high detail, i.e. the actual directions present and the direction of a specific element, during the simultaneous stage. The results indicate that the resolution of simultaneous processing varies as a function of the information which is extracted, i.e. as the information extraction becomes more detailed, from the number of moving elements to the direction of a specific element, the capacity to process multiple signals is reduced. Thus, when assigning a capacity to simultaneous motion processing, this must be qualified by designating the degree of information extraction. Crown Copyright © 2013. Published by Elsevier Ltd. All rights reserved.

  16. Information Extraction of High Resolution Remote Sensing Images Based on the Calculation of Optimal Segmentation Parameters.

    PubMed

    Zhu, Hongchun; Cai, Lijie; Liu, Haiying; Huang, Wei

    2016-01-01

    Multi-scale image segmentation and the selection of optimal segmentation parameters are the key processes in the object-oriented information extraction of high-resolution remote sensing images. The accuracy of remote sensing special subject information depends on this extraction. On the basis of WorldView-2 high-resolution data, the optimal segmentation parameters methodof object-oriented image segmentation and high-resolution image information extraction, the following processes were conducted in this study. Firstly, the best combination of the bands and weights was determined for the information extraction of high-resolution remote sensing image. An improved weighted mean-variance method was proposed andused to calculatethe optimal segmentation scale. Thereafter, the best shape factor parameter and compact factor parameters were computed with the use of the control variables and the combination of the heterogeneity and homogeneity indexes. Different types of image segmentation parameters were obtained according to the surface features. The high-resolution remote sensing images were multi-scale segmented with the optimal segmentation parameters. Ahierarchical network structure was established by setting the information extraction rules to achieve object-oriented information extraction. This study presents an effective and practical method that can explain expert input judgment by reproducible quantitative measurements. Furthermore the results of this procedure may be incorporated into a classification scheme.

  17. Study on the extraction method of tidal flat area in northern Jiangsu Province based on remote sensing waterlines

    NASA Astrophysics Data System (ADS)

    Zhang, Yuanyuan; Gao, Zhiqiang; Liu, Xiangyang; Xu, Ning; Liu, Chaoshun; Gao, Wei

    2016-09-01

    Reclamation caused a significant dynamic change in the coastal zone, the tidal flat zone is an unstable reserve land resource, it has important significance for its research. In order to realize the efficient extraction of the tidal flat area information, this paper takes Rudong County in Jiangsu Province as the research area, using the HJ1A/1B images as the data source, on the basis of previous research experience and literature review, the paper chooses the method of object-oriented classification as a semi-automatic extraction method to generate waterlines. Then waterlines are analyzed by DSAS software to obtain tide points, automatic extraction of outer boundary points are followed under the use of Python to determine the extent of tidal flats in 2014 of Rudong County, the extraction area was 55182hm2, the confusion matrix is used to verify the accuracy and the result shows that the kappa coefficient is 0.945. The method could improve deficiencies of previous studies and its available free nature on the Internet makes a generalization.

  18. The Extraction of One-Dimensional Flow Properties from Multi-Dimensional Data Sets

    NASA Technical Reports Server (NTRS)

    Baurle, Robert A.; Gaffney, Richard L., Jr.

    2007-01-01

    The engineering design and analysis of air-breathing propulsion systems relies heavily on zero- or one-dimensional properties (e.g. thrust, total pressure recovery, mixing and combustion efficiency, etc.) for figures of merit. The extraction of these parameters from experimental data sets and/or multi-dimensional computational data sets is therefore an important aspect of the design process. A variety of methods exist for extracting performance measures from multi-dimensional data sets. Some of the information contained in the multi-dimensional flow is inevitably lost when any one-dimensionalization technique is applied. Hence, the unique assumptions associated with a given approach may result in one-dimensional properties that are significantly different than those extracted using alternative approaches. The purpose of this effort is to examine some of the more popular methods used for the extraction of performance measures from multi-dimensional data sets, reveal the strengths and weaknesses of each approach, and highlight various numerical issues that result when mapping data from a multi-dimensional space to a space of one dimension.

  19. The Art of Extracting One-Dimensional Flow Properties from Multi-Dimensional Data Sets

    NASA Technical Reports Server (NTRS)

    Baurle, R. A.; Gaffney, R. L.

    2007-01-01

    The engineering design and analysis of air-breathing propulsion systems relies heavily on zero- or one-dimensional properties (e:g: thrust, total pressure recovery, mixing and combustion efficiency, etc.) for figures of merit. The extraction of these parameters from experimental data sets and/or multi-dimensional computational data sets is therefore an important aspect of the design process. A variety of methods exist for extracting performance measures from multi-dimensional data sets. Some of the information contained in the multi-dimensional flow is inevitably lost when any one-dimensionalization technique is applied. Hence, the unique assumptions associated with a given approach may result in one-dimensional properties that are significantly different than those extracted using alternative approaches. The purpose of this effort is to examine some of the more popular methods used for the extraction of performance measures from multi-dimensional data sets, reveal the strengths and weaknesses of each approach, and highlight various numerical issues that result when mapping data from a multi-dimensional space to a space of one dimension.

  20. Long-term response of yellow-poplar to thinning in the southern Appalachian Mountains

    Treesearch

    Tara L. Keyser; Peter M. Brown

    2014-01-01

    As the focus of forest management on many public lands shifts away from timber production and extraction to habitat, restoration, and diversity-related objectives, it is important to understand the long-term effects that previous management activities have on structure and composition to better inform current management decisions. In this paper, we analyzed 40 years of...

  1. Fiches pratiques: "Comme ils disent..."; Trop d'enfants; Touche pas a mon pote!; Import/export (Practical Ideas: "As They Say..."; Too Many Children; Don't Touch My Pal!).

    ERIC Educational Resources Information Center

    Bourdet, Jean-Francois; And Others

    1993-01-01

    Four classroom activities for French instruction are described, including an exercise in contextual grammar, lessons in interpretation of charts and graphs, an exercise in extracting cultural information from text, and practice in calculating in French and applying basic economic concepts. (MSE)

  2. Solvent Recycling for Shipyards

    DTIC Science & Technology

    1993-05-01

    Suvey results are included in Section 5) Survey manufacturers and compile information on available equipment and features . (Data is summarized in Section...should be placed on safety features . Important safety features include explosion-proof electricals and grounding protection, overpressure relief valves...solvent can dissolve a polymer plastic liner, or extract water from a clay liner, resulting in liner leakage. The threat is compounded by the ability

  3. Hierarchical graph-based segmentation for extracting road networks from high-resolution satellite images

    NASA Astrophysics Data System (ADS)

    Alshehhi, Rasha; Marpu, Prashanth Reddy

    2017-04-01

    Extraction of road networks in urban areas from remotely sensed imagery plays an important role in many urban applications (e.g. road navigation, geometric correction of urban remote sensing images, updating geographic information systems, etc.). It is normally difficult to accurately differentiate road from its background due to the complex geometry of the buildings and the acquisition geometry of the sensor. In this paper, we present a new method for extracting roads from high-resolution imagery based on hierarchical graph-based image segmentation. The proposed method consists of: 1. Extracting features (e.g., using Gabor and morphological filtering) to enhance the contrast between road and non-road pixels, 2. Graph-based segmentation consisting of (i) Constructing a graph representation of the image based on initial segmentation and (ii) Hierarchical merging and splitting of image segments based on color and shape features, and 3. Post-processing to remove irregularities in the extracted road segments. Experiments are conducted on three challenging datasets of high-resolution images to demonstrate the proposed method and compare with other similar approaches. The results demonstrate the validity and superior performance of the proposed method for road extraction in urban areas.

  4. Research on Optimal Observation Scale for Damaged Buildings after Earthquake Based on Optimal Feature Space

    NASA Astrophysics Data System (ADS)

    Chen, J.; Chen, W.; Dou, A.; Li, W.; Sun, Y.

    2018-04-01

    A new information extraction method of damaged buildings rooted in optimal feature space is put forward on the basis of the traditional object-oriented method. In this new method, ESP (estimate of scale parameter) tool is used to optimize the segmentation of image. Then the distance matrix and minimum separation distance of all kinds of surface features are calculated through sample selection to find the optimal feature space, which is finally applied to extract the image of damaged buildings after earthquake. The overall extraction accuracy reaches 83.1 %, the kappa coefficient 0.813. The new information extraction method greatly improves the extraction accuracy and efficiency, compared with the traditional object-oriented method, and owns a good promotional value in the information extraction of damaged buildings. In addition, the new method can be used for the information extraction of different-resolution images of damaged buildings after earthquake, then to seek the optimal observation scale of damaged buildings through accuracy evaluation. It is supposed that the optimal observation scale of damaged buildings is between 1 m and 1.2 m, which provides a reference for future information extraction of damaged buildings.

  5. The edge detection method of the infrared imagery of the laser spot

    NASA Astrophysics Data System (ADS)

    Che, Jinxi; Zhang, Jinchun; Li, Zhongmin

    2016-01-01

    In the jamming effectiveness experiments, in which the thermal infrared imager was interfered by the CO2 Laser, in order to evaluate the jamming effect of the thermal infrared imager by the CO2 Laser, it was needed to analyses the obtained infrared imagery of laser spot. Because the laser spot pictures obtained from the thermal infrared imager are irregular, the edge detection is an important process. The image edge is one of the most basic characteristics of the image, and it contains most of the information of the image. Generally, because of the thermal balance effect, the partly temperature of objective is no quite difference; therefore the infrared imagery's ability of reflecting the local detail of object is obvious week. At the same time, when the information of heat distribution of the thermal imagery was combined with the basic information of target, such as the object size, the relative position of field of view, shape and outline, and so on, the information just has more value. Hence, it is an important step for making image processing to extract the objective edge of the infrared imagery. Meanwhile it is an important part of image processing procedure and it is the premise of many subsequent processing. So as to extract outline information of the target from the original thermal imagery, and overcome the disadvantage, such as the low image contrast of the image and serious noise interference, and so on, the edge of thermal imagery needs detecting and processing. The principles of the Roberts, Sobel, Prewitt and Canny operator were analyzed, and then they were used to making edge detection on the thermal imageries of laser spot, which were obtained from the jamming effect experiments of CO2 laser jamming the thermal infrared imager. On the basis of the detection result, their performances were compared. At the end, the characteristics of the operators were summarized, which provide reference for the choice of edge detection operators in thermal imagery processing in future.

  6. Ensemble methods with simple features for document zone classification

    NASA Astrophysics Data System (ADS)

    Obafemi-Ajayi, Tayo; Agam, Gady; Xie, Bingqing

    2012-01-01

    Document layout analysis is of fundamental importance for document image understanding and information retrieval. It requires the identification of blocks extracted from a document image via features extraction and block classification. In this paper, we focus on the classification of the extracted blocks into five classes: text (machine printed), handwriting, graphics, images, and noise. We propose a new set of features for efficient classifications of these blocks. We present a comparative evaluation of three ensemble based classification algorithms (boosting, bagging, and combined model trees) in addition to other known learning algorithms. Experimental results are demonstrated for a set of 36503 zones extracted from 416 document images which were randomly selected from the tobacco legacy document collection. The results obtained verify the robustness and effectiveness of the proposed set of features in comparison to the commonly used Ocropus recognition features. When used in conjunction with the Ocropus feature set, we further improve the performance of the block classification system to obtain a classification accuracy of 99.21%.

  7. LexValueSets: An Approach for Context-Driven Value Sets Extraction

    PubMed Central

    Pathak, Jyotishman; Jiang, Guoqian; Dwarkanath, Sridhar O.; Buntrock, James D.; Chute, Christopher G.

    2008-01-01

    The ability to model, share and re-use value sets across multiple medical information systems is an important requirement. However, generating value sets semi-automatically from a terminology service is still an unresolved issue, in part due to the lack of linkage to clinical context patterns that provide the constraints in defining a concept domain and invocation of value sets extraction. Towards this goal, we develop and evaluate an approach for context-driven automatic value sets extraction based on a formal terminology model. The crux of the technique is to identify and define the context patterns from various domains of discourse and leverage them for value set extraction using two complementary ideas based on (i) local terms provided by the Subject Matter Experts (extensional) and (ii) semantic definition of the concepts in coding schemes (intensional). A prototype was implemented based on SNOMED CT rendered in the LexGrid terminology model and a preliminary evaluation is presented. PMID:18998955

  8. A randomized control trial comparing the visual and verbal communication methods for reducing fear and anxiety during tooth extraction.

    PubMed

    Gazal, Giath; Tola, Ahmed W; Fareed, Wamiq M; Alnazzawi, Ahmad A; Zafar, Muhammad S

    2016-04-01

    To evaluate the value of using the visual information for reducing the level of dental fear and anxiety in patients undergoing teeth extraction under LA. A total of 64 patients were indiscriminately allotted to solitary of the study groups following reading the information sheet and signing the formal consent. If patient was in the control group, only verbal information and routine warnings were provided. If patient was in the study group, tooth extraction video was showed. The level of dental fear and anxiety was detailed by the patients on customary 100 mm visual analog scales (VAS), with "no dental fear and anxiety" (0 mm) and "severe dental distress and unease" (100 mm). Evaluation of dental apprehension and fretfulness was made pre-operatively, following visual/verbal information and post-extraction. There was a substantial variance among the mean dental fear and anxiety scores for both groups post-extraction (p-value < 0.05). Patients in tooth extraction video group were more comfortable after dental extraction than verbal information and routine warning group. For tooth extraction video group there were major decreases in dental distress and anxiety scores between the pre-operative and either post video information scores or postoperative scores (p-values < 0.05). Younger patients recorded higher dental fear and anxiety scores than older ones (P < 0.05). Dental fear and anxiety associated with dental extractions under local anesthesia can be reduced by showing a tooth extraction video to the patients preoperatively.

  9. Region of interest extraction based on multiscale visual saliency analysis for remote sensing images

    NASA Astrophysics Data System (ADS)

    Zhang, Yinggang; Zhang, Libao; Yu, Xianchuan

    2015-01-01

    Region of interest (ROI) extraction is an important component of remote sensing image processing. However, traditional ROI extraction methods are usually prior knowledge-based and depend on classification, segmentation, and a global searching solution, which are time-consuming and computationally complex. We propose a more efficient ROI extraction model for remote sensing images based on multiscale visual saliency analysis (MVS), implemented in the CIE L*a*b* color space, which is similar to visual perception of the human eye. We first extract the intensity, orientation, and color feature of the image using different methods: the visual attention mechanism is used to eliminate the intensity feature using a difference of Gaussian template; the integer wavelet transform is used to extract the orientation feature; and color information content analysis is used to obtain the color feature. Then, a new feature-competition method is proposed that addresses the different contributions of each feature map to calculate the weight of each feature image for combining them into the final saliency map. Qualitative and quantitative experimental results of the MVS model as compared with those of other models show that it is more effective and provides more accurate ROI extraction results with fewer holes inside the ROI.

  10. Rare tradition of the folk medicinal use of Aconitum spp. is kept alive in Solčavsko, Slovenia.

    PubMed

    Povšnar, Marija; Koželj, Gordana; Kreft, Samo; Lumpert, Mateja

    2017-08-08

    Aconitum species are poisonous plants that have been used in Western medicine for centuries. In the nineteenth century, these plants were part of official and folk medicine in the Slovenian territory. According to current ethnobotanical studies, folk use of Aconitum species is rarely reported in Europe. The purpose of this study was to research the folk medicinal use of Aconitum species in Solčavsko, Slovenia; to collect recipes for the preparation of Aconitum spp., indications for use, and dosing; and to investigate whether the folk use of aconite was connected to poisoning incidents. In Solčavsko, a remote alpine area in northern Slovenia, we performed semi-structured interviews with 19 informants in Solčavsko, 3 informants in Luče, and two retired physicians who worked in that area. Three samples of homemade ethanolic extracts were obtained from informants, and the concentration of aconitine was measured. In addition, four extracts were prepared according to reported recipes. All 22 informants knew of Aconitum spp. and their therapeutic use, and 5 of them provided a detailed description of the preparation and use of "voukuc", an ethanolic extract made from aconite roots. Seven informants were unable to describe the preparation in detail, since they knew of the extract only from the narration of others or they remembered it from childhood. Most likely, the roots of Aconitum tauricum and Aconitum napellus were used for the preparation of the extract, and the solvent was homemade spirits. Four informants kept the extract at home; two extracts were prepared recently (1998 and 2015). Three extracts were analyzed, and 2 contained aconitine. Informants reported many indications for the use of the extract; it was used internally and, in some cases, externally as well. The extract was also used in animals. The extract was measured in drops, but the number of drops differed among the informants. The informants reported nine poisonings with Aconitum spp., but none of them occurred as a result of medicinal use of the extract. In this study, we determined that folk knowledge of the medicinal use of Aconitum spp. is still present in Solčavsko, but Aconitum preparations are used only infrequently.

  11. Advances in Spectral-Spatial Classification of Hyperspectral Images

    NASA Technical Reports Server (NTRS)

    Fauvel, Mathieu; Tarabalka, Yuliya; Benediktsson, Jon Atli; Chanussot, Jocelyn; Tilton, James C.

    2012-01-01

    Recent advances in spectral-spatial classification of hyperspectral images are presented in this paper. Several techniques are investigated for combining both spatial and spectral information. Spatial information is extracted at the object (set of pixels) level rather than at the conventional pixel level. Mathematical morphology is first used to derive the morphological profile of the image, which includes characteristics about the size, orientation and contrast of the spatial structures present in the image. Then the morphological neighborhood is defined and used to derive additional features for classification. Classification is performed with support vector machines using the available spectral information and the extracted spatial information. Spatial post-processing is next investigated to build more homogeneous and spatially consistent thematic maps. To that end, three presegmentation techniques are applied to define regions that are used to regularize the preliminary pixel-wise thematic map. Finally, a multiple classifier system is defined to produce relevant markers that are exploited to segment the hyperspectral image with the minimum spanning forest algorithm. Experimental results conducted on three real hyperspectral images with different spatial and spectral resolutions and corresponding to various contexts are presented. They highlight the importance of spectral-spatial strategies for the accurate classification of hyperspectral images and validate the proposed methods.

  12. PREDOSE: a semantic web platform for drug abuse epidemiology using social media.

    PubMed

    Cameron, Delroy; Smith, Gary A; Daniulaityte, Raminta; Sheth, Amit P; Dave, Drashti; Chen, Lu; Anand, Gaurish; Carlson, Robert; Watkins, Kera Z; Falck, Russel

    2013-12-01

    The role of social media in biomedical knowledge mining, including clinical, medical and healthcare informatics, prescription drug abuse epidemiology and drug pharmacology, has become increasingly significant in recent years. Social media offers opportunities for people to share opinions and experiences freely in online communities, which may contribute information beyond the knowledge of domain professionals. This paper describes the development of a novel semantic web platform called PREDOSE (PREscription Drug abuse Online Surveillance and Epidemiology), which is designed to facilitate the epidemiologic study of prescription (and related) drug abuse practices using social media. PREDOSE uses web forum posts and domain knowledge, modeled in a manually created Drug Abuse Ontology (DAO--pronounced dow), to facilitate the extraction of semantic information from User Generated Content (UGC), through combination of lexical, pattern-based and semantics-based techniques. In a previous study, PREDOSE was used to obtain the datasets from which new knowledge in drug abuse research was derived. Here, we report on various platform enhancements, including an updated DAO, new components for relationship and triple extraction, and tools for content analysis, trend detection and emerging patterns exploration, which enhance the capabilities of the PREDOSE platform. Given these enhancements, PREDOSE is now more equipped to impact drug abuse research by alleviating traditional labor-intensive content analysis tasks. Using custom web crawlers that scrape UGC from publicly available web forums, PREDOSE first automates the collection of web-based social media content for subsequent semantic annotation. The annotation scheme is modeled in the DAO, and includes domain specific knowledge such as prescription (and related) drugs, methods of preparation, side effects, and routes of administration. The DAO is also used to help recognize three types of data, namely: (1) entities, (2) relationships and (3) triples. PREDOSE then uses a combination of lexical and semantic-based techniques to extract entities and relationships from the scraped content, and a top-down approach for triple extraction that uses patterns expressed in the DAO. In addition, PREDOSE uses publicly available lexicons to identify initial sentiment expressions in text, and then a probabilistic optimization algorithm (from related research) to extract the final sentiment expressions. Together, these techniques enable the capture of fine-grained semantic information, which facilitate search, trend analysis and overall content analysis using social media on prescription drug abuse. Moreover, extracted data are also made available to domain experts for the creation of training and test sets for use in evaluation and refinements in information extraction techniques. A recent evaluation of the information extraction techniques applied in the PREDOSE platform indicates 85% precision and 72% recall in entity identification, on a manually created gold standard dataset. In another study, PREDOSE achieved 36% precision in relationship identification and 33% precision in triple extraction, through manual evaluation by domain experts. Given the complexity of the relationship and triple extraction tasks and the abstruse nature of social media texts, we interpret these as favorable initial results. Extracted semantic information is currently in use in an online discovery support system, by prescription drug abuse researchers at the Center for Interventions, Treatment and Addictions Research (CITAR) at Wright State University. A comprehensive platform for entity, relationship, triple and sentiment extraction from such abstruse texts has never been developed for drug abuse research. PREDOSE has already demonstrated the importance of mining social media by providing data from which new findings in drug abuse research were uncovered. Given the recent platform enhancements, including the refined DAO, components for relationship and triple extraction, and tools for content, trend and emerging pattern analysis, it is expected that PREDOSE will play a significant role in advancing drug abuse epidemiology in future. Copyright © 2013 Elsevier Inc. All rights reserved.

  13. Information Extraction Using Controlled English to Support Knowledge-Sharing and Decision-Making

    DTIC Science & Technology

    2012-06-01

    or language variants. CE-based information extraction will greatly facilitate the processes in the cognitive and social domains that enable forces...terminology or language variants. CE-based information extraction will greatly facilitate the processes in the cognitive and social domains that...processor is run to turn the atomic CE into a more “ stylistically felicitous” CE, using techniques such as: aggregating all information about an entity

  14. Real-Time Information Extraction from Big Data

    DTIC Science & Technology

    2015-10-01

    I N S T I T U T E F O R D E F E N S E A N A L Y S E S Real-Time Information Extraction from Big Data Robert M. Rolfe...Information Extraction from Big Data Jagdeep Shah Robert M. Rolfe Francisco L. Loaiza-Lemos October 7, 2015 I N S T I T U T E F O R D E F E N S E...AN A LY S E S Abstract We are drowning under the 3 Vs (volume, velocity and variety) of big data . Real-time information extraction from big

  15. The H0 function, a new index for detecting structural/topological complexity information in undirected graphs

    NASA Astrophysics Data System (ADS)

    Buscema, Massimo; Asadi-Zeydabadi, Masoud; Lodwick, Weldon; Breda, Marco

    2016-04-01

    Significant applications such as the analysis of Alzheimer's disease differentiated from dementia, or in data mining of social media, or in extracting information of drug cartel structural composition, are often modeled as graphs. The structural or topological complexity or lack of it in a graph is quite often useful in understanding and more importantly, resolving the problem. We are proposing a new index we call the H0function to measure the structural/topological complexity of a graph. To do this, we introduce the concept of graph pruning and its associated algorithm that is used in the development of our measure. We illustrate the behavior of our measure, the H0 function, through different examples found in the appendix. These examples indicate that the H0 function contains information that is useful and important characteristics of a graph. Here, we restrict ourselves to undirected.

  16. Extraction of Data from a Hospital Information System to Perform Process Mining.

    PubMed

    Neira, Ricardo Alfredo Quintano; de Vries, Gert-Jan; Caffarel, Jennifer; Stretton, Erin

    2017-01-01

    The aim of this work is to share our experience in relevant data extraction from a hospital information system in preparation for a research study using process mining techniques. The steps performed were: research definition, mapping the normative processes, identification of tables and fields names of the database, and extraction of data. We then offer lessons learned during data extraction phase. Any errors made in the extraction phase will propagate and have implications on subsequent analyses. Thus, it is essential to take the time needed and devote sufficient attention to detail to perform all activities with the goal of ensuring high quality of the extracted data. We hope this work will be informative for other researchers to plan and execute extraction of data for process mining research studies.

  17. A generalizable NLP framework for fast development of pattern-based biomedical relation extraction systems.

    PubMed

    Peng, Yifan; Torii, Manabu; Wu, Cathy H; Vijay-Shanker, K

    2014-08-23

    Text mining is increasingly used in the biomedical domain because of its ability to automatically gather information from large amount of scientific articles. One important task in biomedical text mining is relation extraction, which aims to identify designated relations among biological entities reported in literature. A relation extraction system achieving high performance is expensive to develop because of the substantial time and effort required for its design and implementation. Here, we report a novel framework to facilitate the development of a pattern-based biomedical relation extraction system. It has several unique design features: (1) leveraging syntactic variations possible in a language and automatically generating extraction patterns in a systematic manner, (2) applying sentence simplification to improve the coverage of extraction patterns, and (3) identifying referential relations between a syntactic argument of a predicate and the actual target expected in the relation extraction task. A relation extraction system derived using the proposed framework achieved overall F-scores of 72.66% for the Simple events and 55.57% for the Binding events on the BioNLP-ST 2011 GE test set, comparing favorably with the top performing systems that participated in the BioNLP-ST 2011 GE task. We obtained similar results on the BioNLP-ST 2013 GE test set (80.07% and 60.58%, respectively). We conducted additional experiments on the training and development sets to provide a more detailed analysis of the system and its individual modules. This analysis indicates that without increasing the number of patterns, simplification and referential relation linking play a key role in the effective extraction of biomedical relations. In this paper, we present a novel framework for fast development of relation extraction systems. The framework requires only a list of triggers as input, and does not need information from an annotated corpus. Thus, we reduce the involvement of domain experts, who would otherwise have to provide manual annotations and help with the design of hand crafted patterns. We demonstrate how our framework is used to develop a system which achieves state-of-the-art performance on a public benchmark corpus.

  18. Development of an economic model to assess the cost-effectiveness of hawthorn extract as an adjunct treatment for heart failure in Australia

    PubMed Central

    Ford, Emily; Adams, Jon; Graves, Nicholas

    2012-01-01

    Objective An economic model was developed to evaluate the cost-effectiveness of hawthorn extract as an adjunctive treatment for heart failure in Australia. Methods A Markov model of chronic heart failure was developed to compare the costs and outcomes of standard treatment and standard treatment with hawthorn extract. Health states were defined by the New York Heart Association (NYHA) classification system and death. For any given cycle, patients could remain in the same NYHA class, experience an improvement or deterioration in NYHA class, be hospitalised or die. Model inputs were derived from the published medical literature, and the output was quality-adjusted life years (QALYs). Probabilistic sensitivity analysis was conducted. The expected value of perfect information (EVPI) and the expected value of partial perfect information (EVPPI) were conducted to establish the value of further research and the ideal target for such research. Results Hawthorn extract increased costs by $1866.78 and resulted in a gain of 0.02 QALYs. The incremental cost-effectiveness ratio was $85 160.33 per QALY. The cost-effectiveness acceptability curve indicated that at a threshold of $40 000 the new treatment had a 0.29 probability of being cost-effective. The average incremental net monetary benefit (NMB) was −$1791.64, the average NMB for the standard treatment was $92 067.49, and for hawthorn extract $90 275.84. Additional research is potentially cost-effective if research is not proposed to cost more than $325 million. Utilities form the most important target parameter group for further research. Conclusions Hawthorn extract is not currently considered to be cost-effective in as an adjunctive treatment for heart failure in Australia. Further research in the area of utilities is warranted. PMID:22942231

  19. Development of an economic model to assess the cost-effectiveness of hawthorn extract as an adjunct treatment for heart failure in Australia.

    PubMed

    Ford, Emily; Adams, Jon; Graves, Nicholas

    2012-01-01

    An economic model was developed to evaluate the cost-effectiveness of hawthorn extract as an adjunctive treatment for heart failure in Australia. A Markov model of chronic heart failure was developed to compare the costs and outcomes of standard treatment and standard treatment with hawthorn extract. Health states were defined by the New York Heart Association (NYHA) classification system and death. For any given cycle, patients could remain in the same NYHA class, experience an improvement or deterioration in NYHA class, be hospitalised or die. Model inputs were derived from the published medical literature, and the output was quality-adjusted life years (QALYs). Probabilistic sensitivity analysis was conducted. The expected value of perfect information (EVPI) and the expected value of partial perfect information (EVPPI) were conducted to establish the value of further research and the ideal target for such research. Hawthorn extract increased costs by $1866.78 and resulted in a gain of 0.02 QALYs. The incremental cost-effectiveness ratio was $85 160.33 per QALY. The cost-effectiveness acceptability curve indicated that at a threshold of $40 000 the new treatment had a 0.29 probability of being cost-effective. The average incremental net monetary benefit (NMB) was -$1791.64, the average NMB for the standard treatment was $92 067.49, and for hawthorn extract $90 275.84. Additional research is potentially cost-effective if research is not proposed to cost more than $325 million. Utilities form the most important target parameter group for further research. Hawthorn extract is not currently considered to be cost-effective in as an adjunctive treatment for heart failure in Australia. Further research in the area of utilities is warranted.

  20. Place in Perspective: Extracting Online Information about Points of Interest

    NASA Astrophysics Data System (ADS)

    Alves, Ana O.; Pereira, Francisco C.; Rodrigues, Filipe; Oliveirinha, João

    During the last few years, the amount of online descriptive information about places has reached reasonable dimensions for many cities in the world. Being such information mostly in Natural Language text, Information Extraction techniques are needed for obtaining the meaning of places that underlies these massive amounts of commonsense and user made sources. In this article, we show how we automatically label places using Information Extraction techniques applied to online resources such as Wikipedia, Yellow Pages and Yahoo!.

  1. An automated procedure for detection of IDP's dwellings using VHR satellite imagery

    NASA Astrophysics Data System (ADS)

    Jenerowicz, Malgorzata; Kemper, Thomas; Soille, Pierre

    2011-11-01

    This paper presents the results for the estimation of dwellings structures in Al Salam IDP Camp, Southern Darfur, based on Very High Resolution multispectral satellite images obtained by implementation of Mathematical Morphology analysis. A series of image processing procedures, feature extraction methods and textural analysis have been applied in order to provide reliable information about dwellings structures. One of the issues in this context is related to similarity of the spectral response of thatched dwellings' roofs and the surroundings in the IDP camps, where the exploitation of multispectral information is crucial. This study shows the advantage of automatic extraction approach and highlights the importance of detailed spatial and spectral information analysis based on multi-temporal dataset. The additional data fusion of high-resolution panchromatic band with lower resolution multispectral bands of WorldView-2 satellite has positive influence on results and thereby can be useful for humanitarian aid agency, providing support of decisions and estimations of population especially in situations when frequent revisits by space imaging system are the only possibility of continued monitoring.

  2. DeTEXT: A Database for Evaluating Text Extraction from Biomedical Literature Figures

    PubMed Central

    Yin, Xu-Cheng; Yang, Chun; Pei, Wei-Yi; Man, Haixia; Zhang, Jun; Learned-Miller, Erik; Yu, Hong

    2015-01-01

    Hundreds of millions of figures are available in biomedical literature, representing important biomedical experimental evidence. Since text is a rich source of information in figures, automatically extracting such text may assist in the task of mining figure information. A high-quality ground truth standard can greatly facilitate the development of an automated system. This article describes DeTEXT: A database for evaluating text extraction from biomedical literature figures. It is the first publicly available, human-annotated, high quality, and large-scale figure-text dataset with 288 full-text articles, 500 biomedical figures, and 9308 text regions. This article describes how figures were selected from open-access full-text biomedical articles and how annotation guidelines and annotation tools were developed. We also discuss the inter-annotator agreement and the reliability of the annotations. We summarize the statistics of the DeTEXT data and make available evaluation protocols for DeTEXT. Finally we lay out challenges we observed in the automated detection and recognition of figure text and discuss research directions in this area. DeTEXT is publicly available for downloading at http://prir.ustb.edu.cn/DeTEXT/. PMID:25951377

  3. Ionization Electron Signal Processing in Single Phase LArTPCs II. Data/Simulation Comparison and Performance in MicroBooNE

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Adams, C.; et al.

    The single-phase liquid argon time projection chamber (LArTPC) provides a large amount of detailed information in the form of fine-grained drifted ionization charge from particle traces. To fully utilize this information, the deposited charge must be accurately extracted from the raw digitized waveforms via a robust signal processing chain. Enabled by the ultra-low noise levels associated with cryogenic electronics in the MicroBooNE detector, the precise extraction of ionization charge from the induction wire planes in a single-phase LArTPC is qualitatively demonstrated on MicroBooNE data with event display images, and quantitatively demonstrated via waveform-level and track-level metrics. Improved performance of inductionmore » plane calorimetry is demonstrated through the agreement of extracted ionization charge measurements across different wire planes for various event topologies. In addition to the comprehensive waveform-level comparison of data and simulation, a calibration of the cryogenic electronics response is presented and solutions to various MicroBooNE-specific TPC issues are discussed. This work presents an important improvement in LArTPC signal processing, the foundation of reconstruction and therefore physics analyses in MicroBooNE.« less

  4. Extracting Temporal and Spatial Distributions Information about Algal Glooms Based on Multitemporal Modis

    NASA Astrophysics Data System (ADS)

    Chunguang, L.; Qingjiu, T.

    2012-07-01

    Based on MODIS remote sensing data, method and technology to extraction the time and space distribution information of algae bloom is studied and established. The dynamic feature of time and space in Taihu Lake from 2009 to 2011 can be obtained by extracted method. Variation of cyanobacterial bloom in the Taihu Lake is analyzed and discussed. The algae bloom frequency index (AFI) and algae bloom sustainability index (ASI) is important criterion which can show the interannual and inter-monthly variation in the whole area or the subregion of Taihu Lake. Utilizing the AFI and ASI from 2009 to 2011, it found some phenomena that: the booming frequency decreased from the north and west to the East and South of Taihu Lake. The annual month algae bloom variation of AFI reflect the booming existing twin peaks in the high shock level and lag trend in general. In the subregion statistics, the IBD and ASI in 2011 show the abnormal condition in the border between the Gongshan Bay and Central Lake. The date is obvious earlier than that on the same subregion in previous years and that on others subregion in the same year.

  5. Development of Mobile Mapping System for 3D Road Asset Inventory.

    PubMed

    Sairam, Nivedita; Nagarajan, Sudhagar; Ornitz, Scott

    2016-03-12

    Asset Management is an important component of an infrastructure project. A significant cost is involved in maintaining and updating the asset information. Data collection is the most time-consuming task in the development of an asset management system. In order to reduce the time and cost involved in data collection, this paper proposes a low cost Mobile Mapping System using an equipped laser scanner and cameras. First, the feasibility of low cost sensors for 3D asset inventory is discussed by deriving appropriate sensor models. Then, through calibration procedures, respective alignments of the laser scanner, cameras, Inertial Measurement Unit and GPS (Global Positioning System) antenna are determined. The efficiency of this Mobile Mapping System is experimented by mounting it on a truck and golf cart. By using derived sensor models, geo-referenced images and 3D point clouds are derived. After validating the quality of the derived data, the paper provides a framework to extract road assets both automatically and manually using techniques implementing RANSAC plane fitting and edge extraction algorithms. Then the scope of such extraction techniques along with a sample GIS (Geographic Information System) database structure for unified 3D asset inventory are discussed.

  6. Development of Mobile Mapping System for 3D Road Asset Inventory

    PubMed Central

    Sairam, Nivedita; Nagarajan, Sudhagar; Ornitz, Scott

    2016-01-01

    Asset Management is an important component of an infrastructure project. A significant cost is involved in maintaining and updating the asset information. Data collection is the most time-consuming task in the development of an asset management system. In order to reduce the time and cost involved in data collection, this paper proposes a low cost Mobile Mapping System using an equipped laser scanner and cameras. First, the feasibility of low cost sensors for 3D asset inventory is discussed by deriving appropriate sensor models. Then, through calibration procedures, respective alignments of the laser scanner, cameras, Inertial Measurement Unit and GPS (Global Positioning System) antenna are determined. The efficiency of this Mobile Mapping System is experimented by mounting it on a truck and golf cart. By using derived sensor models, geo-referenced images and 3D point clouds are derived. After validating the quality of the derived data, the paper provides a framework to extract road assets both automatically and manually using techniques implementing RANSAC plane fitting and edge extraction algorithms. Then the scope of such extraction techniques along with a sample GIS (Geographic Information System) database structure for unified 3D asset inventory are discussed. PMID:26985897

  7. Semi-Automatic Terminology Generation for Information Extraction from German Chest X-Ray Reports.

    PubMed

    Krebs, Jonathan; Corovic, Hamo; Dietrich, Georg; Ertl, Max; Fette, Georg; Kaspar, Mathias; Krug, Markus; Stoerk, Stefan; Puppe, Frank

    2017-01-01

    Extraction of structured data from textual reports is an important subtask for building medical data warehouses for research and care. Many medical and most radiology reports are written in a telegraphic style with a concatenation of noun phrases describing the presence or absence of findings. Therefore a lexico-syntactical approach is promising, where key terms and their relations are recognized and mapped on a predefined standard terminology (ontology). We propose a two-phase algorithm for terminology matching: In the first pass, a local terminology for recognition is derived as close as possible to the terms used in the radiology reports. In the second pass, the local terminology is mapped to a standard terminology. In this paper, we report on an algorithm for the first step of semi-automatic generation of the local terminology and evaluate the algorithm with radiology reports of chest X-ray examinations from Würzburg university hospital. With an effort of about 20 hours work of a radiologist as domain expert and 10 hours for meetings, a local terminology with about 250 attributes and various value patterns was built. In an evaluation with 100 randomly chosen reports it achieved an F1-Score of about 95% for information extraction.

  8. Residual and Destroyed Accessible Information after Measurements

    NASA Astrophysics Data System (ADS)

    Han, Rui; Leuchs, Gerd; Grassl, Markus

    2018-04-01

    When quantum states are used to send classical information, the receiver performs a measurement on the signal states. The amount of information extracted is often not optimal due to the receiver's measurement scheme and experimental apparatus. For quantum nondemolition measurements, there is potentially some residual information in the postmeasurement state, while part of the information has been extracted and the rest is destroyed. Here, we propose a framework to characterize a quantum measurement by how much information it extracts and destroys, and how much information it leaves in the residual postmeasurement state. The concept is illustrated for several receivers discriminating coherent states.

  9. Question analysis for Indonesian comparative question

    NASA Astrophysics Data System (ADS)

    Saelan, A.; Purwarianti, A.; Widyantoro, D. H.

    2017-01-01

    Information seeking is one of human needs today. Comparing things using search engine surely take more times than search only one thing. In this paper, we analyzed comparative questions for comparative question answering system. Comparative question is a question that comparing two or more entities. We grouped comparative questions into 5 types: selection between mentioned entities, selection between unmentioned entities, selection between any entity, comparison, and yes or no question. Then we extracted 4 types of information from comparative questions: entity, aspect, comparison, and constraint. We built classifiers for classification task and information extraction task. Features used for classification task are bag of words, whether for information extraction, we used lexical, 2 previous and following words lexical, and previous label as features. We tried 2 scenarios: classification first and extraction first. For classification first, we used classification result as a feature for extraction. Otherwise, for extraction first, we used extraction result as features for classification. We found that the result would be better if we do extraction first before classification. For the extraction task, classification using SMO gave the best result (88.78%), while for classification, it is better to use naïve bayes (82.35%).

  10. Anticandidal, antibacterial, cytotoxic and antioxidant activities of Calendula arvensis flowers.

    PubMed

    Abudunia, A-M; Marmouzi, I; Faouzi, M E A; Ramli, Y; Taoufik, J; El Madani, N; Essassi, E M; Salama, A; Khedid, K; Ansar, M; Ibrahimi, A

    2017-03-01

    Calendula arvensis (CA) is one of the important plants used in traditional medicine in Morocco, due to its interesting chemical composition. The present study aimed to determine the anticandidal, antioxidant and antibacterial activities, and the effects of extracts of CA flowers on the growth of myeloid cancer cells. Also, to characterize the chemical composition of the plant. Flowers of CA were collected based on ethnopharmacological information from the villages around the region Rabat-Khemisset, Moroccco. The hexane and methanol extracts were obtained by soxhlet extraction, while aqueous extracts was obtained by maceration in cold water. CA extracts were assessed for antioxidant activity using four different methods (DPPH, FRAP, TEAC, β-carotene bleaching test). Furthermore, the phenolic and flavonoid contents were measured, also the antimicrobial activity has been evaluated by the well diffusion method using several bacterial and fungal strains. Finally, extracts cytotoxicity was assessed using MTT test. Phytochemical quantification of the methanolic and aqueous extracts revealed that they were rich with flavonoid and phenolic content and were found to possess considerable antioxidant activities. MIC values of methanolic extracts were 12.5-25μg/mL. While MIC values of hexanolic extracts were between 6.25-12.5μg/mL and were bacteriostatic for all bacteria while methanolic and aqueous extracts were bactericidal. In addition, the extracts exhibited no activity on Candida species except the methanolic extract, which showed antifungal activity onCandida tropicalis 1 and Candida famata 1. The methanolic and aqueous extracts also exhibited antimyeloid cancer activity (IC 50 of 31μg/mL). In our study, we conclude that the methanolic and aqueous extracts were a promising source of antioxidant, antimicrobial and cytotoxic agents. Copyright © 2016 Elsevier Masson SAS. All rights reserved.

  11. Highway extraction from high resolution aerial photography using a geometric active contour model

    NASA Astrophysics Data System (ADS)

    Niu, Xutong

    Highway extraction and vehicle detection are two of the most important steps in traffic-flow analysis from multi-frame aerial photographs. The traditional method of deriving traffic flow trajectories relies on manual vehicle counting from a sequence of aerial photographs, which is tedious and time-consuming. This research presents a new framework for semi-automatic highway extraction. The basis of the new framework is an improved geometric active contour (GAC) model. This novel model seeks to minimize an objective function that transforms a problem of propagation of regular curves into an optimization problem. The implementation of curve propagation is based on level set theory. By using an implicit representation of a two-dimensional curve, a level set approach can be used to deal with topological changes naturally, and the output is unaffected by different initial positions of the curve. However, the original GAC model, on which the new model is based, only incorporates boundary information into the curve propagation process. An error-producing phenomenon called leakage is inevitable wherever there is an uncertain weak edge. In this research, region-based information is added as a constraint into the original GAC model, thereby, giving this proposed method the ability of integrating both boundary and region-based information during the curve propagation. Adding the region-based constraint eliminates the leakage problem. This dissertation applies the proposed augmented GAC model to the problem of highway extraction from high-resolution aerial photography. First, an optimized stopping criterion is designed and used in the implementation of the GAC model. It effectively saves processing time and computations. Second, a seed point propagation framework is designed and implemented. This framework incorporates highway extraction, tracking, and linking into one procedure. A seed point is usually placed at an end node of highway segments close to the boundary of the image or at a position where possible blocking may occur, such as at an overpass bridge or near vehicle crowds. These seed points can be automatically propagated throughout the entire highway network. During the process, road center points are also extracted, which introduces a search direction for solving possible blocking problems. This new framework has been successfully applied to highway network extraction from a large orthophoto mosaic. In the process, vehicles on the highway extracted from mosaic were detected with an 83% success rate.

  12. Extracting information in spike time patterns with wavelets and information theory.

    PubMed

    Lopes-dos-Santos, Vítor; Panzeri, Stefano; Kayser, Christoph; Diamond, Mathew E; Quian Quiroga, Rodrigo

    2015-02-01

    We present a new method to assess the information carried by temporal patterns in spike trains. The method first performs a wavelet decomposition of the spike trains, then uses Shannon information to select a subset of coefficients carrying information, and finally assesses timing information in terms of decoding performance: the ability to identify the presented stimuli from spike train patterns. We show that the method allows: 1) a robust assessment of the information carried by spike time patterns even when this is distributed across multiple time scales and time points; 2) an effective denoising of the raster plots that improves the estimate of stimulus tuning of spike trains; and 3) an assessment of the information carried by temporally coordinated spikes across neurons. Using simulated data, we demonstrate that the Wavelet-Information (WI) method performs better and is more robust to spike time-jitter, background noise, and sample size than well-established approaches, such as principal component analysis, direct estimates of information from digitized spike trains, or a metric-based method. Furthermore, when applied to real spike trains from monkey auditory cortex and from rat barrel cortex, the WI method allows extracting larger amounts of spike timing information. Importantly, the fact that the WI method incorporates multiple time scales makes it robust to the choice of partly arbitrary parameters such as temporal resolution, response window length, number of response features considered, and the number of available trials. These results highlight the potential of the proposed method for accurate and objective assessments of how spike timing encodes information. Copyright © 2015 the American Physiological Society.

  13. Semantic Preview Benefit in English: Individual Differences in the Extraction and Use of Parafoveal Semantic Information

    ERIC Educational Resources Information Center

    Veldre, Aaron; Andrews, Sally

    2016-01-01

    Although there is robust evidence that skilled readers of English extract and use orthographic and phonological information from the parafovea to facilitate word identification, semantic preview benefits have been elusive. We sought to establish whether individual differences in the extraction and/or use of parafoveal semantic information could…

  14. Selective Separation of Trivalent Actinides from Lanthanides by Aqueous Processing with Introduction of Soft Donor Atoms

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Kenneth L. Nash

    2009-09-22

    Implementation of a closed loop nuclear fuel cycle requires the utilization of Pu-containing MOX fuels with the important side effect of increased production of the transplutonium actinides, most importantly isotopes of Am and Cm. Because the presence of these isotopes significantly impacts the long-term radiotoxicity of high level waste, it is important that effective methods for their isolation and/or transmutation be developed. Furthermore, since transmutation is most efficiently done in the absence of lanthanide fission products (high yield species with large thermal neutron absorption cross sections) it is important to have efficient procedures for the mutual separation of Am andmore » Cm from the lanthanides. The chemistries of these elements are nearly identical, differing only in the slightly stronger strength of interaction of trivalent actinides with ligand donor atoms softer than O (N, Cl-, S). Research being conducted around the world has led to the development of new reagents and processes with considerable potential for this task. However, pilot scale testing of these reagents and processes has demonstrated the susceptibility of the new classes of reagents to radiolytic and hydrolytic degradation. In this project, separations of trivalent actinides from fission product lanthanides have been investigated in studies of 1) the extraction and chemical stability properties of a class of soft-donor extractants that are adapted from water-soluble analogs, 2) the application of water soluble soft-donor complexing agents in tandem with conventional extractant molecules emphasizing fundamental studies of the TALSPEAK Process. This research was conducted principally in radiochemistry laboratories at Washington State University. Collaborators at the Radiological Processing Laboratory (RPL) at the Pacific Northwest National Laboratory (PNNL) have contributed their unique facilities and capabilities, and have supported student internships at PNNL to broaden their academic experience. New information has been developed to qualify the extraction potential of a class of pyridine-functionalized tetraaza complexants indicating potential single contact Am-Nd separation factors of about 40. The methodology developed for characterization will find further application in our continuing efforts to synthesize and characterize new reagents for this separation. Significant new insights into the performance envelope and supporting information on the TALSPEAK process has also been developed.« less

  15. Parent experiences and information needs relating to procedural pain in children: a systematic review protocol.

    PubMed

    Gates, Allison; Shave, Kassi; Featherstone, Robin; Buckreus, Kelli; Ali, Samina; Scott, Shannon; Hartling, Lisa

    2017-06-06

    There exist many evidence-based interventions available to manage procedural pain in children and neonates, yet they are severely underutilized. Parents play an important role in the management of their child's pain; however, many do not possess adequate knowledge of how to effectively do so. The purpose of the planned study is to systematically review and synthesize current knowledge of the experiences and information needs of parents with regard to the management of their child's pain and distress related to medical procedures in the emergency department. We will conduct a systematic review using rigorous methods and reporting based on the PRISMA statement. We will conduct a comprehensive search of literature published between 2000 and 2016 reporting on parents' experiences and information needs with regard to helping their child manage procedural pain and distress. Ovid MEDLINE, Ovid PsycINFO, CINAHL, and PubMed will be searched. We will also search reference lists of key studies and gray literature sources. Two reviewers will screen the articles following inclusion criteria defined a priori. One reviewer will then extract the data from each article following a data extraction form developed by the study team. The second reviewer will check the data extraction for accuracy and completeness. Any disagreements with regard to study inclusion or data extraction will be resolved via discussion. Data from qualitative studies will be summarized thematically, while those from quantitative studies will be summarized narratively. The second reviewer will confirm the overarching themes resulting from the qualitative and quantitative data syntheses. The Critical Appraisal Skills Programme Qualitative Research Checklist and the Quality Assessment Tool for Quantitative Studies will be used to assess the quality of the evidence from each included study. To our knowledge, no published review exists that comprehensively reports on the experiences and information needs of parents related to the management of their child's procedural pain and distress. A systematic review of parents' experiences and information needs will help to inform strategies to empower them with the knowledge necessary to ensure their child's comfort during a painful procedure. PROSPERO CRD42016043698.

  16. Associating Human-Centered Concepts with Social Networks Using Fuzzy Sets

    NASA Astrophysics Data System (ADS)

    Yager, Ronald R.

    The rapidly growing global interconnectivity, brought about to a large extent by the Internet, has dramatically increased the importance and diversity of social networks. Modern social networks cut across a spectrum from benign recreational focused websites such as Facebook to occupationally oriented websites such as LinkedIn to criminally focused groups such as drug cartels to devastation and terror focused groups such as Al-Qaeda. Many organizations are interested in analyzing and extracting information related to these social networks. Among these are governmental police and security agencies as well marketing and sales organizations. To aid these organizations there is a need for technologies to model social networks and intelligently extract information from these models. While established technologies exist for the modeling of relational networks [1-7] few technologies exist to extract information from these, compatible with human perception and understanding. Data bases is an example of a technology in which we have tools for representing our information as well as tools for querying and extracting the information contained. Our goal is in some sense analogous. We want to use the relational network model to represent information, in this case about relationships and interconnections, and then be able to query the social network using intelligent human-centered concepts. To extend our capabilities to interact with social relational networks we need to associate with these network human concepts and ideas. Since human beings predominantly use linguistic terms in which to reason and understand we need to build bridges between human conceptualization and the formal mathematical representation of the social network. Consider for example a concept such as "leader". An analyst may be able to express, in linguistic terms, using a network relevant vocabulary, properties of a leader. Our task is to translate this linguistic description into a mathematical formalism that allows us to determine how true it is that a particular node is a leader. In this work we look at the use of fuzzy set methodologies [8-10] to provide a bridge between the human analyst and the formal model of the network.

  17. Geographic Information System (GIS) capabilities in traffic accident information management: a qualitative approach

    PubMed Central

    Ahmadi, Maryam; Valinejadi, Ali; Goodarzi, Afshin; Safari, Ameneh; Hemmat, Morteza; Majdabadi, Hesamedin Askari; Mohammadi, Ali

    2017-01-01

    Background Traffic accidents are one of the more important national and international issues, and their consequences are important for the political, economical, and social level in a country. Management of traffic accident information requires information systems with analytical and accessibility capabilities to spatial and descriptive data. Objective The aim of this study was to determine the capabilities of a Geographic Information System (GIS) in management of traffic accident information. Methods This qualitative cross-sectional study was performed in 2016. In the first step, GIS capabilities were identified via literature retrieved from the Internet and based on the included criteria. Review of the literature was performed until data saturation was reached; a form was used to extract the capabilities. In the second step, study population were hospital managers, police, emergency, statisticians, and IT experts in trauma, emergency and police centers. Sampling was purposive. Data was collected using a questionnaire based on the first step data; validity and reliability were determined by content validity and Cronbach’s alpha of 75%. Data was analyzed using the decision Delphi technique. Results GIS capabilities were identified in ten categories and 64 sub-categories. Import and process of spatial and descriptive data and so, analysis of this data were the most important capabilities of GIS in traffic accident information management. Conclusion Storing and retrieving of descriptive and spatial data, providing statistical analysis in table, chart and zoning format, management of bad structure issues, determining the cost effectiveness of the decisions and prioritizing their implementation were the most important capabilities of GIS which can be efficient in the management of traffic accident information. PMID:28848627

  18. Extracting laboratory test information from biomedical text

    PubMed Central

    Kang, Yanna Shen; Kayaalp, Mehmet

    2013-01-01

    Background: No previous study reported the efficacy of current natural language processing (NLP) methods for extracting laboratory test information from narrative documents. This study investigates the pathology informatics question of how accurately such information can be extracted from text with the current tools and techniques, especially machine learning and symbolic NLP methods. The study data came from a text corpus maintained by the U.S. Food and Drug Administration, containing a rich set of information on laboratory tests and test devices. Methods: The authors developed a symbolic information extraction (SIE) system to extract device and test specific information about four types of laboratory test entities: Specimens, analytes, units of measures and detection limits. They compared the performance of SIE and three prominent machine learning based NLP systems, LingPipe, GATE and BANNER, each implementing a distinct supervised machine learning method, hidden Markov models, support vector machines and conditional random fields, respectively. Results: Machine learning systems recognized laboratory test entities with moderately high recall, but low precision rates. Their recall rates were relatively higher when the number of distinct entity values (e.g., the spectrum of specimens) was very limited or when lexical morphology of the entity was distinctive (as in units of measures), yet SIE outperformed them with statistically significant margins on extracting specimen, analyte and detection limit information in both precision and F-measure. Its high recall performance was statistically significant on analyte information extraction. Conclusions: Despite its shortcomings against machine learning methods, a well-tailored symbolic system may better discern relevancy among a pile of information of the same type and may outperform a machine learning system by tapping into lexically non-local contextual information such as the document structure. PMID:24083058

  19. Client-side Skype forensics: an overview

    NASA Astrophysics Data System (ADS)

    Meißner, Tina; Kröger, Knut; Creutzburg, Reiner

    2013-03-01

    IT security and computer forensics are important components in the information technology. In the present study, a client-side Skype forensics is performed. It is designed to explain which kind of user data are stored on a computer and which tools allow the extraction of those data for a forensic investigation. There are described both methods - a manual analysis and an analysis with (mainly) open source tools, respectively.

  20. Evaluation of Antioxidant Properties, Phenolic Compounds, Anthelmintic, and Cytotoxic Activities of Various Extracts Isolated from Nepeta cadmea: An Endemic Plant for Turkey.

    PubMed

    Kaska, Arzu; Deniz, Nahide; Çiçek, Mehmet; Mammadov, Ramazan

    2018-05-10

    Nepeta cadmea Boiss. is a species endemic to Turkey that belongs to the Nepeta genus. Several species of this genus are used in folk medicine. This study was designed to investigate the phenolic compounds, antioxidant, anthelmintic, and cytotoxic activities of various extracts (ethanol, methanol, acetone, and water) of N. cadmea. The antioxidant activities of these extracts were analyzed using scavenging methods (DPPH, ABTS, and H 2 O 2 scavenging activity), the β-carotene/linoleic acid test system, the phosphomolybdenum method, and metal chelating activity. Among the 4 different extracts of N. cadmea that were evaluated, the water extract showed the highest amount of radical scavenging (DPPH, 25.54 μg/mL and ABTS, 14.51 μg/mL) and antioxidant activities (β-carotene, 86.91%). In the metal chelating and H 2 O 2 scavenging activities, the acetone extract was statistically different from the other extracts. For the phosphomolybdenum method, the antioxidant capacity of the extracts was in the range of 8.15 to 80.40 μg/mg. The phenolic content of the ethanol extract was examined using HPLC and determined some phenolics: epicatechin, chlorogenic, and caffeic acids. With regard to the anthelmintic properties, dose-dependent activity was observed in each of the extracts of N. cadmea. All the extracts exhibited high cytotoxic activities. The results will provide additional information for further studies on the biological activities of N. cadmea, while also helping us to understand the importance of this species. Furthermore, based on the results obtained, N. cadmea may be considered as a potentially useful supplement for the human diet, as well as a natural antioxidant for medicinal applications. The plants of the Nepeta genus have been extensively used as traditional herbal medicines. Nepeta cadmea Boiss., one of the species belonging to the Nepeta genus, is a species endemic to Turkey. In our study, we demonstrated the antioxidant capacities, total phenolic, flavonoid, tannin content, anthelmintic, and cytotoxic activities of various extracts of Nepeta cadmea. The present study could well supply valuable data for future investigations and further information on the potential use of this endemic plant for humans, in both dietary and pharmacological applications. © 2018 Institute of Food Technologists®.

  1. The many faces of research on face perception.

    PubMed

    Little, Anthony C; Jones, Benedict C; DeBruine, Lisa M

    2011-06-12

    Face perception is fundamental to human social interaction. Many different types of important information are visible in faces and the processes and mechanisms involved in extracting this information are complex and can be highly specialized. The importance of faces has long been recognized by a wide range of scientists. Importantly, the range of perspectives and techniques that this breadth has brought to face perception research has, in recent years, led to many important advances in our understanding of face processing. The articles in this issue on face perception each review a particular arena of interest in face perception, variously focusing on (i) the social aspects of face perception (attraction, recognition and emotion), (ii) the neural mechanisms underlying face perception (using brain scanning, patient data, direct stimulation of the brain, visual adaptation and single-cell recording), and (iii) comparative aspects of face perception (comparing adult human abilities with those of chimpanzees and children). Here, we introduce the central themes of the issue and present an overview of the articles.

  2. Engineering analysis of ERTS data for rice in the Philippines

    NASA Technical Reports Server (NTRS)

    Mcnair, A. J. (Principal Investigator); Heydt, H. L.

    1973-01-01

    The author has identified the following significant results. Rice is an important food worldwide. Worthwhile goals, particularly for developing nations, are the capability to recognize from satellite imagery: (1) areas where rice is grown, and (2) growth status (irrigation, vigor, yield). A two-step procedure to achieve this is being investigated. Ground truth, and ERTS-1 imagery (four passes) covering 80% of a rice growth cycle for some Philippine sites, have been analyzed. One-D and three-D signature extraction, and synthesis of an initial site recognition/status algorithm have been performed. Results are encouraging. but additional passes and sites must be analyzed. Good position information for extracted data is a must.

  3. Extraction and purification methods in downstream processing of plant-based recombinant proteins.

    PubMed

    Łojewska, Ewelina; Kowalczyk, Tomasz; Olejniczak, Szymon; Sakowicz, Tomasz

    2016-04-01

    During the last two decades, the production of recombinant proteins in plant systems has been receiving increased attention. Currently, proteins are considered as the most important biopharmaceuticals. However, high costs and problems with scaling up the purification and isolation processes make the production of plant-based recombinant proteins a challenging task. This paper presents a summary of the information regarding the downstream processing in plant systems and provides a comprehensible overview of its key steps, such as extraction and purification. To highlight the recent progress, mainly new developments in the downstream technology have been chosen. Furthermore, besides most popular techniques, alternative methods have been described. Copyright © 2015 Elsevier Inc. All rights reserved.

  4. SIDECACHE: Information access, management and dissemination framework for web services.

    PubMed

    Doderer, Mark S; Burkhardt, Cory; Robbins, Kay A

    2011-06-14

    Many bioinformatics algorithms and data sets are deployed using web services so that the results can be explored via the Internet and easily integrated into other tools and services. These services often include data from other sites that is accessed either dynamically or through file downloads. Developers of these services face several problems because of the dynamic nature of the information from the upstream services. Many publicly available repositories of bioinformatics data frequently update their information. When such an update occurs, the developers of the downstream service may also need to update. For file downloads, this process is typically performed manually followed by web service restart. Requests for information obtained by dynamic access of upstream sources is sometimes subject to rate restrictions. SideCache provides a framework for deploying web services that integrate information extracted from other databases and from web sources that are periodically updated. This situation occurs frequently in biotechnology where new information is being continuously generated and the latest information is important. SideCache provides several types of services including proxy access and rate control, local caching, and automatic web service updating. We have used the SideCache framework to automate the deployment and updating of a number of bioinformatics web services and tools that extract information from remote primary sources such as NCBI, NCIBI, and Ensembl. The SideCache framework also has been used to share research results through the use of a SideCache derived web service.

  5. Ore grade decrease as life cycle impact indicator for metal scarcity: the case of copper.

    PubMed

    Vieira, Marisa D M; Goedkoop, Mark J; Storm, Per; Huijbregts, Mark A J

    2012-12-04

    In the life cycle assessment (LCA) of products, the increasing scarcity of metal resources is currently addressed in a preliminary way. Here, we propose a new method on the basis of global ore grade information to assess the importance of the extraction of metal resources in the life cycle of products. It is shown how characterization factors, reflecting the decrease in ore grade due to an increase in metal extraction, can be derived from cumulative ore grade-tonnage relationships. CFs were derived for three different types of copper deposits (porphyry, sediment-hosted, and volcanogenic massive sulfide). We tested the influence of the CF model (marginal vs average), mathematical distribution (loglogistic vs loglinear), and reserve estimate (ultimate reserve vs reserve base). For the marginal CFs, the statistical distribution choice and the estimate of the copper reserves introduce a difference of a factor of 1.0-5.0 and a factor of 1.2-1.7, respectively. For the average CFs, the differences are larger for these two choices, i.e. respectively a factor of 5.7-43 and a factor of 2.1-3.8. Comparing the marginal CFs with the average CFs, the differences are higher (a factor 1.7-94). This paper demonstrates that cumulative grade-tonnage relationships for metal extraction can be used in LCA to assess the relative importance of metal extractions.

  6. Rapid discrimination and characterization of vanilla bean extracts by attenuated total reflection infrared spectroscopy and selected ion flow tube mass spectrometry.

    PubMed

    Sharp, Michael D; Kocaoglu-Vurma, Nurdan A; Langford, Vaughan; Rodriguez-Saona, Luis E; Harper, W James

    2012-03-01

    Vanilla beans have been shown to contain over 200 compounds, which can vary in concentration depending on the region where the beans are harvested. Several compounds including vanillin, p-hydroxybenzaldehyde, guaiacol, and anise alcohol have been found to be important for the aroma profile of vanilla. Our objective was to evaluate the performance of selected ion flow tube mass spectrometry (SIFT-MS) and Fourier-transform infrared (FTIR) spectroscopy for rapid discrimination and characterization of vanilla bean extracts. Vanilla extracts were obtained from different countries including Uganda, Indonesia, Papua New Guinea, Madagascar, and India. Multivariate data analysis (soft independent modeling of class analogy, SIMCA) was utilized to determine the clustering patterns between samples. Both methods provided differentiation between samples for all vanilla bean extracts. FTIR differentiated on the basis of functional groups, whereas the SIFT-MS method provided more specific information about the chemical basis of the differentiation. SIMCA's discriminating power showed that the most important compounds responsible for the differentiation between samples by SIFT-MS were vanillin, anise alcohol, 4-methylguaiacol, p-hydroxybenzaldehyde/trimethylpyrazine, p-cresol/anisole, guaiacol, isovaleric acid, and acetic acid. ATR-IR spectroscopy analysis showed that the classification of samples was related to major bands at 1523, 1573, 1516, 1292, 1774, 1670, 1608, and 1431 cm(-1) , associated with vanillin and vanillin derivatives. © 2012 Institute of Food Technologists®

  7. Overestimation of organic phosphorus in wetland soils by alkaline extraction and molybdate colorimetry.

    PubMed

    Turner, Benjamin L; Newman, Susan; Reddy, K Ramesh

    2006-05-15

    Accurate information on the chemical nature of soil phosphorus is essential for understanding its bioavailability and fate in wetland ecosystems. Solution phosphorus-31 nuclear magnetic resonance (31P NMR) spectroscopy was used to assess the conventional colorimetric procedure for phosphorus speciation in alkaline extracts of organic soils from the Florida Everglades. Molybdate colorimetry markedly overestimated organic phosphorus by between 30 and 54% compared to NMR spectroscopy. This was due in large part to the association of inorganic phosphate with organic matter, although the error was exacerbated in some samples by the presence of pyrophosphate, an inorganic polyphosphate that is not detected by colorimetry. The results have important implications for our understanding of phosphorus biogeochemistry in wetlands and suggest that alkaline extraction and solution 31p NMR spectroscopy is the only accurate method for quantifying organic phosphorus in wetland soils.

  8. A model for indexing medical documents combining statistical and symbolic knowledge.

    PubMed

    Avillach, Paul; Joubert, Michel; Fieschi, Marius

    2007-10-11

    To develop and evaluate an information processing method based on terminologies, in order to index medical documents in any given documentary context. We designed a model using both symbolic general knowledge extracted from the Unified Medical Language System (UMLS) and statistical knowledge extracted from a domain of application. Using statistical knowledge allowed us to contextualize the general knowledge for every particular situation. For each document studied, the extracted terms are ranked to highlight the most significant ones. The model was tested on a set of 17,079 French standardized discharge summaries (SDSs). The most important ICD-10 term of each SDS was ranked 1st or 2nd by the method in nearly 90% of the cases. The use of several terminologies leads to more precise indexing. The improvement achieved in the models implementation performances as a result of using semantic relationships is encouraging.

  9. A Model for Indexing Medical Documents Combining Statistical and Symbolic Knowledge.

    PubMed Central

    Avillach, Paul; Joubert, Michel; Fieschi, Marius

    2007-01-01

    OBJECTIVES: To develop and evaluate an information processing method based on terminologies, in order to index medical documents in any given documentary context. METHODS: We designed a model using both symbolic general knowledge extracted from the Unified Medical Language System (UMLS) and statistical knowledge extracted from a domain of application. Using statistical knowledge allowed us to contextualize the general knowledge for every particular situation. For each document studied, the extracted terms are ranked to highlight the most significant ones. The model was tested on a set of 17,079 French standardized discharge summaries (SDSs). RESULTS: The most important ICD-10 term of each SDS was ranked 1st or 2nd by the method in nearly 90% of the cases. CONCLUSIONS: The use of several terminologies leads to more precise indexing. The improvement achieved in the model’s implementation performances as a result of using semantic relationships is encouraging. PMID:18693792

  10. Extracting DNA words based on the sequence features: non-uniform distribution and integrity.

    PubMed

    Li, Zhi; Cao, Hongyan; Cui, Yuehua; Zhang, Yanbo

    2016-01-25

    DNA sequence can be viewed as an unknown language with words as its functional units. Given that most sequence alignment algorithms such as the motif discovery algorithms depend on the quality of background information about sequences, it is necessary to develop an ab initio algorithm for extracting the "words" based only on the DNA sequences. We considered that non-uniform distribution and integrity were two important features of a word, based on which we developed an ab initio algorithm to extract "DNA words" that have potential functional meaning. A Kolmogorov-Smirnov test was used for consistency test of uniform distribution of DNA sequences, and the integrity was judged by the sequence and position alignment. Two random base sequences were adopted as negative control, and an English book was used as positive control to verify our algorithm. We applied our algorithm to the genomes of Saccharomyces cerevisiae and 10 strains of Escherichia coli to show the utility of the methods. The results provide strong evidences that the algorithm is a promising tool for ab initio building a DNA dictionary. Our method provides a fast way for large scale screening of important DNA elements and offers potential insights into the understanding of a genome.

  11. Extracting remaining information from an inconclusive result in optimal unambiguous state discrimination

    NASA Astrophysics Data System (ADS)

    Zhang, Gang; Yu, Long-Bao; Zhang, Wen-Hai; Cao, Zhuo-Liang

    2014-12-01

    In unambiguous state discrimination, the measurement results consist of the error-free results and an inconclusive result, and an inconclusive result is conventionally regarded as a useless remainder from which no information about initial states is extracted. In this paper, we investigate the problem of extracting remaining information from an inconclusive result, provided that the optimal total success probability is determined. We present three simple examples. An inconclusive answer in the first two examples can be extracted partial information, while an inconclusive answer in the third one cannot be. The initial states in the third example are defined as the highly symmetric states.

  12. PCA Tomography: how to extract information from data cubes

    NASA Astrophysics Data System (ADS)

    Steiner, J. E.; Menezes, R. B.; Ricci, T. V.; Oliveira, A. S.

    2009-05-01

    Astronomy has evolved almost exclusively by the use of spectroscopic and imaging techniques, operated separately. With the development of modern technologies, it is possible to obtain data cubes in which one combines both techniques simultaneously, producing images with spectral resolution. To extract information from them can be quite complex, and hence the development of new methods of data analysis is desirable. We present a method of analysis of data cube (data from single field observations, containing two spatial and one spectral dimension) that uses Principal Component Analysis (PCA) to express the data in the form of reduced dimensionality, facilitating efficient information extraction from very large data sets. PCA transforms the system of correlated coordinates into a system of uncorrelated coordinates ordered by principal components of decreasing variance. The new coordinates are referred to as eigenvectors, and the projections of the data on to these coordinates produce images we will call tomograms. The association of the tomograms (images) to eigenvectors (spectra) is important for the interpretation of both. The eigenvectors are mutually orthogonal, and this information is fundamental for their handling and interpretation. When the data cube shows objects that present uncorrelated physical phenomena, the eigenvector's orthogonality may be instrumental in separating and identifying them. By handling eigenvectors and tomograms, one can enhance features, extract noise, compress data, extract spectra, etc. We applied the method, for illustration purpose only, to the central region of the low ionization nuclear emission region (LINER) galaxy NGC 4736, and demonstrate that it has a type 1 active nucleus, not known before. Furthermore, we show that it is displaced from the centre of its stellar bulge. Based on observations obtained at the Gemini Observatory, which is operated by the Association of Universities for Research in Astronomy, Inc., under a cooperative agreement with the National Science Foundation on behalf of the Gemini partnership: the National Science Foundation (United States), the Science and Technology Facilities Council (United Kingdom), the National Research Council (Canada), CONICYT (Chile), the Australian Research Council (Australia), Ministério da Ciência e Tecnologia (Brazil) and SECYT (Argentina). E-mail: steiner@astro.iag.usp.br

  13. Construction of Green Tide Monitoring System and Research on its Key Techniques

    NASA Astrophysics Data System (ADS)

    Xing, B.; Li, J.; Zhu, H.; Wei, P.; Zhao, Y.

    2018-04-01

    As a kind of marine natural disaster, Green Tide has been appearing every year along the Qingdao Coast, bringing great loss to this region, since the large-scale bloom in 2008. Therefore, it is of great value to obtain the real time dynamic information about green tide distribution. In this study, methods of optical remote sensing and microwave remote sensing are employed in Green Tide Monitoring Research. A specific remote sensing data processing flow and a green tide information extraction algorithm are designed, according to the optical and microwave data of different characteristics. In the aspect of green tide spatial distribution information extraction, an automatic extraction algorithm of green tide distribution boundaries is designed based on the principle of mathematical morphology dilation/erosion. And key issues in information extraction, including the division of green tide regions, the obtaining of basic distributions, the limitation of distribution boundary, and the elimination of islands, have been solved. The automatic generation of green tide distribution boundaries from the results of remote sensing information extraction is realized. Finally, a green tide monitoring system is built based on IDL/GIS secondary development in the integrated environment of RS and GIS, achieving the integration of RS monitoring and information extraction.

  14. Automatic information extraction from unstructured mammography reports using distributed semantics.

    PubMed

    Gupta, Anupama; Banerjee, Imon; Rubin, Daniel L

    2018-02-01

    To date, the methods developed for automated extraction of information from radiology reports are mainly rule-based or dictionary-based, and, therefore, require substantial manual effort to build these systems. Recent efforts to develop automated systems for entity detection have been undertaken, but little work has been done to automatically extract relations and their associated named entities in narrative radiology reports that have comparable accuracy to rule-based methods. Our goal is to extract relations in a unsupervised way from radiology reports without specifying prior domain knowledge. We propose a hybrid approach for information extraction that combines dependency-based parse tree with distributed semantics for generating structured information frames about particular findings/abnormalities from the free-text mammography reports. The proposed IE system obtains a F 1 -score of 0.94 in terms of completeness of the content in the information frames, which outperforms a state-of-the-art rule-based system in this domain by a significant margin. The proposed system can be leveraged in a variety of applications, such as decision support and information retrieval, and may also easily scale to other radiology domains, since there is no need to tune the system with hand-crafted information extraction rules. Copyright © 2018 Elsevier Inc. All rights reserved.

  15. Weak characteristic information extraction from early fault of wind turbine generator gearbox

    NASA Astrophysics Data System (ADS)

    Xu, Xiaoli; Liu, Xiuli

    2017-09-01

    Given the weak early degradation characteristic information during early fault evolution in gearbox of wind turbine generator, traditional singular value decomposition (SVD)-based denoising may result in loss of useful information. A weak characteristic information extraction based on μ-SVD and local mean decomposition (LMD) is developed to address this problem. The basic principle of the method is as follows: Determine the denoising order based on cumulative contribution rate, perform signal reconstruction, extract and subject the noisy part of signal to LMD and μ-SVD denoising, and obtain denoised signal through superposition. Experimental results show that this method can significantly weaken signal noise, effectively extract the weak characteristic information of early fault, and facilitate the early fault warning and dynamic predictive maintenance.

  16. An information extraction framework for cohort identification using electronic health records.

    PubMed

    Liu, Hongfang; Bielinski, Suzette J; Sohn, Sunghwan; Murphy, Sean; Wagholikar, Kavishwar B; Jonnalagadda, Siddhartha R; Ravikumar, K E; Wu, Stephen T; Kullo, Iftikhar J; Chute, Christopher G

    2013-01-01

    Information extraction (IE), a natural language processing (NLP) task that automatically extracts structured or semi-structured information from free text, has become popular in the clinical domain for supporting automated systems at point-of-care and enabling secondary use of electronic health records (EHRs) for clinical and translational research. However, a high performance IE system can be very challenging to construct due to the complexity and dynamic nature of human language. In this paper, we report an IE framework for cohort identification using EHRs that is a knowledge-driven framework developed under the Unstructured Information Management Architecture (UIMA). A system to extract specific information can be developed by subject matter experts through expert knowledge engineering of the externalized knowledge resources used in the framework.

  17. Synthesising quantitative and qualitative research in evidence‐based patient information

    PubMed Central

    Goldsmith, Megan R; Bankhead, Clare R; Austoker, Joan

    2007-01-01

    Background Systematic reviews have, in the past, focused on quantitative studies and clinical effectiveness, while excluding qualitative evidence. Qualitative research can inform evidence‐based practice independently of other research methodologies but methods for the synthesis of such data are currently evolving. Synthesising quantitative and qualitative research in a single review is an important methodological challenge. Aims This paper describes the review methods developed and the difficulties encountered during the process of updating a systematic review of evidence to inform guidelines for the content of patient information related to cervical screening. Methods Systematic searches of 12 electronic databases (January 1996 to July 2004) were conducted. Studies that evaluated the content of information provided to women about cervical screening or that addressed women's information needs were assessed for inclusion. A data extraction form and quality assessment criteria were developed from published resources. A non‐quantitative synthesis was conducted and a tabular evidence profile for each important outcome (eg “explain what the test involves”) was prepared. The overall quality of evidence for each outcome was then assessed using an approach published by the GRADE working group, which was adapted to suit the review questions and modified to include qualitative research evidence. Quantitative and qualitative studies were considered separately for every outcome. Results 32 papers were included in the systematic review following data extraction and assessment of methodological quality. The review questions were best answered by evidence from a range of data sources. The inclusion of qualitative research, which was often highly relevant and specific to many components of the screening information materials, enabled the production of a set of recommendations that will directly affect policy within the NHS Cervical Screening Programme. Conclusions A practical example is provided of how quantitative and qualitative data sources might successfully be brought together and considered in one review. PMID:17325406

  18. Collective Influence of Multiple Spreaders Evaluated by Tracing Real Information Flow in Large-Scale Social Networks.

    PubMed

    Teng, Xian; Pei, Sen; Morone, Flaviano; Makse, Hernán A

    2016-10-26

    Identifying the most influential spreaders that maximize information flow is a central question in network theory. Recently, a scalable method called "Collective Influence (CI)" has been put forward through collective influence maximization. In contrast to heuristic methods evaluating nodes' significance separately, CI method inspects the collective influence of multiple spreaders. Despite that CI applies to the influence maximization problem in percolation model, it is still important to examine its efficacy in realistic information spreading. Here, we examine real-world information flow in various social and scientific platforms including American Physical Society, Facebook, Twitter and LiveJournal. Since empirical data cannot be directly mapped to ideal multi-source spreading, we leverage the behavioral patterns of users extracted from data to construct "virtual" information spreading processes. Our results demonstrate that the set of spreaders selected by CI can induce larger scale of information propagation. Moreover, local measures as the number of connections or citations are not necessarily the deterministic factors of nodes' importance in realistic information spreading. This result has significance for rankings scientists in scientific networks like the APS, where the commonly used number of citations can be a poor indicator of the collective influence of authors in the community.

  19. Embedded importance watermarking for image verification in radiology

    NASA Astrophysics Data System (ADS)

    Osborne, Domininc; Rogers, D.; Sorell, M.; Abbott, Derek

    2004-03-01

    Digital medical images used in radiology are quite different to everyday continuous tone images. Radiology images require that all detailed diagnostic information can be extracted, which traditionally constrains digital medical images to be of large size and stored without loss of information. In order to transmit diagnostic images over a narrowband wireless communication link for remote diagnosis, lossy compression schemes must be used. This involves discarding detailed information and compressing the data, making it more susceptible to error. The loss of image detail and incidental degradation occurring during transmission have potential legal accountability issues, especially in the case of the null diagnosis of a tumor. The work proposed here investigates techniques for verifying the voracity of medical images - in particular, detailing the use of embedded watermarking as an objective means to ensure that important parts of the medical image can be verified. We propose a result to show how embedded watermarking can be used to differentiate contextual from detailed information. The type of images that will be used include spiral hairline fractures and small tumors, which contain the essential diagnostic high spatial frequency information.

  20. A Framework for Land Cover Classification Using Discrete Return LiDAR Data: Adopting Pseudo-Waveform and Hierarchical Segmentation

    NASA Technical Reports Server (NTRS)

    Jung, Jinha; Pasolli, Edoardo; Prasad, Saurabh; Tilton, James C.; Crawford, Melba M.

    2014-01-01

    Acquiring current, accurate land-use information is critical for monitoring and understanding the impact of anthropogenic activities on natural environments.Remote sensing technologies are of increasing importance because of their capability to acquire information for large areas in a timely manner, enabling decision makers to be more effective in complex environments. Although optical imagery has demonstrated to be successful for land cover classification, active sensors, such as light detection and ranging (LiDAR), have distinct capabilities that can be exploited to improve classification results. However, utilization of LiDAR data for land cover classification has not been fully exploited. Moreover, spatial-spectral classification has recently gained significant attention since classification accuracy can be improved by extracting additional information from the neighboring pixels. Although spatial information has been widely used for spectral data, less attention has been given to LiDARdata. In this work, a new framework for land cover classification using discrete return LiDAR data is proposed. Pseudo-waveforms are generated from the LiDAR data and processed by hierarchical segmentation. Spatial featuresare extracted in a region-based way using a new unsupervised strategy for multiple pruning of the segmentation hierarchy. The proposed framework is validated experimentally on a real dataset acquired in an urban area. Better classification results are exhibited by the proposed framework compared to the cases in which basic LiDAR products such as digital surface model and intensity image are used. Moreover, the proposed region-based feature extraction strategy results in improved classification accuracies in comparison with a more traditional window-based approach.

  1. The Extraction of Post-Earthquake Building Damage Informatiom Based on Convolutional Neural Network

    NASA Astrophysics Data System (ADS)

    Chen, M.; Wang, X.; Dou, A.; Wu, X.

    2018-04-01

    The seismic damage information of buildings extracted from remote sensing (RS) imagery is meaningful for supporting relief and effective reduction of losses caused by earthquake. Both traditional pixel-based and object-oriented methods have some shortcoming in extracting information of object. Pixel-based method can't make fully use of contextual information of objects. Object-oriented method faces problem that segmentation of image is not ideal, and the choice of feature space is difficult. In this paper, a new stratage is proposed which combines Convolution Neural Network (CNN) with imagery segmentation to extract building damage information from remote sensing imagery. the key idea of this method includes two steps. First to use CNN to predicate the probability of each pixel and then integrate the probability within each segmentation spot. The method is tested through extracting the collapsed building and uncollapsed building from the aerial image which is acquired in Longtoushan Town after Ms 6.5 Ludian County, Yunnan Province earthquake. The results show that the proposed method indicates its effectiveness in extracting damage information of buildings after earthquake.

  2. Modeling of In-stream Tidal Energy Development and its Potential Effects in Tacoma Narrows, Washington, USA

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Yang, Zhaoqing; Wang, Taiping; Copping, Andrea E.

    Understanding and providing proactive information on the potential for tidal energy projects to cause changes to the physical system and to key water quality constituents in tidal waters is a necessary and cost-effective means to avoid costly regulatory involvement and late stage surprises in the permitting process. This paper presents a modeling study for evaluating the tidal energy extraction and its potential impacts on the marine environment in a real world site - Tacoma Narrows of Puget Sound, Washington State, USA. An unstructured-grid coastal ocean model, fitted with a module that simulates tidal energy devices, was applied to simulate themore » tidal energy extracted by different turbine array configurations and the potential effects of the extraction at local and system-wide scales in Tacoma Narrows and South Puget Sound. Model results demonstrated the advantage of an unstructured-grid model for simulating the far-field effects of tidal energy extraction in a large model domain, as well as assessing the near-field effect using a fine grid resolution near the tidal turbines. The outcome shows that a realistic near-term deployment scenario extracts a very small fraction of the total tidal energy in the system and that system wide environmental effects are not likely; however, near-field effects on the flow field and bed shear stress in the area of tidal turbine farm are more likely. Model results also indicate that from a practical standpoint, hydrodynamic or water quality effects are not likely to be the limiting factor for development of large commercial-scale tidal farms. Results indicate that very high numbers of turbines are required to significantly alter the tidal system; limitations on marine space or other environmental concerns are likely to be reached before reaching these deployment levels. These findings show that important information obtained from numerical modeling can be used to inform regulatory and policy processes for tidal energy development.« less

  3. Fuzzy Linguistic Knowledge Based Behavior Extraction for Building Energy Management Systems

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Dumidu Wijayasekara; Milos Manic

    2013-08-01

    Significant portion of world energy production is consumed by building Heating, Ventilation and Air Conditioning (HVAC) units. Thus along with occupant comfort, energy efficiency is also an important factor in HVAC control. Modern buildings use advanced Multiple Input Multiple Output (MIMO) control schemes to realize these goals. However, since the performance of HVAC units is dependent on many criteria including uncertainties in weather, number of occupants, and thermal state, the performance of current state of the art systems are sub-optimal. Furthermore, because of the large number of sensors in buildings, and the high frequency of data collection, large amount ofmore » information is available. Therefore, important behavior of buildings that compromise energy efficiency or occupant comfort is difficult to identify. This paper presents an easy to use and understandable framework for identifying such behavior. The presented framework uses human understandable knowledge-base to extract important behavior of buildings and present it to users via a graphical user interface. The presented framework was tested on a building in the Pacific Northwest and was shown to be able to identify important behavior that relates to energy efficiency and occupant comfort.« less

  4. A gradient-based approach for automated crest-line detection and analysis of sand dune patterns on planetary surfaces

    NASA Astrophysics Data System (ADS)

    Lancaster, N.; LeBlanc, D.; Bebis, G.; Nicolescu, M.

    2015-12-01

    Dune-field patterns are believed to behave as self-organizing systems, but what causes the patterns to form is still poorly understood. The most obvious (and in many cases the most significant) aspect of a dune system is the pattern of dune crest lines. Extracting meaningful features such as crest length, orientation, spacing, bifurcations, and merging of crests from image data can reveal important information about the specific dune-field morphological properties, development, and response to changes in boundary conditions, but manual methods are labor-intensive and time-consuming. We are developing the capability to recognize and characterize patterns of sand dunes on planetary surfaces. Our goal is to develop a robust methodology and the necessary algorithms for automated or semi-automated extraction of dune morphometric information from image data. Our main approach uses image processing methods to extract gradient information from satellite images of dune fields. Typically, the gradients have a dominant magnitude and orientation. In many cases, the images have two major dominant gradient orientations, for the sunny and shaded side of the dunes. A histogram of the gradient orientations is used to determine the dominant orientation. A threshold is applied to the image based on gradient orientations which agree with the dominant orientation. The contours of the binary image can then be used to determine the dune crest-lines, based on pixel intensity values. Once the crest-lines have been extracted, the morphological properties can be computed. We have tested our approach on a variety of images of linear and crescentic (transverse) dunes and compared dune detection algorithms with manually-digitized dune crest lines, achieving true positive values of 0.57-0.99; and false positives values of 0.30-0.67, indicating that out approach is generally robust.

  5. Remote Video Monitor of Vehicles in Cooperative Information Platform

    NASA Astrophysics Data System (ADS)

    Qin, Guofeng; Wang, Xiaoguo; Wang, Li; Li, Yang; Li, Qiyan

    Detection of vehicles plays an important role in the area of the modern intelligent traffic management. And the pattern recognition is a hot issue in the area of computer vision. An auto- recognition system in cooperative information platform is studied. In the cooperative platform, 3G wireless network, including GPS, GPRS (CDMA), Internet (Intranet), remote video monitor and M-DMB networks are integrated. The remote video information can be taken from the terminals and sent to the cooperative platform, then detected by the auto-recognition system. The images are pretreated and segmented, including feature extraction, template matching and pattern recognition. The system identifies different models and gets vehicular traffic statistics. Finally, the implementation of the system is introduced.

  6. A semi-supervised learning framework for biomedical event extraction based on hidden topics.

    PubMed

    Zhou, Deyu; Zhong, Dayou

    2015-05-01

    Scientists have devoted decades of efforts to understanding the interaction between proteins or RNA production. The information might empower the current knowledge on drug reactions or the development of certain diseases. Nevertheless, due to the lack of explicit structure, literature in life science, one of the most important sources of this information, prevents computer-based systems from accessing. Therefore, biomedical event extraction, automatically acquiring knowledge of molecular events in research articles, has attracted community-wide efforts recently. Most approaches are based on statistical models, requiring large-scale annotated corpora to precisely estimate models' parameters. However, it is usually difficult to obtain in practice. Therefore, employing un-annotated data based on semi-supervised learning for biomedical event extraction is a feasible solution and attracts more interests. In this paper, a semi-supervised learning framework based on hidden topics for biomedical event extraction is presented. In this framework, sentences in the un-annotated corpus are elaborately and automatically assigned with event annotations based on their distances to these sentences in the annotated corpus. More specifically, not only the structures of the sentences, but also the hidden topics embedded in the sentences are used for describing the distance. The sentences and newly assigned event annotations, together with the annotated corpus, are employed for training. Experiments were conducted on the multi-level event extraction corpus, a golden standard corpus. Experimental results show that more than 2.2% improvement on F-score on biomedical event extraction is achieved by the proposed framework when compared to the state-of-the-art approach. The results suggest that by incorporating un-annotated data, the proposed framework indeed improves the performance of the state-of-the-art event extraction system and the similarity between sentences might be precisely described by hidden topics and structures of the sentences. Copyright © 2015 Elsevier B.V. All rights reserved.

  7. a R-Shiny Based Phenology Analysis System and Case Study Using Digital Camera Dataset

    NASA Astrophysics Data System (ADS)

    Zhou, Y. K.

    2018-05-01

    Accurate extracting of the vegetation phenology information play an important role in exploring the effects of climate changes on vegetation. Repeated photos from digital camera is a useful and huge data source in phonological analysis. Data processing and mining on phenological data is still a big challenge. There is no single tool or a universal solution for big data processing and visualization in the field of phenology extraction. In this paper, we proposed a R-shiny based web application for vegetation phenological parameters extraction and analysis. Its main functions include phenological site distribution visualization, ROI (Region of Interest) selection, vegetation index calculation and visualization, data filtering, growth trajectory fitting, phenology parameters extraction, etc. the long-term observation photography data from Freemanwood site in 2013 is processed by this system as an example. The results show that: (1) this system is capable of analyzing large data using a distributed framework; (2) The combination of multiple parameter extraction and growth curve fitting methods could effectively extract the key phenology parameters. Moreover, there are discrepancies between different combination methods in unique study areas. Vegetation with single-growth peak is suitable for using the double logistic module to fit the growth trajectory, while vegetation with multi-growth peaks should better use spline method.

  8. Extracting foreground ensemble features to detect abnormal crowd behavior in intelligent video-surveillance systems

    NASA Astrophysics Data System (ADS)

    Chan, Yi-Tung; Wang, Shuenn-Jyi; Tsai, Chung-Hsien

    2017-09-01

    Public safety is a matter of national security and people's livelihoods. In recent years, intelligent video-surveillance systems have become important active-protection systems. A surveillance system that provides early detection and threat assessment could protect people from crowd-related disasters and ensure public safety. Image processing is commonly used to extract features, e.g., people, from a surveillance video. However, little research has been conducted on the relationship between foreground detection and feature extraction. Most current video-surveillance research has been developed for restricted environments, in which the extracted features are limited by having information from a single foreground; they do not effectively represent the diversity of crowd behavior. This paper presents a general framework based on extracting ensemble features from the foreground of a surveillance video to analyze a crowd. The proposed method can flexibly integrate different foreground-detection technologies to adapt to various monitored environments. Furthermore, the extractable representative features depend on the heterogeneous foreground data. Finally, a classification algorithm is applied to these features to automatically model crowd behavior and distinguish an abnormal event from normal patterns. The experimental results demonstrate that the proposed method's performance is both comparable to that of state-of-the-art methods and satisfies the requirements of real-time applications.

  9. The research of road and vehicle information extraction algorithm based on high resolution remote sensing image

    NASA Astrophysics Data System (ADS)

    Zhou, Tingting; Gu, Lingjia; Ren, Ruizhi; Cao, Qiong

    2016-09-01

    With the rapid development of remote sensing technology, the spatial resolution and temporal resolution of satellite imagery also have a huge increase. Meanwhile, High-spatial-resolution images are becoming increasingly popular for commercial applications. The remote sensing image technology has broad application prospects in intelligent traffic. Compared with traditional traffic information collection methods, vehicle information extraction using high-resolution remote sensing image has the advantages of high resolution and wide coverage. This has great guiding significance to urban planning, transportation management, travel route choice and so on. Firstly, this paper preprocessed the acquired high-resolution multi-spectral and panchromatic remote sensing images. After that, on the one hand, in order to get the optimal thresholding for image segmentation, histogram equalization and linear enhancement technologies were applied into the preprocessing results. On the other hand, considering distribution characteristics of road, the normalized difference vegetation index (NDVI) and normalized difference water index (NDWI) were used to suppress water and vegetation information of preprocessing results. Then, the above two processing result were combined. Finally, the geometric characteristics were used to completed road information extraction. The road vector extracted was used to limit the target vehicle area. Target vehicle extraction was divided into bright vehicles extraction and dark vehicles extraction. Eventually, the extraction results of the two kinds of vehicles were combined to get the final results. The experiment results demonstrated that the proposed algorithm has a high precision for the vehicle information extraction for different high resolution remote sensing images. Among these results, the average fault detection rate was about 5.36%, the average residual rate was about 13.60% and the average accuracy was approximately 91.26%.

  10. An ensemble method for extracting adverse drug events from social media.

    PubMed

    Liu, Jing; Zhao, Songzheng; Zhang, Xiaodi

    2016-06-01

    Because adverse drug events (ADEs) are a serious health problem and a leading cause of death, it is of vital importance to identify them correctly and in a timely manner. With the development of Web 2.0, social media has become a large data source for information on ADEs. The objective of this study is to develop a relation extraction system that uses natural language processing techniques to effectively distinguish between ADEs and non-ADEs in informal text on social media. We develop a feature-based approach that utilizes various lexical, syntactic, and semantic features. Information-gain-based feature selection is performed to address high-dimensional features. Then, we evaluate the effectiveness of four well-known kernel-based approaches (i.e., subset tree kernel, tree kernel, shortest dependency path kernel, and all-paths graph kernel) and several ensembles that are generated by adopting different combination methods (i.e., majority voting, weighted averaging, and stacked generalization). All of the approaches are tested using three data sets: two health-related discussion forums and one general social networking site (i.e., Twitter). When investigating the contribution of each feature subset, the feature-based approach attains the best area under the receiver operating characteristics curve (AUC) values, which are 78.6%, 72.2%, and 79.2% on the three data sets. When individual methods are used, we attain the best AUC values of 82.1%, 73.2%, and 77.0% using the subset tree kernel, shortest dependency path kernel, and feature-based approach on the three data sets, respectively. When using classifier ensembles, we achieve the best AUC values of 84.5%, 77.3%, and 84.5% on the three data sets, outperforming the baselines. Our experimental results indicate that ADE extraction from social media can benefit from feature selection. With respect to the effectiveness of different feature subsets, lexical features and semantic features can enhance the ADE extraction capability. Kernel-based approaches, which can stay away from the feature sparsity issue, are qualified to address the ADE extraction problem. Combining different individual classifiers using suitable combination methods can further enhance the ADE extraction effectiveness. Copyright © 2016 Elsevier B.V. All rights reserved.

  11. Simultaneous Determination of Oleanolic Acid and Ursolic Acid by in Vivo Microdialysis via UHPLC-MS/MS Using Magnetic Dispersive Solid Phase Extraction Coupling with Microwave-Assisted Derivatization and Its Application to a Pharmacokinetic Study of Arctiumlappa L. Root Extract in Rats.

    PubMed

    Zheng, Zhenjia; Zhao, Xian-En; Zhu, Shuyun; Dang, Jun; Qiao, Xuguang; Qiu, Zhichang; Tao, Yanduo

    2018-04-18

    Simultaneous detection of oleanolic acid and ursolic acid in rat blood by in vivo microdialysis can provide important pharmacokinetics information. Microwave-assisted derivatization coupled with magnetic dispersive solid phase extraction was established for the determination of oleanolic acid and ursolic acid by liquid chromatography tandem mass spectrometry. 2'-Carbonyl-piperazine rhodamine B was first designed and synthesized as the derivatization reagent, which was easily adsorbed onto the surface of Fe 3 O 4 /graphene oxide. Simultaneous derivatization and extraction of oleanolic acid and ursolic acid were performed on Fe 3 O 4 /graphene oxide. The permanent positive charge of the derivatization reagent significantly improved the ionization efficiencies. The limits of detection were 0.025 and 0.020 ng/mL for oleanolic acid and ursolic acid, respectively. The validated method was shown to be promising for sensitive, accurate, and simultaneous determination of oleanolic acid and ursolic acid. It was used for their pharmacokinetics study in rat blood after oral administration of Arctiumlappa L. root extract.

  12. A Comparison of Tissue Spray and Lipid Extract Direct Injection Electrospray Ionization Mass Spectrometry for the Differentiation of Eutopic and Ectopic Endometrial Tissues

    NASA Astrophysics Data System (ADS)

    Chagovets, Vitaliy; Wang, Zhihao; Kononikhin, Alexey; Starodubtseva, Natalia; Borisova, Anna; Salimova, Dinara; Popov, Igor; Kozachenko, Andrey; Chingin, Konstantin; Chen, Huanwen; Frankevich, Vladimir; Adamyan, Leila; Sukhikh, Gennady

    2018-02-01

    Recent research revealed that tissue spray mass spectrometry enables rapid molecular profiling of biological tissues, which is of great importance for the search of disease biomarkers as well as for online surgery control. However, the payback for the high speed of analysis in tissue spray analysis is the generally lower chemical sensitivity compared with the traditional approach based on the offline chemical extraction and electrospray ionization mass spectrometry detection. In this study, high resolution mass spectrometry analysis of endometrium tissues of different localizations obtained using direct tissue spray mass spectrometry in positive ion mode is compared with the results of electrospray ionization analysis of lipid extracts. Identified features in both cases belong to three lipid classes: phosphatidylcholines, phosphoethanolamines, and sphingomyelins. Lipids coverage is validated by hydrophilic interaction liquid chromatography with mass spectrometry of lipid extracts. Multivariate analysis of data from both methods reveals satisfactory differentiation of eutopic and ectopic endometrium tissues. Overall, our results indicate that the chemical information provided by tissue spray ionization is sufficient to allow differentiation of endometrial tissues by localization with similar reliability but higher speed than in the traditional approach relying on offline extraction.

  13. Neutron Polarization Analysis for Biphasic Solvent Extraction Systems

    DOE PAGES

    Motokawa, Ryuhei; Endo, Hitoshi; Nagao, Michihiro; ...

    2016-06-16

    Here we performed neutron polarization analysis (NPA) of extracted organic phases containing complexes, comprised of Zr(NO 3) 4 and tri-n-butyl phosphate, which enabled decomposition of the intensity distribution of small-angle neutron scattering (SANS) into the coherent and incoherent scattering components. The coherent scattering intensity, containing structural information, and the incoherent scattering compete over a wide range of magnitude of scattering vector, q, specifically when q is larger than q* ≈ 1/R g, where R g is the radius of gyration of scatterer. Therefore, it is important to determine the incoherent scattering intensity exactly to perform an accurate structural analysis frommore » SANS data when R g is small, such as the aforementioned extracted coordination species. Although NPA is the best method for evaluating the incoherent scattering component for accurately determining the coherent scattering in SANS, this method is not used frequently in SANS data analysis because it is technically challenging. In this study, we successfully demonstrated that experimental determination of the incoherent scattering using NPA is suitable for sample systems containing a small scatterer with a weak coherent scattering intensity, such as extracted complexes in biphasic solvent extraction systems.« less

  14. Applying different independent component analysis algorithms and support vector regression for IT chain store sales forecasting.

    PubMed

    Dai, Wensheng; Wu, Jui-Yu; Lu, Chi-Jie

    2014-01-01

    Sales forecasting is one of the most important issues in managing information technology (IT) chain store sales since an IT chain store has many branches. Integrating feature extraction method and prediction tool, such as support vector regression (SVR), is a useful method for constructing an effective sales forecasting scheme. Independent component analysis (ICA) is a novel feature extraction technique and has been widely applied to deal with various forecasting problems. But, up to now, only the basic ICA method (i.e., temporal ICA model) was applied to sale forecasting problem. In this paper, we utilize three different ICA methods including spatial ICA (sICA), temporal ICA (tICA), and spatiotemporal ICA (stICA) to extract features from the sales data and compare their performance in sales forecasting of IT chain store. Experimental results from a real sales data show that the sales forecasting scheme by integrating stICA and SVR outperforms the comparison models in terms of forecasting error. The stICA is a promising tool for extracting effective features from branch sales data and the extracted features can improve the prediction performance of SVR for sales forecasting.

  15. Applying Different Independent Component Analysis Algorithms and Support Vector Regression for IT Chain Store Sales Forecasting

    PubMed Central

    Dai, Wensheng

    2014-01-01

    Sales forecasting is one of the most important issues in managing information technology (IT) chain store sales since an IT chain store has many branches. Integrating feature extraction method and prediction tool, such as support vector regression (SVR), is a useful method for constructing an effective sales forecasting scheme. Independent component analysis (ICA) is a novel feature extraction technique and has been widely applied to deal with various forecasting problems. But, up to now, only the basic ICA method (i.e., temporal ICA model) was applied to sale forecasting problem. In this paper, we utilize three different ICA methods including spatial ICA (sICA), temporal ICA (tICA), and spatiotemporal ICA (stICA) to extract features from the sales data and compare their performance in sales forecasting of IT chain store. Experimental results from a real sales data show that the sales forecasting scheme by integrating stICA and SVR outperforms the comparison models in terms of forecasting error. The stICA is a promising tool for extracting effective features from branch sales data and the extracted features can improve the prediction performance of SVR for sales forecasting. PMID:25165740

  16. Research on Crowdsourcing Emergency Information Extraction of Based on Events' Frame

    NASA Astrophysics Data System (ADS)

    Yang, Bo; Wang, Jizhou; Ma, Weijun; Mao, Xi

    2018-01-01

    At present, the common information extraction method cannot extract the structured emergency event information accurately; the general information retrieval tool cannot completely identify the emergency geographic information; these ways also do not have an accurate assessment of these results of distilling. So, this paper proposes an emergency information collection technology based on event framework. This technique is to solve the problem of emergency information picking. It mainly includes emergency information extraction model (EIEM), complete address recognition method (CARM) and the accuracy evaluation model of emergency information (AEMEI). EIEM can be structured to extract emergency information and complements the lack of network data acquisition in emergency mapping. CARM uses a hierarchical model and the shortest path algorithm and allows the toponomy pieces to be joined as a full address. AEMEI analyzes the results of the emergency event and summarizes the advantages and disadvantages of the event framework. Experiments show that event frame technology can solve the problem of emergency information drawing and provides reference cases for other applications. When the emergency disaster is about to occur, the relevant departments query emergency's data that has occurred in the past. They can make arrangements ahead of schedule which defense and reducing disaster. The technology decreases the number of casualties and property damage in the country and world. This is of great significance to the state and society.

  17. Extractables characterization for five materials of construction representative of packaging systems used for parenteral and ophthalmic drug products.

    PubMed

    Jenke, Dennis; Castner, James; Egert, Thomas; Feinberg, Tom; Hendricker, Alan; Houston, Christopher; Hunt, Desmond G; Lynch, Michael; Shaw, Arthur; Nicholas, Kumudini; Norwood, Daniel L; Paskiet, Diane; Ruberto, Michael; Smith, Edward J; Holcomb, Frank

    2013-01-01

    Polymeric and elastomeric materials are commonly encountered in medical devices and packaging systems used to manufacture, store, deliver, and/or administer drug products. Characterizing extractables from such materials is a necessary step in establishing their suitability for use in these applications. In this study, five individual materials representative of polymers and elastomers commonly used in packaging systems and devices were extracted under conditions and with solvents that are relevant to parenteral and ophthalmic drug products (PODPs). Extraction methods included elevated temperature sealed vessel extraction, sonication, refluxing, and Soxhlet extraction. Extraction solvents included a low-pH (pH = 2.5) salt mixture, a high-pH (pH = 9.5) phosphate buffer, a 1/1 isopropanol/water mixture, isopropanol, and hexane. The resulting extracts were chemically characterized via spectroscopic and chromatographic means to establish the metal/trace element and organic extractables profiles. Additionally, the test articles themselves were tested for volatile organic substances. The results of this testing established the extractables profiles of the test articles, which are reported herein. Trends in the extractables, and their estimated concentrations, as a function of the extraction and testing methodologies are considered in the context of the use of the test article in medical applications and with respect to establishing best demonstrated practices for extractables profiling of materials used in PODP-related packaging systems and devices. Plastic and rubber materials are commonly encountered in medical devices and packaging/delivery systems for drug products. Characterizing the extractables from these materials is an important part of determining that they are suitable for use. In this study, five materials representative of plastics and rubbers used in packaging and medical devices were extracted by several means, and the extracts were analytically characterized to establish each material's profile of extracted organic compounds and trace element/metals. This information was utilized to make generalizations about the appropriateness of the test methods and the appropriate use of the test materials.

  18. Secure alignment of coordinate systems using quantum correlation

    NASA Astrophysics Data System (ADS)

    Rezazadeh, F.; Mani, A.; Karimipour, V.

    2017-08-01

    We show that two parties far apart can use shared entangled states and classical communication to align their coordinate systems with a very high fidelity. Moreover, compared with previous methods proposed for such a task, i.e., sending parallel or antiparallel pairs or groups of spin states, our method has the extra advantages of using single-qubit measurements and also being secure, so that third parties do not extract any information about the aligned coordinate system established between the two parties. The latter property is important in many other quantum information protocols in which measurements inevitably play a significant role.

  19. Visual cues in low-level flight - Implications for pilotage, training, simulation, and enhanced/synthetic vision systems

    NASA Technical Reports Server (NTRS)

    Foyle, David C.; Kaiser, Mary K.; Johnson, Walter W.

    1992-01-01

    This paper reviews some of the sources of visual information that are available in the out-the-window scene and describes how these visual cues are important for routine pilotage and training, as well as the development of simulator visual systems and enhanced or synthetic vision systems for aircraft cockpits. It is shown how these visual cues may change or disappear under environmental or sensor conditions, and how the visual scene can be augmented by advanced displays to capitalize on the pilot's excellent ability to extract visual information from the visual scene.

  20. Integrating machine learning techniques and high-resolution imagery to generate GIS-ready information for urban water consumption studies

    NASA Astrophysics Data System (ADS)

    Wolf, Nils; Hof, Angela

    2012-10-01

    Urban sprawl driven by shifts in tourism development produces new suburban landscapes of water consumption on Mediterranean coasts. Golf courses, ornamental, 'Atlantic' gardens and swimming pools are the most striking artefacts of this transformation, threatening the local water supply systems and exacerbating water scarcity. In the face of climate change, urban landscape irrigation is becoming increasingly important from a resource management point of view. This paper adopts urban remote sensing towards a targeted mapping approach using machine learning techniques and highresolution satellite imagery (WorldView-2) to generate GIS-ready information for urban water consumption studies. Swimming pools, vegetation and - as a subgroup of vegetation - turf grass are extracted as important determinants of water consumption. For image analysis, the complex nature of urban environments suggests spatial-spectral classification, i.e. the complementary use of the spectral signature and spatial descriptors. Multiscale image segmentation provides means to extract the spatial descriptors - namely object feature layers - which can be concatenated at pixel level to the spectral signature. This study assesses the value of object features using different machine learning techniques and amounts of labeled information for learning. The results indicate the benefit of the spatial-spectral approach if combined with appropriate classifiers like tree-based ensembles or support vector machines, which can handle high dimensionality. Finally, a Random Forest classifier was chosen to deliver the classified input data for the estimation of evaporative water loss and net landscape irrigation requirements.

  1. Characterization of Melanogenesis Inhibitory Constituents of Morus alba Leaves and Optimization of Extraction Conditions Using Response Surface Methodology.

    PubMed

    Jeong, Ji Yeon; Liu, Qing; Kim, Seon Beom; Jo, Yang Hee; Mo, Eun Jin; Yang, Hyo Hee; Song, Dae Hye; Hwang, Bang Yeon; Lee, Mi Kyeong

    2015-05-14

    Melanin is a natural pigment that plays an important role in the protection of skin, however, hyperpigmentation cause by excessive levels of melatonin is associated with several problems. Therefore, melanogenesis inhibitory natural products have been developed by the cosmetic industry as skin medications. The leaves of Morus alba (Moraceae) have been reported to inhibit melanogenesis, therefore, characterization of the melanogenesis inhibitory constituents of M. alba leaves was attempted in this study. Twenty compounds including eight benzofurans, 10 flavonoids, one stilbenoid and one chalcone were isolated from M. alba leaves and these phenolic constituents were shown to significantly inhibit tyrosinase activity and melanin content in B6F10 melanoma cells. To maximize the melanogenesis inhibitory activity and active phenolic contents, optimized M. alba leave extraction conditions were predicted using response surface methodology as a methanol concentration of 85.2%; an extraction temperature of 53.2 °C and an extraction time of 2 h. The tyrosinase inhibition and total phenolic content under optimal conditions were found to be 74.8% inhibition and 24.8 μg GAE/mg extract, which were well-matched with the predicted values of 75.0% inhibition and 23.8 μg GAE/mg extract. These results shall provide useful information about melanogenesis inhibitory constituents and optimized extracts from M. alba leaves as cosmetic therapeutics to reduce skin hyperpigmentation.

  2. An Information Extraction Framework for Cohort Identification Using Electronic Health Records

    PubMed Central

    Liu, Hongfang; Bielinski, Suzette J.; Sohn, Sunghwan; Murphy, Sean; Wagholikar, Kavishwar B.; Jonnalagadda, Siddhartha R.; Ravikumar, K.E.; Wu, Stephen T.; Kullo, Iftikhar J.; Chute, Christopher G

    Information extraction (IE), a natural language processing (NLP) task that automatically extracts structured or semi-structured information from free text, has become popular in the clinical domain for supporting automated systems at point-of-care and enabling secondary use of electronic health records (EHRs) for clinical and translational research. However, a high performance IE system can be very challenging to construct due to the complexity and dynamic nature of human language. In this paper, we report an IE framework for cohort identification using EHRs that is a knowledge-driven framework developed under the Unstructured Information Management Architecture (UIMA). A system to extract specific information can be developed by subject matter experts through expert knowledge engineering of the externalized knowledge resources used in the framework. PMID:24303255

  3. Automatic 3D Extraction of Buildings, Vegetation and Roads from LIDAR Data

    NASA Astrophysics Data System (ADS)

    Bellakaout, A.; Cherkaoui, M.; Ettarid, M.; Touzani, A.

    2016-06-01

    Aerial topographic surveys using Light Detection and Ranging (LiDAR) technology collect dense and accurate information from the surface or terrain; it is becoming one of the important tools in the geosciences for studying objects and earth surface. Classification of Lidar data for extracting ground, vegetation, and buildings is a very important step needed in numerous applications such as 3D city modelling, extraction of different derived data for geographical information systems (GIS), mapping, navigation, etc... Regardless of what the scan data will be used for, an automatic process is greatly required to handle the large amount of data collected because the manual process is time consuming and very expensive. This paper is presenting an approach for automatic classification of aerial Lidar data into five groups of items: buildings, trees, roads, linear object and soil using single return Lidar and processing the point cloud without generating DEM. Topological relationship and height variation analysis is adopted to segment, preliminary, the entire point cloud preliminarily into upper and lower contours, uniform and non-uniform surface, non-uniform surfaces, linear objects, and others. This primary classification is used on the one hand to know the upper and lower part of each building in an urban scene, needed to model buildings façades; and on the other hand to extract point cloud of uniform surfaces which contain roofs, roads and ground used in the second phase of classification. A second algorithm is developed to segment the uniform surface into buildings roofs, roads and ground, the second phase of classification based on the topological relationship and height variation analysis, The proposed approach has been tested using two areas : the first is a housing complex and the second is a primary school. The proposed approach led to successful classification results of buildings, vegetation and road classes.

  4. Table Extraction from Web Pages Using Conditional Random Fields to Extract Toponym Related Data

    NASA Astrophysics Data System (ADS)

    Luthfi Hanifah, Hayyu'; Akbar, Saiful

    2017-01-01

    Table is one of the ways to visualize information on web pages. The abundant number of web pages that compose the World Wide Web has been the motivation of information extraction and information retrieval research, including the research for table extraction. Besides, there is a need for a system which is designed to specifically handle location-related information. Based on this background, this research is conducted to provide a way to extract location-related data from web tables so that it can be used in the development of Geographic Information Retrieval (GIR) system. The location-related data will be identified by the toponym (location name). In this research, a rule-based approach with gazetteer is used to recognize toponym from web table. Meanwhile, to extract data from a table, a combination of rule-based approach and statistical-based approach is used. On the statistical-based approach, Conditional Random Fields (CRF) model is used to understand the schema of the table. The result of table extraction is presented on JSON format. If a web table contains toponym, a field will be added on the JSON document to store the toponym values. This field can be used to index the table data in accordance to the toponym, which then can be used in the development of GIR system.

  5. An Overview of the Biological Effects of Some Mediterranean Essential Oils on Human Health

    PubMed Central

    2017-01-01

    Essential oils (EOs), extracted from aromatic plants, are interesting natural products and represent an important part of the traditional pharmacopeia. The use of some EOs as alternative antimicrobial and pharmaceutical agents has attracted considerable interest recently. Most of the EOs and their single constituents have been reported to inhibit several phytopathogens, human pathogens, and insects as well as their effective uses in food and pharmaceutical industries. The current review discussed the chemical composition and bioactivity of some important EOs extracted from some Mediterranean plants and their principal bioactive single constituents. Information has been furnished on the mechanisms, mode of actions, and factors affecting the bioactivity of some single constituents from different Mediterranean plant EOs. The current review gives an insight into some common plant EOs belonging to Lamiaceae, Apiaceae, Rutaceae, and Verbenaceae families commonly growing in Mediterranean region. Further information has been provided about the medical uses of some EOs for several human diseases covering the pharmacological effects (anti-inflammatory, antioxidant, and anticarcinogenic). The antimicrobial effects have been also considered in the current review. Although plant EOs are considered promising natural alternatives for many chemical drugs, they still need more specific research for wide application especially in food and pharmaceutical industries. PMID:29230418

  6. Modelling Single Tree Structure with Terrestrial Laser Scanner

    NASA Astrophysics Data System (ADS)

    Yurtseven, H.; Akgül, M.; Gülci, S.

    2017-11-01

    Recent technological developments, which has reliable accuracy and quality for all engineering works, such as remote sensing tools have wide range use in forestry applications. Last decade, sustainable use and management opportunities of forest resources are favorite topics. Thus, precision of obtained data plays an important role in evaluation of current status of forests' value. The use of aerial and terrestrial laser technology has more reliable and effective models to advance the appropriate natural resource management. This study investigates the use of terrestrial laser scanner (TLS) technology in forestry, and also the methodological data processing stages for tree volume extraction is explained. Z+F Imager 5010C TLS system was used for measure single tree information such as tree height, diameter of breast height, branch volume and canopy closure. In this context more detailed and accurate data can be obtained than conventional inventory sampling in forestry by using TLS systems. However the accuracy of obtained data is up to the experiences of TLS operator in the field. Number of scan stations and its positions are other important factors to reduce noise effect and accurate 3D modelling. The results indicated that the use of point cloud data to extract tree information for forestry applications are promising methodology for precision forestry.

  7. [A customized method for information extraction from unstructured text data in the electronic medical records].

    PubMed

    Bao, X Y; Huang, W J; Zhang, K; Jin, M; Li, Y; Niu, C Z

    2018-04-18

    There is a huge amount of diagnostic or treatment information in electronic medical record (EMR), which is a concrete manifestation of clinicians actual diagnosis and treatment details. Plenty of episodes in EMRs, such as complaints, present illness, past history, differential diagnosis, diagnostic imaging, surgical records, reflecting details of diagnosis and treatment in clinical process, adopt Chinese description of natural language. How to extract effective information from these Chinese narrative text data, and organize it into a form of tabular for analysis of medical research, for the practical utilization of clinical data in the real world, is a difficult problem in Chinese medical data processing. Based on the EMRs narrative text data in a tertiary hospital in China, a customized information extracting rules learning, and rule based information extraction methods is proposed. The overall method consists of three steps, which includes: (1) Step 1, a random sample of 600 copies (including the history of present illness, past history, personal history, family history, etc.) of the electronic medical record data, was extracted as raw corpora. With our developed Chinese clinical narrative text annotation platform, the trained clinician and nurses marked the tokens and phrases in the corpora which would be extracted (with a history of diabetes as an example). (2) Step 2, based on the annotated corpora clinical text data, some extraction templates were summarized and induced firstly. Then these templates were rewritten using regular expressions of Perl programming language, as extraction rules. Using these extraction rules as basic knowledge base, we developed extraction packages in Perl, for extracting data from the EMRs text data. In the end, the extracted data items were organized in tabular data format, for later usage in clinical research or hospital surveillance purposes. (3) As the final step of the method, the evaluation and validation of the proposed methods were implemented in the National Clinical Service Data Integration Platform, and we checked the extraction results using artificial verification and automated verification combined, proved the effectiveness of the method. For all the patients with diabetes as diagnosed disease in the Department of Endocrine in the hospital, the medical history episode of these patients showed that, altogether 1 436 patients were dismissed in 2015, and a history of diabetes medical records extraction results showed that the recall rate was 87.6%, the accuracy rate was 99.5%, and F-Score was 0.93. For all the 10% patients (totally 1 223 patients) with diabetes by the dismissed dates of August 2017 in the same department, the extracted diabetes history extraction results showed that the recall rate was 89.2%, the accuracy rate was 99.2%, F-Score was 0.94. This study mainly adopts the combination of natural language processing and rule-based information extraction, and designs and implements an algorithm for extracting customized information from unstructured Chinese electronic medical record text data. It has better results than existing work.

  8. Extracting semantically enriched events from biomedical literature

    PubMed Central

    2012-01-01

    Background Research into event-based text mining from the biomedical literature has been growing in popularity to facilitate the development of advanced biomedical text mining systems. Such technology permits advanced search, which goes beyond document or sentence-based retrieval. However, existing event-based systems typically ignore additional information within the textual context of events that can determine, amongst other things, whether an event represents a fact, hypothesis, experimental result or analysis of results, whether it describes new or previously reported knowledge, and whether it is speculated or negated. We refer to such contextual information as meta-knowledge. The automatic recognition of such information can permit the training of systems allowing finer-grained searching of events according to the meta-knowledge that is associated with them. Results Based on a corpus of 1,000 MEDLINE abstracts, fully manually annotated with both events and associated meta-knowledge, we have constructed a machine learning-based system that automatically assigns meta-knowledge information to events. This system has been integrated into EventMine, a state-of-the-art event extraction system, in order to create a more advanced system (EventMine-MK) that not only extracts events from text automatically, but also assigns five different types of meta-knowledge to these events. The meta-knowledge assignment module of EventMine-MK performs with macro-averaged F-scores in the range of 57-87% on the BioNLP’09 Shared Task corpus. EventMine-MK has been evaluated on the BioNLP’09 Shared Task subtask of detecting negated and speculated events. Our results show that EventMine-MK can outperform other state-of-the-art systems that participated in this task. Conclusions We have constructed the first practical system that extracts both events and associated, detailed meta-knowledge information from biomedical literature. The automatically assigned meta-knowledge information can be used to refine search systems, in order to provide an extra search layer beyond entities and assertions, dealing with phenomena such as rhetorical intent, speculations, contradictions and negations. This finer grained search functionality can assist in several important tasks, e.g., database curation (by locating new experimental knowledge) and pathway enrichment (by providing information for inference). To allow easy integration into text mining systems, EventMine-MK is provided as a UIMA component that can be used in the interoperable text mining infrastructure, U-Compare. PMID:22621266

  9. Extracting semantically enriched events from biomedical literature.

    PubMed

    Miwa, Makoto; Thompson, Paul; McNaught, John; Kell, Douglas B; Ananiadou, Sophia

    2012-05-23

    Research into event-based text mining from the biomedical literature has been growing in popularity to facilitate the development of advanced biomedical text mining systems. Such technology permits advanced search, which goes beyond document or sentence-based retrieval. However, existing event-based systems typically ignore additional information within the textual context of events that can determine, amongst other things, whether an event represents a fact, hypothesis, experimental result or analysis of results, whether it describes new or previously reported knowledge, and whether it is speculated or negated. We refer to such contextual information as meta-knowledge. The automatic recognition of such information can permit the training of systems allowing finer-grained searching of events according to the meta-knowledge that is associated with them. Based on a corpus of 1,000 MEDLINE abstracts, fully manually annotated with both events and associated meta-knowledge, we have constructed a machine learning-based system that automatically assigns meta-knowledge information to events. This system has been integrated into EventMine, a state-of-the-art event extraction system, in order to create a more advanced system (EventMine-MK) that not only extracts events from text automatically, but also assigns five different types of meta-knowledge to these events. The meta-knowledge assignment module of EventMine-MK performs with macro-averaged F-scores in the range of 57-87% on the BioNLP'09 Shared Task corpus. EventMine-MK has been evaluated on the BioNLP'09 Shared Task subtask of detecting negated and speculated events. Our results show that EventMine-MK can outperform other state-of-the-art systems that participated in this task. We have constructed the first practical system that extracts both events and associated, detailed meta-knowledge information from biomedical literature. The automatically assigned meta-knowledge information can be used to refine search systems, in order to provide an extra search layer beyond entities and assertions, dealing with phenomena such as rhetorical intent, speculations, contradictions and negations. This finer grained search functionality can assist in several important tasks, e.g., database curation (by locating new experimental knowledge) and pathway enrichment (by providing information for inference). To allow easy integration into text mining systems, EventMine-MK is provided as a UIMA component that can be used in the interoperable text mining infrastructure, U-Compare.

  10. [The application of spectral geological profile in the alteration mapping].

    PubMed

    Li, Qing-Ting; Lin, Qi-Zhong; Zhang, Bing; Lu, Lin-Lin

    2012-07-01

    Geological section can help validating and understanding of the alteration information which is extracted from remote sensing images. In the paper, the concept of spectral geological profile was introduced based on the principle of geological section and the method of spectral information extraction. The spectral profile can realize the storage and vision of spectra along the geological profile, but the spectral geological spectral profile includes more information besides the information of spectral profile. The main object of spectral geological spectral profile is to obtain the distribution of alteration types and content of minerals along the profile which can be extracted from spectra measured by field spectrometer, especially for the spatial distribution and mode of alteration association. Technical method and work flow of alteration information extraction was studied for the spectral geological profile. The spectral geological profile was set up using the ground reflectance spectra and the alteration information was extracted from the remote sensing image with the help of typical spectra geological profile. At last the meaning and effect of the spectral geological profile was discussed.

  11. Recent progress in automatically extracting information from the pharmacogenomic literature

    PubMed Central

    Garten, Yael; Coulet, Adrien; Altman, Russ B

    2011-01-01

    The biomedical literature holds our understanding of pharmacogenomics, but it is dispersed across many journals. In order to integrate our knowledge, connect important facts across publications and generate new hypotheses we must organize and encode the contents of the literature. By creating databases of structured pharmocogenomic knowledge, we can make the value of the literature much greater than the sum of the individual reports. We can, for example, generate candidate gene lists or interpret surprising hits in genome-wide association studies. Text mining automatically adds structure to the unstructured knowledge embedded in millions of publications, and recent years have seen a surge in work on biomedical text mining, some specific to pharmacogenomics literature. These methods enable extraction of specific types of information and can also provide answers to general, systemic queries. In this article, we describe the main tasks of text mining in the context of pharmacogenomics, summarize recent applications and anticipate the next phase of text mining applications. PMID:21047206

  12. Aircraft Segmentation in SAR Images Based on Improved Active Shape Model

    NASA Astrophysics Data System (ADS)

    Zhang, X.; Xiong, B.; Kuang, G.

    2018-04-01

    In SAR image interpretation, aircrafts are the important targets arousing much attention. However, it is far from easy to segment an aircraft from the background completely and precisely in SAR images. Because of the complex structure, different kinds of electromagnetic scattering take place on the aircraft surfaces. As a result, aircraft targets usually appear to be inhomogeneous and disconnected. It is a good idea to extract an aircraft target by the active shape model (ASM), since combination of the geometric information controls variations of the shape during the contour evolution. However, linear dimensionality reduction, used in classic ACM, makes the model rigid. It brings much trouble to segment different types of aircrafts. Aiming at this problem, an improved ACM based on ISOMAP is proposed in this paper. ISOMAP algorithm is used to extract the shape information of the training set and make the model flexible enough to deal with different aircrafts. The experiments based on real SAR data shows that the proposed method achieves obvious improvement in accuracy.

  13. Human fatigue expression recognition through image-based dynamic multi-information and bimodal deep learning

    NASA Astrophysics Data System (ADS)

    Zhao, Lei; Wang, Zengcai; Wang, Xiaojin; Qi, Yazhou; Liu, Qing; Zhang, Guoxin

    2016-09-01

    Human fatigue is an important cause of traffic accidents. To improve the safety of transportation, we propose, in this paper, a framework for fatigue expression recognition using image-based facial dynamic multi-information and a bimodal deep neural network. First, the landmark of face region and the texture of eye region, which complement each other in fatigue expression recognition, are extracted from facial image sequences captured by a single camera. Then, two stacked autoencoder neural networks are trained for landmark and texture, respectively. Finally, the two trained neural networks are combined by learning a joint layer on top of them to construct a bimodal deep neural network. The model can be used to extract a unified representation that fuses landmark and texture modalities together and classify fatigue expressions accurately. The proposed system is tested on a human fatigue dataset obtained from an actual driving environment. The experimental results demonstrate that the proposed method performs stably and robustly, and that the average accuracy achieves 96.2%.

  14. Classification of hepatocellular carcinoma stages from free-text clinical and radiology reports

    PubMed Central

    Yim, Wen-wai; Kwan, Sharon W; Johnson, Guy; Yetisgen, Meliha

    2017-01-01

    Cancer stage information is important for clinical research. However, they are not always explicitly noted in electronic medical records. In this paper, we present our work on automatic classification of hepatocellular carcinoma (HCC) stages from free-text clinical and radiology notes. To accomplish this, we defined 11 stage parameters used in the three HCC staging systems, American Joint Committee on Cancer (AJCC), Barcelona Clinic Liver Cancer (BCLC), and Cancer of the Liver Italian Program (CLIP). After aggregating stage parameters to the patient-level, the final stage classifications were achieved using an expert-created decision logic. Each stage parameter relevant for staging was extracted using several classification methods, e.g. sentence classification and automatic information structuring, to identify and normalize text as cancer stage parameter values. Stage parameter extraction for the test set performed at 0.81 F1. Cancer stage prediction for AJCC, BCLC, and CLIP stage classifications were 0.55, 0.50, and 0.43 F1.

  15. MULTISCALE TENSOR ANISOTROPIC FILTERING OF FLUORESCENCE MICROSCOPY FOR DENOISING MICROVASCULATURE.

    PubMed

    Prasath, V B S; Pelapur, R; Glinskii, O V; Glinsky, V V; Huxley, V H; Palaniappan, K

    2015-04-01

    Fluorescence microscopy images are contaminated by noise and improving image quality without blurring vascular structures by filtering is an important step in automatic image analysis. The application of interest here is to automatically extract the structural components of the microvascular system with accuracy from images acquired by fluorescence microscopy. A robust denoising process is necessary in order to extract accurate vascular morphology information. For this purpose, we propose a multiscale tensor with anisotropic diffusion model which progressively and adaptively updates the amount of smoothing while preserving vessel boundaries accurately. Based on a coherency enhancing flow with planar confidence measure and fused 3D structure information, our method integrates multiple scales for microvasculature preservation and noise removal membrane structures. Experimental results on simulated synthetic images and epifluorescence images show the advantage of our improvement over other related diffusion filters. We further show that the proposed multiscale integration approach improves denoising accuracy of different tensor diffusion methods to obtain better microvasculature segmentation.

  16. Figure Text Extraction in Biomedical Literature

    PubMed Central

    Kim, Daehyun; Yu, Hong

    2011-01-01

    Background Figures are ubiquitous in biomedical full-text articles, and they represent important biomedical knowledge. However, the sheer volume of biomedical publications has made it necessary to develop computational approaches for accessing figures. Therefore, we are developing the Biomedical Figure Search engine (http://figuresearch.askHERMES.org) to allow bioscientists to access figures efficiently. Since text frequently appears in figures, automatically extracting such text may assist the task of mining information from figures. Little research, however, has been conducted exploring text extraction from biomedical figures. Methodology We first evaluated an off-the-shelf Optical Character Recognition (OCR) tool on its ability to extract text from figures appearing in biomedical full-text articles. We then developed a Figure Text Extraction Tool (FigTExT) to improve the performance of the OCR tool for figure text extraction through the use of three innovative components: image preprocessing, character recognition, and text correction. We first developed image preprocessing to enhance image quality and to improve text localization. Then we adapted the off-the-shelf OCR tool on the improved text localization for character recognition. Finally, we developed and evaluated a novel text correction framework by taking advantage of figure-specific lexicons. Results/Conclusions The evaluation on 382 figures (9,643 figure texts in total) randomly selected from PubMed Central full-text articles shows that FigTExT performed with 84% precision, 98% recall, and 90% F1-score for text localization and with 62.5% precision, 51.0% recall and 56.2% F1-score for figure text extraction. When limiting figure texts to those judged by domain experts to be important content, FigTExT performed with 87.3% precision, 68.8% recall, and 77% F1-score. FigTExT significantly improved the performance of the off-the-shelf OCR tool we used, which on its own performed with 36.6% precision, 19.3% recall, and 25.3% F1-score for text extraction. In addition, our results show that FigTExT can extract texts that do not appear in figure captions or other associated text, further suggesting the potential utility of FigTExT for improving figure search. PMID:21249186

  17. Figure text extraction in biomedical literature.

    PubMed

    Kim, Daehyun; Yu, Hong

    2011-01-13

    Figures are ubiquitous in biomedical full-text articles, and they represent important biomedical knowledge. However, the sheer volume of biomedical publications has made it necessary to develop computational approaches for accessing figures. Therefore, we are developing the Biomedical Figure Search engine (http://figuresearch.askHERMES.org) to allow bioscientists to access figures efficiently. Since text frequently appears in figures, automatically extracting such text may assist the task of mining information from figures. Little research, however, has been conducted exploring text extraction from biomedical figures. We first evaluated an off-the-shelf Optical Character Recognition (OCR) tool on its ability to extract text from figures appearing in biomedical full-text articles. We then developed a Figure Text Extraction Tool (FigTExT) to improve the performance of the OCR tool for figure text extraction through the use of three innovative components: image preprocessing, character recognition, and text correction. We first developed image preprocessing to enhance image quality and to improve text localization. Then we adapted the off-the-shelf OCR tool on the improved text localization for character recognition. Finally, we developed and evaluated a novel text correction framework by taking advantage of figure-specific lexicons. The evaluation on 382 figures (9,643 figure texts in total) randomly selected from PubMed Central full-text articles shows that FigTExT performed with 84% precision, 98% recall, and 90% F1-score for text localization and with 62.5% precision, 51.0% recall and 56.2% F1-score for figure text extraction. When limiting figure texts to those judged by domain experts to be important content, FigTExT performed with 87.3% precision, 68.8% recall, and 77% F1-score. FigTExT significantly improved the performance of the off-the-shelf OCR tool we used, which on its own performed with 36.6% precision, 19.3% recall, and 25.3% F1-score for text extraction. In addition, our results show that FigTExT can extract texts that do not appear in figure captions or other associated text, further suggesting the potential utility of FigTExT for improving figure search.

  18. Considering context: reliable entity networks through contextual relationship extraction

    NASA Astrophysics Data System (ADS)

    David, Peter; Hawes, Timothy; Hansen, Nichole; Nolan, James J.

    2016-05-01

    Existing information extraction techniques can only partially address the problem of exploiting unreadable-large amounts text. When discussion of events and relationships is limited to simple, past-tense, factual descriptions of events, current NLP-based systems can identify events and relationships and extract a limited amount of additional information. But the simple subset of available information that existing tools can extract from text is only useful to a small set of users and problems. Automated systems need to find and separate information based on what is threatened or planned to occur, has occurred in the past, or could potentially occur. We address the problem of advanced event and relationship extraction with our event and relationship attribute recognition system, which labels generic, planned, recurring, and potential events. The approach is based on a combination of new machine learning methods, novel linguistic features, and crowd-sourced labeling. The attribute labeler closes the gap between structured event and relationship models and the complicated and nuanced language that people use to describe them. Our operational-quality event and relationship attribute labeler enables Warfighters and analysts to more thoroughly exploit information in unstructured text. This is made possible through 1) More precise event and relationship interpretation, 2) More detailed information about extracted events and relationships, and 3) More reliable and informative entity networks that acknowledge the different attributes of entity-entity relationships.

  19. A Method for Extracting Important Segments from Documents Using Support Vector Machines

    NASA Astrophysics Data System (ADS)

    Suzuki, Daisuke; Utsumi, Akira

    In this paper we propose an extraction-based method for automatic summarization. The proposed method consists of two processes: important segment extraction and sentence compaction. The process of important segment extraction classifies each segment in a document as important or not by Support Vector Machines (SVMs). The process of sentence compaction then determines grammatically appropriate portions of a sentence for a summary according to its dependency structure and the classification result by SVMs. To test the performance of our method, we conducted an evaluation experiment using the Text Summarization Challenge (TSC-1) corpus of human-prepared summaries. The result was that our method achieved better performance than a segment-extraction-only method and the Lead method, especially for sentences only a part of which was included in human summaries. Further analysis of the experimental results suggests that a hybrid method that integrates sentence extraction with segment extraction may generate better summaries.

  20. An occlusion paradigm to assess the importance of the timing of the quiet eye fixation.

    PubMed

    Vine, Samuel J; Lee, Don Hyung; Walters-Symons, Rosanna; Wilson, Mark R

    2017-02-01

    The aim of the study was to explore the significance of the 'timing' of the quiet eye (QE), and the relative importance of late (online control) or early (pre-programming) visual information for accuracy. Twenty-seven skilled golfers completed a putting task using an occlusion paradigm with three conditions: early (prior to backswing), late (during putter stroke), and no (control) occlusion of vision. Performance, QE, and kinematic variables relating to the swing were measured. Results revealed that providing only early visual information (occluding late visual information) had a significant detrimental effect on performance and kinematic measures, compared to the control condition (no occlusion), despite QE durations being maintained. Conversely, providing only late visual information (occluding early visual information) was not significantly detrimental to performance or kinematics, with results similar to those in the control condition. These findings imply that the visual information extracted during movement execution - the late proportion of the QE - is critical when golf putting. The results challenge the predominant view that the QE serves only a pre-programming function. We propose that the different proportions of the QE (before and during movement) may serve different functions in supporting accuracy in golf putting.

  1. Using text mining techniques to extract phenotypic information from the PhenoCHF corpus

    PubMed Central

    2015-01-01

    Background Phenotypic information locked away in unstructured narrative text presents significant barriers to information accessibility, both for clinical practitioners and for computerised applications used for clinical research purposes. Text mining (TM) techniques have previously been applied successfully to extract different types of information from text in the biomedical domain. They have the potential to be extended to allow the extraction of information relating to phenotypes from free text. Methods To stimulate the development of TM systems that are able to extract phenotypic information from text, we have created a new corpus (PhenoCHF) that is annotated by domain experts with several types of phenotypic information relating to congestive heart failure. To ensure that systems developed using the corpus are robust to multiple text types, it integrates text from heterogeneous sources, i.e., electronic health records (EHRs) and scientific articles from the literature. We have developed several different phenotype extraction methods to demonstrate the utility of the corpus, and tested these methods on a further corpus, i.e., ShARe/CLEF 2013. Results Evaluation of our automated methods showed that PhenoCHF can facilitate the training of reliable phenotype extraction systems, which are robust to variations in text type. These results have been reinforced by evaluating our trained systems on the ShARe/CLEF corpus, which contains clinical records of various types. Like other studies within the biomedical domain, we found that solutions based on conditional random fields produced the best results, when coupled with a rich feature set. Conclusions PhenoCHF is the first annotated corpus aimed at encoding detailed phenotypic information. The unique heterogeneous composition of the corpus has been shown to be advantageous in the training of systems that can accurately extract phenotypic information from a range of different text types. Although the scope of our annotation is currently limited to a single disease, the promising results achieved can stimulate further work into the extraction of phenotypic information for other diseases. The PhenoCHF annotation guidelines and annotations are publicly available at https://code.google.com/p/phenochf-corpus. PMID:26099853

  2. Using text mining techniques to extract phenotypic information from the PhenoCHF corpus.

    PubMed

    Alnazzawi, Noha; Thompson, Paul; Batista-Navarro, Riza; Ananiadou, Sophia

    2015-01-01

    Phenotypic information locked away in unstructured narrative text presents significant barriers to information accessibility, both for clinical practitioners and for computerised applications used for clinical research purposes. Text mining (TM) techniques have previously been applied successfully to extract different types of information from text in the biomedical domain. They have the potential to be extended to allow the extraction of information relating to phenotypes from free text. To stimulate the development of TM systems that are able to extract phenotypic information from text, we have created a new corpus (PhenoCHF) that is annotated by domain experts with several types of phenotypic information relating to congestive heart failure. To ensure that systems developed using the corpus are robust to multiple text types, it integrates text from heterogeneous sources, i.e., electronic health records (EHRs) and scientific articles from the literature. We have developed several different phenotype extraction methods to demonstrate the utility of the corpus, and tested these methods on a further corpus, i.e., ShARe/CLEF 2013. Evaluation of our automated methods showed that PhenoCHF can facilitate the training of reliable phenotype extraction systems, which are robust to variations in text type. These results have been reinforced by evaluating our trained systems on the ShARe/CLEF corpus, which contains clinical records of various types. Like other studies within the biomedical domain, we found that solutions based on conditional random fields produced the best results, when coupled with a rich feature set. PhenoCHF is the first annotated corpus aimed at encoding detailed phenotypic information. The unique heterogeneous composition of the corpus has been shown to be advantageous in the training of systems that can accurately extract phenotypic information from a range of different text types. Although the scope of our annotation is currently limited to a single disease, the promising results achieved can stimulate further work into the extraction of phenotypic information for other diseases. The PhenoCHF annotation guidelines and annotations are publicly available at https://code.google.com/p/phenochf-corpus.

  3. Experimental extraction of an entangled photon pair from two identically decohered pairs.

    PubMed

    Yamamoto, Takashi; Koashi, Masato; Ozdemir, Sahin Kaya; Imoto, Nobuyuki

    2003-01-23

    Entanglement is considered to be one of the most important resources in quantum information processing schemes, including teleportation, dense coding and entanglement-based quantum key distribution. Because entanglement cannot be generated by classical communication between distant parties, distribution of entangled particles between them is necessary. During the distribution process, entanglement between the particles is degraded by the decoherence and dissipation processes that result from unavoidable coupling with the environment. Entanglement distillation and concentration schemes are therefore needed to extract pairs with a higher degree of entanglement from these less-entangled pairs; this is accomplished using local operations and classical communication. Here we report an experimental demonstration of extraction of a polarization-entangled photon pair from two decohered photon pairs. Two polarization-entangled photon pairs are generated by spontaneous parametric down-conversion and then distributed through a channel that induces identical phase fluctuations to both pairs; this ensures that no entanglement is available as long as each pair is manipulated individually. Then, through collective local operations and classical communication we extract from the two decohered pairs a photon pair that is observed to be polarization-entangled.

  4. Heavy metal extractable forms in sludge from wastewater treatment plants.

    PubMed

    Alvarez, E Alonso; Mochón, M Callejón; Jiménez Sánchez, J C; Ternero Rodríguez, M

    2002-05-01

    The analysis of heavy metals is a very important task to assess the potential environmental and health risk associated with the sludge coming from wastewater treatment plants (WWTPs). However, it is widely accepted that the determination of total elements does not give an accurate estimation of the potential environmental impact. So, it is necessary to apply sequential extraction techniques to obtain a suitable information about their bioavailability or toxicity. In this paper, a sequential extraction scheme according to the BCR's guidelines was applied to sludge samples collected from each sludge treatment step of five municipal activated sludge plants. Al. Cd, Co, Cu, Cr, Fe, Mn, Hg, Mo, Ni, Pb, Ti and Zn were determined in the sludge extracts by inductively coupled plasma atomic emission spectrometry. In relation to current international legislation for the use of sludge for agricultural purposes none of metal concentrations exceeded maximum permitted levels. In most of the metal elements under considerations, results showed a clear rise along the sludge treatment in the proportion of two less-available fractions (oxidizable metal and residual metal).

  5. Breast cancer mitosis detection in histopathological images with spatial feature extraction

    NASA Astrophysics Data System (ADS)

    Albayrak, Abdülkadir; Bilgin, Gökhan

    2013-12-01

    In this work, cellular mitosis detection in histopathological images has been investigated. Mitosis detection is very expensive and time consuming process. Development of digital imaging in pathology has enabled reasonable and effective solution to this problem. Segmentation of digital images provides easier analysis of cell structures in histopathological data. To differentiate normal and mitotic cells in histopathological images, feature extraction step is very crucial step for the system accuracy. A mitotic cell has more distinctive textural dissimilarities than the other normal cells. Hence, it is important to incorporate spatial information in feature extraction or in post-processing steps. As a main part of this study, Haralick texture descriptor has been proposed with different spatial window sizes in RGB and La*b* color spaces. So, spatial dependencies of normal and mitotic cellular pixels can be evaluated within different pixel neighborhoods. Extracted features are compared with various sample sizes by Support Vector Machines using k-fold cross validation method. According to the represented results, it has been shown that separation accuracy on mitotic and non-mitotic cellular pixels gets better with the increasing size of spatial window.

  6. A research framework for pharmacovigilance in health social media: Identification and evaluation of patient adverse drug event reports.

    PubMed

    Liu, Xiao; Chen, Hsinchun

    2015-12-01

    Social media offer insights of patients' medical problems such as drug side effects and treatment failures. Patient reports of adverse drug events from social media have great potential to improve current practice of pharmacovigilance. However, extracting patient adverse drug event reports from social media continues to be an important challenge for health informatics research. In this study, we develop a research framework with advanced natural language processing techniques for integrated and high-performance patient reported adverse drug event extraction. The framework consists of medical entity extraction for recognizing patient discussions of drug and events, adverse drug event extraction with shortest dependency path kernel based statistical learning method and semantic filtering with information from medical knowledge bases, and report source classification to tease out noise. To evaluate the proposed framework, a series of experiments were conducted on a test bed encompassing about postings from major diabetes and heart disease forums in the United States. The results reveal that each component of the framework significantly contributes to its overall effectiveness. Our framework significantly outperforms prior work. Published by Elsevier Inc.

  7. Irrigation network extraction methodology from LiDAR DTM using Whitebox and ArcGIS

    NASA Astrophysics Data System (ADS)

    Mahor, M. A. P.; De La Cruz, R. M.; Olfindo, N. T.; Perez, A. M. C.

    2016-10-01

    Irrigation networks are important in distributing water resources to areas where rainfall is not enough to sustain agriculture. They are also crucial when it comes to being able to redirect vast amounts of water to decrease the risks of flooding in flat areas, especially near sources of water. With the lack of studies about irrigation feature extraction, which range from wide canals to small ditches, this study aims to present a method of extracting these features from LiDAR-derived digital terrain models (DTMs) using Geographic Information Systems (GIS) tools such as ArcGIS and Whitebox Geospatial Analysis Tools (Whitebox GAT). High-resolution LiDAR DTMs with 1-meter horizontal and 0.25-meter vertical accuracies were processed to generate the gully depth map. This map was then reclassified, converted to vector, and filtered according to segment length, and sinuosity to be able to isolate these irrigation features. Initial results in the test area show that the extraction completeness is greater than 80% when compared with data obtained from the National Irrigation Administration (NIA).

  8. Extracting Vegetation Coverage in Dry-hot Valley Regions Based on Alternating Angle Minimum Algorithm

    NASA Astrophysics Data System (ADS)

    Y Yang, M.; Wang, J.; Zhang, Q.

    2017-07-01

    Vegetation coverage is one of the most important indicators for ecological environment change, and is also an effective index for the assessment of land degradation and desertification. The dry-hot valley regions have sparse surface vegetation, and the spectral information about the vegetation in such regions usually has a weak representation in remote sensing, so there are considerable limitations for applying the commonly-used vegetation index method to calculate the vegetation coverage in the dry-hot valley regions. Therefore, in this paper, Alternating Angle Minimum (AAM) algorithm of deterministic model is adopted for selective endmember for pixel unmixing of MODIS image in order to extract the vegetation coverage, and accuracy test is carried out by the use of the Landsat TM image over the same period. As shown by the results, in the dry-hot valley regions with sparse vegetation, AAM model has a high unmixing accuracy, and the extracted vegetation coverage is close to the actual situation, so it is promising to apply the AAM model to the extraction of vegetation coverage in the dry-hot valley regions.

  9. Missing binary data extraction challenges from Cochrane reviews in mental health and Campbell reviews with implications for empirical research.

    PubMed

    Spineli, Loukia M

    2017-12-01

    Tο report challenges encountered during the extraction process from Cochrane reviews in mental health and Campbell reviews and to indicate their implications on the empirical performance of different methods to handle missingness. We used a collection of meta-analyses on binary outcomes collated from a previous work on missing outcome data. To evaluate the accuracy of their extraction, we developed specific criteria pertaining to the reporting of missing outcome data in systematic reviews. Using the most popular methods to handle missing binary outcome data, we investigated the implications of the accuracy of the extracted meta-analysis on the random-effects meta-analysis results. Of 113 meta-analyses from Cochrane reviews, 60 (53%) were judged as "unclearly" extracted (ie, no information on the outcome of completers but available information on how missing participants were handled) and 42 (37%) as "unacceptably" extracted (ie, no information on the outcome of completers as well as no information on how missing participants were handled). For the remaining meta-analyses, it was judged that data were "acceptably" extracted (ie, information on the completers' outcome was provided for all trials). Overall, "unclear" extraction overestimated the magnitude of the summary odds ratio and the between-study variance and additionally inflated the uncertainty of both meta-analytical parameters. The only eligible Campbell review was judged as "unclear." Depending on the extent of missingness, the reporting quality of the systematic reviews can greatly affect the accuracy of the extracted meta-analyses and by extent, the empirical performance of different methods to handle missingness. Copyright © 2017 John Wiley & Sons, Ltd.

  10. Evaluation of δ2H and δ18O of water in pores extracted by compression method-effects of closed pores and comparison to direct vapor equilibration and laser spectrometry method

    NASA Astrophysics Data System (ADS)

    Nakata, Kotaro; Hasegawa, Takuma; Oyama, Takahiro; Miyakawa, Kazuya

    2018-06-01

    Stable isotopes (δ2H and δ18O) of water can help our understanding of origin, mixing and migration of groundwater. In the formation with low permeability, it provides information about migration mechanism of ion such as diffusion and/or advection. Thus it has been realized as very important information to understand the migration of water and ions in it. However, in formation with low permeability it is difficult to obtain the ground water sample as liquid and water in pores needs to be extracted to estimate it. Compressing rock is the most common and widely used method of extracting water in pores. However, changes in δ2H and δ18O may take place during compression because changes in ion concentration have been reported in previous studies. In this study, two natural rocks were compressed, and the changes in the δ2H and δ18O with compression pressure were investigated. Mechanisms for the changes in water isotopes observed during the compression were then discussed. In addition, δ2H and δ18O of water in pores were also evaluated by direct vapor equilibration and laser spectrometry (DVE-LS) and δ2H and δ18O were compared with those obtained by compression. δ2H was found to change during the compression and a part of this change was found to be explained by the effect of water from closed pores extracted by compression. In addition, water isotopes in both open and closed pores were estimated by combining the results of 2 kinds of compression experiments. Water isotopes evaluated by compression that not be affected by water from closed pores showed good agreements with those obtained by DVE-LS indicating compression could show the mixed information of water from open and closed pores, while DVE-LS could show the information only for open pores. Thus, the comparison of water isotopes obtained by compression and DVE-LS could provide the information about water isotopes in closed and open pores.

  11. Advances in Spectral-Spatial Classification of Hyperspectral Images

    NASA Technical Reports Server (NTRS)

    Fauvel, Mathieu; Tarabalka, Yuliya; Benediktsson, Jon Atli; Chanussot, Jocelyn; Tilton, James C.

    2012-01-01

    Recent advances in spectral-spatial classification of hyperspectral images are presented in this paper. Several techniques are investigated for combining both spatial and spectral information. Spatial information is extracted at the object (set of pixels) level rather than at the conventional pixel level. Mathematical morphology is first used to derive the morphological profile of the image, which includes characteristics about the size, orientation, and contrast of the spatial structures present in the image. Then, the morphological neighborhood is defined and used to derive additional features for classification. Classification is performed with support vector machines (SVMs) using the available spectral information and the extracted spatial information. Spatial postprocessing is next investigated to build more homogeneous and spatially consistent thematic maps. To that end, three presegmentation techniques are applied to define regions that are used to regularize the preliminary pixel-wise thematic map. Finally, a multiple-classifier (MC) system is defined to produce relevant markers that are exploited to segment the hyperspectral image with the minimum spanning forest algorithm. Experimental results conducted on three real hyperspectral images with different spatial and spectral resolutions and corresponding to various contexts are presented. They highlight the importance of spectral–spatial strategies for the accurate classification of hyperspectral images and validate the proposed methods.

  12. Building change detection via a combination of CNNs using only RGB aerial imageries

    NASA Astrophysics Data System (ADS)

    Nemoto, Keisuke; Hamaguchi, Ryuhei; Sato, Masakazu; Fujita, Aito; Imaizumi, Tomoyuki; Hikosaka, Shuhei

    2017-10-01

    Building change information extracted from remote sensing imageries is important for various applications such as urban management and marketing planning. The goal of this work is to develop a methodology for automatically capturing building changes from remote sensing imageries. Recent studies have addressed this goal by exploiting 3-D information as a proxy for building height. In contrast, because in practice it is expensive or impossible to prepare 3-D information, we do not rely on 3-D data but focus on using only RGB aerial imageries. Instead, we employ deep convolutional neural networks (CNNs) to extract effective features, and improve change detection accuracy in RGB remote sensing imageries. We consider two aspects of building change detection, building detection and subsequent change detection. Our proposed methodology was tested on several areas, which has some differences such as dominant building characteristics and varying brightness values. On all over the tested areas, the proposed method provides good results for changed objects, with recall values over 75 % with a strict overlap requirement of over 50% in intersection-over-union (IoU). When the IoU threshold was relaxed to over 10%, resulting recall values were over 81%. We conclude that use of CNNs enables accurate detection of building changes without employing 3-D information.

  13. The role of conflict minerals, artisanal mining, and informal trading networks in African intrastate and regional conflicts

    USGS Publications Warehouse

    Chirico, Peter G.; Malpeli, Katherine C.

    2014-01-01

    The relationship between natural resources and armed conflict gained public and political attention in the 1990s, when it became evident that the mining and trading of diamonds were connected with brutal rebellions in several African nations. Easily extracted resources such as alluvial diamonds and gold have been and continue to be exploited by rebel groups to fund their activities. Artisanal and small-scale miners operating under a quasi-legal status often mine these mineral deposits. While many African countries have legalized artisanal mining and established flow chains through which production is intended to travel, informal trading networks frequently emerge in which miners seek to evade taxes and fees by selling to unauthorized buyers. These networks have the potential to become international in scope, with actors operating in multiple countries. The lack of government control over the artisanal mining sector and the prominence of informal trade networks can have severe social, political, and economic consequences. In the past, mineral extraction fuelled violent civil wars in Sierra Leone, Liberia, and Angola, and it continues to do so today in several other countries. The significant influence of the informal network that surrounds artisanal mining is therefore an important security concern that can extend across borders and have far-reaching impacts.

  14. Mapping care processes within a hospital: from theory to a web-based proposal merging enterprise modelling and ISO normative principles.

    PubMed

    Staccini, Pascal; Joubert, Michel; Quaranta, Jean-François; Fieschi, Marius

    2005-03-01

    Today, the economic and regulatory environment, involving activity-based and prospective payment systems, healthcare quality and risk analysis, traceability of the acts performed and evaluation of care practices, accounts for the current interest in clinical and hospital information systems. The structured gathering of information relative to users' needs and system requirements is fundamental when installing such systems. This stage takes time and is generally misconstrued by caregivers and is of limited efficacy to analysts. We used a modelling technique designed for manufacturing processes (IDEF0/SADT). We enhanced the basic model of an activity with descriptors extracted from the Ishikawa cause-and-effect diagram (methods, men, materials, machines, and environment). We proposed an object data model of a process and its components, and programmed a web-based tool in an object-oriented environment. This tool makes it possible to extract the data dictionary of a given process from the description of its elements and to locate documents (procedures, recommendations, instructions) according to each activity or role. Aimed at structuring needs and storing information provided by directly involved teams regarding the workings of an institution (or at least part of it), the process-mapping approach has an important contribution to make in the analysis of clinical information systems.

  15. Considerations on the Optimal and Efficient Processing of Information-Bearing Signals

    ERIC Educational Resources Information Center

    Harms, Herbert Andrew

    2013-01-01

    Noise is a fundamental hurdle that impedes the processing of information-bearing signals, specifically the extraction of salient information. Processing that is both optimal and efficient is desired; optimality ensures the extracted information has the highest fidelity allowed by the noise, while efficiency ensures limited resource usage. Optimal…

  16. Microemulsions and Aggregation Formation in Extraction Processes for Used Nuclear Fuel: Thermodynamic and Structural Studies

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Nilsson, Mikael

    Advanced nuclear fuel cycles rely on successful chemical separation of various elements in the used fuel. Numerous solvent extraction (SX) processes have been developed for the recovery and purification of metal ions from this used material. However, the predictability of process operations has been challenged by the lack of a fundamental understanding of the chemical interactions in several of these separation systems. For example, gaps in the thermodynamic description of the mechanism and the complexes formed will make predictions very challenging. Recent studies of certain extraction systems under development and a number of more established SX processes have suggested thatmore » aggregate formation in the organic phase results in a transformation of its selectivity and efficiency. Aggregation phenomena have consistently been interfering in SX process development, and have, over the years, become synonymous with an undesirable effect that must be prevented. This multiyear, multicollaborative research effort was carried out to study solvation and self-organization in non-aqueous solutions at conditions promoting aggregation phenomena. Our approach to this challenging topic was to investigate extraction systems comprising more than one extraction reagent where synergy of the metal ion could be observed. These systems were probed for the existence of stable microemulsions in the organic phase, and a number of high-end characterization tools were employed to elucidate the role of the aggregates in metal ion extraction. The ultimate goal was to find connections between synergy of metal ion extraction and reverse micellar formation. Our main accomplishment for this project was the expansion of the understanding of metal ion complexation in the extraction system combining tributyl phosphate (TBP) and dibutyl phosphoric acid (HDBP). We have found that for this system no direct correlation exists for the metal ion extraction and the formation of aggregates, meaning that the metal ion is not solubilized in a reverse micelle core. Rather we have found solid evidence that the metal ions are extracted and coordinated by the organic ligands as suggested by classic SX theories. However, we have challenged the existence of mixed complexes that have been suggested to exist in this particular extraction system. Most importantly we have generated a wealth of information and trained students on important lab techniques and strengthened the collaboration between the DOE national laboratories and US educational institution involved in this work.« less

  17. Comparison of the efficacy and safety of pollen allergen extracts using skin prick testing and serum specific IgE as references.

    PubMed

    Visitsunthorn, Nualanong; Visitsuntho, Kittipos; Pacharn, Punchama; Jirapongsananuruk, Orathai; Bunnag, Chaweewan

    2017-12-01

    Allergen extracts may be different due to the difference in dissemination of allergen-containing species in various geographical areas. Therefore, we wish to develop our own extracts to ensure the precision and quality of diagnosis. To compare the efficacy and safety of our locally prepared pollen allergen extracts to imported ones, using skin prick testing (SPT) and serum specific IgE (sIgE) as references. This prospective, randomized, double-blinded, self-controlled study was performed in respiratory allergic adult volunteers who are sensitized to at least one kind of pollen. Each subject was pricked with our Bermuda grass, Johnson grass and careless weed pollen allergen extracts, and also with the imported ones. sIgE levels were measured by using ImmunoCAP?. In 68 volunteers, our Bermuda, Johnson and careless weed extracts showed 91.2%, 45.6% and 54.4% positive SPTs, respectively, while for the imported ones 73.5%, 45.6% and 54.4% SPTs were positive, respectively. No adverse reaction was found in all procedures. The concentration of 10,000 BAU/mL of Bermuda grass, 1 : 20 w/v or 10,000 PNU/mL of Johnson grass and 1 : 40 w/v or 10,000 PNU/mL of careless weed yielded the most positive SPT results. There was no significant difference in mean wheal diameter (MWD) yielded from using local and imported extracts. Significant correlation was found between MWDs of imported pollen extracts and serum sIgE levels (p < 0.01). No significant difference between SPT results of local and imported pollen allergen extracts was found. Significant correlation was found between MWDs of imported pollen extract SPT and serum sIgE levels.

  18. Electronic health information quality challenges and interventions to improve public health surveillance data and practice.

    PubMed

    Dixon, Brian E; Siegel, Jason A; Oemig, Tanya V; Grannis, Shaun J

    2013-01-01

    We examined completeness, an attribute of data quality, in the context of electronic laboratory reporting (ELR) of notifiable disease information to public health agencies. We extracted more than seven million ELR messages from multiple clinical information systems in two states. We calculated and compared the completeness of various data fields within the messages that were identified to be important to public health reporting processes. We compared unaltered, original messages from source systems with similar messages from another state as well as messages enriched by a health information exchange (HIE). Our analysis focused on calculating completeness (i.e., the number of nonmissing values) for fields deemed important for inclusion in notifiable disease case reports. The completeness of data fields for laboratory transactions varied across clinical information systems and jurisdictions. Fields identifying the patient and test results were usually complete (97%-100%). Fields containing patient demographics, patient contact information, and provider contact information were suboptimal (6%-89%). Transactions enhanced by the HIE were found to be more complete (increases ranged from 2% to 25%) than the original messages. ELR data from clinical information systems can be of suboptimal quality. Public health monitoring of data sources and augmentation of ELR message content using HIE services can improve data quality.

  19. Prediction and analysis of essential genes using the enrichments of gene ontology and KEGG pathways.

    PubMed

    Chen, Lei; Zhang, Yu-Hang; Wang, ShaoPeng; Zhang, YunHua; Huang, Tao; Cai, Yu-Dong

    2017-01-01

    Identifying essential genes in a given organism is important for research on their fundamental roles in organism survival. Furthermore, if possible, uncovering the links between core functions or pathways with these essential genes will further help us obtain deep insight into the key roles of these genes. In this study, we investigated the essential and non-essential genes reported in a previous study and extracted gene ontology (GO) terms and biological pathways that are important for the determination of essential genes. Through the enrichment theory of GO and KEGG pathways, we encoded each essential/non-essential gene into a vector in which each component represented the relationship between the gene and one GO term or KEGG pathway. To analyze these relationships, the maximum relevance minimum redundancy (mRMR) was adopted. Then, the incremental feature selection (IFS) and support vector machine (SVM) were employed to extract important GO terms and KEGG pathways. A prediction model was built simultaneously using the extracted GO terms and KEGG pathways, which yielded nearly perfect performance, with a Matthews correlation coefficient of 0.951, for distinguishing essential and non-essential genes. To fully investigate the key factors influencing the fundamental roles of essential genes, the 21 most important GO terms and three KEGG pathways were analyzed in detail. In addition, several genes was provided in this study, which were predicted to be essential genes by our prediction model. We suggest that this study provides more functional and pathway information on the essential genes and provides a new way to investigate related problems.

  20. Measurement of dielectric constant of organic solvents by indigenously developed dielectric probe

    NASA Astrophysics Data System (ADS)

    Keshari, Ajay Kumar; Rao, J. Prabhakar; Rao, C. V. S. Brahmmananda; Ramakrishnan, R.; Ramanarayanan, R. R.

    2018-04-01

    The extraction, separation and purification of actinides (uranium and plutonium) from various matrices are an important step in nuclear fuel cycle. One of the separation process adopted in an industrial scale is the liquid-liquid extraction or solvent extraction. Liquid-liquid extraction uses a specific ligand/extractant in conjunction with suitable diluent. Solvent extraction or liquid-liquid extraction, involves the partitioning of the solute between two immiscible phases. In most cases, one of the phases is aqueous, and the other one is an organic solvent. The solvent used in solvent extraction should be selective for the metal of interest, it should have optimum distribution ratio, and the loaded metal from the organic phase should be easily stripped under suitable experimental conditions. Some of the important physical properties which are important for the solvent are density, viscosity, phase separation time, interfacial surface tension and the polarity of the extractant.

  1. Requirement of scientific documentation for the development of Naturopathy.

    PubMed

    Rastogi, Rajiv

    2006-01-01

    Past few decades have witnessed explosion of knowledge in almost every field. This has resulted not only in the advancement of the subjects in particular but also have influenced the growth of various allied subjects. The present paper explains about the advancement of science through efforts made in specific areas and also through discoveries in different allied fields having an indirect influence upon the subject in proper. In Naturopathy this seems that though nothing particular is added to the basic thoughts or fundamental principles of the subject yet the entire treatment understanding is revolutionised under the influence of scientific discoveries of past few decades. Advent of information technology has further added to the boom of knowledge and many times this seems impossible to utilize these informations for the good of human being because these are not logically arranged in our minds. In the above background, the author tries to define documentation stating that we have today ocean of information and knowledge about various things- living or dead, plants, animals or human beings; the geographical conditions or changing weather and environment. What required to be done is to extract the relevant knowledge and information required to enrich the subject. The author compares documentation with churning of milk to extract butter. Documentation, in fact, is churning of ocean of information to extract the specific, most appropriate, relevant and defined information and knowledge related to the particular subject . The paper besides discussing the definition of documentation, highlights the areas of Naturopathy requiring an urgent necessity to make proper documentations. Paper also discusses the present status of Naturopathy in India, proposes short-term and long-term goals to be achieved and plans the strategies for achieving them. The most important aspect of the paper is due understanding of the limitations of Naturopathy but a constant effort to improve the same with the growth made in various discipline of science so far.

  2. CRL/Brandeis: Description of the DIDEROT System as Used for MUC-5

    DTIC Science & Technology

    1993-01-01

    been evaluated in the 4th Message Understanding Conference (MUC-4 ) where it was required to extract information from 200 texts on South American...Email : jamesp@cs.brandeis .edu Abstract This report describes the major developments over the last six months in completing th e Diderot information ...extraction system for the MUC-5 evaluation . Diderot is an information extraction system built at CRL and Brandeis University over th e past two

  3. A rule-based named-entity recognition method for knowledge extraction of evidence-based dietary recommendations

    PubMed Central

    2017-01-01

    Evidence-based dietary information represented as unstructured text is a crucial information that needs to be accessed in order to help dietitians follow the new knowledge arrives daily with newly published scientific reports. Different named-entity recognition (NER) methods have been introduced previously to extract useful information from the biomedical literature. They are focused on, for example extracting gene mentions, proteins mentions, relationships between genes and proteins, chemical concepts and relationships between drugs and diseases. In this paper, we present a novel NER method, called drNER, for knowledge extraction of evidence-based dietary information. To the best of our knowledge this is the first attempt at extracting dietary concepts. DrNER is a rule-based NER that consists of two phases. The first one involves the detection and determination of the entities mention, and the second one involves the selection and extraction of the entities. We evaluate the method by using text corpora from heterogeneous sources, including text from several scientifically validated web sites and text from scientific publications. Evaluation of the method showed that drNER gives good results and can be used for knowledge extraction of evidence-based dietary recommendations. PMID:28644863

  4. A Multi-Disciplinary Approach to Remote Sensing through Low-Cost UAVs.

    PubMed

    Calvario, Gabriela; Sierra, Basilio; Alarcón, Teresa E; Hernandez, Carmen; Dalmau, Oscar

    2017-06-16

    The use of Unmanned Aerial Vehicles (UAVs) based on remote sensing has generated low cost monitoring, since the data can be acquired quickly and easily. This paper reports the experience related to agave crop analysis with a low cost UAV. The data were processed by traditional photogrammetric flow and data extraction techniques were applied to extract new layers and separate the agave plants from weeds and other elements of the environment. Our proposal combines elements of photogrammetry, computer vision, data mining, geomatics and computer science. This fusion leads to very interesting results in agave control. This paper aims to demonstrate the potential of UAV monitoring in agave crops and the importance of information processing with reliable data flow.

  5. Text Line Detection from Rectangle Traffic Panels of Natural Scene

    NASA Astrophysics Data System (ADS)

    Wang, Shiyuan; Huang, Linlin; Hu, Jian

    2018-01-01

    Traffic sign detection and recognition is very important for Intelligent Transportation. Among traffic signs, traffic panel contains rich information. However, due to low resolution and blur in the rectangular traffic panel, it is difficult to extract the character and symbols. In this paper, we propose a coarse-to-fine method to detect the Chinese character on traffic panels from natural scenes. Given a traffic panel Color Quantization is applied to extract candidate regions of Chinese characters. Second, a multi-stage filter based on learning is applied to discard the non-character regions. Third, we aggregate the characters for text lines by Distance Metric Learning method. Experimental results on real traffic images from Baidu Street View demonstrate the effectiveness of the proposed method.

  6. Bilinear Time-frequency Analysis for Lamb Wave Signal Detected by Electromagnetic Acoustic Transducer

    NASA Astrophysics Data System (ADS)

    Sun, Wenxiu; Liu, Guoqiang; Xia, Hui; Xia, Zhengwu

    2018-03-01

    Accurate acquisition of the detection signal travel time plays a very important role in cross-hole tomography. The experimental platform of aluminum plate under the perpendicular magnetic field is established and the bilinear time-frequency analysis methods, Wigner-Ville Distribution (WVD) and the pseudo-Wigner-Ville distribution (PWVD), are applied to analyse the Lamb wave signals detected by electromagnetic acoustic transducer (EMAT). By extracting the same frequency component of the time-frequency spectrum as the excitation frequency, the travel time information can be obtained. In comparison with traditional linear time-frequency analysis method such as short-time Fourier transform (STFT), the bilinear time-frequency analysis method PWVD is more appropriate in extracting travel time and recognizing patterns of Lamb wave.

  7. A Multi-Disciplinary Approach to Remote Sensing through Low-Cost UAVs

    PubMed Central

    Calvario, Gabriela; Sierra, Basilio; Alarcón, Teresa E.; Hernandez, Carmen; Dalmau, Oscar

    2017-01-01

    The use of Unmanned Aerial Vehicles (UAVs) based on remote sensing has generated low cost monitoring, since the data can be acquired quickly and easily. This paper reports the experience related to agave crop analysis with a low cost UAV. The data were processed by traditional photogrammetric flow and data extraction techniques were applied to extract new layers and separate the agave plants from weeds and other elements of the environment. Our proposal combines elements of photogrammetry, computer vision, data mining, geomatics and computer science. This fusion leads to very interesting results in agave control. This paper aims to demonstrate the potential of UAV monitoring in agave crops and the importance of information processing with reliable data flow. PMID:28621740

  8. Sulci segmentation using geometric active contours

    NASA Astrophysics Data System (ADS)

    Torkaman, Mahsa; Zhu, Liangjia; Karasev, Peter; Tannenbaum, Allen

    2017-02-01

    Sulci are groove-like regions lying in the depth of the cerebral cortex between gyri, which together, form a folded appearance in human and mammalian brains. Sulci play an important role in the structural analysis of the brain, morphometry (i.e., the measurement of brain structures), anatomical labeling and landmark-based registration.1 Moreover, sulcal morphological changes are related to cortical thickness, whose measurement may provide useful information for studying variety of psychiatric disorders. Manually extracting sulci requires complying with complex protocols, which make the procedure both tedious and error prone.2 In this paper, we describe an automatic procedure, employing geometric active contours, which extract the sulci. Sulcal boundaries are obtained by minimizing a certain energy functional whose minimum is attained at the boundary of the given sulci.

  9. Benchmarking infrastructure for mutation text mining

    PubMed Central

    2014-01-01

    Background Experimental research on the automatic extraction of information about mutations from texts is greatly hindered by the lack of consensus evaluation infrastructure for the testing and benchmarking of mutation text mining systems. Results We propose a community-oriented annotation and benchmarking infrastructure to support development, testing, benchmarking, and comparison of mutation text mining systems. The design is based on semantic standards, where RDF is used to represent annotations, an OWL ontology provides an extensible schema for the data and SPARQL is used to compute various performance metrics, so that in many cases no programming is needed to analyze results from a text mining system. While large benchmark corpora for biological entity and relation extraction are focused mostly on genes, proteins, diseases, and species, our benchmarking infrastructure fills the gap for mutation information. The core infrastructure comprises (1) an ontology for modelling annotations, (2) SPARQL queries for computing performance metrics, and (3) a sizeable collection of manually curated documents, that can support mutation grounding and mutation impact extraction experiments. Conclusion We have developed the principal infrastructure for the benchmarking of mutation text mining tasks. The use of RDF and OWL as the representation for corpora ensures extensibility. The infrastructure is suitable for out-of-the-box use in several important scenarios and is ready, in its current state, for initial community adoption. PMID:24568600

  10. Content Analysis of Student Essays after Attending a Problem-Based Learning Course: Facilitating the Development of Critical Thinking and Communication Skills in Japanese Nursing Students.

    PubMed

    Itatani, Tomoya; Nagata, Kyoko; Yanagihara, Kiyoko; Tabuchi, Noriko

    2017-08-22

    The importance of active learning has continued to increase in Japan. The authors conducted classes for first-year students who entered the nursing program using the problem-based learning method which is a kind of active learning. Students discussed social topics in classes. The purposes of this study were to analyze the post-class essay, describe logical and critical thinking after attended a Problem-Based Learning (PBL) course. The authors used Mayring's methodology for qualitative content analysis and text mining. In the description about the skills required to resolve social issues, seven categories were extracted: (recognition of diverse social issues), (attitudes about resolving social issues), (discerning the root cause), (multi-lateral information processing skills), (making a path to resolve issues), (processivity in dealing with issues), and (reflecting). In the description about communication, five categories were extracted: (simple statement), (robust theories), (respecting the opponent), (communication skills), and (attractive presentations). As the result of text mining, the words extracted more than 100 times included "issue," "society," "resolve," "myself," "ability," "opinion," and "information." Education using PBL could be an effective means of improving skills that students described, and communication in general. Some students felt difficulty of communication resulting from characteristics of Japanese.

  11. Benchmarking infrastructure for mutation text mining.

    PubMed

    Klein, Artjom; Riazanov, Alexandre; Hindle, Matthew M; Baker, Christopher Jo

    2014-02-25

    Experimental research on the automatic extraction of information about mutations from texts is greatly hindered by the lack of consensus evaluation infrastructure for the testing and benchmarking of mutation text mining systems. We propose a community-oriented annotation and benchmarking infrastructure to support development, testing, benchmarking, and comparison of mutation text mining systems. The design is based on semantic standards, where RDF is used to represent annotations, an OWL ontology provides an extensible schema for the data and SPARQL is used to compute various performance metrics, so that in many cases no programming is needed to analyze results from a text mining system. While large benchmark corpora for biological entity and relation extraction are focused mostly on genes, proteins, diseases, and species, our benchmarking infrastructure fills the gap for mutation information. The core infrastructure comprises (1) an ontology for modelling annotations, (2) SPARQL queries for computing performance metrics, and (3) a sizeable collection of manually curated documents, that can support mutation grounding and mutation impact extraction experiments. We have developed the principal infrastructure for the benchmarking of mutation text mining tasks. The use of RDF and OWL as the representation for corpora ensures extensibility. The infrastructure is suitable for out-of-the-box use in several important scenarios and is ready, in its current state, for initial community adoption.

  12. Object-based vegetation classification with high resolution remote sensing imagery

    NASA Astrophysics Data System (ADS)

    Yu, Qian

    Vegetation species are valuable indicators to understand the earth system. Information from mapping of vegetation species and community distribution at large scales provides important insight for studying the phenological (growth) cycles of vegetation and plant physiology. Such information plays an important role in land process modeling including climate, ecosystem and hydrological models. The rapidly growing remote sensing technology has increased its potential in vegetation species mapping. However, extracting information at a species level is still a challenging research topic. I proposed an effective method for extracting vegetation species distribution from remotely sensed data and investigated some ways for accuracy improvement. The study consists of three phases. Firstly, a statistical analysis was conducted to explore the spatial variation and class separability of vegetation as a function of image scale. This analysis aimed to confirm that high resolution imagery contains the information on spatial vegetation variation and these species classes can be potentially separable. The second phase was a major effort in advancing classification by proposing a method for extracting vegetation species from high spatial resolution remote sensing data. The proposed classification employs an object-based approach that integrates GIS and remote sensing data and explores the usefulness of ancillary information. The whole process includes image segmentation, feature generation and selection, and nearest neighbor classification. The third phase introduces a spatial regression model for evaluating the mapping quality from the above vegetation classification results. The effects of six categories of sample characteristics on the classification uncertainty are examined: topography, sample membership, sample density, spatial composition characteristics, training reliability and sample object features. This evaluation analysis answered several interesting scientific questions such as (1) whether the sample characteristics affect the classification accuracy and how significant if it does; (2) how much variance of classification uncertainty can be explained by above factors. This research is carried out on a hilly peninsular area in Mediterranean climate, Point Reyes National Seashore (PRNS) in Northern California. The area mainly consists of a heterogeneous, semi-natural broadleaf and conifer woodland, shrub land, and annual grassland. A detailed list of vegetation alliances is used in this study. Research results from the first phase indicates that vegetation spatial variation as reflected by the average local variance (ALV) keeps a high level of magnitude between 1 m and 4 m resolution. (Abstract shortened by UMI.)

  13. An exploratory study of air emissions associated with shale gas development and production in the Barnett Shale.

    PubMed

    Rich, Alisa; Grover, James P; Sattler, Melanie L

    2014-01-01

    Information regarding air emissions from shale gas extraction and production is critically important given production is occurring in highly urbanized areas across the United States. Objectives of this exploratory study were to collect ambient air samples in residential areas within 61 m (200 feet) of shale gas extraction/production and determine whether a "fingerprint" of chemicals can be associated with shale gas activity. Statistical analyses correlating fingerprint chemicals with methane, equipment, and processes of extraction/production were performed. Ambient air sampling in residential areas of shale gas extraction and production was conducted at six counties in the Dallas/Fort Worth (DFW) Metroplex from 2008 to 2010. The 39 locations tested were identified by clients that requested monitoring. Seven sites were sampled on 2 days (typically months later in another season), and two sites were sampled on 3 days, resulting in 50 sets of monitoring data. Twenty-four-hour passive samples were collected using summa canisters. Gas chromatography/mass spectrometer analysis was used to identify organic compounds present. Methane was present in concentrations above laboratory detection limits in 49 out of 50 sampling data sets. Most of the areas investigated had atmospheric methane concentrations considerably higher than reported urban background concentrations (1.8-2.0 ppm(v)). Other chemical constituents were found to be correlated with presence of methane. A principal components analysis (PCA) identified multivariate patterns of concentrations that potentially constitute signatures of emissions from different phases of operation at natural gas sites. The first factor identified through the PCA proved most informative. Extreme negative values were strongly and statistically associated with the presence of compressors at sample sites. The seven chemicals strongly associated with this factor (o-xylene, ethylbenzene, 1,2,4-trimethylbenzene, m- and p-xylene, 1,3,5-trimethylbenzene, toluene, and benzene) thus constitute a potential fingerprint of emissions associated with compression. Information regarding air emissions from shale gas development and production is critically important given production is now occurring in highly urbanized areas across the United States. Methane, the primary shale gas constituent, contributes substantially to climate change; other natural gas constituents are known to have adverse health effects. This study goes beyond previous Barnett Shale field studies by encompassing a wider variety of production equipment (wells, tanks, compressors, and separators) and a wider geographical region. The principal components analysis, unique to this study, provides valuable information regarding the ability to anticipate associated shale gas chemical constituents.

  14. Smart Extraction and Analysis System for Clinical Research.

    PubMed

    Afzal, Muhammad; Hussain, Maqbool; Khan, Wajahat Ali; Ali, Taqdir; Jamshed, Arif; Lee, Sungyoung

    2017-05-01

    With the increasing use of electronic health records (EHRs), there is a growing need to expand the utilization of EHR data to support clinical research. The key challenge in achieving this goal is the unavailability of smart systems and methods to overcome the issue of data preparation, structuring, and sharing for smooth clinical research. We developed a robust analysis system called the smart extraction and analysis system (SEAS) that consists of two subsystems: (1) the information extraction system (IES), for extracting information from clinical documents, and (2) the survival analysis system (SAS), for a descriptive and predictive analysis to compile the survival statistics and predict the future chance of survivability. The IES subsystem is based on a novel permutation-based pattern recognition method that extracts information from unstructured clinical documents. Similarly, the SAS subsystem is based on a classification and regression tree (CART)-based prediction model for survival analysis. SEAS is evaluated and validated on a real-world case study of head and neck cancer. The overall information extraction accuracy of the system for semistructured text is recorded at 99%, while that for unstructured text is 97%. Furthermore, the automated, unstructured information extraction has reduced the average time spent on manual data entry by 75%, without compromising the accuracy of the system. Moreover, around 88% of patients are found in a terminal or dead state for the highest clinical stage of disease (level IV). Similarly, there is an ∼36% probability of a patient being alive if at least one of the lifestyle risk factors was positive. We presented our work on the development of SEAS to replace costly and time-consuming manual methods with smart automatic extraction of information and survival prediction methods. SEAS has reduced the time and energy of human resources spent unnecessarily on manual tasks.

  15. Information extraction and knowledge graph construction from geoscience literature

    NASA Astrophysics Data System (ADS)

    Wang, Chengbin; Ma, Xiaogang; Chen, Jianguo; Chen, Jingwen

    2018-03-01

    Geoscience literature published online is an important part of open data, and brings both challenges and opportunities for data analysis. Compared with studies of numerical geoscience data, there are limited works on information extraction and knowledge discovery from textual geoscience data. This paper presents a workflow and a few empirical case studies for that topic, with a focus on documents written in Chinese. First, we set up a hybrid corpus combining the generic and geology terms from geology dictionaries to train Chinese word segmentation rules of the Conditional Random Fields model. Second, we used the word segmentation rules to parse documents into individual words, and removed the stop-words from the segmentation results to get a corpus constituted of content-words. Third, we used a statistical method to analyze the semantic links between content-words, and we selected the chord and bigram graphs to visualize the content-words and their links as nodes and edges in a knowledge graph, respectively. The resulting graph presents a clear overview of key information in an unstructured document. This study proves the usefulness of the designed workflow, and shows the potential of leveraging natural language processing and knowledge graph technologies for geoscience.

  16. A comparison of machine learning techniques for detection of drug target articles.

    PubMed

    Danger, Roxana; Segura-Bedmar, Isabel; Martínez, Paloma; Rosso, Paolo

    2010-12-01

    Important progress in treating diseases has been possible thanks to the identification of drug targets. Drug targets are the molecular structures whose abnormal activity, associated to a disease, can be modified by drugs, improving the health of patients. Pharmaceutical industry needs to give priority to their identification and validation in order to reduce the long and costly drug development times. In the last two decades, our knowledge about drugs, their mechanisms of action and drug targets has rapidly increased. Nevertheless, most of this knowledge is hidden in millions of medical articles and textbooks. Extracting knowledge from this large amount of unstructured information is a laborious job, even for human experts. Drug target articles identification, a crucial first step toward the automatic extraction of information from texts, constitutes the aim of this paper. A comparison of several machine learning techniques has been performed in order to obtain a satisfactory classifier for detecting drug target articles using semantic information from biomedical resources such as the Unified Medical Language System. The best result has been achieved by a Fuzzy Lattice Reasoning classifier, which reaches 98% of ROC area measure. Copyright © 2010 Elsevier Inc. All rights reserved.

  17. Gist and verbatim communication concerning medication risks/benefits.

    PubMed

    Blalock, Susan J; DeVellis, Robert F; Chewning, Betty; Sleath, Betsy L; Reyna, Valerie F

    2016-06-01

    To describe the information about medication risks/benefits that rheumatologists provide during patient office visits, the gist that patients with rheumatoid arthritis (RA) extract from the information provided, and the relationship between communication and medication satisfaction. Data from 169 RA patients were analyzed. Each participant had up to three visits audiotaped. Four RA patients coded the audiotapes using a Gist Coding Scheme and research assistants coded the audiotapes using a Verbatim Coding Scheme. When extracting gist from the information discussed during visits, patient coders distinguished between discussion concerning the possibility of medication side effects versus expression of significant safety concerns. Among patients in the best health, nearly 80% reported being totally satisfied with their medications when the physician communicated the gist that the medication was effective, compared to approximately 50% when this gist was not communicated. Study findings underscore the multidimensional nature of medication risk communication and the importance of communication concerning medication effectiveness/need. Health care providers should ensure that patients understand that medication self-management practices can minimize potential risks. Communicating simple gist messages may increase patient satisfaction, especially messages about benefits for well-managed patients. Optimal communication also requires shared understanding of desired therapeutic outcomes. Copyright © 2015 Elsevier Ireland Ltd. All rights reserved.

  18. Ultrahigh pressure extraction of bioactive compounds from plants-A review.

    PubMed

    Xi, Jun

    2017-04-13

    Extraction of bioactive compounds from plants is one of the most important research areas for pharmaceutical and food industries. Conventional extraction techniques are usually associated with longer extraction times, lower yields, more organic solvent consumption, and poor extraction efficiency. A novel extraction technique, ultrahigh pressure extraction, has been developed for the extraction of bioactive compounds from plants, in order to shorten the extraction time, decrease the solvent consumption, increase the extraction yields, and enhance the quality of extracts. The mild processing temperature of ultrahigh pressure extraction may lead to an enhanced extraction of thermolabile bioactive ingredients. A critical review is conducted to introduce the different aspects of ultrahigh pressure extraction of plants bioactive compounds, including principles and mechanisms, the important parameters influencing its performance, comparison of ultrahigh pressure extraction with other extraction techniques, advantages, and disadvantages. The future opportunities of ultrahigh pressure extraction are also discussed.

  19. Optimality in Data Assimilation

    NASA Astrophysics Data System (ADS)

    Nearing, Grey; Yatheendradas, Soni

    2016-04-01

    It costs a lot more to develop and launch an earth-observing satellite than it does to build a data assimilation system. As such, we propose that it is important to understand the efficiency of our assimilation algorithms at extracting information from remote sensing retrievals. To address this, we propose that it is necessary to adopt completely general definition of "optimality" that explicitly acknowledges all differences between the parametric constraints of our assimilation algorithm (e.g., Gaussianity, partial linearity, Markovian updates) and the true nature of the environmetnal system and observing system. In fact, it is not only possible, but incredibly straightforward, to measure the optimality (in this more general sense) of any data assimilation algorithm as applied to any intended model or natural system. We measure the information content of remote sensing data conditional on the fact that we are already running a model and then measure the actual information extracted by data assimilation. The ratio of the two is an efficiency metric, and optimality is defined as occurring when the data assimilation algorithm is perfectly efficient at extracting information from the retrievals. We measure the information content of the remote sensing data in a way that, unlike triple collocation, does not rely on any a priori presumed relationship (e.g., linear) between the retrieval and the ground truth, however, like triple-collocation, is insensitive to the spatial mismatch between point-based measurements and grid-scale retrievals. This theory and method is therefore suitable for use with both dense and sparse validation networks. Additionally, the method we propose is *constructive* in the sense that it provides guidance on how to improve data assimilation systems. All data assimilation strategies can be reduced to approximations of Bayes' law, and we measure the fractions of total information loss that are due to individual assumptions or approximations in the prior (i.e., the model uncertainty distribution), and in the likelihood (i.e., the observation operator and observation uncertainty distribution). In this way, we can directly identify the parts of a data assimilation algorithm that contribute most to assimilation error in a way that (unlike traditional DA performance metrics) considers nonlinearity in the model and observation and non-optimality in the fit between filter assumptions and the real system. To reiterate, the method we propose is theoretically rigorous but also dead-to-rights simple, and can be implemented in no more than a few hours by a competent programmer. We use this to show that careful applications of the Ensemble Kalman Filter use substantially less than half of the information contained in remote sensing soil moisture retrievals (LPRM, AMSR-E, SMOS, and SMOPS). We propose that this finding may explain some of the results from several recent large-scale experiments that show lower-than-expected value to assimilating soil moisture retrievals into land surface models forced by high-quality precipitation data. Our results have important implications for the SMAP mission because over half of the SMAP-affiliated "early adopters" plan to use the EnKF as their primary method for extracting information from SMAP retrievals.

  20. An analysis of methods used to synthesize evidence and grade recommendations in food-based dietary guidelines

    PubMed Central

    Blake, Phillipa; Durão, Solange; Naude, Celeste E; Bero, Lisa

    2018-01-01

    Abstract Evidence-informed guideline development methods underpinned by systematic reviews ensure that guidelines are transparently developed, free from overt bias, and based on the best available evidence. Only recently has the nutrition field begun using these methods to develop public health nutrition guidelines. Given the importance of following an evidence-informed approach and recent advances in related methods, this study sought to describe the methods used to synthesize evidence, rate evidence quality, grade recommendations, and manage conflicts of interest (COIs) in national food-based dietary guidelines (FBDGs). The Food and Agriculture Organization’s FBDGs database was searched to identify the latest versions of FBDGs published from 2010 onward. Relevant data from 32 FBDGs were extracted, and the findings are presented narratively. This study shows that despite advances in evidence-informed methods for developing dietary guidelines, there are variations and deficiencies in methods used to review evidence, rate evidence quality, and grade recommendations. Dietary guidelines should follow systematic and transparent methods and be informed by the best available evidence, while considering important contextual factors and managing conflicts of interest. PMID:29425371

  1. 30 CFR 702.10 - Information collection.

    Code of Federal Regulations, 2012 CFR

    2012-07-01

    ... 30 Mineral Resources 3 2012-07-01 2012-07-01 false Information collection. 702.10 Section 702.10... EXEMPTION FOR COAL EXTRACTION INCIDENTAL TO THE EXTRACTION OF OTHER MINERALS § 702.10 Information collection. The collections of information contained in §§ 702.11, 702.12, 702.13, 702.15 and 702.18 of this part...

  2. 30 CFR 702.10 - Information collection.

    Code of Federal Regulations, 2011 CFR

    2011-07-01

    ... 30 Mineral Resources 3 2011-07-01 2011-07-01 false Information collection. 702.10 Section 702.10... EXEMPTION FOR COAL EXTRACTION INCIDENTAL TO THE EXTRACTION OF OTHER MINERALS § 702.10 Information collection. The collections of information contained in §§ 702.11, 702.12, 702.13, 702.15 and 702.18 of this part...

  3. 30 CFR 702.10 - Information collection.

    Code of Federal Regulations, 2010 CFR

    2010-07-01

    ... 30 Mineral Resources 3 2010-07-01 2010-07-01 false Information collection. 702.10 Section 702.10... EXEMPTION FOR COAL EXTRACTION INCIDENTAL TO THE EXTRACTION OF OTHER MINERALS § 702.10 Information collection. The collections of information contained in §§ 702.11, 702.12, 702.13, 702.15 and 702.18 of this part...

  4. Integrating Information Extraction Agents into a Tourism Recommender System

    NASA Astrophysics Data System (ADS)

    Esparcia, Sergio; Sánchez-Anguix, Víctor; Argente, Estefanía; García-Fornes, Ana; Julián, Vicente

    Recommender systems face some problems. On the one hand information needs to be maintained updated, which can result in a costly task if it is not performed automatically. On the other hand, it may be interesting to include third party services in the recommendation since they improve its quality. In this paper, we present an add-on for the Social-Net Tourism Recommender System that uses information extraction and natural language processing techniques in order to automatically extract and classify information from the Web. Its goal is to maintain the system updated and obtain information about third party services that are not offered by service providers inside the system.

  5. Development of pair distribution function analysis

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Vondreele, R.; Billinge, S.; Kwei, G.

    1996-09-01

    This is the final report of a 3-year LDRD project at LANL. It has become more and more evident that structural coherence in the CuO{sub 2} planes of high-{Tc} superconducting materials over some intermediate length scale (nm range) is important to superconductivity. In recent years, the pair distribution function (PDF) analysis of powder diffraction data has been developed for extracting structural information on these length scales. This project sought to expand and develop this technique, use it to analyze neutron powder diffraction data, and apply it to problems. In particular, interest is in the area of high-{Tc} superconductors, although wemore » planned to extend the study to the closely related perovskite ferroelectric materials andother materials where the local structure affects the properties where detailed knowledge of the local and intermediate range structure is important. In addition, we planned to carry out single crystal experiments to look for diffuse scattering. This information augments the information from the PDF.« less

  6. Functional Neuronal Processing of Human Body Odors

    PubMed Central

    Lundström, Johan N.; Olsson, Mats J.

    2013-01-01

    Body odors carry informational cues of great importance for individuals across a wide range of species, and signals hidden within the body odor cocktail are known to regulate several key behaviors in animals. For a long time, the notion that humans may be among these species has been dismissed. We now know, however, that each human has a unique odor signature that carries information related to his or her genetic makeup, as well as information about personal environmental variables, such as diet and hygiene. Although a substantial number of studies have investigated the behavioral effects of body odors, only a handful have studied central processing. Recent studies have, however, demonstrated that the human brain responds to fear signals hidden within the body odor cocktail, is able to extract kin specific signals, and processes body odors differently than other perceptually similar odors. In this chapter, we provide an overview of the current knowledge of how the human brain processes body odors and the potential importance these signals have for us in everyday life. PMID:20831940

  7. The online community based decision making support system for mitigating biased decision making

    NASA Astrophysics Data System (ADS)

    Kang, Sunghyun; Seo, Jiwan; Choi, Seungjin; Kim, Junho; Han, Sangyong

    2016-10-01

    As the Internet technology and social media advance, various information and opinions are shared and distributed through the online communities. However, the existence of implicit and explicit bias of opinions may have a potential influence on the outcomes. Compared to the importance of mitigating biased information, the study in this field is relatively young and does not address many important issues. In this paper we propose the noble approach to mitigate the biased opinions using conventional machine learning methods. The proposed method extracts the useful features such as inclination and sentiment of the community members. They are classified based on their previous behavior, and the propensity of the members is understood. This information on each community and its members is very useful and improve the ability to make an unbiased decision. The proposed method presented in this paper is shown to have the ability to assist optimal, fair and good decision making while also reducing the influence of implicit bias.

  8. Multiple kernel learning in protein-protein interaction extraction from biomedical literature.

    PubMed

    Yang, Zhihao; Tang, Nan; Zhang, Xiao; Lin, Hongfei; Li, Yanpeng; Yang, Zhiwei

    2011-03-01

    Knowledge about protein-protein interactions (PPIs) unveils the molecular mechanisms of biological processes. The volume and content of published biomedical literature on protein interactions is expanding rapidly, making it increasingly difficult for interaction database administrators, responsible for content input and maintenance to detect and manually update protein interaction information. The objective of this work is to develop an effective approach to automatic extraction of PPI information from biomedical literature. We present a weighted multiple kernel learning-based approach for automatic PPI extraction from biomedical literature. The approach combines the following kernels: feature-based, tree, graph and part-of-speech (POS) path. In particular, we extend the shortest path-enclosed tree (SPT) and dependency path tree to capture richer contextual information. Our experimental results show that the combination of SPT and dependency path tree extensions contributes to the improvement of performance by almost 0.7 percentage units in F-score and 2 percentage units in area under the receiver operating characteristics curve (AUC). Combining two or more appropriately weighed individual will further improve the performance. Both on the individual corpus and cross-corpus evaluation our combined kernel can achieve state-of-the-art performance with respect to comparable evaluations, with 64.41% F-score and 88.46% AUC on the AImed corpus. As different kernels calculate the similarity between two sentences from different aspects. Our combined kernel can reduce the risk of missing important features. More specifically, we use a weighted linear combination of individual kernels instead of assigning the same weight to each individual kernel, thus allowing the introduction of each kernel to incrementally contribute to the performance improvement. In addition, SPT and dependency path tree extensions can improve the performance by including richer context information. Copyright © 2010 Elsevier B.V. All rights reserved.

  9. Conception of Self-Construction Production Scheduling System

    NASA Astrophysics Data System (ADS)

    Xue, Hai; Zhang, Xuerui; Shimizu, Yasuhiro; Fujimura, Shigeru

    With the high speed innovation of information technology, many production scheduling systems have been developed. However, a lot of customization according to individual production environment is required, and then a large investment for development and maintenance is indispensable. Therefore now the direction to construct scheduling systems should be changed. The final objective of this research aims at developing a system which is built by it extracting the scheduling technique automatically through the daily production scheduling work, so that an investment will be reduced. This extraction mechanism should be applied for various production processes for the interoperability. Using the master information extracted by the system, production scheduling operators can be supported to accelerate the production scheduling work easily and accurately without any restriction of scheduling operations. By installing this extraction mechanism, it is easy to introduce scheduling system without a lot of expense for customization. In this paper, at first a model for expressing a scheduling problem is proposed. Then the guideline to extract the scheduling information and use the extracted information is shown and some applied functions are also proposed based on it.

  10. Development of Extraction Tests for Determining the Bioavailability of Metals in Soil

    DTIC Science & Technology

    2005-06-01

    Liability Information System COV coefficient of variance Cr(III) trivalent chromium Cr(VI) hexavalent chromium DCB dithionite citrate bicarbonate...indicated that bioavailability was a less important issue for chromium than understanding the form of chromium (i.e., trivalent or hexavalent) that is...7.3.3 Chromium 50 7.3.4 Lead 50 7.3.5 Summary of In Vitro Testing for Wildlife Receptors 51 7.4 References 51 Supplemental Materials for

  11. Evaluation of Three Protein-Extraction Methods for Proteome Analysis of Maize Leaf Midrib, a Compound Tissue Rich in Sclerenchyma Cells.

    PubMed

    Wang, Ning; Wu, Xiaolin; Ku, Lixia; Chen, Yanhui; Wang, Wei

    2016-01-01

    Leaf morphology is closely related to the growth and development of maize (Zea mays L.) plants and final kernel production. As an important part of the maize leaf, the midrib holds leaf blades in the aerial position for maximum sunlight capture. Leaf midribs of adult plants contain substantial sclerenchyma cells with heavily thickened and lignified secondary walls and have a high amount of phenolics, making protein extraction and proteome analysis difficult in leaf midrib tissue. In the present study, three protein-extraction methods that are commonly used in plant proteomics, i.e., phenol extraction, TCA/acetone extraction, and TCA/acetone/phenol extraction, were qualitatively and quantitatively evaluated based on 2DE maps and MS/MS analysis using the midribs of the 10th newly expanded leaves of maize plants. Microscopy revealed the existence of substantial amounts of sclerenchyma underneath maize midrib epidermises (particularly abaxial epidermises). The spot-number order obtained via 2DE mapping was as follows: phenol extraction (655) > TCA/acetone extraction (589) > TCA/acetone/phenol extraction (545). MS/MS analysis identified a total of 17 spots that exhibited 2-fold changes in abundance among the three methods (using phenol extraction as a control). Sixteen of the proteins identified were hydrophilic, with GRAVY values ranging from -0.026 to -0.487. For all three methods, we were able to obtain high-quality protein samples and good 2DE maps for the maize leaf midrib. However, phenol extraction produced a better 2DE map with greater resolution between spots, and TCA/acetone extraction produced higher protein yields. Thus, this paper includes a discussion regarding the possible reasons for differential protein extraction among the three methods. This study provides useful information that can be used to select suitable protein extraction methods for the proteome analysis of recalcitrant plant tissues that are rich in sclerenchyma cells.

  12. The extraction of neural information from the surface EMG for the control of upper-limb prostheses: emerging avenues and challenges.

    PubMed

    Farina, Dario; Jiang, Ning; Rehbaum, Hubertus; Holobar, Aleš; Graimann, Bernhard; Dietl, Hans; Aszmann, Oskar C

    2014-07-01

    Despite not recording directly from neural cells, the surface electromyogram (EMG) signal contains information on the neural drive to muscles, i.e., the spike trains of motor neurons. Using this property, myoelectric control consists of the recording of EMG signals for extracting control signals to command external devices, such as hand prostheses. In commercial control systems, the intensity of muscle activity is extracted from the EMG and used for single degrees of freedom activation (direct control). Over the past 60 years, academic research has progressed to more sophisticated approaches but, surprisingly, none of these academic achievements has been implemented in commercial systems so far. We provide an overview of both commercial and academic myoelectric control systems and we analyze their performance with respect to the characteristics of the ideal myocontroller. Classic and relatively novel academic methods are described, including techniques for simultaneous and proportional control of multiple degrees of freedom and the use of individual motor neuron spike trains for direct control. The conclusion is that the gap between industry and academia is due to the relatively small functional improvement in daily situations that academic systems offer, despite the promising laboratory results, at the expense of a substantial reduction in robustness. None of the systems so far proposed in the literature fulfills all the important criteria needed for widespread acceptance by the patients, i.e. intuitive, closed-loop, adaptive, and robust real-time ( 200 ms delay) control, minimal number of recording electrodes with low sensitivity to repositioning, minimal training, limited complexity and low consumption. Nonetheless, in recent years, important efforts have been invested in matching these criteria, with relevant steps forwards.

  13. Relative extraction ratio (RER) for arsenic and heavy metals in soils and tailings from various metal mines, Korea.

    PubMed

    Son, Hye Ok; Jung, Myung Chae

    2011-01-01

    This study focused on the evaluation of leaching behaviours for arsenic and heavy metals (Cd, Cu, Ni, Pb and Zn) in soils and tailings contaminated by mining activities. Ten representative mine soils were taken at four representative metal mines in Korea. To evaluate the leaching characteristics of the samples, eight extraction methods were adapted namely 0.1 M HCl, 0.5 M HCl, 1.0 M HCl, 3.0 M HCl, Korean Standard Leaching Procedure for waste materials (KSLP), Synthetic Precipitation Leaching Procedure (SPLP), Toxicity Characteristic Leaching Procedure (TCLP) and aqua regia extraction (AR) methods. In order to compare element concentrations as extraction methods, relative extraction ratios (RERs, %), defined as element concentration extracted by the individual leaching method divided by that extracted by aqua regia based on USEPA method 3050B, were calculated. Although the RER values can vary upon sample types and elements, they increase with increasing ionic strength of each extracting solution. Thus, the RER for arsenic and heavy metals in the samples increased in the order of KSLP < SPLP < TCLP < 0.1 M HCl < 0.5 M HCl < 1.0 M HCl < 3.0 M HCl. In the same extraction method, the RER values for Cd and Zn were relatively higher than those for As, Cu, Ni and Pb. This may be due to differences in geochemical behaviour of each element, namely high solubility of Cd and Zn and low solubility of As, Cu, Ni and Pb in surface environment. Thus, the extraction results can give important information on the degree and extent of arsenic and heavy metal dispersion in the surface environment.

  14. Ovicidal effect of the methanolic extract of ginger (Zingiber officinale) on Fasciola hepatica eggs: an in vitro study.

    PubMed

    Moazeni, Mohammad; Khademolhoseini, Ali Asghar

    2016-09-01

    Fasciolosis is of considerable economic and public health importance worldwide. Little information is available on the ovicidal effects of anthelminthic drugs. The use of ovicidal anthelmintics can be effective in disease control. In this study, the effectiveness of the methanolic extract of ginger (Zingiber officinale) on the eggs of Fasciola hepatica is investigated. Fasciola hepatica eggs were obtained from the gall bladders of naturally infected sheep and kept at 4 °C until use. The eggs were exposed to varying concentrations of ginger extract (1, 5, 10, 25 and 50 mg/mL) for 24, 48 and 72 h. To investigate the effect of the ginger extracts on the miracidial formation, the treated eggs were incubated at 28 °C for 14 days. The results indicated that F. hepatica eggs are susceptible to the methanolic extract of Z. officinale. The ovicidal effect of ginger extract at a concentration of 1 mg/mL with 24, 48 and 72 h treatment time was 46.08, 51.53 and 69.09 % respectively (compared with 22.70 % for control group). The ovicidal effect of ginger extract at a concentration of 5 mg/mL after 24 h was 98.84 %. One hundred percent ovicidal efficacy was obtained through application of ginger extract at concentrations of 5 and 10 mg/mL with a 48 and 24 h treatment time respectively. The in vitro ovicidal effect of the methanolic extract of Z. officinale was satisfactory in this study, however, in vivo efficacy of this extract, remains for further investigation. To the best of our knowledge, this is the first report on the ovicidal effect of Z. officinale against F. hepatica eggs.

  15. Shape information from glucose curves: Functional data analysis compared with traditional summary measures

    PubMed Central

    2013-01-01

    Background Plasma glucose levels are important measures in medical care and research, and are often obtained from oral glucose tolerance tests (OGTT) with repeated measurements over 2–3 hours. It is common practice to use simple summary measures of OGTT curves. However, different OGTT curves can yield similar summary measures, and information of physiological or clinical interest may be lost. Our mean aim was to extract information inherent in the shape of OGTT glucose curves, compare it with the information from simple summary measures, and explore the clinical usefulness of such information. Methods OGTTs with five glucose measurements over two hours were recorded for 974 healthy pregnant women in their first trimester. For each woman, the five measurements were transformed into smooth OGTT glucose curves by functional data analysis (FDA), a collection of statistical methods developed specifically to analyse curve data. The essential modes of temporal variation between OGTT glucose curves were extracted by functional principal component analysis. The resultant functional principal component (FPC) scores were compared with commonly used simple summary measures: fasting and two-hour (2-h) values, area under the curve (AUC) and simple shape index (2-h minus 90-min values, or 90-min minus 60-min values). Clinical usefulness of FDA was explored by regression analyses of glucose tolerance later in pregnancy. Results Over 99% of the variation between individually fitted curves was expressed in the first three FPCs, interpreted physiologically as “general level” (FPC1), “time to peak” (FPC2) and “oscillations” (FPC3). FPC1 scores correlated strongly with AUC (r=0.999), but less with the other simple summary measures (−0.42≤r≤0.79). FPC2 scores gave shape information not captured by simple summary measures (−0.12≤r≤0.40). FPC2 scores, but not FPC1 nor the simple summary measures, discriminated between women who did and did not develop gestational diabetes later in pregnancy. Conclusions FDA of OGTT glucose curves in early pregnancy extracted shape information that was not identified by commonly used simple summary measures. This information discriminated between women with and without gestational diabetes later in pregnancy. PMID:23327294

  16. Shape information from glucose curves: functional data analysis compared with traditional summary measures.

    PubMed

    Frøslie, Kathrine Frey; Røislien, Jo; Qvigstad, Elisabeth; Godang, Kristin; Bollerslev, Jens; Voldner, Nanna; Henriksen, Tore; Veierød, Marit B

    2013-01-17

    Plasma glucose levels are important measures in medical care and research, and are often obtained from oral glucose tolerance tests (OGTT) with repeated measurements over 2-3  hours. It is common practice to use simple summary measures of OGTT curves. However, different OGTT curves can yield similar summary measures, and information of physiological or clinical interest may be lost. Our mean aim was to extract information inherent in the shape of OGTT glucose curves, compare it with the information from simple summary measures, and explore the clinical usefulness of such information. OGTTs with five glucose measurements over two hours were recorded for 974 healthy pregnant women in their first trimester. For each woman, the five measurements were transformed into smooth OGTT glucose curves by functional data analysis (FDA), a collection of statistical methods developed specifically to analyse curve data. The essential modes of temporal variation between OGTT glucose curves were extracted by functional principal component analysis. The resultant functional principal component (FPC) scores were compared with commonly used simple summary measures: fasting and two-hour (2-h) values, area under the curve (AUC) and simple shape index (2-h minus 90-min values, or 90-min minus 60-min values). Clinical usefulness of FDA was explored by regression analyses of glucose tolerance later in pregnancy. Over 99% of the variation between individually fitted curves was expressed in the first three FPCs, interpreted physiologically as "general level" (FPC1), "time to peak" (FPC2) and "oscillations" (FPC3). FPC1 scores correlated strongly with AUC (r=0.999), but less with the other simple summary measures (-0.42≤r≤0.79). FPC2 scores gave shape information not captured by simple summary measures (-0.12≤r≤0.40). FPC2 scores, but not FPC1 nor the simple summary measures, discriminated between women who did and did not develop gestational diabetes later in pregnancy. FDA of OGTT glucose curves in early pregnancy extracted shape information that was not identified by commonly used simple summary measures. This information discriminated between women with and without gestational diabetes later in pregnancy.

  17. Collective Influence of Multiple Spreaders Evaluated by Tracing Real Information Flow in Large-Scale Social Networks

    NASA Astrophysics Data System (ADS)

    Teng, Xian; Pei, Sen; Morone, Flaviano; Makse, Hernán A.

    2016-10-01

    Identifying the most influential spreaders that maximize information flow is a central question in network theory. Recently, a scalable method called “Collective Influence (CI)” has been put forward through collective influence maximization. In contrast to heuristic methods evaluating nodes’ significance separately, CI method inspects the collective influence of multiple spreaders. Despite that CI applies to the influence maximization problem in percolation model, it is still important to examine its efficacy in realistic information spreading. Here, we examine real-world information flow in various social and scientific platforms including American Physical Society, Facebook, Twitter and LiveJournal. Since empirical data cannot be directly mapped to ideal multi-source spreading, we leverage the behavioral patterns of users extracted from data to construct “virtual” information spreading processes. Our results demonstrate that the set of spreaders selected by CI can induce larger scale of information propagation. Moreover, local measures as the number of connections or citations are not necessarily the deterministic factors of nodes’ importance in realistic information spreading. This result has significance for rankings scientists in scientific networks like the APS, where the commonly used number of citations can be a poor indicator of the collective influence of authors in the community.

  18. Collective Influence of Multiple Spreaders Evaluated by Tracing Real Information Flow in Large-Scale Social Networks

    PubMed Central

    Teng, Xian; Pei, Sen; Morone, Flaviano; Makse, Hernán A.

    2016-01-01

    Identifying the most influential spreaders that maximize information flow is a central question in network theory. Recently, a scalable method called “Collective Influence (CI)” has been put forward through collective influence maximization. In contrast to heuristic methods evaluating nodes’ significance separately, CI method inspects the collective influence of multiple spreaders. Despite that CI applies to the influence maximization problem in percolation model, it is still important to examine its efficacy in realistic information spreading. Here, we examine real-world information flow in various social and scientific platforms including American Physical Society, Facebook, Twitter and LiveJournal. Since empirical data cannot be directly mapped to ideal multi-source spreading, we leverage the behavioral patterns of users extracted from data to construct “virtual” information spreading processes. Our results demonstrate that the set of spreaders selected by CI can induce larger scale of information propagation. Moreover, local measures as the number of connections or citations are not necessarily the deterministic factors of nodes’ importance in realistic information spreading. This result has significance for rankings scientists in scientific networks like the APS, where the commonly used number of citations can be a poor indicator of the collective influence of authors in the community. PMID:27782207

  19. The Extraction of 3D Shape from Texture and Shading in the Human Brain

    PubMed Central

    Georgieva, Svetlana S.; Todd, James T.; Peeters, Ronald

    2008-01-01

    We used functional magnetic resonance imaging to investigate the human cortical areas involved in processing 3-dimensional (3D) shape from texture (SfT) and shading. The stimuli included monocular images of randomly shaped 3D surfaces and a wide variety of 2-dimensional (2D) controls. The results of both passive and active experiments reveal that the extraction of 3D SfT involves the bilateral caudal inferior temporal gyrus (caudal ITG), lateral occipital sulcus (LOS) and several bilateral sites along the intraparietal sulcus. These areas are largely consistent with those involved in the processing of 3D shape from motion and stereo. The experiments also demonstrate, however, that the analysis of 3D shape from shading is primarily restricted to the caudal ITG areas. Additional results from psychophysical experiments reveal that this difference in neuronal substrate cannot be explained by a difference in strength between the 2 cues. These results underscore the importance of the posterior part of the lateral occipital complex for the extraction of visual 3D shape information from all depth cues, and they suggest strongly that the importance of shading is diminished relative to other cues for the analysis of 3D shape in parietal regions. PMID:18281304

  20. Acquiring 3-D information about thick objects from differential interference contrast images using texture extraction

    NASA Astrophysics Data System (ADS)

    Sierra, Heidy; Brooks, Dana; Dimarzio, Charles

    2010-07-01

    The extraction of 3-D morphological information about thick objects is explored in this work. We extract this information from 3-D differential interference contrast (DIC) images by applying a texture detection method. Texture extraction methods have been successfully used in different applications to study biological samples. A 3-D texture image is obtained by applying a local entropy-based texture extraction method. The use of this method to detect regions of blastocyst mouse embryos that are used in assisted reproduction techniques such as in vitro fertilization is presented as an example. Results demonstrate the potential of using texture detection methods to improve morphological analysis of thick samples, which is relevant to many biomedical and biological studies. Fluorescence and optical quadrature microscope phase images are used for validation.

  1. Study on identifying deciduous forest by the method of feature space transformation

    NASA Astrophysics Data System (ADS)

    Zhang, Xuexia; Wu, Pengfei

    2009-10-01

    The thematic remotely sensed information extraction is always one of puzzling nuts which the remote sensing science faces, so many remote sensing scientists devotes diligently to this domain research. The methods of thematic information extraction include two kinds of the visual interpretation and the computer interpretation, the developing direction of which is intellectualization and comprehensive modularization. The paper tries to develop the intelligent extraction method of feature space transformation for the deciduous forest thematic information extraction in Changping district of Beijing city. The whole Chinese-Brazil resources satellite images received in 2005 are used to extract the deciduous forest coverage area by feature space transformation method and linear spectral decomposing method, and the result from remote sensing is similar to woodland resource census data by Chinese forestry bureau in 2004.

  2. The segmentation of bones in pelvic CT images based on extraction of key frames.

    PubMed

    Yu, Hui; Wang, Haijun; Shi, Yao; Xu, Ke; Yu, Xuyao; Cao, Yuzhen

    2018-05-22

    Bone segmentation is important in computed tomography (CT) imaging of the pelvis, which assists physicians in the early diagnosis of pelvic injury, in planning operations, and in evaluating the effects of surgical treatment. This study developed a new algorithm for the accurate, fast, and efficient segmentation of the pelvis. The proposed method consists of two main parts: the extraction of key frames and the segmentation of pelvic CT images. Key frames were extracted based on pixel difference, mutual information and normalized correlation coefficient. In the pelvis segmentation phase, skeleton extraction from CT images and a marker-based watershed algorithm were combined to segment the pelvis. To meet the requirements of clinical application, physician's judgment is needed. Therefore the proposed methodology is semi-automated. In this paper, 5 sets of CT data were used to test the overlapping area, and 15 CT images were used to determine the average deviation distance. The average overlapping area of the 5 sets was greater than 94%, and the minimum average deviation distance was approximately 0.58 pixels. In addition, the key frame extraction efficiency and the running time of the proposed method were evaluated on 20 sets of CT data. For each set, approximately 13% of the images were selected as key frames, and the average processing time was approximately 2 min (the time for manual marking was not included). The proposed method is able to achieve accurate, fast, and efficient segmentation of pelvic CT image sequences. Segmentation results not only provide an important reference for early diagnosis and decisions regarding surgical procedures, they also offer more accurate data for medical image registration, recognition and 3D reconstruction.

  3. Digital representation of oil and natural gas well pad scars in southwest Wyoming

    USGS Publications Warehouse

    Garman, Steven L.; McBeth, Jamie L.

    2014-01-01

    The recent proliferation of oil and natural gas energy development in southwest Wyoming has stimulated the need to understand wildlife responses to this development. Central to many wildlife assessments is the use of geospatial methods that rely on digital representation of energy infrastructure. Surface disturbance of the well pad scars associated with oil and natural gas extraction has been an important but unavailable infrastructure layer. To provide a digital baseline of this surface disturbance, we extracted visible oil and gas well pad scars from 1-meter National Agriculture Imagery Program imagery (NAIP) acquired in 2009 for a 7.7 million-hectare region of southwest Wyoming. Scars include the pad area where wellheads, pumps, and storage facilities reside, and the surrounding area that was scraped and denuded of vegetation during the establishment of the pad. Scars containing tanks, compressors, and the storage of oil and gas related equipment, and produced-water ponds were also collected on occasion. Our extraction method was a two-step process starting with automated extraction followed by manual inspection and clean up. We used available well-point information to guide manual clean up and to derive estimates of year of origin and duration of activity on a pad scar. We also derived estimates of the proportion of non-vegetated area on a scar using a Normalized Difference Vegetation Index derived using 1-meter NAIP imagery. We extracted 16,973 pad scars of which 15,318 were oil and gas well pads. Digital representation of pad scars along with time-stamps of activity and estimates of non-vegetated area provides important baseline (circa 2009) data for assessments of wildlife responses, land-use trends, and disturbance-mediated pattern assessments.

  4. [Study on infrared spectrum change of Ganoderma lucidum and its extracts].

    PubMed

    Chen, Zao-Xin; Xu, Yong-Qun; Chen, Xiao-Kang; Huang, Dong-Lan; Lu, Wen-Guan

    2013-05-01

    From the determination of the infrared spectra of four substances (original ganoderma lucidum and ganoderma lucidum water extract, 95% ethanol extract and petroleum ether extract), it was found that the infrared spectrum can carry systematic chemical information and basically reflects the distribution of each component of the analyte. Ganoderma lucidum and its extracts can be distinguished according to the absorption peak area ratio of 3 416-3 279, 1 541 and 723 cm(-1) to 2 935-2 852 cm(-1). A method of calculating the information entropy of the sample set with Euclidean distance was proposed, the relationship between the information entropy and the amount of chemical information carried by the sample set was discussed, and the authors come to a conclusion that sample set of original ganoderma lucidum carry the most abundant chemical information. The infrared spectrum set of original ganoderma lucidum has better clustering effect on ganoderma atrum, Cyan ganoderma, ganoderma multiplicatum and ganoderma lucidum when making hierarchical cluster analysis of 4 sample set. The results show that infrared spectrum carries the chemical information of the material structure and closely relates to the chemical composition of the system. The higher the value of information entropy, the much richer the chemical information and the more the benefit for pattern recognition. This study has a guidance function to the construction of the sample set in pattern recognition.

  5. Research on Remote Sensing Geological Information Extraction Based on Object Oriented Classification

    NASA Astrophysics Data System (ADS)

    Gao, Hui

    2018-04-01

    The northern Tibet belongs to the Sub cold arid climate zone in the plateau. It is rarely visited by people. The geological working conditions are very poor. However, the stratum exposures are good and human interference is very small. Therefore, the research on the automatic classification and extraction of remote sensing geological information has typical significance and good application prospect. Based on the object-oriented classification in Northern Tibet, using the Worldview2 high-resolution remote sensing data, combined with the tectonic information and image enhancement, the lithological spectral features, shape features, spatial locations and topological relations of various geological information are excavated. By setting the threshold, based on the hierarchical classification, eight kinds of geological information were classified and extracted. Compared with the existing geological maps, the accuracy analysis shows that the overall accuracy reached 87.8561 %, indicating that the classification-oriented method is effective and feasible for this study area and provides a new idea for the automatic extraction of remote sensing geological information.

  6. Electrochemical Probing through a Redox Capacitor To Acquire Chemical Information on Biothiols

    PubMed Central

    2016-01-01

    The acquisition of chemical information is a critical need for medical diagnostics, food/environmental monitoring, and national security. Here, we report an electrochemical information processing approach that integrates (i) complex electrical inputs/outputs, (ii) mediators to transduce the electrical I/O into redox signals that can actively probe the chemical environment, and (iii) a redox capacitor that manipulates signals for information extraction. We demonstrate the capabilities of this chemical information processing strategy using biothiols because of the emerging importance of these molecules in medicine and because their distinct chemical properties allow evaluation of hypothesis-driven information probing. We show that input sequences can be tailored to probe for chemical information both qualitatively (step inputs probe for thiol-specific signatures) and quantitatively. Specifically, we observed picomolar limits of detection and linear responses to concentrations over 5 orders of magnitude (1 pM–0.1 μM). This approach allows the capabilities of signal processing to be extended for rapid, robust, and on-site analysis of chemical information. PMID:27385047

  7. Electrochemical Probing through a Redox Capacitor To Acquire Chemical Information on Biothiols.

    PubMed

    Liu, Zhengchun; Liu, Yi; Kim, Eunkyoung; Bentley, William E; Payne, Gregory F

    2016-07-19

    The acquisition of chemical information is a critical need for medical diagnostics, food/environmental monitoring, and national security. Here, we report an electrochemical information processing approach that integrates (i) complex electrical inputs/outputs, (ii) mediators to transduce the electrical I/O into redox signals that can actively probe the chemical environment, and (iii) a redox capacitor that manipulates signals for information extraction. We demonstrate the capabilities of this chemical information processing strategy using biothiols because of the emerging importance of these molecules in medicine and because their distinct chemical properties allow evaluation of hypothesis-driven information probing. We show that input sequences can be tailored to probe for chemical information both qualitatively (step inputs probe for thiol-specific signatures) and quantitatively. Specifically, we observed picomolar limits of detection and linear responses to concentrations over 5 orders of magnitude (1 pM-0.1 μM). This approach allows the capabilities of signal processing to be extended for rapid, robust, and on-site analysis of chemical information.

  8. Optimisation of supercritical carbon dioxide extraction of essential oil of flowers of tea (Camellia sinensis L.) plants and its antioxidative activity.

    PubMed

    Chen, Zhenchun; Mei, Xin; Jin, Yuxia; Kim, Eun-Hye; Yang, Ziyin; Tu, Youying

    2014-01-30

    To extract natural volatile compounds from tea (Camellia sinensis) flowers without thermal degradation and residue of organic solvents, supercritical fluid extraction (SFE) using carbon dioxide was employed to prepare essential oil of tea flowers in the present study. Four important parameters--pressure, temperature, static extraction time, and dynamic extraction time--were selected as independent variables in the SFE. The optimum extraction conditions were the pressure of 30 MPa, temperature of 50°C, static time of 10 min, and dynamic time of 90 min. Based on gas chromatography-mass spectrometry analysis, 59 compounds, including alkanes (45.4%), esters (10.5%), ketones (7.1%), aldehydes (3.7%), terpenes (3.7%), acids (2.1%), alcohols (1.6%), ethers (1.3%) and others (10.3%) were identified in the essential oil of tea flowers. Moreover, the essential oil of tea flowers showed relatively stronger DPPH radical scavenging activity than essential oils of geranium and peppermint, although its antioxidative activity was weaker than those of essential oil of clove, ascorbic acid, tert-butylhydroquinone, and butylated hydroxyanisole. Essential oil of tea flowers using SFE contained many types of volatile compounds and showed considerable DPPH scavenging activity. The information will contribute to the future application of tea flowers as raw materials in health-care food and food flavour industries. © 2013 Society of Chemical Industry.

  9. Multispectra CWT-based algorithm (MCWT) in mass spectra for peak extraction.

    PubMed

    Hsueh, Huey-Miin; Kuo, Hsun-Chih; Tsai, Chen-An

    2008-01-01

    An important objective in mass spectrometry (MS) is to identify a set of biomarkers that can be used to potentially distinguish patients between distinct treatments (or conditions) from tens or hundreds of spectra. A common two-step approach involving peak extraction and quantification is employed to identify the features of scientific interest. The selected features are then used for further investigation to understand underlying biological mechanism of individual protein or for development of genomic biomarkers to early diagnosis. However, the use of inadequate or ineffective peak detection and peak alignment algorithms in peak extraction step may lead to a high rate of false positives. Also, it is crucial to reduce the false positive rate in detecting biomarkers from ten or hundreds of spectra. Here a new procedure is introduced for feature extraction in mass spectrometry data that extends the continuous wavelet transform-based (CWT-based) algorithm to multiple spectra. The proposed multispectra CWT-based algorithm (MCWT) not only can perform peak detection for multiple spectra but also carry out peak alignment at the same time. The author' MCWT algorithm constructs a reference, which integrates information of multiple raw spectra, for feature extraction. The algorithm is applied to a SELDI-TOF mass spectra data set provided by CAMDA 2006 with known polypeptide m/z positions. This new approach is easy to implement and it outperforms the existing peak extraction method from the Bioconductor PROcess package.

  10. Biomedical question answering using semantic relations.

    PubMed

    Hristovski, Dimitar; Dinevski, Dejan; Kastrin, Andrej; Rindflesch, Thomas C

    2015-01-16

    The proliferation of the scientific literature in the field of biomedicine makes it difficult to keep abreast of current knowledge, even for domain experts. While general Web search engines and specialized information retrieval (IR) systems have made important strides in recent decades, the problem of accurate knowledge extraction from the biomedical literature is far from solved. Classical IR systems usually return a list of documents that have to be read by the user to extract relevant information. This tedious and time-consuming work can be lessened with automatic Question Answering (QA) systems, which aim to provide users with direct and precise answers to their questions. In this work we propose a novel methodology for QA based on semantic relations extracted from the biomedical literature. We extracted semantic relations with the SemRep natural language processing system from 122,421,765 sentences, which came from 21,014,382 MEDLINE citations (i.e., the complete MEDLINE distribution up to the end of 2012). A total of 58,879,300 semantic relation instances were extracted and organized in a relational database. The QA process is implemented as a search in this database, which is accessed through a Web-based application, called SemBT (available at http://sembt.mf.uni-lj.si ). We conducted an extensive evaluation of the proposed methodology in order to estimate the accuracy of extracting a particular semantic relation from a particular sentence. Evaluation was performed by 80 domain experts. In total 7,510 semantic relation instances belonging to 2,675 distinct relations were evaluated 12,083 times. The instances were evaluated as correct 8,228 times (68%). In this work we propose an innovative methodology for biomedical QA. The system is implemented as a Web-based application that is able to provide precise answers to a wide range of questions. A typical question is answered within a few seconds. The tool has some extensions that make it especially useful for interpretation of DNA microarray results.

  11. Automatic River Network Extraction from LIDAR Data

    NASA Astrophysics Data System (ADS)

    Maderal, E. N.; Valcarcel, N.; Delgado, J.; Sevilla, C.; Ojeda, J. C.

    2016-06-01

    National Geographic Institute of Spain (IGN-ES) has launched a new production system for automatic river network extraction for the Geospatial Reference Information (GRI) within hydrography theme. The goal is to get an accurate and updated river network, automatically extracted as possible. For this, IGN-ES has full LiDAR coverage for the whole Spanish territory with a density of 0.5 points per square meter. To implement this work, it has been validated the technical feasibility, developed a methodology to automate each production phase: hydrological terrain models generation with 2 meter grid size and river network extraction combining hydrographic criteria (topographic network) and hydrological criteria (flow accumulation river network), and finally the production was launched. The key points of this work has been managing a big data environment, more than 160,000 Lidar data files, the infrastructure to store (up to 40 Tb between results and intermediate files), and process; using local virtualization and the Amazon Web Service (AWS), which allowed to obtain this automatic production within 6 months, it also has been important the software stability (TerraScan-TerraSolid, GlobalMapper-Blue Marble , FME-Safe, ArcGIS-Esri) and finally, the human resources managing. The results of this production has been an accurate automatic river network extraction for the whole country with a significant improvement for the altimetric component of the 3D linear vector. This article presents the technical feasibility, the production methodology, the automatic river network extraction production and its advantages over traditional vector extraction systems.

  12. Data Content and Exchange in General Practice: a Review

    PubMed Central

    Kalankesh, Leila R; Farahbakhsh, Mostafa; Rahimi, Niloofar

    2014-01-01

    Background: efficient communication of data is inevitable requirement for general practice. Any issue in data content and its exchange among GP and other related entities hinders continuity of patient care. Methods: literature search for this review was conducted on three electronic databases including Medline, Scopus and Science Direct. Results: through reviewing papers, we extracted information on the GP data content, use cases of GP information exchange, its participants, tools and methods, incentives and barriers. Conclusion: considering importance of data content and exchange for GP systems, it seems that more research is needed to be conducted toward providing a comprehensive framework for data content and exchange in GP systems. PMID:25648317

  13. Protocols for the Investigation of Information Processing in Human Assessment of Fundamental Movement Skills.

    PubMed

    Ward, Brodie J; Thornton, Ashleigh; Lay, Brendan; Rosenberg, Michael

    2017-01-01

    Fundamental movement skill (FMS) assessment remains an important tool in classifying individuals' level of FMS proficiency. The collection of FMS performances for assessment and monitoring has remained unchanged over the last few decades, but new motion capture technologies offer opportunities to automate this process. To achieve this, a greater understanding of the human process of movement skill assessment is required. The authors present the rationale and protocols of a project in which they aim to investigate the visual search patterns and information extraction employed by human assessors during FMS assessment, as well as the implementation of the Kinect system for FMS capture.

  14. Extraction of Graph Information Based on Image Contents and the Use of Ontology

    ERIC Educational Resources Information Center

    Kanjanawattana, Sarunya; Kimura, Masaomi

    2016-01-01

    A graph is an effective form of data representation used to summarize complex information. Explicit information such as the relationship between the X- and Y-axes can be easily extracted from a graph by applying human intelligence. However, implicit knowledge such as information obtained from other related concepts in an ontology also resides in…

  15. Ontology-Based Information Extraction for Business Intelligence

    NASA Astrophysics Data System (ADS)

    Saggion, Horacio; Funk, Adam; Maynard, Diana; Bontcheva, Kalina

    Business Intelligence (BI) requires the acquisition and aggregation of key pieces of knowledge from multiple sources in order to provide valuable information to customers or feed statistical BI models and tools. The massive amount of information available to business analysts makes information extraction and other natural language processing tools key enablers for the acquisition and use of that semantic information. We describe the application of ontology-based extraction and merging in the context of a practical e-business application for the EU MUSING Project where the goal is to gather international company intelligence and country/region information. The results of our experiments so far are very promising and we are now in the process of building a complete end-to-end solution.

  16. Citizen-sensor-networks to confront government decision-makers: Two lessons from the Netherlands.

    PubMed

    Carton, Linda; Ache, Peter

    2017-07-01

    This paper presents one emerging social-technical innovation: The evolution of citizen-sensor-networks where citizens organize themselves from the 'bottom up', for the sake of confronting governance officials with measured information about environmental qualities. We have observed how citizen-sensor-networks have been initiated in the Netherlands in cases where official government monitoring and business organizations leave gaps. The formed citizen-sensor-networks collect information about issues that affect the local community in their quality-of-living. In particular, two community initiatives are described where the sensed environmental information, on noise pollution and gas-extraction induced earthquakes respectively, is published through networked geographic information methods. Both community initiatives pioneered in developing an approach that comprises the combined setting-up of sensor data flows, real-time map portals and community organization. Two particular cases are analyzed to trace the emergence and network operation of such 'networked geo-information tools' in practice: (1) The Groningen earthquake monitor, and (2) The Airplane Monitor Schiphol. In both cases, environmental 'externalities' of spatial-economic activities play an important role, having economic dimensions of national importance (e.g. gas extraction and national airport development) while simultaneously affecting the regional community with environmental consequences. The monitoring systems analyzed in this paper are established bottom-up, by citizens for citizens, to serve as 'information power' in dialogue with government institutions. The goal of this paper is to gain insight in how these citizen-sensor-networks come about: how the idea for establishing a sensor network originated, how their value gets recognized and adopted in the overall 'system of governance'; to what extent they bring countervailing power against vested interests and established discourses to the table and influence power-laden conflicts over environmental pressures; and whether or not they achieve (some form of) institutionalization and, ultimately, policy change. We find that the studied-citizen-sensor networks gain strength by uniting efforts and activities in crowdsourcing data, providing factual, 'objectivized data' or 'evidence' of the situation 'on the ground' on a matter of local community-wide concern. By filling an information need of the local community, a process of 'collective sense-making' combined with citizen empowerment could grow, which influenced societal discourse and challenged prevailing truth-claims of public institutions. In both cases similar, 'competing' web-portals were developed in response, both by the gas-extraction company and the airport. But with the citizen-sensor-networks alongside, we conclude there is a shift in power balance involved between government and affected communities, as the government no longer has information monopoly on environmental measurements. Copyright © 2017 Elsevier Ltd. All rights reserved.

  17. Gemas: issues from the comparison of aqua regia and X-ray fluorescence results

    NASA Astrophysics Data System (ADS)

    Dinelli, Enrico; Birke, Manfred; Reimann, Clemens; Demetriades, Alecos; DeVivo, Benedetto; Flight, Dee; Ladenberger, Anna; Albanese, Stefano; Cicchella, Domenico; Lima, Annamaria

    2014-05-01

    The comparison of analytical results from aqua regia (AR) and X-ray fluorescence spectroscopy (XRF) can provide information on soil processes controlling the element distribution. The GEMAS (GEochemical Mapping of Agricultural and grazing land Soils) agricultural soil database is used for this comparison. Analyses for the same suite of elements and parameters were carried out in the same laboratory under strict quality control procedures. Sample preparation has been conducted at the laboratory of the The comparison of analytical results from aqua regia (AR) and X-ray fluorescence spectroscopy (XRF) can provide information on soil processes controlling the element distribution in soil. The GEMAS (GEochemical Mapping of Agricultural and grazing land Soils) agricultural soil database, consisting of 2 x ca. 2100 samples spread evenly over 33 European countries, is used for this comparison. Analyses for the same suite of elements and parameters were carried out in the same laboratory under strict quality control procedures. Sample preparation has been conducted at the laboratory of the Geological Survey of the Slovak Republic, AR analyses were carried out at ACME Labs, and XRF analyses at the Federal Institute for Geosciences and Natural Resources, Germany Element recovery by AR is very different, ranging from <1% (e.g. Na, Zr) to > 80% (e.g. Mn, P, Co). Recovery is controlled by mineralogy of the parent material, but geographic and climatic factors and the weathering history of the soils are also important. Nonetheless, even the very low recovery elements show wide ranges of variation and spatial patterns that are affected by other factors than soil parent material. For many elements soil pH have a clear influence on AR extractability: under acidic soil conditions almost all elements tend to be leached and their extractability is generally low. It progressively increases with increasing pH and is highest in the pH range 7-8. Critical is the clay content of the soil that almost for all elements correspond to higher extractability with increasing clay abundance. Also other factors such as organic matter content of soil, Fe and Mn occurrence are important for certain elements or in selected areas. This work illustrates that there are significant differences in the extractability of elements from soils and addresses important influencing factors related to soil properties, geology, climate.

  18. CMS-2 Reverse Engineering and ENCORE/MODEL Integration

    DTIC Science & Technology

    1992-05-01

    Automated extraction of design information from an existing software system written in CMS-2 can be used to document that system as-built, and that I The...extracted information is provided by a commer- dally available CASE tool. * Information describing software system design is automatically extracted...the displays in Figures 1, 2, and 3. T achiev ths GE 11 b iuo w as rjcs CM-2t Aa nsltr(M2da 1 n Joia Reverse EwngiernTcnlg 5RT [2GRE] . Two xampe fD

  19. DTIC (Defense Technical Information Center) Model Action Plan for Incorporating DGIS (DOD Gateway Information System) Capabilities.

    DTIC Science & Technology

    1986-05-01

    Information System (DGIS) is being developed to provide the DD crmjnj t with a modern tool to access diverse dtabaiees and extract information products...this community with a modern tool for accessing these databases and extracting information products from them. Since the Defense Technical Information...adjunct to DROLS xesults. The study , thereor. centerd around obtaining background information inside the unit on that unit’s users who request DROLS

  20. Disaster Emergency Rapid Assessment Based on Remote Sensing and Background Data

    NASA Astrophysics Data System (ADS)

    Han, X.; Wu, J.

    2018-04-01

    The period from starting to the stable conditions is an important stage of disaster development. In addition to collecting and reporting information on disaster situations, remote sensing images by satellites and drones and monitoring results from disaster-stricken areas should be obtained. Fusion of multi-source background data such as population, geography and topography, and remote sensing monitoring information can be used in geographic information system analysis to quickly and objectively assess the disaster information. According to the characteristics of different hazards, the models and methods driven by the rapid assessment of mission requirements are tested and screened. Based on remote sensing images, the features of exposures quickly determine disaster-affected areas and intensity levels, and extract key disaster information about affected hospitals and schools as well as cultivated land and crops, and make decisions after emergency response with visual assessment results.

  1. Rain/No-Rain Identification from Bispectral Satellite Information using Deep Neural Networks

    NASA Astrophysics Data System (ADS)

    Tao, Y.

    2016-12-01

    Satellite-based precipitation estimation products have the advantage of high resolution and global coverage. However, they still suffer from insufficient accuracy. To accurately estimate precipitation from satellite data, there are two most important aspects: sufficient precipitation information in the satellite information and proper methodologies to extract such information effectively. This study applies the state-of-the-art machine learning methodologies to bispectral satellite information for Rain/No-Rain detection. Specifically, we use deep neural networks to extract features from infrared and water vapor channels and connect it to precipitation identification. To evaluate the effectiveness of the methodology, we first applies it to the infrared data only (Model DL-IR only), the most commonly used inputs for satellite-based precipitation estimation. Then we incorporates water vapor data (Model DL-IR + WV) to further improve the prediction performance. Radar stage IV dataset is used as ground measurement for parameter calibration. The operational product, Precipitation Estimation from Remotely Sensed Information Using Artificial Neural Networks Cloud Classification System (PERSIANN-CCS), is used as a reference to compare the performance of both models in both winter and summer seasons.The experiments show significant improvement for both models in precipitation identification. The overall performance gains in the Critical Success Index (CSI) are 21.60% and 43.66% over the verification periods for Model DL-IR only and Model DL-IR+WV model compared to PERSIANN-CCS, respectively. Moreover, specific case studies show that the water vapor channel information and the deep neural networks effectively help recover a large number of missing precipitation pixels under warm clouds while reducing false alarms under cold clouds.

  2. Versatile and efficient pore network extraction method using marker-based watershed segmentation

    NASA Astrophysics Data System (ADS)

    Gostick, Jeff T.

    2017-08-01

    Obtaining structural information from tomographic images of porous materials is a critical component of porous media research. Extracting pore networks is particularly valuable since it enables pore network modeling simulations which can be useful for a host of tasks from predicting transport properties to simulating performance of entire devices. This work reports an efficient algorithm for extracting networks using only standard image analysis techniques. The algorithm was applied to several standard porous materials ranging from sandstone to fibrous mats, and in all cases agreed very well with established or known values for pore and throat sizes, capillary pressure curves, and permeability. In the case of sandstone, the present algorithm was compared to the network obtained using the current state-of-the-art algorithm, and very good agreement was achieved. Most importantly, the network extracted from an image of fibrous media correctly predicted the anisotropic permeability tensor, demonstrating the critical ability to detect key structural features. The highly efficient algorithm allows extraction on fairly large images of 5003 voxels in just over 200 s. The ability for one algorithm to match materials as varied as sandstone with 20% porosity and fibrous media with 75% porosity is a significant advancement. The source code for this algorithm is provided.

  3. Research on vibration signal analysis and extraction method of gear local fault

    NASA Astrophysics Data System (ADS)

    Yang, X. F.; Wang, D.; Ma, J. F.; Shao, W.

    2018-02-01

    Gear is the main connection parts and power transmission parts in the mechanical equipment. If the fault occurs, it directly affects the running state of the whole machine and even endangers the personal safety. So it has important theoretical significance and practical value to study on the extraction of the gear fault signal and fault diagnosis of the gear. In this paper, the gear local fault as the research object, set up the vibration model of gear fault vibration mechanism, derive the vibration mechanism of the gear local fault and analyzes the similarities and differences of the vibration signal between the gear non fault and the gears local faults. In the MATLAB environment, the wavelet transform algorithm is used to denoise the fault signal. Hilbert transform is used to demodulate the fault vibration signal. The results show that the method can denoise the strong noise mechanical vibration signal and extract the local fault feature information from the fault vibration signal..

  4. Accurate airway centerline extraction based on topological thinning using graph-theoretic analysis.

    PubMed

    Bian, Zijian; Tan, Wenjun; Yang, Jinzhu; Liu, Jiren; Zhao, Dazhe

    2014-01-01

    The quantitative analysis of the airway tree is of critical importance in the CT-based diagnosis and treatment of popular pulmonary diseases. The extraction of airway centerline is a precursor to identify airway hierarchical structure, measure geometrical parameters, and guide visualized detection. Traditional methods suffer from extra branches and circles due to incomplete segmentation results, which induce false analysis in applications. This paper proposed an automatic and robust centerline extraction method for airway tree. First, the centerline is located based on the topological thinning method; border voxels are deleted symmetrically to preserve topological and geometrical properties iteratively. Second, the structural information is generated using graph-theoretic analysis. Then inaccurate circles are removed with a distance weighting strategy, and extra branches are pruned according to clinical anatomic knowledge. The centerline region without false appendices is eventually determined after the described phases. Experimental results show that the proposed method identifies more than 96% branches and keep consistency across different cases and achieves superior circle-free structure and centrality.

  5. Space Subdivision in Indoor Mobile Laser Scanning Point Clouds Based on Scanline Analysis.

    PubMed

    Zheng, Yi; Peter, Michael; Zhong, Ruofei; Oude Elberink, Sander; Zhou, Quan

    2018-06-05

    Indoor space subdivision is an important aspect of scene analysis that provides essential information for many applications, such as indoor navigation and evacuation route planning. Until now, most proposed scene understanding algorithms have been based on whole point clouds, which has led to complicated operations, high computational loads and low processing speed. This paper presents novel methods to efficiently extract the location of openings (e.g., doors and windows) and to subdivide space by analyzing scanlines. An opening detection method is demonstrated that analyses the local geometric regularity in scanlines to refine the extracted opening. Moreover, a space subdivision method based on the extracted openings and the scanning system trajectory is described. Finally, the opening detection and space subdivision results are saved as point cloud labels which will be used for further investigations. The method has been tested on a real dataset collected by ZEB-REVO. The experimental results validate the completeness and correctness of the proposed method for different indoor environment and scanning paths.

  6. Uniform Local Binary Pattern Based Texture-Edge Feature for 3D Human Behavior Recognition.

    PubMed

    Ming, Yue; Wang, Guangchao; Fan, Chunxiao

    2015-01-01

    With the rapid development of 3D somatosensory technology, human behavior recognition has become an important research field. Human behavior feature analysis has evolved from traditional 2D features to 3D features. In order to improve the performance of human activity recognition, a human behavior recognition method is proposed, which is based on a hybrid texture-edge local pattern coding feature extraction and integration of RGB and depth videos information. The paper mainly focuses on background subtraction on RGB and depth video sequences of behaviors, extracting and integrating historical images of the behavior outlines, feature extraction and classification. The new method of 3D human behavior recognition has achieved the rapid and efficient recognition of behavior videos. A large number of experiments show that the proposed method has faster speed and higher recognition rate. The recognition method has good robustness for different environmental colors, lightings and other factors. Meanwhile, the feature of mixed texture-edge uniform local binary pattern can be used in most 3D behavior recognition.

  7. Lung lobe segmentation based on statistical atlas and graph cuts

    NASA Astrophysics Data System (ADS)

    Nimura, Yukitaka; Kitasaka, Takayuki; Honma, Hirotoshi; Takabatake, Hirotsugu; Mori, Masaki; Natori, Hiroshi; Mori, Kensaku

    2012-03-01

    This paper presents a novel method that can extract lung lobes by utilizing probability atlas and multilabel graph cuts. Information about pulmonary structures plays very important role for decision of the treatment strategy and surgical planning. The human lungs are divided into five anatomical regions, the lung lobes. Precise segmentation and recognition of lung lobes are indispensable tasks in computer aided diagnosis systems and computer aided surgery systems. A lot of methods for lung lobe segmentation are proposed. However, these methods only target the normal cases. Therefore, these methods cannot extract the lung lobes in abnormal cases, such as COPD cases. To extract lung lobes in abnormal cases, this paper propose a lung lobe segmentation method based on probability atlas of lobe location and multilabel graph cuts. The process consists of three components; normalization based on the patient's physique, probability atlas generation, and segmentation based on graph cuts. We apply this method to six cases of chest CT images including COPD cases. Jaccard index was 79.1%.

  8. Quantifying hydrogen-deuterium exchange of meteoritic dicarboxylic acids during aqueous extraction

    NASA Astrophysics Data System (ADS)

    Fuller, M.; Huang, Y.

    2003-03-01

    Hydrogen isotope ratios of organic compounds in carbonaceous chondrites provide critical information about their origins and evolutionary history. However, because many of these compounds are obtained by aqueous extraction, the degree of hydrogen-deuterium (H/D) exchange that occurs during the process needs to be quantitatively evaluated. This study uses compound- specific hydrogen isotopic analysis to quantify the H/D exchange during aqueous extraction. Three common meteoritic dicarboxylic acids (succinic, glutaric, and 2-methyl glutaric acids) were refluxed under conditions simulating the extraction process. Changes in D values of the dicarboxylic acids were measured following the reflux experiments. A pseudo-first order rate law was used to model the H/D exchange rates which were then used to calculate the isotope exchange resulting from aqueous extraction. The degree of H/D exchange varies as a result of differences in molecular structure, the alkalinity of the extraction solution and presence/absence of meteorite powder. However, our model indicates that succinic, glutaric, and 2-methyl glutaric acids with a D of 1800 would experience isotope changes of 38, 10, and 6, respectively during the extraction process. Therefore, the overall change in D values of the dicarboxylic acids during the aqueous extraction process is negligible. We also demonstrate that H/D exchange occurs on the chiral -carbon in 2-methyl glutaric acid. The results suggest that the racemic mixture of 2-methyl glutaric acid in the Tagish Lake meteorite could result from post-synthesis aqueous alteration. The approach employed in this study can also be used to quantify H/D exchange for other important meteoritic compounds such as amino acids.

  9. Extracting Social Information from Chemosensory Cues: Consideration of Several Scenarios and Their Functional Implications

    PubMed Central

    Ben-Shaul, Yoram

    2015-01-01

    Across all sensory modalities, stimuli can vary along multiple dimensions. Efficient extraction of information requires sensitivity to those stimulus dimensions that provide behaviorally relevant information. To derive social information from chemosensory cues, sensory systems must embed information about the relationships between behaviorally relevant traits of individuals and the distributions of the chemical cues that are informative about these traits. In simple cases, the mere presence of one particular compound is sufficient to guide appropriate behavior. However, more generally, chemosensory information is conveyed via relative levels of multiple chemical cues, in non-trivial ways. The computations and networks needed to derive information from multi-molecule stimuli are distinct from those required by single molecule cues. Our current knowledge about how socially relevant information is encoded by chemical blends, and how it is extracted by chemosensory systems is very limited. This manuscript explores several scenarios and the neuronal computations required to identify them. PMID:26635515

  10. Proteomic Analysis of Hair Follicles

    NASA Astrophysics Data System (ADS)

    Ishioka, Noriaki; Terada, Masahiro; Yamada, Shin; Seki, Masaya; Takahashi, Rika; Majima, Hideyuki J.; Higashibata, Akira; Mukai, Chiaki

    2013-02-01

    Hair root cells actively divide in a hair follicle, and they sensitively reflect physical conditions. By analyzing the human hair, we can know stress levels on the human body and metabolic conditions caused by microgravity environment and cosmic radiation. The Japan Aerospace Exploration Agency (JAXA) has initiated a human research study to investigate the effects of long-term space flight on gene expression and mineral metabolism by analyzing hair samples of astronauts who stayed in the International Space Station (ISS) for 6 months. During long-term flights, the physiological effects on astronauts include muscle atrophy and bone calcium loss. Furthermore, radiation and psychological effects are important issue to consider. Therefore, an understanding of the effects of the space environment is important for developing countermeasures against the effects experienced by astronauts. In this experiment, we identify functionally important target proteins that integrate transcriptome, mineral metabolism and proteome profiles from human hair. To compare the protein expression data with the gene expression data from hair roots, we developed the protein processing method. We extracted the protein from five strands of hair using ISOGEN reagents. Then, these extracted proteins were analyzed by LC-MS/MS. These collected profiles will give us useful physiological information to examine the effect of space flight.

  11. Auto-Context Convolutional Neural Network (Auto-Net) for Brain Extraction in Magnetic Resonance Imaging.

    PubMed

    Mohseni Salehi, Seyed Sadegh; Erdogmus, Deniz; Gholipour, Ali

    2017-11-01

    Brain extraction or whole brain segmentation is an important first step in many of the neuroimage analysis pipelines. The accuracy and the robustness of brain extraction, therefore, are crucial for the accuracy of the entire brain analysis process. The state-of-the-art brain extraction techniques rely heavily on the accuracy of alignment or registration between brain atlases and query brain anatomy, and/or make assumptions about the image geometry, and therefore have limited success when these assumptions do not hold or image registration fails. With the aim of designing an accurate, learning-based, geometry-independent, and registration-free brain extraction tool, in this paper, we present a technique based on an auto-context convolutional neural network (CNN), in which intrinsic local and global image features are learned through 2-D patches of different window sizes. We consider two different architectures: 1) a voxelwise approach based on three parallel 2-D convolutional pathways for three different directions (axial, coronal, and sagittal) that implicitly learn 3-D image information without the need for computationally expensive 3-D convolutions and 2) a fully convolutional network based on the U-net architecture. Posterior probability maps generated by the networks are used iteratively as context information along with the original image patches to learn the local shape and connectedness of the brain to extract it from non-brain tissue. The brain extraction results we have obtained from our CNNs are superior to the recently reported results in the literature on two publicly available benchmark data sets, namely, LPBA40 and OASIS, in which we obtained the Dice overlap coefficients of 97.73% and 97.62%, respectively. Significant improvement was achieved via our auto-context algorithm. Furthermore, we evaluated the performance of our algorithm in the challenging problem of extracting arbitrarily oriented fetal brains in reconstructed fetal brain magnetic resonance imaging (MRI) data sets. In this application, our voxelwise auto-context CNN performed much better than the other methods (Dice coefficient: 95.97%), where the other methods performed poorly due to the non-standard orientation and geometry of the fetal brain in MRI. Through training, our method can provide accurate brain extraction in challenging applications. This, in turn, may reduce the problems associated with image registration in segmentation tasks.

  12. The reporting of theoretical health risks by the media: Canadian newspaper reporting of potential blood transmission of Creutzfeldt-Jakob disease

    PubMed Central

    Wilson, Kumanan; Code, Catherine; Dornan, Christopher; Ahmad, Nadya; Hébert, Paul; Graham, Ian

    2004-01-01

    Background The media play an important role at the interface of science and policy by communicating scientific information to the public and policy makers. In issues of theoretical risk, in which there is scientific uncertainty, the media's role as disseminators of information is particularly important due to the potential to influence public perception of the severity of the risk. In this article we describe how the Canadian print media reported the theoretical risk of blood transmission of Creutzfeldt-Jakob disease (CJD). Methods We searched 3 newspaper databases for articles published by 6 major Canadian daily newspapers between January 1990 and December 1999. We identified all articles relating to blood transmission of CJD. In duplicate we extracted information from the articles and entered the information into a qualitative software program. We compared the observations obtained from this content analysis with information obtained from a previous policy analysis examining the Canadian blood system's decision-making concerning the potential transfusion transmission of CJD. Results Our search identified 245 relevant articles. We observed that newspapers in one instance accelerated a policy decision, which had important resource and health implication, by communicating information on risk to the public. We also observed that newspapers primarily relied upon expert opinion (47 articles) as opposed to published medical evidence (28 articles) when communicating risk information. Journalists we interviewed described the challenges of balancing their responsibility to raise awareness of potential health threats with not unnecessarily arousing fear amongst the public. Conclusions Based on our findings we recommend that journalists report information from both expert opinion sources and from published studies when communicating information on risk. We also recommend researchers work more closely with journalists to assist them in identifying and appraising relevant scientific information on risk. PMID:14706119

  13. Extracting information from the text of electronic medical records to improve case detection: a systematic review

    PubMed Central

    Carroll, John A; Smith, Helen E; Scott, Donia; Cassell, Jackie A

    2016-01-01

    Background Electronic medical records (EMRs) are revolutionizing health-related research. One key issue for study quality is the accurate identification of patients with the condition of interest. Information in EMRs can be entered as structured codes or unstructured free text. The majority of research studies have used only coded parts of EMRs for case-detection, which may bias findings, miss cases, and reduce study quality. This review examines whether incorporating information from text into case-detection algorithms can improve research quality. Methods A systematic search returned 9659 papers, 67 of which reported on the extraction of information from free text of EMRs with the stated purpose of detecting cases of a named clinical condition. Methods for extracting information from text and the technical accuracy of case-detection algorithms were reviewed. Results Studies mainly used US hospital-based EMRs, and extracted information from text for 41 conditions using keyword searches, rule-based algorithms, and machine learning methods. There was no clear difference in case-detection algorithm accuracy between rule-based and machine learning methods of extraction. Inclusion of information from text resulted in a significant improvement in algorithm sensitivity and area under the receiver operating characteristic in comparison to codes alone (median sensitivity 78% (codes + text) vs 62% (codes), P = .03; median area under the receiver operating characteristic 95% (codes + text) vs 88% (codes), P = .025). Conclusions Text in EMRs is accessible, especially with open source information extraction algorithms, and significantly improves case detection when combined with codes. More harmonization of reporting within EMR studies is needed, particularly standardized reporting of algorithm accuracy metrics like positive predictive value (precision) and sensitivity (recall). PMID:26911811

  14. Automated extraction of radiation dose information for CT examinations.

    PubMed

    Cook, Tessa S; Zimmerman, Stefan; Maidment, Andrew D A; Kim, Woojin; Boonn, William W

    2010-11-01

    Exposure to radiation as a result of medical imaging is currently in the spotlight, receiving attention from Congress as well as the lay press. Although scanner manufacturers are moving toward including effective dose information in the Digital Imaging and Communications in Medicine headers of imaging studies, there is a vast repository of retrospective CT data at every imaging center that stores dose information in an image-based dose sheet. As such, it is difficult for imaging centers to participate in the ACR's Dose Index Registry. The authors have designed an automated extraction system to query their PACS archive and parse CT examinations to extract the dose information stored in each dose sheet. First, an open-source optical character recognition program processes each dose sheet and converts the information to American Standard Code for Information Interchange (ASCII) text. Each text file is parsed, and radiation dose information is extracted and stored in a database which can be queried using an existing pathology and radiology enterprise search tool. Using this automated extraction pipeline, it is possible to perform dose analysis on the >800,000 CT examinations in the PACS archive and generate dose reports for all of these patients. It is also possible to more effectively educate technologists, radiologists, and referring physicians about exposure to radiation from CT by generating report cards for interpreted and performed studies. The automated extraction pipeline enables compliance with the ACR's reporting guidelines and greater awareness of radiation dose to patients, thus resulting in improved patient care and management. Copyright © 2010 American College of Radiology. Published by Elsevier Inc. All rights reserved.

  15. Automated generation of individually customized visualizations of diagnosis-specific medical information using novel techniques of information extraction

    NASA Astrophysics Data System (ADS)

    Chen, Andrew A.; Meng, Frank; Morioka, Craig A.; Churchill, Bernard M.; Kangarloo, Hooshang

    2005-04-01

    Managing pediatric patients with neurogenic bladder (NGB) involves regular laboratory, imaging, and physiologic testing. Using input from domain experts and current literature, we identified specific data points from these tests to develop the concept of an electronic disease vector for NGB. An information extraction engine was used to extract the desired data elements from free-text and semi-structured documents retrieved from the patient"s medical record. Finally, a Java-based presentation engine created graphical visualizations of the extracted data. After precision, recall, and timing evaluation, we conclude that these tools may enable clinically useful, automatically generated, and diagnosis-specific visualizations of patient data, potentially improving compliance and ultimately, outcomes.

  16. Road Damage Extraction from Post-Earthquake Uav Images Assisted by Vector Data

    NASA Astrophysics Data System (ADS)

    Chen, Z.; Dou, A.

    2018-04-01

    Extraction of road damage information after earthquake has been regarded as urgent mission. To collect information about stricken areas, Unmanned Aerial Vehicle can be used to obtain images rapidly. This paper put forward a novel method to detect road damage and bring forward a coefficient to assess road accessibility. With the assistance of vector road data, image data of the Jiuzhaigou Ms7.0 Earthquake is tested. In the first, the image is clipped according to vector buffer. Then a large-scale segmentation is applied to remove irrelevant objects. Thirdly, statistics of road features are analysed, and damage information is extracted. Combining with the on-filed investigation, the extraction result is effective.

  17. The GRIDView Visualization Package

    NASA Astrophysics Data System (ADS)

    Kent, B. R.

    2011-07-01

    Large three-dimensional data cubes, catalogs, and spectral line archives are increasingly important elements of the data discovery process in astronomy. Visualization of large data volumes is of vital importance for the success of large spectral line surveys. Examples of data reduction utilizing the GRIDView software package are shown. The package allows users to manipulate data cubes, extract spectral profiles, and measure line properties. The package and included graphical user interfaces (GUIs) are designed with pipeline infrastructure in mind. The software has been used with great success analyzing spectral line and continuum data sets obtained from large radio survey collaborations. The tools are also important for multi-wavelength cross-correlation studies and incorporate Virtual Observatory client applications for overlaying database information in real time as cubes are examined by users.

  18. City Core - detecting the anthropocene in urban lake cores

    NASA Astrophysics Data System (ADS)

    Kjaer, K. H.; Ilsøe, P.; Andresen, C. S.; Rasmussen, P.; Andersen, T. J.; Frei, R.; Schreiber, N.; Odgaard, B.; Funder, S.; Holm, J. M.; Andersen, K.

    2011-12-01

    Here, we presents the preliminary results from lake cores taken in ditches associated with the historical fortifications enclosing the oldest - central Copenhagen to achieve new knowledge from sediment deposits related to anthropogenic activities. We have examined sediment cores with X-ray fluorescence (XRF) analyzers to correlate element patterns from urban and industrial emissions. Thus, we aim to track these patterns back in time - long before regular routines of recording of atmospheric environment began around 1978. Furthermore, we compare our data to alternative sources of information in order to constrain and expand the temporal dating limits (approximately 1890) achieved from 210Pb activity. From custom reports and statistic sources, information on imported volumes from coal, metal and oil was obtained and related contaminants from these substances to the sediment archives. Intriguingly, we find a steep increase in import of coal and metals matching the exponential increase of lead and zinc counts from XRF-recordings of the sediment cores. In this finding, we claim to have constrain the initiation of urban industrialization. In order to confirm the age resolution of the lake cores, DNA was extracted from sediments, sedaDNA. Thus we attempt to trace plantation of well documented exotic plants to, for instance, the Botanical Garden. Through extraction and sampling of sedaDNA from these floral and arboreal specimens we intend to locate their strataigraphic horizons in the sediment core. These findings may correlate data back to 1872, when the garden was established on the area of the former fortification. In this line of research, we hope to achieve important supplementary knowledge of sedaDNA-leaching frequencies within freshwater sediments.

  19. Large-scale extraction of accurate drug-disease treatment pairs from biomedical literature for drug repurposing

    PubMed Central

    2013-01-01

    Background A large-scale, highly accurate, machine-understandable drug-disease treatment relationship knowledge base is important for computational approaches to drug repurposing. The large body of published biomedical research articles and clinical case reports available on MEDLINE is a rich source of FDA-approved drug-disease indication as well as drug-repurposing knowledge that is crucial for applying FDA-approved drugs for new diseases. However, much of this information is buried in free text and not captured in any existing databases. The goal of this study is to extract a large number of accurate drug-disease treatment pairs from published literature. Results In this study, we developed a simple but highly accurate pattern-learning approach to extract treatment-specific drug-disease pairs from 20 million biomedical abstracts available on MEDLINE. We extracted a total of 34,305 unique drug-disease treatment pairs, the majority of which are not included in existing structured databases. Our algorithm achieved a precision of 0.904 and a recall of 0.131 in extracting all pairs, and a precision of 0.904 and a recall of 0.842 in extracting frequent pairs. In addition, we have shown that the extracted pairs strongly correlate with both drug target genes and therapeutic classes, therefore may have high potential in drug discovery. Conclusions We demonstrated that our simple pattern-learning relationship extraction algorithm is able to accurately extract many drug-disease pairs from the free text of biomedical literature that are not captured in structured databases. The large-scale, accurate, machine-understandable drug-disease treatment knowledge base that is resultant of our study, in combination with pairs from structured databases, will have high potential in computational drug repurposing tasks. PMID:23742147

  20. Oil and Gas Extraction Sector (NAICS 211)

    EPA Pesticide Factsheets

    Environmental regulatory information for oil and gas extraction sectors, including oil and natural gas drilling. Includes information about NESHAPs for RICE and stationary combustion engines, and effluent guidelines for synthetic-based drilling fluids

  1. Single-Cell Quantitative PCR: Advances and Potential in Cancer Diagnostics.

    PubMed

    Ok, Chi Young; Singh, Rajesh R; Salim, Alaa A

    2016-01-01

    Tissues are heterogeneous in their components. If cells of interest are a minor population of collected tissue, it would be difficult to obtain genetic or genomic information of the interested cell population with conventional genomic DNA extraction from the collected tissue. Single-cell DNA analysis is important in the analysis of genetics of cell clonality, genetic anticipation, and single-cell DNA polymorphisms. Single-cell PCR using Single Cell Ampligrid/GeXP platform is described in this chapter.

  2. Laboratory Spectroscopy of Ices of Astrophysical Interest

    NASA Technical Reports Server (NTRS)

    Hudson, Reggie; Moore, M. H.

    2011-01-01

    Ongoing and future NASA and ESA astronomy missions need detailed information on the spectra of a variety of molecular ices to help establish the identity and abundances of molecules observed in astronomical data. Examples of condensed-phase molecules already detected on cold surfaces include H2O, CO, CO2, N2, NH3, CH4, SO2, O2, and O3. In addition, strong evidence exists for the solid-phase nitriles HCN, HC3N, and C2N2 in Titan's atmosphere. The wavelength region over which these identifications have been made is roughly 0.5 to 100 micron. Searches for additional features of complex carbon-containing species are in progress. Existing and future observations often impose special requirements on the information that comes from the laboratory. For example, the measurement of spectra, determination of integrated band strengths, and extraction of complex refractive indices of ices (and icy mixtures) in both amorphous and crystalline phases at relevant temperatures are all important tasks. In addition, the determination of the index of refraction of amorphous and crystalline ices in the visible region is essential for the extraction of infrared optical constants. Similarly, the measurement of spectra of ions and molecules embedded in relevant ices is important. This laboratory review will examine some of the existing experimental work and capabilities in these areas along with what more may be needed to meet current and future NASA and ESA planetary needs.

  3. Warburgia: a comprehensive review of the botany, traditional uses and phytochemistry.

    PubMed

    Leonard, Carmen M; Viljoen, Alvaro M

    2015-05-13

    The genus Warburgia (Canellaceae) is represented by several medicinal trees found exclusively on the African continent. Traditionally, extracts and products produced from Warburgia species are regarded as important natural African antibiotics and have been used extensively as part of traditional healing practices for the treatment of fungal, bacterial and protozoal infections in both humans and animals. We here aim to collate and review the fragmented information on the ethnobotany, phytochemistry and biological activities of ethnomedicinally important Warburgia species and present recommendations for future research. Peer-reviewed articles using "Warburgia" as search term ("all fields") were retrieved from Scopus, ScienceDirect, SciFinder and Google Scholar with no specific time frame set for the search. In addition, various books were consulted that contained botanical and ethnopharmacological information. The ethnopharmacology, phytochemistry and biological activity of Warburgia are reviewed. Most of the biological activities are attributed to the drimane sesquiterpenoids, including polygodial, warburganal, muzigadial, mukaadial and ugandensial, flavonoids and miscellaneous compounds present in the various species. In addition to anti-infective properties, Warburgia extracts are also used to treat a wide range of ailments, including stomach aches, fever and headaches, which may also be a manifestation of infections. The need to record anecdotal evidence is emphasised and conservation efforts are highlighted to contribute to the protection and preservation of one of Africa's most coveted botanical resources. Copyright © 2015 Elsevier Ireland Ltd. All rights reserved.

  4. Secure searching of biomarkers through hybrid homomorphic encryption scheme.

    PubMed

    Kim, Miran; Song, Yongsoo; Cheon, Jung Hee

    2017-07-26

    As genome sequencing technology develops rapidly, there has lately been an increasing need to keep genomic data secure even when stored in the cloud and still used for research. We are interested in designing a protocol for the secure outsourcing matching problem on encrypted data. We propose an efficient method to securely search a matching position with the query data and extract some information at the position. After decryption, only a small amount of comparisons with the query information should be performed in plaintext state. We apply this method to find a set of biomarkers in encrypted genomes. The important feature of our method is to encode a genomic database as a single element of polynomial ring. Since our method requires a single homomorphic multiplication of hybrid scheme for query computation, it has the advantage over the previous methods in parameter size, computation complexity, and communication cost. In particular, the extraction procedure not only prevents leakage of database information that has not been queried by user but also reduces the communication cost by half. We evaluate the performance of our method and verify that the computation on large-scale personal data can be securely and practically outsourced to a cloud environment during data analysis. It takes about 3.9 s to search-and-extract the reference and alternate sequences at the queried position in a database of size 4M. Our solution for finding a set of biomarkers in DNA sequences shows the progress of cryptographic techniques in terms of their capability can support real-world genome data analysis in a cloud environment.

  5. Spatial Acuity and Prey Detection in Weakly Electric Fish

    PubMed Central

    Babineau, David; Lewis, John E; Longtin, André

    2007-01-01

    It is well-known that weakly electric fish can exhibit extreme temporal acuity at the behavioral level, discriminating time intervals in the submicrosecond range. However, relatively little is known about the spatial acuity of the electrosense. Here we use a recently developed model of the electric field generated by Apteronotus leptorhynchus to study spatial acuity and small signal extraction. We show that the quality of sensory information available on the lateral body surface is highest for objects close to the fish's midbody, suggesting that spatial acuity should be highest at this location. Overall, however, this information is relatively blurry and the electrosense exhibits relatively poor acuity. Despite this apparent limitation, weakly electric fish are able to extract the minute signals generated by small prey, even in the presence of large background signals. In fact, we show that the fish's poor spatial acuity may actually enhance prey detection under some conditions. This occurs because the electric image produced by a spatially dense background is relatively “blurred” or spatially uniform. Hence, the small spatially localized prey signal “pops out” when fish motion is simulated. This shows explicitly how the back-and-forth swimming, characteristic of these fish, can be used to generate motion cues that, as in other animals, assist in the extraction of sensory information when signal-to-noise ratios are low. Our study also reveals the importance of the structure of complex electrosensory backgrounds. Whereas large-object spacing is favorable for discriminating the individual elements of a scene, small spacing can increase the fish's ability to resolve a single target object against this background. PMID:17335346

  6. Parental views on delivering preventive advice to children referred for treatment of dental caries under general anaesthesia: a qualitative investigation.

    PubMed

    Aljafari, A K; Scambler, S; Gallagher, J E; Hosey, M T

    2014-06-01

    To: 1, Explore opinions of parents of children undergoing caries treatment under general anaesthesia (GA) regarding delivery of oral health advice; 2, Discover current oral health practices and beliefs; 3, Inform further research and action. Qualitative study using semi-structured interviews and thematic data analysis, sampling parents of children aged 3-10 years undergoing GA tooth extraction due to dental caries. Twenty nine parents were interviewed (mean age 38.9 years, range 28-50, sd 6.4). The mean age of their children was seven years (range 3-10, sd 2.1). All children required deciduous tooth extractions (5.1 teeth on average). Those that also required permanent tooth extractions had on average 2.1 permanent teeth extracted. Many parents knew the importance of oral hygiene and sugar limitation, describing it as 'general knowledge' and 'common sense'. However, few understood that fruit juice is potentially cariogenic. Parenting challenges seemed to restrict their ability to control the child's diet and establish oral hygiene. Many reported not previously receiving oral health advice and reported never having fluoride varnish applied. There were requests for more caries prevention information and advice via the internet, schools or video games. Parental oral health knowledge, parenting skills, and previous advice received seem to all be issues related to the oral health of those children. Providing advice, especially in respect to fruit juice cariogenicity and the benefits of fluoride application through a child-friendly website, including a video game, as well as the use of school programmes might be an acceptable approach.

  7. The most frequently encountered volatile contaminants of essential oils and plant extracts introduced during the isolation procedure: fast and easy profiling.

    PubMed

    Radulović, Niko S; Blagojević, Polina D

    2012-01-01

    Unfortunately, contaminants of synthetic/artificial origin are sometimes identified as major constituents of essential oils or plant extracts and considered to be biologically active native plant metabolites. To explore the possibility of early recognition and to create a list of some of the most common semi-volatile contaminants of essential oils and plant extracts. Detailed GC and GC-MS analyses of the evaporation residues of six commercially available diethyl ethers and of a plastic bag hydrodistillate were performed. Average mass scans of the total ion chromatogram profiles of the analysed samples were performed. Almost 200 different compounds, subdivided into two groups, were identified in the analysed samples: (i) compounds that could be only of a synthetic/artificial origin, such as butylated hydroxytoluene and o-phthalic acid esters, i.e. requiring exclusion from the list of identified plant constituents; (ii) compounds possibly of synthetic and/or natural plant origin, i.e. compounds derived from the fatty acid metabolism or products of anaerobic intracellular/microbial fermentation. Average mass scans of the total ion chromatogram profiles provide meaningful and convenient information on uncovering important solvent-derived contamination. A database of the most common semi-volatile contaminants of essential oils and plant extracts has been generated that provides information on the likelihood of rejection or acceptance of contaminants as possible plant constituents. The suggested average mass scan approach enables fast and easy profiling of solvents, allowing even inexperienced researchers to pinpoint contaminants. Copyright © 2011 John Wiley & Sons, Ltd.

  8. Classification of clinically useful sentences in clinical evidence resources.

    PubMed

    Morid, Mohammad Amin; Fiszman, Marcelo; Raja, Kalpana; Jonnalagadda, Siddhartha R; Del Fiol, Guilherme

    2016-04-01

    Most patient care questions raised by clinicians can be answered by online clinical knowledge resources. However, important barriers still challenge the use of these resources at the point of care. To design and assess a method for extracting clinically useful sentences from synthesized online clinical resources that represent the most clinically useful information for directly answering clinicians' information needs. We developed a Kernel-based Bayesian Network classification model based on different domain-specific feature types extracted from sentences in a gold standard composed of 18 UpToDate documents. These features included UMLS concepts and their semantic groups, semantic predications extracted by SemRep, patient population identified by a pattern-based natural language processing (NLP) algorithm, and cue words extracted by a feature selection technique. Algorithm performance was measured in terms of precision, recall, and F-measure. The feature-rich approach yielded an F-measure of 74% versus 37% for a feature co-occurrence method (p<0.001). Excluding predication, population, semantic concept or text-based features reduced the F-measure to 62%, 66%, 58% and 69% respectively (p<0.01). The classifier applied to Medline sentences reached an F-measure of 73%, which is equivalent to the performance of the classifier on UpToDate sentences (p=0.62). The feature-rich approach significantly outperformed general baseline methods. This approach significantly outperformed classifiers based on a single type of feature. Different types of semantic features provided a unique contribution to overall classification performance. The classifier's model and features used for UpToDate generalized well to Medline abstracts. Copyright © 2016 Elsevier Inc. All rights reserved.

  9. Extracting leaf area index using viewing geometry effects-A new perspective on high-resolution unmanned aerial system photography

    NASA Astrophysics Data System (ADS)

    Roth, Lukas; Aasen, Helge; Walter, Achim; Liebisch, Frank

    2018-07-01

    Extraction of leaf area index (LAI) is an important prerequisite in numerous studies related to plant ecology, physiology and breeding. LAI is indicative for the performance of a plant canopy and of its potential for growth and yield. In this study, a novel method to estimate LAI based on RGB images taken by an unmanned aerial system (UAS) is introduced. Soybean was taken as the model crop of investigation. The method integrates viewing geometry information in an approach related to gap fraction theory. A 3-D simulation of virtual canopies helped developing and verifying the underlying model. In addition, the method includes techniques to extract plot based data from individual oblique images using image projection, as well as image segmentation applying an active learning approach. Data from a soybean field experiment were used to validate the method. The thereby measured LAI prediction accuracy was comparable with the one of a gap fraction-based handheld device (R2 of 0.92 , RMSE of 0.42 m 2m-2) and correlated well with destructive LAI measurements (R2 of 0.89 , RMSE of 0.41 m2 m-2). These results indicate that, if respecting the range (LAI ≤ 3) the method was tested for, extracting LAI from UAS derived RGB images using viewing geometry information represents a valid alternative to destructive and optical handheld device LAI measurements in soybean. Thereby, we open the door for automated, high-throughput assessment of LAI in plant and crop science.

  10. Enhancing Comparative Effectiveness Research With Automated Pediatric Pneumonia Detection in a Multi-Institutional Clinical Repository: A PHIS+ Pilot Study.

    PubMed

    Meystre, Stephane; Gouripeddi, Ramkiran; Tieder, Joel; Simmons, Jeffrey; Srivastava, Rajendu; Shah, Samir

    2017-05-15

    Community-acquired pneumonia is a leading cause of pediatric morbidity. Administrative data are often used to conduct comparative effectiveness research (CER) with sufficient sample sizes to enhance detection of important outcomes. However, such studies are prone to misclassification errors because of the variable accuracy of discharge diagnosis codes. The aim of this study was to develop an automated, scalable, and accurate method to determine the presence or absence of pneumonia in children using chest imaging reports. The multi-institutional PHIS+ clinical repository was developed to support pediatric CER by expanding an administrative database of children's hospitals with detailed clinical data. To develop a scalable approach to find patients with bacterial pneumonia more accurately, we developed a Natural Language Processing (NLP) application to extract relevant information from chest diagnostic imaging reports. Domain experts established a reference standard by manually annotating 282 reports to train and then test the NLP application. Findings of pleural effusion, pulmonary infiltrate, and pneumonia were automatically extracted from the reports and then used to automatically classify whether a report was consistent with bacterial pneumonia. Compared with the annotated diagnostic imaging reports reference standard, the most accurate implementation of machine learning algorithms in our NLP application allowed extracting relevant findings with a sensitivity of .939 and a positive predictive value of .925. It allowed classifying reports with a sensitivity of .71, a positive predictive value of .86, and a specificity of .962. When compared with each of the domain experts manually annotating these reports, the NLP application allowed for significantly higher sensitivity (.71 vs .527) and similar positive predictive value and specificity . NLP-based pneumonia information extraction of pediatric diagnostic imaging reports performed better than domain experts in this pilot study. NLP is an efficient method to extract information from a large collection of imaging reports to facilitate CER. ©Stephane Meystre, Ramkiran Gouripeddi, Joel Tieder, Jeffrey Simmons, Rajendu Srivastava, Samir Shah. Originally published in the Journal of Medical Internet Research (http://www.jmir.org), 15.05.2017.

  11. Construction of a database for published phase II/III drug intervention clinical trials for the period 2009-2014 comprising 2,326 records, 90 disease categories, and 939 drug entities.

    PubMed

    Jeong, Sohyun; Han, Nayoung; Choi, Boyoon; Sohn, Minji; Song, Yun-Kyoung; Chung, Myeon-Woo; Na, Han-Sung; Ji, Eunhee; Kim, Hyunah; Rhew, Ki Yon; Kim, Therasa; Kim, In-Wha; Oh, Jung Mi

    2016-06-01

    To construct a database of published clinical drug trials suitable for use 1) as a research tool in accessing clinical trial information and 2) in evidence-based decision-making by regulatory professionals, clinical research investigators, and medical practitioners. Comprehensive information obtained from a search of design elements and results of clinical trials in peer reviewed journals using PubMed (http://www.ncbi.nlm.ih.gov/pubmed). The methodology to develop a structured database was devised by a panel composed of experts in medical, pharmaceutical, information technology, and members of Ministry of Food and Drug Safety (MFDS) using a step by step approach. A double-sided system consisting of user mode and manager mode served as the framework for the database; elements of interest from each trial were entered via secure manager mode enabling the input information to be accessed in a user-friendly manner (user mode). Information regarding methodology used and results of drug treatment were extracted as detail elements of each data set and then inputted into the web-based database system. Comprehensive information comprising 2,326 clinical trial records, 90 disease states, and 939 drugs entities and concerning study objectives, background, methods used, results, and conclusion could be extracted from published information on phase II/III drug intervention clinical trials appearing in SCI journals within the last 10 years. The extracted data was successfully assembled into a clinical drug trial database with easy access suitable for use as a research tool. The clinically most important therapeutic categories, i.e., cancer, cardiovascular, respiratory, neurological, metabolic, urogenital, gastrointestinal, psychological, and infectious diseases were covered by the database. Names of test and control drugs, details on primary and secondary outcomes and indexed keywords could also be retrieved and built into the database. The construction used in the database enables the user to sort and download targeted information as a Microsoft Excel spreadsheet. Because of the comprehensive and standardized nature of the clinical drug trial database and its ease of access it should serve as valuable information repository and research tool for accessing clinical trial information and making evidence-based decisions by regulatory professionals, clinical research investigators, and medical practitioners.

  12. The Role of Mother in Informing Girls About Puberty: A Meta-Analysis Study

    PubMed Central

    Sooki, Zahra; Shariati, Mohammad; Chaman, Reza; Khosravi, Ahmad; Effatpanah, Mohammad; Keramat, Afsaneh

    2016-01-01

    Context Family, especially the mother, has the most important role in the education, transformation of information, and health behaviors of girls in order for them to have a healthy transition from the critical stage of puberty, but there are different views in this regard. Objectives Considering the various findings about the source of information about puberty, a meta-analysis study was conducted to investigate the extent of the mother’s role in informing girls about puberty. Data Sources This meta-analysis study was based on English articles published from 2000 to February 2015 in the Scopus, PubMed, and Science direct databases and on Persian articles in the SID, Magiran, and Iran Medex databases with determined key words and their MeSH equivalent. Study Selection Quantitative cross-sectional articles were extracted by two independent researchers and finally 46 articles were selected based on inclusion criteria. STROBE list were used for evaluation of studies. Data Extraction The percent of mothers as the current and preferred source of gaining information about the process of puberty, menarche, and menstruation from the perspective of adolescent girls was extracted from the articles. The results of studies were analyzed using meta-analysis (random effects model) and the studies’ heterogeneity was analyzed using the I2 calculation index. Variance between studies was analyzed using tau squared (Tau2) and review manager 5 software. Results The results showed that, from the perspective of teenage girls in Iran and other countries, in 56% of cases, the mother was the current source of information about the process of puberty, menarche, and menstruation. The preferred source of information about the process of puberty, menarche, and menstruation was the mother in all studies at 60% (Iran 57%, and other countries 66%). Conclusions According to the findings of this study, it is essential that health professionals and officials of the ministry of health train mothers about the time, trends, and factors affecting the start of puberty using a multi-dimensional approach that involves religious organizations, community groups, and peer groups. PMID:27331056

  13. Quantitative photothermal phase imaging of red blood cells using digital holographic photothermal microscope.

    PubMed

    Vasudevan, Srivathsan; Chen, George C K; Lin, Zhiping; Ng, Beng Koon

    2015-05-10

    Photothermal microscopy (PTM), a noninvasive pump-probe high-resolution microscopy, has been applied as a bioimaging tool in many biomedical studies. PTM utilizes a conventional phase contrast microscope to obtain highly resolved photothermal images. However, phase information cannot be extracted from these photothermal images, as they are not quantitative. Moreover, the problem of halos inherent in conventional phase contrast microscopy needs to be tackled. Hence, a digital holographic photothermal microscopy technique is proposed as a solution to obtain quantitative phase images. The proposed technique is demonstrated by extracting phase values of red blood cells from their photothermal images. These phase values can potentially be used to determine the temperature distribution of the photothermal images, which is an important study in live cell monitoring applications.

  14. The Determination of Metals in Sediment Pore Waters and in 1N HCl-Extracted Sediments by ICP-MS

    USGS Publications Warehouse

    May, T.W.; Wiedmeyer, Ray H.; Brumbaugh, W.G.; Schmitt, C.J.

    1997-01-01

    Concentrations of metals in sediment interstitial water (pore water) and those extractable from sediment with weak acids can provide important information about the bioavailability and toxicological effects of such contaminants. The highly variable nature of metal concentrations in these matrices requires instrumentation with the detection limit capability of graphite furnace atomic absorption and the wide dynamic linear range capability of ICP-OES. These criteria are satisfied with ICP-MS instrumentation. We investigated the performance of ICP-MS in the determination of certain metals from these matrices. The results for three metals were compared to those determined by graphite furnace atomic absorption spectroscopy. It was concluded that ICP-MS was an excellent instrumental approach for the determination of metals in these matrices.

  15. Building Extraction Based on Openstreetmap Tags and Very High Spatial Resolution Image in Urban Area

    NASA Astrophysics Data System (ADS)

    Kang, L.; Wang, Q.; Yan, H. W.

    2018-04-01

    How to derive contour of buildings from VHR images is the essential problem for automatic building extraction in urban area. To solve this problem, OSM data is introduced to offer vector contour information of buildings which is hard to get from VHR images. First, we import OSM data into database. The line string data of OSM with tags of building, amenity, office etc. are selected and combined into completed contours; Second, the accuracy of contours of buildings is confirmed by comparing with the real buildings in Google Earth; Third, maximum likelihood classification is conducted with the confirmed building contours, and the result demonstrates that the proposed approach is effective and accurate. The approach offers a new way for automatic interpretation of VHR images.

  16. Analysis of spike-wave discharges in rats using discrete wavelet transform.

    PubMed

    Ubeyli, Elif Derya; Ilbay, Gül; Sahin, Deniz; Ateş, Nurbay

    2009-03-01

    A feature is a distinctive or characteristic measurement, transform, structural component extracted from a segment of a pattern. Features are used to represent patterns with the goal of minimizing the loss of important information. The discrete wavelet transform (DWT) as a feature extraction method was used in representing the spike-wave discharges (SWDs) records of Wistar Albino Glaxo/Rijswijk (WAG/Rij) rats. The SWD records of WAG/Rij rats were decomposed into time-frequency representations using the DWT and the statistical features were calculated to depict their distribution. The obtained wavelet coefficients were used to identify characteristics of the signal that were not apparent from the original time domain signal. The present study demonstrates that the wavelet coefficients are useful in determining the dynamics in the time-frequency domain of SWD records.

  17. Whisking mechanics and active sensing

    PubMed Central

    Bush, Nicholas E; Solla, Sara A

    2017-01-01

    We describe recent advances in quantifying the three-dimensional (3D) geometry and mechanics of whisking. Careful delineation of relevant 3D reference frames reveals important geometric and mechanical distinctions between the localization problem (‘where’ is an object) and the feature extraction problem (‘what’ is an object). Head-centered and resting-whisker reference frames lend themselves to quantifying temporal and kinematic cues used for object localization. The whisking-centered reference frame lends itself to quantifying the contact mechanics likely associated with feature extraction. We offer the ‘windowed sampling’ hypothesis for active sensing: that rats can estimate an object’s spatial features by integrating mechanical information across whiskers during brief (25–60 ms) windows of ‘haptic enclosure’ with the whiskers, a motion that resembles a hand grasp. PMID:27632212

  18. Whisking mechanics and active sensing.

    PubMed

    Bush, Nicholas E; Solla, Sara A; Hartmann, Mitra Jz

    2016-10-01

    We describe recent advances in quantifying the three-dimensional (3D) geometry and mechanics of whisking. Careful delineation of relevant 3D reference frames reveals important geometric and mechanical distinctions between the localization problem ('where' is an object) and the feature extraction problem ('what' is an object). Head-centered and resting-whisker reference frames lend themselves to quantifying temporal and kinematic cues used for object localization. The whisking-centered reference frame lends itself to quantifying the contact mechanics likely associated with feature extraction. We offer the 'windowed sampling' hypothesis for active sensing: that rats can estimate an object's spatial features by integrating mechanical information across whiskers during brief (25-60ms) windows of 'haptic enclosure' with the whiskers, a motion that resembles a hand grasp. Copyright © 2016. Published by Elsevier Ltd.

  19. Natural language processing systems for capturing and standardizing unstructured clinical information: A systematic review.

    PubMed

    Kreimeyer, Kory; Foster, Matthew; Pandey, Abhishek; Arya, Nina; Halford, Gwendolyn; Jones, Sandra F; Forshee, Richard; Walderhaug, Mark; Botsis, Taxiarchis

    2017-09-01

    We followed a systematic approach based on the Preferred Reporting Items for Systematic Reviews and Meta-Analyses to identify existing clinical natural language processing (NLP) systems that generate structured information from unstructured free text. Seven literature databases were searched with a query combining the concepts of natural language processing and structured data capture. Two reviewers screened all records for relevance during two screening phases, and information about clinical NLP systems was collected from the final set of papers. A total of 7149 records (after removing duplicates) were retrieved and screened, and 86 were determined to fit the review criteria. These papers contained information about 71 different clinical NLP systems, which were then analyzed. The NLP systems address a wide variety of important clinical and research tasks. Certain tasks are well addressed by the existing systems, while others remain as open challenges that only a small number of systems attempt, such as extraction of temporal information or normalization of concepts to standard terminologies. This review has identified many NLP systems capable of processing clinical free text and generating structured output, and the information collected and evaluated here will be important for prioritizing development of new approaches for clinical NLP. Copyright © 2017 Elsevier Inc. All rights reserved.

  20. Systematics of the electric dipole response in stable tin isotopes

    NASA Astrophysics Data System (ADS)

    Bassauer, Sergej; von Neumann-Cosel, Peter; Tamii, Atsushi

    2018-05-01

    The electric dipole is an important property of heavy nuclei. Precise information on the electric dipole response provides information on the electric dipole polarisability which in turn allows to extract important constraints on neutron-skin thickness in heavy nuclei and parameters of the symmetry energy. The tin isotope chain is particularly suited for a systematic study of the dependence of the electric dipole response on neutron excess as it provides a wide mass range of accessible isotopes with little change of the underlying structure. Recently an inelastic proton scattering experiment under forward angles including 0º on 112,116,124Sn was performed at the Research Centre for Nuclear Physics (RCNP), Japan with a focus on the low-energy dipole strength and the polarisability. First results are presented here. Using data from an earlier proton scattering experiment on 120Sn the gamma strength function and level density are determined for this nucleus.

  1. Fully Convolutional Network Based Shadow Extraction from GF-2 Imagery

    NASA Astrophysics Data System (ADS)

    Li, Z.; Cai, G.; Ren, H.

    2018-04-01

    There are many shadows on the high spatial resolution satellite images, especially in the urban areas. Although shadows on imagery severely affect the information extraction of land cover or land use, they provide auxiliary information for building extraction which is hard to achieve a satisfactory accuracy through image classification itself. This paper focused on the method of building shadow extraction by designing a fully convolutional network and training samples collected from GF-2 satellite imagery in the urban region of Changchun city. By means of spatial filtering and calculation of adjacent relationship along the sunlight direction, the small patches from vegetation or bridges have been eliminated from the preliminary extracted shadows. Finally, the building shadows were separated. The extracted building shadow information from the proposed method in this paper was compared with the results from the traditional object-oriented supervised classification algorihtms. It showed that the deep learning network approach can improve the accuracy to a large extent.

  2. An efficient and scalable extraction and quantification method for algal derived biofuel.

    PubMed

    Lohman, Egan J; Gardner, Robert D; Halverson, Luke; Macur, Richard E; Peyton, Brent M; Gerlach, Robin

    2013-09-01

    Microalgae are capable of synthesizing a multitude of compounds including biofuel precursors and other high value products such as omega-3-fatty acids. However, accurate analysis of the specific compounds produced by microalgae is important since slight variations in saturation and carbon chain length can affect the quality, and thus the value, of the end product. We present a method that allows for fast and reliable extraction of lipids and similar compounds from a range of algae, followed by their characterization using gas chromatographic analysis with a focus on biodiesel-relevant compounds. This method determines which range of biologically synthesized compounds is likely responsible for each fatty acid methyl ester (FAME) produced; information that is fundamental for identifying preferred microalgae candidates as a biodiesel source. Traditional methods of analyzing these precursor molecules are time intensive and prone to high degrees of variation between species and experimental conditions. Here we detail a new method which uses microwave energy as a reliable, single-step cell disruption technique to extract lipids from live cultures of microalgae. After extractable lipid characterization (including lipid type (free fatty acids, mono-, di- or tri-acylglycerides) and carbon chain length determination) by GC-FID, the same lipid extracts are transesterified into FAMEs and directly compared to total biodiesel potential by GC-MS. This approach provides insight into the fraction of total FAMEs derived from extractable lipids compared to FAMEs derived from the residual fraction (i.e. membrane bound phospholipids, sterols, etc.). This approach can also indicate which extractable lipid compound, based on chain length and relative abundance, is responsible for each FAME. This method was tested on three species of microalgae; the marine diatom Phaeodactylum tricornutum, the model Chlorophyte Chlamydomonas reinhardtii, and the freshwater green alga Chlorella vulgaris. The method is shown to be robust, highly reproducible, and fast, allowing for multiple samples to be analyzed throughout the time course of culturing, thus providing time-resolved information regarding lipid quantity and quality. Total time from harvesting to obtaining analytical results is less than 2h. © 2013.

  3. Antigenotoxic and free radical scavenging activities of extracts from Moricandia arvensis.

    PubMed

    Skandrani, I; Sghaier, M Ben; Neffati, A; Boubaker, J; Bouhlel, I; Kilani, S; Mahmoud, A; Ghedira, K; Chekir-Ghedira, L

    2007-01-01

    This study evaluates genotoxic and antigenotoxic effects of extracts from leaves of Moricandia arvensis, which are used in traditional cooking and medicines. Extracts showed no genotoxicity when tested with the SOS Chromotest using E. coli PQ37 and PQ35 strains, except for the total oligomers flavonoids enriched extract. Petroleum ether and methanol extracts are the most active in reducing nitrofurantoin genotoxicity, whereas methanol and total oligomers flavonoids enriched extracts showed the most important inhibitory effect of H2O2 genotoxicity. In addition, these two extracts showed important free radical scavenging activity toward the DPPH. radical, whereas the chloroform extract exhibited the highest value of TEAC against ABTS+. radical.

  4. Models Extracted from Text for System-Software Safety Analyses

    NASA Technical Reports Server (NTRS)

    Malin, Jane T.

    2010-01-01

    This presentation describes extraction and integration of requirements information and safety information in visualizations to support early review of completeness, correctness, and consistency of lengthy and diverse system safety analyses. Software tools have been developed and extended to perform the following tasks: 1) extract model parts and safety information from text in interface requirements documents, failure modes and effects analyses and hazard reports; 2) map and integrate the information to develop system architecture models and visualizations for safety analysts; and 3) provide model output to support virtual system integration testing. This presentation illustrates the methods and products with a rocket motor initiation case.

  5. Gaze behavior and the perception of egocentric distance

    PubMed Central

    Gajewski, Daniel A.; Wallin, Courtney P.; Philbeck, John W.

    2014-01-01

    The ground plane is thought to be an important reference for localizing objects, particularly when angular declination is informative, as it is for objects seen resting at floor level. A potential role for eye movements has been implicated by the idea that information about the nearby ground is required to localize objects more distant, and by the fact that the time course for the extraction of distance extends beyond the duration of a typical eye fixation. To test this potential role, eye movements were monitored when participants previewed targets. Distance estimates were provided by walking without vision to the remembered target location (blind walking) or by verbal report. We found that a strategy of holding the gaze steady on the object was as frequent as one where the region between the observer and object was fixated. There was no performance advantage associated with making eye movements in an observational study (Experiment 1) or when an eye-movement strategy was manipulated experimentally (Experiment 2). Observers were extracting useful information covertly, however. In Experiments 3 through 5, obscuring the nearby ground plane had a modest impact on performance; obscuring the walls and ceiling was more detrimental. The results suggest that these alternate surfaces provide useful information when judging the distance to objects within indoor environments. Critically, they constrain the role for the nearby ground plane in theories of egocentric distance perception. PMID:24453346

  6. The Roles for Prior Visual Experience and Age on the Extraction of Egocentric Distance.

    PubMed

    Wallin, Courtney P; Gajewski, Daniel A; Teplitz, Rebeca W; Mihelic Jaidzeka, Sandra; Philbeck, John W

    2017-01-01

    In a well-lit room, observers can generate well-constrained estimates of the distance to an object on the floor even with just a fleeting glimpse. Performance under these conditions is typically characterized by some underestimation but improves when observers have previewed the room. Such evidence suggests that information extracted from longer durations may be stored to contribute to the perception of distance at limited time frames. Here, we examined the possibility that this stored information is used differentially across age. Specifically, we posited that older adults would rely more than younger adults on information gathered and stored at longer glimpses to judge the distance of briefly glimpsed objects. We collected distance judgments from younger and older adults after brief target glimpses. Half of the participants were provided 20-s previews of the testing room in advance; the other half received no preview. Performance benefits were observed for all individuals with prior visual experience, and these were moderately more pronounced for the older adults. The results suggest that observers store contextual information gained from longer viewing durations to aid in the perception of distance at brief glimpses, and that this memory becomes more important with age. © The Author 2016. Published by Oxford University Press on behalf of The Gerontological Society of America. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.

  7. DDMGD: the database of text-mined associations between genes methylated in diseases from different species.

    PubMed

    Bin Raies, Arwa; Mansour, Hicham; Incitti, Roberto; Bajic, Vladimir B

    2015-01-01

    Gathering information about associations between methylated genes and diseases is important for diseases diagnosis and treatment decisions. Recent advancements in epigenetics research allow for large-scale discoveries of associations of genes methylated in diseases in different species. Searching manually for such information is not easy, as it is scattered across a large number of electronic publications and repositories. Therefore, we developed DDMGD database (http://www.cbrc.kaust.edu.sa/ddmgd/) to provide a comprehensive repository of information related to genes methylated in diseases that can be found through text mining. DDMGD's scope is not limited to a particular group of genes, diseases or species. Using the text mining system DEMGD we developed earlier and additional post-processing, we extracted associations of genes methylated in different diseases from PubMed Central articles and PubMed abstracts. The accuracy of extracted associations is 82% as estimated on 2500 hand-curated entries. DDMGD provides a user-friendly interface facilitating retrieval of these associations ranked according to confidence scores. Submission of new associations to DDMGD is provided. A comparison analysis of DDMGD with several other databases focused on genes methylated in diseases shows that DDMGD is comprehensive and includes most of the recent information on genes methylated in diseases. © The Author(s) 2014. Published by Oxford University Press on behalf of Nucleic Acids Research.

  8. Tuberculosis diagnosis support analysis for precarious health information systems.

    PubMed

    Orjuela-Cañón, Alvaro David; Camargo Mendoza, Jorge Eliécer; Awad García, Carlos Enrique; Vergara Vela, Erika Paola

    2018-04-01

    Pulmonary tuberculosis is a world emergency for the World Health Organization. Techniques and new diagnosis tools are important to battle this bacterial infection. There have been many advances in all those fields, but in developing countries such as Colombia, where the resources and infrastructure are limited, new fast and less expensive strategies are increasingly needed. Artificial neural networks are computational intelligence techniques that can be used in this kind of problems and offer additional support in the tuberculosis diagnosis process, providing a tool to medical staff to make decisions about management of subjects under suspicious of tuberculosis. A database extracted from 105 subjects with precarious information of people under suspect of pulmonary tuberculosis was used in this study. Data extracted from sex, age, diabetes, homeless, AIDS status and a variable with clinical knowledge from the medical personnel were used. Models based on artificial neural networks were used, exploring supervised learning to detect the disease. Unsupervised learning was used to create three risk groups based on available information. Obtained results are comparable with traditional techniques for detection of tuberculosis, showing advantages such as fast and low implementation costs. Sensitivity of 97% and specificity of 71% where achieved. Used techniques allowed to obtain valuable information that can be useful for physicians who treat the disease in decision making processes, especially under limited infrastructure and data. Copyright © 2018 Elsevier B.V. All rights reserved.

  9. Real-time bilinear rotation decoupling in absorptive mode J-spectroscopy: Detecting low-intensity metabolite peak close to high-intensity metabolite peak with convenience.

    PubMed

    Verma, Ajay; Baishya, Bikash

    2016-05-01

    "Pure shift" NMR spectra display singlet peak per chemical site. Thus, high resolution is offered at the cost of valuable J-coupling information. In the present work, real-time BIRD (BIlinear Rotation Decoupling) is applied to the absorptive-mode 2D J-spectroscopy to provide pure shift spectrum in the direct dimension and J-coupling information in the indirect dimension. Quite often in metabolomics, proton NMR spectra from complex bio-fluids display tremendous signal overlap. Although conventional J-spectroscopy in principle overcomes this problem by separating the multiplet information from chemical shift information, however, only magnitude mode of the experiment is practical, sacrificing much of the potential high resolution that could be achieved. Few J-spectroscopy methods have been reported so far that produce high-resolution pure shift spectrum along with J-coupling information for crowded spectral regions. In the present work, high-quality J-resolved spectrum from important metabolomic mixture such as tissue extract from rat cortex is demonstrated. Many low-intensity metabolite peaks which are obscured by the broad dispersive tails from high-intensity metabolite peaks in regular magnitude mode J-spectrum can be clearly identified in real-time BIRD J-resolved spectrum. The general practice of removing such spectral overlap is tedious and time-consuming as it involves repeated sample preparation to change the pH of the tissue extract sample and subsequent spectra recording. Copyright © 2016 Elsevier Inc. All rights reserved.

  10. Chlorophyll Fluorescence Analysis of Cyanobacterial Photosynthesis and Acclimation

    PubMed Central

    Campbell, Douglas; Hurry, Vaughan; Clarke, Adrian K.; Gustafsson, Petter; Öquist, Gunnar

    1998-01-01

    Cyanobacteria are ecologically important photosynthetic prokaryotes that also serve as popular model organisms for studies of photosynthesis and gene regulation. Both molecular and ecological studies of cyanobacteria benefit from real-time information on photosynthesis and acclimation. Monitoring in vivo chlorophyll fluorescence can provide noninvasive measures of photosynthetic physiology in a wide range of cyanobacteria and cyanolichens and requires only small samples. Cyanobacterial fluorescence patterns are distinct from those of plants, because of key structural and functional properties of cyanobacteria. These include significant fluorescence emission from the light-harvesting phycobiliproteins; large and rapid changes in fluorescence yield (state transitions) which depend on metabolic and environmental conditions; and flexible, overlapping respiratory and photosynthetic electron transport chains. The fluorescence parameters FV/FM, FV′/FM′,qp,qN, NPQ, and φPS II were originally developed to extract information from the fluorescence signals of higher plants. In this review, we consider how the special properties of cyanobacteria can be accommodated and used to extract biologically useful information from cyanobacterial in vivo chlorophyll fluorescence signals. We describe how the pattern of fluorescence yield versus light intensity can be used to predict the acclimated light level for a cyanobacterial population, giving information valuable for both laboratory and field studies of acclimation processes. The size of the change in fluorescence yield during dark-to-light transitions can provide information on respiration and the iron status of the cyanobacteria. Finally, fluorescence parameters can be used to estimate the electron transport rate at the acclimated growth light intensity. PMID:9729605

  11. Real-time bilinear rotation decoupling in absorptive mode J-spectroscopy: Detecting low-intensity metabolite peak close to high-intensity metabolite peak with convenience

    NASA Astrophysics Data System (ADS)

    Verma, Ajay; Baishya, Bikash

    2016-05-01

    ;Pure shift; NMR spectra display singlet peak per chemical site. Thus, high resolution is offered at the cost of valuable J-coupling information. In the present work, real-time BIRD (BIlinear Rotation Decoupling) is applied to the absorptive-mode 2D J-spectroscopy to provide pure shift spectrum in the direct dimension and J-coupling information in the indirect dimension. Quite often in metabolomics, proton NMR spectra from complex bio-fluids display tremendous signal overlap. Although conventional J-spectroscopy in principle overcomes this problem by separating the multiplet information from chemical shift information, however, only magnitude mode of the experiment is practical, sacrificing much of the potential high resolution that could be achieved. Few J-spectroscopy methods have been reported so far that produce high-resolution pure shift spectrum along with J-coupling information for crowded spectral regions. In the present work, high-quality J-resolved spectrum from important metabolomic mixture such as tissue extract from rat cortex is demonstrated. Many low-intensity metabolite peaks which are obscured by the broad dispersive tails from high-intensity metabolite peaks in regular magnitude mode J-spectrum can be clearly identified in real-time BIRD J-resolved spectrum. The general practice of removing such spectral overlap is tedious and time-consuming as it involves repeated sample preparation to change the pH of the tissue extract sample and subsequent spectra recording.

  12. Kernel-based discriminant feature extraction using a representative dataset

    NASA Astrophysics Data System (ADS)

    Li, Honglin; Sancho Gomez, Jose-Luis; Ahalt, Stanley C.

    2002-07-01

    Discriminant Feature Extraction (DFE) is widely recognized as an important pre-processing step in classification applications. Most DFE algorithms are linear and thus can only explore the linear discriminant information among the different classes. Recently, there has been several promising attempts to develop nonlinear DFE algorithms, among which is Kernel-based Feature Extraction (KFE). The efficacy of KFE has been experimentally verified by both synthetic data and real problems. However, KFE has some known limitations. First, KFE does not work well for strongly overlapped data. Second, KFE employs all of the training set samples during the feature extraction phase, which can result in significant computation when applied to very large datasets. Finally, KFE can result in overfitting. In this paper, we propose a substantial improvement to KFE that overcomes the above limitations by using a representative dataset, which consists of critical points that are generated from data-editing techniques and centroid points that are determined by using the Frequency Sensitive Competitive Learning (FSCL) algorithm. Experiments show that this new KFE algorithm performs well on significantly overlapped datasets, and it also reduces computational complexity. Further, by controlling the number of centroids, the overfitting problem can be effectively alleviated.

  13. Fluorescence Intrinsic Characterization of Excitation-Emission Matrix Using Multi-Dimensional Ensemble Empirical Mode Decomposition

    PubMed Central

    Chang, Chi-Ying; Chang, Chia-Chi; Hsiao, Tzu-Chien

    2013-01-01

    Excitation-emission matrix (EEM) fluorescence spectroscopy is a noninvasive method for tissue diagnosis and has become important in clinical use. However, the intrinsic characterization of EEM fluorescence remains unclear. Photobleaching and the complexity of the chemical compounds make it difficult to distinguish individual compounds due to overlapping features. Conventional studies use principal component analysis (PCA) for EEM fluorescence analysis, and the relationship between the EEM features extracted by PCA and diseases has been examined. The spectral features of different tissue constituents are not fully separable or clearly defined. Recently, a non-stationary method called multi-dimensional ensemble empirical mode decomposition (MEEMD) was introduced; this method can extract the intrinsic oscillations on multiple spatial scales without loss of information. The aim of this study was to propose a fluorescence spectroscopy system for EEM measurements and to describe a method for extracting the intrinsic characteristics of EEM by MEEMD. The results indicate that, although PCA provides the principal factor for the spectral features associated with chemical compounds, MEEMD can provide additional intrinsic features with more reliable mapping of the chemical compounds. MEEMD has the potential to extract intrinsic fluorescence features and improve the detection of biochemical changes. PMID:24240806

  14. Multifrequency synthesis and extraction using square wave projection patterns for quantitative tissue imaging.

    PubMed

    Nadeau, Kyle P; Rice, Tyler B; Durkin, Anthony J; Tromberg, Bruce J

    2015-11-01

    We present a method for spatial frequency domain data acquisition utilizing a multifrequency synthesis and extraction (MSE) method and binary square wave projection patterns. By illuminating a sample with square wave patterns, multiple spatial frequency components are simultaneously attenuated and can be extracted to determine optical property and depth information. Additionally, binary patterns are projected faster than sinusoids typically used in spatial frequency domain imaging (SFDI), allowing for short (millisecond or less) camera exposure times, and data acquisition speeds an order of magnitude or more greater than conventional SFDI. In cases where sensitivity to superficial layers or scattering is important, the fundamental component from higher frequency square wave patterns can be used. When probing deeper layers, the fundamental and harmonic components from lower frequency square wave patterns can be used. We compared optical property and depth penetration results extracted using square waves to those obtained using sinusoidal patterns on an in vivo human forearm and absorbing tube phantom, respectively. Absorption and reduced scattering coefficient values agree with conventional SFDI to within 1% using both high frequency (fundamental) and low frequency (fundamental and harmonic) spatial frequencies. Depth penetration reflectance values also agree to within 1% of conventional SFDI.

  15. Multifrequency synthesis and extraction using square wave projection patterns for quantitative tissue imaging

    PubMed Central

    Nadeau, Kyle P.; Rice, Tyler B.; Durkin, Anthony J.; Tromberg, Bruce J.

    2015-01-01

    Abstract. We present a method for spatial frequency domain data acquisition utilizing a multifrequency synthesis and extraction (MSE) method and binary square wave projection patterns. By illuminating a sample with square wave patterns, multiple spatial frequency components are simultaneously attenuated and can be extracted to determine optical property and depth information. Additionally, binary patterns are projected faster than sinusoids typically used in spatial frequency domain imaging (SFDI), allowing for short (millisecond or less) camera exposure times, and data acquisition speeds an order of magnitude or more greater than conventional SFDI. In cases where sensitivity to superficial layers or scattering is important, the fundamental component from higher frequency square wave patterns can be used. When probing deeper layers, the fundamental and harmonic components from lower frequency square wave patterns can be used. We compared optical property and depth penetration results extracted using square waves to those obtained using sinusoidal patterns on an in vivo human forearm and absorbing tube phantom, respectively. Absorption and reduced scattering coefficient values agree with conventional SFDI to within 1% using both high frequency (fundamental) and low frequency (fundamental and harmonic) spatial frequencies. Depth penetration reflectance values also agree to within 1% of conventional SFDI. PMID:26524682

  16. Isolation, Separation, and Preconcentration of Biologically Active Compounds from Plant Matrices by Extraction Techniques.

    PubMed

    Raks, Victoria; Al-Suod, Hossam; Buszewski, Bogusław

    2018-01-01

    Development of efficient methods for isolation and separation of biologically active compounds remains an important challenge for researchers. Designing systems such as organomineral composite materials that allow extraction of a wide range of biologically active compounds, acting as broad-utility solid-phase extraction agents, remains an important and necessary task. Selective sorbents can be easily used for highly selective and reliable extraction of specific components present in complex matrices. Herein, state-of-the-art approaches for selective isolation, preconcentration, and separation of biologically active compounds from a range of matrices are discussed. Primary focus is given to novel extraction methods for some biologically active compounds including cyclic polyols, flavonoids, and oligosaccharides from plants. In addition, application of silica-, carbon-, and polymer-based solid-phase extraction adsorbents and membrane extraction for selective separation of these compounds is discussed. Potential separation process interactions are recommended; their understanding is of utmost importance for the creation of optimal conditions to extract biologically active compounds including those with estrogenic properties.

  17. Pharmacological effects of the phytochemicals of Anethum sowa L. root extracts.

    PubMed

    Saleh-E-In, Md Moshfekus; Sultana, Nasim; Hossain, Md Nur; Hasan, Sayeema; Islam, Md Rabiul

    2016-11-14

    Anethum sowa L. is widely used as an important spice and traditional medicinal plants to treat various ailments. On the basis of scientific ethnobotanical information, this study was undertaken to evaluate the antioxidant, antimicrobial and cytotoxic activity of the crude extracts of Anethum sowa L. roots as well as to identify the classes of phytochemicals by chemical tests. The antioxidant potential of the extracts was ascertained with the stable organic free radical (2, 2-diphenyl-1-picryl-hydrazyl). The agar well diffusion method was used to determine the susceptibility of bacterial and fungal strains of the crude extracts. The minimum inhibitory concentration (MIC) and minimum bactericidal concentrations (MBC) were determined by the microdilution test. Cytotoxic activities were screened using brine shrimps (Artemia salina) lethality assay. Finally, phytochemicals were profiled using standard procedures. A preliminary phytochemical screening of the different crude extracts by methanol, ethyl acetate and chloroform showed the presence of secondary metabolites such as flavonoids, alkaloids, saponin, cardiac glycosides and tannins while cyanogenetic glycosides were not detected. The methanol, ethyl acetate and chloroform extracts displayed high antioxidant activity (IC 50  = 13.08 ± 0.03, 33.48 ± 0.16 and 36.42 ± 0.41 μg/mL, respectively) in the DPPH assay comparable to that of the standard ascorbic acid and BHT (IC 50  = 3.74 ± 0.05 and 11.84 ± 0.29 μg/mL). The cytotoxic activity of the crude ethyl acetate and chloroform extracts possessed excellent activity (LC 50  = 5.03 ± 0.08, 5.23 ± 0.11 and 17.22 ± 0.14 μg/mL, respectively) against brine shrimp larvae after 24 h of treatment and compared with standard vincristine sulphate (LC 50  = 0.46 ± 0.05 μg/mL). The extracts also showed good antimicrobial activity against both Gram-positive and Gram-negative bacteria when compared with two standard antibiotics ciprofloxacin and tetracycline. These results showed that the Anethum sowa root extracts are the important source of the antioxidant, antimicrobial and cytotoxic agent. So, further research is necessary to isolate and characterize of different phytoconstituents for pharmaceutical drug lead molecules and also to verify its traditional uses.

  18. The use of experimental structures to model protein dynamics.

    PubMed

    Katebi, Ataur R; Sankar, Kannan; Jia, Kejue; Jernigan, Robert L

    2015-01-01

    The number of solved protein structures submitted in the Protein Data Bank (PDB) has increased dramatically in recent years. For some specific proteins, this number is very high-for example, there are over 550 solved structures for HIV-1 protease, one protein that is essential for the life cycle of human immunodeficiency virus (HIV) which causes acquired immunodeficiency syndrome (AIDS) in humans. The large number of structures for the same protein and its variants include a sample of different conformational states of the protein. A rich set of structures solved experimentally for the same protein has information buried within the dataset that can explain the functional dynamics and structural mechanism of the protein. To extract the dynamics information and functional mechanism from the experimental structures, this chapter focuses on two methods-Principal Component Analysis (PCA) and Elastic Network Models (ENM). PCA is a widely used statistical dimensionality reduction technique to classify and visualize high-dimensional data. On the other hand, ENMs are well-established simple biophysical method for modeling the functionally important global motions of proteins. This chapter covers the basics of these two. Moreover, an improved ENM version that utilizes the variations found within a given set of structures for a protein is described. As a practical example, we have extracted the functional dynamics and mechanism of HIV-1 protease dimeric structure by using a set of 329 PDB structures of this protein. We have described, step by step, how to select a set of protein structures, how to extract the needed information from the PDB files for PCA, how to extract the dynamics information using PCA, how to calculate ENM modes, how to measure the congruency between the dynamics computed from the principal components (PCs) and the ENM modes, and how to compute entropies using the PCs. We provide the computer programs or references to software tools to accomplish each step and show how to use these programs and tools. We also include computer programs to generate movies based on PCs and ENM modes and describe how to visualize them.

  19. The Use of Experimental Structures to Model Protein Dynamics

    PubMed Central

    Katebi, Ataur R.; Sankar, Kannan; Jia, Kejue; Jernigan, Robert L.

    2014-01-01

    Summary The number of solved protein structures submitted in the Protein Data Bank (PDB) has increased dramatically in recent years. For some specific proteins, this number is very high – for example, there are over 550 solved structures for HIV-1 protease, one protein that is essential for the life cycle of human immunodeficiency virus (HIV) which causes acquired immunodeficiency syndrome (AIDS) in humans. The large number of structures for the same protein and its variants include a sample of different conformational states of the protein. A rich set of structures solved experimentally for the same protein has information buried within the dataset that can explain the functional dynamics and structural mechanism of the protein. To extract the dynamics information and functional mechanism from the experimental structures, this chapter focuses on two methods – Principal Component Analysis (PCA) and Elastic Network Models (ENM). PCA is a widely used statistical dimensionality reduction technique to classify and visualize high-dimensional data. On the other hand, ENMs are well-established simple biophysical method for modeling the functionally important global motions of proteins. This chapter covers the basics of these two. Moreover, an improved ENM version that utilizes the variations found within a given set of structures for a protein is described. As a practical example, we have extracted the functional dynamics and mechanism of HIV-1 protease dimeric structure by using a set of 329 PDB structures of this protein. We have described, step by step, how to select a set of protein structures, how to extract the needed information from the PDB files for PCA, how to extract the dynamics information using PCA, how to calculate ENM modes, how to measure the congruency between the dynamics computed from the principal components (PCs) and the ENM modes, and how to compute entropies using the PCs. We provide the computer programs or references to software tools to accomplish each step and show how to use these programs and tools. We also include computer programs to generate movies based on PCs and ENM modes and describe how to visualize them. PMID:25330965

  20. Investigation of the Impact of Extracting and Exchanging Health Information by Using Internet and Social Networks.

    PubMed

    Pistolis, John; Zimeras, Stelios; Chardalias, Kostas; Roupa, Zoe; Fildisis, George; Diomidous, Marianna

    2016-06-01

    Social networks (1) have been embedded in our daily life for a long time. They constitute a powerful tool used nowadays for both searching and exchanging information on different issues by using Internet searching engines (Google, Bing, etc.) and Social Networks (Facebook, Twitter etc.). In this paper, are presented the results of a research based on the frequency and the type of the usage of the Internet and the Social Networks by the general public and the health professionals. The objectives of the research were focused on the investigation of the frequency of seeking and meticulously searching for health information in the social media by both individuals and health practitioners. The exchanging of information is a procedure that involves the issues of reliability and quality of information. In this research, by using advanced statistical techniques an effort is made to investigate the participant's profile in using social networks for searching and exchanging information on health issues. Based on the answers 93 % of the people, use the Internet to find information on health-subjects. Considering principal component analysis, the most important health subjects were nutrition (0.719 %), respiratory issues (0.79 %), cardiological issues (0.777%), psychological issues (0.667%) and total (73.8%). The research results, based on different statistical techniques revealed that the 61.2% of the males and 56.4% of the females intended to use the social networks for searching medical information. Based on the principal components analysis, the most important sources that the participants mentioned, were the use of the Internet and social networks for exchanging information on health issues. These sources proved to be of paramount importance to the participants of the study. The same holds for nursing, medical and administrative staff in hospitals.

Top