Sample records for information extraction tasks

  1. Question analysis for Indonesian comparative question

    NASA Astrophysics Data System (ADS)

    Saelan, A.; Purwarianti, A.; Widyantoro, D. H.

    2017-01-01

    Information seeking is one of human needs today. Comparing things using search engine surely take more times than search only one thing. In this paper, we analyzed comparative questions for comparative question answering system. Comparative question is a question that comparing two or more entities. We grouped comparative questions into 5 types: selection between mentioned entities, selection between unmentioned entities, selection between any entity, comparison, and yes or no question. Then we extracted 4 types of information from comparative questions: entity, aspect, comparison, and constraint. We built classifiers for classification task and information extraction task. Features used for classification task are bag of words, whether for information extraction, we used lexical, 2 previous and following words lexical, and previous label as features. We tried 2 scenarios: classification first and extraction first. For classification first, we used classification result as a feature for extraction. Otherwise, for extraction first, we used extraction result as features for classification. We found that the result would be better if we do extraction first before classification. For the extraction task, classification using SMO gave the best result (88.78%), while for classification, it is better to use naïve bayes (82.35%).

  2. Information based universal feature extraction

    NASA Astrophysics Data System (ADS)

    Amiri, Mohammad; Brause, Rüdiger

    2015-02-01

    In many real world image based pattern recognition tasks, the extraction and usage of task-relevant features are the most crucial part of the diagnosis. In the standard approach, they mostly remain task-specific, although humans who perform such a task always use the same image features, trained in early childhood. It seems that universal feature sets exist, but they are not yet systematically found. In our contribution, we tried to find those universal image feature sets that are valuable for most image related tasks. In our approach, we trained a neural network by natural and non-natural images of objects and background, using a Shannon information-based algorithm and learning constraints. The goal was to extract those features that give the most valuable information for classification of visual objects hand-written digits. This will give a good start and performance increase for all other image learning tasks, implementing a transfer learning approach. As result, in our case we found that we could indeed extract features which are valid in all three kinds of tasks.

  3. Challenges in Managing Information Extraction

    ERIC Educational Resources Information Center

    Shen, Warren H.

    2009-01-01

    This dissertation studies information extraction (IE), the problem of extracting structured information from unstructured data. Example IE tasks include extracting person names from news articles, product information from e-commerce Web pages, street addresses from emails, and names of emerging music bands from blogs. IE is all increasingly…

  4. Automated concept-level information extraction to reduce the need for custom software and rules development.

    PubMed

    D'Avolio, Leonard W; Nguyen, Thien M; Goryachev, Sergey; Fiore, Louis D

    2011-01-01

    Despite at least 40 years of promising empirical performance, very few clinical natural language processing (NLP) or information extraction systems currently contribute to medical science or care. The authors address this gap by reducing the need for custom software and rules development with a graphical user interface-driven, highly generalizable approach to concept-level retrieval. A 'learn by example' approach combines features derived from open-source NLP pipelines with open-source machine learning classifiers to automatically and iteratively evaluate top-performing configurations. The Fourth i2b2/VA Shared Task Challenge's concept extraction task provided the data sets and metrics used to evaluate performance. Top F-measure scores for each of the tasks were medical problems (0.83), treatments (0.82), and tests (0.83). Recall lagged precision in all experiments. Precision was near or above 0.90 in all tasks. Discussion With no customization for the tasks and less than 5 min of end-user time to configure and launch each experiment, the average F-measure was 0.83, one point behind the mean F-measure of the 22 entrants in the competition. Strong precision scores indicate the potential of applying the approach for more specific clinical information extraction tasks. There was not one best configuration, supporting an iterative approach to model creation. Acceptable levels of performance can be achieved using fully automated and generalizable approaches to concept-level information extraction. The described implementation and related documentation is available for download.

  5. Systematically Extracting Metal- and Solvent-Related Occupational Information from Free-Text Responses to Lifetime Occupational History Questionnaires

    PubMed Central

    Friesen, Melissa C.; Locke, Sarah J.; Tornow, Carina; Chen, Yu-Cheng; Koh, Dong-Hee; Stewart, Patricia A.; Purdue, Mark; Colt, Joanne S.

    2014-01-01

    Objectives: Lifetime occupational history (OH) questionnaires often use open-ended questions to capture detailed information about study participants’ jobs. Exposure assessors use this information, along with responses to job- and industry-specific questionnaires, to assign exposure estimates on a job-by-job basis. An alternative approach is to use information from the OH responses and the job- and industry-specific questionnaires to develop programmable decision rules for assigning exposures. As a first step in this process, we developed a systematic approach to extract the free-text OH responses and convert them into standardized variables that represented exposure scenarios. Methods: Our study population comprised 2408 subjects, reporting 11991 jobs, from a case–control study of renal cell carcinoma. Each subject completed a lifetime OH questionnaire that included verbatim responses, for each job, to open-ended questions including job title, main tasks and activities (task), tools and equipment used (tools), and chemicals and materials handled (chemicals). Based on a review of the literature, we identified exposure scenarios (occupations, industries, tasks/tools/chemicals) expected to involve possible exposure to chlorinated solvents, trichloroethylene (TCE) in particular, lead, and cadmium. We then used a SAS macro to review the information reported by study participants to identify jobs associated with each exposure scenario; this was done using previously coded standardized occupation and industry classification codes, and a priori lists of associated key words and phrases related to possibly exposed tasks, tools, and chemicals. Exposure variables representing the occupation, industry, and task/tool/chemicals exposure scenarios were added to the work history records of the study respondents. Our identification of possibly TCE-exposed scenarios in the OH responses was compared to an expert’s independently assigned probability ratings to evaluate whether we missed identifying possibly exposed jobs. Results: Our process added exposure variables for 52 occupation groups, 43 industry groups, and 46 task/tool/chemical scenarios to the data set of OH responses. Across all four agents, we identified possibly exposed task/tool/chemical exposure scenarios in 44–51% of the jobs in possibly exposed occupations. Possibly exposed task/tool/chemical exposure scenarios were found in a nontrivial 9–14% of the jobs not in possibly exposed occupations, suggesting that our process identified important information that would not be captured using occupation alone. Our extraction process was sensitive: for jobs where our extraction of OH responses identified no exposure scenarios and for which the sole source of information was the OH responses, only 0.1% were assessed as possibly exposed to TCE by the expert. Conclusions: Our systematic extraction of OH information found useful information in the task/chemicals/tools responses that was relatively easy to extract and that was not available from the occupational or industry information. The extracted variables can be used as inputs in the development of decision rules, especially for jobs where no additional information, such as job- and industry-specific questionnaires, is available. PMID:24590110

  6. Systematically extracting metal- and solvent-related occupational information from free-text responses to lifetime occupational history questionnaires.

    PubMed

    Friesen, Melissa C; Locke, Sarah J; Tornow, Carina; Chen, Yu-Cheng; Koh, Dong-Hee; Stewart, Patricia A; Purdue, Mark; Colt, Joanne S

    2014-06-01

    Lifetime occupational history (OH) questionnaires often use open-ended questions to capture detailed information about study participants' jobs. Exposure assessors use this information, along with responses to job- and industry-specific questionnaires, to assign exposure estimates on a job-by-job basis. An alternative approach is to use information from the OH responses and the job- and industry-specific questionnaires to develop programmable decision rules for assigning exposures. As a first step in this process, we developed a systematic approach to extract the free-text OH responses and convert them into standardized variables that represented exposure scenarios. Our study population comprised 2408 subjects, reporting 11991 jobs, from a case-control study of renal cell carcinoma. Each subject completed a lifetime OH questionnaire that included verbatim responses, for each job, to open-ended questions including job title, main tasks and activities (task), tools and equipment used (tools), and chemicals and materials handled (chemicals). Based on a review of the literature, we identified exposure scenarios (occupations, industries, tasks/tools/chemicals) expected to involve possible exposure to chlorinated solvents, trichloroethylene (TCE) in particular, lead, and cadmium. We then used a SAS macro to review the information reported by study participants to identify jobs associated with each exposure scenario; this was done using previously coded standardized occupation and industry classification codes, and a priori lists of associated key words and phrases related to possibly exposed tasks, tools, and chemicals. Exposure variables representing the occupation, industry, and task/tool/chemicals exposure scenarios were added to the work history records of the study respondents. Our identification of possibly TCE-exposed scenarios in the OH responses was compared to an expert's independently assigned probability ratings to evaluate whether we missed identifying possibly exposed jobs. Our process added exposure variables for 52 occupation groups, 43 industry groups, and 46 task/tool/chemical scenarios to the data set of OH responses. Across all four agents, we identified possibly exposed task/tool/chemical exposure scenarios in 44-51% of the jobs in possibly exposed occupations. Possibly exposed task/tool/chemical exposure scenarios were found in a nontrivial 9-14% of the jobs not in possibly exposed occupations, suggesting that our process identified important information that would not be captured using occupation alone. Our extraction process was sensitive: for jobs where our extraction of OH responses identified no exposure scenarios and for which the sole source of information was the OH responses, only 0.1% were assessed as possibly exposed to TCE by the expert. Our systematic extraction of OH information found useful information in the task/chemicals/tools responses that was relatively easy to extract and that was not available from the occupational or industry information. The extracted variables can be used as inputs in the development of decision rules, especially for jobs where no additional information, such as job- and industry-specific questionnaires, is available. Published by Oxford University Press on behalf of the British Occupational Hygiene Society 2014.

  7. Overview of the Cancer Genetics and Pathway Curation tasks of BioNLP Shared Task 2013

    PubMed Central

    2015-01-01

    Background Since their introduction in 2009, the BioNLP Shared Task events have been instrumental in advancing the development of methods and resources for the automatic extraction of information from the biomedical literature. In this paper, we present the Cancer Genetics (CG) and Pathway Curation (PC) tasks, two event extraction tasks introduced in the BioNLP Shared Task 2013. The CG task focuses on cancer, emphasizing the extraction of physiological and pathological processes at various levels of biological organization, and the PC task targets reactions relevant to the development of biomolecular pathway models, defining its extraction targets on the basis of established pathway representations and ontologies. Results Six groups participated in the CG task and two groups in the PC task, together applying a wide range of extraction approaches including both established state-of-the-art systems and newly introduced extraction methods. The best-performing systems achieved F-scores of 55% on the CG task and 53% on the PC task, demonstrating a level of performance comparable to the best results achieved in similar previously proposed tasks. Conclusions The results indicate that existing event extraction technology can generalize to meet the novel challenges represented by the CG and PC task settings, suggesting that extraction methods are capable of supporting the construction of knowledge bases on the molecular mechanisms of cancer and the curation of biomolecular pathway models. The CG and PC tasks continue as open challenges for all interested parties, with data, tools and resources available from the shared task homepage. PMID:26202570

  8. Active learning for ontological event extraction incorporating named entity recognition and unknown word handling.

    PubMed

    Han, Xu; Kim, Jung-jae; Kwoh, Chee Keong

    2016-01-01

    Biomedical text mining may target various kinds of valuable information embedded in the literature, but a critical obstacle to the extension of the mining targets is the cost of manual construction of labeled data, which are required for state-of-the-art supervised learning systems. Active learning is to choose the most informative documents for the supervised learning in order to reduce the amount of required manual annotations. Previous works of active learning, however, focused on the tasks of entity recognition and protein-protein interactions, but not on event extraction tasks for multiple event types. They also did not consider the evidence of event participants, which might be a clue for the presence of events in unlabeled documents. Moreover, the confidence scores of events produced by event extraction systems are not reliable for ranking documents in terms of informativity for supervised learning. We here propose a novel committee-based active learning method that supports multi-event extraction tasks and employs a new statistical method for informativity estimation instead of using the confidence scores from event extraction systems. Our method is based on a committee of two systems as follows: We first employ an event extraction system to filter potential false negatives among unlabeled documents, from which the system does not extract any event. We then develop a statistical method to rank the potential false negatives of unlabeled documents 1) by using a language model that measures the probabilities of the expression of multiple events in documents and 2) by using a named entity recognition system that locates the named entities that can be event arguments (e.g. proteins). The proposed method further deals with unknown words in test data by using word similarity measures. We also apply our active learning method for the task of named entity recognition. We evaluate the proposed method against the BioNLP Shared Tasks datasets, and show that our method can achieve better performance than such previous methods as entropy and Gibbs error based methods and a conventional committee-based method. We also show that the incorporation of named entity recognition into the active learning for event extraction and the unknown word handling further improve the active learning method. In addition, the adaptation of the active learning method into named entity recognition tasks also improves the document selection for manual annotation of named entities.

  9. Representation control increases task efficiency in complex graphical representations.

    PubMed

    Moritz, Julia; Meyerhoff, Hauke S; Meyer-Dernbecher, Claudia; Schwan, Stephan

    2018-01-01

    In complex graphical representations, the relevant information for a specific task is often distributed across multiple spatial locations. In such situations, understanding the representation requires internal transformation processes in order to extract the relevant information. However, digital technology enables observers to alter the spatial arrangement of depicted information and therefore to offload the transformation processes. The objective of this study was to investigate the use of such a representation control (i.e. the users' option to decide how information should be displayed) in order to accomplish an information extraction task in terms of solution time and accuracy. In the representation control condition, the participants were allowed to reorganize the graphical representation and reduce information density. In the control condition, no interactive features were offered. We observed that participants in the representation control condition solved tasks that required reorganization of the maps faster and more accurate than participants without representation control. The present findings demonstrate how processes of cognitive offloading, spatial contiguity, and information coherence interact in knowledge media intended for broad and diverse groups of recipients.

  10. Representation control increases task efficiency in complex graphical representations

    PubMed Central

    Meyerhoff, Hauke S.; Meyer-Dernbecher, Claudia; Schwan, Stephan

    2018-01-01

    In complex graphical representations, the relevant information for a specific task is often distributed across multiple spatial locations. In such situations, understanding the representation requires internal transformation processes in order to extract the relevant information. However, digital technology enables observers to alter the spatial arrangement of depicted information and therefore to offload the transformation processes. The objective of this study was to investigate the use of such a representation control (i.e. the users' option to decide how information should be displayed) in order to accomplish an information extraction task in terms of solution time and accuracy. In the representation control condition, the participants were allowed to reorganize the graphical representation and reduce information density. In the control condition, no interactive features were offered. We observed that participants in the representation control condition solved tasks that required reorganization of the maps faster and more accurate than participants without representation control. The present findings demonstrate how processes of cognitive offloading, spatial contiguity, and information coherence interact in knowledge media intended for broad and diverse groups of recipients. PMID:29698443

  11. User-centered evaluation of Arizona BioPathway: an information extraction, integration, and visualization system.

    PubMed

    Quiñones, Karin D; Su, Hua; Marshall, Byron; Eggers, Shauna; Chen, Hsinchun

    2007-09-01

    Explosive growth in biomedical research has made automated information extraction, knowledge integration, and visualization increasingly important and critically needed. The Arizona BioPathway (ABP) system extracts and displays biological regulatory pathway information from the abstracts of journal articles. This study uses relations extracted from more than 200 PubMed abstracts presented in a tabular and graphical user interface with built-in search and aggregation functionality. This paper presents a task-centered assessment of the usefulness and usability of the ABP system focusing on its relation aggregation and visualization functionalities. Results suggest that our graph-based visualization is more efficient in supporting pathway analysis tasks and is perceived as more useful and easier to use as compared to a text-based literature-viewing method. Relation aggregation significantly contributes to knowledge-acquisition efficiency. Together, the graphic and tabular views in the ABP Visualizer provide a flexible and effective interface for pathway relation browsing and analysis. Our study contributes to pathway-related research and biological information extraction by assessing the value of a multiview, relation-based interface that supports user-controlled exploration of pathway information across multiple granularities.

  12. BioNLP Shared Task--The Bacteria Track.

    PubMed

    Bossy, Robert; Jourde, Julien; Manine, Alain-Pierre; Veber, Philippe; Alphonse, Erick; van de Guchte, Maarten; Bessières, Philippe; Nédellec, Claire

    2012-06-26

    We present the BioNLP 2011 Shared Task Bacteria Track, the first Information Extraction challenge entirely dedicated to bacteria. It includes three tasks that cover different levels of biological knowledge. The Bacteria Gene Renaming supporting task is aimed at extracting gene renaming and gene name synonymy in PubMed abstracts. The Bacteria Gene Interaction is a gene/protein interaction extraction task from individual sentences. The interactions have been categorized into ten different sub-types, thus giving a detailed account of genetic regulations at the molecular level. Finally, the Bacteria Biotopes task focuses on the localization and environment of bacteria mentioned in textbook articles. We describe the process of creation for the three corpora, including document acquisition and manual annotation, as well as the metrics used to evaluate the participants' submissions. Three teams submitted to the Bacteria Gene Renaming task; the best team achieved an F-score of 87%. For the Bacteria Gene Interaction task, the only participant's score had reached a global F-score of 77%, although the system efficiency varies significantly from one sub-type to another. Three teams submitted to the Bacteria Biotopes task with very different approaches; the best team achieved an F-score of 45%. However, the detailed study of the participating systems efficiency reveals the strengths and weaknesses of each participating system. The three tasks of the Bacteria Track offer participants a chance to address a wide range of issues in Information Extraction, including entity recognition, semantic typing and coreference resolution. We found common trends in the most efficient systems: the systematic use of syntactic dependencies and machine learning. Nevertheless, the originality of the Bacteria Biotopes task encouraged the use of interesting novel methods and techniques, such as term compositionality, scopes wider than the sentence.

  13. A framework for feature extraction from hospital medical data with applications in risk prediction.

    PubMed

    Tran, Truyen; Luo, Wei; Phung, Dinh; Gupta, Sunil; Rana, Santu; Kennedy, Richard Lee; Larkins, Ann; Venkatesh, Svetha

    2014-12-30

    Feature engineering is a time consuming component of predictive modeling. We propose a versatile platform to automatically extract features for risk prediction, based on a pre-defined and extensible entity schema. The extraction is independent of disease type or risk prediction task. We contrast auto-extracted features to baselines generated from the Elixhauser comorbidities. Hospital medical records was transformed to event sequences, to which filters were applied to extract feature sets capturing diversity in temporal scales and data types. The features were evaluated on a readmission prediction task, comparing with baseline feature sets generated from the Elixhauser comorbidities. The prediction model was through logistic regression with elastic net regularization. Predictions horizons of 1, 2, 3, 6, 12 months were considered for four diverse diseases: diabetes, COPD, mental disorders and pneumonia, with derivation and validation cohorts defined on non-overlapping data-collection periods. For unplanned readmissions, auto-extracted feature set using socio-demographic information and medical records, outperformed baselines derived from the socio-demographic information and Elixhauser comorbidities, over 20 settings (5 prediction horizons over 4 diseases). In particular over 30-day prediction, the AUCs are: COPD-baseline: 0.60 (95% CI: 0.57, 0.63), auto-extracted: 0.67 (0.64, 0.70); diabetes-baseline: 0.60 (0.58, 0.63), auto-extracted: 0.67 (0.64, 0.69); mental disorders-baseline: 0.57 (0.54, 0.60), auto-extracted: 0.69 (0.64,0.70); pneumonia-baseline: 0.61 (0.59, 0.63), auto-extracted: 0.70 (0.67, 0.72). The advantages of auto-extracted standard features from complex medical records, in a disease and task agnostic manner were demonstrated. Auto-extracted features have good predictive power over multiple time horizons. Such feature sets have potential to form the foundation of complex automated analytic tasks.

  14. Thinking graphically: Connecting vision and cognition during graph comprehension.

    PubMed

    Ratwani, Raj M; Trafton, J Gregory; Boehm-Davis, Deborah A

    2008-03-01

    Task analytic theories of graph comprehension account for the perceptual and conceptual processes required to extract specific information from graphs. Comparatively, the processes underlying information integration have received less attention. We propose a new framework for information integration that highlights visual integration and cognitive integration. During visual integration, pattern recognition processes are used to form visual clusters of information; these visual clusters are then used to reason about the graph during cognitive integration. In 3 experiments, the processes required to extract specific information and to integrate information were examined by collecting verbal protocol and eye movement data. Results supported the task analytic theories for specific information extraction and the processes of visual and cognitive integration for integrative questions. Further, the integrative processes scaled up as graph complexity increased, highlighting the importance of these processes for integration in more complex graphs. Finally, based on this framework, design principles to improve both visual and cognitive integration are described. PsycINFO Database Record (c) 2008 APA, all rights reserved

  15. Distributed smoothed tree kernel for protein-protein interaction extraction from the biomedical literature

    PubMed Central

    Murugesan, Gurusamy; Abdulkadhar, Sabenabanu; Natarajan, Jeyakumar

    2017-01-01

    Automatic extraction of protein-protein interaction (PPI) pairs from biomedical literature is a widely examined task in biological information extraction. Currently, many kernel based approaches such as linear kernel, tree kernel, graph kernel and combination of multiple kernels has achieved promising results in PPI task. However, most of these kernel methods fail to capture the semantic relation information between two entities. In this paper, we present a special type of tree kernel for PPI extraction which exploits both syntactic (structural) and semantic vectors information known as Distributed Smoothed Tree kernel (DSTK). DSTK comprises of distributed trees with syntactic information along with distributional semantic vectors representing semantic information of the sentences or phrases. To generate robust machine learning model composition of feature based kernel and DSTK were combined using ensemble support vector machine (SVM). Five different corpora (AIMed, BioInfer, HPRD50, IEPA, and LLL) were used for evaluating the performance of our system. Experimental results show that our system achieves better f-score with five different corpora compared to other state-of-the-art systems. PMID:29099838

  16. Distributed smoothed tree kernel for protein-protein interaction extraction from the biomedical literature.

    PubMed

    Murugesan, Gurusamy; Abdulkadhar, Sabenabanu; Natarajan, Jeyakumar

    2017-01-01

    Automatic extraction of protein-protein interaction (PPI) pairs from biomedical literature is a widely examined task in biological information extraction. Currently, many kernel based approaches such as linear kernel, tree kernel, graph kernel and combination of multiple kernels has achieved promising results in PPI task. However, most of these kernel methods fail to capture the semantic relation information between two entities. In this paper, we present a special type of tree kernel for PPI extraction which exploits both syntactic (structural) and semantic vectors information known as Distributed Smoothed Tree kernel (DSTK). DSTK comprises of distributed trees with syntactic information along with distributional semantic vectors representing semantic information of the sentences or phrases. To generate robust machine learning model composition of feature based kernel and DSTK were combined using ensemble support vector machine (SVM). Five different corpora (AIMed, BioInfer, HPRD50, IEPA, and LLL) were used for evaluating the performance of our system. Experimental results show that our system achieves better f-score with five different corpora compared to other state-of-the-art systems.

  17. Impaired visual recognition of biological motion in schizophrenia.

    PubMed

    Kim, Jejoong; Doop, Mikisha L; Blake, Randolph; Park, Sohee

    2005-09-15

    Motion perception deficits have been suggested to be an important feature of schizophrenia but the behavioral consequences of such deficits are unknown. Biological motion refers to the movements generated by living beings. The human visual system rapidly and effortlessly detects and extracts socially relevant information from biological motion. A deficit in biological motion perception may have significant consequences for detecting and interpreting social information. Schizophrenia patients and matched healthy controls were tested on two visual tasks: recognition of human activity portrayed in point-light animations (biological motion task) and a perceptual control task involving detection of a grouped figure against the background noise (global-form task). Both tasks required detection of a global form against background noise but only the biological motion task required the extraction of motion-related information. Schizophrenia patients performed as well as the controls in the global-form task, but were significantly impaired on the biological motion task. In addition, deficits in biological motion perception correlated with impaired social functioning as measured by the Zigler social competence scale [Zigler, E., Levine, J. (1981). Premorbid competence in schizophrenia: what is being measured? Journal of Consulting and Clinical Psychology, 49, 96-105.]. The deficit in biological motion processing, which may be related to the previously documented deficit in global motion processing, could contribute to abnormal social functioning in schizophrenia.

  18. Government Information Locator Service (GILS). Draft report to the Information Infrastructure Task Force

    NASA Technical Reports Server (NTRS)

    1994-01-01

    This is a draft report on the Government Information Locator Service (GILS) to the National Information Infrastructure (NII) task force. GILS is designed to take advantage of internetworking technology known as client-server architecture which allows information to be distributed among multiple independent information servers. Two appendices are provided -- (1) A glossary of related terminology and (2) extracts from a draft GILS profile for the use of the American National Standard Information Retrieval Application Service Definition and Protocol Specification for Library Applications.

  19. A knowledge engineering approach to recognizing and extracting sequences of nucleic acids from scientific literature.

    PubMed

    García-Remesal, Miguel; Maojo, Victor; Crespo, José

    2010-01-01

    In this paper we present a knowledge engineering approach to automatically recognize and extract genetic sequences from scientific articles. To carry out this task, we use a preliminary recognizer based on a finite state machine to extract all candidate DNA/RNA sequences. The latter are then fed into a knowledge-based system that automatically discards false positives and refines noisy and incorrectly merged sequences. We created the knowledge base by manually analyzing different manuscripts containing genetic sequences. Our approach was evaluated using a test set of 211 full-text articles in PDF format containing 3134 genetic sequences. For such set, we achieved 87.76% precision and 97.70% recall respectively. This method can facilitate different research tasks. These include text mining, information extraction, and information retrieval research dealing with large collections of documents containing genetic sequences.

  20. Concept recognition for extracting protein interaction relations from biomedical text

    PubMed Central

    Baumgartner, William A; Lu, Zhiyong; Johnson, Helen L; Caporaso, J Gregory; Paquette, Jesse; Lindemann, Anna; White, Elizabeth K; Medvedeva, Olga; Cohen, K Bretonnel; Hunter, Lawrence

    2008-01-01

    Background: Reliable information extraction applications have been a long sought goal of the biomedical text mining community, a goal that if reached would provide valuable tools to benchside biologists in their increasingly difficult task of assimilating the knowledge contained in the biomedical literature. We present an integrated approach to concept recognition in biomedical text. Concept recognition provides key information that has been largely missing from previous biomedical information extraction efforts, namely direct links to well defined knowledge resources that explicitly cement the concept's semantics. The BioCreative II tasks discussed in this special issue have provided a unique opportunity to demonstrate the effectiveness of concept recognition in the field of biomedical language processing. Results: Through the modular construction of a protein interaction relation extraction system, we present several use cases of concept recognition in biomedical text, and relate these use cases to potential uses by the benchside biologist. Conclusion: Current information extraction technologies are approaching performance standards at which concept recognition can begin to deliver high quality data to the benchside biologist. Our system is available as part of the BioCreative Meta-Server project and on the internet . PMID:18834500

  1. Building an automated SOAP classifier for emergency department reports.

    PubMed

    Mowery, Danielle; Wiebe, Janyce; Visweswaran, Shyam; Harkema, Henk; Chapman, Wendy W

    2012-02-01

    Information extraction applications that extract structured event and entity information from unstructured text can leverage knowledge of clinical report structure to improve performance. The Subjective, Objective, Assessment, Plan (SOAP) framework, used to structure progress notes to facilitate problem-specific, clinical decision making by physicians, is one example of a well-known, canonical structure in the medical domain. Although its applicability to structuring data is understood, its contribution to information extraction tasks has not yet been determined. The first step to evaluating the SOAP framework's usefulness for clinical information extraction is to apply the model to clinical narratives and develop an automated SOAP classifier that classifies sentences from clinical reports. In this quantitative study, we applied the SOAP framework to sentences from emergency department reports, and trained and evaluated SOAP classifiers built with various linguistic features. We found the SOAP framework can be applied manually to emergency department reports with high agreement (Cohen's kappa coefficients over 0.70). Using a variety of features, we found classifiers for each SOAP class can be created with moderate to outstanding performance with F(1) scores of 93.9 (subjective), 94.5 (objective), 75.7 (assessment), and 77.0 (plan). We look forward to expanding the framework and applying the SOAP classification to clinical information extraction tasks. Copyright © 2011. Published by Elsevier Inc.

  2. Perceiving age and gender in unfamiliar faces: brain potential evidence for implicit and explicit person categorization.

    PubMed

    Wiese, Holger; Schweinberger, Stefan R; Neumann, Markus F

    2008-11-01

    We used repetition priming to investigate implicit and explicit processes of unfamiliar face categorization. During prime and test phases, participants categorized unfamiliar faces according to either age or gender. Faces presented at test were either new or primed in a task-congruent (same task during priming and test) or incongruent (different tasks) condition. During age categorization, reaction times revealed significant priming for both priming conditions, and event-related potentials yielded an increased N170 over the left hemisphere as a result of priming. During gender categorization, congruent faces elicited priming and a latency decrease in the right N170. Accordingly, information about age is extracted irrespective of processing demands, and priming facilitates the extraction of feature information reflected in the left N170 effect. By contrast, priming of gender categorization may depend on whether the task at initial presentation requires configural processing.

  3. Informational primacy of visual dimensions: specialized roles for luminance and chromaticity in figure-ground perception.

    PubMed

    Yamagishi, N; Melara, R D

    2001-07-01

    Three experiments were conducted to examine the distinct contributions of two visual dimensions to figure-ground segregation. In each experiment, pattern identification was assessed by asking observers to judge whether a near-threshold test pattern was the same or different in shape to a high-contrast comparison pattern. A test pattern could differ from its background along one dimension, either luminance (luminance tasks) or chromaticity (chromaticity tasks). In each task, performance in a baseline condition, in which the test pattern was intact, was compared with performance in each of several degradation conditions, in which either the contour or the surface of the figure was degraded, using either partial occlusion (Experiment 1) or ramping (Experiments 2 and 3) of figure-ground differences. In each experiment, performance in luminance tasks was worst when the contour was degraded, whereas performance in chromaticity tasks was worst when the surface was degraded. This interaction was found even when spatial frequencies were fixed across test patterns by low-pass filtering. The results are consistent with a late (postfiltering) dual-mechanism system that processes luminance information to extract boundary representations and chromaticity information to extract surface representations.

  4. An information extraction framework for cohort identification using electronic health records.

    PubMed

    Liu, Hongfang; Bielinski, Suzette J; Sohn, Sunghwan; Murphy, Sean; Wagholikar, Kavishwar B; Jonnalagadda, Siddhartha R; Ravikumar, K E; Wu, Stephen T; Kullo, Iftikhar J; Chute, Christopher G

    2013-01-01

    Information extraction (IE), a natural language processing (NLP) task that automatically extracts structured or semi-structured information from free text, has become popular in the clinical domain for supporting automated systems at point-of-care and enabling secondary use of electronic health records (EHRs) for clinical and translational research. However, a high performance IE system can be very challenging to construct due to the complexity and dynamic nature of human language. In this paper, we report an IE framework for cohort identification using EHRs that is a knowledge-driven framework developed under the Unstructured Information Management Architecture (UIMA). A system to extract specific information can be developed by subject matter experts through expert knowledge engineering of the externalized knowledge resources used in the framework.

  5. Integrating Information Extraction Agents into a Tourism Recommender System

    NASA Astrophysics Data System (ADS)

    Esparcia, Sergio; Sánchez-Anguix, Víctor; Argente, Estefanía; García-Fornes, Ana; Julián, Vicente

    Recommender systems face some problems. On the one hand information needs to be maintained updated, which can result in a costly task if it is not performed automatically. On the other hand, it may be interesting to include third party services in the recommendation since they improve its quality. In this paper, we present an add-on for the Social-Net Tourism Recommender System that uses information extraction and natural language processing techniques in order to automatically extract and classify information from the Web. Its goal is to maintain the system updated and obtain information about third party services that are not offered by service providers inside the system.

  6. Investigating the feasibility of using partial least squares as a method of extracting salient information for the evaluation of digital breast tomosynthesis

    NASA Astrophysics Data System (ADS)

    Zhang, George Z.; Myers, Kyle J.; Park, Subok

    2013-03-01

    Digital breast tomosynthesis (DBT) has shown promise for improving the detection of breast cancer, but it has not yet been fully optimized due to a large space of system parameters to explore. A task-based statistical approach1 is a rigorous method for evaluating and optimizing this promising imaging technique with the use of optimal observers such as the Hotelling observer (HO). However, the high data dimensionality found in DBT has been the bottleneck for the use of a task-based approach in DBT evaluation. To reduce data dimensionality while extracting salient information for performing a given task, efficient channels have to be used for the HO. In the past few years, 2D Laguerre-Gauss (LG) channels, which are a complete basis for stationary backgrounds and rotationally symmetric signals, have been utilized for DBT evaluation2, 3 . But since background and signal statistics from DBT data are neither stationary nor rotationally symmetric, LG channels may not be efficient in providing reliable performance trends as a function of system parameters. Recently, partial least squares (PLS) has been shown to generate efficient channels for the Hotelling observer in detection tasks involving random backgrounds and signals.4 In this study, we investigate the use of PLS as a method for extracting salient information from DBT in order to better evaluate such systems.

  7. Image Analysis and Modeling

    DTIC Science & Technology

    1975-08-01

    image analysis and processing tasks such as information extraction, image enhancement and restoration, coding, etc. The ultimate objective of this research is to form a basis for the development of technology relevant to military applications of machine extraction of information from aircraft and satellite imagery of the earth’s surface. This report discusses research activities during the three month period February 1 - April 30,

  8. Improving the Accuracy of Attribute Extraction using the Relatedness between Attribute Values

    NASA Astrophysics Data System (ADS)

    Bollegala, Danushka; Tani, Naoki; Ishizuka, Mitsuru

    Extracting attribute-values related to entities from web texts is an important step in numerous web related tasks such as information retrieval, information extraction, and entity disambiguation (namesake disambiguation). For example, for a search query that contains a personal name, we can not only return documents that contain that personal name, but if we have attribute-values such as the organization for which that person works, we can also suggest documents that contain information related to that organization, thereby improving the user's search experience. Despite numerous potential applications of attribute extraction, it remains a challenging task due to the inherent noise in web data -- often a single web page contains multiple entities and attributes. We propose a graph-based approach to select the correct attribute-values from a set of candidate attribute-values extracted for a particular entity. First, we build an undirected weighted graph in which, attribute-values are represented by nodes, and the edge that connects two nodes in the graph represents the degree of relatedness between the corresponding attribute-values. Next, we find the maximum spanning tree of this graph that connects exactly one attribute-value for each attribute-type. The proposed method outperforms previously proposed attribute extraction methods on a dataset that contains 5000 web pages.

  9. Biological event composition

    PubMed Central

    2012-01-01

    Background In recent years, biological event extraction has emerged as a key natural language processing task, aiming to address the information overload problem in accessing the molecular biology literature. The BioNLP shared task competitions have contributed to this recent interest considerably. The first competition (BioNLP'09) focused on extracting biological events from Medline abstracts from a narrow domain, while the theme of the latest competition (BioNLP-ST'11) was generalization and a wider range of text types, event types, and subject domains were considered. We view event extraction as a building block in larger discourse interpretation and propose a two-phase, linguistically-grounded, rule-based methodology. In the first phase, a general, underspecified semantic interpretation is composed from syntactic dependency relations in a bottom-up manner. The notion of embedding underpins this phase and it is informed by a trigger dictionary and argument identification rules. Coreference resolution is also performed at this step, allowing extraction of inter-sentential relations. The second phase is concerned with constraining the resulting semantic interpretation by shared task specifications. We evaluated our general methodology on core biological event extraction and speculation/negation tasks in three main tracks of BioNLP-ST'11 (GENIA, EPI, and ID). Results We achieved competitive results in GENIA and ID tracks, while our results in the EPI track leave room for improvement. One notable feature of our system is that its performance across abstracts and articles bodies is stable. Coreference resolution results in minor improvement in system performance. Due to our interest in discourse-level elements, such as speculation/negation and coreference, we provide a more detailed analysis of our system performance in these subtasks. Conclusions The results demonstrate the viability of a robust, linguistically-oriented methodology, which clearly distinguishes general semantic interpretation from shared task specific aspects, for biological event extraction. Our error analysis pinpoints some shortcomings, which we plan to address in future work within our incremental system development methodology. PMID:22759461

  10. An Information Extraction Framework for Cohort Identification Using Electronic Health Records

    PubMed Central

    Liu, Hongfang; Bielinski, Suzette J.; Sohn, Sunghwan; Murphy, Sean; Wagholikar, Kavishwar B.; Jonnalagadda, Siddhartha R.; Ravikumar, K.E.; Wu, Stephen T.; Kullo, Iftikhar J.; Chute, Christopher G

    Information extraction (IE), a natural language processing (NLP) task that automatically extracts structured or semi-structured information from free text, has become popular in the clinical domain for supporting automated systems at point-of-care and enabling secondary use of electronic health records (EHRs) for clinical and translational research. However, a high performance IE system can be very challenging to construct due to the complexity and dynamic nature of human language. In this paper, we report an IE framework for cohort identification using EHRs that is a knowledge-driven framework developed under the Unstructured Information Management Architecture (UIMA). A system to extract specific information can be developed by subject matter experts through expert knowledge engineering of the externalized knowledge resources used in the framework. PMID:24303255

  11. Models Extracted from Text for System-Software Safety Analyses

    NASA Technical Reports Server (NTRS)

    Malin, Jane T.

    2010-01-01

    This presentation describes extraction and integration of requirements information and safety information in visualizations to support early review of completeness, correctness, and consistency of lengthy and diverse system safety analyses. Software tools have been developed and extended to perform the following tasks: 1) extract model parts and safety information from text in interface requirements documents, failure modes and effects analyses and hazard reports; 2) map and integrate the information to develop system architecture models and visualizations for safety analysts; and 3) provide model output to support virtual system integration testing. This presentation illustrates the methods and products with a rocket motor initiation case.

  12. An Analysis of Students' Mistakes on Routine Slope Tasks

    ERIC Educational Resources Information Center

    Cho, Peter; Nagle, Courtney

    2017-01-01

    This study extends past research on students' understanding of slope by analyzing college students' mistakes on routine tasks involving slope. We conduct quantitative and qualitative analysis of students' mistakes to extract information regarding slope conceptualizations described in prior research. Results delineate procedural proficiencies and…

  13. Pharmacovigilance from social media: mining adverse drug reaction mentions using sequence labeling with word embedding cluster features.

    PubMed

    Nikfarjam, Azadeh; Sarker, Abeed; O'Connor, Karen; Ginn, Rachel; Gonzalez, Graciela

    2015-05-01

    Social media is becoming increasingly popular as a platform for sharing personal health-related information. This information can be utilized for public health monitoring tasks, particularly for pharmacovigilance, via the use of natural language processing (NLP) techniques. However, the language in social media is highly informal, and user-expressed medical concepts are often nontechnical, descriptive, and challenging to extract. There has been limited progress in addressing these challenges, and thus far, advanced machine learning-based NLP techniques have been underutilized. Our objective is to design a machine learning-based approach to extract mentions of adverse drug reactions (ADRs) from highly informal text in social media. We introduce ADRMine, a machine learning-based concept extraction system that uses conditional random fields (CRFs). ADRMine utilizes a variety of features, including a novel feature for modeling words' semantic similarities. The similarities are modeled by clustering words based on unsupervised, pretrained word representation vectors (embeddings) generated from unlabeled user posts in social media using a deep learning technique. ADRMine outperforms several strong baseline systems in the ADR extraction task by achieving an F-measure of 0.82. Feature analysis demonstrates that the proposed word cluster features significantly improve extraction performance. It is possible to extract complex medical concepts, with relatively high performance, from informal, user-generated content. Our approach is particularly scalable, suitable for social media mining, as it relies on large volumes of unlabeled data, thus diminishing the need for large, annotated training data sets. © The Author 2015. Published by Oxford University Press on behalf of the American Medical Informatics Association.

  14. A method for automatically extracting infectious disease-related primers and probes from the literature

    PubMed Central

    2010-01-01

    Background Primer and probe sequences are the main components of nucleic acid-based detection systems. Biologists use primers and probes for different tasks, some related to the diagnosis and prescription of infectious diseases. The biological literature is the main information source for empirically validated primer and probe sequences. Therefore, it is becoming increasingly important for researchers to navigate this important information. In this paper, we present a four-phase method for extracting and annotating primer/probe sequences from the literature. These phases are: (1) convert each document into a tree of paper sections, (2) detect the candidate sequences using a set of finite state machine-based recognizers, (3) refine problem sequences using a rule-based expert system, and (4) annotate the extracted sequences with their related organism/gene information. Results We tested our approach using a test set composed of 297 manuscripts. The extracted sequences and their organism/gene annotations were manually evaluated by a panel of molecular biologists. The results of the evaluation show that our approach is suitable for automatically extracting DNA sequences, achieving precision/recall rates of 97.98% and 95.77%, respectively. In addition, 76.66% of the detected sequences were correctly annotated with their organism name. The system also provided correct gene-related information for 46.18% of the sequences assigned a correct organism name. Conclusions We believe that the proposed method can facilitate routine tasks for biomedical researchers using molecular methods to diagnose and prescribe different infectious diseases. In addition, the proposed method can be expanded to detect and extract other biological sequences from the literature. The extracted information can also be used to readily update available primer/probe databases or to create new databases from scratch. PMID:20682041

  15. Decoding memory features from hippocampal spiking activities using sparse classification models.

    PubMed

    Dong Song; Hampson, Robert E; Robinson, Brian S; Marmarelis, Vasilis Z; Deadwyler, Sam A; Berger, Theodore W

    2016-08-01

    To understand how memory information is encoded in the hippocampus, we build classification models to decode memory features from hippocampal CA3 and CA1 spatio-temporal patterns of spikes recorded from epilepsy patients performing a memory-dependent delayed match-to-sample task. The classification model consists of a set of B-spline basis functions for extracting memory features from the spike patterns, and a sparse logistic regression classifier for generating binary categorical output of memory features. Results show that classification models can extract significant amount of memory information with respects to types of memory tasks and categories of sample images used in the task, despite the high level of variability in prediction accuracy due to the small sample size. These results support the hypothesis that memories are encoded in the hippocampal activities and have important implication to the development of hippocampal memory prostheses.

  16. Thinking Graphically: Connecting Vision and Cognition during Graph Comprehension

    ERIC Educational Resources Information Center

    Ratwani, Raj M.; Trafton, J. Gregory; Boehm-Davis, Deborah A.

    2008-01-01

    Task analytic theories of graph comprehension account for the perceptual and conceptual processes required to extract specific information from graphs. Comparatively, the processes underlying information integration have received less attention. We propose a new framework for information integration that highlights visual integration and cognitive…

  17. Map Design for Computer Processing: Literature Review and DMA Product Critique.

    DTIC Science & Technology

    1985-01-01

    requirements can be separated contour lines (vegetation shown by iconic symbols) from user preference. versus extracting relief information using only con...tour lines (vegetation shown by tints); 0 extracting vegetation information using iconic sym- PERFORMANCE TESTING bols (relief shown by elevation...show another: trapolating the symbols on a white background) in tim- * in the case of point symbols, iconic forms where ing the performance of tasks

  18. Further Investigations of Content Analytic Techniques for Extracting the Differentiating Information Contained in the Narrative Sections of Performance Evaluations for Navy Enlisted Personnel. Technical Report No. 75-1.

    ERIC Educational Resources Information Center

    Ramsey-Klee, Diane M.; Richman, Vivian

    The purpose of this research is to develop content analytic techniques capable of extracting the differentiating information in narrative performance evaluations for enlisted personnel in order to aid in the process of selecting personnel for advancement, duty assignment, training, or quality retention. Four tasks were performed. The first task…

  19. YAdumper: extracting and translating large information volumes from relational databases to structured flat files.

    PubMed

    Fernández, José M; Valencia, Alfonso

    2004-10-12

    Downloading the information stored in relational databases into XML and other flat formats is a common task in bioinformatics. This periodical dumping of information requires considerable CPU time, disk and memory resources. YAdumper has been developed as a purpose-specific tool to deal with the integral structured information download of relational databases. YAdumper is a Java application that organizes database extraction following an XML template based on an external Document Type Declaration. Compared with other non-native alternatives, YAdumper substantially reduces memory requirements and considerably improves writing performance.

  20. A stable biologically motivated learning mechanism for visual feature extraction to handle facial categorization.

    PubMed

    Rajaei, Karim; Khaligh-Razavi, Seyed-Mahdi; Ghodrati, Masoud; Ebrahimpour, Reza; Shiri Ahmad Abadi, Mohammad Ebrahim

    2012-01-01

    The brain mechanism of extracting visual features for recognizing various objects has consistently been a controversial issue in computational models of object recognition. To extract visual features, we introduce a new, biologically motivated model for facial categorization, which is an extension of the Hubel and Wiesel simple-to-complex cell hierarchy. To address the synaptic stability versus plasticity dilemma, we apply the Adaptive Resonance Theory (ART) for extracting informative intermediate level visual features during the learning process, which also makes this model stable against the destruction of previously learned information while learning new information. Such a mechanism has been suggested to be embedded within known laminar microcircuits of the cerebral cortex. To reveal the strength of the proposed visual feature learning mechanism, we show that when we use this mechanism in the training process of a well-known biologically motivated object recognition model (the HMAX model), it performs better than the HMAX model in face/non-face classification tasks. Furthermore, we demonstrate that our proposed mechanism is capable of following similar trends in performance as humans in a psychophysical experiment using a face versus non-face rapid categorization task.

  1. Analysis of Technique to Extract Data from the Web for Improved Performance

    NASA Astrophysics Data System (ADS)

    Gupta, Neena; Singh, Manish

    2010-11-01

    The World Wide Web rapidly guides the world into a newly amazing electronic world, where everyone can publish anything in electronic form and extract almost all the information. Extraction of information from semi structured or unstructured documents, such as web pages, is a useful yet complex task. Data extraction, which is important for many applications, extracts the records from the HTML files automatically. Ontologies can achieve a high degree of accuracy in data extraction. We analyze method for data extraction OBDE (Ontology-Based Data Extraction), which automatically extracts the query result records from the web with the help of agents. OBDE first constructs an ontology for a domain according to information matching between the query interfaces and query result pages from different web sites within the same domain. Then, the constructed domain ontology is used during data extraction to identify the query result section in a query result page and to align and label the data values in the extracted records. The ontology-assisted data extraction method is fully automatic and overcomes many of the deficiencies of current automatic data extraction methods.

  2. Position-aware deep multi-task learning for drug-drug interaction extraction.

    PubMed

    Zhou, Deyu; Miao, Lei; He, Yulan

    2018-05-01

    A drug-drug interaction (DDI) is a situation in which a drug affects the activity of another drug synergistically or antagonistically when being administered together. The information of DDIs is crucial for healthcare professionals to prevent adverse drug events. Although some known DDIs can be found in purposely-built databases such as DrugBank, most information is still buried in scientific publications. Therefore, automatically extracting DDIs from biomedical texts is sorely needed. In this paper, we propose a novel position-aware deep multi-task learning approach for extracting DDIs from biomedical texts. In particular, sentences are represented as a sequence of word embeddings and position embeddings. An attention-based bidirectional long short-term memory (BiLSTM) network is used to encode each sentence. The relative position information of words with the target drugs in text is combined with the hidden states of BiLSTM to generate the position-aware attention weights. Moreover, the tasks of predicting whether or not two drugs interact with each other and further distinguishing the types of interactions are learned jointly in multi-task learning framework. The proposed approach has been evaluated on the DDIExtraction challenge 2013 corpus and the results show that with the position-aware attention only, our proposed approach outperforms the state-of-the-art method by 0.99% for binary DDI classification, and with both position-aware attention and multi-task learning, our approach achieves a micro F-score of 72.99% on interaction type identification, outperforming the state-of-the-art approach by 1.51%, which demonstrates the effectiveness of the proposed approach. Copyright © 2018 Elsevier B.V. All rights reserved.

  3. SOCIAL MEDIA MINING SHARED TASK WORKSHOP.

    PubMed

    Sarker, Abeed; Nikfarjam, Azadeh; Gonzalez, Graciela

    2016-01-01

    Social media has evolved into a crucial resource for obtaining large volumes of real-time information. The promise of social media has been realized by the public health domain, and recent research has addressed some important challenges in that domain by utilizing social media data. Tasks such as monitoring flu trends, viral disease outbreaks, medication abuse, and adverse drug reactions are some examples of studies where data from social media have been exploited. The focus of this workshop is to explore solutions to three important natural language processing challenges for domain-specific social media text: (i) text classification, (ii) information extraction, and (iii) concept normalization. To explore different approaches to solving these problems on social media data, we designed a shared task which was open to participants globally. We designed three tasks using our in-house annotated Twitter data on adverse drug reactions. Task 1 involved automatic classification of adverse drug reaction assertive user posts; Task 2 focused on extracting specific adverse drug reaction mentions from user posts; and Task 3, which was slightly ill-defined due to the complex nature of the problem, involved normalizing user mentions of adverse drug reactions to standardized concept IDs. A total of 11 teams participated, and a total of 24 (18 for Task 1, and 6 for Task 2) system runs were submitted. Following the evaluation of the systems, and an assessment of their innovation/novelty, we accepted 7 descriptive manuscripts for publication--5 for Task 1 and 2 for Task 2. We provide descriptions of the tasks, data, and participating systems in this paper.

  4. Building gold standard corpora for medical natural language processing tasks.

    PubMed

    Deleger, Louise; Li, Qi; Lingren, Todd; Kaiser, Megan; Molnar, Katalin; Stoutenborough, Laura; Kouril, Michal; Marsolo, Keith; Solti, Imre

    2012-01-01

    We present the construction of three annotated corpora to serve as gold standards for medical natural language processing (NLP) tasks. Clinical notes from the medical record, clinical trial announcements, and FDA drug labels are annotated. We report high inter-annotator agreements (overall F-measures between 0.8467 and 0.9176) for the annotation of Personal Health Information (PHI) elements for a de-identification task and of medications, diseases/disorders, and signs/symptoms for information extraction (IE) task. The annotated corpora of clinical trials and FDA labels will be publicly released and to facilitate translational NLP tasks that require cross-corpora interoperability (e.g. clinical trial eligibility screening) their annotation schemas are aligned with a large scale, NIH-funded clinical text annotation project.

  5. Detection and categorization of bacteria habitats using shallow linguistic analysis

    PubMed Central

    2015-01-01

    Background Information regarding bacteria biotopes is important for several research areas including health sciences, microbiology, and food processing and preservation. One of the challenges for scientists in these domains is the huge amount of information buried in the text of electronic resources. Developing methods to automatically extract bacteria habitat relations from the text of these electronic resources is crucial for facilitating research in these areas. Methods We introduce a linguistically motivated rule-based approach for recognizing and normalizing names of bacteria habitats in biomedical text by using an ontology. Our approach is based on the shallow syntactic analysis of the text that include sentence segmentation, part-of-speech (POS) tagging, partial parsing, and lemmatization. In addition, we propose two methods for identifying bacteria habitat localization relations. The underlying assumption for the first method is that discourse changes with a new paragraph. Therefore, it operates on a paragraph-basis. The second method performs a more fine-grained analysis of the text and operates on a sentence-basis. We also develop a novel anaphora resolution method for bacteria coreferences and incorporate it with the sentence-based relation extraction approach. Results We participated in the Bacteria Biotope (BB) Task of the BioNLP Shared Task 2013. Our system (Boun) achieved the second best performance with 68% Slot Error Rate (SER) in Sub-task 1 (Entity Detection and Categorization), and ranked third with an F-score of 27% in Sub-task 2 (Localization Event Extraction). This paper reports the system that is implemented for the shared task, including the novel methods developed and the improvements obtained after the official evaluation. The extensions include the expansion of the OntoBiotope ontology using the training set for Sub-task 1, and the novel sentence-based relation extraction method incorporated with anaphora resolution for Sub-task 2. These extensions resulted in promising results for Sub-task 1 with a SER of 68%, and state-of-the-art performance for Sub-task 2 with an F-score of 53%. Conclusions Our results show that a linguistically-oriented approach based on the shallow syntactic analysis of the text is as effective as machine learning approaches for the detection and ontology-based normalization of habitat entities. Furthermore, the newly developed sentence-based relation extraction system with the anaphora resolution module significantly outperforms the paragraph-based one, as well as the other systems that participated in the BB Shared Task 2013. PMID:26201262

  6. Too Much Information--Too Much Apprehension

    ERIC Educational Resources Information Center

    Hijazi, Sam

    2004-01-01

    The information age along with the exponential increase in information technology has brought an unexpected amount of information. The endeavor to sort and extract a meaning from the massive amount of data has become a challenging task to many educators and managers. This research is an attempt to collect the most common suggestions to reduce the…

  7. Learning to research in first grade: Bridging the transition from narrative to expository texts and tasks

    NASA Astrophysics Data System (ADS)

    Weise, Richard

    Decades of research indicate that students at all academic grade and performance levels perform poorly with informational texts and tasks and particularly with locating assignment-relevant information in expository texts. Students have little understanding of the individual tasks required, the arc of the activity, the hierarchical structure of the information they seek, or how to reconstitute and interpret the information they extract. Poor performance begins with the introduction of textbooks and research assignments in fourth grade and continues into adulthood. However, to date, neither educators nor researchers have substantially addressed this problem. In this quasi-experimental study, we ask if first-grade children can perform essential tasks in identifying, extracting, and integrating assignment-relevant information and if instruction improves their performance. To answer this question, we conducted a 15-week, teacher-led, intervention in two first-grade classrooms in an inner-city Nashville elementary school. We created a computer learning environment (NoteTaker) to facilitate children's creation of a mental model of the research process and a narrative/expository bridge curriculum to support the children's transition from all narrative to all expository texts and tasks. We also created a new scaffolding taxonomy and a reading-to-research model to focus our research. Teachers participated in weekly professional development workshops. The results of this quasi-experimental study indicate that at-risk, first-grade children are able to (a) identify relevant information in an expository text, (b) categorize the information they identify, and (c) justify their choice of category. Children's performance in the first and last tasks significantly improved with instruction, and low-performing readers showed the greatest benefits from instruction. We find that the children's performance in categorizing information depended upon content-specific knowledge that was not taught, as well as on the process knowledge that was taught. We also find that children's narrative reading performance predicted their initial-performance for each assessment measure. We argue that first-grade children are developmentally ready to read expository texts and to learn reading-to-research tasks and that primary-school literacy instruction should not be limited to reading and writing stories.

  8. Acquiring geographical data with web harvesting

    NASA Astrophysics Data System (ADS)

    Dramowicz, K.

    2016-04-01

    Many websites contain very attractive and up to date geographical information. This information can be extracted, stored, analyzed and mapped using web harvesting techniques. Poorly organized data from websites are transformed with web harvesting into a more structured format, which can be stored in a database and analyzed. Almost 25% of web traffic is related to web harvesting, mostly while using search engines. This paper presents how to harvest geographic information from web documents using the free tool called the Beautiful Soup, one of the most commonly used Python libraries for pulling data from HTML and XML files. It is a relatively easy task to process one static HTML table. The more challenging task is to extract and save information from tables located in multiple and poorly organized websites. Legal and ethical aspects of web harvesting are discussed as well. The paper demonstrates two case studies. The first one shows how to extract various types of information about the Good Country Index from the multiple web pages, load it into one attribute table and map the results. The second case study shows how script tools and GIS can be used to extract information from one hundred thirty six websites about Nova Scotia wines. In a little more than three minutes a database containing one hundred and six liquor stores selling these wines is created. Then the availability and spatial distribution of various types of wines (by grape types, by wineries, and by liquor stores) are mapped and analyzed.

  9. Characterizing Task-Based OpenMP Programs

    PubMed Central

    Muddukrishna, Ananya; Jonsson, Peter A.; Brorsson, Mats

    2015-01-01

    Programmers struggle to understand performance of task-based OpenMP programs since profiling tools only report thread-based performance. Performance tuning also requires task-based performance in order to balance per-task memory hierarchy utilization against exposed task parallelism. We provide a cost-effective method to extract detailed task-based performance information from OpenMP programs. We demonstrate the utility of our method by quickly diagnosing performance problems and characterizing exposed task parallelism and per-task instruction profiles of benchmarks in the widely-used Barcelona OpenMP Tasks Suite. Programmers can tune performance faster and understand performance tradeoffs more effectively than existing tools by using our method to characterize task-based performance. PMID:25860023

  10. Applied cognitive task analysis (ACTA): a practitioner's toolkit for understanding cognitive task demands.

    PubMed

    Militello, L G; Hutton, R J

    1998-11-01

    Cognitive task analysis (CTA) is a set of methods for identifying cognitive skills, or mental demands, needed to perform a task proficiently. The product of the task analysis can be used to inform the design of interfaces and training systems. However, CTA is resource intensive and has previously been of limited use to design practitioners. A streamlined method of CTA, Applied Cognitive Task Analysis (ACTA), is presented in this paper. ACTA consists of three interview methods that help the practitioner to extract information about the cognitive demands and skills required for a task. ACTA also allows the practitioner to represent this information in a format that will translate more directly into applied products, such as improved training scenarios or interface recommendations. This paper will describe the three methods, an evaluation study conducted to assess the usability and usefulness of the methods, and some directions for future research for making cognitive task analysis accessible to practitioners. ACTA techniques were found to be easy to use, flexible, and to provide clear output. The information and training materials developed based on ACTA interviews were found to be accurate and important for training purposes.

  11. Extracting semantically enriched events from biomedical literature

    PubMed Central

    2012-01-01

    Background Research into event-based text mining from the biomedical literature has been growing in popularity to facilitate the development of advanced biomedical text mining systems. Such technology permits advanced search, which goes beyond document or sentence-based retrieval. However, existing event-based systems typically ignore additional information within the textual context of events that can determine, amongst other things, whether an event represents a fact, hypothesis, experimental result or analysis of results, whether it describes new or previously reported knowledge, and whether it is speculated or negated. We refer to such contextual information as meta-knowledge. The automatic recognition of such information can permit the training of systems allowing finer-grained searching of events according to the meta-knowledge that is associated with them. Results Based on a corpus of 1,000 MEDLINE abstracts, fully manually annotated with both events and associated meta-knowledge, we have constructed a machine learning-based system that automatically assigns meta-knowledge information to events. This system has been integrated into EventMine, a state-of-the-art event extraction system, in order to create a more advanced system (EventMine-MK) that not only extracts events from text automatically, but also assigns five different types of meta-knowledge to these events. The meta-knowledge assignment module of EventMine-MK performs with macro-averaged F-scores in the range of 57-87% on the BioNLP’09 Shared Task corpus. EventMine-MK has been evaluated on the BioNLP’09 Shared Task subtask of detecting negated and speculated events. Our results show that EventMine-MK can outperform other state-of-the-art systems that participated in this task. Conclusions We have constructed the first practical system that extracts both events and associated, detailed meta-knowledge information from biomedical literature. The automatically assigned meta-knowledge information can be used to refine search systems, in order to provide an extra search layer beyond entities and assertions, dealing with phenomena such as rhetorical intent, speculations, contradictions and negations. This finer grained search functionality can assist in several important tasks, e.g., database curation (by locating new experimental knowledge) and pathway enrichment (by providing information for inference). To allow easy integration into text mining systems, EventMine-MK is provided as a UIMA component that can be used in the interoperable text mining infrastructure, U-Compare. PMID:22621266

  12. Extracting semantically enriched events from biomedical literature.

    PubMed

    Miwa, Makoto; Thompson, Paul; McNaught, John; Kell, Douglas B; Ananiadou, Sophia

    2012-05-23

    Research into event-based text mining from the biomedical literature has been growing in popularity to facilitate the development of advanced biomedical text mining systems. Such technology permits advanced search, which goes beyond document or sentence-based retrieval. However, existing event-based systems typically ignore additional information within the textual context of events that can determine, amongst other things, whether an event represents a fact, hypothesis, experimental result or analysis of results, whether it describes new or previously reported knowledge, and whether it is speculated or negated. We refer to such contextual information as meta-knowledge. The automatic recognition of such information can permit the training of systems allowing finer-grained searching of events according to the meta-knowledge that is associated with them. Based on a corpus of 1,000 MEDLINE abstracts, fully manually annotated with both events and associated meta-knowledge, we have constructed a machine learning-based system that automatically assigns meta-knowledge information to events. This system has been integrated into EventMine, a state-of-the-art event extraction system, in order to create a more advanced system (EventMine-MK) that not only extracts events from text automatically, but also assigns five different types of meta-knowledge to these events. The meta-knowledge assignment module of EventMine-MK performs with macro-averaged F-scores in the range of 57-87% on the BioNLP'09 Shared Task corpus. EventMine-MK has been evaluated on the BioNLP'09 Shared Task subtask of detecting negated and speculated events. Our results show that EventMine-MK can outperform other state-of-the-art systems that participated in this task. We have constructed the first practical system that extracts both events and associated, detailed meta-knowledge information from biomedical literature. The automatically assigned meta-knowledge information can be used to refine search systems, in order to provide an extra search layer beyond entities and assertions, dealing with phenomena such as rhetorical intent, speculations, contradictions and negations. This finer grained search functionality can assist in several important tasks, e.g., database curation (by locating new experimental knowledge) and pathway enrichment (by providing information for inference). To allow easy integration into text mining systems, EventMine-MK is provided as a UIMA component that can be used in the interoperable text mining infrastructure, U-Compare.

  13. Human listening studies reveal insights into object features extracted by echolocating dolphins

    NASA Astrophysics Data System (ADS)

    Delong, Caroline M.; Au, Whitlow W. L.; Roitblat, Herbert L.

    2004-05-01

    Echolocating dolphins extract object feature information from the acoustic parameters of object echoes. However, little is known about which object features are salient to dolphins or how they extract those features. To gain insight into how dolphins might be extracting feature information, human listeners were presented with echoes from objects used in a dolphin echoic-visual cross-modal matching task. Human participants performed a task similar to the one the dolphin had performed; however, echoic samples consisting of 23-echo trains were presented via headphones. The participants listened to the echoic sample and then visually selected the correct object from among three alternatives. The participants performed as well as or better than the dolphin (M=88.0% correct), and reported using a combination of acoustic cues to extract object features (e.g., loudness, pitch, timbre). Participants frequently reported using the pattern of aural changes in the echoes across the echo train to identify the shape and structure of the objects (e.g., peaks in loudness or pitch). It is likely that dolphins also attend to the pattern of changes across echoes as objects are echolocated from different angles.

  14. Perceptual learning modules in mathematics: enhancing students' pattern recognition, structure extraction, and fluency.

    PubMed

    Kellman, Philip J; Massey, Christine M; Son, Ji Y

    2010-04-01

    Learning in educational settings emphasizes declarative and procedural knowledge. Studies of expertise, however, point to other crucial components of learning, especially improvements produced by experience in the extraction of information: perceptual learning (PL). We suggest that such improvements characterize both simple sensory and complex cognitive, even symbolic, tasks through common processes of discovery and selection. We apply these ideas in the form of perceptual learning modules (PLMs) to mathematics learning. We tested three PLMs, each emphasizing different aspects of complex task performance, in middle and high school mathematics. In the MultiRep PLM, practice in matching function information across multiple representations improved students' abilities to generate correct graphs and equations from word problems. In the Algebraic Transformations PLM, practice in seeing equation structure across transformations (but not solving equations) led to dramatic improvements in the speed of equation solving. In the Linear Measurement PLM, interactive trials involving extraction of information about units and lengths produced successful transfer to novel measurement problems and fraction problem solving. Taken together, these results suggest (a) that PL techniques have the potential to address crucial, neglected dimensions of learning, including discovery and fluent processing of relations; (b) PL effects apply even to complex tasks that involve symbolic processing; and (c) appropriately designed PL technology can produce rapid and enduring advances in learning. Copyright © 2009 Cognitive Science Society, Inc.

  15. Data mining for personal navigation

    NASA Astrophysics Data System (ADS)

    Hariharan, Gurushyam; Franti, Pasi; Mehta, Sandeep

    2002-03-01

    Relevance is the key in defining what data is to be extracted from the Internet. Traditionally, relevance has been defined mainly by keywords and user profiles. In this paper we discuss a fairly untouched dimension to relevance: location. Any navigational information sought by a user at large on earth is evidently governed by his location. We believe that task oriented data mining of the web amalgamated with location information is the key to providing relevant information for personal navigation. We explore the existential hurdles and propose novel approaches to tackle them. We also present naive, task-oriented data mining based approaches and their implementations in Java, to extract location based information. Ad-hoc pairing of data with coordinates (x, y) is very rare on the web. But if the same co-ordinates are converted to a logical address (state/city/street), a wide spectrum of location-based information base opens up. Hence, given the coordinates (x, y) on the earth, the scheme points to the logical address of the user. Location based information could either be picked up from fixed and known service providers (e.g. Yellow Pages) or from any arbitrary website on the Web. Once the web servers providing information relevant to the logical address are located, task oriented data mining is performed over these sites keeping in mind what information is interesting to the contemporary user. After all this, a simple data stream is provided to the user with information scaled to his convenience. The scheme has been implemented for cities of Finland.

  16. Digital mammographic tumor classification using transfer learning from deep convolutional neural networks.

    PubMed

    Huynh, Benjamin Q; Li, Hui; Giger, Maryellen L

    2016-07-01

    Convolutional neural networks (CNNs) show potential for computer-aided diagnosis (CADx) by learning features directly from the image data instead of using analytically extracted features. However, CNNs are difficult to train from scratch for medical images due to small sample sizes and variations in tumor presentations. Instead, transfer learning can be used to extract tumor information from medical images via CNNs originally pretrained for nonmedical tasks, alleviating the need for large datasets. Our database includes 219 breast lesions (607 full-field digital mammographic images). We compared support vector machine classifiers based on the CNN-extracted image features and our prior computer-extracted tumor features in the task of distinguishing between benign and malignant breast lesions. Five-fold cross validation (by lesion) was conducted with the area under the receiver operating characteristic (ROC) curve as the performance metric. Results show that classifiers based on CNN-extracted features (with transfer learning) perform comparably to those using analytically extracted features [area under the ROC curve [Formula: see text

  17. MedEx/J: A One-Scan Simple and Fast NLP Tool for Japanese Clinical Texts.

    PubMed

    Aramaki, Eiji; Yano, Ken; Wakamiya, Shoko

    2017-01-01

    Because of recent replacement of physical documents with electronic medical records (EMR), the importance of information processing in the medical field has increased. In light of this trend, we have been developing MedEx/J, which retrieves important Japanese language information from medical reports. MedEx/J executes two tasks simultaneously: (1) term extraction, and (2) positive and negative event classification. We designate this approach as a one-scan approach, providing simplicity of systems and reasonable accuracy. MedEx/J performance on the two tasks is described herein: (1) term extraction (Fβ = 1 = 0.87) and (2) positive-negative classification (Fβ = 1 = 0.63). This paper also presents discussion and explains remaining issues in the medical natural language processing field.

  18. Bayesian decoding using unsorted spikes in the rat hippocampus

    PubMed Central

    Layton, Stuart P.; Chen, Zhe; Wilson, Matthew A.

    2013-01-01

    A fundamental task in neuroscience is to understand how neural ensembles represent information. Population decoding is a useful tool to extract information from neuronal populations based on the ensemble spiking activity. We propose a novel Bayesian decoding paradigm to decode unsorted spikes in the rat hippocampus. Our approach uses a direct mapping between spike waveform features and covariates of interest and avoids accumulation of spike sorting errors. Our decoding paradigm is nonparametric, encoding model-free for representing stimuli, and extracts information from all available spikes and their waveform features. We apply the proposed Bayesian decoding algorithm to a position reconstruction task for freely behaving rats based on tetrode recordings of rat hippocampal neuronal activity. Our detailed decoding analyses demonstrate that our approach is efficient and better utilizes the available information in the nonsortable hash than the standard sorting-based decoding algorithm. Our approach can be adapted to an online encoding/decoding framework for applications that require real-time decoding, such as brain-machine interfaces. PMID:24089403

  19. Integrating semantic information into multiple kernels for protein-protein interaction extraction from biomedical literatures.

    PubMed

    Li, Lishuang; Zhang, Panpan; Zheng, Tianfu; Zhang, Hongying; Jiang, Zhenchao; Huang, Degen

    2014-01-01

    Protein-Protein Interaction (PPI) extraction is an important task in the biomedical information extraction. Presently, many machine learning methods for PPI extraction have achieved promising results. However, the performance is still not satisfactory. One reason is that the semantic resources were basically ignored. In this paper, we propose a multiple-kernel learning-based approach to extract PPIs, combining the feature-based kernel, tree kernel and semantic kernel. Particularly, we extend the shortest path-enclosed tree kernel (SPT) by a dynamic extended strategy to retrieve the richer syntactic information. Our semantic kernel calculates the protein-protein pair similarity and the context similarity based on two semantic resources: WordNet and Medical Subject Heading (MeSH). We evaluate our method with Support Vector Machine (SVM) and achieve an F-score of 69.40% and an AUC of 92.00%, which show that our method outperforms most of the state-of-the-art systems by integrating semantic information.

  20. Overview of the INEX 2008 Book Track

    NASA Astrophysics Data System (ADS)

    Kazai, Gabriella; Doucet, Antoine; Landoni, Monica

    This paper provides an overview of the INEX 2008 Book Track. Now in its second year, the track aimed at broadening its scope by investigating topics of interest in the fields of information retrieval, human computer interaction, digital libraries, and eBooks. The main topics of investigation were defined around challenges for supporting users in reading, searching, and navigating the full texts of digitized books. Based on these themes, four tasks were defined: 1) The Book Retrieval task aimed at comparing traditional and book-specific retrieval approaches, 2) the Page in Context task aimed at evaluating the value of focused retrieval approaches for searching books, 3) the Structure Extraction task aimed to test automatic techniques for deriving structure from OCR and layout information, and 4) the Active Reading task aimed to explore suitable user interfaces for eBooks enabling reading, annotation, review, and summary across multiple books. We report on the setup and results of each of these tasks.

  1. Task-discriminative space-by-time factorization of muscle activity

    PubMed Central

    Delis, Ioannis; Panzeri, Stefano; Pozzo, Thierry; Berret, Bastien

    2015-01-01

    Movement generation has been hypothesized to rely on a modular organization of muscle activity. Crucial to this hypothesis is the ability to perform reliably a variety of motor tasks by recruiting a limited set of modules and combining them in a task-dependent manner. Thus far, existing algorithms that extract putative modules of muscle activations, such as Non-negative Matrix Factorization (NMF), identify modular decompositions that maximize the reconstruction of the recorded EMG data. Typically, the functional role of the decompositions, i.e., task accomplishment, is only assessed a posteriori. However, as motor actions are defined in task space, we suggest that motor modules should be computed in task space too. In this study, we propose a new module extraction algorithm, named DsNM3F, that uses task information during the module identification process. DsNM3F extends our previous space-by-time decomposition method (the so-called sNM3F algorithm, which could assess task performance only after having computed modules) to identify modules gauging between two complementary objectives: reconstruction of the original data and reliable discrimination of the performed tasks. We show that DsNM3F recovers the task dependence of module activations more accurately than sNM3F. We also apply it to electromyographic signals recorded during performance of a variety of arm pointing tasks and identify spatial and temporal modules of muscle activity that are highly consistent with previous studies. DsNM3F achieves perfect task categorization without significant loss in data approximation when task information is available and generalizes as well as sNM3F when applied to new data. These findings suggest that the space-by-time decomposition of muscle activity finds robust task-discriminating modular representations of muscle activity and that the insertion of task discrimination objectives is useful for describing the task modulation of module recruitment. PMID:26217213

  2. Task-discriminative space-by-time factorization of muscle activity.

    PubMed

    Delis, Ioannis; Panzeri, Stefano; Pozzo, Thierry; Berret, Bastien

    2015-01-01

    Movement generation has been hypothesized to rely on a modular organization of muscle activity. Crucial to this hypothesis is the ability to perform reliably a variety of motor tasks by recruiting a limited set of modules and combining them in a task-dependent manner. Thus far, existing algorithms that extract putative modules of muscle activations, such as Non-negative Matrix Factorization (NMF), identify modular decompositions that maximize the reconstruction of the recorded EMG data. Typically, the functional role of the decompositions, i.e., task accomplishment, is only assessed a posteriori. However, as motor actions are defined in task space, we suggest that motor modules should be computed in task space too. In this study, we propose a new module extraction algorithm, named DsNM3F, that uses task information during the module identification process. DsNM3F extends our previous space-by-time decomposition method (the so-called sNM3F algorithm, which could assess task performance only after having computed modules) to identify modules gauging between two complementary objectives: reconstruction of the original data and reliable discrimination of the performed tasks. We show that DsNM3F recovers the task dependence of module activations more accurately than sNM3F. We also apply it to electromyographic signals recorded during performance of a variety of arm pointing tasks and identify spatial and temporal modules of muscle activity that are highly consistent with previous studies. DsNM3F achieves perfect task categorization without significant loss in data approximation when task information is available and generalizes as well as sNM3F when applied to new data. These findings suggest that the space-by-time decomposition of muscle activity finds robust task-discriminating modular representations of muscle activity and that the insertion of task discrimination objectives is useful for describing the task modulation of module recruitment.

  3. Mobile-Cloud Assisted Video Summarization Framework for Efficient Management of Remote Sensing Data Generated by Wireless Capsule Sensors

    PubMed Central

    Mehmood, Irfan; Sajjad, Muhammad; Baik, Sung Wook

    2014-01-01

    Wireless capsule endoscopy (WCE) has great advantages over traditional endoscopy because it is portable and easy to use, especially in remote monitoring health-services. However, during the WCE process, the large amount of captured video data demands a significant deal of computation to analyze and retrieve informative video frames. In order to facilitate efficient WCE data collection and browsing task, we present a resource- and bandwidth-aware WCE video summarization framework that extracts the representative keyframes of the WCE video contents by removing redundant and non-informative frames. For redundancy elimination, we use Jeffrey-divergence between color histograms and inter-frame Boolean series-based correlation of color channels. To remove non-informative frames, multi-fractal texture features are extracted to assist the classification using an ensemble-based classifier. Owing to the limited WCE resources, it is impossible for the WCE system to perform computationally intensive video summarization tasks. To resolve computational challenges, mobile-cloud architecture is incorporated, which provides resizable computing capacities by adaptively offloading video summarization tasks between the client and the cloud server. The qualitative and quantitative results are encouraging and show that the proposed framework saves information transmission cost and bandwidth, as well as the valuable time of data analysts in browsing remote sensing data. PMID:25225874

  4. Mobile-cloud assisted video summarization framework for efficient management of remote sensing data generated by wireless capsule sensors.

    PubMed

    Mehmood, Irfan; Sajjad, Muhammad; Baik, Sung Wook

    2014-09-15

    Wireless capsule endoscopy (WCE) has great advantages over traditional endoscopy because it is portable and easy to use, especially in remote monitoring health-services. However, during the WCE process, the large amount of captured video data demands a significant deal of computation to analyze and retrieve informative video frames. In order to facilitate efficient WCE data collection and browsing task, we present a resource- and bandwidth-aware WCE video summarization framework that extracts the representative keyframes of the WCE video contents by removing redundant and non-informative frames. For redundancy elimination, we use Jeffrey-divergence between color histograms and inter-frame Boolean series-based correlation of color channels. To remove non-informative frames, multi-fractal texture features are extracted to assist the classification using an ensemble-based classifier. Owing to the limited WCE resources, it is impossible for the WCE system to perform computationally intensive video summarization tasks. To resolve computational challenges, mobile-cloud architecture is incorporated, which provides resizable computing capacities by adaptively offloading video summarization tasks between the client and the cloud server. The qualitative and quantitative results are encouraging and show that the proposed framework saves information transmission cost and bandwidth, as well as the valuable time of data analysts in browsing remote sensing data.

  5. Proprioceptive coordination of movement sequences: role of velocity and position information.

    PubMed

    Cordo, P; Carlton, L; Bevan, L; Carlton, M; Kerr, G K

    1994-05-01

    1. Recent studies have shown that the CNS uses proprioceptive information to coordinate multijoint movement sequences; proprioceptive input related to the kinematics of one joint rotation in a movement sequence can be used to trigger a subsequent joint rotation. In this paper we adopt a broad definition of "proprioception," which includes all somatosensory information related to joint posture and kinematics. This paper addresses how the CNS uses proprioceptive information related to the velocity and position of joints to coordinate multijoint movement sequences. 2. Normal human subjects sat at an experimental apparatus and performed a movement sequence with the right arm without visual feedback. The apparatus passively rotated the right elbow horizontally in the extension direction with either a constant velocity trajectory or an unpredictable velocity trajectory. The subjects' task was to open briskly the right hand when the elbow passed through a prescribed target position, similar to backhand throwing in the horizontal plane. The randomization of elbow velocities and the absence of visual information was used to discourage subjects from using any information other than proprioceptive input to perform the task. 3. Our results indicate that the CNS is able to extract the necessary kinematic information from proprioceptive input to trigger the hand opening at the correct elbow position. We estimated the minimal sensory conduction and processing delay to be 150 ms, and on the basis of this estimate, we predicted the expected performance with different degrees of reduced proprioceptive information. These predictions were compared with the subjects' actual performances, revealing that the CNS was using proprioceptive input related to joint velocity in this motor task. To determine whether position information was also being used, we examined the subjects' performances with unpredictable velocity trajectories. The results from experiments with unpredictable velocity trajectories indicate that the CNS extracts proprioceptive information related to both the velocity and the angular position of the joint to trigger the hand movement in this movement sequence. 4. To determine the generality of proprioceptive triggering in movement sequences, we estimated the minimal movement duration with which proprioceptive information can be used as well as the amount of learning required to use proprioceptive input to perform the task. The temporal limits for proprioceptive processing in this movement task were established by determining the minimal movement time during which the task could be performed.(ABSTRACT TRUNCATED AT 400 WORDS)

  6. Dynamic Decision-Making in Multi-Task Environments: Theory and Experimental Results.

    DTIC Science & Technology

    1981-03-15

    The operator’s primary responsibility in this new role is to extract information from his environment, and to integrate it for’ action selection and its...of the human operator from one of a controller to one of a supervisory decision-maker. The operator’s primary responsibility in this new role is to...troller to that of a monitor of multiple tasks, or a supervisor of sev- ~ I eral semi-automated subsystems. The operator’s primary task in these

  7. A generalizable NLP framework for fast development of pattern-based biomedical relation extraction systems.

    PubMed

    Peng, Yifan; Torii, Manabu; Wu, Cathy H; Vijay-Shanker, K

    2014-08-23

    Text mining is increasingly used in the biomedical domain because of its ability to automatically gather information from large amount of scientific articles. One important task in biomedical text mining is relation extraction, which aims to identify designated relations among biological entities reported in literature. A relation extraction system achieving high performance is expensive to develop because of the substantial time and effort required for its design and implementation. Here, we report a novel framework to facilitate the development of a pattern-based biomedical relation extraction system. It has several unique design features: (1) leveraging syntactic variations possible in a language and automatically generating extraction patterns in a systematic manner, (2) applying sentence simplification to improve the coverage of extraction patterns, and (3) identifying referential relations between a syntactic argument of a predicate and the actual target expected in the relation extraction task. A relation extraction system derived using the proposed framework achieved overall F-scores of 72.66% for the Simple events and 55.57% for the Binding events on the BioNLP-ST 2011 GE test set, comparing favorably with the top performing systems that participated in the BioNLP-ST 2011 GE task. We obtained similar results on the BioNLP-ST 2013 GE test set (80.07% and 60.58%, respectively). We conducted additional experiments on the training and development sets to provide a more detailed analysis of the system and its individual modules. This analysis indicates that without increasing the number of patterns, simplification and referential relation linking play a key role in the effective extraction of biomedical relations. In this paper, we present a novel framework for fast development of relation extraction systems. The framework requires only a list of triggers as input, and does not need information from an annotated corpus. Thus, we reduce the involvement of domain experts, who would otherwise have to provide manual annotations and help with the design of hand crafted patterns. We demonstrate how our framework is used to develop a system which achieves state-of-the-art performance on a public benchmark corpus.

  8. Development of an information retrieval tool for biomedical patents.

    PubMed

    Alves, Tiago; Rodrigues, Rúben; Costa, Hugo; Rocha, Miguel

    2018-06-01

    The volume of biomedical literature has been increasing in the last years. Patent documents have also followed this trend, being important sources of biomedical knowledge, technical details and curated data, which are put together along the granting process. The field of Biomedical text mining (BioTM) has been creating solutions for the problems posed by the unstructured nature of natural language, which makes the search of information a challenging task. Several BioTM techniques can be applied to patents. From those, Information Retrieval (IR) includes processes where relevant data are obtained from collections of documents. In this work, the main goal was to build a patent pipeline addressing IR tasks over patent repositories to make these documents amenable to BioTM tasks. The pipeline was developed within @Note2, an open-source computational framework for BioTM, adding a number of modules to the core libraries, including patent metadata and full text retrieval, PDF to text conversion and optical character recognition. Also, user interfaces were developed for the main operations materialized in a new @Note2 plug-in. The integration of these tools in @Note2 opens opportunities to run BioTM tools over patent texts, including tasks from Information Extraction, such as Named Entity Recognition or Relation Extraction. We demonstrated the pipeline's main functions with a case study, using an available benchmark dataset from BioCreative challenges. Also, we show the use of the plug-in with a user query related to the production of vanillin. This work makes available all the relevant content from patents to the scientific community, decreasing drastically the time required for this task, and provides graphical interfaces to ease the use of these tools. Copyright © 2018 Elsevier B.V. All rights reserved.

  9. Comparison of continuously acquired resting state and extracted analogues from active tasks.

    PubMed

    Ganger, Sebastian; Hahn, Andreas; Küblböck, Martin; Kranz, Georg S; Spies, Marie; Vanicek, Thomas; Seiger, René; Sladky, Ronald; Windischberger, Christian; Kasper, Siegfried; Lanzenberger, Rupert

    2015-10-01

    Functional connectivity analysis of brain networks has become an important tool for investigation of human brain function. Although functional connectivity computations are usually based on resting-state data, the application to task-specific fMRI has received growing attention. Three major methods for extraction of resting-state data from task-related signal have been proposed (1) usage of unmanipulated task data for functional connectivity; (2) regression against task effects, subsequently using the residuals; and (3) concatenation of baseline blocks located in-between task blocks. Despite widespread application in current research, consensus on which method best resembles resting-state seems to be missing. We, therefore, evaluated these techniques in a sample of 26 healthy controls measured at 7 Tesla. In addition to continuous resting-state, two different task paradigms were assessed (emotion discrimination and right finger-tapping) and five well-described networks were analyzed (default mode, thalamus, cuneus, sensorimotor, and auditory). Investigating the similarity to continuous resting-state (Dice, Intraclass correlation coefficient (ICC), R(2) ) showed that regression against task effects yields functional connectivity networks most alike to resting-state. However, all methods exhibited significant differences when compared to continuous resting-state and similarity metrics were lower than test-retest of two resting-state scans. Omitting global signal regression did not change these findings. Visually, the networks are highly similar, but through further investigation marked differences can be found. Therefore, our data does not support referring to resting-state when extracting signals from task designs, although functional connectivity computed from task-specific data may indeed yield interesting information. © 2015 The Authors Human Brain Mapping Published by Wiley Periodicals, Inc.

  10. Utilization of ontology look-up services in information retrieval for biomedical literature.

    PubMed

    Vishnyakova, Dina; Pasche, Emilie; Lovis, Christian; Ruch, Patrick

    2013-01-01

    With the vast amount of biomedical data we face the necessity to improve information retrieval processes in biomedical domain. The use of biomedical ontologies facilitated the combination of various data sources (e.g. scientific literature, clinical data repository) by increasing the quality of information retrieval and reducing the maintenance efforts. In this context, we developed Ontology Look-up services (OLS), based on NEWT and MeSH vocabularies. Our services were involved in some information retrieval tasks such as gene/disease normalization. The implementation of OLS services significantly accelerated the extraction of particular biomedical facts by structuring and enriching the data context. The results of precision in normalization tasks were boosted on about 20%.

  11. Prediction of isometric motor tasks and effort levels based on high-density EMG in patients with incomplete spinal cord injury

    NASA Astrophysics Data System (ADS)

    Jordanić, Mislav; Rojas-Martínez, Mónica; Mañanas, Miguel Angel; Francesc Alonso, Joan

    2016-08-01

    Objective. The development of modern assistive and rehabilitation devices requires reliable and easy-to-use methods to extract neural information for control of devices. Group-specific pattern recognition identifiers are influenced by inter-subject variability. Based on high-density EMG (HD-EMG) maps, our research group has already shown that inter-subject muscle activation patterns exist in a population of healthy subjects. The aim of this paper is to analyze muscle activation patterns associated with four tasks (flexion/extension of the elbow, and supination/pronation of the forearm) at three different effort levels in a group of patients with incomplete Spinal Cord Injury (iSCI). Approach. Muscle activation patterns were evaluated by the automatic identification of these four isometric tasks along with the identification of levels of voluntary contractions. Two types of classifiers were considered in the identification: linear discriminant analysis and support vector machine. Main results. Results show that performance of classification increases when combining features extracted from intensity and spatial information of HD-EMG maps (accuracy = 97.5%). Moreover, when compared to a population with injuries at different levels, a lower variability between activation maps was obtained within a group of patients with similar injury suggesting stronger task-specific and effort-level-specific co-activation patterns, which enable better prediction results. Significance. Despite the challenge of identifying both the four tasks and the three effort levels in patients with iSCI, promising results were obtained which support the use of HD-EMG features for providing useful information regarding motion and force intention.

  12. The use of digital spaceborne SAR data for the delineation of surface features indicative of malaria vector breeding habitats

    NASA Technical Reports Server (NTRS)

    Imhoff, M. L.; Vermillion, C. H.; Khan, F. A.

    1984-01-01

    An investigation to examine the utility of spaceborne radar image data to malaria vector control programs is described. Specific tasks involve an analysis of radar illumination geometry vs information content, the synergy of radar and multispectral data mergers, and automated information extraction techniques.

  13. Modelling and representation issues in automated feature extraction from aerial and satellite images

    NASA Astrophysics Data System (ADS)

    Sowmya, Arcot; Trinder, John

    New digital systems for the processing of photogrammetric and remote sensing images have led to new approaches to information extraction for mapping and Geographic Information System (GIS) applications, with the expectation that data can become more readily available at a lower cost and with greater currency. Demands for mapping and GIS data are increasing as well for environmental assessment and monitoring. Hence, researchers from the fields of photogrammetry and remote sensing, as well as computer vision and artificial intelligence, are bringing together their particular skills for automating these tasks of information extraction. The paper will review some of the approaches used in knowledge representation and modelling for machine vision, and give examples of their applications in research for image understanding of aerial and satellite imagery.

  14. Application of the EVEX resource to event extraction and network construction: Shared Task entry and result analysis

    PubMed Central

    2015-01-01

    Background Modern methods for mining biomolecular interactions from literature typically make predictions based solely on the immediate textual context, in effect a single sentence. No prior work has been published on extending this context to the information automatically gathered from the whole biomedical literature. Thus, our motivation for this study is to explore whether mutually supporting evidence, aggregated across several documents can be utilized to improve the performance of the state-of-the-art event extraction systems. In this paper, we describe our participation in the latest BioNLP Shared Task using the large-scale text mining resource EVEX. We participated in the Genia Event Extraction (GE) and Gene Regulation Network (GRN) tasks with two separate systems. In the GE task, we implemented a re-ranking approach to improve the precision of an existing event extraction system, incorporating features from the EVEX resource. In the GRN task, our system relied solely on the EVEX resource and utilized a rule-based conversion algorithm between the EVEX and GRN formats. Results In the GE task, our re-ranking approach led to a modest performance increase and resulted in the first rank of the official Shared Task results with 50.97% F-score. Additionally, in this paper we explore and evaluate the usage of distributed vector representations for this challenge. In the GRN task, we ranked fifth in the official results with a strict/relaxed SER score of 0.92/0.81 respectively. To try and improve upon these results, we have implemented a novel machine learning based conversion system and benchmarked its performance against the original rule-based system. Conclusions For the GRN task, we were able to produce a gene regulatory network from the EVEX data, warranting the use of such generic large-scale text mining data in network biology settings. A detailed performance and error analysis provides more insight into the relatively low recall rates. In the GE task we demonstrate that both the re-ranking approach and the word vectors can provide slight performance improvement. A manual evaluation of the re-ranking results pinpoints some of the challenges faced in applying large-scale text mining knowledge to event extraction. PMID:26551766

  15. Automating the generation of lexical patterns for processing free text in clinical documents.

    PubMed

    Meng, Frank; Morioka, Craig

    2015-09-01

    Many tasks in natural language processing utilize lexical pattern-matching techniques, including information extraction (IE), negation identification, and syntactic parsing. However, it is generally difficult to derive patterns that achieve acceptable levels of recall while also remaining highly precise. We present a multiple sequence alignment (MSA)-based technique that automatically generates patterns, thereby leveraging language usage to determine the context of words that influence a given target. MSAs capture the commonalities among word sequences and are able to reveal areas of linguistic stability and variation. In this way, MSAs provide a systemic approach to generating lexical patterns that are generalizable, which will both increase recall levels and maintain high levels of precision. The MSA-generated patterns exhibited consistent F1-, F.5-, and F2- scores compared to two baseline techniques for IE across four different tasks. Both baseline techniques performed well for some tasks and less well for others, but MSA was found to consistently perform at a high level for all four tasks. The performance of MSA on the four extraction tasks indicates the method's versatility. The results show that the MSA-based patterns are able to handle the extraction of individual data elements as well as relations between two concepts without the need for large amounts of manual intervention. We presented an MSA-based framework for generating lexical patterns that showed consistently high levels of both performance and recall over four different extraction tasks when compared to baseline methods. © The Author 2015. Published by Oxford University Press on behalf of the American Medical Informatics Association. All rights reserved. For Permissions, please email: journals.permissions@oup.com.

  16. System for selecting relevant information for decision support.

    PubMed

    Kalina, Jan; Seidl, Libor; Zvára, Karel; Grünfeldová, Hana; Slovák, Dalibor; Zvárová, Jana

    2013-01-01

    We implemented a prototype of a decision support system called SIR which has a form of a web-based classification service for diagnostic decision support. The system has the ability to select the most relevant variables and to learn a classification rule, which is guaranteed to be suitable also for high-dimensional measurements. The classification system can be useful for clinicians in primary care to support their decision-making tasks with relevant information extracted from any available clinical study. The implemented prototype was tested on a sample of patients in a cardiological study and performs an information extraction from a high-dimensional set containing both clinical and gene expression data.

  17. Semi-Supervised Recurrent Neural Network for Adverse Drug Reaction mention extraction.

    PubMed

    Gupta, Shashank; Pawar, Sachin; Ramrakhiyani, Nitin; Palshikar, Girish Keshav; Varma, Vasudeva

    2018-06-13

    Social media is a useful platform to share health-related information due to its vast reach. This makes it a good candidate for public-health monitoring tasks, specifically for pharmacovigilance. We study the problem of extraction of Adverse-Drug-Reaction (ADR) mentions from social media, particularly from Twitter. Medical information extraction from social media is challenging, mainly due to short and highly informal nature of text, as compared to more technical and formal medical reports. Current methods in ADR mention extraction rely on supervised learning methods, which suffer from labeled data scarcity problem. The state-of-the-art method uses deep neural networks, specifically a class of Recurrent Neural Network (RNN) which is Long-Short-Term-Memory network (LSTM). Deep neural networks, due to their large number of free parameters rely heavily on large annotated corpora for learning the end task. But in the real-world, it is hard to get large labeled data, mainly due to the heavy cost associated with the manual annotation. To this end, we propose a novel semi-supervised learning based RNN model, which can leverage unlabeled data also present in abundance on social media. Through experiments we demonstrate the effectiveness of our method, achieving state-of-the-art performance in ADR mention extraction. In this study, we tackle the problem of labeled data scarcity for Adverse Drug Reaction mention extraction from social media and propose a novel semi-supervised learning based method which can leverage large unlabeled corpus available in abundance on the web. Through empirical study, we demonstrate that our proposed method outperforms fully supervised learning based baseline which relies on large manually annotated corpus for a good performance.

  18. Text extraction method for historical Tibetan document images based on block projections

    NASA Astrophysics Data System (ADS)

    Duan, Li-juan; Zhang, Xi-qun; Ma, Long-long; Wu, Jian

    2017-11-01

    Text extraction is an important initial step in digitizing the historical documents. In this paper, we present a text extraction method for historical Tibetan document images based on block projections. The task of text extraction is considered as text area detection and location problem. The images are divided equally into blocks and the blocks are filtered by the information of the categories of connected components and corner point density. By analyzing the filtered blocks' projections, the approximate text areas can be located, and the text regions are extracted. Experiments on the dataset of historical Tibetan documents demonstrate the effectiveness of the proposed method.

  19. Rapid Training of Information Extraction with Local and Global Data Views

    DTIC Science & Technology

    2012-05-01

    56 xiii 4.1 An example of words and their bit string representations. Bold ones are transliterated Arabic words...Natural Language Processing ( NLP ) community faces new tasks and new domains all the time. Without enough labeled data of a new task or a new domain to...conduct supervised learning, semi-supervised learning is particularly attractive to NLP researchers since it only requires a handful of labeled examples

  20. Information Extraction for System-Software Safety Analysis: Calendar Year 2008 Year-End Report

    NASA Technical Reports Server (NTRS)

    Malin, Jane T.

    2009-01-01

    This annual report describes work to integrate a set of tools to support early model-based analysis of failures and hazards due to system-software interactions. The tools perform and assist analysts in the following tasks: 1) extract model parts from text for architecture and safety/hazard models; 2) combine the parts with library information to develop the models for visualization and analysis; 3) perform graph analysis and simulation to identify and evaluate possible paths from hazard sources to vulnerable entities and functions, in nominal and anomalous system-software configurations and scenarios; and 4) identify resulting candidate scenarios for software integration testing. There has been significant technical progress in model extraction from Orion program text sources, architecture model derivation (components and connections) and documentation of extraction sources. Models have been derived from Internal Interface Requirements Documents (IIRDs) and FMEA documents. Linguistic text processing is used to extract model parts and relationships, and the Aerospace Ontology also aids automated model development from the extracted information. Visualizations of these models assist analysts in requirements overview and in checking consistency and completeness.

  1. An Overview of Biomolecular Event Extraction from Scientific Documents

    PubMed Central

    Vanegas, Jorge A.; Matos, Sérgio; González, Fabio; Oliveira, José L.

    2015-01-01

    This paper presents a review of state-of-the-art approaches to automatic extraction of biomolecular events from scientific texts. Events involving biomolecules such as genes, transcription factors, or enzymes, for example, have a central role in biological processes and functions and provide valuable information for describing physiological and pathogenesis mechanisms. Event extraction from biomedical literature has a broad range of applications, including support for information retrieval, knowledge summarization, and information extraction and discovery. However, automatic event extraction is a challenging task due to the ambiguity and diversity of natural language and higher-level linguistic phenomena, such as speculations and negations, which occur in biological texts and can lead to misunderstanding or incorrect interpretation. Many strategies have been proposed in the last decade, originating from different research areas such as natural language processing, machine learning, and statistics. This review summarizes the most representative approaches in biomolecular event extraction and presents an analysis of the current state of the art and of commonly used methods, features, and tools. Finally, current research trends and future perspectives are also discussed. PMID:26587051

  2. BioCreative V track 4: a shared task for the extraction of causal network information using the Biological Expression Language.

    PubMed

    Rinaldi, Fabio; Ellendorff, Tilia Renate; Madan, Sumit; Clematide, Simon; van der Lek, Adrian; Mevissen, Theo; Fluck, Juliane

    2016-01-01

    Automatic extraction of biological network information is one of the most desired and most complex tasks in biological and medical text mining. Track 4 at BioCreative V attempts to approach this complexity using fragments of large-scale manually curated biological networks, represented in Biological Expression Language (BEL), as training and test data. BEL is an advanced knowledge representation format which has been designed to be both human readable and machine processable. The specific goal of track 4 was to evaluate text mining systems capable of automatically constructing BEL statements from given evidence text, and of retrieving evidence text for given BEL statements. Given the complexity of the task, we designed an evaluation methodology which gives credit to partially correct statements. We identified various levels of information expressed by BEL statements, such as entities, functions, relations, and introduced an evaluation framework which rewards systems capable of delivering useful BEL fragments at each of these levels. The aim of this evaluation method is to help identify the characteristics of the systems which, if combined, would be most useful for achieving the overall goal of automatically constructing causal biological networks from text. © The Author(s) 2016. Published by Oxford University Press.

  3. Extracting duration information in a picture category decoding task using hidden Markov Models

    NASA Astrophysics Data System (ADS)

    Pfeiffer, Tim; Heinze, Nicolai; Frysch, Robert; Deouell, Leon Y.; Schoenfeld, Mircea A.; Knight, Robert T.; Rose, Georg

    2016-04-01

    Objective. Adapting classifiers for the purpose of brain signal decoding is a major challenge in brain-computer-interface (BCI) research. In a previous study we showed in principle that hidden Markov models (HMM) are a suitable alternative to the well-studied static classifiers. However, since we investigated a rather straightforward task, advantages from modeling of the signal could not be assessed. Approach. Here, we investigate a more complex data set in order to find out to what extent HMMs, as a dynamic classifier, can provide useful additional information. We show for a visual decoding problem that besides category information, HMMs can simultaneously decode picture duration without an additional training required. This decoding is based on a strong correlation that we found between picture duration and the behavior of the Viterbi paths. Main results. Decoding accuracies of up to 80% could be obtained for category and duration decoding with a single classifier trained on category information only. Significance. The extraction of multiple types of information using a single classifier enables the processing of more complex problems, while preserving good training results even on small databases. Therefore, it provides a convenient framework for online real-life BCI utilizations.

  4. Tags Extarction from Spatial Documents in Search Engines

    NASA Astrophysics Data System (ADS)

    Borhaninejad, S.; Hakimpour, F.; Hamzei, E.

    2015-12-01

    Nowadays the selective access to information on the Web is provided by search engines, but in the cases which the data includes spatial information the search task becomes more complex and search engines require special capabilities. The purpose of this study is to extract the information which lies in spatial documents. To that end, we implement and evaluate information extraction from GML documents and a retrieval method in an integrated approach. Our proposed system consists of three components: crawler, database and user interface. In crawler component, GML documents are discovered and their text is parsed for information extraction; storage. The database component is responsible for indexing of information which is collected by crawlers. Finally the user interface component provides the interaction between system and user. We have implemented this system as a pilot system on an Application Server as a simulation of Web. Our system as a spatial search engine provided searching capability throughout the GML documents and thus an important step to improve the efficiency of search engines has been taken.

  5. Feature saliency in judging the sex and familiarity of faces.

    PubMed

    Roberts, T; Bruce, V

    1988-01-01

    Two experiments are reported on the effect of feature masking on judgements of the sex and familiarity of faces. In experiment 1 the effect of masking the eyes, nose, or mouth of famous and nonfamous, male and female faces on response times in two tasks was investigated. In the first, recognition, task only masking of the eyes had a significant effect on response times. In the second, sex-judgement, task masking of the nose gave rise to a significant and large increase in response times. In experiment 2 it was found that when facial features were presented in isolation in a sex-judgement task, responses to noses were at chance level, unlike those for eyes or mouths. It appears that visual information available from the nose in isolation from the rest of the face is not sufficient for sex judgement, yet masking of the nose may disrupt the extraction of information about the overall topography of the face, information that may be more useful for sex judgement than for identification of a face.

  6. Spatio-Temporal Information Analysis of Event-Related BOLD Responses

    PubMed Central

    Alpert, Galit Fuhrmann; Handwerker, Dan; Sun, Felice T.; D’Esposito, Mark; Knight, Robert T.

    2009-01-01

    A new approach for analysis of event related fMRI (BOLD) signals is proposed. The technique is based on measures from information theory and is used both for spatial localization of task related activity, as well as for extracting temporal information regarding the task dependent propagation of activation across different brain regions. This approach enables whole brain visualization of voxels (areas) most involved in coding of a specific task condition, the time at which they are most informative about the condition, as well as their average amplitude at that preferred time. The approach does not require prior assumptions about the shape of the hemodynamic response function (HRF), nor about linear relations between BOLD response and presented stimuli (or task conditions). We show that relative delays between different brain regions can also be computed without prior knowledge of the experimental design, suggesting a general method that could be applied for analysis of differential time delays that occur during natural, uncontrolled conditions. Here we analyze BOLD signals recorded during performance of a motor learning task. We show that during motor learning, the BOLD response of unimodal motor cortical areas precedes the response in higher-order multimodal association areas, including posterior parietal cortex. Brain areas found to be associated with reduced activity during motor learning, predominantly in prefrontal brain regions, are informative about the task typically at significantly later times. PMID:17188515

  7. Corpora and Data Preparation for Information Extraction

    DTIC Science & Technology

    1993-09-01

    technical publications in fields such as communications, airline transportation, rubber & plas- tics, and food marketing . The Japanese-language...types in the U. S., for example, avocado farms, electric popcorn popper sales, management consulting. The template-filling task required that products

  8. Joint Extraction of Entities and Relations Using Reinforcement Learning and Deep Learning.

    PubMed

    Feng, Yuntian; Zhang, Hongjun; Hao, Wenning; Chen, Gang

    2017-01-01

    We use both reinforcement learning and deep learning to simultaneously extract entities and relations from unstructured texts. For reinforcement learning, we model the task as a two-step decision process. Deep learning is used to automatically capture the most important information from unstructured texts, which represent the state in the decision process. By designing the reward function per step, our proposed method can pass the information of entity extraction to relation extraction and obtain feedback in order to extract entities and relations simultaneously. Firstly, we use bidirectional LSTM to model the context information, which realizes preliminary entity extraction. On the basis of the extraction results, attention based method can represent the sentences that include target entity pair to generate the initial state in the decision process. Then we use Tree-LSTM to represent relation mentions to generate the transition state in the decision process. Finally, we employ Q -Learning algorithm to get control policy π in the two-step decision process. Experiments on ACE2005 demonstrate that our method attains better performance than the state-of-the-art method and gets a 2.4% increase in recall-score.

  9. Joint Extraction of Entities and Relations Using Reinforcement Learning and Deep Learning

    PubMed Central

    Zhang, Hongjun; Chen, Gang

    2017-01-01

    We use both reinforcement learning and deep learning to simultaneously extract entities and relations from unstructured texts. For reinforcement learning, we model the task as a two-step decision process. Deep learning is used to automatically capture the most important information from unstructured texts, which represent the state in the decision process. By designing the reward function per step, our proposed method can pass the information of entity extraction to relation extraction and obtain feedback in order to extract entities and relations simultaneously. Firstly, we use bidirectional LSTM to model the context information, which realizes preliminary entity extraction. On the basis of the extraction results, attention based method can represent the sentences that include target entity pair to generate the initial state in the decision process. Then we use Tree-LSTM to represent relation mentions to generate the transition state in the decision process. Finally, we employ Q-Learning algorithm to get control policy π in the two-step decision process. Experiments on ACE2005 demonstrate that our method attains better performance than the state-of-the-art method and gets a 2.4% increase in recall-score. PMID:28894463

  10. Disruptive technologies for Massachusetts Bay Transportation Authority business strategy exploration.

    DOT National Transportation Integrated Search

    2013-04-01

    There are three tasks for this research : 1. Methodology to extract Road Usage Patterns from Phone Data: We combined the : most complete record of daily mobility, based on large-scale mobile phone data, with : detailed Geographic Information System (...

  11. A neural joint model for entity and relation extraction from biomedical text.

    PubMed

    Li, Fei; Zhang, Meishan; Fu, Guohong; Ji, Donghong

    2017-03-31

    Extracting biomedical entities and their relations from text has important applications on biomedical research. Previous work primarily utilized feature-based pipeline models to process this task. Many efforts need to be made on feature engineering when feature-based models are employed. Moreover, pipeline models may suffer error propagation and are not able to utilize the interactions between subtasks. Therefore, we propose a neural joint model to extract biomedical entities as well as their relations simultaneously, and it can alleviate the problems above. Our model was evaluated on two tasks, i.e., the task of extracting adverse drug events between drug and disease entities, and the task of extracting resident relations between bacteria and location entities. Compared with the state-of-the-art systems in these tasks, our model improved the F1 scores of the first task by 5.1% in entity recognition and 8.0% in relation extraction, and that of the second task by 9.2% in relation extraction. The proposed model achieves competitive performances with less work on feature engineering. We demonstrate that the model based on neural networks is effective for biomedical entity and relation extraction. In addition, parameter sharing is an alternative method for neural models to jointly process this task. Our work can facilitate the research on biomedical text mining.

  12. Extracting Information from Narratives: An Application to Aviation Safety Reports

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Posse, Christian; Matzke, Brett D.; Anderson, Catherine M.

    2005-05-12

    Aviation safety reports are the best available source of information about why a flight incident happened. However, stream of consciousness permeates the narratives making difficult the automation of the information extraction task. We propose an approach and infrastructure based on a common pattern specification language to capture relevant information via normalized template expression matching in context. Template expression matching handles variants of multi-word expressions. Normalization improves the likelihood of correct hits by standardizing and cleaning the vocabulary used in narratives. Checking for the presence of negative modifiers in the proximity of a potential hit reduces the chance of false hits.more » We present the above approach in the context of a specific application, which is the extraction of human performance factors from NASA ASRS reports. While knowledge infusion from experts plays a critical role during the learning phase, early results show that in a production mode, the automated process provides information that is consistent with analyses by human subjects.« less

  13. Efficient feature extraction from wide-area motion imagery by MapReduce in Hadoop

    NASA Astrophysics Data System (ADS)

    Cheng, Erkang; Ma, Liya; Blaisse, Adam; Blasch, Erik; Sheaff, Carolyn; Chen, Genshe; Wu, Jie; Ling, Haibin

    2014-06-01

    Wide-Area Motion Imagery (WAMI) feature extraction is important for applications such as target tracking, traffic management and accident discovery. With the increasing amount of WAMI collections and feature extraction from the data, a scalable framework is needed to handle the large amount of information. Cloud computing is one of the approaches recently applied in large scale or big data. In this paper, MapReduce in Hadoop is investigated for large scale feature extraction tasks for WAMI. Specifically, a large dataset of WAMI images is divided into several splits. Each split has a small subset of WAMI images. The feature extractions of WAMI images in each split are distributed to slave nodes in the Hadoop system. Feature extraction of each image is performed individually in the assigned slave node. Finally, the feature extraction results are sent to the Hadoop File System (HDFS) to aggregate the feature information over the collected imagery. Experiments of feature extraction with and without MapReduce are conducted to illustrate the effectiveness of our proposed Cloud-Enabled WAMI Exploitation (CAWE) approach.

  14. Overview of the ID, EPI and REL tasks of BioNLP Shared Task 2011.

    PubMed

    Pyysalo, Sampo; Ohta, Tomoko; Rak, Rafal; Sullivan, Dan; Mao, Chunhong; Wang, Chunxia; Sobral, Bruno; Tsujii, Jun'ichi; Ananiadou, Sophia

    2012-06-26

    We present the preparation, resources, results and analysis of three tasks of the BioNLP Shared Task 2011: the main tasks on Infectious Diseases (ID) and Epigenetics and Post-translational Modifications (EPI), and the supporting task on Entity Relations (REL). The two main tasks represent extensions of the event extraction model introduced in the BioNLP Shared Task 2009 (ST'09) to two new areas of biomedical scientific literature, each motivated by the needs of specific biocuration tasks. The ID task concerns the molecular mechanisms of infection, virulence and resistance, focusing in particular on the functions of a class of signaling systems that are ubiquitous in bacteria. The EPI task is dedicated to the extraction of statements regarding chemical modifications of DNA and proteins, with particular emphasis on changes relating to the epigenetic control of gene expression. By contrast to these two application-oriented main tasks, the REL task seeks to support extraction in general by separating challenges relating to part-of relations into a subproblem that can be addressed by independent systems. Seven groups participated in each of the two main tasks and four groups in the supporting task. The participating systems indicated advances in the capability of event extraction methods and demonstrated generalization in many aspects: from abstracts to full texts, from previously considered subdomains to new ones, and from the ST'09 extraction targets to other entities and events. The highest performance achieved in the supporting task REL, 58% F-score, is broadly comparable with levels reported for other relation extraction tasks. For the ID task, the highest-performing system achieved 56% F-score, comparable to the state-of-the-art performance at the established ST'09 task. In the EPI task, the best result was 53% F-score for the full set of extraction targets and 69% F-score for a reduced set of core extraction targets, approaching a level of performance sufficient for user-facing applications. In this study, we extend on previously reported results and perform further analyses of the outputs of the participating systems. We place specific emphasis on aspects of system performance relating to real-world applicability, considering alternate evaluation metrics and performing additional manual analysis of system outputs. We further demonstrate that the strengths of extraction systems can be combined to improve on the performance achieved by any system in isolation. The manually annotated corpora, supporting resources, and evaluation tools for all tasks are available from http://www.bionlp-st.org and the tasks continue as open challenges for all interested parties.

  15. Overview of the ID, EPI and REL tasks of BioNLP Shared Task 2011

    PubMed Central

    2012-01-01

    We present the preparation, resources, results and analysis of three tasks of the BioNLP Shared Task 2011: the main tasks on Infectious Diseases (ID) and Epigenetics and Post-translational Modifications (EPI), and the supporting task on Entity Relations (REL). The two main tasks represent extensions of the event extraction model introduced in the BioNLP Shared Task 2009 (ST'09) to two new areas of biomedical scientific literature, each motivated by the needs of specific biocuration tasks. The ID task concerns the molecular mechanisms of infection, virulence and resistance, focusing in particular on the functions of a class of signaling systems that are ubiquitous in bacteria. The EPI task is dedicated to the extraction of statements regarding chemical modifications of DNA and proteins, with particular emphasis on changes relating to the epigenetic control of gene expression. By contrast to these two application-oriented main tasks, the REL task seeks to support extraction in general by separating challenges relating to part-of relations into a subproblem that can be addressed by independent systems. Seven groups participated in each of the two main tasks and four groups in the supporting task. The participating systems indicated advances in the capability of event extraction methods and demonstrated generalization in many aspects: from abstracts to full texts, from previously considered subdomains to new ones, and from the ST'09 extraction targets to other entities and events. The highest performance achieved in the supporting task REL, 58% F-score, is broadly comparable with levels reported for other relation extraction tasks. For the ID task, the highest-performing system achieved 56% F-score, comparable to the state-of-the-art performance at the established ST'09 task. In the EPI task, the best result was 53% F-score for the full set of extraction targets and 69% F-score for a reduced set of core extraction targets, approaching a level of performance sufficient for user-facing applications. In this study, we extend on previously reported results and perform further analyses of the outputs of the participating systems. We place specific emphasis on aspects of system performance relating to real-world applicability, considering alternate evaluation metrics and performing additional manual analysis of system outputs. We further demonstrate that the strengths of extraction systems can be combined to improve on the performance achieved by any system in isolation. The manually annotated corpora, supporting resources, and evaluation tools for all tasks are available from http://www.bionlp-st.org and the tasks continue as open challenges for all interested parties. PMID:22759456

  16. Four types of ensemble coding in data visualizations.

    PubMed

    Szafir, Danielle Albers; Haroz, Steve; Gleicher, Michael; Franconeri, Steven

    2016-01-01

    Ensemble coding supports rapid extraction of visual statistics about distributed visual information. Researchers typically study this ability with the goal of drawing conclusions about how such coding extracts information from natural scenes. Here we argue that a second domain can serve as another strong inspiration for understanding ensemble coding: graphs, maps, and other visual presentations of data. Data visualizations allow observers to leverage their ability to perform visual ensemble statistics on distributions of spatial or featural visual information to estimate actual statistics on data. We survey the types of visual statistical tasks that occur within data visualizations across everyday examples, such as scatterplots, and more specialized images, such as weather maps or depictions of patterns in text. We divide these tasks into four categories: identification of sets of values, summarization across those values, segmentation of collections, and estimation of structure. We point to unanswered questions for each category and give examples of such cross-pollination in the current literature. Increased collaboration between the data visualization and perceptual psychology research communities can inspire new solutions to challenges in visualization while simultaneously exposing unsolved problems in perception research.

  17. The Effects of Using Corpora on Revision Tasks in L2 Writing with Coded Error Feedback

    ERIC Educational Resources Information Center

    Tono, Yukio; Satake, Yoshiho; Miura, Aika

    2014-01-01

    This study reports on the results of classroom research investigating the effects of corpus use in the process of revising compositions in English as a foreign language. Our primary aim was to investigate the relationship between the information extracted from corpus data and how that information actually helped in revising different types of…

  18. Comparison of continuously acquired resting state and extracted analogues from active tasks

    PubMed Central

    Ganger, Sebastian; Hahn, Andreas; Küblböck, Martin; Kranz, Georg S.; Spies, Marie; Vanicek, Thomas; Seiger, René; Sladky, Ronald; Windischberger, Christian; Kasper, Siegfried

    2015-01-01

    Abstract Functional connectivity analysis of brain networks has become an important tool for investigation of human brain function. Although functional connectivity computations are usually based on resting‐state data, the application to task‐specific fMRI has received growing attention. Three major methods for extraction of resting‐state data from task‐related signal have been proposed (1) usage of unmanipulated task data for functional connectivity; (2) regression against task effects, subsequently using the residuals; and (3) concatenation of baseline blocks located in‐between task blocks. Despite widespread application in current research, consensus on which method best resembles resting‐state seems to be missing. We, therefore, evaluated these techniques in a sample of 26 healthy controls measured at 7 Tesla. In addition to continuous resting‐state, two different task paradigms were assessed (emotion discrimination and right finger‐tapping) and five well‐described networks were analyzed (default mode, thalamus, cuneus, sensorimotor, and auditory). Investigating the similarity to continuous resting‐state (Dice, Intraclass correlation coefficient (ICC), R 2) showed that regression against task effects yields functional connectivity networks most alike to resting‐state. However, all methods exhibited significant differences when compared to continuous resting‐state and similarity metrics were lower than test‐retest of two resting‐state scans. Omitting global signal regression did not change these findings. Visually, the networks are highly similar, but through further investigation marked differences can be found. Therefore, our data does not support referring to resting‐state when extracting signals from task designs, although functional connectivity computed from task‐specific data may indeed yield interesting information. Hum Brain Mapp 36:4053–4063, 2015. © 2015 The Authors Human Brain Mapping Published by Wiley Periodicals, Inc. PMID:26178250

  19. Iterative filtering decomposition based on local spectral evolution kernel

    PubMed Central

    Wang, Yang; Wei, Guo-Wei; Yang, Siyang

    2011-01-01

    The synthesizing information, achieving understanding, and deriving insight from increasingly massive, time-varying, noisy and possibly conflicting data sets are some of most challenging tasks in the present information age. Traditional technologies, such as Fourier transform and wavelet multi-resolution analysis, are inadequate to handle all of the above-mentioned tasks. The empirical model decomposition (EMD) has emerged as a new powerful tool for resolving many challenging problems in data processing and analysis. Recently, an iterative filtering decomposition (IFD) has been introduced to address the stability and efficiency problems of the EMD. Another data analysis technique is the local spectral evolution kernel (LSEK), which provides a near prefect low pass filter with desirable time-frequency localizations. The present work utilizes the LSEK to further stabilize the IFD, and offers an efficient, flexible and robust scheme for information extraction, complexity reduction, and signal and image understanding. The performance of the present LSEK based IFD is intensively validated over a wide range of data processing tasks, including mode decomposition, analysis of time-varying data, information extraction from nonlinear dynamic systems, etc. The utility, robustness and usefulness of the proposed LESK based IFD are demonstrated via a large number of applications, such as the analysis of stock market data, the decomposition of ocean wave magnitudes, the understanding of physiologic signals and information recovery from noisy images. The performance of the proposed method is compared with that of existing methods in the literature. Our results indicate that the LSEK based IFD improves both the efficiency and the stability of conventional EMD algorithms. PMID:22350559

  20. Acquiring Information from Wider Scope to Improve Event Extraction

    DTIC Science & Technology

    2012-05-01

    solve all the problems might be hard or even impossible: Word sense disambiguation is already a hard NLP task, and normalizing different expressions...blindfolded woman seen being shot in the head by a hooded militant on a video obtained but not aired by the Arab television station Al-Jazeera. She...imbalance Why are we interested in unsupervised topic features? There is a problem that arises in the evaluation of almost all the tasks in NLP , concerning

  1. Local and global aspects of biological motion perception in children born at very low birth weight

    PubMed Central

    Williamson, K. E.; Jakobson, L. S.; Saunders, D. R.; Troje, N. F.

    2015-01-01

    Biological motion perception can be assessed using a variety of tasks. In the present study, 8- to 11-year-old children born prematurely at very low birth weight (<1500 g) and matched, full-term controls completed tasks that required the extraction of local motion cues, the ability to perceptually group these cues to extract information about body structure, and the ability to carry out higher order processes required for action recognition and person identification. Preterm children exhibited difficulties in all 4 aspects of biological motion perception. However, intercorrelations between test scores were weak in both full-term and preterm children—a finding that supports the view that these processes are relatively independent. Preterm children also displayed more autistic-like traits than full-term peers. In preterm (but not full-term) children, these traits were negatively correlated with performance in the task requiring structure-from-motion processing, r(30) = −.36, p < .05), but positively correlated with the ability to extract identity, r(30) = .45, p < .05). These findings extend previous reports of vulnerability in systems involved in processing dynamic cues in preterm children and suggest that a core deficit in social perception/cognition may contribute to the development of the social and behavioral difficulties even in members of this population who are functioning within the normal range intellectually. The results could inform the development of screening, diagnostic, and intervention tools. PMID:25103588

  2. An Ontology-Enabled Natural Language Processing Pipeline for Provenance Metadata Extraction from Biomedical Text (Short Paper).

    PubMed

    Valdez, Joshua; Rueschman, Michael; Kim, Matthew; Redline, Susan; Sahoo, Satya S

    2016-10-01

    Extraction of structured information from biomedical literature is a complex and challenging problem due to the complexity of biomedical domain and lack of appropriate natural language processing (NLP) techniques. High quality domain ontologies model both data and metadata information at a fine level of granularity, which can be effectively used to accurately extract structured information from biomedical text. Extraction of provenance metadata, which describes the history or source of information, from published articles is an important task to support scientific reproducibility. Reproducibility of results reported by previous research studies is a foundational component of scientific advancement. This is highlighted by the recent initiative by the US National Institutes of Health called "Principles of Rigor and Reproducibility". In this paper, we describe an effective approach to extract provenance metadata from published biomedical research literature using an ontology-enabled NLP platform as part of the Provenance for Clinical and Healthcare Research (ProvCaRe). The ProvCaRe-NLP tool extends the clinical Text Analysis and Knowledge Extraction System (cTAKES) platform using both provenance and biomedical domain ontologies. We demonstrate the effectiveness of ProvCaRe-NLP tool using a corpus of 20 peer-reviewed publications. The results of our evaluation demonstrate that the ProvCaRe-NLP tool has significantly higher recall in extracting provenance metadata as compared to existing NLP pipelines such as MetaMap.

  3. Establishment of Application Guidance for OTC non-Kampo Crude Drug Extract Products in Japan

    PubMed Central

    Somekawa, Layla; Maegawa, Hikoichiro; Tsukada, Shinsuke; Nakamura, Takatoshi

    2017-01-01

    Currently, there are no standardized regulatory systems for herbal medicinal products worldwide. Communication and sharing of knowledge between different regulatory systems will lead to mutual understanding and might help identify topics which deserve further discussion in the establishment of common standards. Regulatory information on traditional herbal medicinal products in Japan is updated by the establishment of Application Guidance for over-the-counter non-Kampo Crude Drug Extract Products. We would like to report on updated regulatory information on the new Application Guidance. Methods for comparison of Crude Drug Extract formulation and standard decoction and criteria for application and the key points to consider for each criterion are indicated in the guidance. Establishment of the guidance contributes to improvements in public health. We hope that the regulatory information about traditional herbal medicinal products in Japan will be of contribution to tackling the challenging task of regulating traditional herbal products worldwide. PMID:28894633

  4. Comparing deep learning and concept extraction based methods for patient phenotyping from clinical narratives.

    PubMed

    Gehrmann, Sebastian; Dernoncourt, Franck; Li, Yeran; Carlson, Eric T; Wu, Joy T; Welt, Jonathan; Foote, John; Moseley, Edward T; Grant, David W; Tyler, Patrick D; Celi, Leo A

    2018-01-01

    In secondary analysis of electronic health records, a crucial task consists in correctly identifying the patient cohort under investigation. In many cases, the most valuable and relevant information for an accurate classification of medical conditions exist only in clinical narratives. Therefore, it is necessary to use natural language processing (NLP) techniques to extract and evaluate these narratives. The most commonly used approach to this problem relies on extracting a number of clinician-defined medical concepts from text and using machine learning techniques to identify whether a particular patient has a certain condition. However, recent advances in deep learning and NLP enable models to learn a rich representation of (medical) language. Convolutional neural networks (CNN) for text classification can augment the existing techniques by leveraging the representation of language to learn which phrases in a text are relevant for a given medical condition. In this work, we compare concept extraction based methods with CNNs and other commonly used models in NLP in ten phenotyping tasks using 1,610 discharge summaries from the MIMIC-III database. We show that CNNs outperform concept extraction based methods in almost all of the tasks, with an improvement in F1-score of up to 26 and up to 7 percentage points in area under the ROC curve (AUC). We additionally assess the interpretability of both approaches by presenting and evaluating methods that calculate and extract the most salient phrases for a prediction. The results indicate that CNNs are a valid alternative to existing approaches in patient phenotyping and cohort identification, and should be further investigated. Moreover, the deep learning approach presented in this paper can be used to assist clinicians during chart review or support the extraction of billing codes from text by identifying and highlighting relevant phrases for various medical conditions.

  5. Improving mental task classification by adding high frequency band information.

    PubMed

    Zhang, Li; He, Wei; He, Chuanhong; Wang, Ping

    2010-02-01

    Features extracted from delta, theta, alpha, beta and gamma bands spanning low frequency range are commonly used to classify scalp-recorded electroencephalogram (EEG) for designing brain-computer interface (BCI) and higher frequencies are often neglected as noise. In this paper, we implemented an experimental validation to demonstrate that high frequency components could provide helpful information for improving the performance of the mental task based BCI. Electromyography (EMG) and electrooculography (EOG) artifacts were removed by using blind source separation (BSS) techniques. Frequency band powers and asymmetry ratios from the high frequency band (40-100 Hz) together with those from the lower frequency bands were used to represent EEG features. Finally, Fisher discriminant analysis (FDA) combining with Mahalanobis distance were used as the classifier. In this study, four types of classifications were performed using EEG signals recorded from four subjects during five mental tasks. We obtained significantly higher classification accuracy by adding the high frequency band features compared to using the low frequency bands alone, which demonstrated that the information in high frequency components from scalp-recorded EEG is valuable for the mental task based BCI.

  6. Automatic Extraction of JPF Options and Documentation

    NASA Technical Reports Server (NTRS)

    Luks, Wojciech; Tkachuk, Oksana; Buschnell, David

    2011-01-01

    Documenting existing Java PathFinder (JPF) projects or developing new extensions is a challenging task. JPF provides a platform for creating new extensions and relies on key-value properties for their configuration. Keeping track of all possible options and extension mechanisms in JPF can be difficult. This paper presents jpf-autodoc-options, a tool that automatically extracts JPF projects options and other documentation-related information, which can greatly help both JPF users and developers of JPF extensions.

  7. Music information retrieval in compressed audio files: a survey

    NASA Astrophysics Data System (ADS)

    Zampoglou, Markos; Malamos, Athanasios G.

    2014-07-01

    In this paper, we present an organized survey of the existing literature on music information retrieval systems in which descriptor features are extracted directly from the compressed audio files, without prior decompression to pulse-code modulation format. Avoiding the decompression step and utilizing the readily available compressed-domain information can significantly lighten the computational cost of a music information retrieval system, allowing application to large-scale music databases. We identify a number of systems relying on compressed-domain information and form a systematic classification of the features they extract, the retrieval tasks they tackle and the degree in which they achieve an actual increase in the overall speed-as well as any resulting loss in accuracy. Finally, we discuss recent developments in the field, and the potential research directions they open toward ultra-fast, scalable systems.

  8. OpenDMAP: An open source, ontology-driven concept analysis engine, with applications to capturing knowledge regarding protein transport, protein interactions and cell-type-specific gene expression

    PubMed Central

    Hunter, Lawrence; Lu, Zhiyong; Firby, James; Baumgartner, William A; Johnson, Helen L; Ogren, Philip V; Cohen, K Bretonnel

    2008-01-01

    Background Information extraction (IE) efforts are widely acknowledged to be important in harnessing the rapid advance of biomedical knowledge, particularly in areas where important factual information is published in a diverse literature. Here we report on the design, implementation and several evaluations of OpenDMAP, an ontology-driven, integrated concept analysis system. It significantly advances the state of the art in information extraction by leveraging knowledge in ontological resources, integrating diverse text processing applications, and using an expanded pattern language that allows the mixing of syntactic and semantic elements and variable ordering. Results OpenDMAP information extraction systems were produced for extracting protein transport assertions (transport), protein-protein interaction assertions (interaction) and assertions that a gene is expressed in a cell type (expression). Evaluations were performed on each system, resulting in F-scores ranging from .26 – .72 (precision .39 – .85, recall .16 – .85). Additionally, each of these systems was run over all abstracts in MEDLINE, producing a total of 72,460 transport instances, 265,795 interaction instances and 176,153 expression instances. Conclusion OpenDMAP advances the performance standards for extracting protein-protein interaction predications from the full texts of biomedical research articles. Furthermore, this level of performance appears to generalize to other information extraction tasks, including extracting information about predicates of more than two arguments. The output of the information extraction system is always constructed from elements of an ontology, ensuring that the knowledge representation is grounded with respect to a carefully constructed model of reality. The results of these efforts can be used to increase the efficiency of manual curation efforts and to provide additional features in systems that integrate multiple sources for information extraction. The open source OpenDMAP code library is freely available at PMID:18237434

  9. DOE Office of Scientific and Technical Information (OSTI.GOV)

    Gydesen, S.P.

    The purpose of this letter report is to reconstruct from available information that data which can be used to develop daily reactor operating history for 1960--1964. The information needed for source team calculations (as determined by the Source Terms Task Leader) were extracted and included in this report. The data on the amount of uranium dissolved by the separations plants (expressed both as tons and as MW) is also included in this compilation.

  10. Human-robot skills transfer interfaces for a flexible surgical robot.

    PubMed

    Calinon, Sylvain; Bruno, Danilo; Malekzadeh, Milad S; Nanayakkara, Thrishantha; Caldwell, Darwin G

    2014-09-01

    In minimally invasive surgery, tools go through narrow openings and manipulate soft organs to perform surgical tasks. There are limitations in current robot-assisted surgical systems due to the rigidity of robot tools. The aim of the STIFF-FLOP European project is to develop a soft robotic arm to perform surgical tasks. The flexibility of the robot allows the surgeon to move within organs to reach remote areas inside the body and perform challenging procedures in laparoscopy. This article addresses the problem of designing learning interfaces enabling the transfer of skills from human demonstration. Robot programming by demonstration encompasses a wide range of learning strategies, from simple mimicking of the demonstrator's actions to the higher level imitation of the underlying intent extracted from the demonstrations. By focusing on this last form, we study the problem of extracting an objective function explaining the demonstrations from an over-specified set of candidate reward functions, and using this information for self-refinement of the skill. In contrast to inverse reinforcement learning strategies that attempt to explain the observations with reward functions defined for the entire task (or a set of pre-defined reward profiles active for different parts of the task), the proposed approach is based on context-dependent reward-weighted learning, where the robot can learn the relevance of candidate objective functions with respect to the current phase of the task or encountered situation. The robot then exploits this information for skills refinement in the policy parameters space. The proposed approach is tested in simulation with a cutting task performed by the STIFF-FLOP flexible robot, using kinesthetic demonstrations from a Barrett WAM manipulator. Copyright © 2014 Elsevier Ireland Ltd. All rights reserved.

  11. Automation for System Safety Analysis

    NASA Technical Reports Server (NTRS)

    Malin, Jane T.; Fleming, Land; Throop, David; Thronesbery, Carroll; Flores, Joshua; Bennett, Ted; Wennberg, Paul

    2009-01-01

    This presentation describes work to integrate a set of tools to support early model-based analysis of failures and hazards due to system-software interactions. The tools perform and assist analysts in the following tasks: 1) extract model parts from text for architecture and safety/hazard models; 2) combine the parts with library information to develop the models for visualization and analysis; 3) perform graph analysis and simulation to identify and evaluate possible paths from hazard sources to vulnerable entities and functions, in nominal and anomalous system-software configurations and scenarios; and 4) identify resulting candidate scenarios for software integration testing. There has been significant technical progress in model extraction from Orion program text sources, architecture model derivation (components and connections) and documentation of extraction sources. Models have been derived from Internal Interface Requirements Documents (IIRDs) and FMEA documents. Linguistic text processing is used to extract model parts and relationships, and the Aerospace Ontology also aids automated model development from the extracted information. Visualizations of these models assist analysts in requirements overview and in checking consistency and completeness.

  12. Smart Extraction and Analysis System for Clinical Research.

    PubMed

    Afzal, Muhammad; Hussain, Maqbool; Khan, Wajahat Ali; Ali, Taqdir; Jamshed, Arif; Lee, Sungyoung

    2017-05-01

    With the increasing use of electronic health records (EHRs), there is a growing need to expand the utilization of EHR data to support clinical research. The key challenge in achieving this goal is the unavailability of smart systems and methods to overcome the issue of data preparation, structuring, and sharing for smooth clinical research. We developed a robust analysis system called the smart extraction and analysis system (SEAS) that consists of two subsystems: (1) the information extraction system (IES), for extracting information from clinical documents, and (2) the survival analysis system (SAS), for a descriptive and predictive analysis to compile the survival statistics and predict the future chance of survivability. The IES subsystem is based on a novel permutation-based pattern recognition method that extracts information from unstructured clinical documents. Similarly, the SAS subsystem is based on a classification and regression tree (CART)-based prediction model for survival analysis. SEAS is evaluated and validated on a real-world case study of head and neck cancer. The overall information extraction accuracy of the system for semistructured text is recorded at 99%, while that for unstructured text is 97%. Furthermore, the automated, unstructured information extraction has reduced the average time spent on manual data entry by 75%, without compromising the accuracy of the system. Moreover, around 88% of patients are found in a terminal or dead state for the highest clinical stage of disease (level IV). Similarly, there is an ∼36% probability of a patient being alive if at least one of the lifestyle risk factors was positive. We presented our work on the development of SEAS to replace costly and time-consuming manual methods with smart automatic extraction of information and survival prediction methods. SEAS has reduced the time and energy of human resources spent unnecessarily on manual tasks.

  13. Quantum algorithms for topological and geometric analysis of data

    PubMed Central

    Lloyd, Seth; Garnerone, Silvano; Zanardi, Paolo

    2016-01-01

    Extracting useful information from large data sets can be a daunting task. Topological methods for analysing data sets provide a powerful technique for extracting such information. Persistent homology is a sophisticated tool for identifying topological features and for determining how such features persist as the data is viewed at different scales. Here we present quantum machine learning algorithms for calculating Betti numbers—the numbers of connected components, holes and voids—in persistent homology, and for finding eigenvectors and eigenvalues of the combinatorial Laplacian. The algorithms provide an exponential speed-up over the best currently known classical algorithms for topological data analysis. PMID:26806491

  14. Gaze movements and spatial working memory in collision avoidance: a traffic intersection task

    PubMed Central

    Hardiess, Gregor; Hansmann-Roth, Sabrina; Mallot, Hanspeter A.

    2013-01-01

    Street crossing under traffic is an everyday activity including collision detection as well as avoidance of objects in the path of motion. Such tasks demand extraction and representation of spatio-temporal information about relevant obstacles in an optimized format. Relevant task information is extracted visually by the use of gaze movements and represented in spatial working memory. In a virtual reality traffic intersection task, subjects are confronted with a two-lane intersection where cars are appearing with different frequencies, corresponding to high and low traffic densities. Under free observation and exploration of the scenery (using unrestricted eye and head movements) the overall task for the subjects was to predict the potential-of-collision (POC) of the cars or to adjust an adequate driving speed in order to cross the intersection without collision (i.e., to find the free space for crossing). In a series of experiments, gaze movement parameters, task performance, and the representation of car positions within working memory at distinct time points were assessed in normal subjects as well as in neurological patients suffering from homonymous hemianopia. In the following, we review the findings of these experiments together with other studies and provide a new perspective of the role of gaze behavior and spatial memory in collision detection and avoidance, focusing on the following questions: (1) which sensory variables can be identified supporting adequate collision detection? (2) How do gaze movements and working memory contribute to collision avoidance when multiple moving objects are present and (3) how do they correlate with task performance? (4) How do patients with homonymous visual field defects (HVFDs) use gaze movements and working memory to compensate for visual field loss? In conclusion, we extend the theory of collision detection and avoidance in the case of multiple moving objects and provide a new perspective on the combined operation of external (bottom-up) and internal (top-down) cues in a traffic intersection task. PMID:23760667

  15. Perceptual Learning, Cognition, and Expertise

    ERIC Educational Resources Information Center

    Kellman, Philip J.; Massey, Christine M.

    2013-01-01

    Recent research indicates that perceptual learning (PL)--experience-induced changes in the way perceivers extract information--plays a larger role in complex cognitive tasks, including abstract and symbolic domains, than has been understood in theory or implemented in instruction. Here, we describe the involvement of PL in complex cognitive tasks…

  16. Generalized Categorial Grammar for Unbounded Dependencies Recovery

    ERIC Educational Resources Information Center

    Nguyen, Luan Viet

    2014-01-01

    Accurate recovery of predicate-argument dependencies is vital for interpretation tasks like information extraction and question answering, and unbounded dependencies may account for a significant portion of the dependencies in any given text. This thesis describes a Generalized Categorial Grammar (GCG) which, like other categorial grammars,…

  17. Automatic Generation of Conditional Diagnostic Guidelines.

    PubMed

    Baldwin, Tyler; Guo, Yufan; Syeda-Mahmood, Tanveer

    2016-01-01

    The diagnostic workup for many diseases can be extraordinarily nuanced, and as such reference material text often contains extensive information regarding when it is appropriate to have a patient undergo a given procedure. In this work we employ a three task pipeline for the extraction of statements indicating the conditions under which a procedure should be performed, given a suspected diagnosis. First, we identify each instance in the text where a procedure is being recommended. Next we examine the context around these recommendations to extract conditional statements that dictate the conditions under which the recommendation holds. Finally, corefering recommendations across the document are linked to produce a full recommendation summary. Results indicate that each underlying task can be performed with above baseline performance, and the output can be used to produce concise recommendation summaries.

  18. On the performances of computer vision algorithms on mobile platforms

    NASA Astrophysics Data System (ADS)

    Battiato, S.; Farinella, G. M.; Messina, E.; Puglisi, G.; Ravì, D.; Capra, A.; Tomaselli, V.

    2012-01-01

    Computer Vision enables mobile devices to extract the meaning of the observed scene from the information acquired with the onboard sensor cameras. Nowadays, there is a growing interest in Computer Vision algorithms able to work on mobile platform (e.g., phone camera, point-and-shot-camera, etc.). Indeed, bringing Computer Vision capabilities on mobile devices open new opportunities in different application contexts. The implementation of vision algorithms on mobile devices is still a challenging task since these devices have poor image sensors and optics as well as limited processing power. In this paper we have considered different algorithms covering classic Computer Vision tasks: keypoint extraction, face detection, image segmentation. Several tests have been done to compare the performances of the involved mobile platforms: Nokia N900, LG Optimus One, Samsung Galaxy SII.

  19. Adaptable, high recall, event extraction system with minimal configuration.

    PubMed

    Miwa, Makoto; Ananiadou, Sophia

    2015-01-01

    Biomedical event extraction has been a major focus of biomedical natural language processing (BioNLP) research since the first BioNLP shared task was held in 2009. Accordingly, a large number of event extraction systems have been developed. Most such systems, however, have been developed for specific tasks and/or incorporated task specific settings, making their application to new corpora and tasks problematic without modification of the systems themselves. There is thus a need for event extraction systems that can achieve high levels of accuracy when applied to corpora in new domains, without the need for exhaustive tuning or modification, whilst retaining competitive levels of performance. We have enhanced our state-of-the-art event extraction system, EventMine, to alleviate the need for task-specific tuning. Task-specific details are specified in a configuration file, while extensive task-specific parameter tuning is avoided through the integration of a weighting method, a covariate shift method, and their combination. The task-specific configuration and weighting method have been employed within the context of two different sub-tasks of BioNLP shared task 2013, i.e. Cancer Genetics (CG) and Pathway Curation (PC), removing the need to modify the system specifically for each task. With minimal task specific configuration and tuning, EventMine achieved the 1st place in the PC task, and 2nd in the CG, achieving the highest recall for both tasks. The system has been further enhanced following the shared task by incorporating the covariate shift method and entity generalisations based on the task definitions, leading to further performance improvements. We have shown that it is possible to apply a state-of-the-art event extraction system to new tasks with high levels of performance, without having to modify the system internally. Both covariate shift and weighting methods are useful in facilitating the production of high recall systems. These methods and their combination can adapt a model to the target data with no deep tuning and little manual configuration.

  20. Extracting semantic representations from word co-occurrence statistics: stop-lists, stemming, and SVD.

    PubMed

    Bullinaria, John A; Levy, Joseph P

    2012-09-01

    In a previous article, we presented a systematic computational study of the extraction of semantic representations from the word-word co-occurrence statistics of large text corpora. The conclusion was that semantic vectors of pointwise mutual information values from very small co-occurrence windows, together with a cosine distance measure, consistently resulted in the best representations across a range of psychologically relevant semantic tasks. This article extends that study by investigating the use of three further factors--namely, the application of stop-lists, word stemming, and dimensionality reduction using singular value decomposition (SVD)--that have been used to provide improved performance elsewhere. It also introduces an additional semantic task and explores the advantages of using a much larger corpus. This leads to the discovery and analysis of improved SVD-based methods for generating semantic representations (that provide new state-of-the-art performance on a standard TOEFL task) and the identification and discussion of problems and misleading results that can arise without a full systematic study.

  1. Development and evaluation of task-specific NLP framework in China.

    PubMed

    Ge, Caixia; Zhang, Yinsheng; Huang, Zhenzhen; Jia, Zheng; Ju, Meizhi; Duan, Huilong; Li, Haomin

    2015-01-01

    Natural language processing (NLP) has been designed to convert narrative text into structured data. Although some general NLP architectures have been developed, a task-specific NLP framework to facilitate the effective use of data is still a challenge in lexical resource limited regions, such as China. The purpose of this study is to design and develop a task-specific NLP framework to extract targeted information from particular documents by adopting dedicated algorithms on current limited lexical resources. In this framework, a shared and evolving ontology mechanism was designed. The result has shown that such a free text driven platform will accelerate the NLP technology acceptance in China.

  2. Modelling Mathematics Problem Solving Item Responses Using a Multidimensional IRT Model

    ERIC Educational Resources Information Center

    Wu, Margaret; Adams, Raymond

    2006-01-01

    This research examined students' responses to mathematics problem-solving tasks and applied a general multidimensional IRT model at the response category level. In doing so, cognitive processes were identified and modelled through item response modelling to extract more information than would be provided using conventional practices in scoring…

  3. EEG based topography analysis in string recognition task

    NASA Astrophysics Data System (ADS)

    Ma, Xiaofei; Huang, Xiaolin; Shen, Yuxiaotong; Qin, Zike; Ge, Yun; Chen, Ying; Ning, Xinbao

    2017-03-01

    Vision perception and recognition is a complex process, during which different parts of brain are involved depending on the specific modality of the vision target, e.g. face, character, or word. In this study, brain activities in string recognition task compared with idle control state are analyzed through topographies based on multiple measurements, i.e. sample entropy, symbolic sample entropy and normalized rhythm power, extracted from simultaneously collected scalp EEG. Our analyses show that, for most subjects, both symbolic sample entropy and normalized gamma power in string recognition task are significantly higher than those in idle state, especially at locations of P4, O2, T6 and C4. It implies that these regions are highly involved in string recognition task. Since symbolic sample entropy measures complexity, from the perspective of new information generation, and normalized rhythm power reveals the power distributions in frequency domain, complementary information about the underlying dynamics can be provided through the two types of indices.

  4. Information extraction from Italian medical reports: An ontology-driven approach.

    PubMed

    Viani, Natalia; Larizza, Cristiana; Tibollo, Valentina; Napolitano, Carlo; Priori, Silvia G; Bellazzi, Riccardo; Sacchi, Lucia

    2018-03-01

    In this work, we propose an ontology-driven approach to identify events and their attributes from episodes of care included in medical reports written in Italian. For this language, shared resources for clinical information extraction are not easily accessible. The corpus considered in this work includes 5432 non-annotated medical reports belonging to patients with rare arrhythmias. To guide the information extraction process, we built a domain-specific ontology that includes the events and the attributes to be extracted, with related regular expressions. The ontology and the annotation system were constructed on a development set, while the performance was evaluated on an independent test set. As a gold standard, we considered a manually curated hospital database named TRIAD, which stores most of the information written in reports. The proposed approach performs well on the considered Italian medical corpus, with a percentage of correct annotations above 90% for most considered clinical events. We also assessed the possibility to adapt the system to the analysis of another language (i.e., English), with promising results. Our annotation system relies on a domain ontology to extract and link information in clinical text. We developed an ontology that can be easily enriched and translated, and the system performs well on the considered task. In the future, it could be successfully used to automatically populate the TRIAD database. Copyright © 2017 Elsevier B.V. All rights reserved.

  5. Information extraction for enhanced access to disease outbreak reports.

    PubMed

    Grishman, Ralph; Huttunen, Silja; Yangarber, Roman

    2002-08-01

    Document search is generally based on individual terms in the document. However, for collections within limited domains it is possible to provide more powerful access tools. This paper describes a system designed for collections of reports of infectious disease outbreaks. The system, Proteus-BIO, automatically creates a table of outbreaks, with each table entry linked to the document describing that outbreak; this makes it possible to use database operations such as selection and sorting to find relevant documents. Proteus-BIO consists of a Web crawler which gathers relevant documents; an information extraction engine which converts the individual outbreak events to a tabular database; and a database browser which provides access to the events and, through them, to the documents. The information extraction engine uses sets of patterns and word classes to extract the information about each event. Preparing these patterns and word classes has been a time-consuming manual operation in the past, but automated discovery tools now make this task significantly easier. A small study comparing the effectiveness of the tabular index with conventional Web search tools demonstrated that users can find substantially more documents in a given time period with Proteus-BIO.

  6. Individual differences in the components of children's and adults' information processing for simple symbolic and non-symbolic numeric decisions.

    PubMed

    Thompson, Clarissa A; Ratcliff, Roger; McKoon, Gail

    2016-10-01

    How do speed and accuracy trade off, and what components of information processing develop as children and adults make simple numeric comparisons? Data from symbolic and non-symbolic number tasks were collected from 19 first graders (Mage=7.12 years), 26 second/third graders (Mage=8.20 years), 27 fourth/fifth graders (Mage=10.46 years), and 19 seventh/eighth graders (Mage=13.22 years). The non-symbolic task asked children to decide whether an array of asterisks had a larger or smaller number than 50, and the symbolic task asked whether a two-digit number was greater than or less than 50. We used a diffusion model analysis to estimate components of processing in tasks from accuracy, correct and error response times, and response time (RT) distributions. Participants who were accurate on one task were accurate on the other task, and participants who made fast decisions on one task made fast decisions on the other task. Older participants extracted a higher quality of information from the stimulus arrays, were more willing to make a decision, and were faster at encoding, transforming the stimulus representation, and executing their responses. Individual participants' accuracy and RTs were uncorrelated. Drift rate and boundary settings were significantly related across tasks, but they were unrelated to each other. Accuracy was mainly determined by drift rate, and RT was mainly determined by boundary separation. We concluded that RT and accuracy operate largely independently. Copyright © 2016 Elsevier Inc. All rights reserved.

  7. Can multilinguality improve Biomedical Word Sense Disambiguation?

    PubMed

    Duque, Andres; Martinez-Romo, Juan; Araujo, Lourdes

    2016-12-01

    Ambiguity in the biomedical domain represents a major issue when performing Natural Language Processing tasks over the huge amount of available information in the field. For this reason, Word Sense Disambiguation is critical for achieving accurate systems able to tackle complex tasks such as information extraction, summarization or document classification. In this work we explore whether multilinguality can help to solve the problem of ambiguity, and the conditions required for a system to improve the results obtained by monolingual approaches. Also, we analyze the best ways to generate those useful multilingual resources, and study different languages and sources of knowledge. The proposed system, based on co-occurrence graphs containing biomedical concepts and textual information, is evaluated on a test dataset frequently used in biomedicine. We can conclude that multilingual resources are able to provide a clear improvement of more than 7% compared to monolingual approaches, for graphs built from a small number of documents. Also, empirical results show that automatically translated resources are a useful source of information for this particular task. Copyright © 2016 Elsevier Inc. All rights reserved.

  8. HOTS: A Hierarchy of Event-Based Time-Surfaces for Pattern Recognition.

    PubMed

    Lagorce, Xavier; Orchard, Garrick; Galluppi, Francesco; Shi, Bertram E; Benosman, Ryad B

    2017-07-01

    This paper describes novel event-based spatio-temporal features called time-surfaces and how they can be used to create a hierarchical event-based pattern recognition architecture. Unlike existing hierarchical architectures for pattern recognition, the presented model relies on a time oriented approach to extract spatio-temporal features from the asynchronously acquired dynamics of a visual scene. These dynamics are acquired using biologically inspired frameless asynchronous event-driven vision sensors. Similarly to cortical structures, subsequent layers in our hierarchy extract increasingly abstract features using increasingly large spatio-temporal windows. The central concept is to use the rich temporal information provided by events to create contexts in the form of time-surfaces which represent the recent temporal activity within a local spatial neighborhood. We demonstrate that this concept can robustly be used at all stages of an event-based hierarchical model. First layer feature units operate on groups of pixels, while subsequent layer feature units operate on the output of lower level feature units. We report results on a previously published 36 class character recognition task and a four class canonical dynamic card pip task, achieving near 100 percent accuracy on each. We introduce a new seven class moving face recognition task, achieving 79 percent accuracy.This paper describes novel event-based spatio-temporal features called time-surfaces and how they can be used to create a hierarchical event-based pattern recognition architecture. Unlike existing hierarchical architectures for pattern recognition, the presented model relies on a time oriented approach to extract spatio-temporal features from the asynchronously acquired dynamics of a visual scene. These dynamics are acquired using biologically inspired frameless asynchronous event-driven vision sensors. Similarly to cortical structures, subsequent layers in our hierarchy extract increasingly abstract features using increasingly large spatio-temporal windows. The central concept is to use the rich temporal information provided by events to create contexts in the form of time-surfaces which represent the recent temporal activity within a local spatial neighborhood. We demonstrate that this concept can robustly be used at all stages of an event-based hierarchical model. First layer feature units operate on groups of pixels, while subsequent layer feature units operate on the output of lower level feature units. We report results on a previously published 36 class character recognition task and a four class canonical dynamic card pip task, achieving near 100 percent accuracy on each. We introduce a new seven class moving face recognition task, achieving 79 percent accuracy.

  9. Sieve-based relation extraction of gene regulatory networks from biological literature

    PubMed Central

    2015-01-01

    Background Relation extraction is an essential procedure in literature mining. It focuses on extracting semantic relations between parts of text, called mentions. Biomedical literature includes an enormous amount of textual descriptions of biological entities, their interactions and results of related experiments. To extract them in an explicit, computer readable format, these relations were at first extracted manually from databases. Manual curation was later replaced with automatic or semi-automatic tools with natural language processing capabilities. The current challenge is the development of information extraction procedures that can directly infer more complex relational structures, such as gene regulatory networks. Results We develop a computational approach for extraction of gene regulatory networks from textual data. Our method is designed as a sieve-based system and uses linear-chain conditional random fields and rules for relation extraction. With this method we successfully extracted the sporulation gene regulation network in the bacterium Bacillus subtilis for the information extraction challenge at the BioNLP 2013 conference. To enable extraction of distant relations using first-order models, we transform the data into skip-mention sequences. We infer multiple models, each of which is able to extract different relationship types. Following the shared task, we conducted additional analysis using different system settings that resulted in reducing the reconstruction error of bacterial sporulation network from 0.73 to 0.68, measured as the slot error rate between the predicted and the reference network. We observe that all relation extraction sieves contribute to the predictive performance of the proposed approach. Also, features constructed by considering mention words and their prefixes and suffixes are the most important features for higher accuracy of extraction. Analysis of distances between different mention types in the text shows that our choice of transforming data into skip-mention sequences is appropriate for detecting relations between distant mentions. Conclusions Linear-chain conditional random fields, along with appropriate data transformations, can be efficiently used to extract relations. The sieve-based architecture simplifies the system as new sieves can be easily added or removed and each sieve can utilize the results of previous ones. Furthermore, sieves with conditional random fields can be trained on arbitrary text data and hence are applicable to broad range of relation extraction tasks and data domains. PMID:26551454

  10. Sieve-based relation extraction of gene regulatory networks from biological literature.

    PubMed

    Žitnik, Slavko; Žitnik, Marinka; Zupan, Blaž; Bajec, Marko

    2015-01-01

    Relation extraction is an essential procedure in literature mining. It focuses on extracting semantic relations between parts of text, called mentions. Biomedical literature includes an enormous amount of textual descriptions of biological entities, their interactions and results of related experiments. To extract them in an explicit, computer readable format, these relations were at first extracted manually from databases. Manual curation was later replaced with automatic or semi-automatic tools with natural language processing capabilities. The current challenge is the development of information extraction procedures that can directly infer more complex relational structures, such as gene regulatory networks. We develop a computational approach for extraction of gene regulatory networks from textual data. Our method is designed as a sieve-based system and uses linear-chain conditional random fields and rules for relation extraction. With this method we successfully extracted the sporulation gene regulation network in the bacterium Bacillus subtilis for the information extraction challenge at the BioNLP 2013 conference. To enable extraction of distant relations using first-order models, we transform the data into skip-mention sequences. We infer multiple models, each of which is able to extract different relationship types. Following the shared task, we conducted additional analysis using different system settings that resulted in reducing the reconstruction error of bacterial sporulation network from 0.73 to 0.68, measured as the slot error rate between the predicted and the reference network. We observe that all relation extraction sieves contribute to the predictive performance of the proposed approach. Also, features constructed by considering mention words and their prefixes and suffixes are the most important features for higher accuracy of extraction. Analysis of distances between different mention types in the text shows that our choice of transforming data into skip-mention sequences is appropriate for detecting relations between distant mentions. Linear-chain conditional random fields, along with appropriate data transformations, can be efficiently used to extract relations. The sieve-based architecture simplifies the system as new sieves can be easily added or removed and each sieve can utilize the results of previous ones. Furthermore, sieves with conditional random fields can be trained on arbitrary text data and hence are applicable to broad range of relation extraction tasks and data domains.

  11. BioSimplify: an open source sentence simplification engine to improve recall in automatic biomedical information extraction.

    PubMed

    Jonnalagadda, Siddhartha; Gonzalez, Graciela

    2010-11-13

    BioSimplify is an open source tool written in Java that introduces and facilitates the use of a novel model for sentence simplification tuned for automatic discourse analysis and information extraction (as opposed to sentence simplification for improving human readability). The model is based on a "shot-gun" approach that produces many different (simpler) versions of the original sentence by combining variants of its constituent elements. This tool is optimized for processing biomedical scientific literature such as the abstracts indexed in PubMed. We tested our tool on its impact to the task of PPI extraction and it improved the f-score of the PPI tool by around 7%, with an improvement in recall of around 20%. The BioSimplify tool and test corpus can be downloaded from https://biosimplify.sourceforge.net.

  12. Automatic classification of animal vocalizations

    NASA Astrophysics Data System (ADS)

    Clemins, Patrick J.

    2005-11-01

    Bioacoustics, the study of animal vocalizations, has begun to use increasingly sophisticated analysis techniques in recent years. Some common tasks in bioacoustics are repertoire determination, call detection, individual identification, stress detection, and behavior correlation. Each research study, however, uses a wide variety of different measured variables, called features, and classification systems to accomplish these tasks. The well-established field of human speech processing has developed a number of different techniques to perform many of the aforementioned bioacoustics tasks. Melfrequency cepstral coefficients (MFCCs) and perceptual linear prediction (PLP) coefficients are two popular feature sets. The hidden Markov model (HMM), a statistical model similar to a finite autonoma machine, is the most commonly used supervised classification model and is capable of modeling both temporal and spectral variations. This research designs a framework that applies models from human speech processing for bioacoustic analysis tasks. The development of the generalized perceptual linear prediction (gPLP) feature extraction model is one of the more important novel contributions of the framework. Perceptual information from the species under study can be incorporated into the gPLP feature extraction model to represent the vocalizations as the animals might perceive them. By including this perceptual information and modifying parameters of the HMM classification system, this framework can be applied to a wide range of species. The effectiveness of the framework is shown by analyzing African elephant and beluga whale vocalizations. The features extracted from the African elephant data are used as input to a supervised classification system and compared to results from traditional statistical tests. The gPLP features extracted from the beluga whale data are used in an unsupervised classification system and the results are compared to labels assigned by experts. The development of a framework from which to build animal vocalization classifiers will provide bioacoustics researchers with a consistent platform to analyze and classify vocalizations. A common framework will also allow studies to compare results across species and institutions. In addition, the use of automated classification techniques can speed analysis and uncover behavioral correlations not readily apparent using traditional techniques.

  13. Machinery running state identification based on discriminant semi-supervised local tangent space alignment for feature fusion and extraction

    NASA Astrophysics Data System (ADS)

    Su, Zuqiang; Xiao, Hong; Zhang, Yi; Tang, Baoping; Jiang, Yonghua

    2017-04-01

    Extraction of sensitive features is a challenging but key task in data-driven machinery running state identification. Aimed at solving this problem, a method for machinery running state identification that applies discriminant semi-supervised local tangent space alignment (DSS-LTSA) for feature fusion and extraction is proposed. Firstly, in order to extract more distinct features, the vibration signals are decomposed by wavelet packet decomposition WPD, and a mixed-domain feature set consisted of statistical features, autoregressive (AR) model coefficients, instantaneous amplitude Shannon entropy and WPD energy spectrum is extracted to comprehensively characterize the properties of machinery running state(s). Then, the mixed-dimension feature set is inputted into DSS-LTSA for feature fusion and extraction to eliminate redundant information and interference noise. The proposed DSS-LTSA can extract intrinsic structure information of both labeled and unlabeled state samples, and as a result the over-fitting problem of supervised manifold learning and blindness problem of unsupervised manifold learning are overcome. Simultaneously, class discrimination information is integrated within the dimension reduction process in a semi-supervised manner to improve sensitivity of the extracted fusion features. Lastly, the extracted fusion features are inputted into a pattern recognition algorithm to achieve the running state identification. The effectiveness of the proposed method is verified by a running state identification case in a gearbox, and the results confirm the improved accuracy of the running state identification.

  14. Capturing patient information at nursing shift changes: methodological evaluation of speech recognition and information extraction

    PubMed Central

    Suominen, Hanna; Johnson, Maree; Zhou, Liyuan; Sanchez, Paula; Sirel, Raul; Basilakis, Jim; Hanlen, Leif; Estival, Dominique; Dawson, Linda; Kelly, Barbara

    2015-01-01

    Objective We study the use of speech recognition and information extraction to generate drafts of Australian nursing-handover documents. Methods Speech recognition correctness and clinicians’ preferences were evaluated using 15 recorder–microphone combinations, six documents, three speakers, Dragon Medical 11, and five survey/interview participants. Information extraction correctness evaluation used 260 documents, six-class classification for each word, two annotators, and the CRF++ conditional random field toolkit. Results A noise-cancelling lapel-microphone with a digital voice recorder gave the best correctness (79%). This microphone was also the most preferred option by all but one participant. Although the participants liked the small size of this recorder, their preference was for tablets that can also be used for document proofing and sign-off, among other tasks. Accented speech was harder to recognize than native language and a male speaker was detected better than a female speaker. Information extraction was excellent in filtering out irrelevant text (85% F1) and identifying text relevant to two classes (87% and 70% F1). Similarly to the annotators’ disagreements, there was confusion between the remaining three classes, which explains the modest 62% macro-averaged F1. Discussion We present evidence for the feasibility of speech recognition and information extraction to support clinicians’ in entering text and unlock its content for computerized decision-making and surveillance in healthcare. Conclusions The benefits of this automation include storing all information; making the drafts available and accessible almost instantly to everyone with authorized access; and avoiding information loss, delays, and misinterpretations inherent to using a ward clerk or transcription services. PMID:25336589

  15. Application of a fast skyline computation algorithm for serendipitous searching problems

    NASA Astrophysics Data System (ADS)

    Koizumi, Kenichi; Hiraki, Kei; Inaba, Mary

    2018-02-01

    Skyline computation is a method of extracting interesting entries from a large population with multiple attributes. These entries, called skyline or Pareto optimal entries, are known to have extreme characteristics that cannot be found by outlier detection methods. Skyline computation is an important task for characterizing large amounts of data and selecting interesting entries with extreme features. When the population changes dynamically, the task of calculating a sequence of skyline sets is called continuous skyline computation. This task is known to be difficult to perform for the following reasons: (1) information of non-skyline entries must be stored since they may join the skyline in the future; (2) the appearance or disappearance of even a single entry can change the skyline drastically; (3) it is difficult to adopt a geometric acceleration algorithm for skyline computation tasks with high-dimensional datasets. Our new algorithm called jointed rooted-tree (JR-tree) manages entries using a rooted tree structure. JR-tree delays extend the tree to deep levels to accelerate tree construction and traversal. In this study, we presented the difficulties in extracting entries tagged with a rare label in high-dimensional space and the potential of fast skyline computation in low-latency cell identification technology.

  16. Adaptable, high recall, event extraction system with minimal configuration

    PubMed Central

    2015-01-01

    Background Biomedical event extraction has been a major focus of biomedical natural language processing (BioNLP) research since the first BioNLP shared task was held in 2009. Accordingly, a large number of event extraction systems have been developed. Most such systems, however, have been developed for specific tasks and/or incorporated task specific settings, making their application to new corpora and tasks problematic without modification of the systems themselves. There is thus a need for event extraction systems that can achieve high levels of accuracy when applied to corpora in new domains, without the need for exhaustive tuning or modification, whilst retaining competitive levels of performance. Results We have enhanced our state-of-the-art event extraction system, EventMine, to alleviate the need for task-specific tuning. Task-specific details are specified in a configuration file, while extensive task-specific parameter tuning is avoided through the integration of a weighting method, a covariate shift method, and their combination. The task-specific configuration and weighting method have been employed within the context of two different sub-tasks of BioNLP shared task 2013, i.e. Cancer Genetics (CG) and Pathway Curation (PC), removing the need to modify the system specifically for each task. With minimal task specific configuration and tuning, EventMine achieved the 1st place in the PC task, and 2nd in the CG, achieving the highest recall for both tasks. The system has been further enhanced following the shared task by incorporating the covariate shift method and entity generalisations based on the task definitions, leading to further performance improvements. Conclusions We have shown that it is possible to apply a state-of-the-art event extraction system to new tasks with high levels of performance, without having to modify the system internally. Both covariate shift and weighting methods are useful in facilitating the production of high recall systems. These methods and their combination can adapt a model to the target data with no deep tuning and little manual configuration. PMID:26201408

  17. Robust multitask learning with three-dimensional empirical mode decomposition-based features for hyperspectral classification

    NASA Astrophysics Data System (ADS)

    He, Zhi; Liu, Lin

    2016-11-01

    Empirical mode decomposition (EMD) and its variants have recently been applied for hyperspectral image (HSI) classification due to their ability to extract useful features from the original HSI. However, it remains a challenging task to effectively exploit the spectral-spatial information by the traditional vector or image-based methods. In this paper, a three-dimensional (3D) extension of EMD (3D-EMD) is proposed to naturally treat the HSI as a cube and decompose the HSI into varying oscillations (i.e. 3D intrinsic mode functions (3D-IMFs)). To achieve fast 3D-EMD implementation, 3D Delaunay triangulation (3D-DT) is utilized to determine the distances of extrema, while separable filters are adopted to generate the envelopes. Taking the extracted 3D-IMFs as features of different tasks, robust multitask learning (RMTL) is further proposed for HSI classification. In RMTL, pairs of low-rank and sparse structures are formulated by trace-norm and l1,2 -norm to capture task relatedness and specificity, respectively. Moreover, the optimization problems of RMTL can be efficiently solved by the inexact augmented Lagrangian method (IALM). Compared with several state-of-the-art feature extraction and classification methods, the experimental results conducted on three benchmark data sets demonstrate the superiority of the proposed methods.

  18. The BioExtract Server: a web-based bioinformatic workflow platform

    PubMed Central

    Lushbough, Carol M.; Jennewein, Douglas M.; Brendel, Volker P.

    2011-01-01

    The BioExtract Server (bioextract.org) is an open, web-based system designed to aid researchers in the analysis of genomic data by providing a platform for the creation of bioinformatic workflows. Scientific workflows are created within the system by recording tasks performed by the user. These tasks may include querying multiple, distributed data sources, saving query results as searchable data extracts, and executing local and web-accessible analytic tools. The series of recorded tasks can then be saved as a reproducible, sharable workflow available for subsequent execution with the original or modified inputs and parameter settings. Integrated data resources include interfaces to the National Center for Biotechnology Information (NCBI) nucleotide and protein databases, the European Molecular Biology Laboratory (EMBL-Bank) non-redundant nucleotide database, the Universal Protein Resource (UniProt), and the UniProt Reference Clusters (UniRef) database. The system offers access to numerous preinstalled, curated analytic tools and also provides researchers with the option of selecting computational tools from a large list of web services including the European Molecular Biology Open Software Suite (EMBOSS), BioMoby, and the Kyoto Encyclopedia of Genes and Genomes (KEGG). The system further allows users to integrate local command line tools residing on their own computers through a client-side Java applet. PMID:21546552

  19. Synonym extraction and abbreviation expansion with ensembles of semantic spaces.

    PubMed

    Henriksson, Aron; Moen, Hans; Skeppstedt, Maria; Daudaravičius, Vidas; Duneld, Martin

    2014-02-05

    Terminologies that account for variation in language use by linking synonyms and abbreviations to their corresponding concept are important enablers of high-quality information extraction from medical texts. Due to the use of specialized sub-languages in the medical domain, manual construction of semantic resources that accurately reflect language use is both costly and challenging, often resulting in low coverage. Although models of distributional semantics applied to large corpora provide a potential means of supporting development of such resources, their ability to isolate synonymy from other semantic relations is limited. Their application in the clinical domain has also only recently begun to be explored. Combining distributional models and applying them to different types of corpora may lead to enhanced performance on the tasks of automatically extracting synonyms and abbreviation-expansion pairs. A combination of two distributional models - Random Indexing and Random Permutation - employed in conjunction with a single corpus outperforms using either of the models in isolation. Furthermore, combining semantic spaces induced from different types of corpora - a corpus of clinical text and a corpus of medical journal articles - further improves results, outperforming a combination of semantic spaces induced from a single source, as well as a single semantic space induced from the conjoint corpus. A combination strategy that simply sums the cosine similarity scores of candidate terms is generally the most profitable out of the ones explored. Finally, applying simple post-processing filtering rules yields substantial performance gains on the tasks of extracting abbreviation-expansion pairs, but not synonyms. The best results, measured as recall in a list of ten candidate terms, for the three tasks are: 0.39 for abbreviations to long forms, 0.33 for long forms to abbreviations, and 0.47 for synonyms. This study demonstrates that ensembles of semantic spaces can yield improved performance on the tasks of automatically extracting synonyms and abbreviation-expansion pairs. This notion, which merits further exploration, allows different distributional models - with different model parameters - and different types of corpora to be combined, potentially allowing enhanced performance to be obtained on a wide range of natural language processing tasks.

  20. Synonym extraction and abbreviation expansion with ensembles of semantic spaces

    PubMed Central

    2014-01-01

    Background Terminologies that account for variation in language use by linking synonyms and abbreviations to their corresponding concept are important enablers of high-quality information extraction from medical texts. Due to the use of specialized sub-languages in the medical domain, manual construction of semantic resources that accurately reflect language use is both costly and challenging, often resulting in low coverage. Although models of distributional semantics applied to large corpora provide a potential means of supporting development of such resources, their ability to isolate synonymy from other semantic relations is limited. Their application in the clinical domain has also only recently begun to be explored. Combining distributional models and applying them to different types of corpora may lead to enhanced performance on the tasks of automatically extracting synonyms and abbreviation-expansion pairs. Results A combination of two distributional models – Random Indexing and Random Permutation – employed in conjunction with a single corpus outperforms using either of the models in isolation. Furthermore, combining semantic spaces induced from different types of corpora – a corpus of clinical text and a corpus of medical journal articles – further improves results, outperforming a combination of semantic spaces induced from a single source, as well as a single semantic space induced from the conjoint corpus. A combination strategy that simply sums the cosine similarity scores of candidate terms is generally the most profitable out of the ones explored. Finally, applying simple post-processing filtering rules yields substantial performance gains on the tasks of extracting abbreviation-expansion pairs, but not synonyms. The best results, measured as recall in a list of ten candidate terms, for the three tasks are: 0.39 for abbreviations to long forms, 0.33 for long forms to abbreviations, and 0.47 for synonyms. Conclusions This study demonstrates that ensembles of semantic spaces can yield improved performance on the tasks of automatically extracting synonyms and abbreviation-expansion pairs. This notion, which merits further exploration, allows different distributional models – with different model parameters – and different types of corpora to be combined, potentially allowing enhanced performance to be obtained on a wide range of natural language processing tasks. PMID:24499679

  1. Road marking features extraction using the VIAPIX® system

    NASA Astrophysics Data System (ADS)

    Kaddah, W.; Ouerhani, Y.; Alfalou, A.; Desthieux, M.; Brosseau, C.; Gutierrez, C.

    2016-07-01

    Precise extraction of road marking features is a critical task for autonomous urban driving, augmented driver assistance, and robotics technologies. In this study, we consider an autonomous system allowing us lane detection for marked urban roads and analysis of their features. The task is to relate the georeferencing of road markings from images obtained using the VIAPIX® system. Based on inverse perspective mapping and color segmentation to detect all white objects existing on this road, the present algorithm enables us to examine these images automatically and rapidly and also to get information on road marks, their surface conditions, and their georeferencing. This algorithm allows detecting all road markings and identifying some of them by making use of a phase-only correlation filter (POF). We illustrate this algorithm and its robustness by applying it to a variety of relevant scenarios.

  2. An Experimental Investigation of Complexity in Database Query Formulation Tasks

    ERIC Educational Resources Information Center

    Casterella, Gretchen Irwin; Vijayasarathy, Leo

    2013-01-01

    Information Technology professionals and other knowledge workers rely on their ability to extract data from organizational databases to respond to business questions and support decision making. Structured query language (SQL) is the standard programming language for querying data in relational databases, and SQL skills are in high demand and are…

  3. Including Both Time and Accuracy in Defining Text Search Efficiency.

    ERIC Educational Resources Information Center

    Symons, Sonya; Specht, Jacqueline A.

    1994-01-01

    Examines factors related to efficiency in a textbook search task. Finds that time and accuracy involved distinct processes and that accuracy was related to verbal competence. Finds further that measures of planning and extracting information accounted for 59% of the variance in search efficiency. Suggests that both accuracy and rate need to be…

  4. Strong converse theorems using Rényi entropies

    NASA Astrophysics Data System (ADS)

    Leditzky, Felix; Wilde, Mark M.; Datta, Nilanjana

    2016-08-01

    We use a Rényi entropy method to prove strong converse theorems for certain information-theoretic tasks which involve local operations and quantum (or classical) communication between two parties. These include state redistribution, coherent state merging, quantum state splitting, measurement compression with quantum side information, randomness extraction against quantum side information, and data compression with quantum side information. The method we employ in proving these results extends ideas developed by Sharma [preprint arXiv:1404.5940 [quant-ph] (2014)], which he used to give a new proof of the strong converse theorem for state merging. For state redistribution, we prove the strong converse property for the boundary of the entire achievable rate region in the (e, q)-plane, where e and q denote the entanglement cost and quantum communication cost, respectively. In the case of measurement compression with quantum side information, we prove a strong converse theorem for the classical communication cost, which is a new result extending the previously known weak converse. For the remaining tasks, we provide new proofs for strong converse theorems previously established using smooth entropies. For each task, we obtain the strong converse theorem from explicit bounds on the figure of merit of the task in terms of a Rényi generalization of the optimal rate. Hence, we identify candidates for the strong converse exponents for each task discussed in this paper. To prove our results, we establish various new entropic inequalities, which might be of independent interest. These involve conditional entropies and mutual information derived from the sandwiched Rényi divergence. In particular, we obtain novel bounds relating these quantities, as well as the Rényi conditional mutual information, to the fidelity of two quantum states.

  5. Strong converse theorems using Rényi entropies

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Leditzky, Felix; Datta, Nilanjana; Wilde, Mark M.

    We use a Rényi entropy method to prove strong converse theorems for certain information-theoretic tasks which involve local operations and quantum (or classical) communication between two parties. These include state redistribution, coherent state merging, quantum state splitting, measurement compression with quantum side information, randomness extraction against quantum side information, and data compression with quantum side information. The method we employ in proving these results extends ideas developed by Sharma [preprint http://arxiv.org/abs/1404.5940 [quant-ph] (2014)], which he used to give a new proof of the strong converse theorem for state merging. For state redistribution, we prove the strong converse property for themore » boundary of the entire achievable rate region in the (e, q)-plane, where e and q denote the entanglement cost and quantum communication cost, respectively. In the case of measurement compression with quantum side information, we prove a strong converse theorem for the classical communication cost, which is a new result extending the previously known weak converse. For the remaining tasks, we provide new proofs for strong converse theorems previously established using smooth entropies. For each task, we obtain the strong converse theorem from explicit bounds on the figure of merit of the task in terms of a Rényi generalization of the optimal rate. Hence, we identify candidates for the strong converse exponents for each task discussed in this paper. To prove our results, we establish various new entropic inequalities, which might be of independent interest. These involve conditional entropies and mutual information derived from the sandwiched Rényi divergence. In particular, we obtain novel bounds relating these quantities, as well as the Rényi conditional mutual information, to the fidelity of two quantum states.« less

  6. Associations Between Driving Performance and Engaging in Secondary Tasks: A Systematic Review

    PubMed Central

    Ferdinand, Alva O.

    2014-01-01

    We conducted a systematic review and meta-analysis of the literature examining the relationship between driving performance and engaging in secondary tasks. We extracted data from abstracts of 206 empirical articles published between 1968 and 2012 and developed a logistic regression model to identify correlates of a detrimental relationship between secondary tasks and driving performance. Of 350 analyses, 80% reported finding a detrimental relationship. Studies using experimental designs were 37% less likely to report a detrimental relationship (P = .014). Studies examining mobile phone use while driving were 16% more likely to find such a relationship (P = .009). Quasi-experiments can better determine the effects of secondary tasks on driving performance and consequently serve to inform policymakers interested in reducing distracted driving and increasing roadway safety. PMID:24432925

  7. Enhancing biomedical text summarization using semantic relation extraction.

    PubMed

    Shang, Yue; Li, Yanpeng; Lin, Hongfei; Yang, Zhihao

    2011-01-01

    Automatic text summarization for a biomedical concept can help researchers to get the key points of a certain topic from large amount of biomedical literature efficiently. In this paper, we present a method for generating text summary for a given biomedical concept, e.g., H1N1 disease, from multiple documents based on semantic relation extraction. Our approach includes three stages: 1) We extract semantic relations in each sentence using the semantic knowledge representation tool SemRep. 2) We develop a relation-level retrieval method to select the relations most relevant to each query concept and visualize them in a graphic representation. 3) For relations in the relevant set, we extract informative sentences that can interpret them from the document collection to generate text summary using an information retrieval based method. Our major focus in this work is to investigate the contribution of semantic relation extraction to the task of biomedical text summarization. The experimental results on summarization for a set of diseases show that the introduction of semantic knowledge improves the performance and our results are better than the MEAD system, a well-known tool for text summarization.

  8. Chemical name extraction based on automatic training data generation and rich feature set.

    PubMed

    Yan, Su; Spangler, W Scott; Chen, Ying

    2013-01-01

    The automation of extracting chemical names from text has significant value to biomedical and life science research. A major barrier in this task is the difficulty of getting a sizable and good quality data to train a reliable entity extraction model. Another difficulty is the selection of informative features of chemical names, since comprehensive domain knowledge on chemistry nomenclature is required. Leveraging random text generation techniques, we explore the idea of automatically creating training sets for the task of chemical name extraction. Assuming the availability of an incomplete list of chemical names, called a dictionary, we are able to generate well-controlled, random, yet realistic chemical-like training documents. We statistically analyze the construction of chemical names based on the incomplete dictionary, and propose a series of new features, without relying on any domain knowledge. Compared to state-of-the-art models learned from manually labeled data and domain knowledge, our solution shows better or comparable results in annotating real-world data with less human effort. Moreover, we report an interesting observation about the language for chemical names. That is, both the structural and semantic components of chemical names follow a Zipfian distribution, which resembles many natural languages.

  9. Local binary pattern variants-based adaptive texture features analysis for posed and nonposed facial expression recognition

    NASA Astrophysics Data System (ADS)

    Sultana, Maryam; Bhatti, Naeem; Javed, Sajid; Jung, Soon Ki

    2017-09-01

    Facial expression recognition (FER) is an important task for various computer vision applications. The task becomes challenging when it requires the detection and encoding of macro- and micropatterns of facial expressions. We present a two-stage texture feature extraction framework based on the local binary pattern (LBP) variants and evaluate its significance in recognizing posed and nonposed facial expressions. We focus on the parametric limitations of the LBP variants and investigate their effects for optimal FER. The size of the local neighborhood is an important parameter of the LBP technique for its extraction in images. To make the LBP adaptive, we exploit the granulometric information of the facial images to find the local neighborhood size for the extraction of center-symmetric LBP (CS-LBP) features. Our two-stage texture representations consist of an LBP variant and the adaptive CS-LBP features. Among the presented two-stage texture feature extractions, the binarized statistical image features and adaptive CS-LBP features were found showing high FER rates. Evaluation of the adaptive texture features shows competitive and higher performance than the nonadaptive features and other state-of-the-art approaches, respectively.

  10. Harnessing Biomedical Natural Language Processing Tools to Identify Medicinal Plant Knowledge from Historical Texts.

    PubMed

    Sharma, Vivekanand; Law, Wayne; Balick, Michael J; Sarkar, Indra Neil

    2017-01-01

    The growing amount of data describing historical medicinal uses of plants from digitization efforts provides the opportunity to develop systematic approaches for identifying potential plant-based therapies. However, the task of cataloguing plant use information from natural language text is a challenging task for ethnobotanists. To date, there have been only limited adoption of informatics approaches used for supporting the identification of ethnobotanical information associated with medicinal uses. This study explored the feasibility of using biomedical terminologies and natural language processing approaches for extracting relevant plant-associated therapeutic use information from historical biodiversity literature collection available from the Biodiversity Heritage Library. The results from this preliminary study suggest that there is potential utility of informatics methods to identify medicinal plant knowledge from digitized resources as well as highlight opportunities for improvement.

  11. Harnessing Biomedical Natural Language Processing Tools to Identify Medicinal Plant Knowledge from Historical Texts

    PubMed Central

    Sharma, Vivekanand; Law, Wayne; Balick, Michael J.; Sarkar, Indra Neil

    2017-01-01

    The growing amount of data describing historical medicinal uses of plants from digitization efforts provides the opportunity to develop systematic approaches for identifying potential plant-based therapies. However, the task of cataloguing plant use information from natural language text is a challenging task for ethnobotanists. To date, there have been only limited adoption of informatics approaches used for supporting the identification of ethnobotanical information associated with medicinal uses. This study explored the feasibility of using biomedical terminologies and natural language processing approaches for extracting relevant plant-associated therapeutic use information from historical biodiversity literature collection available from the Biodiversity Heritage Library. The results from this preliminary study suggest that there is potential utility of informatics methods to identify medicinal plant knowledge from digitized resources as well as highlight opportunities for improvement. PMID:29854223

  12. Recent progress of task-specific ionic liquids in chiral resolution and extraction of biological samples and metal ions.

    PubMed

    Wu, Datong; Cai, Pengfei; Zhao, Xiaoyong; Kong, Yong; Pan, Yuanjiang

    2018-01-01

    Ionic liquids have been functionalized for modern applications. The functional ionic liquids are also called task-specific ionic liquids. Various task-specific ionic liquids with certain groups have been constructed and exploited widely in the field of separation. To take advantage of their properties in separation science, task-specific ionic liquids are generally used in techniques such as liquid-liquid extraction, solid-phase extraction, gas chromatography, high-performance liquid chromatography, and capillary electrophoresis. This review mainly covers original research papers published in the last five years, and we will focus on task-specific ionic liquids as the chiral selectors in chiral resolution and as extractant or sensor for biological samples and metal ion purification. © 2017 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.

  13. Does It Really Matter Where You Look When Walking on Stairs? Insights from a Dual-Task Study

    PubMed Central

    Miyasike-daSilva, Veronica; McIlroy, William E.

    2012-01-01

    Although the visual system is known to provide relevant information to guide stair locomotion, there is less understanding of the specific contributions of foveal and peripheral visual field information. The present study investigated the specific role of foveal vision during stair locomotion and ground-stairs transitions by using a dual-task paradigm to influence the ability to rely on foveal vision. Fifteen healthy adults (26.9±3.3 years; 8 females) ascended a 7-step staircase under four conditions: no secondary tasks (CONTROL); gaze fixation on a fixed target located at the end of the pathway (TARGET); visual reaction time task (VRT); and auditory reaction time task (ART). Gaze fixations towards stair features were significantly reduced in TARGET and VRT compared to CONTROL and ART. Despite the reduced fixations, participants were able to successfully ascend stairs and rarely used the handrail. Step time was increased during VRT compared to CONTROL in most stair steps. Navigating on the transition steps did not require more gaze fixations than the middle steps. However, reaction time tended to increase during locomotion on transitions suggesting additional executive demands during this phase. These findings suggest that foveal vision may not be an essential source of visual information regarding stair features to guide stair walking, despite the unique control challenges at transition phases as highlighted by phase-specific challenges in dual-tasking. Instead, the tendency to look at the steps in usual conditions likely provides a stable reference frame for extraction of visual information regarding step features from the entire visual field. PMID:22970297

  14. Systematic review automation technologies

    PubMed Central

    2014-01-01

    Systematic reviews, a cornerstone of evidence-based medicine, are not produced quickly enough to support clinical practice. The cost of production, availability of the requisite expertise and timeliness are often quoted as major contributors for the delay. This detailed survey of the state of the art of information systems designed to support or automate individual tasks in the systematic review, and in particular systematic reviews of randomized controlled clinical trials, reveals trends that see the convergence of several parallel research projects. We surveyed literature describing informatics systems that support or automate the processes of systematic review or each of the tasks of the systematic review. Several projects focus on automating, simplifying and/or streamlining specific tasks of the systematic review. Some tasks are already fully automated while others are still largely manual. In this review, we describe each task and the effect that its automation would have on the entire systematic review process, summarize the existing information system support for each task, and highlight where further research is needed for realizing automation for the task. Integration of the systems that automate systematic review tasks may lead to a revised systematic review workflow. We envisage the optimized workflow will lead to system in which each systematic review is described as a computer program that automatically retrieves relevant trials, appraises them, extracts and synthesizes data, evaluates the risk of bias, performs meta-analysis calculations, and produces a report in real time. PMID:25005128

  15. A Hybrid Human-Computer Approach to the Extraction of Scientific Facts from the Literature.

    PubMed

    Tchoua, Roselyne B; Chard, Kyle; Audus, Debra; Qin, Jian; de Pablo, Juan; Foster, Ian

    2016-01-01

    A wealth of valuable data is locked within the millions of research articles published each year. Reading and extracting pertinent information from those articles has become an unmanageable task for scientists. This problem hinders scientific progress by making it hard to build on results buried in literature. Moreover, these data are loosely structured, encoded in manuscripts of various formats, embedded in different content types, and are, in general, not machine accessible. We present a hybrid human-computer solution for semi-automatically extracting scientific facts from literature. This solution combines an automated discovery, download, and extraction phase with a semi-expert crowd assembled from students to extract specific scientific facts. To evaluate our approach we apply it to a challenging molecular engineering scenario, extraction of a polymer property: the Flory-Huggins interaction parameter. We demonstrate useful contributions to a comprehensive database of polymer properties.

  16. Latent Dirichlet Allocation (LDA) Model and kNN Algorithm to Classify Research Project Selection

    NASA Astrophysics Data System (ADS)

    Safi’ie, M. A.; Utami, E.; Fatta, H. A.

    2018-03-01

    Universitas Sebelas Maret has a teaching staff more than 1500 people, and one of its tasks is to carry out research. In the other side, the funding support for research and service is limited, so there is need to be evaluated to determine the Research proposal submission and devotion on society (P2M). At the selection stage, research proposal documents are collected as unstructured data and the data stored is very large. To extract information contained in the documents therein required text mining technology. This technology applied to gain knowledge to the documents by automating the information extraction. In this articles we use Latent Dirichlet Allocation (LDA) to the documents as a model in feature extraction process, to get terms that represent its documents. Hereafter we use k-Nearest Neighbour (kNN) algorithm to classify the documents based on its terms.

  17. Primary task event-related potentials related to different aspects of information processing

    NASA Technical Reports Server (NTRS)

    Munson, Robert C.; Horst, Richard L.; Mahaffey, David L.

    1988-01-01

    The results of two studies which investigated the relationships between cognitive processing and components of transient event-related potentials (ERPs) are presented in a task in which mental workload was manipulated. The task involved the monitoring of an array of discrete readouts for values that went out of bounds, and was somewhat analogous to tasks performed in cockpits. The ERPs elicited by the changing readouts varied with the number of readouts being monitored, the number of monitored readouts that were close to going out of bounds, and whether or not the change took a monitored readout out of bounds. Moreover, different regions of the waveform differentially reflected these effects. The results confirm the sensitivity of scalp-recorded ERPs to the cognitive processes affected by mental workload and suggest the possibility of extracting useful ERP indices of primary task performance in a wide range of man-machine settings.

  18. Usability Evaluation of an Unstructured Clinical Document Query Tool for Researchers.

    PubMed

    Hultman, Gretchen; McEwan, Reed; Pakhomov, Serguei; Lindemann, Elizabeth; Skube, Steven; Melton, Genevieve B

    2018-01-01

    Natural Language Processing - Patient Information Extraction for Researchers (NLP-PIER) was developed for clinical researchers for self-service Natural Language Processing (NLP) queries with clinical notes. This study was to conduct a user-centered analysis with clinical researchers to gain insight into NLP-PIER's usability and to gain an understanding of the needs of clinical researchers when using an application for searching clinical notes. Clinical researcher participants (n=11) completed tasks using the system's two existing search interfaces and completed a set of surveys and an exit interview. Quantitative data including time on task, task completion rate, and survey responses were collected. Interviews were analyzed qualitatively. Survey scores, time on task and task completion proportions varied widely. Qualitative analysis indicated that participants found the system to be useful and usable in specific projects. This study identified several usability challenges and our findings will guide the improvement of NLP-PIER 's interfaces.

  19. Kernel-Based Relevance Analysis with Enhanced Interpretability for Detection of Brain Activity Patterns

    PubMed Central

    Alvarez-Meza, Andres M.; Orozco-Gutierrez, Alvaro; Castellanos-Dominguez, German

    2017-01-01

    We introduce Enhanced Kernel-based Relevance Analysis (EKRA) that aims to support the automatic identification of brain activity patterns using electroencephalographic recordings. EKRA is a data-driven strategy that incorporates two kernel functions to take advantage of the available joint information, associating neural responses to a given stimulus condition. Regarding this, a Centered Kernel Alignment functional is adjusted to learning the linear projection that best discriminates the input feature set, optimizing the required free parameters automatically. Our approach is carried out in two scenarios: (i) feature selection by computing a relevance vector from extracted neural features to facilitating the physiological interpretation of a given brain activity task, and (ii) enhanced feature selection to perform an additional transformation of relevant features aiming to improve the overall identification accuracy. Accordingly, we provide an alternative feature relevance analysis strategy that allows improving the system performance while favoring the data interpretability. For the validation purpose, EKRA is tested in two well-known tasks of brain activity: motor imagery discrimination and epileptic seizure detection. The obtained results show that the EKRA approach estimates a relevant representation space extracted from the provided supervised information, emphasizing the salient input features. As a result, our proposal outperforms the state-of-the-art methods regarding brain activity discrimination accuracy with the benefit of enhanced physiological interpretation about the task at hand. PMID:29056897

  20. The cerebellum predicts the temporal consequences of observed motor acts.

    PubMed

    Avanzino, Laura; Bove, Marco; Pelosin, Elisa; Ogliastro, Carla; Lagravinese, Giovanna; Martino, Davide

    2015-01-01

    It is increasingly clear that we extract patterns of temporal regularity between events to optimize information processing. The ability to extract temporal patterns and regularity of events is referred as temporal expectation. Temporal expectation activates the same cerebral network usually engaged in action selection, comprising cerebellum. However, it is unclear whether the cerebellum is directly involved in temporal expectation, when timing information is processed to make predictions on the outcome of a motor act. Healthy volunteers received one session of either active (inhibitory, 1 Hz) or sham repetitive transcranial magnetic stimulation covering the right lateral cerebellum prior the execution of a temporal expectation task. Subjects were asked to predict the end of a visually perceived human body motion (right hand handwriting) and of an inanimate object motion (a moving circle reaching a target). Videos representing movements were shown in full; the actual tasks consisted of watching the same videos, but interrupted after a variable interval from its onset by a dark interval of variable duration. During the 'dark' interval, subjects were asked to indicate when the movement represented in the video reached its end by clicking on the spacebar of the keyboard. Performance on the timing task was analyzed measuring the absolute value of timing error, the coefficient of variability and the percentage of anticipation responses. The active group exhibited greater absolute timing error compared with the sham group only in the human body motion task. Our findings suggest that the cerebellum is engaged in cognitive and perceptual domains that are strictly connected to motor control.

  1. 3D micro-mapping: Towards assessing the quality of crowdsourcing to support 3D point cloud analysis

    NASA Astrophysics Data System (ADS)

    Herfort, Benjamin; Höfle, Bernhard; Klonner, Carolin

    2018-03-01

    In this paper, we propose a method to crowdsource the task of complex three-dimensional information extraction from 3D point clouds. We design web-based 3D micro tasks tailored to assess segmented LiDAR point clouds of urban trees and investigate the quality of the approach in an empirical user study. Our results for three different experiments with increasing complexity indicate that a single crowdsourcing task can be solved in a very short time of less than five seconds on average. Furthermore, the results of our empirical case study reveal that the accuracy, sensitivity and precision of 3D crowdsourcing are high for most information extraction problems. For our first experiment (binary classification with single answer) we obtain an accuracy of 91%, a sensitivity of 95% and a precision of 92%. For the more complex tasks of the second Experiment 2 (multiple answer classification) the accuracy ranges from 65% to 99% depending on the label class. Regarding the third experiment - the determination of the crown base height of individual trees - our study highlights that crowdsourcing can be a tool to obtain values with even higher accuracy in comparison to an automated computer-based approach. Finally, we found out that the accuracy of the crowdsourced results for all experiments is hardly influenced by characteristics of the input point cloud data and of the users. Importantly, the results' accuracy can be estimated using agreement among volunteers as an intrinsic indicator, which makes a broad application of 3D micro-mapping very promising.

  2. Reconciling change blindness with long-term memory for objects.

    PubMed

    Wood, Katherine; Simons, Daniel J

    2017-02-01

    How can we reconcile remarkably precise long-term memory for thousands of images with failures to detect changes to similar images? We explored whether people can use detailed, long-term memory to improve change detection performance. Subjects studied a set of images of objects and then performed recognition and change detection tasks with those images. Recognition memory performance exceeded change detection performance, even when a single familiar object in the postchange display consistently indicated the change location. In fact, participants were no better when a familiar object predicted the change location than when the displays consisted of unfamiliar objects. When given an explicit strategy to search for a familiar object as a way to improve performance on the change detection task, they performed no better than in a 6-alternative recognition memory task. Subjects only benefited from the presence of familiar objects in the change detection task when they had more time to view the prechange array before it switched. Once the cost to using the change detection information decreased, subjects made use of it in conjunction with memory to boost performance on the familiar-item change detection task. This suggests that even useful information will go unused if it is sufficiently difficult to extract.

  3. Geotherm: the U.S. geological survey geothermal information system

    USGS Publications Warehouse

    Bliss, J.D.; Rapport, A.

    1983-01-01

    GEOTHERM is a comprehensive system of public databases and software used to store, locate, and evaluate information on the geology, geochemistry, and hydrology of geothermal systems. Three main databases address the general characteristics of geothermal wells and fields, and the chemical properties of geothermal fluids; the last database is currently the most active. System tasks are divided into four areas: (1) data acquisition and entry, involving data entry via word processors and magnetic tape; (2) quality assurance, including the criteria and standards handbook and front-end data-screening programs; (3) operation, involving database backups and information extraction; and (4) user assistance, preparation of such items as application programs, and a quarterly newsletter. The principal task of GEOTHERM is to provide information and research support for the conduct of national geothermal-resource assessments. The principal users of GEOTHERM are those involved with the Geothermal Research Program of the U.S. Geological Survey. Information in the system is available to the public on request. ?? 1983.

  4. Morphological learning in a novel language: A cross-language comparison.

    PubMed

    Havas, Viktória; Waris, Otto; Vaquero, Lucía; Rodríguez-Fornells, Antoni; Laine, Matti

    2015-01-01

    Being able to extract and interpret the internal structure of complex word forms such as the English word dance+r+s is crucial for successful language learning. We examined whether the ability to extract morphological information during word learning is affected by the morphological features of one's native tongue. Spanish and Finnish adult participants performed a word-picture associative learning task in an artificial language where the target words included a suffix marking the gender of the corresponding animate object. The short exposure phase was followed by a word recognition task and a generalization task for the suffix. The participants' native tongues vary greatly in terms of morphological structure, leading to two opposing hypotheses. On the one hand, Spanish speakers may be more effective in identifying gender in a novel language because this feature is present in Spanish but not in Finnish. On the other hand, Finnish speakers may have an advantage as the abundance of bound morphemes in their language calls for continuous morphological decomposition. The results support the latter alternative, suggesting that lifelong experience on morphological decomposition provides an advantage in novel morphological learning.

  5. Recognition of chemical entities: combining dictionary-based and grammar-based approaches.

    PubMed

    Akhondi, Saber A; Hettne, Kristina M; van der Horst, Eelke; van Mulligen, Erik M; Kors, Jan A

    2015-01-01

    The past decade has seen an upsurge in the number of publications in chemistry. The ever-swelling volume of available documents makes it increasingly hard to extract relevant new information from such unstructured texts. The BioCreative CHEMDNER challenge invites the development of systems for the automatic recognition of chemicals in text (CEM task) and for ranking the recognized compounds at the document level (CDI task). We investigated an ensemble approach where dictionary-based named entity recognition is used along with grammar-based recognizers to extract compounds from text. We assessed the performance of ten different commercial and publicly available lexical resources using an open source indexing system (Peregrine), in combination with three different chemical compound recognizers and a set of regular expressions to recognize chemical database identifiers. The effect of different stop-word lists, case-sensitivity matching, and use of chunking information was also investigated. We focused on lexical resources that provide chemical structure information. To rank the different compounds found in a text, we used a term confidence score based on the normalized ratio of the term frequencies in chemical and non-chemical journals. The use of stop-word lists greatly improved the performance of the dictionary-based recognition, but there was no additional benefit from using chunking information. A combination of ChEBI and HMDB as lexical resources, the LeadMine tool for grammar-based recognition, and the regular expressions, outperformed any of the individual systems. On the test set, the F-scores were 77.8% (recall 71.2%, precision 85.8%) for the CEM task and 77.6% (recall 71.7%, precision 84.6%) for the CDI task. Missed terms were mainly due to tokenization issues, poor recognition of formulas, and term conjunctions. We developed an ensemble system that combines dictionary-based and grammar-based approaches for chemical named entity recognition, outperforming any of the individual systems that we considered. The system is able to provide structure information for most of the compounds that are found. Improved tokenization and better recognition of specific entity types is likely to further improve system performance.

  6. Recognition of chemical entities: combining dictionary-based and grammar-based approaches

    PubMed Central

    2015-01-01

    Background The past decade has seen an upsurge in the number of publications in chemistry. The ever-swelling volume of available documents makes it increasingly hard to extract relevant new information from such unstructured texts. The BioCreative CHEMDNER challenge invites the development of systems for the automatic recognition of chemicals in text (CEM task) and for ranking the recognized compounds at the document level (CDI task). We investigated an ensemble approach where dictionary-based named entity recognition is used along with grammar-based recognizers to extract compounds from text. We assessed the performance of ten different commercial and publicly available lexical resources using an open source indexing system (Peregrine), in combination with three different chemical compound recognizers and a set of regular expressions to recognize chemical database identifiers. The effect of different stop-word lists, case-sensitivity matching, and use of chunking information was also investigated. We focused on lexical resources that provide chemical structure information. To rank the different compounds found in a text, we used a term confidence score based on the normalized ratio of the term frequencies in chemical and non-chemical journals. Results The use of stop-word lists greatly improved the performance of the dictionary-based recognition, but there was no additional benefit from using chunking information. A combination of ChEBI and HMDB as lexical resources, the LeadMine tool for grammar-based recognition, and the regular expressions, outperformed any of the individual systems. On the test set, the F-scores were 77.8% (recall 71.2%, precision 85.8%) for the CEM task and 77.6% (recall 71.7%, precision 84.6%) for the CDI task. Missed terms were mainly due to tokenization issues, poor recognition of formulas, and term conjunctions. Conclusions We developed an ensemble system that combines dictionary-based and grammar-based approaches for chemical named entity recognition, outperforming any of the individual systems that we considered. The system is able to provide structure information for most of the compounds that are found. Improved tokenization and better recognition of specific entity types is likely to further improve system performance. PMID:25810767

  7. Information science team

    NASA Technical Reports Server (NTRS)

    Billingsley, F.

    1982-01-01

    Concerns are expressed about the data handling aspects of system design and about enabling technology for data handling and data analysis. The status, contributing factors, critical issues, and recommendations for investigations are listed for data handling, rectification and registration, and information extraction. Potential supports to individual P.I., research tasks, systematic data system design, and to system operation. The need for an airborne spectrometer class instrument for fundamental research in high spectral and spatial resolution is indicated. Geographic information system formatting and labelling techniques, very large scale integration, and methods for providing multitype data sets must also be developed.

  8. Classification of brain signals associated with imagination of hand grasping, opening and reaching by means of wavelet-based common spatial pattern and mutual information.

    PubMed

    Amanpour, Behzad; Erfanian, Abbas

    2013-01-01

    An important issue in designing a practical brain-computer interface (BCI) is the selection of mental tasks to be imagined. Different types of mental tasks have been used in BCI including left, right, foot, and tongue motor imageries. However, the mental tasks are different from the actions to be controlled by the BCI. It is desirable to select a mental task to be consistent with the desired action to be performed by BCI. In this paper, we investigated the detecting the imagination of the hand grasping, hand opening, and hand reaching in one hand using electroencephalographic (EEG) signals. The results show that the ERD/ERS patterns, associated with the imagination of hand grasping, opening, and reaching are different. For classification of brain signals associated with these mental tasks and feature extraction, a method based on wavelet packet, regularized common spatial pattern (CSP), and mutual information is proposed. The results of an offline analysis on five subjects show that the two-class mental tasks can be classified with an average accuracy of 77.6% using proposed method. In addition, we examine the proposed method on datasets IVa from BCI Competition III and IIa from BCI Competition IV.

  9. Coarse-to-Fine Encoding of Spatial Frequency Information into Visual Short-Term Memory for Faces but Impartial Decay

    ERIC Educational Resources Information Center

    Gao, Zaifeng; Bentin, Shlomo

    2011-01-01

    Face perception studies investigated how spatial frequencies (SF) are extracted from retinal display while forming a perceptual representation, or their selective use during task-imposed categorization. Here we focused on the order of encoding low-spatial frequencies (LSF) and high-spatial frequencies (HSF) from perceptual representations into…

  10. Young Children's Spontaneous Use of Geometry in Maps

    ERIC Educational Resources Information Center

    Shusterman, Anna; Lee, Sang Ah; Spelke, Elizabeth S.

    2008-01-01

    Two experiments tested whether 4-year-old children extract and use geometric information in simple maps without task instruction or feedback. Children saw maps depicting an arrangement of three containers and were asked to place an object into a container designated on the map. In Experiment 1, one of the three locations on the map and the array…

  11. Auditory Temporal Order Discrimination and Backward Recognition Masking in Adults with Dyslexia

    ERIC Educational Resources Information Center

    Griffiths, Yvonne M.; Hill, Nicholas I.; Bailey, Peter J.; Snowling, Margaret J.

    2003-01-01

    The ability of 20 adult dyslexic readers to extract frequency information from successive tone pairs was compared with that of IQ-matched controls using temporal order discrimination and auditory backward recognition masking (ABRM) tasks. In both paradigms, the interstimulus interval (ISI) between tones in a pair was either short (20 ms) or long…

  12. Finding Relevant Data in a Sea of Languages

    DTIC Science & Technology

    2016-04-26

    full machine-translated text , unbiased word clouds , query-biased word clouds , and query-biased sentence...and information retrieval to automate language processing tasks so that the limited number of linguists available for analyzing text and spoken...the crime (stock market). The Cross-LAnguage Search Engine (CLASE) has already preprocessed the documents, extracting text to identify the language

  13. The effects of 'ecstasy' (MDMA) on visuospatial memory performance: findings from a systematic review with meta-analyses.

    PubMed

    Murphy, Philip N; Bruno, Raimondo; Ryland, Ida; Wareing, Michele; Fisk, John E; Montgomery, Catharine; Hilton, Joanne

    2012-03-01

    To review, with meta-analyses where appropriate, performance differences between ecstasy (3,4-methylenedioxymethamphetamine) users and non-users on a wider range of visuospatial tasks than previously reviewed. Such tasks have been shown to draw upon working memory executive resources. Abstract databases were searched using the United Kingdom National Health Service Evidence Health Information Resource. Inclusion criteria were publication in English language peer-reviewed journals and the reporting of new findings regarding human ecstasy-users' performance on visuospatial tasks. Data extracted included specific task requirements to provide a basis for meta-analyses for categories of tasks with similar requirements. Fifty-two studies were identified for review, although not all were suitable for meta-analysis. Significant weighted mean effect sizes indicating poorer performance by ecstasy users compared with matched controls were found for tasks requiring recall of spatial stimulus elements, recognition of figures and production/reproduction of figures. There was no evidence of a linear relationship between estimated ecstasy consumption and effect sizes. Given the networked nature of processing for spatial and non-spatial visual information, future scanning and imaging studies should focus on brain activation differences between ecstasy users and non-users in the context of specific tasks to facilitate identification of loci of potentially compromised activity in users. Copyright © 2012 John Wiley & Sons, Ltd.

  14. Evaluating the predictive power of multivariate tensor-based morphometry in Alzheimer's disease progression via convex fused sparse group Lasso

    NASA Astrophysics Data System (ADS)

    Tsao, Sinchai; Gajawelli, Niharika; Zhou, Jiayu; Shi, Jie; Ye, Jieping; Wang, Yalin; Lepore, Natasha

    2014-03-01

    Prediction of Alzheimers disease (AD) progression based on baseline measures allows us to understand disease progression and has implications in decisions concerning treatment strategy. To this end we combine a predictive multi-task machine learning method1 with novel MR-based multivariate morphometric surface map of the hippocampus2 to predict future cognitive scores of patients. Previous work by Zhou et al.1 has shown that a multi-task learning framework that performs prediction of all future time points (or tasks) simultaneously can be used to encode both sparsity as well as temporal smoothness. They showed that this can be used in predicting cognitive outcomes of Alzheimers Disease Neuroimaging Initiative (ADNI) subjects based on FreeSurfer-based baseline MRI features, MMSE score demographic information and ApoE status. Whilst volumetric information may hold generalized information on brain status, we hypothesized that hippocampus specific information may be more useful in predictive modeling of AD. To this end, we applied Shi et al.2s recently developed multivariate tensor-based (mTBM) parametric surface analysis method to extract features from the hippocampal surface. We show that by combining the power of the multi-task framework with the sensitivity of mTBM features of the hippocampus surface, we are able to improve significantly improve predictive performance of ADAS cognitive scores 6, 12, 24, 36 and 48 months from baseline.

  15. Unsupervised user similarity mining in GSM sensor networks.

    PubMed

    Shad, Shafqat Ali; Chen, Enhong

    2013-01-01

    Mobility data has attracted the researchers for the past few years because of its rich context and spatiotemporal nature, where this information can be used for potential applications like early warning system, route prediction, traffic management, advertisement, social networking, and community finding. All the mentioned applications are based on mobility profile building and user trend analysis, where mobility profile building is done through significant places extraction, user's actual movement prediction, and context awareness. However, significant places extraction and user's actual movement prediction for mobility profile building are a trivial task. In this paper, we present the user similarity mining-based methodology through user mobility profile building by using the semantic tagging information provided by user and basic GSM network architecture properties based on unsupervised clustering approach. As the mobility information is in low-level raw form, our proposed methodology successfully converts it to a high-level meaningful information by using the cell-Id location information rather than previously used location capturing methods like GPS, Infrared, and Wifi for profile mining and user similarity mining.

  16. What is the context of contextual cueing?

    PubMed

    Makovski, Tal

    2016-12-01

    People have a powerful ability to extract regularities from noisy environments and to utilize this knowledge to assist in visual search. Extensive research has shown that this ability, termed contextual cueing (CC), is robust and ubiquitous, but it is still unclear what exactly is the context that is being leaned. Researchers have typically focused on how people learn spatial configuration regularities and have hence used simplified, meaningless search stimuli. Here, observers performed visual search tasks using images of real-world objects. The results revealed that, contrary to past findings, the repetition of either arbitrary spatial information or identity information was not sufficient to produce context learning. Instead, learning was found only when both types of information were repeated together. These results were further replicated in hybrid search tasks, in which subjects looked for multiple target templates. Together, these data suggest that CC is more limited than typically assumed, yet this learning is highly robust.

  17. GEOTHERM Data Set

    DOE Data Explorer

    DeAngelo, Jacob

    1983-01-01

    GEOTHERM is a comprehensive system of public databases and software used to store, locate, and evaluate information on the geology, geochemistry, and hydrology of geothermal systems. Three main databases address the general characteristics of geothermal wells and fields, and the chemical properties of geothermal fluids; the last database is currently the most active. System tasks are divided into four areas: (1) data acquisition and entry, involving data entry via word processors and magnetic tape; (2) quality assurance, including the criteria and standards handbook and front-end data-screening programs; (3) operation, involving database backups and information extraction; and (4) user assistance, preparation of such items as application programs, and a quarterly newsletter. The principal task of GEOTHERM is to provide information and research support for the conduct of national geothermal-resource assessments. The principal users of GEOTHERM are those involved with the Geothermal Research Program of the U.S. Geological Survey.

  18. Simulation of a Real-Time Brain Computer Interface for Detecting a Self-Paced Hitting Task.

    PubMed

    Hammad, Sofyan H; Kamavuako, Ernest N; Farina, Dario; Jensen, Winnie

    2016-12-01

    An invasive brain-computer interface (BCI) is a promising neurorehabilitation device for severely disabled patients. Although some systems have been shown to work well in restricted laboratory settings, their utility must be tested in less controlled, real-time environments. Our objective was to investigate whether a specific motor task could be reliably detected from multiunit intracortical signals from freely moving animals in a simulated, real-time setting. Intracortical signals were first obtained from electrodes placed in the primary motor cortex of four rats that were trained to hit a retractable paddle (defined as a "Hit"). In the simulated real-time setting, the signal-to-noise-ratio was first increased by wavelet denoising. Action potentials were detected, and features were extracted (spike count, mean absolute values, entropy, and combination of these features) within pre-defined time windows (200 ms, 300 ms, and 400 ms) to classify the occurrence of a "Hit." We found higher detection accuracy of a "Hit" (73.1%, 73.4%, and 67.9% for the three window sizes, respectively) when the decision was made based on a combination of features rather than on a single feature. However, the duration of the window length was not statistically significant (p = 0.5). Our results showed the feasibility of detecting a motor task in real time in a less restricted environment compared to environments commonly applied within invasive BCI research, and they showed the feasibility of using information extracted from multiunit recordings, thereby avoiding the time-consuming and complex task of extracting and sorting single units. © 2016 International Neuromodulation Society.

  19. Distance Metric Learning Using Privileged Information for Face Verification and Person Re-Identification.

    PubMed

    Xu, Xinxing; Li, Wen; Xu, Dong

    2015-12-01

    In this paper, we propose a new approach to improve face verification and person re-identification in the RGB images by leveraging a set of RGB-D data, in which we have additional depth images in the training data captured using depth cameras such as Kinect. In particular, we extract visual features and depth features from the RGB images and depth images, respectively. As the depth features are available only in the training data, we treat the depth features as privileged information, and we formulate this task as a distance metric learning with privileged information problem. Unlike the traditional face verification and person re-identification tasks that only use visual features, we further employ the extra depth features in the training data to improve the learning of distance metric in the training process. Based on the information-theoretic metric learning (ITML) method, we propose a new formulation called ITML with privileged information (ITML+) for this task. We also present an efficient algorithm based on the cyclic projection method for solving the proposed ITML+ formulation. Extensive experiments on the challenging faces data sets EUROCOM and CurtinFaces for face verification as well as the BIWI RGBD-ID data set for person re-identification demonstrate the effectiveness of our proposed approach.

  20. Pathology report data extraction from relational database using R, with extraction from reports on melanoma of skin as an example.

    PubMed

    Ye, Jay J

    2016-01-01

    Different methods have been described for data extraction from pathology reports with varying degrees of success. Here a technique for directly extracting data from relational database is described. Our department uses synoptic reports modified from College of American Pathologists (CAP) Cancer Protocol Templates to report most of our cancer diagnoses. Choosing the melanoma of skin synoptic report as an example, R scripting language extended with RODBC package was used to query the pathology information system database. Reports containing melanoma of skin synoptic report in the past 4 and a half years were retrieved and individual data elements were extracted. Using the retrieved list of the cases, the database was queried a second time to retrieve/extract the lymph node staging information in the subsequent reports from the same patients. 426 synoptic reports corresponding to unique lesions of melanoma of skin were retrieved, and data elements of interest were extracted into an R data frame. The distribution of Breslow depth of melanomas grouped by year is used as an example of intra-report data extraction and analysis. When the new pN staging information was present in the subsequent reports, 82% (77/94) was precisely retrieved (pN0, pN1, pN2 and pN3). Additional 15% (14/94) was retrieved with certain ambiguity (positive or knowing there was an update). The specificity was 100% for both. The relationship between Breslow depth and lymph node status was graphed as an example of lesion-specific multi-report data extraction and analysis. R extended with RODBC package is a simple and versatile approach well-suited for the above tasks. The success or failure of the retrieval and extraction depended largely on whether the reports were formatted and whether the contents of the elements were consistently phrased. This approach can be easily modified and adopted for other pathology information systems that use relational database for data management.

  1. Investigation of automated feature extraction using multiple data sources

    NASA Astrophysics Data System (ADS)

    Harvey, Neal R.; Perkins, Simon J.; Pope, Paul A.; Theiler, James P.; David, Nancy A.; Porter, Reid B.

    2003-04-01

    An increasing number and variety of platforms are now capable of collecting remote sensing data over a particular scene. For many applications, the information available from any individual sensor may be incomplete, inconsistent or imprecise. However, other sources may provide complementary and/or additional data. Thus, for an application such as image feature extraction or classification, it may be that fusing the mulitple data sources can lead to more consistent and reliable results. Unfortunately, with the increased complexity of the fused data, the search space of feature-extraction or classification algorithms also greatly increases. With a single data source, the determination of a suitable algorithm may be a significant challenge for an image analyst. With the fused data, the search for suitable algorithms can go far beyond the capabilities of a human in a realistic time frame, and becomes the realm of machine learning, where the computational power of modern computers can be harnessed to the task at hand. We describe experiments in which we investigate the ability of a suite of automated feature extraction tools developed at Los Alamos National Laboratory to make use of multiple data sources for various feature extraction tasks. We compare and contrast this software's capabilities on 1) individual data sets from different data sources 2) fused data sets from multiple data sources and 3) fusion of results from multiple individual data sources.

  2. Applying natural language processing techniques to develop a task-specific EMR interface for timely stroke thrombolysis: A feasibility study.

    PubMed

    Sung, Sheng-Feng; Chen, Kuanchin; Wu, Darren Philbert; Hung, Ling-Chien; Su, Yu-Hsiang; Hu, Ya-Han

    2018-04-01

    To reduce errors in determining eligibility for intravenous thrombolytic therapy (IVT) in stroke patients through use of an enhanced task-specific electronic medical record (EMR) interface powered by natural language processing (NLP) techniques. The information processing algorithm utilized MetaMap to extract medical concepts from IVT eligibility criteria and expanded the concepts using the Unified Medical Language System Metathesaurus. Concepts identified from clinical notes by MetaMap were compared to those from IVT eligibility criteria. The task-specific EMR interface displays IVT-relevant information by highlighting phrases that contain matched concepts. Clinical usability was assessed with clinicians staffing the acute stroke team by comparing user performance while using the task-specific and the current EMR interfaces. The algorithm identified IVT-relevant concepts with micro-averaged precisions, recalls, and F1 measures of 0.998, 0.812, and 0.895 at the phrase level and of 1, 0.972, and 0.986 at the document level. Users using the task-specific interface achieved a higher accuracy score than those using the current interface (91% versus 80%, p = 0.016) in assessing the IVT eligibility criteria. The completion time between the interfaces was statistically similar (2.46 min versus 1.70 min, p = 0.754). Although the information processing algorithm had room for improvement, the task-specific EMR interface significantly reduced errors in assessing IVT eligibility criteria. The study findings provide evidence to support an NLP enhanced EMR system to facilitate IVT decision-making by presenting meaningful and timely information to clinicians, thereby offering a new avenue for improvements in acute stroke care. Copyright © 2018 Elsevier B.V. All rights reserved.

  3. Exploiting graph kernels for high performance biomedical relation extraction.

    PubMed

    Panyam, Nagesh C; Verspoor, Karin; Cohn, Trevor; Ramamohanarao, Kotagiri

    2018-01-30

    Relation extraction from biomedical publications is an important task in the area of semantic mining of text. Kernel methods for supervised relation extraction are often preferred over manual feature engineering methods, when classifying highly ordered structures such as trees and graphs obtained from syntactic parsing of a sentence. Tree kernels such as the Subset Tree Kernel and Partial Tree Kernel have been shown to be effective for classifying constituency parse trees and basic dependency parse graphs of a sentence. Graph kernels such as the All Path Graph kernel (APG) and Approximate Subgraph Matching (ASM) kernel have been shown to be suitable for classifying general graphs with cycles, such as the enhanced dependency parse graph of a sentence. In this work, we present a high performance Chemical-Induced Disease (CID) relation extraction system. We present a comparative study of kernel methods for the CID task and also extend our study to the Protein-Protein Interaction (PPI) extraction task, an important biomedical relation extraction task. We discuss novel modifications to the ASM kernel to boost its performance and a method to apply graph kernels for extracting relations expressed in multiple sentences. Our system for CID relation extraction attains an F-score of 60%, without using external knowledge sources or task specific heuristic or rules. In comparison, the state of the art Chemical-Disease Relation Extraction system achieves an F-score of 56% using an ensemble of multiple machine learning methods, which is then boosted to 61% with a rule based system employing task specific post processing rules. For the CID task, graph kernels outperform tree kernels substantially, and the best performance is obtained with APG kernel that attains an F-score of 60%, followed by the ASM kernel at 57%. The performance difference between the ASM and APG kernels for CID sentence level relation extraction is not significant. In our evaluation of ASM for the PPI task, ASM performed better than APG kernel for the BioInfer dataset, in the Area Under Curve (AUC) measure (74% vs 69%). However, for all the other PPI datasets, namely AIMed, HPRD50, IEPA and LLL, ASM is substantially outperformed by the APG kernel in F-score and AUC measures. We demonstrate a high performance Chemical Induced Disease relation extraction, without employing external knowledge sources or task specific heuristics. Our work shows that graph kernels are effective in extracting relations that are expressed in multiple sentences. We also show that the graph kernels, namely the ASM and APG kernels, substantially outperform the tree kernels. Among the graph kernels, we showed the ASM kernel as effective for biomedical relation extraction, with comparable performance to the APG kernel for datasets such as the CID-sentence level relation extraction and BioInfer in PPI. Overall, the APG kernel is shown to be significantly more accurate than the ASM kernel, achieving better performance on most datasets.

  4. PREDOSE: A Semantic Web Platform for Drug Abuse Epidemiology using Social Media

    PubMed Central

    Cameron, Delroy; Smith, Gary A.; Daniulaityte, Raminta; Sheth, Amit P.; Dave, Drashti; Chen, Lu; Anand, Gaurish; Carlson, Robert; Watkins, Kera Z.; Falck, Russel

    2013-01-01

    Objectives The role of social media in biomedical knowledge mining, including clinical, medical and healthcare informatics, prescription drug abuse epidemiology and drug pharmacology, has become increasingly significant in recent years. Social media offers opportunities for people to share opinions and experiences freely in online communities, which may contribute information beyond the knowledge of domain professionals. This paper describes the development of a novel Semantic Web platform called PREDOSE (PREscription Drug abuse Online Surveillance and Epidemiology), which is designed to facilitate the epidemiologic study of prescription (and related) drug abuse practices using social media. PREDOSE uses web forum posts and domain knowledge, modeled in a manually created Drug Abuse Ontology (DAO) (pronounced dow), to facilitate the extraction of semantic information from User Generated Content (UGC). A combination of lexical, pattern-based and semantics-based techniques is used together with the domain knowledge to extract fine-grained semantic information from UGC. In a previous study, PREDOSE was used to obtain the datasets from which new knowledge in drug abuse research was derived. Here, we report on various platform enhancements, including an updated DAO, new components for relationship and triple extraction, and tools for content analysis, trend detection and emerging patterns exploration, which enhance the capabilities of the PREDOSE platform. Given these enhancements, PREDOSE is now more equipped to impact drug abuse research by alleviating traditional labor-intensive content analysis tasks. Methods Using custom web crawlers that scrape UGC from publicly available web forums, PREDOSE first automates the collection of web-based social media content for subsequent semantic annotation. The annotation scheme is modeled in the DAO, and includes domain specific knowledge such as prescription (and related) drugs, methods of preparation, side effects, routes of administration, etc. The DAO is also used to help recognize three types of data, namely: 1) entities, 2) relationships and 3) triples. PREDOSE then uses a combination of lexical and semantic-based techniques to extract entities and relationships from the scraped content, and a top-down approach for triple extraction that uses patterns expressed in the DAO. In addition, PREDOSE uses publicly available lexicons to identify initial sentiment expressions in text, and then a probabilistic optimization algorithm (from related research) to extract the final sentiment expressions. Together, these techniques enable the capture of fine-grained semantic information from UGC, and querying, search, trend analysis and overall content analysis of social media related to prescription drug abuse. Moreover, extracted data are also made available to domain experts for the creation of training and test sets for use in evaluation and refinements in information extraction techniques. Results A recent evaluation of the information extraction techniques applied in the PREDOSE platform indicates 85% precision and 72% recall in entity identification, on a manually created gold standard dataset. In another study, PREDOSE achieved 36% precision in relationship identification and 33% precision in triple extraction, through manual evaluation by domain experts. Given the complexity of the relationship and triple extraction tasks and the abstruse nature of social media texts, we interpret these as favorable initial results. Extracted semantic information is currently in use in an online discovery support system, by prescription drug abuse researchers at the Center for Interventions, Treatment and Addictions Research (CITAR) at Wright State University. Conclusion A comprehensive platform for entity, relationship, triple and sentiment extraction from such abstruse texts has never been developed for drug abuse research. PREDOSE has already demonstrated the importance of mining social media by providing data from which new findings in drug abuse research were uncovered. Given the recent platform enhancements, including the refined DAO, components for relationship and triple extraction, and tools for content, trend and emerging pattern analysis, it is expected that PREDOSE will play a significant role in advancing drug abuse epidemiology in future. PMID:23892295

  5. Task-evoked brain functional magnetic susceptibility mapping by independent component analysis (χICA).

    PubMed

    Chen, Zikuan; Calhoun, Vince D

    2016-03-01

    Conventionally, independent component analysis (ICA) is performed on an fMRI magnitude dataset to analyze brain functional mapping (AICA). By solving the inverse problem of fMRI, we can reconstruct the brain magnetic susceptibility (χ) functional states. Upon the reconstructed χ dataspace, we propose an ICA-based brain functional χ mapping method (χICA) to extract task-evoked brain functional map. A complex division algorithm is applied to a timeseries of fMRI phase images to extract temporal phase changes (relative to an OFF-state snapshot). A computed inverse MRI (CIMRI) model is used to reconstruct a 4D brain χ response dataset. χICA is implemented by applying a spatial InfoMax ICA algorithm to the reconstructed 4D χ dataspace. With finger-tapping experiments on a 7T system, the χICA-extracted χ-depicted functional map is similar to the SPM-inferred functional χ map by a spatial correlation of 0.67 ± 0.05. In comparison, the AICA-extracted magnitude-depicted map is correlated with the SPM magnitude map by 0.81 ± 0.05. The understanding of the inferiority of χICA to AICA for task-evoked functional map is an ongoing research topic. For task-evoked brain functional mapping, we compare the data-driven ICA method with the task-correlated SPM method. In particular, we compare χICA with AICA for extracting task-correlated timecourses and functional maps. χICA can extract a χ-depicted task-evoked brain functional map from a reconstructed χ dataspace without the knowledge about brain hemodynamic responses. The χICA-extracted brain functional χ map reveals a bidirectional BOLD response pattern that is unavailable (or different) from AICA. Copyright © 2016 Elsevier B.V. All rights reserved.

  6. On the selection and evaluation of visual display symbology Factors influencing search and identification times

    NASA Technical Reports Server (NTRS)

    Remington, Roger; Williams, Douglas

    1986-01-01

    Three single-target visual search tasks were used to evaluate a set of cathode-ray tube (CRT) symbols for a helicopter situation display. The search tasks were representative of the information extraction required in practice, and reaction time was used to measure the efficiency with which symbols could be located and identified. Familiar numeric symbols were responded to more quickly than graphic symbols. The addition of modifier symbols, such as a nearby flashing dot or surrounding square, had a greater disruptive effect on the graphic symbols than did the numeric characters. The results suggest that a symbol set is, in some respects, like a list that must be learned. Factors that affect the time to identify items in a memory task, such as familiarity and visual discriminability, also affect the time to identify symbols. This analogy has broad implications for the design of symbol sets. An attempt was made to model information access with this class of display.

  7. Comparing two types of engineering visualizations: task-related manipulations matter.

    PubMed

    Cölln, Martin C; Kusch, Kerstin; Helmert, Jens R; Kohler, Petra; Velichkovsky, Boris M; Pannasch, Sebastian

    2012-01-01

    This study focuses on the comparison of traditional engineering drawings with a CAD (computer aided design) visualization in terms of user performance and eye movements in an applied context. Twenty-five students of mechanical engineering completed search tasks for measures in two distinct depictions of a car engine component (engineering drawing vs. CAD model). Besides spatial dimensionality, the display types most notably differed in terms of information layout, access and interaction options. The CAD visualization yielded better performance, if users directly manipulated the object, but was inferior, if employed in a conventional static manner, i.e. inspecting only predefined views. An additional eye movement analysis revealed longer fixation durations and a stronger increase of task-relevant fixations over time when interacting with the CAD visualization. This suggests a more focused extraction and filtering of information. We conclude that the three-dimensional CAD visualization can be advantageous if its ability to manipulate is used. Copyright © 2011 Elsevier Ltd and The Ergonomics Society. All rights reserved.

  8. Natural Language Processing in Radiology: A Systematic Review.

    PubMed

    Pons, Ewoud; Braun, Loes M M; Hunink, M G Myriam; Kors, Jan A

    2016-05-01

    Radiological reporting has generated large quantities of digital content within the electronic health record, which is potentially a valuable source of information for improving clinical care and supporting research. Although radiology reports are stored for communication and documentation of diagnostic imaging, harnessing their potential requires efficient and automated information extraction: they exist mainly as free-text clinical narrative, from which it is a major challenge to obtain structured data. Natural language processing (NLP) provides techniques that aid the conversion of text into a structured representation, and thus enables computers to derive meaning from human (ie, natural language) input. Used on radiology reports, NLP techniques enable automatic identification and extraction of information. By exploring the various purposes for their use, this review examines how radiology benefits from NLP. A systematic literature search identified 67 relevant publications describing NLP methods that support practical applications in radiology. This review takes a close look at the individual studies in terms of tasks (ie, the extracted information), the NLP methodology and tools used, and their application purpose and performance results. Additionally, limitations, future challenges, and requirements for advancing NLP in radiology will be discussed. (©) RSNA, 2016 Online supplemental material is available for this article.

  9. Event-based text mining for biology and functional genomics

    PubMed Central

    Thompson, Paul; Nawaz, Raheel; McNaught, John; Kell, Douglas B.

    2015-01-01

    The assessment of genome function requires a mapping between genome-derived entities and biochemical reactions, and the biomedical literature represents a rich source of information about reactions between biological components. However, the increasingly rapid growth in the volume of literature provides both a challenge and an opportunity for researchers to isolate information about reactions of interest in a timely and efficient manner. In response, recent text mining research in the biology domain has been largely focused on the identification and extraction of ‘events’, i.e. categorised, structured representations of relationships between biochemical entities, from the literature. Functional genomics analyses necessarily encompass events as so defined. Automatic event extraction systems facilitate the development of sophisticated semantic search applications, allowing researchers to formulate structured queries over extracted events, so as to specify the exact types of reactions to be retrieved. This article provides an overview of recent research into event extraction. We cover annotated corpora on which systems are trained, systems that achieve state-of-the-art performance and details of the community shared tasks that have been instrumental in increasing the quality, coverage and scalability of recent systems. Finally, several concrete applications of event extraction are covered, together with emerging directions of research. PMID:24907365

  10. Drug drug interaction extraction from the literature using a recursive neural network

    PubMed Central

    Lim, Sangrak; Lee, Kyubum

    2018-01-01

    Detecting drug-drug interactions (DDI) is important because information on DDIs can help prevent adverse effects from drug combinations. Since there are many new DDI-related papers published in the biomedical domain, manually extracting DDI information from the literature is a laborious task. However, text mining can be used to find DDIs in the biomedical literature. Among the recently developed neural networks, we use a Recursive Neural Network to improve the performance of DDI extraction. Our recursive neural network model uses a position feature, a subtree containment feature, and an ensemble method to improve the performance of DDI extraction. Compared with the state-of-the-art models, the DDI detection and type classifiers of our model performed 4.4% and 2.8% better, respectively, on the DDIExtraction Challenge’13 test data. We also validated our model on the PK DDI corpus that consists of two types of DDIs data: in vivo DDI and in vitro DDI. Compared with the existing model, our detection classifier performed 2.3% and 6.7% better on in vivo and in vitro data respectively. The results of our validation demonstrate that our model can automatically extract DDIs better than existing models. PMID:29373599

  11. Enhancing Biomedical Text Summarization Using Semantic Relation Extraction

    PubMed Central

    Shang, Yue; Li, Yanpeng; Lin, Hongfei; Yang, Zhihao

    2011-01-01

    Automatic text summarization for a biomedical concept can help researchers to get the key points of a certain topic from large amount of biomedical literature efficiently. In this paper, we present a method for generating text summary for a given biomedical concept, e.g., H1N1 disease, from multiple documents based on semantic relation extraction. Our approach includes three stages: 1) We extract semantic relations in each sentence using the semantic knowledge representation tool SemRep. 2) We develop a relation-level retrieval method to select the relations most relevant to each query concept and visualize them in a graphic representation. 3) For relations in the relevant set, we extract informative sentences that can interpret them from the document collection to generate text summary using an information retrieval based method. Our major focus in this work is to investigate the contribution of semantic relation extraction to the task of biomedical text summarization. The experimental results on summarization for a set of diseases show that the introduction of semantic knowledge improves the performance and our results are better than the MEAD system, a well-known tool for text summarization. PMID:21887336

  12. Multi-task feature learning by using trace norm regularization

    NASA Astrophysics Data System (ADS)

    Jiangmei, Zhang; Binfeng, Yu; Haibo, Ji; Wang, Kunpeng

    2017-11-01

    Multi-task learning can extract the correlation of multiple related machine learning problems to improve performance. This paper considers applying the multi-task learning method to learn a single task. We propose a new learning approach, which employs the mixture of expert model to divide a learning task into several related sub-tasks, and then uses the trace norm regularization to extract common feature representation of these sub-tasks. A nonlinear extension of this approach by using kernel is also provided. Experiments conducted on both simulated and real data sets demonstrate the advantage of the proposed approach.

  13. Fast and accurate edge orientation processing during object manipulation

    PubMed Central

    Flanagan, J Randall; Johansson, Roland S

    2018-01-01

    Quickly and accurately extracting information about a touched object’s orientation is a critical aspect of dexterous object manipulation. However, the speed and acuity of tactile edge orientation processing with respect to the fingertips as reported in previous perceptual studies appear inadequate in these respects. Here we directly establish the tactile system’s capacity to process edge-orientation information during dexterous manipulation. Participants extracted tactile information about edge orientation very quickly, using it within 200 ms of first touching the object. Participants were also strikingly accurate. With edges spanning the entire fingertip, edge-orientation resolution was better than 3° in our object manipulation task, which is several times better than reported in previous perceptual studies. Performance remained impressive even with edges as short as 2 mm, consistent with our ability to precisely manipulate very small objects. Taken together, our results radically redefine the spatial processing capacity of the tactile system. PMID:29611804

  14. Scene text recognition in mobile applications by character descriptor and structure configuration.

    PubMed

    Yi, Chucai; Tian, Yingli

    2014-07-01

    Text characters and strings in natural scene can provide valuable information for many applications. Extracting text directly from natural scene images or videos is a challenging task because of diverse text patterns and variant background interferences. This paper proposes a method of scene text recognition from detected text regions. In text detection, our previously proposed algorithms are applied to obtain text regions from scene image. First, we design a discriminative character descriptor by combining several state-of-the-art feature detectors and descriptors. Second, we model character structure at each character class by designing stroke configuration maps. Our algorithm design is compatible with the application of scene text extraction in smart mobile devices. An Android-based demo system is developed to show the effectiveness of our proposed method on scene text information extraction from nearby objects. The demo system also provides us some insight into algorithm design and performance improvement of scene text extraction. The evaluation results on benchmark data sets demonstrate that our proposed scheme of text recognition is comparable with the best existing methods.

  15. Classifying Cognitive Profiles Using Machine Learning with Privileged Information in Mild Cognitive Impairment.

    PubMed

    Alahmadi, Hanin H; Shen, Yuan; Fouad, Shereen; Luft, Caroline Di B; Bentham, Peter; Kourtzi, Zoe; Tino, Peter

    2016-01-01

    Early diagnosis of dementia is critical for assessing disease progression and potential treatment. State-or-the-art machine learning techniques have been increasingly employed to take on this diagnostic task. In this study, we employed Generalized Matrix Learning Vector Quantization (GMLVQ) classifiers to discriminate patients with Mild Cognitive Impairment (MCI) from healthy controls based on their cognitive skills. Further, we adopted a "Learning with privileged information" approach to combine cognitive and fMRI data for the classification task. The resulting classifier operates solely on the cognitive data while it incorporates the fMRI data as privileged information (PI) during training. This novel classifier is of practical use as the collection of brain imaging data is not always possible with patients and older participants. MCI patients and healthy age-matched controls were trained to extract structure from temporal sequences. We ask whether machine learning classifiers can be used to discriminate patients from controls and whether differences between these groups relate to individual cognitive profiles. To this end, we tested participants in four cognitive tasks: working memory, cognitive inhibition, divided attention, and selective attention. We also collected fMRI data before and after training on a probabilistic sequence learning task and extracted fMRI responses and connectivity as features for machine learning classifiers. Our results show that the PI guided GMLVQ classifiers outperform the baseline classifier that only used the cognitive data. In addition, we found that for the baseline classifier, divided attention is the only relevant cognitive feature. When PI was incorporated, divided attention remained the most relevant feature while cognitive inhibition became also relevant for the task. Interestingly, this analysis for the fMRI GMLVQ classifier suggests that (1) when overall fMRI signal is used as inputs to the classifier, the post-training session is most relevant; and (2) when the graph feature reflecting underlying spatiotemporal fMRI pattern is used, the pre-training session is most relevant. Taken together these results suggest that brain connectivity before training and overall fMRI signal after training are both diagnostic of cognitive skills in MCI.

  16. Assessing the Neural Basis of Uncertainty in Perceptual Category Learning through Varying Levels of Distortion

    ERIC Educational Resources Information Center

    Daniel, Reka; Wagner, Gerd; Koch, Kathrin; Reichenbach, Jurgen R.; Sauer, Heinrich; Schlosser, Ralf G. M.

    2011-01-01

    The formation of new perceptual categories involves learning to extract that information from a wide range of often noisy sensory inputs, which is critical for selecting between a limited number of responses. To identify brain regions involved in visual classification learning under noisy conditions, we developed a task on the basis of the…

  17. Experimental "Microcultures" in Young Children: Identifying Biographic, Cognitive, and Social Predictors of Information Transmission

    ERIC Educational Resources Information Center

    Flynn, Emma; Whiten, Andrew

    2012-01-01

    In one of the first open diffusion experiments with young children, a tool-use task that afforded multiple methods to extract an enclosed reward and a child model habitually using one of these methods were introduced into different playgroups. Eighty-eight children, ranging in age from 2 years 8 months to 4 years 5 months, participated. Measures…

  18. Evaluating the Impact of Depth Cue Salience in Working Three-Dimensional Mental Rotation Tasks by Means of Psychometric Experiments

    ERIC Educational Resources Information Center

    Arendasy, Martin; Sommer, Markus; Hergovich, Andreas; Feldhammer, Martina

    2011-01-01

    The gender difference in three-dimensional mental rotation is well documented in the literature. In this article we combined automatic item generation, (quasi-)experimental research designs and item response theory models of change measurement to evaluate the effect of the ability to extract the depth information conveyed in the two-dimensional…

  19. Space Research Data Management in the National Aeronautics and Space Administration

    NASA Technical Reports Server (NTRS)

    Ludwig, G. H.

    1986-01-01

    Space related scientific research has passed through a natural evolutionary process. The task of extracting the meaningful information from the raw data is highly involved and will require data processing capabilities that do not exist today. The results are presented of a three year examination of this subject, using an earlier report as a starting point. The general conclusion is that there are areas in which NASA's data management practices can be improved and recommends specific actions. These actions will enhance NASA's ability to extract more of the potential data and to capitalize on future opportunities.

  20. Event extraction of bacteria biotopes: a knowledge-intensive NLP-based approach

    PubMed Central

    2012-01-01

    Background Bacteria biotopes cover a wide range of diverse habitats including animal and plant hosts, natural, medical and industrial environments. The high volume of publications in the microbiology domain provides a rich source of up-to-date information on bacteria biotopes. This information, as found in scientific articles, is expressed in natural language and is rarely available in a structured format, such as a database. This information is of great importance for fundamental research and microbiology applications (e.g., medicine, agronomy, food, bioenergy). The automatic extraction of this information from texts will provide a great benefit to the field. Methods We present a new method for extracting relationships between bacteria and their locations using the Alvis framework. Recognition of bacteria and their locations was achieved using a pattern-based approach and domain lexical resources. For the detection of environment locations, we propose a new approach that combines lexical information and the syntactic-semantic analysis of corpus terms to overcome the incompleteness of lexical resources. Bacteria location relations extend over sentence borders, and we developed domain-specific rules for dealing with bacteria anaphors. Results We participated in the BioNLP 2011 Bacteria Biotope (BB) task with the Alvis system. Official evaluation results show that it achieves the best performance of participating systems. New developments since then have increased the F-score by 4.1 points. Conclusions We have shown that the combination of semantic analysis and domain-adapted resources is both effective and efficient for event information extraction in the bacteria biotope domain. We plan to adapt the method to deal with a larger set of location types and a large-scale scientific article corpus to enable microbiologists to integrate and use the extracted knowledge in combination with experimental data. PMID:22759462

  1. Event extraction of bacteria biotopes: a knowledge-intensive NLP-based approach.

    PubMed

    Ratkovic, Zorana; Golik, Wiktoria; Warnier, Pierre

    2012-06-26

    Bacteria biotopes cover a wide range of diverse habitats including animal and plant hosts, natural, medical and industrial environments. The high volume of publications in the microbiology domain provides a rich source of up-to-date information on bacteria biotopes. This information, as found in scientific articles, is expressed in natural language and is rarely available in a structured format, such as a database. This information is of great importance for fundamental research and microbiology applications (e.g., medicine, agronomy, food, bioenergy). The automatic extraction of this information from texts will provide a great benefit to the field. We present a new method for extracting relationships between bacteria and their locations using the Alvis framework. Recognition of bacteria and their locations was achieved using a pattern-based approach and domain lexical resources. For the detection of environment locations, we propose a new approach that combines lexical information and the syntactic-semantic analysis of corpus terms to overcome the incompleteness of lexical resources. Bacteria location relations extend over sentence borders, and we developed domain-specific rules for dealing with bacteria anaphors. We participated in the BioNLP 2011 Bacteria Biotope (BB) task with the Alvis system. Official evaluation results show that it achieves the best performance of participating systems. New developments since then have increased the F-score by 4.1 points. We have shown that the combination of semantic analysis and domain-adapted resources is both effective and efficient for event information extraction in the bacteria biotope domain. We plan to adapt the method to deal with a larger set of location types and a large-scale scientific article corpus to enable microbiologists to integrate and use the extracted knowledge in combination with experimental data.

  2. Variability extraction and modeling for product variants.

    PubMed

    Linsbauer, Lukas; Lopez-Herrejon, Roberto Erick; Egyed, Alexander

    2017-01-01

    Fast-changing hardware and software technologies in addition to larger and more specialized customer bases demand software tailored to meet very diverse requirements. Software development approaches that aim at capturing this diversity on a single consolidated platform often require large upfront investments, e.g., time or budget. Alternatively, companies resort to developing one variant of a software product at a time by reusing as much as possible from already-existing product variants. However, identifying and extracting the parts to reuse is an error-prone and inefficient task compounded by the typically large number of product variants. Hence, more disciplined and systematic approaches are needed to cope with the complexity of developing and maintaining sets of product variants. Such approaches require detailed information about the product variants, the features they provide and their relations. In this paper, we present an approach to extract such variability information from product variants. It identifies traces from features and feature interactions to their implementation artifacts, and computes their dependencies. This work can be useful in many scenarios ranging from ad hoc development approaches such as clone-and-own to systematic reuse approaches such as software product lines. We applied our variability extraction approach to six case studies and provide a detailed evaluation. The results show that the extracted variability information is consistent with the variability in our six case study systems given by their variability models and available product variants.

  3. Differential impact of continuous theta-burst stimulation over left and right DLPFC on planning.

    PubMed

    Kaller, Christoph P; Heinze, Katharina; Frenkel, Annekathrein; Läppchen, Claus H; Unterrainer, Josef M; Weiller, Cornelius; Lange, Rüdiger; Rahm, Benjamin

    2013-01-01

    Most neuroimaging studies on planning report bilateral activations of the dorsolateral prefrontal cortex (dlPFC). Recently, these concurrent activations of left and right dlPFC have been shown to double dissociate with different cognitive demands imposed by the planning task: Higher demands on the extraction of task-relevant information led to stronger activation in left dlPFC, whereas higher demands on the integration of interdependent information into a coherent action sequence entailed stronger activation of right dlPFC. Here, we used continuous theta-burst stimulation (cTBS) to investigate the supposed causal structure-function mapping underlying this double dissociation. Two groups of healthy subjects (left-lateralized stimulation, n = 26; right-lateralized stimulation, n = 26) were tested within-subject on a variant of the Tower of London task following either real cTBS over dlPFC or sham stimulation over posterior parietal cortex. Results revealed that, irrespective of specific task demands, cTBS over left and right dlPFC was associated with a global decrease and increase, respectively, in initial planning times compared to sham stimulation. Moreover, no interaction between task demands and stimulation type (real vs. sham) and/or stimulation side (left vs. right hemisphere) were found. Together, against expectations from previous neuroimaging data, lateralized cTBS did not lead to planning-parameter specific changes in performance, but instead revealed a global asymmetric pattern of faster versus slower task processing after left versus right cTBS. This global asymmetry in the absence of any task-parameter specific impact of cTBS suggests that different levels of information processing may span colocalized, but independent axes of functional lateralization in the dlPFC. Copyright © 2011 Wiley Periodicals, Inc.

  4. Active learning-based information structure analysis of full scientific articles and two applications for biomedical literature review.

    PubMed

    Guo, Yufan; Silins, Ilona; Stenius, Ulla; Korhonen, Anna

    2013-06-01

    Techniques that are capable of automatically analyzing the information structure of scientific articles could be highly useful for improving information access to biomedical literature. However, most existing approaches rely on supervised machine learning (ML) and substantial labeled data that are expensive to develop and apply to different sub-fields of biomedicine. Recent research shows that minimal supervision is sufficient for fairly accurate information structure analysis of biomedical abstracts. However, is it realistic for full articles given their high linguistic and informational complexity? We introduce and release a novel corpus of 50 biomedical articles annotated according to the Argumentative Zoning (AZ) scheme, and investigate active learning with one of the most widely used ML models-Support Vector Machines (SVM)-on this corpus. Additionally, we introduce two novel applications that use AZ to support real-life literature review in biomedicine via question answering and summarization. We show that active learning with SVM trained on 500 labeled sentences (6% of the corpus) performs surprisingly well with the accuracy of 82%, just 2% lower than fully supervised learning. In our question answering task, biomedical researchers find relevant information significantly faster from AZ-annotated than unannotated articles. In the summarization task, sentences extracted from particular zones are significantly more similar to gold standard summaries than those extracted from particular sections of full articles. These results demonstrate that active learning of full articles' information structure is indeed realistic and the accuracy is high enough to support real-life literature review in biomedicine. The annotated corpus, our AZ classifier and the two novel applications are available at http://www.cl.cam.ac.uk/yg244/12bioinfo.html

  5. Occupational exposure to silica in construction workers: a literature-based exposure database.

    PubMed

    Beaudry, Charles; Lavoué, Jérôme; Sauvé, Jean-François; Bégin, Denis; Senhaji Rhazi, Mounia; Perrault, Guy; Dion, Chantal; Gérin, Michel

    2013-01-01

    We created an exposure database of respirable crystalline silica levels in the construction industry from the literature. We extracted silica and dust exposure levels in publications reporting silica exposure levels or quantitative evaluations of control effectiveness published in or after 1990. The database contains 6118 records (2858 of respirable crystalline silica) extracted from 115 sources, summarizing 11,845 measurements. Four hundred and eighty-eight records represent summarized exposure levels instead of individual values. For these records, the reported summary parameters were standardized into a geometric mean and a geometric standard deviation. Each record is associated with 80 characteristics, including information on trade, task, materials, tools, sampling strategy, analytical methods, and control measures. Although the database was constructed in French, 38 essential variables were standardized and translated into English. The data span the period 1974-2009, with 92% of the records corresponding to personal measurements. Thirteen standardized trades and 25 different standardized tasks are associated with at least five individual silica measurements. Trade-specific respirable crystalline silica geometric means vary from 0.01 (plumber) to 0.30 mg/m³ (tunnel construction skilled labor), while tasks vary from 0.01 (six categories, including sanding and electrical maintenance) to 1.59 mg/m³ (abrasive blasting). Despite limitations associated with the use of literature data, this database can be analyzed using meta-analytical and multivariate techniques and currently represents the most important source of exposure information about silica exposure in the construction industry. It is available on request to the research community.

  6. Large Margin Multi-Modal Multi-Task Feature Extraction for Image Classification.

    PubMed

    Yong Luo; Yonggang Wen; Dacheng Tao; Jie Gui; Chao Xu

    2016-01-01

    The features used in many image analysis-based applications are frequently of very high dimension. Feature extraction offers several advantages in high-dimensional cases, and many recent studies have used multi-task feature extraction approaches, which often outperform single-task feature extraction approaches. However, most of these methods are limited in that they only consider data represented by a single type of feature, even though features usually represent images from multiple modalities. We, therefore, propose a novel large margin multi-modal multi-task feature extraction (LM3FE) framework for handling multi-modal features for image classification. In particular, LM3FE simultaneously learns the feature extraction matrix for each modality and the modality combination coefficients. In this way, LM3FE not only handles correlated and noisy features, but also utilizes the complementarity of different modalities to further help reduce feature redundancy in each modality. The large margin principle employed also helps to extract strongly predictive features, so that they are more suitable for prediction (e.g., classification). An alternating algorithm is developed for problem optimization, and each subproblem can be efficiently solved. Experiments on two challenging real-world image data sets demonstrate the effectiveness and superiority of the proposed method.

  7. The feasibility of using natural language processing to extract clinical information from breast pathology reports.

    PubMed

    Buckley, Julliette M; Coopey, Suzanne B; Sharko, John; Polubriaginof, Fernanda; Drohan, Brian; Belli, Ahmet K; Kim, Elizabeth M H; Garber, Judy E; Smith, Barbara L; Gadd, Michele A; Specht, Michelle C; Roche, Constance A; Gudewicz, Thomas M; Hughes, Kevin S

    2012-01-01

    The opportunity to integrate clinical decision support systems into clinical practice is limited due to the lack of structured, machine readable data in the current format of the electronic health record. Natural language processing has been designed to convert free text into machine readable data. The aim of the current study was to ascertain the feasibility of using natural language processing to extract clinical information from >76,000 breast pathology reports. APPROACH AND PROCEDURE: Breast pathology reports from three institutions were analyzed using natural language processing software (Clearforest, Waltham, MA) to extract information on a variety of pathologic diagnoses of interest. Data tables were created from the extracted information according to date of surgery, side of surgery, and medical record number. The variety of ways in which each diagnosis could be represented was recorded, as a means of demonstrating the complexity of machine interpretation of free text. There was widespread variation in how pathologists reported common pathologic diagnoses. We report, for example, 124 ways of saying invasive ductal carcinoma and 95 ways of saying invasive lobular carcinoma. There were >4000 ways of saying invasive ductal carcinoma was not present. Natural language processor sensitivity and specificity were 99.1% and 96.5% when compared to expert human coders. We have demonstrated how a large body of free text medical information such as seen in breast pathology reports, can be converted to a machine readable format using natural language processing, and described the inherent complexities of the task.

  8. Task-dependent recurrent dynamics in visual cortex

    PubMed Central

    Tajima, Satohiro; Koida, Kowa; Tajima, Chihiro I; Suzuki, Hideyuki; Aihara, Kazuyuki; Komatsu, Hidehiko

    2017-01-01

    The capacity for flexible sensory-action association in animals has been related to context-dependent attractor dynamics outside the sensory cortices. Here, we report a line of evidence that flexibly modulated attractor dynamics during task switching are already present in the higher visual cortex in macaque monkeys. With a nonlinear decoding approach, we can extract the particular aspect of the neural population response that reflects the task-induced emergence of bistable attractor dynamics in a neural population, which could be obscured by standard unsupervised dimensionality reductions such as PCA. The dynamical modulation selectively increases the information relevant to task demands, indicating that such modulation is beneficial for perceptual decisions. A computational model that features nonlinear recurrent interaction among neurons with a task-dependent background input replicates the key properties observed in the experimental data. These results suggest that the context-dependent attractor dynamics involving the sensory cortex can underlie flexible perceptual abilities. DOI: http://dx.doi.org/10.7554/eLife.26868.001 PMID:28737487

  9. Reinforcement learning in computer vision

    NASA Astrophysics Data System (ADS)

    Bernstein, A. V.; Burnaev, E. V.

    2018-04-01

    Nowadays, machine learning has become one of the basic technologies used in solving various computer vision tasks such as feature detection, image segmentation, object recognition and tracking. In many applications, various complex systems such as robots are equipped with visual sensors from which they learn state of surrounding environment by solving corresponding computer vision tasks. Solutions of these tasks are used for making decisions about possible future actions. It is not surprising that when solving computer vision tasks we should take into account special aspects of their subsequent application in model-based predictive control. Reinforcement learning is one of modern machine learning technologies in which learning is carried out through interaction with the environment. In recent years, Reinforcement learning has been used both for solving such applied tasks as processing and analysis of visual information, and for solving specific computer vision problems such as filtering, extracting image features, localizing objects in scenes, and many others. The paper describes shortly the Reinforcement learning technology and its use for solving computer vision problems.

  10. The BEL information extraction workflow (BELIEF): evaluation in the BioCreative V BEL and IAT track

    PubMed Central

    Madan, Sumit; Hodapp, Sven; Senger, Philipp; Ansari, Sam; Szostak, Justyna; Hoeng, Julia; Peitsch, Manuel; Fluck, Juliane

    2016-01-01

    Network-based approaches have become extremely important in systems biology to achieve a better understanding of biological mechanisms. For network representation, the Biological Expression Language (BEL) is well designed to collate findings from the scientific literature into biological network models. To facilitate encoding and biocuration of such findings in BEL, a BEL Information Extraction Workflow (BELIEF) was developed. BELIEF provides a web-based curation interface, the BELIEF Dashboard, that incorporates text mining techniques to support the biocurator in the generation of BEL networks. The underlying UIMA-based text mining pipeline (BELIEF Pipeline) uses several named entity recognition processes and relationship extraction methods to detect concepts and BEL relationships in literature. The BELIEF Dashboard allows easy curation of the automatically generated BEL statements and their context annotations. Resulting BEL statements and their context annotations can be syntactically and semantically verified to ensure consistency in the BEL network. In summary, the workflow supports experts in different stages of systems biology network building. Based on the BioCreative V BEL track evaluation, we show that the BELIEF Pipeline automatically extracts relationships with an F-score of 36.4% and fully correct statements can be obtained with an F-score of 30.8%. Participation in the BioCreative V Interactive task (IAT) track with BELIEF revealed a systems usability scale (SUS) of 67. Considering the complexity of the task for new users—learning BEL, working with a completely new interface, and performing complex curation—a score so close to the overall SUS average highlights the usability of BELIEF. Database URL: BELIEF is available at http://www.scaiview.com/belief/ PMID:27694210

  11. The BEL information extraction workflow (BELIEF): evaluation in the BioCreative V BEL and IAT track.

    PubMed

    Madan, Sumit; Hodapp, Sven; Senger, Philipp; Ansari, Sam; Szostak, Justyna; Hoeng, Julia; Peitsch, Manuel; Fluck, Juliane

    2016-01-01

    Network-based approaches have become extremely important in systems biology to achieve a better understanding of biological mechanisms. For network representation, the Biological Expression Language (BEL) is well designed to collate findings from the scientific literature into biological network models. To facilitate encoding and biocuration of such findings in BEL, a BEL Information Extraction Workflow (BELIEF) was developed. BELIEF provides a web-based curation interface, the BELIEF Dashboard, that incorporates text mining techniques to support the biocurator in the generation of BEL networks. The underlying UIMA-based text mining pipeline (BELIEF Pipeline) uses several named entity recognition processes and relationship extraction methods to detect concepts and BEL relationships in literature. The BELIEF Dashboard allows easy curation of the automatically generated BEL statements and their context annotations. Resulting BEL statements and their context annotations can be syntactically and semantically verified to ensure consistency in the BEL network. In summary, the workflow supports experts in different stages of systems biology network building. Based on the BioCreative V BEL track evaluation, we show that the BELIEF Pipeline automatically extracts relationships with an F-score of 36.4% and fully correct statements can be obtained with an F-score of 30.8%. Participation in the BioCreative V Interactive task (IAT) track with BELIEF revealed a systems usability scale (SUS) of 67. Considering the complexity of the task for new users-learning BEL, working with a completely new interface, and performing complex curation-a score so close to the overall SUS average highlights the usability of BELIEF.Database URL: BELIEF is available at http://www.scaiview.com/belief/. © The Author(s) 2016. Published by Oxford University Press. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.

  12. Temporal data representation, normalization, extraction, and reasoning: A review from clinical domain

    PubMed Central

    Madkour, Mohcine; Benhaddou, Driss; Tao, Cui

    2016-01-01

    Background and Objective We live our lives by the calendar and the clock, but time is also an abstraction, even an illusion. The sense of time can be both domain-specific and complex, and is often left implicit, requiring significant domain knowledge to accurately recognize and harness. In the clinical domain, the momentum gained from recent advances in infrastructure and governance practices has enabled the collection of tremendous amount of data at each moment in time. Electronic Health Records (EHRs) have paved the way to making these data available for practitioners and researchers. However, temporal data representation, normalization, extraction and reasoning are very important in order to mine such massive data and therefore for constructing the clinical timeline. The objective of this work is to provide an overview of the problem of constructing a timeline at the clinical point of care and to summarize the state-of-the-art in processing temporal information of clinical narratives. Methods This review surveys the methods used in three important area: modeling and representing of time, Medical NLP methods for extracting time, and methods of time reasoning and processing. The review emphasis on the current existing gap between present methods and the semantic web technologies and catch up with the possible combinations. Results the main findings of this review is revealing the importance of time processing not only in constructing timelines and clinical decision support systems but also as a vital component of EHR data models and operations. Conclusions Extracting temporal information in clinical narratives is a challenging task. The inclusion of ontologies and semantic web will lead to better assessment of the annotation task and, together with medical NLP techniques, will help resolving granularity and co-reference resolution problems. PMID:27040831

  13. Computer-based evaluation of Alzheimer's disease and mild cognitive impairment patients during a picture description task.

    PubMed

    Hernández-Domínguez, Laura; Ratté, Sylvie; Sierra-Martínez, Gerardo; Roche-Bergua, Andrés

    2018-01-01

    We present a methodology to automatically evaluate the performance of patients during picture description tasks. Transcriptions and audio recordings of the Cookie Theft picture description task were used. With 25 healthy elderly control (HC) samples and an information coverage measure, we automatically generated a population-specific referent. We then assessed 517 transcriptions (257 Alzheimer's disease [AD], 217 HC, and 43 mild cognitively impaired samples) according to their informativeness and pertinence against this referent. We extracted linguistic and phonetic metrics which previous literature correlated to early-stage AD. We trained two learners to distinguish HCs from cognitively impaired individuals. Our measures significantly ( P  < .001) correlated with the severity of the cognitive impairment and the Mini-Mental State Examination score. The classification sensitivity was 81% (area under the curve of receiver operating characteristics = 0.79) and 85% (area under the curve of receiver operating characteristics = 0.76) between HCs and AD and between HCs and AD and mild cognitively impaired, respectively. An automated assessment of a picture description task could assist clinicians in the detection of early signs of cognitive impairment and AD.

  14. Social Experience Does Not Abolish Cultural Diversity in Eye Movements

    PubMed Central

    Kelly, David J.; Jack, Rachael E.; Miellet, Sébastien; De Luca, Emanuele; Foreman, Kay; Caldara, Roberto

    2011-01-01

    Adults from Eastern (e.g., China) and Western (e.g., USA) cultural groups display pronounced differences in a range of visual processing tasks. For example, the eye movement strategies used for information extraction during a variety of face processing tasks (e.g., identification and facial expressions of emotion categorization) differs across cultural groups. Currently, many of the differences reported in previous studies have asserted that culture itself is responsible for shaping the way we process visual information, yet this has never been directly investigated. In the current study, we assessed the relative contribution of genetic and cultural factors by testing face processing in a population of British Born Chinese adults using face recognition and expression classification tasks. Contrary to predictions made by the cultural differences framework, the majority of British Born Chinese adults deployed “Eastern” eye movement strategies, while approximately 25% of participants displayed “Western” strategies. Furthermore, the cultural eye movement strategies used by individuals were consistent across recognition and expression tasks. These findings suggest that “culture” alone cannot straightforwardly account for diversity in eye movement patterns. Instead a more complex understanding of how the environment and individual experiences can influence the mechanisms that govern visual processing is required. PMID:21886626

  15. A system for classifying disease comorbidity status from medical discharge summaries using automated hotspot and negated concept detection.

    PubMed

    Ambert, Kyle H; Cohen, Aaron M

    2009-01-01

    OBJECTIVE Free-text clinical reports serve as an important part of patient care management and clinical documentation of patient disease and treatment status. Free-text notes are commonplace in medical practice, but remain an under-used source of information for clinical and epidemiological research, as well as personalized medicine. The authors explore the challenges associated with automatically extracting information from clinical reports using their submission to the Integrating Informatics with Biology and the Bedside (i2b2) 2008 Natural Language Processing Obesity Challenge Task. DESIGN A text mining system for classifying patient comorbidity status, based on the information contained in clinical reports. The approach of the authors incorporates a variety of automated techniques, including hot-spot filtering, negated concept identification, zero-vector filtering, weighting by inverse class-frequency, and error-correcting of output codes with linear support vector machines. MEASUREMENTS Performance was evaluated in terms of the macroaveraged F1 measure. RESULTS The automated system performed well against manual expert rule-based systems, finishing fifth in the Challenge's intuitive task, and 13(th) in the textual task. CONCLUSIONS The system demonstrates that effective comorbidity status classification by an automated system is possible.

  16. Predictive classification of self-paced upper-limb analytical movements with EEG.

    PubMed

    Ibáñez, Jaime; Serrano, J I; del Castillo, M D; Minguez, J; Pons, J L

    2015-11-01

    The extent to which the electroencephalographic activity allows the characterization of movements with the upper limb is an open question. This paper describes the design and validation of a classifier of upper-limb analytical movements based on electroencephalographic activity extracted from intervals preceding self-initiated movement tasks. Features selected for the classification are subject specific and associated with the movement tasks. Further tests are performed to reject the hypothesis that other information different from the task-related cortical activity is being used by the classifiers. Six healthy subjects were measured performing self-initiated upper-limb analytical movements. A Bayesian classifier was used to classify among seven different kinds of movements. Features considered covered the alpha and beta bands. A genetic algorithm was used to optimally select a subset of features for the classification. An average accuracy of 62.9 ± 7.5% was reached, which was above the baseline level observed with the proposed methodology (30.2 ± 4.3%). The study shows how the electroencephalography carries information about the type of analytical movement performed with the upper limb and how it can be decoded before the movement begins. In neurorehabilitation environments, this information could be used for monitoring and assisting purposes.

  17. Image Analysis and Modeling

    DTIC Science & Technology

    1976-03-01

    This report summarizes the results of the research program on Image Analysis and Modeling supported by the Defense Advanced Research Projects Agency...The objective is to achieve a better understanding of image structure and to use this knowledge to develop improved image models for use in image ... analysis and processing tasks such as information extraction, image enhancement and restoration, and coding. The ultimate objective of this research is

  18. Machine Reading for Extraction of Bacteria and Habitat Taxonomies

    PubMed Central

    Kordjamshidi, Parisa; Massa, Wouter; Provoost, Thomas; Moens, Marie-Francine

    2015-01-01

    There is a vast amount of scientific literature available from various resources such as the internet. Automating the extraction of knowledge from these resources is very helpful for biologists to easily access this information. This paper presents a system to extract the bacteria and their habitats, as well as the relations between them. We investigate to what extent current techniques are suited for this task and test a variety of models in this regard. We detect entities in a biological text and map the habitats into a given taxonomy. Our model uses a linear chain Conditional Random Field (CRF). For the prediction of relations between the entities, a model based on logistic regression is built. Designing a system upon these techniques, we explore several improvements for both the generation and selection of good candidates. One contribution to this lies in the extended exibility of our ontology mapper that uses an advanced boundary detection and assigns the taxonomy elements to the detected habitats. Furthermore, we discover value in the combination of several distinct candidate generation rules. Using these techniques, we show results that are significantly improving upon the state of art for the BioNLP Bacteria Biotopes task. PMID:27077141

  19. Brain-computer interface analysis of a dynamic visuo-motor task.

    PubMed

    Logar, Vito; Belič, Aleš

    2011-01-01

    The area of brain-computer interfaces (BCIs) represents one of the more interesting fields in neurophysiological research, since it investigates the development of the machines that perform different transformations of the brain's "thoughts" to certain pre-defined actions. Experimental studies have reported some successful implementations of BCIs; however, much of the field still remains unexplored. According to some recent reports the phase coding of informational content is an important mechanism in the brain's function and cognition, and has the potential to explain various mechanisms of the brain's data transfer, but it has yet to be scrutinized in the context of brain-computer interface. Therefore, if the mechanism of phase coding is plausible, one should be able to extract the phase-coded content, carried by brain signals, using appropriate signal-processing methods. In our previous studies we have shown that by using a phase-demodulation-based signal-processing approach it is possible to decode some relevant information on the current motor action in the brain from electroencephalographic (EEG) data. In this paper the authors would like to present a continuation of their previous work on the brain-information-decoding analysis of visuo-motor (VM) tasks. The present study shows that EEG data measured during more complex, dynamic visuo-motor (dVM) tasks carries enough information about the currently performed motor action to be successfully extracted by using the appropriate signal-processing and identification methods. The aim of this paper is therefore to present a mathematical model, which by means of the EEG measurements as its inputs predicts the course of the wrist movements as applied by each subject during the task in simulated or real time (BCI analysis). However, several modifications to the existing methodology are needed to achieve optimal decoding results and a real-time, data-processing ability. The information extracted from the EEG could, therefore, be further used for the development of a closed-loop, non-invasive, brain-computer interface. For the case of this study two types of measurements were performed, i.e., the electroencephalographic (EEG) signals and the wrist movements were measured simultaneously, during the subject's performance of a dynamic visuo-motor task. Wrist-movement predictions were computed by using the EEG data-processing methodology of double brain-rhythm filtering, double phase demodulation and double principal component analyses (PCA), each with a separate set of parameters. For the movement-prediction model a fuzzy inference system was used. The results have shown that the EEG signals measured during the dVM tasks carry enough information about the subjects' wrist movements for them to be successfully decoded using the presented methodology. Reasonably high values of the correlation coefficients suggest that the validation of the proposed approach is satisfactory. Moreover, since the causality of the rhythm filtering and the PCA transformation has been achieved, we have shown that these methods can also be used in a real-time, brain-computer interface. The study revealed that using non-causal, optimized methods yields better prediction results in comparison with the causal, non-optimized methodology; however, taking into account that the causality of these methods allows real-time processing, the minor decrease in prediction quality is acceptable. The study suggests that the methodology that was proposed in our previous studies is also valid for identifying the EEG-coded content during dVM tasks, albeit with various modifications, which allow better prediction results and real-time data processing. The results have shown that wrist movements can be predicted in simulated or real time; however, the results of the non-causal, optimized methodology (simulated) are slightly better. Nevertheless, the study has revealed that these methods should be suitable for use in the development of a non-invasive, brain-computer interface. Copyright © 2010 Elsevier B.V. All rights reserved.

  20. Studies of Implicit Prototype Extraction In Patients with Mild Cognitive Impairment and Early Alzheimer’s Disease

    PubMed Central

    Nosofsky, Robert M.; Denton, Stephen E.; Zaki, Safa R.; Murphy-Knudsen, Anne F.; Unverzagt, Frederick W.

    2013-01-01

    Studies of incidental category learning support the hypothesis of an implicit prototype-extraction system which is distinct from explicit memory (Smith, 2008). In those studies, patients with explicit-memory impairments due to damage to the medial-temporal lobe performed normally in implicit categorization tasks (Bozoki, Grossman, & Smith, 2006; Knowlton & Squire, 1993). However, alternative interpretations are that: i) even people with impairments to a single memory system have sufficient resources to succeed on the particular categorization tasks that have been tested (Nosofsky & Zaki, 1998; Zaki & Nosofsky, 2001); and ii) working memory can be used at time of test to learn the categories (Palmeri & Flanery, 1999). In the present experiments, patients with amnestic mild cognitive impairment or early Alzheimer’s disease were tested in prototype-extraction tasks to examine these possibilities. In a categorization task involving discrete-feature stimuli, the majority of subjects relied on memories for exceedingly few features, even when the task structure strongly encouraged reliance on broad-based prototypes. In a dot-pattern categorization task, even the memory-impaired patients were able to use working memory at time of test to extract the category structure (at least for the stimulus set used in past work). We argue that the results weaken the past case made in favor of a separate system of implicit-prototype extraction. PMID:22746953

  1. Model of experts for decision support in the diagnosis of leukemia patients.

    PubMed

    Corchado, Juan M; De Paz, Juan F; Rodríguez, Sara; Bajo, Javier

    2009-07-01

    Recent advances in the field of biomedicine, specifically in the field of genomics, have led to an increase in the information available for conducting expression analysis. Expression analysis is a technique used in transcriptomics, a branch of genomics that deals with the study of messenger ribonucleic acid (mRNA) and the extraction of information contained in the genes. This increase in information is reflected in the exon arrays, which require the use of new techniques in order to extract the information. The purpose of this study is to provide a tool based on a mixture of experts model that allows the analysis of the information contained in the exon arrays, from which automatic classifications for decision support in diagnoses of leukemia patients can be made. The proposed model integrates several cooperative algorithms characterized for their efficiency for data processing, filtering, classification and knowledge extraction. The Cancer Institute of the University of Salamanca is making an effort to develop tools to automate the evaluation of data and to facilitate de analysis of information. This proposal is a step forward in this direction and the first step toward the development of a mixture of experts tool that integrates different cognitive and statistical approaches to deal with the analysis of exon arrays. The mixture of experts model presented within this work provides great capacities for learning and adaptation to the characteristics of the problem in consideration, using novel algorithms in each of the stages of the analysis process that can be easily configured and combined, and provides results that notably improve those provided by the existing methods for exon arrays analysis. The material used consists of data from exon arrays provided by the Cancer Institute that contain samples from leukemia patients. The methodology used consists of a system based on a mixture of experts. Each one of the experts incorporates novel artificial intelligence techniques that improve the process of carrying out various tasks such as pre-processing, filtering, classification and extraction of knowledge. This article will detail the manner in which individual experts are combined so that together they generate a system capable of extracting knowledge, thus permitting patients to be classified in an automatic and efficient manner that is also comprehensible for medical personnel. The system has been tested in a real setting and has been used for classifying patients who suffer from different forms of leukemia at various stages. Personnel from the Cancer Institute supervised and participated throughout the testing period. Preliminary results are promising, notably improving the results obtained with previously used tools. The medical staff from the Cancer Institute considers the tools that have been developed to be positive and very useful in a supporting capacity for carrying out their daily tasks. Additionally the mixture of experts supplies a tool for the extraction of necessary information in order to explain the associations that have been made in simple terms. That is, it permits the extraction of knowledge for each classification made and generalized in order to be used in subsequent classifications. This allows for a large amount of learning and adaptation within the proposed system.

  2. Unsupervised User Similarity Mining in GSM Sensor Networks

    PubMed Central

    Shad, Shafqat Ali; Chen, Enhong

    2013-01-01

    Mobility data has attracted the researchers for the past few years because of its rich context and spatiotemporal nature, where this information can be used for potential applications like early warning system, route prediction, traffic management, advertisement, social networking, and community finding. All the mentioned applications are based on mobility profile building and user trend analysis, where mobility profile building is done through significant places extraction, user's actual movement prediction, and context awareness. However, significant places extraction and user's actual movement prediction for mobility profile building are a trivial task. In this paper, we present the user similarity mining-based methodology through user mobility profile building by using the semantic tagging information provided by user and basic GSM network architecture properties based on unsupervised clustering approach. As the mobility information is in low-level raw form, our proposed methodology successfully converts it to a high-level meaningful information by using the cell-Id location information rather than previously used location capturing methods like GPS, Infrared, and Wifi for profile mining and user similarity mining. PMID:23576905

  3. Knowledge-based expert systems and a proof-of-concept case study for multiple sequence alignment construction and analysis.

    PubMed

    Aniba, Mohamed Radhouene; Siguenza, Sophie; Friedrich, Anne; Plewniak, Frédéric; Poch, Olivier; Marchler-Bauer, Aron; Thompson, Julie Dawn

    2009-01-01

    The traditional approach to bioinformatics analyses relies on independent task-specific services and applications, using different input and output formats, often idiosyncratic, and frequently not designed to inter-operate. In general, such analyses were performed by experts who manually verified the results obtained at each step in the process. Today, the amount of bioinformatics information continuously being produced means that handling the various applications used to study this information presents a major data management and analysis challenge to researchers. It is now impossible to manually analyse all this information and new approaches are needed that are capable of processing the large-scale heterogeneous data in order to extract the pertinent information. We review the recent use of integrated expert systems aimed at providing more efficient knowledge extraction for bioinformatics research. A general methodology for building knowledge-based expert systems is described, focusing on the unstructured information management architecture, UIMA, which provides facilities for both data and process management. A case study involving a multiple alignment expert system prototype called AlexSys is also presented.

  4. Knowledge-based expert systems and a proof-of-concept case study for multiple sequence alignment construction and analysis

    PubMed Central

    Aniba, Mohamed Radhouene; Siguenza, Sophie; Friedrich, Anne; Plewniak, Frédéric; Poch, Olivier; Marchler-Bauer, Aron

    2009-01-01

    The traditional approach to bioinformatics analyses relies on independent task-specific services and applications, using different input and output formats, often idiosyncratic, and frequently not designed to inter-operate. In general, such analyses were performed by experts who manually verified the results obtained at each step in the process. Today, the amount of bioinformatics information continuously being produced means that handling the various applications used to study this information presents a major data management and analysis challenge to researchers. It is now impossible to manually analyse all this information and new approaches are needed that are capable of processing the large-scale heterogeneous data in order to extract the pertinent information. We review the recent use of integrated expert systems aimed at providing more efficient knowledge extraction for bioinformatics research. A general methodology for building knowledge-based expert systems is described, focusing on the unstructured information management architecture, UIMA, which provides facilities for both data and process management. A case study involving a multiple alignment expert system prototype called AlexSys is also presented. PMID:18971242

  5. PREDOSE: a semantic web platform for drug abuse epidemiology using social media.

    PubMed

    Cameron, Delroy; Smith, Gary A; Daniulaityte, Raminta; Sheth, Amit P; Dave, Drashti; Chen, Lu; Anand, Gaurish; Carlson, Robert; Watkins, Kera Z; Falck, Russel

    2013-12-01

    The role of social media in biomedical knowledge mining, including clinical, medical and healthcare informatics, prescription drug abuse epidemiology and drug pharmacology, has become increasingly significant in recent years. Social media offers opportunities for people to share opinions and experiences freely in online communities, which may contribute information beyond the knowledge of domain professionals. This paper describes the development of a novel semantic web platform called PREDOSE (PREscription Drug abuse Online Surveillance and Epidemiology), which is designed to facilitate the epidemiologic study of prescription (and related) drug abuse practices using social media. PREDOSE uses web forum posts and domain knowledge, modeled in a manually created Drug Abuse Ontology (DAO--pronounced dow), to facilitate the extraction of semantic information from User Generated Content (UGC), through combination of lexical, pattern-based and semantics-based techniques. In a previous study, PREDOSE was used to obtain the datasets from which new knowledge in drug abuse research was derived. Here, we report on various platform enhancements, including an updated DAO, new components for relationship and triple extraction, and tools for content analysis, trend detection and emerging patterns exploration, which enhance the capabilities of the PREDOSE platform. Given these enhancements, PREDOSE is now more equipped to impact drug abuse research by alleviating traditional labor-intensive content analysis tasks. Using custom web crawlers that scrape UGC from publicly available web forums, PREDOSE first automates the collection of web-based social media content for subsequent semantic annotation. The annotation scheme is modeled in the DAO, and includes domain specific knowledge such as prescription (and related) drugs, methods of preparation, side effects, and routes of administration. The DAO is also used to help recognize three types of data, namely: (1) entities, (2) relationships and (3) triples. PREDOSE then uses a combination of lexical and semantic-based techniques to extract entities and relationships from the scraped content, and a top-down approach for triple extraction that uses patterns expressed in the DAO. In addition, PREDOSE uses publicly available lexicons to identify initial sentiment expressions in text, and then a probabilistic optimization algorithm (from related research) to extract the final sentiment expressions. Together, these techniques enable the capture of fine-grained semantic information, which facilitate search, trend analysis and overall content analysis using social media on prescription drug abuse. Moreover, extracted data are also made available to domain experts for the creation of training and test sets for use in evaluation and refinements in information extraction techniques. A recent evaluation of the information extraction techniques applied in the PREDOSE platform indicates 85% precision and 72% recall in entity identification, on a manually created gold standard dataset. In another study, PREDOSE achieved 36% precision in relationship identification and 33% precision in triple extraction, through manual evaluation by domain experts. Given the complexity of the relationship and triple extraction tasks and the abstruse nature of social media texts, we interpret these as favorable initial results. Extracted semantic information is currently in use in an online discovery support system, by prescription drug abuse researchers at the Center for Interventions, Treatment and Addictions Research (CITAR) at Wright State University. A comprehensive platform for entity, relationship, triple and sentiment extraction from such abstruse texts has never been developed for drug abuse research. PREDOSE has already demonstrated the importance of mining social media by providing data from which new findings in drug abuse research were uncovered. Given the recent platform enhancements, including the refined DAO, components for relationship and triple extraction, and tools for content, trend and emerging pattern analysis, it is expected that PREDOSE will play a significant role in advancing drug abuse epidemiology in future. Copyright © 2013 Elsevier Inc. All rights reserved.

  6. Extraction and textural characterization of above-ground areas from aerial stereo pairs: a quality assessment

    NASA Astrophysics Data System (ADS)

    Baillard, C.; Dissard, O.; Jamet, O.; Maître, H.

    Above-ground analysis is a key point to the reconstruction of urban scenes, but it is a difficult task because of the diversity of the involved objects. We propose a new method to above-ground extraction from an aerial stereo pair, which does not require any assumption about object shape or nature. A Digital Surface Model is first produced by a stereoscopic matching stage preserving discontinuities, and then processed by a region-based Markovian classification algorithm. The produced above-ground areas are finally characterized as man-made or natural according to the grey level information. The quality of the results is assessed and discussed.

  7. Physical data measurements and mathematical modelling of simple gas bubble experiments in glass melts

    NASA Technical Reports Server (NTRS)

    Weinberg, Michael C.

    1986-01-01

    In this work consideration is given to the problem of the extraction of physical data information from gas bubble dissolution and growth measurements. The discussion is limited to the analysis of the simplest experimental systems consisting of a single, one component gas bubble in a glassmelt. It is observed that if the glassmelt is highly under- (super-) saturated, then surface tension effects may be ignored, simplifying the task of extracting gas diffusivity values from the measurements. If, in addition, the bubble rise velocity is very small (or very large) the ease of obtaining physical property data is enhanced. Illustrations are given for typical cases.

  8. The smooth (tractor) operator: insights of knowledge engineering.

    PubMed

    Cullen, Ralph H; Smarr, Cory-Ann; Serrano-Baquero, Daniel; McBride, Sara E; Beer, Jenay M; Rogers, Wendy A

    2012-11-01

    The design of and training for complex systems requires in-depth understanding of task demands imposed on users. In this project, we used the knowledge engineering approach (Bowles et al., 2004) to assess the task of mowing in a citrus grove. Knowledge engineering is divided into four phases: (1) Establish goals. We defined specific goals based on the stakeholders involved. The main goal was to identify operator demands to support improvement of the system. (2) Create a working model of the system. We reviewed product literature, analyzed the system, and conducted expert interviews. (3) Extract knowledge. We interviewed tractor operators to understand their knowledge base. (4) Structure knowledge. We analyzed and organized operator knowledge to inform project goals. We categorized the information and developed diagrams to display the knowledge effectively. This project illustrates the benefits of knowledge engineering as a qualitative research method to inform technology design and training. Copyright © 2012 Elsevier Ltd and The Ergonomics Society. All rights reserved.

  9. Information Retrieval and Text Mining Technologies for Chemistry.

    PubMed

    Krallinger, Martin; Rabal, Obdulia; Lourenço, Anália; Oyarzabal, Julen; Valencia, Alfonso

    2017-06-28

    Efficient access to chemical information contained in scientific literature, patents, technical reports, or the web is a pressing need shared by researchers and patent attorneys from different chemical disciplines. Retrieval of important chemical information in most cases starts with finding relevant documents for a particular chemical compound or family. Targeted retrieval of chemical documents is closely connected to the automatic recognition of chemical entities in the text, which commonly involves the extraction of the entire list of chemicals mentioned in a document, including any associated information. In this Review, we provide a comprehensive and in-depth description of fundamental concepts, technical implementations, and current technologies for meeting these information demands. A strong focus is placed on community challenges addressing systems performance, more particularly CHEMDNER and CHEMDNER patents tasks of BioCreative IV and V, respectively. Considering the growing interest in the construction of automatically annotated chemical knowledge bases that integrate chemical information and biological data, cheminformatics approaches for mapping the extracted chemical names into chemical structures and their subsequent annotation together with text mining applications for linking chemistry with biological information are also presented. Finally, future trends and current challenges are highlighted as a roadmap proposal for research in this emerging field.

  10. Sophia: A Expedient UMLS Concept Extraction Annotator.

    PubMed

    Divita, Guy; Zeng, Qing T; Gundlapalli, Adi V; Duvall, Scott; Nebeker, Jonathan; Samore, Matthew H

    2014-01-01

    An opportunity exists for meaningful concept extraction and indexing from large corpora of clinical notes in the Veterans Affairs (VA) electronic medical record. Currently available tools such as MetaMap, cTAKES and HITex do not scale up to address this big data need. Sophia, a rapid UMLS concept extraction annotator was developed to fulfill a mandate and address extraction where high throughput is needed while preserving performance. We report on the development, testing and benchmarking of Sophia against MetaMap and cTAKEs. Sophia demonstrated improved performance on recall as compared to cTAKES and MetaMap (0.71 vs 0.66 and 0.38). The overall f-score was similar to cTAKES and an improvement over MetaMap (0.53 vs 0.57 and 0.43). With regard to speed of processing records, we noted Sophia to be several fold faster than cTAKES and the scaled-out MetaMap service. Sophia offers a viable alternative for high-throughput information extraction tasks.

  11. Sophia: A Expedient UMLS Concept Extraction Annotator

    PubMed Central

    Divita, Guy; Zeng, Qing T; Gundlapalli, Adi V.; Duvall, Scott; Nebeker, Jonathan; Samore, Matthew H.

    2014-01-01

    An opportunity exists for meaningful concept extraction and indexing from large corpora of clinical notes in the Veterans Affairs (VA) electronic medical record. Currently available tools such as MetaMap, cTAKES and HITex do not scale up to address this big data need. Sophia, a rapid UMLS concept extraction annotator was developed to fulfill a mandate and address extraction where high throughput is needed while preserving performance. We report on the development, testing and benchmarking of Sophia against MetaMap and cTAKEs. Sophia demonstrated improved performance on recall as compared to cTAKES and MetaMap (0.71 vs 0.66 and 0.38). The overall f-score was similar to cTAKES and an improvement over MetaMap (0.53 vs 0.57 and 0.43). With regard to speed of processing records, we noted Sophia to be several fold faster than cTAKES and the scaled-out MetaMap service. Sophia offers a viable alternative for high-throughput information extraction tasks. PMID:25954351

  12. A semantic model for multimodal data mining in healthcare information systems.

    PubMed

    Iakovidis, Dimitris; Smailis, Christos

    2012-01-01

    Electronic health records (EHRs) are representative examples of multimodal/multisource data collections; including measurements, images and free texts. The diversity of such information sources and the increasing amounts of medical data produced by healthcare institutes annually, pose significant challenges in data mining. In this paper we present a novel semantic model that describes knowledge extracted from the lowest-level of a data mining process, where information is represented by multiple features i.e. measurements or numerical descriptors extracted from measurements, images, texts or other medical data, forming multidimensional feature spaces. Knowledge collected by manual annotation or extracted by unsupervised data mining from one or more feature spaces is modeled through generalized qualitative spatial semantics. This model enables a unified representation of knowledge across multimodal data repositories. It contributes to bridging the semantic gap, by enabling direct links between low-level features and higher-level concepts e.g. describing body parts, anatomies and pathological findings. The proposed model has been developed in web ontology language based on description logics (OWL-DL) and can be applied to a variety of data mining tasks in medical informatics. It utility is demonstrated for automatic annotation of medical data.

  13. Decoding rule search domain in the left inferior frontal gyrus

    PubMed Central

    Babcock, Laura; Vallesi, Antonino

    2018-01-01

    Traditionally, the left hemisphere has been thought to extract mainly verbal patterns of information, but recent evidence has shown that the left Inferior Frontal Gyrus (IFG) is active during inductive reasoning in both the verbal and spatial domains. We aimed to understand whether the left IFG supports inductive reasoning in a domain-specific or domain-general fashion. To do this we used Multi-Voxel Pattern Analysis to decode the representation of domain during a rule search task. Thirteen participants were asked to extract the rule underlying streams of letters presented in different spatial locations. Each rule was either verbal (letters forming words) or spatial (positions forming geometric figures). Our results show that domain was decodable in the left prefrontal cortex, suggesting that this region represents domain-specific information, rather than processes common to the two domains. A replication study with the same participants tested two years later confirmed these findings, though the individual representations changed, providing evidence for the flexible nature of representations. This study extends our knowledge on the neural basis of goal-directed behaviors and on how information relevant for rule extraction is flexibly mapped in the prefrontal cortex. PMID:29547623

  14. Automating curation using a natural language processing pipeline

    PubMed Central

    Alex, Beatrice; Grover, Claire; Haddow, Barry; Kabadjov, Mijail; Klein, Ewan; Matthews, Michael; Tobin, Richard; Wang, Xinglong

    2008-01-01

    Background: The tasks in BioCreative II were designed to approximate some of the laborious work involved in curating biomedical research papers. The approach to these tasks taken by the University of Edinburgh team was to adapt and extend the existing natural language processing (NLP) system that we have developed as part of a commercial curation assistant. Although this paper concentrates on using NLP to assist with curation, the system can be equally employed to extract types of information from the literature that is immediately relevant to biologists in general. Results: Our system was among the highest performing on the interaction subtasks, and competitive performance on the gene mention task was achieved with minimal development effort. For the gene normalization task, a string matching technique that can be quickly applied to new domains was shown to perform close to average. Conclusion: The technologies being developed were shown to be readily adapted to the BioCreative II tasks. Although high performance may be obtained on individual tasks such as gene mention recognition and normalization, and document classification, tasks in which a number of components must be combined, such as detection and normalization of interacting protein pairs, are still challenging for NLP systems. PMID:18834488

  15. Defect-Repairable Latent Feature Extraction of Driving Behavior via a Deep Sparse Autoencoder

    PubMed Central

    Taniguchi, Tadahiro; Takenaka, Kazuhito; Bando, Takashi

    2018-01-01

    Data representing driving behavior, as measured by various sensors installed in a vehicle, are collected as multi-dimensional sensor time-series data. These data often include redundant information, e.g., both the speed of wheels and the engine speed represent the velocity of the vehicle. Redundant information can be expected to complicate the data analysis, e.g., more factors need to be analyzed; even varying the levels of redundancy can influence the results of the analysis. We assume that the measured multi-dimensional sensor time-series data of driving behavior are generated from low-dimensional data shared by the many types of one-dimensional data of which multi-dimensional time-series data are composed. Meanwhile, sensor time-series data may be defective because of sensor failure. Therefore, another important function is to reduce the negative effect of defective data when extracting low-dimensional time-series data. This study proposes a defect-repairable feature extraction method based on a deep sparse autoencoder (DSAE) to extract low-dimensional time-series data. In the experiments, we show that DSAE provides high-performance latent feature extraction for driving behavior, even for defective sensor time-series data. In addition, we show that the negative effect of defects on the driving behavior segmentation task could be reduced using the latent features extracted by DSAE. PMID:29462931

  16. Airway extraction from 3D chest CT volumes based on iterative extension of VOI enhanced by cavity enhancement filter

    NASA Astrophysics Data System (ADS)

    Meng, Qier; Kitasaka, Takayuki; Oda, Masahiro; Mori, Kensaku

    2017-03-01

    Airway segmentation is an important step in analyzing chest CT volumes for computerized lung cancer detection, emphysema diagnosis, asthma diagnosis, and pre- and intra-operative bronchoscope navigation. However, obtaining an integrated 3-D airway tree structure from a CT volume is a quite challenging task. This paper presents a novel airway segmentation method based on intensity structure analysis and bronchi shape structure analysis in volume of interest (VOI). This method segments the bronchial regions by applying the cavity enhancement filter (CEF) to trace the bronchial tree structure from the trachea. It uses the CEF in each VOI to segment each branch and to predict the positions of VOIs which envelope the bronchial regions in next level. At the same time, a leakage detection is performed to avoid the leakage by analysing the pixel information and the shape information of airway candidate regions extracted in the VOI. Bronchial regions are finally obtained by unifying the extracted airway regions. The experiments results showed that the proposed method can extract most of the bronchial region in each VOI and led good results of the airway segmentation.

  17. Biological network extraction from scientific literature: state of the art and challenges.

    PubMed

    Li, Chen; Liakata, Maria; Rebholz-Schuhmann, Dietrich

    2014-09-01

    Networks of molecular interactions explain complex biological processes, and all known information on molecular events is contained in a number of public repositories including the scientific literature. Metabolic and signalling pathways are often viewed separately, even though both types are composed of interactions involving proteins and other chemical entities. It is necessary to be able to combine data from all available resources to judge the functionality, complexity and completeness of any given network overall, but especially the full integration of relevant information from the scientific literature is still an ongoing and complex task. Currently, the text-mining research community is steadily moving towards processing the full body of the scientific literature by making use of rich linguistic features such as full text parsing, to extract biological interactions. The next step will be to combine these with information from scientific databases to support hypothesis generation for the discovery of new knowledge and the extension of biological networks. The generation of comprehensive networks requires technologies such as entity grounding, coordination resolution and co-reference resolution, which are not fully solved and are required to further improve the quality of results. Here, we analyse the state of the art for the extraction of network information from the scientific literature and the evaluation of extraction methods against reference corpora, discuss challenges involved and identify directions for future research. © The Author 2013. Published by Oxford University Press. For Permissions, please email: journals.permissions@oup.com.

  18. Development of Markup Language for Medical Record Charting: A Charting Language.

    PubMed

    Jung, Won-Mo; Chae, Younbyoung; Jang, Bo-Hyoung

    2015-01-01

    Nowadays a lot of trials for collecting electronic medical records (EMRs) exist. However, structuring data format for EMR is an especially labour-intensive task for practitioners. Here we propose a new mark-up language for medical record charting (called Charting Language), which borrows useful properties from programming languages. Thus, with Charting Language, the text data described in dynamic situation can be easily used to extract information.

  19. Web-Scale Search-Based Data Extraction and Integration

    DTIC Science & Technology

    2011-10-17

    differently, posing challenges for aggregating this information. For example, for the task of finding population for cities in Benin, we were faced with...merged record. Our GeoMerging algorithm attempts to address various ambiguity challenges : • For name: The name of a hospital is not a unique...departments in the same building. For agent-extractor results from structured sources, our GeoMerging algorithm overcomes these challenges using a two

  20. Web-Based Activity Breaks: Impacts on Energy Expenditure and Time in Off-Task Behavior in Elementary School Children

    ERIC Educational Resources Information Center

    Huddleston, Holly Henry

    2017-01-01

    The purpose of study 1 was to characterize energy expenditure (EE) during academic subjects and activities during an elementary school day. Children in 2nd-4th grades (N = 33) wore the SenseWear Armband (SWA) for five school days to measure EE. Teachers' logs were compared to SWA data to extract information about EE throughout the day. Energy…

  1. Reducing Labeling Effort for Structured Prediction Tasks

    DTIC Science & Technology

    2005-01-01

    correctly annotated for the instance to be of use to the learner. Traditional active learning addresses this problem by optimizing the order in which the...than for others. We propose a new active learning paradigm which reduces not only how many instances the annotator must label, but also how difficult...We validate this active learning framework in an interactive information extraction system, reducing the total number of annotation actions by 22%.

  2. Utilizing gamma band to improve mental task based brain-computer interface design.

    PubMed

    Palaniappan, Ramaswamy

    2006-09-01

    A common method for designing brain-computer Interface (BCI) is to use electroencephalogram (EEG) signals extracted during mental tasks. In these BCI designs, features from EEG such as power and asymmetry ratios from delta, theta, alpha, and beta bands have been used in classifying different mental tasks. In this paper, the performance of the mental task based BCI design is improved by using spectral power and asymmetry ratios from gamma (24-37 Hz) band in addition to the lower frequency bands. In the experimental study, EEG signals extracted during five mental tasks from four subjects were used. Elman neural network (ENN) trained by the resilient backpropagation algorithm was used to classify the power and asymmetry ratios from EEG into different combinations of two mental tasks. The results indicated that ((1) the classification performance and training time of the BCI design were improved through the use of additional gamma band features; (2) classification performances were nearly invariant to the number of ENN hidden units or feature extraction method.

  3. How to Assess Gaming-Induced Benefits on Attention and Working Memory.

    PubMed

    Mishra, Jyoti; Bavelier, Daphne; Gazzaley, Adam

    2012-06-01

    Our daily actions are driven by our goals in the moment, constantly forcing us to choose among various options. Attention and working memory are key enablers of that process. Attention allows for selective processing of goal-relevant information and rejecting task-irrelevant information. Working memory functions to maintain goal-relevant information in memory for brief periods of time for subsequent recall and/or manipulation. Efficient attention and working memory thus support the best extraction and retention of environmental information for optimal task performance. Recent studies have evidenced that attention and working memory abilities can be enhanced by cognitive training games as well as entertainment videogames. Here we review key cognitive paradigms that have been used to evaluate the impact of game-based training on various aspects of attention and working memory. Common use of such methodology within the scientific community will enable direct comparison of the efficacy of different games across age groups and clinical populations. The availability of common assessment tools will ultimately facilitate development of the most effective forms of game-based training for cognitive rehabilitation and education.

  4. How to Assess Gaming-Induced Benefits on Attention and Working Memory

    PubMed Central

    Mishra, Jyoti; Bavelier, Daphne

    2012-01-01

    Abstract Our daily actions are driven by our goals in the moment, constantly forcing us to choose among various options. Attention and working memory are key enablers of that process. Attention allows for selective processing of goal-relevant information and rejecting task-irrelevant information. Working memory functions to maintain goal-relevant information in memory for brief periods of time for subsequent recall and/or manipulation. Efficient attention and working memory thus support the best extraction and retention of environmental information for optimal task performance. Recent studies have evidenced that attention and working memory abilities can be enhanced by cognitive training games as well as entertainment videogames. Here we review key cognitive paradigms that have been used to evaluate the impact of game-based training on various aspects of attention and working memory. Common use of such methodology within the scientific community will enable direct comparison of the efficacy of different games across age groups and clinical populations. The availability of common assessment tools will ultimately facilitate development of the most effective forms of game-based training for cognitive rehabilitation and education. PMID:24761314

  5. "Looking-at-nothing" during sequential sensorimotor actions: Long-term memory-based eye scanning of remembered target locations.

    PubMed

    Foerster, Rebecca M

    2018-03-01

    Before acting humans saccade to a target object to extract relevant visual information. Even when acting on remembered objects, locations previously occupied by relevant objects are fixated during imagery and memory tasks - a phenomenon called "looking-at-nothing". While looking-at-nothing was robustly found in tasks encouraging declarative memory built-up, results are mixed in the case of procedural sensorimotor tasks. Eye-guidance to manual targets in complete darkness was observed in a task practiced for days beforehand, while investigations using only a single session did not find fixations to remembered action targets. Here, it is asked whether looking-at-nothing can be found in a single sensorimotor session and thus independent from sleep consolidation, and how it progresses when visual information is repeatedly unavailable. Eye movements were investigated in a computerized version of the trail making test. Participants clicked on numbered circles in ascending sequence. Fifty trials were performed with the same spatial arrangement of 9 visual targets to enable long-term memory consolidation. During 50 consecutive trials, participants had to click the remembered target sequence on an empty screen. Participants scanned the visual targets and also the empty target locations sequentially with their eyes, however, the latter less precise than the former. Over the course of the memory trials, manual and oculomotor sequential target scanning became more similar to the visual trials. Results argue for robust looking-at-nothing during procedural sensorimotor tasks provided that long-term memory information is sufficient. Copyright © 2018 Elsevier Ltd. All rights reserved.

  6. Inducing task-relevant responses to speech in the sleeping brain.

    PubMed

    Kouider, Sid; Andrillon, Thomas; Barbosa, Leonardo S; Goupil, Louise; Bekinschtein, Tristan A

    2014-09-22

    Falling asleep leads to a loss of sensory awareness and to the inability to interact with the environment [1]. While this was traditionally thought as a consequence of the brain shutting down to external inputs, it is now acknowledged that incoming stimuli can still be processed, at least to some extent, during sleep [2]. For instance, sleeping participants can create novel sensory associations between tones and odors [3] or reactivate existing semantic associations, as evidenced by event-related potentials [4-7]. Yet, the extent to which the brain continues to process external stimuli remains largely unknown. In particular, it remains unclear whether sensory information can be processed in a flexible and task-dependent manner by the sleeping brain, all the way up to the preparation of relevant actions. Here, using semantic categorization and lexical decision tasks, we studied task-relevant responses triggered by spoken stimuli in the sleeping brain. Awake participants classified words as either animals or objects (experiment 1) or as either words or pseudowords (experiment 2) by pressing a button with their right or left hand, while transitioning toward sleep. The lateralized readiness potential (LRP), an electrophysiological index of response preparation, revealed that task-specific preparatory responses are preserved during sleep. These findings demonstrate that despite the absence of awareness and behavioral responsiveness, sleepers can still extract task-relevant information from external stimuli and covertly prepare for appropriate motor responses. Copyright © 2014 The Authors. Published by Elsevier Inc. All rights reserved.

  7. A scene-analysis approach to remote sensing. [San Francisco, California

    NASA Technical Reports Server (NTRS)

    Tenenbaum, J. M. (Principal Investigator); Fischler, M. A.; Wolf, H. C.

    1978-01-01

    The author has identified the following significant results. Geometric correspondance between a sensed image and a symbolic map is established in an initial stage of processing by adjusting parameters of a sensed model so that the image features predicted from the map optimally match corresponding features extracted from the sensed image. Information in the map is then used to constrain where to look in an image, what to look for, and how to interpret what is seen. For simple monitoring tasks involving multispectral classification, these constraints significantly reduce computation, simplify interpretation, and improve the utility of the resulting information. Previously intractable tasks requiring spatial and textural analysis may become straightforward in the context established by the map knowledge. The use of map-guided image analysis in monitoring the volume of water in a reservoir, the number of boxcars in a railyard, and the number of ships in a harbor is demonstrated.

  8. Task-specific feature extraction and classification of fMRI volumes using a deep neural network initialized with a deep belief network: Evaluation using sensorimotor tasks

    PubMed Central

    Jang, Hojin; Plis, Sergey M.; Calhoun, Vince D.; Lee, Jong-Hwan

    2016-01-01

    Feedforward deep neural networks (DNN), artificial neural networks with multiple hidden layers, have recently demonstrated a record-breaking performance in multiple areas of applications in computer vision and speech processing. Following the success, DNNs have been applied to neuroimaging modalities including functional/structural magnetic resonance imaging (MRI) and positron-emission tomography data. However, no study has explicitly applied DNNs to 3D whole-brain fMRI volumes and thereby extracted hidden volumetric representations of fMRI that are discriminative for a task performed as the fMRI volume was acquired. Our study applied fully connected feedforward DNN to fMRI volumes collected in four sensorimotor tasks (i.e., left-hand clenching, right-hand clenching, auditory attention, and visual stimulus) undertaken by 12 healthy participants. Using a leave-one-subject-out cross-validation scheme, a restricted Boltzmann machine-based deep belief network was pretrained and used to initialize weights of the DNN. The pretrained DNN was fine-tuned while systematically controlling weight-sparsity levels across hidden layers. Optimal weight-sparsity levels were determined from a minimum validation error rate of fMRI volume classification. Minimum error rates (mean ± standard deviation; %) of 6.9 (± 3.8) were obtained from the three-layer DNN with the sparsest condition of weights across the three hidden layers. These error rates were even lower than the error rates from the single-layer network (9.4 ± 4.6) and the two-layer network (7.4 ± 4.1). The estimated DNN weights showed spatial patterns that are remarkably task-specific, particularly in the higher layers. The output values of the third hidden layer represented distinct patterns/codes of the 3D whole-brain fMRI volume and encoded the information of the tasks as evaluated from representational similarity analysis. Our reported findings show the ability of the DNN to classify a single fMRI volume based on the extraction of hidden representations of fMRI volumes associated with tasks across multiple hidden layers. Our study may be beneficial to the automatic classification/diagnosis of neuropsychiatric and neurological diseases and prediction of disease severity and recovery in (pre-) clinical settings using fMRI volumes without requiring an estimation of activation patterns or ad hoc statistical evaluation. PMID:27079534

  9. Task-specific feature extraction and classification of fMRI volumes using a deep neural network initialized with a deep belief network: Evaluation using sensorimotor tasks.

    PubMed

    Jang, Hojin; Plis, Sergey M; Calhoun, Vince D; Lee, Jong-Hwan

    2017-01-15

    Feedforward deep neural networks (DNNs), artificial neural networks with multiple hidden layers, have recently demonstrated a record-breaking performance in multiple areas of applications in computer vision and speech processing. Following the success, DNNs have been applied to neuroimaging modalities including functional/structural magnetic resonance imaging (MRI) and positron-emission tomography data. However, no study has explicitly applied DNNs to 3D whole-brain fMRI volumes and thereby extracted hidden volumetric representations of fMRI that are discriminative for a task performed as the fMRI volume was acquired. Our study applied fully connected feedforward DNN to fMRI volumes collected in four sensorimotor tasks (i.e., left-hand clenching, right-hand clenching, auditory attention, and visual stimulus) undertaken by 12 healthy participants. Using a leave-one-subject-out cross-validation scheme, a restricted Boltzmann machine-based deep belief network was pretrained and used to initialize weights of the DNN. The pretrained DNN was fine-tuned while systematically controlling weight-sparsity levels across hidden layers. Optimal weight-sparsity levels were determined from a minimum validation error rate of fMRI volume classification. Minimum error rates (mean±standard deviation; %) of 6.9 (±3.8) were obtained from the three-layer DNN with the sparsest condition of weights across the three hidden layers. These error rates were even lower than the error rates from the single-layer network (9.4±4.6) and the two-layer network (7.4±4.1). The estimated DNN weights showed spatial patterns that are remarkably task-specific, particularly in the higher layers. The output values of the third hidden layer represented distinct patterns/codes of the 3D whole-brain fMRI volume and encoded the information of the tasks as evaluated from representational similarity analysis. Our reported findings show the ability of the DNN to classify a single fMRI volume based on the extraction of hidden representations of fMRI volumes associated with tasks across multiple hidden layers. Our study may be beneficial to the automatic classification/diagnosis of neuropsychiatric and neurological diseases and prediction of disease severity and recovery in (pre-) clinical settings using fMRI volumes without requiring an estimation of activation patterns or ad hoc statistical evaluation. Copyright © 2016 Elsevier Inc. All rights reserved.

  10. From gaze cueing to perspective taking: Revisiting the claim that we automatically compute where or what other people are looking at

    PubMed Central

    Bukowski, Henryk; Hietanen, Jari K.; Samson, Dana

    2015-01-01

    ABSTRACT Two paradigms have shown that people automatically compute what or where another person is looking at. In the visual perspective-taking paradigm, participants judge how many objects they see; whereas, in the gaze cueing paradigm, participants identify a target. Unlike in the former task, in the latter task, the influence of what or where the other person is looking at is only observed when the other person is presented alone before the task-relevant objects. We show that this discrepancy across the two paradigms is not due to differences in visual settings (Experiment 1) or available time to extract the directional information (Experiment 2), but that it is caused by how attention is deployed in response to task instructions (Experiment 3). Thus, the mere presence of another person in the field of view is not sufficient to compute where/what that person is looking at, which qualifies the claimed automaticity of such computations. PMID:26924936

  11. From gaze cueing to perspective taking: Revisiting the claim that we automatically compute where or what other people are looking at.

    PubMed

    Bukowski, Henryk; Hietanen, Jari K; Samson, Dana

    2015-09-14

    Two paradigms have shown that people automatically compute what or where another person is looking at. In the visual perspective-taking paradigm, participants judge how many objects they see; whereas, in the gaze cueing paradigm, participants identify a target. Unlike in the former task, in the latter task, the influence of what or where the other person is looking at is only observed when the other person is presented alone before the task-relevant objects. We show that this discrepancy across the two paradigms is not due to differences in visual settings (Experiment 1) or available time to extract the directional information (Experiment 2), but that it is caused by how attention is deployed in response to task instructions (Experiment 3). Thus, the mere presence of another person in the field of view is not sufficient to compute where/what that person is looking at, which qualifies the claimed automaticity of such computations.

  12. Business Intelligence Applied to the ALMA Software Integration Process

    NASA Astrophysics Data System (ADS)

    Zambrano, M.; Recabarren, C.; González, V.; Hoffstadt, A.; Soto, R.; Shen, T.-C.

    2012-09-01

    Software quality assurance and planning of an astronomy project is a complex task, specially if it is a distributed collaborative project such as ALMA, where the development centers are spread across the globe. When you execute a software project there is much valuable information about this process itself that you might be able to collect. One of the ways you can receive this input is via an issue tracking system that will gather the problem reports relative to software bugs captured during the testing of the software, during the integration of the different components or even worst, problems occurred during production time. Usually, there is little time spent on analyzing them but with some multidimensional processing you can extract valuable information from them and it might help you on the long term planning and resources allocation. We present an analysis of the information collected at ALMA from a collection of key unbiased indicators. We describe here the extraction, transformation and load process and how the data was processed. The main goal is to assess a software process and get insights from this information.

  13. Optimizing graph-based patterns to extract biomedical events from the literature

    PubMed Central

    2015-01-01

    In BioNLP-ST 2013 We participated in the BioNLP 2013 shared tasks on event extraction. Our extraction method is based on the search for an approximate subgraph isomorphism between key context dependencies of events and graphs of input sentences. Our system was able to address both the GENIA (GE) task focusing on 13 molecular biology related event types and the Cancer Genetics (CG) task targeting a challenging group of 40 cancer biology related event types with varying arguments concerning 18 kinds of biological entities. In addition to adapting our system to the two tasks, we also attempted to integrate semantics into the graph matching scheme using a distributional similarity model for more events, and evaluated the event extraction impact of using paths of all possible lengths as key context dependencies beyond using only the shortest paths in our system. We achieved a 46.38% F-score in the CG task (ranking 3rd) and a 48.93% F-score in the GE task (ranking 4th). After BioNLP-ST 2013 We explored three ways to further extend our event extraction system in our previously published work: (1) We allow non-essential nodes to be skipped, and incorporated a node skipping penalty into the subgraph distance function of our approximate subgraph matching algorithm. (2) Instead of assigning a unified subgraph distance threshold to all patterns of an event type, we learned a customized threshold for each pattern. (3) We implemented the well-known Empirical Risk Minimization (ERM) principle to optimize the event pattern set by balancing prediction errors on training data against regularization. When evaluated on the official GE task test data, these extensions help to improve the extraction precision from 62% to 65%. However, the overall F-score stays equivalent to the previous performance due to a 1% drop in recall. PMID:26551594

  14. Rapid Automated Sample Preparation for Biological Assays

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Shusteff, M

    Our technology utilizes acoustic, thermal, and electric fields to separate out contaminants such as debris or pollen from environmental samples, lyse open cells, and extract the DNA from the lysate. The objective of the project is to optimize the system described for a forensic sample, and demonstrate its performance for integration with downstream assay platforms (e.g. MIT-LL's ANDE). We intend to increase the quantity of DNA recovered from the sample beyond the current {approx}80% achieved using solid phase extraction methods. Task 1: Develop and test an acoustic filter for cell extraction. Task 2: Develop and test lysis chip. Task 3:more » Develop and test DNA extraction chip. All chips have been fabricated based on the designs laid out in last month's report.« less

  15. Real-time Accurate Surface Reconstruction Pipeline for Vision Guided Planetary Exploration Using Unmanned Ground and Aerial Vehicles

    NASA Technical Reports Server (NTRS)

    Almeida, Eduardo DeBrito

    2012-01-01

    This report discusses work completed over the summer at the Jet Propulsion Laboratory (JPL), California Institute of Technology. A system is presented to guide ground or aerial unmanned robots using computer vision. The system performs accurate camera calibration, camera pose refinement and surface extraction from images collected by a camera mounted on the vehicle. The application motivating the research is planetary exploration and the vehicles are typically rovers or unmanned aerial vehicles. The information extracted from imagery is used primarily for navigation, as robot location is the same as the camera location and the surfaces represent the terrain that rovers traverse. The processed information must be very accurate and acquired very fast in order to be useful in practice. The main challenge being addressed by this project is to achieve high estimation accuracy and high computation speed simultaneously, a difficult task due to many technical reasons.

  16. Clinical Assistant Diagnosis for Electronic Medical Record Based on Convolutional Neural Network.

    PubMed

    Yang, Zhongliang; Huang, Yongfeng; Jiang, Yiran; Sun, Yuxi; Zhang, Yu-Jin; Luo, Pengcheng

    2018-04-20

    Automatically extracting useful information from electronic medical records along with conducting disease diagnoses is a promising task for both clinical decision support(CDS) and neural language processing(NLP). Most of the existing systems are based on artificially constructed knowledge bases, and then auxiliary diagnosis is done by rule matching. In this study, we present a clinical intelligent decision approach based on Convolutional Neural Networks(CNN), which can automatically extract high-level semantic information of electronic medical records and then perform automatic diagnosis without artificial construction of rules or knowledge bases. We use collected 18,590 copies of the real-world clinical electronic medical records to train and test the proposed model. Experimental results show that the proposed model can achieve 98.67% accuracy and 96.02% recall, which strongly supports that using convolutional neural network to automatically learn high-level semantic features of electronic medical records and then conduct assist diagnosis is feasible and effective.

  17. Spatial Uncertainty Modeling of Fuzzy Information in Images for Pattern Classification

    PubMed Central

    Pham, Tuan D.

    2014-01-01

    The modeling of the spatial distribution of image properties is important for many pattern recognition problems in science and engineering. Mathematical methods are needed to quantify the variability of this spatial distribution based on which a decision of classification can be made in an optimal sense. However, image properties are often subject to uncertainty due to both incomplete and imprecise information. This paper presents an integrated approach for estimating the spatial uncertainty of vagueness in images using the theory of geostatistics and the calculus of probability measures of fuzzy events. Such a model for the quantification of spatial uncertainty is utilized as a new image feature extraction method, based on which classifiers can be trained to perform the task of pattern recognition. Applications of the proposed algorithm to the classification of various types of image data suggest the usefulness of the proposed uncertainty modeling technique for texture feature extraction. PMID:25157744

  18. Information Extraction for System-Software Safety Analysis: Calendar Year 2007 Year-End Report

    NASA Technical Reports Server (NTRS)

    Malin, Jane T.

    2008-01-01

    This annual report describes work to integrate a set of tools to support early model-based analysis of failures and hazards due to system-software interactions. The tools perform and assist analysts in the following tasks: 1) extract model parts from text for architecture and safety/hazard models; 2) combine the parts with library information to develop the models for visualization and analysis; 3) perform graph analysis on the models to identify possible paths from hazard sources to vulnerable entities and functions, in nominal and anomalous system-software configurations; 4) perform discrete-time-based simulation on the models to investigate scenarios where these paths may play a role in failures and mishaps; and 5) identify resulting candidate scenarios for software integration testing. This paper describes new challenges in a NASA abort system case, and enhancements made to develop the integrated tool set.

  19. Mutual information, neural networks and the renormalization group

    NASA Astrophysics Data System (ADS)

    Koch-Janusz, Maciej; Ringel, Zohar

    2018-06-01

    Physical systems differing in their microscopic details often display strikingly similar behaviour when probed at macroscopic scales. Those universal properties, largely determining their physical characteristics, are revealed by the powerful renormalization group (RG) procedure, which systematically retains `slow' degrees of freedom and integrates out the rest. However, the important degrees of freedom may be difficult to identify. Here we demonstrate a machine-learning algorithm capable of identifying the relevant degrees of freedom and executing RG steps iteratively without any prior knowledge about the system. We introduce an artificial neural network based on a model-independent, information-theoretic characterization of a real-space RG procedure, which performs this task. We apply the algorithm to classical statistical physics problems in one and two dimensions. We demonstrate RG flow and extract the Ising critical exponent. Our results demonstrate that machine-learning techniques can extract abstract physical concepts and consequently become an integral part of theory- and model-building.

  20. Quantity and unit extraction for scientific and technical intelligence analysis

    NASA Astrophysics Data System (ADS)

    David, Peter; Hawes, Timothy

    2017-05-01

    Scientific and Technical (S and T) intelligence analysts consume huge amounts of data to understand how scientific progress and engineering efforts affect current and future military capabilities. One of the most important types of information S and T analysts exploit is the quantities discussed in their source material. Frequencies, ranges, size, weight, power, and numerous other properties and measurements describing the performance characteristics of systems and the engineering constraints that define them must be culled from source documents before quantified analysis can begin. Automating the process of finding and extracting the relevant quantities from a wide range of S and T documents is difficult because information about quantities and their units is often contained in unstructured text with ad hoc conventions used to convey their meaning. Currently, even simple tasks, such as searching for documents discussing RF frequencies in a band of interest, is a labor intensive and error prone process. This research addresses the challenges facing development of a document processing capability that extracts quantities and units from S and T data, and how Natural Language Processing algorithms can be used to overcome these challenges.

  1. Enriching a document collection by integrating information extraction and PDF annotation

    NASA Astrophysics Data System (ADS)

    Powley, Brett; Dale, Robert; Anisimoff, Ilya

    2009-01-01

    Modern digital libraries offer all the hyperlinking possibilities of the World Wide Web: when a reader finds a citation of interest, in many cases she can now click on a link to be taken to the cited work. This paper presents work aimed at providing the same ease of navigation for legacy PDF document collections that were created before the possibility of integrating hyperlinks into documents was ever considered. To achieve our goal, we need to carry out two tasks: first, we need to identify and link citations and references in the text with high reliability; and second, we need the ability to determine physical PDF page locations for these elements. We demonstrate the use of a high-accuracy citation extraction algorithm which significantly improves on earlier reported techniques, and a technique for integrating PDF processing with a conventional text-stream based information extraction pipeline. We demonstrate these techniques in the context of a particular document collection, this being the ACL Anthology; but the same approach can be applied to other document sets.

  2. Detection of reflecting surfaces by a statistical model

    NASA Astrophysics Data System (ADS)

    He, Qiang; Chu, Chee-Hung H.

    2009-02-01

    Remote sensing is widely used assess the destruction from natural disasters and to plan relief and recovery operations. How to automatically extract useful features and segment interesting objects from digital images, including remote sensing imagery, becomes a critical task for image understanding. Unfortunately, current research on automated feature extraction is ignorant of contextual information. As a result, the fidelity of populating attributes corresponding to interesting features and objects cannot be satisfied. In this paper, we present an exploration on meaningful object extraction integrating reflecting surfaces. Detection of specular reflecting surfaces can be useful in target identification and then can be applied to environmental monitoring, disaster prediction and analysis, military, and counter-terrorism. Our method is based on a statistical model to capture the statistical properties of specular reflecting surfaces. And then the reflecting surfaces are detected through cluster analysis.

  3. DeepSkeleton: Learning Multi-Task Scale-Associated Deep Side Outputs for Object Skeleton Extraction in Natural Images

    NASA Astrophysics Data System (ADS)

    Shen, Wei; Zhao, Kai; Jiang, Yuan; Wang, Yan; Bai, Xiang; Yuille, Alan

    2017-11-01

    Object skeletons are useful for object representation and object detection. They are complementary to the object contour, and provide extra information, such as how object scale (thickness) varies among object parts. But object skeleton extraction from natural images is very challenging, because it requires the extractor to be able to capture both local and non-local image context in order to determine the scale of each skeleton pixel. In this paper, we present a novel fully convolutional network with multiple scale-associated side outputs to address this problem. By observing the relationship between the receptive field sizes of the different layers in the network and the skeleton scales they can capture, we introduce two scale-associated side outputs to each stage of the network. The network is trained by multi-task learning, where one task is skeleton localization to classify whether a pixel is a skeleton pixel or not, and the other is skeleton scale prediction to regress the scale of each skeleton pixel. Supervision is imposed at different stages by guiding the scale-associated side outputs toward the groundtruth skeletons at the appropriate scales. The responses of the multiple scale-associated side outputs are then fused in a scale-specific way to detect skeleton pixels using multiple scales effectively. Our method achieves promising results on two skeleton extraction datasets, and significantly outperforms other competitors. Additionally, the usefulness of the obtained skeletons and scales (thickness) are verified on two object detection applications: Foreground object segmentation and object proposal detection.

  4. A case-matched study of neurophysiological correlates to attention/working memory in people with somatic hypervigilance.

    PubMed

    Berryman, Carolyn; Wise, Vikki; Stanton, Tasha R; McFarlane, Alexander; Moseley, G Lorimer

    2017-02-01

    Somatic hypervigilance describes a clinical presentation in which people report more, and more intense, bodily sensations than is usual. Most explanations of somatic hypervigilance implicate altered information processing, but strong empirical data are lacking. Attention and working memory are critical for information processing, and we aimed to evaluate brain activity during attention/working memory tasks in people with and without somatic hypervigilance. Data from 173 people with somatic hypervigilance and 173 controls matched for age, gender, handedness, and years of education were analyzed. Event-related potential (ERP) data, extracted from the continuous electroencephalograph recordings obtained during performance of the Auditory Oddball task, and the Two In A Row (TIAR) task, for N1, P2, N2, and P3, were used in the analysis. Between-group differences for P3 amplitude and N2 amplitude and latency were assessed with two-tailed independent t tests. Between-group differences for N1 and P2 amplitude and latency were assessed using mixed, repeated measures analyses of variance (ANOVAs) with group and Group × Site factors. Linear regression analysis investigated the relationship between anxiety and depression and any outcomes of significance. People with somatic hypervigilance showed smaller P3 amplitudes-Auditory Oddball task: t(285) = 2.32, 95% confidence interval, CI [3.48, 4.47], p = .026, d = 0.27; Two-In-A-Row (TIAR) task: t(334) = 2.23, 95% CI [2.20; 3.95], p = .021, d = 0.24-than case-matched controls. N2 amplitude was also smaller in people with somatic hypervigilance-TIAR task: t(318) = 2.58, 95% CI [0.33, 2.47], p = .010, d = 0.29-than in case-matched controls. Neither depression nor anxiety was significantly associated with any outcome. People with somatic hypervigilance demonstrated an event-related potential response to attention/working memory tasks that is consistent with altered information processing.

  5. Think spatial: the representation in mental rotation is nonvisual.

    PubMed

    Liesefeld, Heinrich R; Zimmer, Hubert D

    2013-01-01

    For mental rotation, introspection, theories, and interpretations of experimental results imply a certain type of mental representation, namely, visual mental images. Characteristics of the rotated representation can be examined by measuring the influence of stimulus characteristics on rotational speed. If the amount of a given type of information influences rotational speed, one can infer that it was contained in the rotated representation. In Experiment 1, rotational speed of university students (10 men, 11 women) was found to be influenced exclusively by the amount of represented orientation-dependent spatial-relational information but not by orientation-independent spatial-relational information, visual complexity, or the number of stimulus parts. As information in mental-rotation tasks is initially presented visually, this finding implies that at some point during each trial, orientation-dependent information is extracted from visual information. Searching for more direct evidence for this extraction, we recorded the EEG of another sample of university students (12 men, 12 women) during mental rotation of the same stimuli. In an early time window, the observed working memory load-dependent slow potentials were sensitive to the stimuli's visual complexity. Later, in contrast, slow potentials were sensitive to the amount of orientation-dependent information only. We conclude that only orientation-dependent information is contained in the rotated representation. (PsycINFO Database Record (c) 2013 APA, all rights reserved).

  6. Non-negative Matrix Factorization and Co-clustering: A Promising Tool for Multi-tasks Bearing Fault Diagnosis

    NASA Astrophysics Data System (ADS)

    Shen, Fei; Chen, Chao; Yan, Ruqiang

    2017-05-01

    Classical bearing fault diagnosis methods, being designed according to one specific task, always pay attention to the effectiveness of extracted features and the final diagnostic performance. However, most of these approaches suffer from inefficiency when multiple tasks exist, especially in a real-time diagnostic scenario. A fault diagnosis method based on Non-negative Matrix Factorization (NMF) and Co-clustering strategy is proposed to overcome this limitation. Firstly, some high-dimensional matrixes are constructed using the Short-Time Fourier Transform (STFT) features, where the dimension of each matrix equals to the number of target tasks. Then, the NMF algorithm is carried out to obtain different components in each dimension direction through optimized matching, such as Euclidean distance and divergence distance. Finally, a Co-clustering technique based on information entropy is utilized to realize classification of each component. To verity the effectiveness of the proposed approach, a series of bearing data sets were analysed in this research. The tests indicated that although the diagnostic performance of single task is comparable to traditional clustering methods such as K-mean algorithm and Guassian Mixture Model, the accuracy and computational efficiency in multi-tasks fault diagnosis are improved.

  7. Mental Task Classification Scheme Utilizing Correlation Coefficient Extracted from Interchannel Intrinsic Mode Function.

    PubMed

    Rahman, Md Mostafizur; Fattah, Shaikh Anowarul

    2017-01-01

    In view of recent increase of brain computer interface (BCI) based applications, the importance of efficient classification of various mental tasks has increased prodigiously nowadays. In order to obtain effective classification, efficient feature extraction scheme is necessary, for which, in the proposed method, the interchannel relationship among electroencephalogram (EEG) data is utilized. It is expected that the correlation obtained from different combination of channels will be different for different mental tasks, which can be exploited to extract distinctive feature. The empirical mode decomposition (EMD) technique is employed on a test EEG signal obtained from a channel, which provides a number of intrinsic mode functions (IMFs), and correlation coefficient is extracted from interchannel IMF data. Simultaneously, different statistical features are also obtained from each IMF. Finally, the feature matrix is formed utilizing interchannel correlation features and intrachannel statistical features of the selected IMFs of EEG signal. Different kernels of the support vector machine (SVM) classifier are used to carry out the classification task. An EEG dataset containing ten different combinations of five different mental tasks is utilized to demonstrate the classification performance and a very high level of accuracy is achieved by the proposed scheme compared to existing methods.

  8. Secure alignment of coordinate systems using quantum correlation

    NASA Astrophysics Data System (ADS)

    Rezazadeh, F.; Mani, A.; Karimipour, V.

    2017-08-01

    We show that two parties far apart can use shared entangled states and classical communication to align their coordinate systems with a very high fidelity. Moreover, compared with previous methods proposed for such a task, i.e., sending parallel or antiparallel pairs or groups of spin states, our method has the extra advantages of using single-qubit measurements and also being secure, so that third parties do not extract any information about the aligned coordinate system established between the two parties. The latter property is important in many other quantum information protocols in which measurements inevitably play a significant role.

  9. Object-based Encoding in Visual Working Memory: Evidence from Memory-driven Attentional Capture.

    PubMed

    Gao, Zaifeng; Yu, Shixian; Zhu, Chengfeng; Shui, Rende; Weng, Xuchu; Li, Peng; Shen, Mowei

    2016-03-09

    Visual working memory (VWM) adopts a specific manner of object-based encoding (OBE) to extract perceptual information: Whenever one feature-dimension is selected for entry into VWM, the others are also extracted. Currently most studies revealing OBE probed an 'irrelevant-change distracting effect', where changes of irrelevant-features dramatically affected the performance of the target feature. However, the existence of irrelevant-feature change may affect participants' processing manner, leading to a false-positive result. The current study conducted a strict examination of OBE in VWM, by probing whether irrelevant-features guided the deployment of attention in visual search. The participants memorized an object's colour yet ignored shape and concurrently performed a visual-search task. They searched for a target line among distractor lines, each embedded within a different object. One object in the search display could match the shape, colour, or both dimensions of the memory item, but this object never contained the target line. Relative to a neutral baseline, where there was no match between the memory and search displays, search time was significantly prolonged in all match conditions, regardless of whether the memory item was displayed for 100 or 1000 ms. These results suggest that task-irrelevant shape was extracted into VWM, supporting OBE in VWM.

  10. A New Data Representation Based on Training Data Characteristics to Extract Drug Name Entity in Medical Text

    PubMed Central

    Basaruddin, T.

    2016-01-01

    One essential task in information extraction from the medical corpus is drug name recognition. Compared with text sources come from other domains, the medical text mining poses more challenges, for example, more unstructured text, the fast growing of new terms addition, a wide range of name variation for the same drug, the lack of labeled dataset sources and external knowledge, and the multiple token representations for a single drug name. Although many approaches have been proposed to overwhelm the task, some problems remained with poor F-score performance (less than 0.75). This paper presents a new treatment in data representation techniques to overcome some of those challenges. We propose three data representation techniques based on the characteristics of word distribution and word similarities as a result of word embedding training. The first technique is evaluated with the standard NN model, that is, MLP. The second technique involves two deep network classifiers, that is, DBN and SAE. The third technique represents the sentence as a sequence that is evaluated with a recurrent NN model, that is, LSTM. In extracting the drug name entities, the third technique gives the best F-score performance compared to the state of the art, with its average F-score being 0.8645. PMID:27843447

  11. Hyperspectral image classification based on local binary patterns and PCANet

    NASA Astrophysics Data System (ADS)

    Yang, Huizhen; Gao, Feng; Dong, Junyu; Yang, Yang

    2018-04-01

    Hyperspectral image classification has been well acknowledged as one of the challenging tasks of hyperspectral data processing. In this paper, we propose a novel hyperspectral image classification framework based on local binary pattern (LBP) features and PCANet. In the proposed method, linear prediction error (LPE) is first employed to select a subset of informative bands, and LBP is utilized to extract texture features. Then, spectral and texture features are stacked into a high dimensional vectors. Next, the extracted features of a specified position are transformed to a 2-D image. The obtained images of all pixels are fed into PCANet for classification. Experimental results on real hyperspectral dataset demonstrate the effectiveness of the proposed method.

  12. Extraction and purification methods in downstream processing of plant-based recombinant proteins.

    PubMed

    Łojewska, Ewelina; Kowalczyk, Tomasz; Olejniczak, Szymon; Sakowicz, Tomasz

    2016-04-01

    During the last two decades, the production of recombinant proteins in plant systems has been receiving increased attention. Currently, proteins are considered as the most important biopharmaceuticals. However, high costs and problems with scaling up the purification and isolation processes make the production of plant-based recombinant proteins a challenging task. This paper presents a summary of the information regarding the downstream processing in plant systems and provides a comprehensible overview of its key steps, such as extraction and purification. To highlight the recent progress, mainly new developments in the downstream technology have been chosen. Furthermore, besides most popular techniques, alternative methods have been described. Copyright © 2015 Elsevier Inc. All rights reserved.

  13. Zone analysis in biology articles as a basis for information extraction.

    PubMed

    Mizuta, Yoko; Korhonen, Anna; Mullen, Tony; Collier, Nigel

    2006-06-01

    In the field of biomedicine, an overwhelming amount of experimental data has become available as a result of the high throughput of research in this domain. The amount of results reported has now grown beyond the limits of what can be managed by manual means. This makes it increasingly difficult for the researchers in this area to keep up with the latest developments. Information extraction (IE) in the biological domain aims to provide an effective automatic means to dynamically manage the information contained in archived journal articles and abstract collections and thus help researchers in their work. However, while considerable advances have been made in certain areas of IE, pinpointing and organizing factual information (such as experimental results) remains a challenge. In this paper we propose tackling this task by incorporating into IE information about rhetorical zones, i.e. classification of spans of text in terms of argumentation and intellectual attribution. As the first step towards this goal, we introduce a scheme for annotating biological texts for rhetorical zones and provide a qualitative and quantitative analysis of the data annotated according to this scheme. We also discuss our preliminary research on automatic zone analysis, and its incorporation into our IE framework.

  14. DOE Office of Scientific and Technical Information (OSTI.GOV)

    Vatsavai, Raju; Cheriyadat, Anil M; Bhaduri, Budhendra L

    The high rate of urbanization, political conflicts and ensuing internal displacement of population, and increased poverty in the 20th century has resulted in rapid increase of informal settlements. These unplanned, unauthorized, and/or unstructured homes, known as informal settlements, shantytowns, barrios, or slums, pose several challenges to the nations, as these settlements are often located in most hazardous regions and lack basic services. Though several World Bank and United Nations sponsored studies stress the importance of poverty maps in designing better policies and interventions, mapping slums of the world is a daunting and challenging task. In this paper, we summarize ourmore » ongoing research on settlement mapping through the utilization of Very high resolution (VHR) remote sensing imagery. Most existing approaches used to classify VHR images are single instance (or pixel-based) learning algorithms, which are inadequate for analyzing VHR imagery, as single pixels do not contain sufficient contextual information (see Figure 1). However, much needed spatial contextual information can be captured via feature extraction and/or through newer machine learning algorithms in order to extract complex spatial patterns that distinguish informal settlements from formal ones. In recent years, we made significant progress in advancing the state of art in both directions. This paper summarizes these results.« less

  15. Attention-Based Recurrent Temporal Restricted Boltzmann Machine for Radar High Resolution Range Profile Sequence Recognition.

    PubMed

    Zhang, Yifan; Gao, Xunzhang; Peng, Xuan; Ye, Jiaqi; Li, Xiang

    2018-05-16

    The High Resolution Range Profile (HRRP) recognition has attracted great concern in the field of Radar Automatic Target Recognition (RATR). However, traditional HRRP recognition methods failed to model high dimensional sequential data efficiently and have a poor anti-noise ability. To deal with these problems, a novel stochastic neural network model named Attention-based Recurrent Temporal Restricted Boltzmann Machine (ARTRBM) is proposed in this paper. RTRBM is utilized to extract discriminative features and the attention mechanism is adopted to select major features. RTRBM is efficient to model high dimensional HRRP sequences because it can extract the information of temporal and spatial correlation between adjacent HRRPs. The attention mechanism is used in sequential data recognition tasks including machine translation and relation classification, which makes the model pay more attention to the major features of recognition. Therefore, the combination of RTRBM and the attention mechanism makes our model effective for extracting more internal related features and choose the important parts of the extracted features. Additionally, the model performs well with the noise corrupted HRRP data. Experimental results on the Moving and Stationary Target Acquisition and Recognition (MSTAR) dataset show that our proposed model outperforms other traditional methods, which indicates that ARTRBM extracts, selects, and utilizes the correlation information between adjacent HRRPs effectively and is suitable for high dimensional data or noise corrupted data.

  16. Smart Shop Assistant - Using Semantic Technologies to Improve Online Shopping

    NASA Astrophysics Data System (ADS)

    Niemann, Magnus; Mochol, Malgorzata; Tolksdorf, Robert

    Internet commerce experiences a rising complexity: Not only more and more products become available online but also the amount of information available on a single product has been constantly increasing. Thanks to the Web 2.0 development it is, in the meantime, quite common to involve customers in the creation of product description and extraction of additional product information by offering customers feedback forms and product review sites, users' weblogs and other social web services. To face this situation, one of the main tasks in a future internet will be to aggregate, sort and evaluate this huge amount of information to aid the customers in choosing the "perfect" product for their needs.

  17. EyeMusic: Introducing a "visual" colorful experience for the blind using auditory sensory substitution.

    PubMed

    Abboud, Sami; Hanassy, Shlomi; Levy-Tzedek, Shelly; Maidenbaum, Shachar; Amedi, Amir

    2014-01-01

    Sensory-substitution devices (SSDs) provide auditory or tactile representations of visual information. These devices often generate unpleasant sensations and mostly lack color information. We present here a novel SSD aimed at addressing these issues. We developed the EyeMusic, a novel visual-to-auditory SSD for the blind, providing both shape and color information. Our design uses musical notes on a pentatonic scale generated by natural instruments to convey the visual information in a pleasant manner. A short behavioral protocol was utilized to train the blind to extract shape and color information, and test their acquired abilities. Finally, we conducted a survey and a comparison task to assess the pleasantness of the generated auditory stimuli. We show that basic shape and color information can be decoded from the generated auditory stimuli. High performance levels were achieved by all participants following as little as 2-3 hours of training. Furthermore, we show that users indeed found the stimuli pleasant and potentially tolerable for prolonged use. The novel EyeMusic algorithm provides an intuitive and relatively pleasant way for the blind to extract shape and color information. We suggest that this might help facilitating visual rehabilitation because of the added functionality and enhanced pleasantness.

  18. Efficacy Evaluation of Different Wavelet Feature Extraction Methods on Brain MRI Tumor Detection

    NASA Astrophysics Data System (ADS)

    Nabizadeh, Nooshin; John, Nigel; Kubat, Miroslav

    2014-03-01

    Automated Magnetic Resonance Imaging brain tumor detection and segmentation is a challenging task. Among different available methods, feature-based methods are very dominant. While many feature extraction techniques have been employed, it is still not quite clear which of feature extraction methods should be preferred. To help improve the situation, we present the results of a study in which we evaluate the efficiency of using different wavelet transform features extraction methods in brain MRI abnormality detection. Applying T1-weighted brain image, Discrete Wavelet Transform (DWT), Discrete Wavelet Packet Transform (DWPT), Dual Tree Complex Wavelet Transform (DTCWT), and Complex Morlet Wavelet Transform (CMWT) methods are applied to construct the feature pool. Three various classifiers as Support Vector Machine, K Nearest Neighborhood, and Sparse Representation-Based Classifier are applied and compared for classifying the selected features. The results show that DTCWT and CMWT features classified with SVM, result in the highest classification accuracy, proving of capability of wavelet transform features to be informative in this application.

  19. Spatio-Temporal Pattern Mining on Trajectory Data Using Arm

    NASA Astrophysics Data System (ADS)

    Khoshahval, S.; Farnaghi, M.; Taleai, M.

    2017-09-01

    Preliminary mobile was considered to be a device to make human connections easier. But today the consumption of this device has been evolved to a platform for gaming, web surfing and GPS-enabled application capabilities. Embedding GPS in handheld devices, altered them to significant trajectory data gathering facilities. Raw GPS trajectory data is a series of points which contains hidden information. For revealing hidden information in traces, trajectory data analysis is needed. One of the most beneficial concealed information in trajectory data is user activity patterns. In each pattern, there are multiple stops and moves which identifies users visited places and tasks. This paper proposes an approach to discover user daily activity patterns from GPS trajectories using association rules. Finding user patterns needs extraction of user's visited places from stops and moves of GPS trajectories. In order to locate stops and moves, we have implemented a place recognition algorithm. After extraction of visited points an advanced association rule mining algorithm, called Apriori was used to extract user activity patterns. This study outlined that there are useful patterns in each trajectory that can be emerged from raw GPS data using association rule mining techniques in order to find out about multiple users' behaviour in a system and can be utilized in various location-based applications.

  20. Natural language processing systems for capturing and standardizing unstructured clinical information: A systematic review.

    PubMed

    Kreimeyer, Kory; Foster, Matthew; Pandey, Abhishek; Arya, Nina; Halford, Gwendolyn; Jones, Sandra F; Forshee, Richard; Walderhaug, Mark; Botsis, Taxiarchis

    2017-09-01

    We followed a systematic approach based on the Preferred Reporting Items for Systematic Reviews and Meta-Analyses to identify existing clinical natural language processing (NLP) systems that generate structured information from unstructured free text. Seven literature databases were searched with a query combining the concepts of natural language processing and structured data capture. Two reviewers screened all records for relevance during two screening phases, and information about clinical NLP systems was collected from the final set of papers. A total of 7149 records (after removing duplicates) were retrieved and screened, and 86 were determined to fit the review criteria. These papers contained information about 71 different clinical NLP systems, which were then analyzed. The NLP systems address a wide variety of important clinical and research tasks. Certain tasks are well addressed by the existing systems, while others remain as open challenges that only a small number of systems attempt, such as extraction of temporal information or normalization of concepts to standard terminologies. This review has identified many NLP systems capable of processing clinical free text and generating structured output, and the information collected and evaluated here will be important for prioritizing development of new approaches for clinical NLP. Copyright © 2017 Elsevier Inc. All rights reserved.

  1. Thermal-to-visible face recognition using partial least squares.

    PubMed

    Hu, Shuowen; Choi, Jonghyun; Chan, Alex L; Schwartz, William Robson

    2015-03-01

    Although visible face recognition has been an active area of research for several decades, cross-modal face recognition has only been explored by the biometrics community relatively recently. Thermal-to-visible face recognition is one of the most difficult cross-modal face recognition challenges, because of the difference in phenomenology between the thermal and visible imaging modalities. We address the cross-modal recognition problem using a partial least squares (PLS) regression-based approach consisting of preprocessing, feature extraction, and PLS model building. The preprocessing and feature extraction stages are designed to reduce the modality gap between the thermal and visible facial signatures, and facilitate the subsequent one-vs-all PLS-based model building. We incorporate multi-modal information into the PLS model building stage to enhance cross-modal recognition. The performance of the proposed recognition algorithm is evaluated on three challenging datasets containing visible and thermal imagery acquired under different experimental scenarios: time-lapse, physical tasks, mental tasks, and subject-to-camera range. These scenarios represent difficult challenges relevant to real-world applications. We demonstrate that the proposed method performs robustly for the examined scenarios.

  2. ChemEngine: harvesting 3D chemical structures of supplementary data from PDF files.

    PubMed

    Karthikeyan, Muthukumarasamy; Vyas, Renu

    2016-01-01

    Digital access to chemical journals resulted in a vast array of molecular information that is now available in the supplementary material files in PDF format. However, extracting this molecular information, generally from a PDF document format is a daunting task. Here we present an approach to harvest 3D molecular data from the supporting information of scientific research articles that are normally available from publisher's resources. In order to demonstrate the feasibility of extracting truly computable molecules from PDF file formats in a fast and efficient manner, we have developed a Java based application, namely ChemEngine. This program recognizes textual patterns from the supplementary data and generates standard molecular structure data (bond matrix, atomic coordinates) that can be subjected to a multitude of computational processes automatically. The methodology has been demonstrated via several case studies on different formats of coordinates data stored in supplementary information files, wherein ChemEngine selectively harvested the atomic coordinates and interpreted them as molecules with high accuracy. The reusability of extracted molecular coordinate data was demonstrated by computing Single Point Energies that were in close agreement with the original computed data provided with the articles. It is envisaged that the methodology will enable large scale conversion of molecular information from supplementary files available in the PDF format into a collection of ready- to- compute molecular data to create an automated workflow for advanced computational processes. Software along with source codes and instructions available at https://sourceforge.net/projects/chemengine/files/?source=navbar.Graphical abstract.

  3. Person Recognition System Based on a Combination of Body Images from Visible Light and Thermal Cameras.

    PubMed

    Nguyen, Dat Tien; Hong, Hyung Gil; Kim, Ki Wan; Park, Kang Ryoung

    2017-03-16

    The human body contains identity information that can be used for the person recognition (verification/recognition) problem. In this paper, we propose a person recognition method using the information extracted from body images. Our research is novel in the following three ways compared to previous studies. First, we use the images of human body for recognizing individuals. To overcome the limitations of previous studies on body-based person recognition that use only visible light images for recognition, we use human body images captured by two different kinds of camera, including a visible light camera and a thermal camera. The use of two different kinds of body image helps us to reduce the effects of noise, background, and variation in the appearance of a human body. Second, we apply a state-of-the art method, called convolutional neural network (CNN) among various available methods, for image features extraction in order to overcome the limitations of traditional hand-designed image feature extraction methods. Finally, with the extracted image features from body images, the recognition task is performed by measuring the distance between the input and enrolled samples. The experimental results show that the proposed method is efficient for enhancing recognition accuracy compared to systems that use only visible light or thermal images of the human body.

  4. A new automated spectral feature extraction method and its application in spectral classification and defective spectra recovery

    NASA Astrophysics Data System (ADS)

    Wang, Ke; Guo, Ping; Luo, A.-Li

    2017-03-01

    Spectral feature extraction is a crucial procedure in automated spectral analysis. This procedure starts from the spectral data and produces informative and non-redundant features, facilitating the subsequent automated processing and analysis with machine-learning and data-mining techniques. In this paper, we present a new automated feature extraction method for astronomical spectra, with application in spectral classification and defective spectra recovery. The basic idea of our approach is to train a deep neural network to extract features of spectra with different levels of abstraction in different layers. The deep neural network is trained with a fast layer-wise learning algorithm in an analytical way without any iterative optimization procedure. We evaluate the performance of the proposed scheme on real-world spectral data. The results demonstrate that our method is superior regarding its comprehensive performance, and the computational cost is significantly lower than that for other methods. The proposed method can be regarded as a new valid alternative general-purpose feature extraction method for various tasks in spectral data analysis.

  5. Computer vision for driver assistance systems

    NASA Astrophysics Data System (ADS)

    Handmann, Uwe; Kalinke, Thomas; Tzomakas, Christos; Werner, Martin; von Seelen, Werner

    1998-07-01

    Systems for automated image analysis are useful for a variety of tasks and their importance is still increasing due to technological advances and an increase of social acceptance. Especially in the field of driver assistance systems the progress in science has reached a level of high performance. Fully or partly autonomously guided vehicles, particularly for road-based traffic, pose high demands on the development of reliable algorithms due to the conditions imposed by natural environments. At the Institut fur Neuroinformatik, methods for analyzing driving relevant scenes by computer vision are developed in cooperation with several partners from the automobile industry. We introduce a system which extracts the important information from an image taken by a CCD camera installed at the rear view mirror in a car. The approach consists of a sequential and a parallel sensor and information processing. Three main tasks namely the initial segmentation (object detection), the object tracking and the object classification are realized by integration in the sequential branch and by fusion in the parallel branch. The main gain of this approach is given by the integrative coupling of different algorithms providing partly redundant information.

  6. An Investigation of the Relationship Between Automated Machine Translation Evaluation Metrics and User Performance on an Information Extraction Task

    DTIC Science & Technology

    2007-01-01

    parameter dimension between the two models). 93 were tested.3 Model 1 log( pHits 1− pHits ) = α + β1 ∗ MetricScore (6.6) The results for each of the...505.67 oTERavg .357 .13 .007 log( pHits 1− pHits ), that is, log-odds of correct task performance, of 2.79 over the intercept only model. All... pHits 1− pHits ) = −1.15− .418× I[MT=2] − .527× I[MT=3] + 1.78×METEOR+ 1.28×METEOR × I[MT=2] + 1.86×METEOR × I[MT=3] (6.7) Model 3 log( pHits 1− pHits

  7. Kansas environmental and resource study: A Great Plains model, tasks 1-6

    NASA Technical Reports Server (NTRS)

    Haralick, R. M.; Kanemasu, E. T.; Morain, S. A.; Yarger, H. L. (Principal Investigator); Ulaby, F. T.; Shanmugam, K. S.; Williams, D. L.; Mccauley, J. R.; Mcnaughton, J. L.

    1972-01-01

    There are no author identified significant results in this report. Environmental and resources investigations in Kansas utilizing ERTS-1 imagery are summarized for the following areas: (1) use of feature extraction techniqued for texture context information in ERTS imagery; (2) interpretation and automatic image enhancement; (3) water use, production, and disease detection and predictions for wheat; (4) ERTS-1 agricultural statistics; (5) monitoring fresh water resources; and (6) ground pattern analysis in the Great Plains.

  8. Enhanced Gender Recognition System Using an Improved Histogram of Oriented Gradient (HOG) Feature from Quality Assessment of Visible Light and Thermal Images of the Human Body.

    PubMed

    Nguyen, Dat Tien; Park, Kang Ryoung

    2016-07-21

    With higher demand from users, surveillance systems are currently being designed to provide more information about the observed scene, such as the appearance of objects, types of objects, and other information extracted from detected objects. Although the recognition of gender of an observed human can be easily performed using human perception, it remains a difficult task when using computer vision system images. In this paper, we propose a new human gender recognition method that can be applied to surveillance systems based on quality assessment of human areas in visible light and thermal camera images. Our research is novel in the following two ways: First, we utilize the combination of visible light and thermal images of the human body for a recognition task based on quality assessment. We propose a quality measurement method to assess the quality of image regions so as to remove the effects of background regions in the recognition system. Second, by combining the features extracted using the histogram of oriented gradient (HOG) method and the measured qualities of image regions, we form a new image features, called the weighted HOG (wHOG), which is used for efficient gender recognition. Experimental results show that our method produces more accurate estimation results than the state-of-the-art recognition method that uses human body images.

  9. Enhanced Gender Recognition System Using an Improved Histogram of Oriented Gradient (HOG) Feature from Quality Assessment of Visible Light and Thermal Images of the Human Body

    PubMed Central

    Nguyen, Dat Tien; Park, Kang Ryoung

    2016-01-01

    With higher demand from users, surveillance systems are currently being designed to provide more information about the observed scene, such as the appearance of objects, types of objects, and other information extracted from detected objects. Although the recognition of gender of an observed human can be easily performed using human perception, it remains a difficult task when using computer vision system images. In this paper, we propose a new human gender recognition method that can be applied to surveillance systems based on quality assessment of human areas in visible light and thermal camera images. Our research is novel in the following two ways: First, we utilize the combination of visible light and thermal images of the human body for a recognition task based on quality assessment. We propose a quality measurement method to assess the quality of image regions so as to remove the effects of background regions in the recognition system. Second, by combining the features extracted using the histogram of oriented gradient (HOG) method and the measured qualities of image regions, we form a new image features, called the weighted HOG (wHOG), which is used for efficient gender recognition. Experimental results show that our method produces more accurate estimation results than the state-of-the-art recognition method that uses human body images. PMID:27455264

  10. Natural Language Processing Technologies in Radiology Research and Clinical Applications.

    PubMed

    Cai, Tianrun; Giannopoulos, Andreas A; Yu, Sheng; Kelil, Tatiana; Ripley, Beth; Kumamaru, Kanako K; Rybicki, Frank J; Mitsouras, Dimitrios

    2016-01-01

    The migration of imaging reports to electronic medical record systems holds great potential in terms of advancing radiology research and practice by leveraging the large volume of data continuously being updated, integrated, and shared. However, there are significant challenges as well, largely due to the heterogeneity of how these data are formatted. Indeed, although there is movement toward structured reporting in radiology (ie, hierarchically itemized reporting with use of standardized terminology), the majority of radiology reports remain unstructured and use free-form language. To effectively "mine" these large datasets for hypothesis testing, a robust strategy for extracting the necessary information is needed. Manual extraction of information is a time-consuming and often unmanageable task. "Intelligent" search engines that instead rely on natural language processing (NLP), a computer-based approach to analyzing free-form text or speech, can be used to automate this data mining task. The overall goal of NLP is to translate natural human language into a structured format (ie, a fixed collection of elements), each with a standardized set of choices for its value, that is easily manipulated by computer programs to (among other things) order into subcategories or query for the presence or absence of a finding. The authors review the fundamentals of NLP and describe various techniques that constitute NLP in radiology, along with some key applications. ©RSNA, 2016.

  11. Natural Language Processing Technologies in Radiology Research and Clinical Applications

    PubMed Central

    Cai, Tianrun; Giannopoulos, Andreas A.; Yu, Sheng; Kelil, Tatiana; Ripley, Beth; Kumamaru, Kanako K.; Rybicki, Frank J.

    2016-01-01

    The migration of imaging reports to electronic medical record systems holds great potential in terms of advancing radiology research and practice by leveraging the large volume of data continuously being updated, integrated, and shared. However, there are significant challenges as well, largely due to the heterogeneity of how these data are formatted. Indeed, although there is movement toward structured reporting in radiology (ie, hierarchically itemized reporting with use of standardized terminology), the majority of radiology reports remain unstructured and use free-form language. To effectively “mine” these large datasets for hypothesis testing, a robust strategy for extracting the necessary information is needed. Manual extraction of information is a time-consuming and often unmanageable task. “Intelligent” search engines that instead rely on natural language processing (NLP), a computer-based approach to analyzing free-form text or speech, can be used to automate this data mining task. The overall goal of NLP is to translate natural human language into a structured format (ie, a fixed collection of elements), each with a standardized set of choices for its value, that is easily manipulated by computer programs to (among other things) order into subcategories or query for the presence or absence of a finding. The authors review the fundamentals of NLP and describe various techniques that constitute NLP in radiology, along with some key applications. ©RSNA, 2016 PMID:26761536

  12. TEES 2.2: Biomedical Event Extraction for Diverse Corpora

    PubMed Central

    2015-01-01

    Background The Turku Event Extraction System (TEES) is a text mining program developed for the extraction of events, complex biomedical relationships, from scientific literature. Based on a graph-generation approach, the system detects events with the use of a rich feature set built via dependency parsing. The TEES system has achieved record performance in several of the shared tasks of its domain, and continues to be used in a variety of biomedical text mining tasks. Results The TEES system was quickly adapted to the BioNLP'13 Shared Task in order to provide a public baseline for derived systems. An automated approach was developed for learning the underlying annotation rules of event type, allowing immediate adaptation to the various subtasks, and leading to a first place in four out of eight tasks. The system for the automated learning of annotation rules is further enhanced in this paper to the point of requiring no manual adaptation to any of the BioNLP'13 tasks. Further, the scikit-learn machine learning library is integrated into the system, bringing a wide variety of machine learning methods usable with TEES in addition to the default SVM. A scikit-learn ensemble method is also used to analyze the importances of the features in the TEES feature sets. Conclusions The TEES system was introduced for the BioNLP'09 Shared Task and has since then demonstrated good performance in several other shared tasks. By applying the current TEES 2.2 system to multiple corpora from these past shared tasks an overarching analysis of the most promising methods and possible pitfalls in the evolving field of biomedical event extraction are presented. PMID:26551925

  13. TEES 2.2: Biomedical Event Extraction for Diverse Corpora.

    PubMed

    Björne, Jari; Salakoski, Tapio

    2015-01-01

    The Turku Event Extraction System (TEES) is a text mining program developed for the extraction of events, complex biomedical relationships, from scientific literature. Based on a graph-generation approach, the system detects events with the use of a rich feature set built via dependency parsing. The TEES system has achieved record performance in several of the shared tasks of its domain, and continues to be used in a variety of biomedical text mining tasks. The TEES system was quickly adapted to the BioNLP'13 Shared Task in order to provide a public baseline for derived systems. An automated approach was developed for learning the underlying annotation rules of event type, allowing immediate adaptation to the various subtasks, and leading to a first place in four out of eight tasks. The system for the automated learning of annotation rules is further enhanced in this paper to the point of requiring no manual adaptation to any of the BioNLP'13 tasks. Further, the scikit-learn machine learning library is integrated into the system, bringing a wide variety of machine learning methods usable with TEES in addition to the default SVM. A scikit-learn ensemble method is also used to analyze the importances of the features in the TEES feature sets. The TEES system was introduced for the BioNLP'09 Shared Task and has since then demonstrated good performance in several other shared tasks. By applying the current TEES 2.2 system to multiple corpora from these past shared tasks an overarching analysis of the most promising methods and possible pitfalls in the evolving field of biomedical event extraction are presented.

  14. Start Position Strongly Influences Fixation Patterns during Face Processing: Difficulties with Eye Movements as a Measure of Information Use

    PubMed Central

    Arizpe, Joseph; Kravitz, Dwight J.; Yovel, Galit; Baker, Chris I.

    2012-01-01

    Fixation patterns are thought to reflect cognitive processing and, thus, index the most informative stimulus features for task performance. During face recognition, initial fixations to the center of the nose have been taken to indicate this location is optimal for information extraction. However, the use of fixations as a marker for information use rests on the assumption that fixation patterns are predominantly determined by stimulus and task, despite the fact that fixations are also influenced by visuo-motor factors. Here, we tested the effect of starting position on fixation patterns during a face recognition task with upright and inverted faces. While we observed differences in fixations between upright and inverted faces, likely reflecting differences in cognitive processing, there was also a strong effect of start position. Over the first five saccades, fixation patterns across start positions were only coarsely similar, with most fixations around the eyes. Importantly, however, the precise fixation pattern was highly dependent on start position with a strong tendency toward facial features furthest from the start position. For example, the often-reported tendency toward the left over right eye was reversed for the left starting position. Further, delayed initial saccades for central versus peripheral start positions suggest greater information processing prior to the initial saccade, highlighting the experimental bias introduced by the commonly used center start position. Finally, the precise effect of face inversion on fixation patterns was also dependent on start position. These results demonstrate the importance of a non-stimulus, non-task factor in determining fixation patterns. The patterns observed likely reflect a complex combination of visuo-motor effects and simple sampling strategies as well as cognitive factors. These different factors are very difficult to tease apart and therefore great caution must be applied when interpreting absolute fixation locations as indicative of information use, particularly at a fine spatial scale. PMID:22319606

  15. Studying hemispheric lateralization during a Stroop task through near-infrared spectroscopy-based connectivity

    NASA Astrophysics Data System (ADS)

    Zhang, Lei; Sun, Jinyan; Sun, Bailei; Luo, Qingming; Gong, Hui

    2014-05-01

    Near-infrared spectroscopy (NIRS) is a developing and promising functional brain imaging technology. Developing data analysis methods to effectively extract meaningful information from collected data is the major bottleneck in popularizing this technology. In this study, we measured hemodynamic activity of the prefrontal cortex (PFC) during a color-word matching Stroop task using NIRS. Hemispheric lateralization was examined by employing traditional activation and novel NIRS-based connectivity analyses simultaneously. Wavelet transform coherence was used to assess intrahemispheric functional connectivity. Spearman correlation analysis was used to examine the relationship between behavioral performance and activation/functional connectivity, respectively. In agreement with activation analysis, functional connectivity analysis revealed leftward lateralization for the Stroop effect and correlation with behavioral performance. However, functional connectivity was more sensitive than activation for identifying hemispheric lateralization. Granger causality was used to evaluate the effective connectivity between hemispheres. The results showed increased information flow from the left to the right hemispheres for the incongruent versus the neutral task, indicating a leading role of the left PFC. This study demonstrates that the NIRS-based connectivity can reveal the functional architecture of the brain more comprehensively than traditional activation, helping to better utilize the advantages of NIRS.

  16. Dynamic replanning of 3D automated reconstruction using situation graph trees and illumination adjustment

    NASA Astrophysics Data System (ADS)

    Kohler, Sophie; Far, Aïcha Beya; Hirsch, Ernest

    2007-01-01

    This paper presents an original approach for the optimal 3D reconstruction of manufactured workpieces based on a priori planification of the task, enhanced on-line through dynamic adjustment of the lighting conditions, and built around a cognitive intelligent sensory system using so-called Situation Graph Trees. The system takes explicitely structural knowledge related to image acquisition conditions, type of illumination sources, contents of the scene (e. g., CAD models and tolerance information), etc. into account. The principle of the approach relies on two steps. First, a socalled initialization phase, leading to the a priori task plan, collects this structural knowledge. This knowledge is conveniently encoded, as a sub-part, in the Situation Graph Tree building the backbone of the planning system specifying exhaustively the behavior of the application. Second, the image is iteratively evaluated under the control of this Situation Graph Tree. The information describing the quality of the piece to analyze is thus extracted and further exploited for, e. g., inspection tasks. Lastly, the approach enables dynamic adjustment of the Situation Graph Tree, enabling the system to adjust itself to the actual application run-time conditions, thus providing the system with a self-learning capability.

  17. PDF text classification to leverage information extraction from publication reports.

    PubMed

    Bui, Duy Duc An; Del Fiol, Guilherme; Jonnalagadda, Siddhartha

    2016-06-01

    Data extraction from original study reports is a time-consuming, error-prone process in systematic review development. Information extraction (IE) systems have the potential to assist humans in the extraction task, however majority of IE systems were not designed to work on Portable Document Format (PDF) document, an important and common extraction source for systematic review. In a PDF document, narrative content is often mixed with publication metadata or semi-structured text, which add challenges to the underlining natural language processing algorithm. Our goal is to categorize PDF texts for strategic use by IE systems. We used an open-source tool to extract raw texts from a PDF document and developed a text classification algorithm that follows a multi-pass sieve framework to automatically classify PDF text snippets (for brevity, texts) into TITLE, ABSTRACT, BODYTEXT, SEMISTRUCTURE, and METADATA categories. To validate the algorithm, we developed a gold standard of PDF reports that were included in the development of previous systematic reviews by the Cochrane Collaboration. In a two-step procedure, we evaluated (1) classification performance, and compared it with machine learning classifier, and (2) the effects of the algorithm on an IE system that extracts clinical outcome mentions. The multi-pass sieve algorithm achieved an accuracy of 92.6%, which was 9.7% (p<0.001) higher than the best performing machine learning classifier that used a logistic regression algorithm. F-measure improvements were observed in the classification of TITLE (+15.6%), ABSTRACT (+54.2%), BODYTEXT (+3.7%), SEMISTRUCTURE (+34%), and MEDADATA (+14.2%). In addition, use of the algorithm to filter semi-structured texts and publication metadata improved performance of the outcome extraction system (F-measure +4.1%, p=0.002). It also reduced of number of sentences to be processed by 44.9% (p<0.001), which corresponds to a processing time reduction of 50% (p=0.005). The rule-based multi-pass sieve framework can be used effectively in categorizing texts extracted from PDF documents. Text classification is an important prerequisite step to leverage information extraction from PDF documents. Copyright © 2016 Elsevier Inc. All rights reserved.

  18. Biomedical named entity extraction: some issues of corpus compatibilities.

    PubMed

    Ekbal, Asif; Saha, Sriparna; Sikdar, Utpal Kumar

    2013-01-01

    Named Entity (NE) extraction is one of the most fundamental and important tasks in biomedical information extraction. It involves identification of certain entities from text and their classification into some predefined categories. In the biomedical community, there is yet no general consensus regarding named entity (NE) annotation; thus, it is very difficult to compare the existing systems due to corpus incompatibilities. Due to this problem we can not also exploit the advantages of using different corpora together. In our present work we address the issues of corpus compatibilities, and use a single objective optimization (SOO) based classifier ensemble technique that uses the search capability of genetic algorithm (GA) for NE extraction in biomedicine. We hypothesize that the reliability of predictions of each classifier differs among the various output classes. We use Conditional Random Field (CRF) and Support Vector Machine (SVM) frameworks to build a number of models depending upon the various representations of the set of features and/or feature templates. It is to be noted that we tried to extract the features without using any deep domain knowledge and/or resources. In order to assess the challenges of corpus compatibilities, we experiment with the different benchmark datasets and their various combinations. Comparison results with the existing approaches prove the efficacy of the used technique. GA based ensemble achieves around 2% performance improvements over the individual classifiers. Degradation in performance on the integrated corpus clearly shows the difficulties of the task. In summary, our used ensemble based approach attains the state-of-the-art performance levels for entity extraction in three different kinds of biomedical datasets. The possible reasons behind the better performance in our used approach are the (i). use of variety and rich features as described in Subsection "Features for named entity extraction"; (ii) use of GA based classifier ensemble technique to combine the outputs of multiple classifiers.

  19. Nonrigid mammogram registration using mutual information

    NASA Astrophysics Data System (ADS)

    Wirth, Michael A.; Narhan, Jay; Gray, Derek W. S.

    2002-05-01

    Of the papers dealing with the task of mammogram registration, the majority deal with the task by matching corresponding control-points derived from anatomical landmark points. One of the caveats encountered when using pure point-matching techniques is their reliance on accurately extracted anatomical features-points. This paper proposes an innovative approach to matching mammograms which combines the use of a similarity-measure and a point-based spatial transformation. Mutual information is a cost-function used to determine the degree of similarity between the two mammograms. An initial rigid registration is performed to remove global differences and bring the mammograms into approximate alignment. The mammograms are then subdivided into smaller regions and each of the corresponding subimages is matched independently using mutual information. The centroids of each of the matched subimages are then used as corresponding control-point pairs in association with the Thin-Plate Spline radial basis function. The resulting spatial transformation generates a nonrigid match of the mammograms. The technique is illustrated by matching mammograms from the MIAS mammogram database. An experimental comparison is made between mutual information incorporating purely rigid behavior, and that incorporating a more nonrigid behavior. The effectiveness of the registration process is evaluated using image differences.

  20. Multi-dimensional classification of biomedical text: Toward automated, practical provision of high-utility text to diverse users

    PubMed Central

    Shatkay, Hagit; Pan, Fengxia; Rzhetsky, Andrey; Wilbur, W. John

    2008-01-01

    Motivation: Much current research in biomedical text mining is concerned with serving biologists by extracting certain information from scientific text. We note that there is no ‘average biologist’ client; different users have distinct needs. For instance, as noted in past evaluation efforts (BioCreative, TREC, KDD) database curators are often interested in sentences showing experimental evidence and methods. Conversely, lab scientists searching for known information about a protein may seek facts, typically stated with high confidence. Text-mining systems can target specific end-users and become more effective, if the system can first identify text regions rich in the type of scientific content that is of interest to the user, retrieve documents that have many such regions, and focus on fact extraction from these regions. Here, we study the ability to characterize and classify such text automatically. We have recently introduced a multi-dimensional categorization and annotation scheme, developed to be applicable to a wide variety of biomedical documents and scientific statements, while intended to support specific biomedical retrieval and extraction tasks. Results: The annotation scheme was applied to a large corpus in a controlled effort by eight independent annotators, where three individual annotators independently tagged each sentence. We then trained and tested machine learning classifiers to automatically categorize sentence fragments based on the annotation. We discuss here the issues involved in this task, and present an overview of the results. The latter strongly suggest that automatic annotation along most of the dimensions is highly feasible, and that this new framework for scientific sentence categorization is applicable in practice. Contact: shatkay@cs.queensu.ca PMID:18718948

  1. How does intentionality of encoding affect memory for episodic information?

    PubMed Central

    Craig, Michael; Butterworth, Karla; Nilsson, Jonna; Hamilton, Colin J.; Gallagher, Peter

    2016-01-01

    Episodic memory enables the detailed and vivid recall of past events, including target and wider contextual information. In this paper, we investigated whether/how encoding intentionality affects the retention of target and contextual episodic information from a novel experience. Healthy adults performed (1) a What-Where-When (WWW) episodic memory task involving the hiding and delayed recall of a number of items (what) in different locations (where) in temporally distinct sessions (when) and (2) unexpected tests probing memory for wider contextual information from the WWW task. Critically, some participants were informed that memory for WWW information would be subsequently probed (intentional group), while this came as a surprise for others (incidental group). The probing of contextual information came as a surprise for all participants. Participants also performed several measures of episodic and nonepisodic cognition from which common episodic and nonepisodic factors were extracted. Memory for target (WWW) and contextual information was superior in the intentional group compared with the incidental group. Memory for target and contextual information was unrelated to factors of nonepisodic cognition, irrespective of encoding intentionality. In addition, memory for target information was unrelated to factors of episodic cognition. However, memory for wider contextual information was related to some factors of episodic cognition, and these relationships differed between the intentional and incidental groups. Our results lead us to propose the hypothesis that intentional encoding of episodic information increases the coherence of the representation of the context in which the episode took place. This hypothesis remains to be tested. PMID:27918286

  2. Task-specific ionic liquid-assisted extraction and separation of astaxanthin from shrimp waste.

    PubMed

    Bi, Wentao; Tian, Minglei; Zhou, Jun; Row, Kyung Ho

    2010-08-15

    Astaxanthin, as an outstanding antioxidant reagent, was successfully extracted from shrimp waste by the ionic liquids based ultrasonic-assisted extraction. Seven kinds of imidazolium ionic liquids with different cations and anions were investigated in this work and one task-specific ionic liquid in ethanol with 0.50molL(-1) was selected as the solvent. At the optimized ultrasonic extraction conditions, the extraction amount of astaxanthin increased 98% (92.7microg g(-1)) compared to the conventional method (46.7microg g(-1)). Furthermore, the extracted solution was isolated through the solid-phase extraction with a molecularly imprinted polymer sorbent. After loading the samples on molecularly imprinted polymer cartridge, the different washing and elution solvents, such as water, methanol, n-hexane, acetone and dichloromethane, were evaluated, and finally, astaxanthin was separated from the shrimp waste extract. Copyright 2010 Elsevier B.V. All rights reserved.

  3. Qualitative review of usability problems in health information systems for radiology.

    PubMed

    Dias, Camila Rodrigues; Pereira, Marluce Rodrigues; Freire, André Pimenta

    2017-12-01

    Radiology processes are commonly supported by Radiology Information System (RIS), Picture Archiving and Communication System (PACS) and other software for radiology. However, these information technologies can present usability problems that affect the performance of radiologists and physicians, especially considering the complexity of the tasks involved. The purpose of this study was to extract, classify and analyze qualitatively the usability problems in PACS, RIS and other software for radiology. A systematic review was performed to extract usability problems reported in empirical usability studies in the literature. The usability problems were categorized as violations of Nielsen and Molich's usability heuristics. The qualitative analysis indicated the causes and the effects of the identified usability problems. From the 431 papers initially identified, 10 met the study criteria. The analysis of the papers identified 90 instances of usability problems, classified into categories corresponding to established usability heuristics. The five heuristics with the highest number of instances of usability problems were "Flexibility and efficiency of use", "Consistency and standards", "Match between system and the real world", "Recognition rather than recall" and "Help and documentation", respectively. These problems can make the interaction time consuming, causing delays in tasks, dissatisfaction, frustration, preventing users from enjoying all the benefits and functionalities of the system, as well as leading to more errors and difficulties in carrying out clinical analyses. Furthermore, the present paper showed a lack of studies performed on systems for radiology, especially usability evaluations using formal methods of evaluation involving the final users. Copyright © 2017 Elsevier Inc. All rights reserved.

  4. Deep Learning from EEG Reports for Inferring Underspecified Information

    PubMed Central

    Goodwin, Travis R.; Harabagiu, Sanda M.

    2017-01-01

    Secondary use1of electronic health records (EHRs) often relies on the ability to automatically identify and extract information from EHRs. Unfortunately, EHRs are known to suffer from a variety of idiosyncrasies – most prevalently, they have been shown to often omit or underspecify information. Adapting traditional machine learning methods for inferring underspecified information relies on manually specifying features characterizing the specific information to recover (e.g. particular findings, test results, or physician’s impressions). By contrast, in this paper, we present a method for jointly (1) automatically extracting word- and report-level features and (2) inferring underspecified information from EHRs. Our approach accomplishes these two tasks jointly by combining recent advances in deep neural learning with access to textual data in electroencephalogram (EEG) reports. We evaluate the performance of our model on the problem of inferring the neurologist’s over-all impression (normal or abnormal) from electroencephalogram (EEG) reports and report an accuracy of 91.4% precision of 94.4% recall of 91.2% and F1 measure of 92.8% (a 40% improvement over the performance obtained using Doc2Vec). These promising results demonstrate the power of our approach, while error analysis reveals remaining obstacles as well as areas for future improvement. PMID:28815118

  5. Development of techniques for measuring pilot workload

    NASA Technical Reports Server (NTRS)

    Spyker, D. A.; Stackhouse, S. P.; Khalafalla, A. S.; Mclane, R. C.

    1971-01-01

    An objective method of assessing information workload based on physiological measurements was developed. Information workload, or reserve capacity, was measured using a visual discrimination secondary task and subjective rating of task difficulty. The primary task was two axis (pitch and roll) tracking, and the independent variables in this study were aircraft pitch dynamics and wind gust disturbances. The study was structured to provide: (1) a sensitive, nonloading measure of reserve capacity, and (2) an unencumbering reliable measurement of the psychophysiological state. From these, a measured workload index (MWI) and physiological workload index (PWI) were extracted. An important measure of the success of this study was the degree to which the MWI and PWI agreed across the 243 randomly-presented, four-minute trials (9 subjects X 9 tasks X 3 replications). The electrophysiological data collected included vectorcardiogaram, respiration, electromyogram, skin impedance, and electroencephalogram. Special computer programs were created for the analysis of each physiological variable. The digital data base then consisted of 82 physiological features for each of the 243 trials. A prediction of workload based on physiological observations was formulated as a simultaneous least-squares prediction problem. A best subset of 10 features was chosen to predict the three measures of reserve capacity. The cannonical correlation coefficient was .754 with a chi squared value of 91.3 which allows rejection of the null hypothesis with p of .995.

  6. The effects of sleep deprivation on item and associative recognition memory.

    PubMed

    Ratcliff, Roger; Van Dongen, Hans P A

    2018-02-01

    Sleep deprivation adversely affects the ability to perform cognitive tasks, but theories range from predicting an overall decline in cognitive functioning because of reduced stability in attentional networks to specific deficits in various cognitive domains or processes. We measured the effects of sleep deprivation on two memory tasks, item recognition ("was this word in the list studied") and associative recognition ("were these two words studied in the same pair"). These tasks test memory for information encoded a few minutes earlier and so do not address effects of sleep deprivation on working memory or consolidation after sleep. A diffusion model was used to decompose accuracy and response time distributions to produce parameter estimates of components of cognitive processing. The model assumes that over time, noisy evidence from the task stimulus is accumulated to one of two decision criteria, and parameters governing this process are extracted and interpreted in terms of distinct cognitive processes. Results showed that sleep deprivation reduces drift rate (evidence used in the decision process), with little effect on the other components of the decision process. These results contrast with the effects of aging, which show little decline in item recognition but large declines in associative recognition. The results suggest that sleep deprivation degrades the quality of information stored in memory and that this may occur through degraded attentional processes. (PsycINFO Database Record (c) 2018 APA, all rights reserved).

  7. A multi-ontology approach to annotate scientific documents based on a modularization technique.

    PubMed

    Gomes, Priscilla Corrêa E Castro; Moura, Ana Maria de Carvalho; Cavalcanti, Maria Cláudia

    2015-12-01

    Scientific text annotation has become an important task for biomedical scientists. Nowadays, there is an increasing need for the development of intelligent systems to support new scientific findings. Public databases available on the Web provide useful data, but much more useful information is only accessible in scientific texts. Text annotation may help as it relies on the use of ontologies to maintain annotations based on a uniform vocabulary. However, it is difficult to use an ontology, especially those that cover a large domain. In addition, since scientific texts explore multiple domains, which are covered by distinct ontologies, it becomes even more difficult to deal with such task. Moreover, there are dozens of ontologies in the biomedical area, and they are usually big in terms of the number of concepts. It is in this context that ontology modularization can be useful. This work presents an approach to annotate scientific documents using modules of different ontologies, which are built according to a module extraction technique. The main idea is to analyze a set of single-ontology annotations on a text to find out the user interests. Based on these annotations a set of modules are extracted from a set of distinct ontologies, and are made available for the user, for complementary annotation. The reduced size and focus of the extracted modules tend to facilitate the annotation task. An experiment was conducted to evaluate this approach, with the participation of a bioinformatician specialist of the Laboratory of Peptides and Proteins of the IOC/Fiocruz, who was interested in discovering new drug targets aiming at the combat of tropical diseases. Copyright © 2015 Elsevier Inc. All rights reserved.

  8. The (In)Effectiveness of Simulated Blur for Depth Perception in Naturalistic Images.

    PubMed

    Maiello, Guido; Chessa, Manuela; Solari, Fabio; Bex, Peter J

    2015-01-01

    We examine depth perception in images of real scenes with naturalistic variation in pictorial depth cues, simulated dioptric blur and binocular disparity. Light field photographs of natural scenes were taken with a Lytro plenoptic camera that simultaneously captures images at up to 12 focal planes. When accommodation at any given plane was simulated, the corresponding defocus blur at other depth planes was extracted from the stack of focal plane images. Depth information from pictorial cues, relative blur and stereoscopic disparity was separately introduced into the images. In 2AFC tasks, observers were required to indicate which of two patches extracted from these images was farther. Depth discrimination sensitivity was highest when geometric and stereoscopic disparity cues were both present. Blur cues impaired sensitivity by reducing the contrast of geometric information at high spatial frequencies. While simulated generic blur may not assist depth perception, it remains possible that dioptric blur from the optics of an observer's own eyes may be used to recover depth information on an individual basis. The implications of our findings for virtual reality rendering technology are discussed.

  9. The (In)Effectiveness of Simulated Blur for Depth Perception in Naturalistic Images

    PubMed Central

    Maiello, Guido; Chessa, Manuela; Solari, Fabio; Bex, Peter J.

    2015-01-01

    We examine depth perception in images of real scenes with naturalistic variation in pictorial depth cues, simulated dioptric blur and binocular disparity. Light field photographs of natural scenes were taken with a Lytro plenoptic camera that simultaneously captures images at up to 12 focal planes. When accommodation at any given plane was simulated, the corresponding defocus blur at other depth planes was extracted from the stack of focal plane images. Depth information from pictorial cues, relative blur and stereoscopic disparity was separately introduced into the images. In 2AFC tasks, observers were required to indicate which of two patches extracted from these images was farther. Depth discrimination sensitivity was highest when geometric and stereoscopic disparity cues were both present. Blur cues impaired sensitivity by reducing the contrast of geometric information at high spatial frequencies. While simulated generic blur may not assist depth perception, it remains possible that dioptric blur from the optics of an observer’s own eyes may be used to recover depth information on an individual basis. The implications of our findings for virtual reality rendering technology are discussed. PMID:26447793

  10. Data Processing and Text Mining Technologies on Electronic Medical Records: A Review

    PubMed Central

    Sun, Wencheng; Li, Yangyang; Liu, Fang; Fang, Shengqun; Wang, Guoyan

    2018-01-01

    Currently, medical institutes generally use EMR to record patient's condition, including diagnostic information, procedures performed, and treatment results. EMR has been recognized as a valuable resource for large-scale analysis. However, EMR has the characteristics of diversity, incompleteness, redundancy, and privacy, which make it difficult to carry out data mining and analysis directly. Therefore, it is necessary to preprocess the source data in order to improve data quality and improve the data mining results. Different types of data require different processing technologies. Most structured data commonly needs classic preprocessing technologies, including data cleansing, data integration, data transformation, and data reduction. For semistructured or unstructured data, such as medical text, containing more health information, it requires more complex and challenging processing methods. The task of information extraction for medical texts mainly includes NER (named-entity recognition) and RE (relation extraction). This paper focuses on the process of EMR processing and emphatically analyzes the key techniques. In addition, we make an in-depth study on the applications developed based on text mining together with the open challenges and research issues for future work. PMID:29849998

  11. ICA-Based Imagined Conceptual Words Classification on EEG Signals.

    PubMed

    Imani, Ehsan; Pourmohammad, Ali; Bagheri, Mahsa; Mobasheri, Vida

    2017-01-01

    Independent component analysis (ICA) has been used for detecting and removing the eye artifacts conventionally. However, in this research, it was used not only for detecting the eye artifacts, but also for detecting the brain-produced signals of two conceptual danger and information category words. In this cross-sectional research, electroencephalography (EEG) signals were recorded using Micromed and 19-channel helmet devices in unipolar mode, wherein Cz electrode was selected as the reference electrode. In the first part of this research, the statistical community test case included four men and four women, who were 25-30 years old. In the designed task, three groups of traffic signs were considered, in which two groups referred to the concept of danger, and the third one referred to the concept of information. In the second part, the three volunteers, two men and one woman, who had the best results, were chosen from among eight participants. In the second designed task, direction arrows (up, down, left, and right) were used. For the 2/8 volunteers in the rest times, very high-power alpha waves were observed from the back of the head; however, in the thinking times, they were different. According to this result, alpha waves for changing the task from thinking to rest condition took at least 3 s for the two volunteers, and it was at most 5 s until they went to the absolute rest condition. For the 7/8 volunteers, the danger and information signals were well classified; these differences for the 5/8 volunteers were observed in the right hemisphere, and, for the other three volunteers, the differences were observed in the left hemisphere. For the second task, simulations showed that the best classification accuracies resulted when the time window was 2.5 s. In addition, it also showed that the features of the autoregressive (AR)-15 model coefficients were the best choices for extracting the features. For all the states of neural network except hardlim discriminator function, the classification accuracies were almost the same and not very different. Linear discriminant analysis (LDA) in comparison with the neural network yielded higher classification accuracies. ICA is a suitable algorithm for recognizing of the word's concept and its place in the brain. Achieved results from this experiment were the same compared with the results from other methods such as functional magnetic resonance imaging and methods based on the brain signals (EEG) in the vowel imagination and covert speech. Herein, the highest classification accuracy was obtained by extracting the target signal from the output of the ICA and extracting the features of coefficients AR model with time interval of 2.5 s. Finally, LDA resulted in the highest classification accuracy more than 60%.

  12. Controlling the spotlight of attention: visual span size and flexibility in schizophrenia.

    PubMed

    Elahipanah, Ava; Christensen, Bruce K; Reingold, Eyal M

    2011-10-01

    The current study investigated the size and flexible control of visual span among patients with schizophrenia during visual search performance. Visual span is the region of the visual field from which one extracts information during a single eye fixation, and a larger visual span size is linked to more efficient search performance. Therefore, a reduced visual span may explain patients' impaired performance on search tasks. The gaze-contingent moving window paradigm was used to estimate the visual span size of patients and healthy participants while they performed two different search tasks. In addition, changes in visual span size were measured as a function of two manipulations of task difficulty: target-distractor similarity and stimulus familiarity. Patients with schizophrenia searched more slowly across both tasks and conditions. Patients also demonstrated smaller visual span sizes on the easier search condition in each task. Moreover, healthy controls' visual span size increased as target discriminability or distractor familiarity increased. This modulation of visual span size, however, was reduced or not observed among patients. The implications of the present findings, with regard to previously reported visual search deficits, and other functional and structural abnormalities associated with schizophrenia, are discussed. Copyright © 2011 Elsevier Ltd. All rights reserved.

  13. Differences in arithmetic performance between Chinese and German adults are accompanied by differences in processing of non-symbolic numerical magnitude

    PubMed Central

    Lonnemann, Jan; Li, Su; Zhao, Pei; Li, Peng; Linkersdörfer, Janosch; Lindberg, Sven; Hasselhorn, Marcus; Yan, Song

    2017-01-01

    Human beings are assumed to possess an approximate number system (ANS) dedicated to extracting and representing approximate numerical magnitude information. The ANS is assumed to be fundamental to arithmetic learning and has been shown to be associated with arithmetic performance. It is, however, still a matter of debate whether better arithmetic skills are reflected in the ANS. To address this issue, Chinese and German adults were compared regarding their performance in simple arithmetic tasks and in a non-symbolic numerical magnitude comparison task. Chinese participants showed a better performance in solving simple arithmetic tasks and faster reaction times in the non-symbolic numerical magnitude comparison task without making more errors than their German peers. These differences in performance could not be ascribed to differences in general cognitive abilities. Better arithmetic skills were thus found to be accompanied by a higher speed of retrieving non-symbolic numerical magnitude knowledge but not by a higher precision of non-symbolic numerical magnitude representations. The group difference in the speed of retrieving non-symbolic numerical magnitude knowledge was fully mediated by the performance in arithmetic tasks, suggesting that arithmetic skills shape non-symbolic numerical magnitude processing skills. PMID:28384191

  14. Thermodynamical detection of entanglement by Maxwell's demons

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Maruyama, Koji; Vedral, Vlatko; Morikoshi, Fumiaki

    2005-01-01

    Quantum correlation, or entanglement, is now believed to be an indispensable physical resource for certain tasks in quantum information processing, for which classically correlated states cannot be useful. Besides information processing, what kind of physical processes can exploit entanglement? In this paper, we show that there is indeed a more basic relationship between entanglement and its usefulness in thermodynamics. We derive an inequality showing that we can extract more work out of a heat bath via entangled systems than via classically correlated ones. We also analyze the work balance of the process as a heat engine, in connection with themore » second law of thermodynamics.« less

  15. Graph Learning in Knowledge Bases

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Goldberg, Sean; Wang, Daisy Zhe

    The amount of text data has been growing exponentially in recent years, giving rise to automatic information extraction methods that store text annotations in a database. The current state-of-theart structured prediction methods, however, are likely to contain errors and it’s important to be able to manage the overall uncertainty of the database. On the other hand, the advent of crowdsourcing has enabled humans to aid machine algorithms at scale. As part of this project we introduced pi-CASTLE , a system that optimizes and integrates human and machine computing as applied to a complex structured prediction problem involving conditional random fieldsmore » (CRFs). We proposed strategies grounded in information theory to select a token subset, formulate questions for the crowd to label, and integrate these labelings back into the database using a method of constrained inference. On both a text segmentation task over academic citations and a named entity recognition task over tweets we showed an order of magnitude improvement in accuracy gain over baseline methods.« less

  16. Towards an Obesity-Cancer Knowledge Base: Biomedical Entity Identification and Relation Detection

    PubMed Central

    Lossio-Ventura, Juan Antonio; Hogan, William; Modave, François; Hicks, Amanda; Hanna, Josh; Guo, Yi; He, Zhe; Bian, Jiang

    2017-01-01

    Obesity is associated with increased risks of various types of cancer, as well as a wide range of other chronic diseases. On the other hand, access to health information activates patient participation, and improve their health outcomes. However, existing online information on obesity and its relationship to cancer is heterogeneous ranging from pre-clinical models and case studies to mere hypothesis-based scientific arguments. A formal knowledge representation (i.e., a semantic knowledge base) would help better organizing and delivering quality health information related to obesity and cancer that consumers need. Nevertheless, current ontologies describing obesity, cancer and related entities are not designed to guide automatic knowledge base construction from heterogeneous information sources. Thus, in this paper, we present methods for named-entity recognition (NER) to extract biomedical entities from scholarly articles and for detecting if two biomedical entities are related, with the long term goal of building a obesity-cancer knowledge base. We leverage both linguistic and statistical approaches in the NER task, which supersedes the state-of-the-art results. Further, based on statistical features extracted from the sentences, our method for relation detection obtains an accuracy of 99.3% and a f-measure of 0.993. PMID:28503356

  17. Effects of Mangifera indica fruit extract on cognitive deficits in mice.

    PubMed

    Kumar, Sokindra; Maheshwari, Kamal Kishore; Singh, Vijender

    2009-07-01

    Mangos are a source of bioactive compounds with potential health-promoting activity. The present work was undertaken to evaluate the ethanolic extract of Mangifera indica L. fruit on cognitive performances. The models used to study the effect on cognitive performances are step down passive avoidance task and elevated plus maze task in mice. Chronic treatment (7 days) of extract and vitamin C significantly (p < 0.05) reversed the aging and scopolamine induced memory deficits in both paradigms. Preliminary phytochemical screening revealed the presence of free sugars, saponins, tannins, and flavonoids. The results suggestthe extract contained pharmacologically active principles that are memory-enhancing in nature.

  18. Multichannel Convolutional Neural Network for Biological Relation Extraction.

    PubMed

    Quan, Chanqin; Hua, Lei; Sun, Xiao; Bai, Wenjun

    2016-01-01

    The plethora of biomedical relations which are embedded in medical logs (records) demands researchers' attention. Previous theoretical and practical focuses were restricted on traditional machine learning techniques. However, these methods are susceptible to the issues of "vocabulary gap" and data sparseness and the unattainable automation process in feature extraction. To address aforementioned issues, in this work, we propose a multichannel convolutional neural network (MCCNN) for automated biomedical relation extraction. The proposed model has the following two contributions: (1) it enables the fusion of multiple (e.g., five) versions in word embeddings; (2) the need for manual feature engineering can be obviated by automated feature learning with convolutional neural network (CNN). We evaluated our model on two biomedical relation extraction tasks: drug-drug interaction (DDI) extraction and protein-protein interaction (PPI) extraction. For DDI task, our system achieved an overall f -score of 70.2% compared to the standard linear SVM based system (e.g., 67.0%) on DDIExtraction 2013 challenge dataset. And for PPI task, we evaluated our system on Aimed and BioInfer PPI corpus; our system exceeded the state-of-art ensemble SVM system by 2.7% and 5.6% on f -scores.

  19. v3NLP Framework: Tools to Build Applications for Extracting Concepts from Clinical Text

    PubMed Central

    Divita, Guy; Carter, Marjorie E.; Tran, Le-Thuy; Redd, Doug; Zeng, Qing T; Duvall, Scott; Samore, Matthew H.; Gundlapalli, Adi V.

    2016-01-01

    Introduction: Substantial amounts of clinically significant information are contained only within the narrative of the clinical notes in electronic medical records. The v3NLP Framework is a set of “best-of-breed” functionalities developed to transform this information into structured data for use in quality improvement, research, population health surveillance, and decision support. Background: MetaMap, cTAKES and similar well-known natural language processing (NLP) tools do not have sufficient scalability out of the box. The v3NLP Framework evolved out of the necessity to scale-up these tools up and provide a framework to customize and tune techniques that fit a variety of tasks, including document classification, tuned concept extraction for specific conditions, patient classification, and information retrieval. Innovation: Beyond scalability, several v3NLP Framework-developed projects have been efficacy tested and benchmarked. While v3NLP Framework includes annotators, pipelines and applications, its functionalities enable developers to create novel annotators and to place annotators into pipelines and scaled applications. Discussion: The v3NLP Framework has been successfully utilized in many projects including general concept extraction, risk factors for homelessness among veterans, and identification of mentions of the presence of an indwelling urinary catheter. Projects as diverse as predicting colonization with methicillin-resistant Staphylococcus aureus and extracting references to military sexual trauma are being built using v3NLP Framework components. Conclusion: The v3NLP Framework is a set of functionalities and components that provide Java developers with the ability to create novel annotators and to place those annotators into pipelines and applications to extract concepts from clinical text. There are scale-up and scale-out functionalities to process large numbers of records. PMID:27683667

  20. Infrared Cephalic-Vein to Assist Blood Extraction Tasks: Automatic Projection and Recognition

    NASA Astrophysics Data System (ADS)

    Lagüela, S.; Gesto, M.; Riveiro, B.; González-Aguilera, D.

    2017-05-01

    Thermal infrared band is not commonly used in photogrammetric and computer vision algorithms, mainly due to the low spatial resolution of this type of imagery. However, this band captures sub-superficial information, increasing the capabilities of visible bands regarding applications. This fact is especially important in biomedicine and biometrics, allowing the geometric characterization of interior organs and pathologies with photogrammetric principles, as well as the automatic identification and labelling using computer vision algorithms. This paper presents advances of close-range photogrammetry and computer vision applied to thermal infrared imagery, with the final application of Augmented Reality in order to widen its application in the biomedical field. In this case, the thermal infrared image of the arm is acquired and simultaneously projected on the arm, together with the identification label of the cephalic-vein. This way, blood analysts are assisted in finding the vein for blood extraction, especially in those cases where the identification by the human eye is a complex task. Vein recognition is performed based on the Gaussian temperature distribution in the area of the vein, while the calibration between projector and thermographic camera is developed through feature extraction and pattern recognition. The method is validated through its application to a set of volunteers, with different ages and genres, in such way that different conditions of body temperature and vein depth are covered for the applicability and reproducibility of the method.

  1. Gait Recognition Based on Convolutional Neural Networks

    NASA Astrophysics Data System (ADS)

    Sokolova, A.; Konushin, A.

    2017-05-01

    In this work we investigate the problem of people recognition by their gait. For this task, we implement deep learning approach using the optical flow as the main source of motion information and combine neural feature extraction with the additional embedding of descriptors for representation improvement. In order to find the best heuristics, we compare several deep neural network architectures, learning and classification strategies. The experiments were made on two popular datasets for gait recognition, so we investigate their advantages and disadvantages and the transferability of considered methods.

  2. DeTEXT: A Database for Evaluating Text Extraction from Biomedical Literature Figures

    PubMed Central

    Yin, Xu-Cheng; Yang, Chun; Pei, Wei-Yi; Man, Haixia; Zhang, Jun; Learned-Miller, Erik; Yu, Hong

    2015-01-01

    Hundreds of millions of figures are available in biomedical literature, representing important biomedical experimental evidence. Since text is a rich source of information in figures, automatically extracting such text may assist in the task of mining figure information. A high-quality ground truth standard can greatly facilitate the development of an automated system. This article describes DeTEXT: A database for evaluating text extraction from biomedical literature figures. It is the first publicly available, human-annotated, high quality, and large-scale figure-text dataset with 288 full-text articles, 500 biomedical figures, and 9308 text regions. This article describes how figures were selected from open-access full-text biomedical articles and how annotation guidelines and annotation tools were developed. We also discuss the inter-annotator agreement and the reliability of the annotations. We summarize the statistics of the DeTEXT data and make available evaluation protocols for DeTEXT. Finally we lay out challenges we observed in the automated detection and recognition of figure text and discuss research directions in this area. DeTEXT is publicly available for downloading at http://prir.ustb.edu.cn/DeTEXT/. PMID:25951377

  3. Development of Mobile Mapping System for 3D Road Asset Inventory.

    PubMed

    Sairam, Nivedita; Nagarajan, Sudhagar; Ornitz, Scott

    2016-03-12

    Asset Management is an important component of an infrastructure project. A significant cost is involved in maintaining and updating the asset information. Data collection is the most time-consuming task in the development of an asset management system. In order to reduce the time and cost involved in data collection, this paper proposes a low cost Mobile Mapping System using an equipped laser scanner and cameras. First, the feasibility of low cost sensors for 3D asset inventory is discussed by deriving appropriate sensor models. Then, through calibration procedures, respective alignments of the laser scanner, cameras, Inertial Measurement Unit and GPS (Global Positioning System) antenna are determined. The efficiency of this Mobile Mapping System is experimented by mounting it on a truck and golf cart. By using derived sensor models, geo-referenced images and 3D point clouds are derived. After validating the quality of the derived data, the paper provides a framework to extract road assets both automatically and manually using techniques implementing RANSAC plane fitting and edge extraction algorithms. Then the scope of such extraction techniques along with a sample GIS (Geographic Information System) database structure for unified 3D asset inventory are discussed.

  4. Development of Mobile Mapping System for 3D Road Asset Inventory

    PubMed Central

    Sairam, Nivedita; Nagarajan, Sudhagar; Ornitz, Scott

    2016-01-01

    Asset Management is an important component of an infrastructure project. A significant cost is involved in maintaining and updating the asset information. Data collection is the most time-consuming task in the development of an asset management system. In order to reduce the time and cost involved in data collection, this paper proposes a low cost Mobile Mapping System using an equipped laser scanner and cameras. First, the feasibility of low cost sensors for 3D asset inventory is discussed by deriving appropriate sensor models. Then, through calibration procedures, respective alignments of the laser scanner, cameras, Inertial Measurement Unit and GPS (Global Positioning System) antenna are determined. The efficiency of this Mobile Mapping System is experimented by mounting it on a truck and golf cart. By using derived sensor models, geo-referenced images and 3D point clouds are derived. After validating the quality of the derived data, the paper provides a framework to extract road assets both automatically and manually using techniques implementing RANSAC plane fitting and edge extraction algorithms. Then the scope of such extraction techniques along with a sample GIS (Geographic Information System) database structure for unified 3D asset inventory are discussed. PMID:26985897

  5. Extracting biomedical events from pairs of text entities

    PubMed Central

    2015-01-01

    Background Huge amounts of electronic biomedical documents, such as molecular biology reports or genomic papers are generated daily. Nowadays, these documents are mainly available in the form of unstructured free texts, which require heavy processing for their registration into organized databases. This organization is instrumental for information retrieval, enabling to answer the advanced queries of researchers and practitioners in biology, medicine, and related fields. Hence, the massive data flow calls for efficient automatic methods of text-mining that extract high-level information, such as biomedical events, from biomedical text. The usual computational tools of Natural Language Processing cannot be readily applied to extract these biomedical events, due to the peculiarities of the domain. Indeed, biomedical documents contain highly domain-specific jargon and syntax. These documents also describe distinctive dependencies, making text-mining in molecular biology a specific discipline. Results We address biomedical event extraction as the classification of pairs of text entities into the classes corresponding to event types. The candidate pairs of text entities are recursively provided to a multiclass classifier relying on Support Vector Machines. This recursive process extracts events involving other events as arguments. Compared to joint models based on Markov Random Fields, our model simplifies inference and hence requires shorter training and prediction times along with lower memory capacity. Compared to usual pipeline approaches, our model passes over a complex intermediate problem, while making a more extensive usage of sophisticated joint features between text entities. Our method focuses on the core event extraction of the Genia task of BioNLP challenges yielding the best result reported so far on the 2013 edition. PMID:26201478

  6. Methanolic extract of Piper nigrum fruits improves memory impairment by decreasing brain oxidative stress in amyloid beta(1-42) rat model of Alzheimer's disease.

    PubMed

    Hritcu, Lucian; Noumedem, Jaurès A; Cioanca, Oana; Hancianu, Monica; Kuete, Victor; Mihasan, Marius

    2014-04-01

    The present study analyzed the possible memory-enhancing and antioxidant proprieties of the methanolic extract of Piper nigrum L. fruits (50 and 100 mg/kg, orally, for 21 days) in amyloid beta(1-42) rat model of Alzheimer's disease. The memory-enhancing effects of the plant extract were studied by means of in vivo (Y-maze and radial arm-maze tasks) approaches. Also, the antioxidant activity in the hippocampus was assessed using superoxide dismutase-, catalase-, glutathione peroxidase-specific activities and the total content of reduced glutathione, malondialdehyde, and protein carbonyl levels. The amyloid beta(1-42)-treated rats exhibited the following: decrease of spontaneous alternations percentage within Y-maze task and increase of working memory and reference memory errors within radial arm-maze task. Administration of the plant extract significantly improved memory performance and exhibited antioxidant potential. Our results suggest that the plant extract ameliorates amyloid beta(1-42)-induced spatial memory impairment by attenuation of the oxidative stress in the rat hippocampus.

  7. Functional Connectivity among Spikes in Low Dimensional Space during Working Memory Task in Rat

    PubMed Central

    Tian, Xin

    2014-01-01

    Working memory (WM) is critically important in cognitive tasks. The functional connectivity has been a powerful tool for understanding the mechanism underlying the information processing during WM tasks. The aim of this study is to investigate how to effectively characterize the dynamic variations of the functional connectivity in low dimensional space among the principal components (PCs) which were extracted from the instantaneous firing rate series. Spikes were obtained from medial prefrontal cortex (mPFC) of rats with implanted microelectrode array and then transformed into continuous series via instantaneous firing rate method. Granger causality method is proposed to study the functional connectivity. Then three scalar metrics were applied to identify the changes of the reduced dimensionality functional network during working memory tasks: functional connectivity (GC), global efficiency (E) and casual density (CD). As a comparison, GC, E and CD were also calculated to describe the functional connectivity in the original space. The results showed that these network characteristics dynamically changed during the correct WM tasks. The measure values increased to maximum, and then decreased both in the original and in the reduced dimensionality. Besides, the feature values of the reduced dimensionality were significantly higher during the WM tasks than they were in the original space. These findings suggested that functional connectivity among the spikes varied dynamically during the WM tasks and could be described effectively in the low dimensional space. PMID:24658291

  8. Search guidance is proportional to the categorical specificity of a target cue.

    PubMed

    Schmidt, Joseph; Zelinsky, Gregory J

    2009-10-01

    Visual search studies typically assume the availability of precise target information to guide search, often a picture of the exact target. However, search targets in the real world are often defined categorically and with varying degrees of visual specificity. In five target preview conditions we manipulated the availability of target visual information in a search task for common real-world objects. Previews were: a picture of the target, an abstract textual description of the target, a precise textual description, an abstract + colour textual description, or a precise + colour textual description. Guidance generally increased as information was added to the target preview. We conclude that the information used for search guidance need not be limited to a picture of the target. Although generally less precise, to the extent that visual information can be extracted from a target label and loaded into working memory, this information too can be used to guide search.

  9. Context Modulates Congruency Effects in Selective Attention to Social Cues.

    PubMed

    Ravagli, Andrea; Marini, Francesco; Marino, Barbara F M; Ricciardelli, Paola

    2018-01-01

    Head and gaze directions are used during social interactions as essential cues to infer where someone attends. When head and gaze are oriented toward opposite directions, we need to extract socially meaningful information despite stimulus conflict. Recently, a cognitive and neural mechanism for filtering-out conflicting stimuli has been identified while performing non-social attention tasks. This mechanism is engaged proactively when conflict is anticipated in a high proportion of trials and reactively when conflict occurs infrequently. Here, we investigated whether a similar mechanism is at play for limiting distraction from conflicting social cues during gaze or head direction discrimination tasks in contexts with different probabilities of conflict. Results showed that, for the gaze direction task only (Experiment 1), inverse efficiency (IE) scores for distractor-absent trials (i.e., faces with averted gaze and centrally oriented head) were larger (indicating worse performance) when these trials were intermixed with congruent/incongruent distractor-present trials (i.e., faces with averted gaze and tilted head in the same/opposite direction) relative to when the same distractor-absent trials were shown in isolation. Moreover, on distractor-present trials, IE scores for congruent (vs. incongruent) head-gaze pairs in blocks with rare conflict were larger than in blocks with frequent conflict, suggesting that adaptation to conflict was more efficient than adaptation to infrequent events. However, when the task required discrimination of head orientation while ignoring gaze direction, performance was not impacted by both block-level and current trial congruency (Experiment 2), unless the cognitive load of the task was increased by adding a concurrent task (Experiment 3). Overall, our study demonstrates that during attention to social cues proactive cognitive control mechanisms are modulated by the expectation of conflicting stimulus information at both the block- and trial-sequence level, and by the type of task and cognitive load. This helps to clarify the inherent differences in the distracting potential of head and gaze cues during speeded social attention tasks.

  10. Unified Modeling Language (UML) for hospital-based cancer registration processes.

    PubMed

    Shiki, Naomi; Ohno, Yuko; Fujii, Ayumi; Murata, Taizo; Matsumura, Yasushi

    2008-01-01

    Hospital-based cancer registry involves complex processing steps that span across multiple departments. In addition, management techniques and registration procedures differ depending on each medical facility. Establishing processes for hospital-based cancer registry requires clarifying specific functions and labor needed. In recent years, the business modeling technique, in which management evaluation is done by clearly spelling out processes and functions, has been applied to business process analysis. However, there are few analytical reports describing the applications of these concepts to medical-related work. In this study, we initially sought to model hospital-based cancer registration processes using the Unified Modeling Language (UML), to clarify functions. The object of this study was the cancer registry of Osaka University Hospital. We organized the hospital-based cancer registration processes based on interview and observational surveys, and produced an As-Is model using activity, use-case, and class diagrams. After drafting every UML model, it was fed-back to practitioners to check its validity and improved. We were able to define the workflow for each department using activity diagrams. In addition, by using use-case diagrams we were able to classify each department within the hospital as a system, and thereby specify the core processes and staff that were responsible for each department. The class diagrams were effective in systematically organizing the information to be used for hospital-based cancer registries. Using UML modeling, hospital-based cancer registration processes were broadly classified into three separate processes, namely, registration tasks, quality control, and filing data. An additional 14 functions were also extracted. Many tasks take place within the hospital-based cancer registry office, but the process of providing information spans across multiple departments. Moreover, additional tasks were required in comparison to using a standardized system because the hospital-based cancer registration system was constructed with the pre-existing computer system in Osaka University Hospital. Difficulty of utilization of useful information for cancer registration processes was shown to increase the task workload. By using UML, we were able to clarify functions and extract the typical processes for a hospital-based cancer registry. Modeling can provide a basis of process analysis for establishment of efficient hospital-based cancer registration processes in each institute.

  11. Ontologies in medicinal chemistry: current status and future challenges.

    PubMed

    Gómez-Pérez, Asunción; Martínez-Romero, Marcos; Rodríguez-González, Alejandro; Vázquez, Guillermo; Vázquez-Naya, José M

    2013-01-01

    Recent years have seen a dramatic increase in the amount and availability of data in the diverse areas of medicinal chemistry, making it possible to achieve significant advances in fields such as the design, synthesis and biological evaluation of compounds. However, with this data explosion, the storage, management and analysis of available data to extract relevant information has become even a more complex task that offers challenging research issues to Artificial Intelligence (AI) scientists. Ontologies have emerged in AI as a key tool to formally represent and semantically organize aspects of the real world. Beyond glossaries or thesauri, ontologies facilitate communication between experts and allow the application of computational techniques to extract useful information from available data. In medicinal chemistry, multiple ontologies have been developed during the last years which contain knowledge about chemical compounds and processes of synthesis of pharmaceutical products. This article reviews the principal standards and ontologies in medicinal chemistry, analyzes their main applications and suggests future directions.

  12. Recent progress in automatically extracting information from the pharmacogenomic literature

    PubMed Central

    Garten, Yael; Coulet, Adrien; Altman, Russ B

    2011-01-01

    The biomedical literature holds our understanding of pharmacogenomics, but it is dispersed across many journals. In order to integrate our knowledge, connect important facts across publications and generate new hypotheses we must organize and encode the contents of the literature. By creating databases of structured pharmocogenomic knowledge, we can make the value of the literature much greater than the sum of the individual reports. We can, for example, generate candidate gene lists or interpret surprising hits in genome-wide association studies. Text mining automatically adds structure to the unstructured knowledge embedded in millions of publications, and recent years have seen a surge in work on biomedical text mining, some specific to pharmacogenomics literature. These methods enable extraction of specific types of information and can also provide answers to general, systemic queries. In this article, we describe the main tasks of text mining in the context of pharmacogenomics, summarize recent applications and anticipate the next phase of text mining applications. PMID:21047206

  13. Domain-independent information extraction in unstructured text

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Irwin, N.H.

    Extracting information from unstructured text has become an important research area in recent years due to the large amount of text now electronically available. This status report describes the findings and work done during the second year of a two-year Laboratory Directed Research and Development Project. Building on the first-year`s work of identifying important entities, this report details techniques used to group words into semantic categories and to output templates containing selective document content. Using word profiles and category clustering derived during a training run, the time-consuming knowledge-building task can be avoided. Though the output still lacks in completeness whenmore » compared to systems with domain-specific knowledge bases, the results do look promising. The two approaches are compatible and could complement each other within the same system. Domain-independent approaches retain appeal as a system that adapts and learns will soon outpace a system with any amount of a priori knowledge.« less

  14. Complex temporal topic evolution modelling using the Kullback-Leibler divergence and the Bhattacharyya distance.

    PubMed

    Andrei, Victor; Arandjelović, Ognjen

    2016-12-01

    The rapidly expanding corpus of medical research literature presents major challenges in the understanding of previous work, the extraction of maximum information from collected data, and the identification of promising research directions. We present a case for the use of advanced machine learning techniques as an aide in this task and introduce a novel methodology that is shown to be capable of extracting meaningful information from large longitudinal corpora and of tracking complex temporal changes within it. Our framework is based on (i) the discretization of time into epochs, (ii) epoch-wise topic discovery using a hierarchical Dirichlet process-based model, and (iii) a temporal similarity graph which allows for the modelling of complex topic changes. More specifically, this is the first work that discusses and distinguishes between two groups of particularly challenging topic evolution phenomena: topic splitting and speciation and topic convergence and merging, in addition to the more widely recognized emergence and disappearance and gradual evolution. The proposed framework is evaluated on a public medical literature corpus.

  15. Task Force II: Energy and Its Socioeconomic Impacts

    ERIC Educational Resources Information Center

    Appalachia, 1977

    1977-01-01

    Summarizing the Task Force Issues Paper presented at the Appalachian Conference on Balanced Growth and Economic Development (1977), this article presents selected comments by Task Force participants, and Task Force recommendations re: a national severence tax on extraction of nonrenewable energy resources; socioeconomic costs of nuclear energy; a…

  16. Adults with Autism Tend to Undermine the Hidden Environmental Structure: Evidence from a Visual Associative Learning Task.

    PubMed

    Sapey-Triomphe, Laurie-Anne; Sonié, Sandrine; Hénaff, Marie-Anne; Mattout, Jérémie; Schmitz, Christina

    2018-04-13

    The learning-style theory of Autism Spectrum Disorders (ASD) (Qian, Lipkin, Frontiers in Human Neuroscience 5:77, 2011) states that ASD individuals differ from neurotypics in the way they learn and store information about the environment and its structure. ASD would rather adopt a lookup-table strategy (LUT: memorizing each experience), while neurotypics would favor an interpolation style (INT: extracting regularities to generalize). In a series of visual behavioral tasks, we tested this hypothesis in 20 neurotypical and 20 ASD adults. ASD participants had difficulties using the INT style when instructions were hidden but not when instructions were revealed. Rather than an inability to use rules, ASD would be characterized by a disinclination to generalize and infer such rules.

  17. Chemical-induced disease relation extraction with various linguistic features.

    PubMed

    Gu, Jinghang; Qian, Longhua; Zhou, Guodong

    2016-01-01

    Understanding the relations between chemicals and diseases is crucial in various biomedical tasks such as new drug discoveries and new therapy developments. While manually mining these relations from the biomedical literature is costly and time-consuming, such a procedure is often difficult to keep up-to-date. To address these issues, the BioCreative-V community proposed a challenging task of automatic extraction of chemical-induced disease (CID) relations in order to benefit biocuration. This article describes our work on the CID relation extraction task on the BioCreative-V tasks. We built a machine learning based system that utilized simple yet effective linguistic features to extract relations with maximum entropy models. In addition to leveraging various features, the hypernym relations between entity concepts derived from the Medical Subject Headings (MeSH)-controlled vocabulary were also employed during both training and testing stages to obtain more accurate classification models and better extraction performance, respectively. We demoted relation extraction between entities in documents to relation extraction between entity mentions. In our system, pairs of chemical and disease mentions at both intra- and inter-sentence levels were first constructed as relation instances for training and testing, then two classification models at both levels were trained from the training examples and applied to the testing examples. Finally, we merged the classification results from mention level to document level to acquire final relations between chemicals and diseases. Our system achieved promisingF-scores of 60.4% on the development dataset and 58.3% on the test dataset using gold-standard entity annotations, respectively. Database URL:https://github.com/JHnlp/BC5CIDTask. © The Author(s) 2016. Published by Oxford University Press.

  18. Ad Hoc Information Extraction for Clinical Data Warehouses.

    PubMed

    Dietrich, Georg; Krebs, Jonathan; Fette, Georg; Ertl, Maximilian; Kaspar, Mathias; Störk, Stefan; Puppe, Frank

    2018-05-01

    Clinical Data Warehouses (CDW) reuse Electronic health records (EHR) to make their data retrievable for research purposes or patient recruitment for clinical trials. However, much information are hidden in unstructured data like discharge letters. They can be preprocessed and converted to structured data via information extraction (IE), which is unfortunately a laborious task and therefore usually not available for most of the text data in CDW. The goal of our work is to provide an ad hoc IE service that allows users to query text data ad hoc in a manner similar to querying structured data in a CDW. While search engines just return text snippets, our systems also returns frequencies (e.g. how many patients exist with "heart failure" including textual synonyms or how many patients have an LVEF < 45) based on the content of discharge letters or textual reports for special investigations like heart echo. Three subtasks are addressed: (1) To recognize and to exclude negations and their scopes, (2) to extract concepts, i.e. Boolean values and (3) to extract numerical values. We implemented an extended version of the NegEx-algorithm for German texts that detects negations and determines their scope. Furthermore, our document oriented CDW PaDaWaN was extended with query functions, e.g. context sensitive queries and regex queries, and an extraction mode for computing the frequencies for Boolean and numerical values. Evaluations in chest X-ray reports and in discharge letters showed high F1-scores for the three subtasks: Detection of negated concepts in chest X-ray reports with an F1-score of 0.99 and in discharge letters with 0.97; of Boolean values in chest X-ray reports about 0.99, and of numerical values in chest X-ray reports and discharge letters also around 0.99 with the exception of the concept age. The advantages of an ad hoc IE over a standard IE are the low development effort (just entering the concept with its variants), the promptness of the results and the adaptability by the user to his or her particular question. Disadvantage are usually lower accuracy and confidence.This ad hoc information extraction approach is novel and exceeds existing systems: Roogle [1] extracts predefined concepts from texts at preprocessing and makes them retrievable at runtime. Dr. Warehouse [2] applies negation detection and indexes the produced subtexts which include affirmed findings. Our approach combines negation detection and the extraction of concepts. But the extraction does not take place during preprocessing, but at runtime. That provides an ad hoc, dynamic, interactive and adjustable information extraction of random concepts and even their values on the fly at runtime. We developed an ad hoc information extraction query feature for Boolean and numerical values within a CDW with high recall and precision based on a pipeline that detects and removes negations and their scope in clinical texts. Schattauer GmbH.

  19. Surface EMG signals based motion intent recognition using multi-layer ELM

    NASA Astrophysics Data System (ADS)

    Wang, Jianhui; Qi, Lin; Wang, Xiao

    2017-11-01

    The upper-limb rehabilitation robot is regard as a useful tool to help patients with hemiplegic to do repetitive exercise. The surface electromyography (sEMG) contains motion information as the electric signals are generated and related to nerve-muscle motion. These sEMG signals, representing human's intentions of active motions, are introduced into the rehabilitation robot system to recognize upper-limb movements. Traditionally, the feature extraction is an indispensable part of drawing significant information from original signals, which is a tedious task requiring rich and related experience. This paper employs a deep learning scheme to extract the internal features of the sEMG signals using an advanced Extreme Learning Machine based auto-encoder (ELMAE). The mathematical information contained in the multi-layer structure of the ELM-AE is used as the high-level representation of the internal features of the sEMG signals, and thus a simple ELM can post-process the extracted features, formulating the entire multi-layer ELM (ML-ELM) algorithm. The method is employed for the sEMG based neural intentions recognition afterwards. The case studies show the adopted deep learning algorithm (ELM-AE) is capable of yielding higher classification accuracy compared to the Principle Component Analysis (PCA) scheme in 5 different types of upper-limb motions. This indicates the effectiveness and the learning capability of the ML-ELM in such motion intent recognition applications.

  20. Assessment of Cognitive Function in the Water Maze Task: Maximizing Data Collection and Analysis in Animal Models of Brain Injury.

    PubMed

    Whiting, Mark D; Kokiko-Cochran, Olga N

    2016-01-01

    Animal models play a critical role in understanding the biomechanical, pathophysiological, and behavioral consequences of traumatic brain injury (TBI). In preclinical studies, cognitive impairment induced by TBI is often assessed using the Morris water maze (MWM). Frequently described as a hippocampally dependent spatial navigation task, the MWM is a highly integrative behavioral task that requires intact functioning in numerous brain regions and involves an interdependent set of mnemonic and non-mnemonic processes. In this chapter, we review the special considerations involved in using the MWM in animal models of TBI, with an emphasis on maximizing the degree of information extracted from performance data. We include a theoretical framework for examining deficits in discrete stages of cognitive function and offer suggestions for how to make inferences regarding the specific nature of TBI-induced cognitive impairment. The ultimate goal is more precise modeling of the animal equivalents of the cognitive deficits seen in human TBI.

  1. Too little or too much? Parafoveal preview benefits and parafoveal load costs in dyslexic adults.

    PubMed

    Silva, Susana; Faísca, Luís; Araújo, Susana; Casaca, Luis; Carvalho, Loide; Petersson, Karl Magnus; Reis, Alexandra

    2016-07-01

    Two different forms of parafoveal dysfunction have been hypothesized as core deficits of dyslexic individuals: reduced parafoveal preview benefits ("too little parafovea") and increased costs of parafoveal load ("too much parafovea"). We tested both hypotheses in a single eye-tracking experiment using a modified serial rapid automatized naming (RAN) task. Comparisons between dyslexic and non-dyslexic adults showed reduced parafoveal preview benefits in dyslexics, without increased costs of parafoveal load. Reduced parafoveal preview benefits were observed in a naming task, but not in a silent letter-finding task, indicating that the parafoveal dysfunction may be consequent to the overload with extracting phonological information from orthographic input. Our results suggest that dyslexics' parafoveal dysfunction is not based on strict visuo-attentional factors, but nevertheless they stress the importance of extra-phonological processing. Furthermore, evidence of reduced parafoveal preview benefits in dyslexia may help understand why serial RAN is an important reading predictor in adulthood.

  2. Adapting existing natural language processing resources for cardiovascular risk factors identification in clinical notes.

    PubMed

    Khalifa, Abdulrahman; Meystre, Stéphane

    2015-12-01

    The 2014 i2b2 natural language processing shared task focused on identifying cardiovascular risk factors such as high blood pressure, high cholesterol levels, obesity and smoking status among other factors found in health records of diabetic patients. In addition, the task involved detecting medications, and time information associated with the extracted data. This paper presents the development and evaluation of a natural language processing (NLP) application conceived for this i2b2 shared task. For increased efficiency, the application main components were adapted from two existing NLP tools implemented in the Apache UIMA framework: Textractor (for dictionary-based lookup) and cTAKES (for preprocessing and smoking status detection). The application achieved a final (micro-averaged) F1-measure of 87.5% on the final evaluation test set. Our attempt was mostly based on existing tools adapted with minimal changes and allowed for satisfying performance with limited development efforts. Copyright © 2015 Elsevier Inc. All rights reserved.

  3. Discriminative Nonlinear Analysis Operator Learning: When Cosparse Model Meets Image Classification.

    PubMed

    Wen, Zaidao; Hou, Biao; Jiao, Licheng

    2017-05-03

    Linear synthesis model based dictionary learning framework has achieved remarkable performances in image classification in the last decade. Behaved as a generative feature model, it however suffers from some intrinsic deficiencies. In this paper, we propose a novel parametric nonlinear analysis cosparse model (NACM) with which a unique feature vector will be much more efficiently extracted. Additionally, we derive a deep insight to demonstrate that NACM is capable of simultaneously learning the task adapted feature transformation and regularization to encode our preferences, domain prior knowledge and task oriented supervised information into the features. The proposed NACM is devoted to the classification task as a discriminative feature model and yield a novel discriminative nonlinear analysis operator learning framework (DNAOL). The theoretical analysis and experimental performances clearly demonstrate that DNAOL will not only achieve the better or at least competitive classification accuracies than the state-of-the-art algorithms but it can also dramatically reduce the time complexities in both training and testing phases.

  4. Game theoretic approach for cooperative feature extraction in camera networks

    NASA Astrophysics Data System (ADS)

    Redondi, Alessandro E. C.; Baroffio, Luca; Cesana, Matteo; Tagliasacchi, Marco

    2016-07-01

    Visual sensor networks (VSNs) consist of several camera nodes with wireless communication capabilities that can perform visual analysis tasks such as object identification, recognition, and tracking. Often, VSN deployments result in many camera nodes with overlapping fields of view. In the past, such redundancy has been exploited in two different ways: (1) to improve the accuracy/quality of the visual analysis task by exploiting multiview information or (2) to reduce the energy consumed for performing the visual task, by applying temporal scheduling techniques among the cameras. We propose a game theoretic framework based on the Nash bargaining solution to bridge the gap between the two aforementioned approaches. The key tenet of the proposed framework is for cameras to reduce the consumed energy in the analysis process by exploiting the redundancy in the reciprocal fields of view. Experimental results in both simulated and real-life scenarios confirm that the proposed scheme is able to increase the network lifetime, with a negligible loss in terms of visual analysis accuracy.

  5. Linguistic feature analysis for protein interaction extraction

    PubMed Central

    2009-01-01

    Background The rapid growth of the amount of publicly available reports on biomedical experimental results has recently caused a boost of text mining approaches for protein interaction extraction. Most approaches rely implicitly or explicitly on linguistic, i.e., lexical and syntactic, data extracted from text. However, only few attempts have been made to evaluate the contribution of the different feature types. In this work, we contribute to this evaluation by studying the relative importance of deep syntactic features, i.e., grammatical relations, shallow syntactic features (part-of-speech information) and lexical features. For this purpose, we use a recently proposed approach that uses support vector machines with structured kernels. Results Our results reveal that the contribution of the different feature types varies for the different data sets on which the experiments were conducted. The smaller the training corpus compared to the test data, the more important the role of grammatical relations becomes. Moreover, deep syntactic information based classifiers prove to be more robust on heterogeneous texts where no or only limited common vocabulary is shared. Conclusion Our findings suggest that grammatical relations play an important role in the interaction extraction task. Moreover, the net advantage of adding lexical and shallow syntactic features is small related to the number of added features. This implies that efficient classifiers can be built by using only a small fraction of the features that are typically being used in recent approaches. PMID:19909518

  6. Person Recognition System Based on a Combination of Body Images from Visible Light and Thermal Cameras

    PubMed Central

    Nguyen, Dat Tien; Hong, Hyung Gil; Kim, Ki Wan; Park, Kang Ryoung

    2017-01-01

    The human body contains identity information that can be used for the person recognition (verification/recognition) problem. In this paper, we propose a person recognition method using the information extracted from body images. Our research is novel in the following three ways compared to previous studies. First, we use the images of human body for recognizing individuals. To overcome the limitations of previous studies on body-based person recognition that use only visible light images for recognition, we use human body images captured by two different kinds of camera, including a visible light camera and a thermal camera. The use of two different kinds of body image helps us to reduce the effects of noise, background, and variation in the appearance of a human body. Second, we apply a state-of-the art method, called convolutional neural network (CNN) among various available methods, for image features extraction in order to overcome the limitations of traditional hand-designed image feature extraction methods. Finally, with the extracted image features from body images, the recognition task is performed by measuring the distance between the input and enrolled samples. The experimental results show that the proposed method is efficient for enhancing recognition accuracy compared to systems that use only visible light or thermal images of the human body. PMID:28300783

  7. How does intentionality of encoding affect memory for episodic information?

    PubMed

    Craig, Michael; Butterworth, Karla; Nilsson, Jonna; Hamilton, Colin J; Gallagher, Peter; Smulders, Tom V

    2016-11-01

    Episodic memory enables the detailed and vivid recall of past events, including target and wider contextual information. In this paper, we investigated whether/how encoding intentionality affects the retention of target and contextual episodic information from a novel experience. Healthy adults performed (1) a What-Where-When (WWW) episodic memory task involving the hiding and delayed recall of a number of items (what) in different locations (where) in temporally distinct sessions (when) and (2) unexpected tests probing memory for wider contextual information from the WWW task. Critically, some participants were informed that memory for WWW information would be subsequently probed (intentional group), while this came as a surprise for others (incidental group). The probing of contextual information came as a surprise for all participants. Participants also performed several measures of episodic and nonepisodic cognition from which common episodic and nonepisodic factors were extracted. Memory for target (WWW) and contextual information was superior in the intentional group compared with the incidental group. Memory for target and contextual information was unrelated to factors of nonepisodic cognition, irrespective of encoding intentionality. In addition, memory for target information was unrelated to factors of episodic cognition. However, memory for wider contextual information was related to some factors of episodic cognition, and these relationships differed between the intentional and incidental groups. Our results lead us to propose the hypothesis that intentional encoding of episodic information increases the coherence of the representation of the context in which the episode took place. This hypothesis remains to be tested. © 2016 Craig et al.; Published by Cold Spring Harbor Laboratory Press.

  8. a New Multi-Spectral Threshold Normalized Difference Water Index Mst-Ndwi Water Extraction Method - a Case Study in Yanhe Watershed

    NASA Astrophysics Data System (ADS)

    Zhou, Y.; Zhao, H.; Hao, H.; Wang, C.

    2018-05-01

    Accurate remote sensing water extraction is one of the primary tasks of watershed ecological environment study. Since the Yanhe water system has typical characteristics of a small water volume and narrow river channel, which leads to the difficulty for conventional water extraction methods such as Normalized Difference Water Index (NDWI). A new Multi-Spectral Threshold segmentation of the NDWI (MST-NDWI) water extraction method is proposed to achieve the accurate water extraction in Yanhe watershed. In the MST-NDWI method, the spectral characteristics of water bodies and typical backgrounds on the Landsat/TM images have been evaluated in Yanhe watershed. The multi-spectral thresholds (TM1, TM4, TM5) based on maximum-likelihood have been utilized before NDWI water extraction to realize segmentation for a division of built-up lands and small linear rivers. With the proposed method, a water map is extracted from the Landsat/TM images in 2010 in China. An accuracy assessment is conducted to compare the proposed method with the conventional water indexes such as NDWI, Modified NDWI (MNDWI), Enhanced Water Index (EWI), and Automated Water Extraction Index (AWEI). The result shows that the MST-NDWI method generates better water extraction accuracy in Yanhe watershed and can effectively diminish the confusing background objects compared to the conventional water indexes. The MST-NDWI method integrates NDWI and Multi-Spectral Threshold segmentation algorithms, with richer valuable information and remarkable results in accurate water extraction in Yanhe watershed.

  9. ExaCT: automatic extraction of clinical trial characteristics from journal publications

    PubMed Central

    2010-01-01

    Background Clinical trials are one of the most important sources of evidence for guiding evidence-based practice and the design of new trials. However, most of this information is available only in free text - e.g., in journal publications - which is labour intensive to process for systematic reviews, meta-analyses, and other evidence synthesis studies. This paper presents an automatic information extraction system, called ExaCT, that assists users with locating and extracting key trial characteristics (e.g., eligibility criteria, sample size, drug dosage, primary outcomes) from full-text journal articles reporting on randomized controlled trials (RCTs). Methods ExaCT consists of two parts: an information extraction (IE) engine that searches the article for text fragments that best describe the trial characteristics, and a web browser-based user interface that allows human reviewers to assess and modify the suggested selections. The IE engine uses a statistical text classifier to locate those sentences that have the highest probability of describing a trial characteristic. Then, the IE engine's second stage applies simple rules to these sentences to extract text fragments containing the target answer. The same approach is used for all 21 trial characteristics selected for this study. Results We evaluated ExaCT using 50 previously unseen articles describing RCTs. The text classifier (first stage) was able to recover 88% of relevant sentences among its top five candidates (top5 recall) with the topmost candidate being relevant in 80% of cases (top1 precision). Precision and recall of the extraction rules (second stage) were 93% and 91%, respectively. Together, the two stages of the extraction engine were able to provide (partially) correct solutions in 992 out of 1050 test tasks (94%), with a majority of these (696) representing fully correct and complete answers. Conclusions Our experiments confirmed the applicability and efficacy of ExaCT. Furthermore, they demonstrated that combining a statistical method with 'weak' extraction rules can identify a variety of study characteristics. The system is flexible and can be extended to handle other characteristics and document types (e.g., study protocols). PMID:20920176

  10. FamPlex: a resource for entity recognition and relationship resolution of human protein families and complexes in biomedical text mining.

    PubMed

    Bachman, John A; Gyori, Benjamin M; Sorger, Peter K

    2018-06-28

    For automated reading of scientific publications to extract useful information about molecular mechanisms it is critical that genes, proteins and other entities be correctly associated with uniform identifiers, a process known as named entity linking or "grounding." Correct grounding is essential for resolving relationships among mined information, curated interaction databases, and biological datasets. The accuracy of this process is largely dependent on the availability of machine-readable resources associating synonyms and abbreviations commonly found in biomedical literature with uniform identifiers. In a task involving automated reading of ∼215,000 articles using the REACH event extraction software we found that grounding was disproportionately inaccurate for multi-protein families (e.g., "AKT") and complexes with multiple subunits (e.g."NF- κB"). To address this problem we constructed FamPlex, a manually curated resource defining protein families and complexes as they are commonly encountered in biomedical text. In FamPlex the gene-level constituents of families and complexes are defined in a flexible format allowing for multi-level, hierarchical membership. To create FamPlex, text strings corresponding to entities were identified empirically from literature and linked manually to uniform identifiers; these identifiers were also mapped to equivalent entries in multiple related databases. FamPlex also includes curated prefix and suffix patterns that improve named entity recognition and event extraction. Evaluation of REACH extractions on a test corpus of ∼54,000 articles showed that FamPlex significantly increased grounding accuracy for families and complexes (from 15 to 71%). The hierarchical organization of entities in FamPlex also made it possible to integrate otherwise unconnected mechanistic information across families, subfamilies, and individual proteins. Applications of FamPlex to the TRIPS/DRUM reading system and the Biocreative VI Bioentity Normalization Task dataset demonstrated the utility of FamPlex in other settings. FamPlex is an effective resource for improving named entity recognition, grounding, and relationship resolution in automated reading of biomedical text. The content in FamPlex is available in both tabular and Open Biomedical Ontology formats at https://github.com/sorgerlab/famplex under the Creative Commons CC0 license and has been integrated into the TRIPS/DRUM and REACH reading systems.

  11. Heavy metal extractable forms in sludge from wastewater treatment plants.

    PubMed

    Alvarez, E Alonso; Mochón, M Callejón; Jiménez Sánchez, J C; Ternero Rodríguez, M

    2002-05-01

    The analysis of heavy metals is a very important task to assess the potential environmental and health risk associated with the sludge coming from wastewater treatment plants (WWTPs). However, it is widely accepted that the determination of total elements does not give an accurate estimation of the potential environmental impact. So, it is necessary to apply sequential extraction techniques to obtain a suitable information about their bioavailability or toxicity. In this paper, a sequential extraction scheme according to the BCR's guidelines was applied to sludge samples collected from each sludge treatment step of five municipal activated sludge plants. Al. Cd, Co, Cu, Cr, Fe, Mn, Hg, Mo, Ni, Pb, Ti and Zn were determined in the sludge extracts by inductively coupled plasma atomic emission spectrometry. In relation to current international legislation for the use of sludge for agricultural purposes none of metal concentrations exceeded maximum permitted levels. In most of the metal elements under considerations, results showed a clear rise along the sludge treatment in the proportion of two less-available fractions (oxidizable metal and residual metal).

  12. Semantic segmentation of forest stands of pure species combining airborne lidar data and very high resolution multispectral imagery

    NASA Astrophysics Data System (ADS)

    Dechesne, Clément; Mallet, Clément; Le Bris, Arnaud; Gouet-Brunet, Valérie

    2017-04-01

    Forest stands are the basic units for forest inventory and mapping. Stands are defined as large forested areas (e.g., ⩾ 2 ha) of homogeneous tree species composition and age. Their accurate delineation is usually performed by human operators through visual analysis of very high resolution (VHR) infra-red images. This task is tedious, highly time consuming, and should be automated for scalability and efficient updating purposes. In this paper, a method based on the fusion of airborne lidar data and VHR multispectral images is proposed for the automatic delineation of forest stands containing one dominant species (purity superior to 75%). This is the key preliminary task for forest land-cover database update. The multispectral images give information about the tree species whereas 3D lidar point clouds provide geometric information on the trees and allow their individual extraction. Multi-modal features are computed, both at pixel and object levels: the objects are individual trees extracted from lidar data. A supervised classification is then performed at the object level in order to coarsely discriminate the existing tree species in each area of interest. The classification results are further processed to obtain homogeneous areas with smooth borders by employing an energy minimum framework, where additional constraints are joined to form the energy function. The experimental results show that the proposed method provides very satisfactory results both in terms of stand labeling and delineation (overall accuracy ranges between 84 % and 99 %).

  13. Automated rule-base creation via CLIPS-Induce

    NASA Technical Reports Server (NTRS)

    Murphy, Patrick M.

    1994-01-01

    Many CLIPS rule-bases contain one or more rule groups that perform classification. In this paper we describe CLIPS-Induce, an automated system for the creation of a CLIPS classification rule-base from a set of test cases. CLIPS-Induce consists of two components, a decision tree induction component and a CLIPS production extraction component. ID3, a popular decision tree induction algorithm, is used to induce a decision tree from the test cases. CLIPS production extraction is accomplished through a top-down traversal of the decision tree. Nodes of the tree are used to construct query rules, and branches of the tree are used to construct classification rules. The learned CLIPS productions may easily be incorporated into a large CLIPS system that perform tasks such as accessing a database or displaying information.

  14. All-paths graph kernel for protein-protein interaction extraction with evaluation of cross-corpus learning.

    PubMed

    Airola, Antti; Pyysalo, Sampo; Björne, Jari; Pahikkala, Tapio; Ginter, Filip; Salakoski, Tapio

    2008-11-19

    Automated extraction of protein-protein interactions (PPI) is an important and widely studied task in biomedical text mining. We propose a graph kernel based approach for this task. In contrast to earlier approaches to PPI extraction, the introduced all-paths graph kernel has the capability to make use of full, general dependency graphs representing the sentence structure. We evaluate the proposed method on five publicly available PPI corpora, providing the most comprehensive evaluation done for a machine learning based PPI-extraction system. We additionally perform a detailed evaluation of the effects of training and testing on different resources, providing insight into the challenges involved in applying a system beyond the data it was trained on. Our method is shown to achieve state-of-the-art performance with respect to comparable evaluations, with 56.4 F-score and 84.8 AUC on the AImed corpus. We show that the graph kernel approach performs on state-of-the-art level in PPI extraction, and note the possible extension to the task of extracting complex interactions. Cross-corpus results provide further insight into how the learning generalizes beyond individual corpora. Further, we identify several pitfalls that can make evaluations of PPI-extraction systems incomparable, or even invalid. These include incorrect cross-validation strategies and problems related to comparing F-score results achieved on different evaluation resources. Recommendations for avoiding these pitfalls are provided.

  15. Motor learning.

    PubMed

    Wolpert, Daniel M; Flanagan, J Randall

    2010-06-08

    Although learning a motor skill, such as a tennis stroke, feels like a unitary experience, researchers who study motor control and learning break the processes involved into a number of interacting components. These components can be organized into four main groups. First, skilled performance requires the effective and efficient gathering of sensory information, such as deciding where and when to direct one's gaze around the court, and thus an important component of skill acquisition involves learning how best to extract task-relevant information. Second, the performer must learn key features of the task such as the geometry and mechanics of the tennis racket and ball, the properties of the court surface, and how the wind affects the ball's flight. Third, the player needs to set up different classes of control that include predictive and reactive control mechanisms that generate appropriate motor commands to achieve the task goals, as well as compliance control that specifies, for example, the stiffness with which the arm holds the racket. Finally, the successful performer can learn higher-level skills such as anticipating and countering the opponent's strategy and making effective decisions about shot selection. In this Primer we shall consider these components of motor learning using as an example how we learn to play tennis. 2010 Elsevier Ltd. All rights reserved.

  16. Extracting motor synergies from random movements for low-dimensional task-space control of musculoskeletal robots.

    PubMed

    Fu, Kin Chung Denny; Dalla Libera, Fabio; Ishiguro, Hiroshi

    2015-10-08

    In the field of human motor control, the motor synergy hypothesis explains how humans simplify body control dimensionality by coordinating groups of muscles, called motor synergies, instead of controlling muscles independently. In most applications of motor synergies to low-dimensional control in robotics, motor synergies are extracted from given optimal control signals. In this paper, we address the problems of how to extract motor synergies without optimal data given, and how to apply motor synergies to achieve low-dimensional task-space tracking control of a human-like robotic arm actuated by redundant muscles, without prior knowledge of the robot. We propose to extract motor synergies from a subset of randomly generated reaching-like movement data. The essence is to first approximate the corresponding optimal control signals, using estimations of the robot's forward dynamics, and to extract the motor synergies subsequently. In order to avoid modeling difficulties, a learning-based control approach is adopted such that control is accomplished via estimations of the robot's inverse dynamics. We present a kernel-based regression formulation to estimate the forward and the inverse dynamics, and a sliding controller in order to cope with estimation error. Numerical evaluations show that the proposed method enables extraction of motor synergies for low-dimensional task-space control.

  17. Robot Evolutionary Localization Based on Attentive Visual Short-Term Memory

    PubMed Central

    Vega, Julio; Perdices, Eduardo; Cañas, José M.

    2013-01-01

    Cameras are one of the most relevant sensors in autonomous robots. However, two of their challenges are to extract useful information from captured images, and to manage the small field of view of regular cameras. This paper proposes implementing a dynamic visual memory to store the information gathered from a moving camera on board a robot, followed by an attention system to choose where to look with this mobile camera, and a visual localization algorithm that incorporates this visual memory. The visual memory is a collection of relevant task-oriented objects and 3D segments, and its scope is wider than the current camera field of view. The attention module takes into account the need to reobserve objects in the visual memory and the need to explore new areas. The visual memory is useful also in localization tasks, as it provides more information about robot surroundings than the current instantaneous image. This visual system is intended as underlying technology for service robot applications in real people's homes. Several experiments have been carried out, both with simulated and real Pioneer and Nao robots, to validate the system and each of its components in office scenarios. PMID:23337333

  18. A ganglion-cell-based primary image representation method and its contribution to object recognition

    NASA Astrophysics Data System (ADS)

    Wei, Hui; Dai, Zhi-Long; Zuo, Qing-Song

    2016-10-01

    A visual stimulus is represented by the biological visual system at several levels: in the order from low to high levels, they are: photoreceptor cells, ganglion cells (GCs), lateral geniculate nucleus cells and visual cortical neurons. Retinal GCs at the early level need to represent raw data only once, but meet a wide number of diverse requests from different vision-based tasks. This means the information representation at this level is general and not task-specific. Neurobiological findings have attributed this universal adaptation to GCs' receptive field (RF) mechanisms. For the purposes of developing a highly efficient image representation method that can facilitate information processing and interpretation at later stages, here we design a computational model to simulate the GC's non-classical RF. This new image presentation method can extract major structural features from raw data, and is consistent with other statistical measures of the image. Based on the new representation, the performances of other state-of-the-art algorithms in contour detection and segmentation can be upgraded remarkably. This work concludes that applying sophisticated representation schema at early state is an efficient and promising strategy in visual information processing.

  19. Classification of visual and linguistic tasks using eye-movement features.

    PubMed

    Coco, Moreno I; Keller, Frank

    2014-03-07

    The role of the task has received special attention in visual-cognition research because it can provide causal explanations of goal-directed eye-movement responses. The dependency between visual attention and task suggests that eye movements can be used to classify the task being performed. A recent study by Greene, Liu, and Wolfe (2012), however, fails to achieve accurate classification of visual tasks based on eye-movement features. In the present study, we hypothesize that tasks can be successfully classified when they differ with respect to the involvement of other cognitive domains, such as language processing. We extract the eye-movement features used by Greene et al. as well as additional features from the data of three different tasks: visual search, object naming, and scene description. First, we demonstrated that eye-movement responses make it possible to characterize the goals of these tasks. Then, we trained three different types of classifiers and predicted the task participants performed with an accuracy well above chance (a maximum of 88% for visual search). An analysis of the relative importance of features for classification accuracy reveals that just one feature, i.e., initiation time, is sufficient for above-chance performance (a maximum of 79% accuracy in object naming). Crucially, this feature is independent of task duration, which differs systematically across the three tasks we investigated. Overall, the best task classification performance was obtained with a set of seven features that included both spatial information (e.g., entropy of attention allocation) and temporal components (e.g., total fixation on objects) of the eye-movement record. This result confirms the task-dependent allocation of visual attention and extends previous work by showing that task classification is possible when tasks differ in the cognitive processes involved (purely visual tasks such as search vs. communicative tasks such as scene description).

  20. An Efficient Method for Automatic Road Extraction Based on Multiple Features from LiDAR Data

    NASA Astrophysics Data System (ADS)

    Li, Y.; Hu, X.; Guan, H.; Liu, P.

    2016-06-01

    The road extraction in urban areas is difficult task due to the complicated patterns and many contextual objects. LiDAR data directly provides three dimensional (3D) points with less occlusions and smaller shadows. The elevation information and surface roughness are distinguishing features to separate roads. However, LiDAR data has some disadvantages are not beneficial to object extraction, such as the irregular distribution of point clouds and lack of clear edges of roads. For these problems, this paper proposes an automatic road centerlines extraction method which has three major steps: (1) road center point detection based on multiple feature spatial clustering for separating road points from ground points, (2) local principal component analysis with least squares fitting for extracting the primitives of road centerlines, and (3) hierarchical grouping for connecting primitives into complete roads network. Compared with MTH (consist of Mean shift algorithm, Tensor voting, and Hough transform) proposed in our previous article, this method greatly reduced the computational cost. To evaluate the proposed method, the Vaihingen data set, a benchmark testing data provided by ISPRS for "Urban Classification and 3D Building Reconstruction" project, was selected. The experimental results show that our method achieve the same performance by less time in road extraction using LiDAR data.

  1. A globally convergent MC algorithm with an adaptive learning rate.

    PubMed

    Peng, Dezhong; Yi, Zhang; Xiang, Yong; Zhang, Haixian

    2012-02-01

    This brief deals with the problem of minor component analysis (MCA). Artificial neural networks can be exploited to achieve the task of MCA. Recent research works show that convergence of neural networks based MCA algorithms can be guaranteed if the learning rates are less than certain thresholds. However, the computation of these thresholds needs information about the eigenvalues of the autocorrelation matrix of data set, which is unavailable in online extraction of minor component from input data stream. In this correspondence, we introduce an adaptive learning rate into the OJAn MCA algorithm, such that its convergence condition does not depend on any unobtainable information, and can be easily satisfied in practical applications.

  2. Multi-source feature extraction and target recognition in wireless sensor networks based on adaptive distributed wavelet compression algorithms

    NASA Astrophysics Data System (ADS)

    Hortos, William S.

    2008-04-01

    Proposed distributed wavelet-based algorithms are a means to compress sensor data received at the nodes forming a wireless sensor network (WSN) by exchanging information between neighboring sensor nodes. Local collaboration among nodes compacts the measurements, yielding a reduced fused set with equivalent information at far fewer nodes. Nodes may be equipped with multiple sensor types, each capable of sensing distinct phenomena: thermal, humidity, chemical, voltage, or image signals with low or no frequency content as well as audio, seismic or video signals within defined frequency ranges. Compression of the multi-source data through wavelet-based methods, distributed at active nodes, reduces downstream processing and storage requirements along the paths to sink nodes; it also enables noise suppression and more energy-efficient query routing within the WSN. Targets are first detected by the multiple sensors; then wavelet compression and data fusion are applied to the target returns, followed by feature extraction from the reduced data; feature data are input to target recognition/classification routines; targets are tracked during their sojourns through the area monitored by the WSN. Algorithms to perform these tasks are implemented in a distributed manner, based on a partition of the WSN into clusters of nodes. In this work, a scheme of collaborative processing is applied for hierarchical data aggregation and decorrelation, based on the sensor data itself and any redundant information, enabled by a distributed, in-cluster wavelet transform with lifting that allows multiple levels of resolution. The wavelet-based compression algorithm significantly decreases RF bandwidth and other resource use in target processing tasks. Following wavelet compression, features are extracted. The objective of feature extraction is to maximize the probabilities of correct target classification based on multi-source sensor measurements, while minimizing the resource expenditures at participating nodes. Therefore, the feature-extraction method based on the Haar DWT is presented that employs a maximum-entropy measure to determine significant wavelet coefficients. Features are formed by calculating the energy of coefficients grouped around the competing clusters. A DWT-based feature extraction algorithm used for vehicle classification in WSNs can be enhanced by an added rule for selecting the optimal number of resolution levels to improve the correct classification rate and reduce energy consumption expended in local algorithm computations. Published field trial data for vehicular ground targets, measured with multiple sensor types, are used to evaluate the wavelet-assisted algorithms. Extracted features are used in established target recognition routines, e.g., the Bayesian minimum-error-rate classifier, to compare the effects on the classification performance of the wavelet compression. Simulations of feature sets and recognition routines at different resolution levels in target scenarios indicate the impact on classification rates, while formulas are provided to estimate reduction in resource use due to distributed compression.

  3. Task-induced frequency modulation features for brain-computer interfacing.

    PubMed

    Jayaram, Vinay; Hohmann, Matthias; Just, Jennifer; Schölkopf, Bernhard; Grosse-Wentrup, Moritz

    2017-10-01

    Task-induced amplitude modulation of neural oscillations is routinely used in brain-computer interfaces (BCIs) for decoding subjects' intents, and underlies some of the most robust and common methods in the field, such as common spatial patterns and Riemannian geometry. While there has been some interest in phase-related features for classification, both techniques usually presuppose that the frequencies of neural oscillations remain stable across various tasks. We investigate here whether features based on task-induced modulation of the frequency of neural oscillations enable decoding of subjects' intents with an accuracy comparable to task-induced amplitude modulation. We compare cross-validated classification accuracies using the amplitude and frequency modulated features, as well as a joint feature space, across subjects in various paradigms and pre-processing conditions. We show results with a motor imagery task, a cognitive task, and also preliminary results in patients with amyotrophic lateral sclerosis (ALS), as well as using common spatial patterns and Laplacian filtering. The frequency features alone do not significantly out-perform traditional amplitude modulation features, and in some cases perform significantly worse. However, across both tasks and pre-processing in healthy subjects the joint space significantly out-performs either the frequency or amplitude features alone. This result only does not hold for ALS patients, for whom the dataset is of insufficient size to draw any statistically significant conclusions. Task-induced frequency modulation is robust and straight forward to compute, and increases performance when added to standard amplitude modulation features across paradigms. This allows more information to be extracted from the EEG signal cheaply and can be used throughout the field of BCIs.

  4. Visual search in scenes involves selective and non-selective pathways

    PubMed Central

    Wolfe, Jeremy M; Vo, Melissa L-H; Evans, Karla K; Greene, Michelle R

    2010-01-01

    How do we find objects in scenes? For decades, visual search models have been built on experiments in which observers search for targets, presented among distractor items, isolated and randomly arranged on blank backgrounds. Are these models relevant to search in continuous scenes? This paper argues that the mechanisms that govern artificial, laboratory search tasks do play a role in visual search in scenes. However, scene-based information is used to guide search in ways that had no place in earlier models. Search in scenes may be best explained by a dual-path model: A “selective” path in which candidate objects must be individually selected for recognition and a “non-selective” path in which information can be extracted from global / statistical information. PMID:21227734

  5. The Cadmio XML healthcare record.

    PubMed

    Barbera, Francesco; Ferri, Fernando; Ricci, Fabrizio L; Sottile, Pier Angelo

    2002-01-01

    The management of clinical data is a complex task. Patient related information reported in patient folders is a set of heterogeneous and structured data accessed by different users having different goals (in local or geographical networks). XML language provides a mechanism for describing, manipulating, and visualising structured data in web-based applications. XML ensures that the structured data is managed in a uniform and transparent manner independently from the applications and their providers guaranteeing some interoperability. Extracting data from the healthcare record and structuring them according to XML makes the data available through browsers. The MIC/MIE model (Medical Information Category/Medical Information Elements), which allows the definition and management of healthcare records and used in CADMIO, a HISA based project, is described in this paper, using XML for allowing the data to be visualised through web browsers.

  6. Automatic Authorship Detection Using Textual Patterns Extracted from Integrated Syntactic Graphs

    PubMed Central

    Gómez-Adorno, Helena; Sidorov, Grigori; Pinto, David; Vilariño, Darnes; Gelbukh, Alexander

    2016-01-01

    We apply the integrated syntactic graph feature extraction methodology to the task of automatic authorship detection. This graph-based representation allows integrating different levels of language description into a single structure. We extract textual patterns based on features obtained from shortest path walks over integrated syntactic graphs and apply them to determine the authors of documents. On average, our method outperforms the state of the art approaches and gives consistently high results across different corpora, unlike existing methods. Our results show that our textual patterns are useful for the task of authorship attribution. PMID:27589740

  7. HIGH-PRECISION BIOLOGICAL EVENT EXTRACTION: EFFECTS OF SYSTEM AND OF DATA

    PubMed Central

    Cohen, K. Bretonnel; Verspoor, Karin; Johnson, Helen L.; Roeder, Chris; Ogren, Philip V.; Baumgartner, William A.; White, Elizabeth; Tipney, Hannah; Hunter, Lawrence

    2013-01-01

    We approached the problems of event detection, argument identification, and negation and speculation detection in the BioNLP’09 information extraction challenge through concept recognition and analysis. Our methodology involved using the OpenDMAP semantic parser with manually written rules. The original OpenDMAP system was updated for this challenge with a broad ontology defined for the events of interest, new linguistic patterns for those events, and specialized coordination handling. We achieved state-of-the-art precision for two of the three tasks, scoring the highest of 24 teams at precision of 71.81 on Task 1 and the highest of 6 teams at precision of 70.97 on Task 2. We provide a detailed analysis of the training data and show that a number of trigger words were ambiguous as to event type, even when their arguments are constrained by semantic class. The data is also shown to have a number of missing annotations. Analysis of a sampling of the comparatively small number of false positives returned by our system shows that major causes of this type of error were failing to recognize second themes in two-theme events, failing to recognize events when they were the arguments to other events, failure to recognize nontheme arguments, and sentence segmentation errors. We show that specifically handling coordination had a small but important impact on the overall performance of the system. The OpenDMAP system and the rule set are available at http://bionlp.sourceforge.net. PMID:25937701

  8. Spectral feature extraction of EEG signals and pattern recognition during mental tasks of 2-D cursor movements for BCI using SVM and ANN.

    PubMed

    Bascil, M Serdar; Tesneli, Ahmet Y; Temurtas, Feyzullah

    2016-09-01

    Brain computer interface (BCI) is a new communication way between man and machine. It identifies mental task patterns stored in electroencephalogram (EEG). So, it extracts brain electrical activities recorded by EEG and transforms them machine control commands. The main goal of BCI is to make available assistive environmental devices for paralyzed people such as computers and makes their life easier. This study deals with feature extraction and mental task pattern recognition on 2-D cursor control from EEG as offline analysis approach. The hemispherical power density changes are computed and compared on alpha-beta frequency bands with only mental imagination of cursor movements. First of all, power spectral density (PSD) features of EEG signals are extracted and high dimensional data reduced by principle component analysis (PCA) and independent component analysis (ICA) which are statistical algorithms. In the last stage, all features are classified with two types of support vector machine (SVM) which are linear and least squares (LS-SVM) and three different artificial neural network (ANN) structures which are learning vector quantization (LVQ), multilayer neural network (MLNN) and probabilistic neural network (PNN) and mental task patterns are successfully identified via k-fold cross validation technique.

  9. Correction of Visual Perception Based on Neuro-Fuzzy Learning for the Humanoid Robot TEO.

    PubMed

    Hernandez-Vicen, Juan; Martinez, Santiago; Garcia-Haro, Juan Miguel; Balaguer, Carlos

    2018-03-25

    New applications related to robotic manipulation or transportation tasks, with or without physical grasping, are continuously being developed. To perform these activities, the robot takes advantage of different kinds of perceptions. One of the key perceptions in robotics is vision. However, some problems related to image processing makes the application of visual information within robot control algorithms difficult. Camera-based systems have inherent errors that affect the quality and reliability of the information obtained. The need of correcting image distortion slows down image parameter computing, which decreases performance of control algorithms. In this paper, a new approach to correcting several sources of visual distortions on images in only one computing step is proposed. The goal of this system/algorithm is the computation of the tilt angle of an object transported by a robot, minimizing image inherent errors and increasing computing speed. After capturing the image, the computer system extracts the angle using a Fuzzy filter that corrects at the same time all possible distortions, obtaining the real angle in only one processing step. This filter has been developed by the means of Neuro-Fuzzy learning techniques, using datasets with information obtained from real experiments. In this way, the computing time has been decreased and the performance of the application has been improved. The resulting algorithm has been tried out experimentally in robot transportation tasks in the humanoid robot TEO (Task Environment Operator) from the University Carlos III of Madrid.

  10. Correction of Visual Perception Based on Neuro-Fuzzy Learning for the Humanoid Robot TEO

    PubMed Central

    2018-01-01

    New applications related to robotic manipulation or transportation tasks, with or without physical grasping, are continuously being developed. To perform these activities, the robot takes advantage of different kinds of perceptions. One of the key perceptions in robotics is vision. However, some problems related to image processing makes the application of visual information within robot control algorithms difficult. Camera-based systems have inherent errors that affect the quality and reliability of the information obtained. The need of correcting image distortion slows down image parameter computing, which decreases performance of control algorithms. In this paper, a new approach to correcting several sources of visual distortions on images in only one computing step is proposed. The goal of this system/algorithm is the computation of the tilt angle of an object transported by a robot, minimizing image inherent errors and increasing computing speed. After capturing the image, the computer system extracts the angle using a Fuzzy filter that corrects at the same time all possible distortions, obtaining the real angle in only one processing step. This filter has been developed by the means of Neuro-Fuzzy learning techniques, using datasets with information obtained from real experiments. In this way, the computing time has been decreased and the performance of the application has been improved. The resulting algorithm has been tried out experimentally in robot transportation tasks in the humanoid robot TEO (Task Environment Operator) from the University Carlos III of Madrid. PMID:29587392

  11. Lung lobe segmentation based on statistical atlas and graph cuts

    NASA Astrophysics Data System (ADS)

    Nimura, Yukitaka; Kitasaka, Takayuki; Honma, Hirotoshi; Takabatake, Hirotsugu; Mori, Masaki; Natori, Hiroshi; Mori, Kensaku

    2012-03-01

    This paper presents a novel method that can extract lung lobes by utilizing probability atlas and multilabel graph cuts. Information about pulmonary structures plays very important role for decision of the treatment strategy and surgical planning. The human lungs are divided into five anatomical regions, the lung lobes. Precise segmentation and recognition of lung lobes are indispensable tasks in computer aided diagnosis systems and computer aided surgery systems. A lot of methods for lung lobe segmentation are proposed. However, these methods only target the normal cases. Therefore, these methods cannot extract the lung lobes in abnormal cases, such as COPD cases. To extract lung lobes in abnormal cases, this paper propose a lung lobe segmentation method based on probability atlas of lobe location and multilabel graph cuts. The process consists of three components; normalization based on the patient's physique, probability atlas generation, and segmentation based on graph cuts. We apply this method to six cases of chest CT images including COPD cases. Jaccard index was 79.1%.

  12. Joint Feature Extraction and Classifier Design for ECG-Based Biometric Recognition.

    PubMed

    Gutta, Sandeep; Cheng, Qi

    2016-03-01

    Traditional biometric recognition systems often utilize physiological traits such as fingerprint, face, iris, etc. Recent years have seen a growing interest in electrocardiogram (ECG)-based biometric recognition techniques, especially in the field of clinical medicine. In existing ECG-based biometric recognition methods, feature extraction and classifier design are usually performed separately. In this paper, a multitask learning approach is proposed, in which feature extraction and classifier design are carried out simultaneously. Weights are assigned to the features within the kernel of each task. We decompose the matrix consisting of all the feature weights into sparse and low-rank components. The sparse component determines the features that are relevant to identify each individual, and the low-rank component determines the common feature subspace that is relevant to identify all the subjects. A fast optimization algorithm is developed, which requires only the first-order information. The performance of the proposed approach is demonstrated through experiments using the MIT-BIH Normal Sinus Rhythm database.

  13. Motion generation of robotic surgical tasks: learning from expert demonstrations.

    PubMed

    Reiley, Carol E; Plaku, Erion; Hager, Gregory D

    2010-01-01

    Robotic surgical assistants offer the possibility of automating portions of a task that are time consuming and tedious in order to reduce the cognitive workload of a surgeon. This paper proposes using programming by demonstration to build generative models and generate smooth trajectories that capture the underlying structure of the motion data recorded from expert demonstrations. Specifically, motion data from Intuitive Surgical's da Vinci Surgical System of a panel of expert surgeons performing three surgical tasks are recorded. The trials are decomposed into subtasks or surgemes, which are then temporally aligned through dynamic time warping. Next, a Gaussian Mixture Model (GMM) encodes the experts' underlying motion structure. Gaussian Mixture Regression (GMR) is then used to extract a smooth reference trajectory to reproduce a trajectory of the task. The approach is evaluated through an automated skill assessment measurement. Results suggest that this paper presents a means to (i) extract important features of the task, (ii) create a metric to evaluate robot imitative performance (iii) generate smoother trajectories for reproduction of three common medical tasks.

  14. Mapping detailed 3D information onto high resolution SAR signatures

    NASA Astrophysics Data System (ADS)

    Anglberger, H.; Speck, R.

    2017-05-01

    Due to challenges in the visual interpretation of radar signatures or in the subsequent information extraction, a fusion with other data sources can be beneficial. The most accurate basis for a fusion of any kind of remote sensing data is the mapping of the acquired 2D image space onto the true 3D geometry of the scenery. In the case of radar images this is a challenging task because the coordinate system is based on the measured range which causes ambiguous regions due to layover effects. This paper describes a method that accurately maps the detailed 3D information of a scene to the slantrange-based coordinate system of imaging radars. Due to this mapping all the contributing geometrical parts of one resolution cell can be determined in 3D space. The proposed method is highly efficient, because computationally expensive operations can be directly performed on graphics card hardware. The described approach builds a perfect basis for sophisticated methods to extract data from multiple complimentary sensors like from radar and optical images, especially because true 3D information from whole cities will be available in the near future. The performance of the developed methods will be demonstrated with high resolution radar data acquired by the space-borne SAR-sensor TerraSAR-X.

  15. Enhancing interpretability of automatically extracted machine learning features: application to a RBM-Random Forest system on brain lesion segmentation.

    PubMed

    Pereira, Sérgio; Meier, Raphael; McKinley, Richard; Wiest, Roland; Alves, Victor; Silva, Carlos A; Reyes, Mauricio

    2018-02-01

    Machine learning systems are achieving better performances at the cost of becoming increasingly complex. However, because of that, they become less interpretable, which may cause some distrust by the end-user of the system. This is especially important as these systems are pervasively being introduced to critical domains, such as the medical field. Representation Learning techniques are general methods for automatic feature computation. Nevertheless, these techniques are regarded as uninterpretable "black boxes". In this paper, we propose a methodology to enhance the interpretability of automatically extracted machine learning features. The proposed system is composed of a Restricted Boltzmann Machine for unsupervised feature learning, and a Random Forest classifier, which are combined to jointly consider existing correlations between imaging data, features, and target variables. We define two levels of interpretation: global and local. The former is devoted to understanding if the system learned the relevant relations in the data correctly, while the later is focused on predictions performed on a voxel- and patient-level. In addition, we propose a novel feature importance strategy that considers both imaging data and target variables, and we demonstrate the ability of the approach to leverage the interpretability of the obtained representation for the task at hand. We evaluated the proposed methodology in brain tumor segmentation and penumbra estimation in ischemic stroke lesions. We show the ability of the proposed methodology to unveil information regarding relationships between imaging modalities and extracted features and their usefulness for the task at hand. In both clinical scenarios, we demonstrate that the proposed methodology enhances the interpretability of automatically learned features, highlighting specific learning patterns that resemble how an expert extracts relevant data from medical images. Copyright © 2017 Elsevier B.V. All rights reserved.

  16. Dependency-based long short term memory network for drug-drug interaction extraction.

    PubMed

    Wang, Wei; Yang, Xi; Yang, Canqun; Guo, Xiaowei; Zhang, Xiang; Wu, Chengkun

    2017-12-28

    Drug-drug interaction extraction (DDI) needs assistance from automated methods to address the explosively increasing biomedical texts. In recent years, deep neural network based models have been developed to address such needs and they have made significant progress in relation identification. We propose a dependency-based deep neural network model for DDI extraction. By introducing the dependency-based technique to a bi-directional long short term memory network (Bi-LSTM), we build three channels, namely, Linear channel, DFS channel and BFS channel. All of these channels are constructed with three network layers, including embedding layer, LSTM layer and max pooling layer from bottom up. In the embedding layer, we extract two types of features, one is distance-based feature and another is dependency-based feature. In the LSTM layer, a Bi-LSTM is instituted in each channel to better capture relation information. Then max pooling is used to get optimal features from the entire encoding sequential data. At last, we concatenate the outputs of all channels and then link it to the softmax layer for relation identification. To the best of our knowledge, our model achieves new state-of-the-art performance with the F-score of 72.0% on the DDIExtraction 2013 corpus. Moreover, our approach obtains much higher Recall value compared to the existing methods. The dependency-based Bi-LSTM model can learn effective relation information with less feature engineering in the task of DDI extraction. Besides, the experimental results show that our model excels at balancing the Precision and Recall values.

  17. Diagnosis of multiple sclerosis from EEG signals using nonlinear methods.

    PubMed

    Torabi, Ali; Daliri, Mohammad Reza; Sabzposhan, Seyyed Hojjat

    2017-12-01

    EEG signals have essential and important information about the brain and neural diseases. The main purpose of this study is classifying two groups of healthy volunteers and Multiple Sclerosis (MS) patients using nonlinear features of EEG signals while performing cognitive tasks. EEG signals were recorded when users were doing two different attentional tasks. One of the tasks was based on detecting a desired change in color luminance and the other task was based on detecting a desired change in direction of motion. EEG signals were analyzed in two ways: EEG signals analysis without rhythms decomposition and EEG sub-bands analysis. After recording and preprocessing, time delay embedding method was used for state space reconstruction; embedding parameters were determined for original signals and their sub-bands. Afterwards nonlinear methods were used in feature extraction phase. To reduce the feature dimension, scalar feature selections were done by using T-test and Bhattacharyya criteria. Then, the data were classified using linear support vector machines (SVM) and k-nearest neighbor (KNN) method. The best combination of the criteria and classifiers was determined for each task by comparing performances. For both tasks, the best results were achieved by using T-test criterion and SVM classifier. For the direction-based and the color-luminance-based tasks, maximum classification performances were 93.08 and 79.79% respectively which were reached by using optimal set of features. Our results show that the nonlinear dynamic features of EEG signals seem to be useful and effective in MS diseases diagnosis.

  18. Mapping and monitoring renewable resources with space SAR

    NASA Technical Reports Server (NTRS)

    Ulaby, F. T.; Brisco, B.; Dobson, M. C.; Moezzi, S.

    1983-01-01

    The SEASAT-A SAR and SIR-A imagery was examined to evaluate the quality and type of information that can be extracted and used to monitor renewable resources on Earth. Two tasks were carried out: (1) a land cover classification study which utilized two sets of imagery acquired by the SEASAT-A SAR, one set by SIR-A, and one LANDSAT set (4 bands); and (2) a change detection to examine differences between pairs of SEASAT-A SAR images and relates them to hydrologic and/or agronomic variations in the scene.

  19. Automated extraction and semantic analysis of mutation impacts from the biomedical literature

    PubMed Central

    2012-01-01

    Background Mutations as sources of evolution have long been the focus of attention in the biomedical literature. Accessing the mutational information and their impacts on protein properties facilitates research in various domains, such as enzymology and pharmacology. However, manually curating the rich and fast growing repository of biomedical literature is expensive and time-consuming. As a solution, text mining approaches have increasingly been deployed in the biomedical domain. While the detection of single-point mutations is well covered by existing systems, challenges still exist in grounding impacts to their respective mutations and recognizing the affected protein properties, in particular kinetic and stability properties together with physical quantities. Results We present an ontology model for mutation impacts, together with a comprehensive text mining system for extracting and analysing mutation impact information from full-text articles. Organisms, as sources of proteins, are extracted to help disambiguation of genes and proteins. Our system then detects mutation series to correctly ground detected impacts using novel heuristics. It also extracts the affected protein properties, in particular kinetic and stability properties, as well as the magnitude of the effects and validates these relations against the domain ontology. The output of our system can be provided in various formats, in particular by populating an OWL-DL ontology, which can then be queried to provide structured information. The performance of the system is evaluated on our manually annotated corpora. In the impact detection task, our system achieves a precision of 70.4%-71.1%, a recall of 71.3%-71.5%, and grounds the detected impacts with an accuracy of 76.5%-77%. The developed system, including resources, evaluation data and end-user and developer documentation is freely available under an open source license at http://www.semanticsoftware.info/open-mutation-miner. Conclusion We present Open Mutation Miner (OMM), the first comprehensive, fully open-source approach to automatically extract impacts and related relevant information from the biomedical literature. We assessed the performance of our work on manually annotated corpora and the results show the reliability of our approach. The representation of the extracted information into a structured format facilitates knowledge management and aids in database curation and correction. Furthermore, access to the analysis results is provided through multiple interfaces, including web services for automated data integration and desktop-based solutions for end user interactions. PMID:22759648

  20. Automated ancillary cancer history classification for mesothelioma patients from free-text clinical reports

    PubMed Central

    Wilson, Richard A.; Chapman, Wendy W.; DeFries, Shawn J.; Becich, Michael J.; Chapman, Brian E.

    2010-01-01

    Background: Clinical records are often unstructured, free-text documents that create information extraction challenges and costs. Healthcare delivery and research organizations, such as the National Mesothelioma Virtual Bank, require the aggregation of both structured and unstructured data types. Natural language processing offers techniques for automatically extracting information from unstructured, free-text documents. Methods: Five hundred and eight history and physical reports from mesothelioma patients were split into development (208) and test sets (300). A reference standard was developed and each report was annotated by experts with regard to the patient’s personal history of ancillary cancer and family history of any cancer. The Hx application was developed to process reports, extract relevant features, perform reference resolution and classify them with regard to cancer history. Two methods, Dynamic-Window and ConText, for extracting information were evaluated. Hx’s classification responses using each of the two methods were measured against the reference standard. The average Cohen’s weighted kappa served as the human benchmark in evaluating the system. Results: Hx had a high overall accuracy, with each method, scoring 96.2%. F-measures using the Dynamic-Window and ConText methods were 91.8% and 91.6%, which were comparable to the human benchmark of 92.8%. For the personal history classification, Dynamic-Window scored highest with 89.2% and for the family history classification, ConText scored highest with 97.6%, in which both methods were comparable to the human benchmark of 88.3% and 97.2%, respectively. Conclusion: We evaluated an automated application’s performance in classifying a mesothelioma patient’s personal and family history of cancer from clinical reports. To do so, the Hx application must process reports, identify cancer concepts, distinguish the known mesothelioma from ancillary cancers, recognize negation, perform reference resolution and determine the experiencer. Results indicated that both information extraction methods tested were dependant on the domain-specific lexicon and negation extraction. We showed that the more general method, ConText, performed as well as our task-specific method. Although Dynamic- Window could be modified to retrieve other concepts, ConText is more robust and performs better on inconclusive concepts. Hx could greatly improve and expedite the process of extracting data from free-text, clinical records for a variety of research or healthcare delivery organizations. PMID:21031012

  1. Benchmarking infrastructure for mutation text mining

    PubMed Central

    2014-01-01

    Background Experimental research on the automatic extraction of information about mutations from texts is greatly hindered by the lack of consensus evaluation infrastructure for the testing and benchmarking of mutation text mining systems. Results We propose a community-oriented annotation and benchmarking infrastructure to support development, testing, benchmarking, and comparison of mutation text mining systems. The design is based on semantic standards, where RDF is used to represent annotations, an OWL ontology provides an extensible schema for the data and SPARQL is used to compute various performance metrics, so that in many cases no programming is needed to analyze results from a text mining system. While large benchmark corpora for biological entity and relation extraction are focused mostly on genes, proteins, diseases, and species, our benchmarking infrastructure fills the gap for mutation information. The core infrastructure comprises (1) an ontology for modelling annotations, (2) SPARQL queries for computing performance metrics, and (3) a sizeable collection of manually curated documents, that can support mutation grounding and mutation impact extraction experiments. Conclusion We have developed the principal infrastructure for the benchmarking of mutation text mining tasks. The use of RDF and OWL as the representation for corpora ensures extensibility. The infrastructure is suitable for out-of-the-box use in several important scenarios and is ready, in its current state, for initial community adoption. PMID:24568600

  2. Benchmarking infrastructure for mutation text mining.

    PubMed

    Klein, Artjom; Riazanov, Alexandre; Hindle, Matthew M; Baker, Christopher Jo

    2014-02-25

    Experimental research on the automatic extraction of information about mutations from texts is greatly hindered by the lack of consensus evaluation infrastructure for the testing and benchmarking of mutation text mining systems. We propose a community-oriented annotation and benchmarking infrastructure to support development, testing, benchmarking, and comparison of mutation text mining systems. The design is based on semantic standards, where RDF is used to represent annotations, an OWL ontology provides an extensible schema for the data and SPARQL is used to compute various performance metrics, so that in many cases no programming is needed to analyze results from a text mining system. While large benchmark corpora for biological entity and relation extraction are focused mostly on genes, proteins, diseases, and species, our benchmarking infrastructure fills the gap for mutation information. The core infrastructure comprises (1) an ontology for modelling annotations, (2) SPARQL queries for computing performance metrics, and (3) a sizeable collection of manually curated documents, that can support mutation grounding and mutation impact extraction experiments. We have developed the principal infrastructure for the benchmarking of mutation text mining tasks. The use of RDF and OWL as the representation for corpora ensures extensibility. The infrastructure is suitable for out-of-the-box use in several important scenarios and is ready, in its current state, for initial community adoption.

  3. Earth resources data analysis program, phase 3

    NASA Technical Reports Server (NTRS)

    1975-01-01

    Tasks were performed in two areas: (1) systems analysis and (2) algorithmic development. The major effort in the systems analysis task was the development of a recommended approach to the monitoring of resource utilization data for the Large Area Crop Inventory Experiment (LACIE). Other efforts included participation in various studies concerning the LACIE Project Plan, the utility of the GE Image 100, and the specifications for a special purpose processor to be used in the LACIE. In the second task, the major effort was the development of improved algorithms for estimating proportions of unclassified remotely sensed data. Also, work was performed on optimal feature extraction and optimal feature extraction for proportion estimation.

  4. Qualitative modeling of the decision-making process using electrooculography.

    PubMed

    Zargari Marandi, Ramtin; Sabzpoushan, S H

    2015-12-01

    A novel method based on electrooculography (EOG) has been introduced in this work to study the decision-making process. An experiment was designed and implemented wherein subjects were asked to choose between two items from the same category that were presented within a limited time. The EOG and voice signals of the subjects were recorded during the experiment. A calibration task was performed to map the EOG signals to their corresponding gaze positions on the screen by using an artificial neural network. To analyze the data, 16 parameters were extracted from the response time and EOG signals of the subjects. Evaluation and comparison of the parameters, together with subjects' choices, revealed functional information. On the basis of this information, subjects switched their eye gazes between items about three times on average. We also found, according to statistical hypothesis testing-that is, a t test, t(10) = 71.62, SE = 1.25, p < .0001-that the correspondence rate of a subjects' gaze at the moment of selection with the selected item was significant. Ultimately, on the basis of these results, we propose a qualitative choice model for the decision-making task.

  5. Taking Word Clouds Apart: An Empirical Investigation of the Design Space for Keyword Summaries.

    PubMed

    Felix, Cristian; Franconeri, Steven; Bertini, Enrico

    2018-01-01

    In this paper we present a set of four user studies aimed at exploring the visual design space of what we call keyword summaries: lists of words with associated quantitative values used to help people derive an intuition of what information a given document collection (or part of it) may contain. We seek to systematically study how different visual representations may affect people's performance in extracting information out of keyword summaries. To this purpose, we first create a design space of possible visual representations and compare the possible solutions in this design space through a variety of representative tasks and performance metrics. Other researchers have, in the past, studied some aspects of effectiveness with word clouds, however, the existing literature is somewhat scattered and do not seem to address the problem in a sufficiently systematic and holistic manner. The results of our studies showed a strong dependency on the tasks users are performing. In this paper we present details of our methodology, the results, as well as, guidelines on how to design effective keyword summaries based in our discoveries.

  6. HEALTH GeoJunction: place-time-concept browsing of health publications.

    PubMed

    MacEachren, Alan M; Stryker, Michael S; Turton, Ian J; Pezanowski, Scott

    2010-05-18

    The volume of health science publications is escalating rapidly. Thus, keeping up with developments is becoming harder as is the task of finding important cross-domain connections. When geographic location is a relevant component of research reported in publications, these tasks are more difficult because standard search and indexing facilities have limited or no ability to identify geographic foci in documents. This paper introduces HEALTH GeoJunction, a web application that supports researchers in the task of quickly finding scientific publications that are relevant geographically and temporally as well as thematically. HEALTH GeoJunction is a geovisual analytics-enabled web application providing: (a) web services using computational reasoning methods to extract place-time-concept information from bibliographic data for documents and (b) visually-enabled place-time-concept query, filtering, and contextualizing tools that apply to both the documents and their extracted content. This paper focuses specifically on strategies for visually-enabled, iterative, facet-like, place-time-concept filtering that allows analysts to quickly drill down to scientific findings of interest in PubMed abstracts and to explore relations among abstracts and extracted concepts in place and time. The approach enables analysts to: find publications without knowing all relevant query parameters, recognize unanticipated geographic relations within and among documents in multiple health domains, identify the thematic emphasis of research targeting particular places, notice changes in concepts over time, and notice changes in places where concepts are emphasized. PubMed is a database of over 19 million biomedical abstracts and citations maintained by the National Center for Biotechnology Information; achieving quick filtering is an important contribution due to the database size. Including geography in filters is important due to rapidly escalating attention to geographic factors in public health. The implementation of mechanisms for iterative place-time-concept filtering makes it possible to narrow searches efficiently and quickly from thousands of documents to a small subset that meet place-time-concept constraints. Support for a more-like-this query creates the potential to identify unexpected connections across diverse areas of research. Multi-view visualization methods support understanding of the place, time, and concept components of document collections and enable comparison of filtered query results to the full set of publications.

  7. Skin Lesion Analysis towards Melanoma Detection Using Deep Learning Network.

    PubMed

    Li, Yuexiang; Shen, Linlin

    2018-02-11

    Skin lesions are a severe disease globally. Early detection of melanoma in dermoscopy images significantly increases the survival rate. However, the accurate recognition of melanoma is extremely challenging due to the following reasons: low contrast between lesions and skin, visual similarity between melanoma and non-melanoma lesions, etc. Hence, reliable automatic detection of skin tumors is very useful to increase the accuracy and efficiency of pathologists. In this paper, we proposed two deep learning methods to address three main tasks emerging in the area of skin lesion image processing, i.e., lesion segmentation (task 1), lesion dermoscopic feature extraction (task 2) and lesion classification (task 3). A deep learning framework consisting of two fully convolutional residual networks (FCRN) is proposed to simultaneously produce the segmentation result and the coarse classification result. A lesion index calculation unit (LICU) is developed to refine the coarse classification results by calculating the distance heat-map. A straight-forward CNN is proposed for the dermoscopic feature extraction task. The proposed deep learning frameworks were evaluated on the ISIC 2017 dataset. Experimental results show the promising accuracies of our frameworks, i.e., 0.753 for task 1, 0.848 for task 2 and 0.912 for task 3 were achieved.

  8. Filtering large-scale event collections using a combination of supervised and unsupervised learning for event trigger classification.

    PubMed

    Mehryary, Farrokh; Kaewphan, Suwisa; Hakala, Kai; Ginter, Filip

    2016-01-01

    Biomedical event extraction is one of the key tasks in biomedical text mining, supporting various applications such as database curation and hypothesis generation. Several systems, some of which have been applied at a large scale, have been introduced to solve this task. Past studies have shown that the identification of the phrases describing biological processes, also known as trigger detection, is a crucial part of event extraction, and notable overall performance gains can be obtained by solely focusing on this sub-task. In this paper we propose a novel approach for filtering falsely identified triggers from large-scale event databases, thus improving the quality of knowledge extraction. Our method relies on state-of-the-art word embeddings, event statistics gathered from the whole biomedical literature, and both supervised and unsupervised machine learning techniques. We focus on EVEX, an event database covering the whole PubMed and PubMed Central Open Access literature containing more than 40 million extracted events. The top most frequent EVEX trigger words are hierarchically clustered, and the resulting cluster tree is pruned to identify words that can never act as triggers regardless of their context. For rarely occurring trigger words we introduce a supervised approach trained on the combination of trigger word classification produced by the unsupervised clustering method and manual annotation. The method is evaluated on the official test set of BioNLP Shared Task on Event Extraction. The evaluation shows that the method can be used to improve the performance of the state-of-the-art event extraction systems. This successful effort also translates into removing 1,338,075 of potentially incorrect events from EVEX, thus greatly improving the quality of the data. The method is not solely bound to the EVEX resource and can be thus used to improve the quality of any event extraction system or database. The data and source code for this work are available at: http://bionlp-www.utu.fi/trigger-clustering/.

  9. Conversation Thread Extraction and Topic Detection in Text-Based Chat

    DTIC Science & Technology

    2008-09-01

    conversation extraction task. Multiple conversations in a session are interleaved. The goal in extraction is to select only those posts that belong...others. Our first-phase experiments quite clearly show the value of using time-distance as a feature in conversation thread extraction . In this set of... EXTRACTION AND TOPIC DETECTION IN TEXT-BASED CHAT by Paige Holland Adams September 2008 Thesis Advisor

  10. Task-Based Information Searching.

    ERIC Educational Resources Information Center

    Vakkari, Pertti

    2003-01-01

    Reviews studies on the relationship between task performance and information searching by end-users, focusing on information searching in electronic environments and information retrieval systems. Topics include task analysis; task characteristics; search goals; modeling information searching; modeling search goals; information seeking behavior;…

  11. Gender Identification Using High-Frequency Speech Energy: Effects of Increasing the Low-Frequency Limit.

    PubMed

    Donai, Jeremy J; Halbritter, Rachel M

    The purpose of this study was to investigate the ability of normal-hearing listeners to use high-frequency energy for gender identification from naturally produced speech signals. Two experiments were conducted using a repeated-measures design. Experiment 1 investigated the effects of increasing high-pass filter cutoff (i.e., increasing the low-frequency spectral limit) on gender identification from naturally produced vowel segments. Experiment 2 studied the effects of increasing high-pass filter cutoff on gender identification from naturally produced sentences. Confidence ratings for the gender identification task were also obtained for both experiments. Listeners in experiment 1 were capable of extracting talker gender information at levels significantly above chance from vowel segments high-pass filtered up to 8.5 kHz. Listeners in experiment 2 also performed above chance on the gender identification task from sentences high-pass filtered up to 12 kHz. Cumulatively, the results of both experiments provide evidence that normal-hearing listeners can utilize information from the very high-frequency region (above 4 to 5 kHz) of the speech signal for talker gender identification. These findings are at variance with current assumptions regarding the perceptual information regarding talker gender within this frequency region. The current results also corroborate and extend previous studies of the use of high-frequency speech energy for perceptual tasks. These findings have potential implications for the study of information contained within the high-frequency region of the speech spectrum and the role this region may play in navigating the auditory scene, particularly when the low-frequency portion of the spectrum is masked by environmental noise sources or for listeners with substantial hearing loss in the low-frequency region and better hearing sensitivity in the high-frequency region (i.e., reverse slope hearing loss).

  12. Effects of a standardized Bacopa monnieri extract on cognitive performance, anxiety, and depression in the elderly: a randomized, double-blind, placebo-controlled trial.

    PubMed

    Calabrese, Carlo; Gregory, William L; Leo, Michael; Kraemer, Dale; Bone, Kerry; Oken, Barry

    2008-07-01

    Study aims were to evaluate effects of Bacopa monnieri whole plant standardized dry extract on cognitive function and affect and its safety and tolerability in healthy elderly study participants. The study was a randomized, double-blind, placebo-controlled clinical trial with a placebo run-in of 6 weeks and a treatment period of 12 weeks. Volunteers were recruited from the community to a clinic in Portland, Oregon by public notification. Fifty-four (54) participants, 65 or older (mean 73.5 years), without clinical signs of dementia, were recruited and randomized to Bacopa or placebo. Forty-eight (48) completed the study with 24 in each group. Standardized B. monnieri extract 300 mg/day or a similar placebo tablet orally for 12 weeks. The primary outcome variable was the delayed recall score from the Rey Auditory Verbal Learning Test (AVLT). Other cognitive measures were the Stroop Task assessing the ability to ignore irrelevant information, the Divided Attention Task (DAT), and the Wechsler Adult Intelligence Scale (WAIS) letter-digit test of immediate working memory. Affective measures were the State-Trait Anxiety Inventory, Center for Epidemiologic Studies Depression scale (CESD)-10 depression scale, and the Profile of Mood States. Vital signs were also monitored. Controlling for baseline cognitive deficit using the Blessed Orientation-Memory-Concentration test, Bacopa participants had enhanced AVLT delayed word recall memory scores relative to placebo. Stroop results were similarly significant, with the Bacopa group improving and the placebo group unchanged. CESD-10 depression scores, combined state plus trait anxiety scores, and heart rate decreased over time for the Bacopa group but increased for the placebo group. No effects were found on the DAT, WAIS digit task, mood, or blood pressure. The dose was well tolerated with few adverse events (Bacopa n = 9, placebo n = 10), primarily stomach upset. This study provides further evidence that B. monnieri has potential for safely enhancing cognitive performance in the aging.

  13. Task-induced frequency modulation features for brain-computer interfacing

    NASA Astrophysics Data System (ADS)

    Jayaram, Vinay; Hohmann, Matthias; Just, Jennifer; Schölkopf, Bernhard; Grosse-Wentrup, Moritz

    2017-10-01

    Objective. Task-induced amplitude modulation of neural oscillations is routinely used in brain-computer interfaces (BCIs) for decoding subjects’ intents, and underlies some of the most robust and common methods in the field, such as common spatial patterns and Riemannian geometry. While there has been some interest in phase-related features for classification, both techniques usually presuppose that the frequencies of neural oscillations remain stable across various tasks. We investigate here whether features based on task-induced modulation of the frequency of neural oscillations enable decoding of subjects’ intents with an accuracy comparable to task-induced amplitude modulation. Approach. We compare cross-validated classification accuracies using the amplitude and frequency modulated features, as well as a joint feature space, across subjects in various paradigms and pre-processing conditions. We show results with a motor imagery task, a cognitive task, and also preliminary results in patients with amyotrophic lateral sclerosis (ALS), as well as using common spatial patterns and Laplacian filtering. Main results. The frequency features alone do not significantly out-perform traditional amplitude modulation features, and in some cases perform significantly worse. However, across both tasks and pre-processing in healthy subjects the joint space significantly out-performs either the frequency or amplitude features alone. This result only does not hold for ALS patients, for whom the dataset is of insufficient size to draw any statistically significant conclusions. Significance. Task-induced frequency modulation is robust and straight forward to compute, and increases performance when added to standard amplitude modulation features across paradigms. This allows more information to be extracted from the EEG signal cheaply and can be used throughout the field of BCIs.

  14. On the creation of a clinical gold standard corpus in Spanish: Mining adverse drug reactions.

    PubMed

    Oronoz, Maite; Gojenola, Koldo; Pérez, Alicia; de Ilarraza, Arantza Díaz; Casillas, Arantza

    2015-08-01

    The advances achieved in Natural Language Processing make it possible to automatically mine information from electronically created documents. Many Natural Language Processing methods that extract information from texts make use of annotated corpora, but these are scarce in the clinical domain due to legal and ethical issues. In this paper we present the creation of the IxaMed-GS gold standard composed of real electronic health records written in Spanish and manually annotated by experts in pharmacology and pharmacovigilance. The experts mainly annotated entities related to diseases and drugs, but also relationships between entities indicating adverse drug reaction events. To help the experts in the annotation task, we adapted a general corpus linguistic analyzer to the medical domain. The quality of the annotation process in the IxaMed-GS corpus has been assessed by measuring the inter-annotator agreement, which was 90.53% for entities and 82.86% for events. In addition, the corpus has been used for the automatic extraction of adverse drug reaction events using machine learning. Copyright © 2015 Elsevier Inc. All rights reserved.

  15. Identification of pests and diseases of Dalbergia hainanensis based on EVI time series and classification of decision tree

    NASA Astrophysics Data System (ADS)

    Luo, Qiu; Xin, Wu; Qiming, Xiong

    2017-06-01

    In the process of vegetation remote sensing information extraction, the problem of phenological features and low performance of remote sensing analysis algorithm is not considered. To solve this problem, the method of remote sensing vegetation information based on EVI time-series and the classification of decision-tree of multi-source branch similarity is promoted. Firstly, to improve the time-series stability of recognition accuracy, the seasonal feature of vegetation is extracted based on the fitting span range of time-series. Secondly, the decision-tree similarity is distinguished by adaptive selection path or probability parameter of component prediction. As an index, it is to evaluate the degree of task association, decide whether to perform migration of multi-source decision tree, and ensure the speed of migration. Finally, the accuracy of classification and recognition of pests and diseases can reach 87%--98% of commercial forest in Dalbergia hainanensis, which is significantly better than that of MODIS coverage accuracy of 80%--96% in this area. Therefore, the validity of the proposed method can be verified.

  16. Extracting Semantic Building Models from Aerial Stereo Images and Conversion to Citygml

    NASA Astrophysics Data System (ADS)

    Sengul, A.

    2012-07-01

    The collection of geographic data is of primary importance for the creation and maintenance of a GIS. Traditionally the acquisition of 3D information has been the task of photogrammetry using aerial stereo images. Digital photogrammetric systems employ sophisticated software to extract digital terrain models or to plot 3D objects. The demand for 3D city models leads to new applications and new standards. City Geography Mark-up Language (CityGML), a concept for modelling and exchange of 3D city and landscape models, defines the classes and relations for the most relevant topographic objects in cities and regional models with respect to their geometrical, topological, semantically and topological properties. It now is increasingly accepted, since it fulfils the prerequisites required e.g. for risk analysis, urban planning, and simulations. There is a need to include existing 3D information derived from photogrammetric processes in CityGML databases. In order to filling the gap, this paper reports on a framework transferring data plotted by Erdas LPS and Stereo Analyst for ArcGIS software to CityGML using Safe Software's Feature Manupulate Engine (FME)

  17. EEG source space analysis of the supervised factor analytic approach for the classification of multi-directional arm movement

    NASA Astrophysics Data System (ADS)

    Shenoy Handiru, Vikram; Vinod, A. P.; Guan, Cuntai

    2017-08-01

    Objective. In electroencephalography (EEG)-based brain-computer interface (BCI) systems for motor control tasks the conventional practice is to decode motor intentions by using scalp EEG. However, scalp EEG only reveals certain limited information about the complex tasks of movement with a higher degree of freedom. Therefore, our objective is to investigate the effectiveness of source-space EEG in extracting relevant features that discriminate arm movement in multiple directions. Approach. We have proposed a novel feature extraction algorithm based on supervised factor analysis that models the data from source-space EEG. To this end, we computed the features from the source dipoles confined to Brodmann areas of interest (BA4a, BA4p and BA6). Further, we embedded class-wise labels of multi-direction (multi-class) source-space EEG to an unsupervised factor analysis to make it into a supervised learning method. Main Results. Our approach provided an average decoding accuracy of 71% for the classification of hand movement in four orthogonal directions, that is significantly higher (>10%) than the classification accuracy obtained using state-of-the-art spatial pattern features in sensor space. Also, the group analysis on the spectral characteristics of source-space EEG indicates that the slow cortical potentials from a set of cortical source dipoles reveal discriminative information regarding the movement parameter, direction. Significance. This study presents evidence that low-frequency components in the source space play an important role in movement kinematics, and thus it may lead to new strategies for BCI-based neurorehabilitation.

  18. Versatile and efficient pore network extraction method using marker-based watershed segmentation

    NASA Astrophysics Data System (ADS)

    Gostick, Jeff T.

    2017-08-01

    Obtaining structural information from tomographic images of porous materials is a critical component of porous media research. Extracting pore networks is particularly valuable since it enables pore network modeling simulations which can be useful for a host of tasks from predicting transport properties to simulating performance of entire devices. This work reports an efficient algorithm for extracting networks using only standard image analysis techniques. The algorithm was applied to several standard porous materials ranging from sandstone to fibrous mats, and in all cases agreed very well with established or known values for pore and throat sizes, capillary pressure curves, and permeability. In the case of sandstone, the present algorithm was compared to the network obtained using the current state-of-the-art algorithm, and very good agreement was achieved. Most importantly, the network extracted from an image of fibrous media correctly predicted the anisotropic permeability tensor, demonstrating the critical ability to detect key structural features. The highly efficient algorithm allows extraction on fairly large images of 5003 voxels in just over 200 s. The ability for one algorithm to match materials as varied as sandstone with 20% porosity and fibrous media with 75% porosity is a significant advancement. The source code for this algorithm is provided.

  19. Associating Human-Centered Concepts with Social Networks Using Fuzzy Sets

    NASA Astrophysics Data System (ADS)

    Yager, Ronald R.

    The rapidly growing global interconnectivity, brought about to a large extent by the Internet, has dramatically increased the importance and diversity of social networks. Modern social networks cut across a spectrum from benign recreational focused websites such as Facebook to occupationally oriented websites such as LinkedIn to criminally focused groups such as drug cartels to devastation and terror focused groups such as Al-Qaeda. Many organizations are interested in analyzing and extracting information related to these social networks. Among these are governmental police and security agencies as well marketing and sales organizations. To aid these organizations there is a need for technologies to model social networks and intelligently extract information from these models. While established technologies exist for the modeling of relational networks [1-7] few technologies exist to extract information from these, compatible with human perception and understanding. Data bases is an example of a technology in which we have tools for representing our information as well as tools for querying and extracting the information contained. Our goal is in some sense analogous. We want to use the relational network model to represent information, in this case about relationships and interconnections, and then be able to query the social network using intelligent human-centered concepts. To extend our capabilities to interact with social relational networks we need to associate with these network human concepts and ideas. Since human beings predominantly use linguistic terms in which to reason and understand we need to build bridges between human conceptualization and the formal mathematical representation of the social network. Consider for example a concept such as "leader". An analyst may be able to express, in linguistic terms, using a network relevant vocabulary, properties of a leader. Our task is to translate this linguistic description into a mathematical formalism that allows us to determine how true it is that a particular node is a leader. In this work we look at the use of fuzzy set methodologies [8-10] to provide a bridge between the human analyst and the formal model of the network.

  20. Image processing and recognition for biological images

    PubMed Central

    Uchida, Seiichi

    2013-01-01

    This paper reviews image processing and pattern recognition techniques, which will be useful to analyze bioimages. Although this paper does not provide their technical details, it will be possible to grasp their main tasks and typical tools to handle the tasks. Image processing is a large research area to improve the visibility of an input image and acquire some valuable information from it. As the main tasks of image processing, this paper introduces gray-level transformation, binarization, image filtering, image segmentation, visual object tracking, optical flow and image registration. Image pattern recognition is the technique to classify an input image into one of the predefined classes and also has a large research area. This paper overviews its two main modules, that is, feature extraction module and classification module. Throughout the paper, it will be emphasized that bioimage is a very difficult target for even state-of-the-art image processing and pattern recognition techniques due to noises, deformations, etc. This paper is expected to be one tutorial guide to bridge biology and image processing researchers for their further collaboration to tackle such a difficult target. PMID:23560739

  1. Entity recognition in the biomedical domain using a hybrid approach.

    PubMed

    Basaldella, Marco; Furrer, Lenz; Tasso, Carlo; Rinaldi, Fabio

    2017-11-09

    This article describes a high-recall, high-precision approach for the extraction of biomedical entities from scientific articles. The approach uses a two-stage pipeline, combining a dictionary-based entity recognizer with a machine-learning classifier. First, the OGER entity recognizer, which has a bias towards high recall, annotates the terms that appear in selected domain ontologies. Subsequently, the Distiller framework uses this information as a feature for a machine learning algorithm to select the relevant entities only. For this step, we compare two different supervised machine-learning algorithms: Conditional Random Fields and Neural Networks. In an in-domain evaluation using the CRAFT corpus, we test the performance of the combined systems when recognizing chemicals, cell types, cellular components, biological processes, molecular functions, organisms, proteins, and biological sequences. Our best system combines dictionary-based candidate generation with Neural-Network-based filtering. It achieves an overall precision of 86% at a recall of 60% on the named entity recognition task, and a precision of 51% at a recall of 49% on the concept recognition task. These results are to our knowledge the best reported so far in this particular task.

  2. Improving the automated detection of refugee/IDP dwellings using the multispectral bands of the WorldView-2 satellite

    NASA Astrophysics Data System (ADS)

    Kemper, Thomas; Gueguen, Lionel; Soille, Pierre

    2012-06-01

    The enumeration of the population remains a critical task in the management of refugee/IDP camps. Analysis of very high spatial resolution satellite data proofed to be an efficient and secure approach for the estimation of dwellings and the monitoring of the camp over time. In this paper we propose a new methodology for the automated extraction of features based on differential morphological decomposition segmentation for feature extraction and interactive training sample selection from the max-tree and min-tree structures. This feature extraction methodology is tested on a WorldView-2 scene of an IDP camp in Darfur Sudan. Special emphasis is given to the additional available bands of the WorldView-2 sensor. The results obtained show that the interactive image information tool is performing very well by tuning the feature extraction to the local conditions. The analysis of different spectral subsets shows that it is possible to obtain good results already with an RGB combination, but by increasing the number of spectral bands the detection of dwellings becomes more accurate. Best results were obtained using all eight bands of WorldView-2 satellite.

  3. What's statistical about learning? Insights from modelling statistical learning as a set of memory processes

    PubMed Central

    2017-01-01

    Statistical learning has been studied in a variety of different tasks, including word segmentation, object identification, category learning, artificial grammar learning and serial reaction time tasks (e.g. Saffran et al. 1996 Science 274, 1926–1928; Orban et al. 2008 Proceedings of the National Academy of Sciences 105, 2745–2750; Thiessen & Yee 2010 Child Development 81, 1287–1303; Saffran 2002 Journal of Memory and Language 47, 172–196; Misyak & Christiansen 2012 Language Learning 62, 302–331). The difference among these tasks raises questions about whether they all depend on the same kinds of underlying processes and computations, or whether they are tapping into different underlying mechanisms. Prior theoretical approaches to statistical learning have often tried to explain or model learning in a single task. However, in many cases these approaches appear inadequate to explain performance in multiple tasks. For example, explaining word segmentation via the computation of sequential statistics (such as transitional probability) provides little insight into the nature of sensitivity to regularities among simultaneously presented features. In this article, we will present a formal computational approach that we believe is a good candidate to provide a unifying framework to explore and explain learning in a wide variety of statistical learning tasks. This framework suggests that statistical learning arises from a set of processes that are inherent in memory systems, including activation, interference, integration of information and forgetting (e.g. Perruchet & Vinter 1998 Journal of Memory and Language 39, 246–263; Thiessen et al. 2013 Psychological Bulletin 139, 792–814). From this perspective, statistical learning does not involve explicit computation of statistics, but rather the extraction of elements of the input into memory traces, and subsequent integration across those memory traces that emphasize consistent information (Thiessen and Pavlik 2013 Cognitive Science 37, 310–343). This article is part of the themed issue ‘New frontiers for statistical learning in the cognitive sciences'. PMID:27872374

  4. What's statistical about learning? Insights from modelling statistical learning as a set of memory processes.

    PubMed

    Thiessen, Erik D

    2017-01-05

    Statistical learning has been studied in a variety of different tasks, including word segmentation, object identification, category learning, artificial grammar learning and serial reaction time tasks (e.g. Saffran et al. 1996 Science 274: , 1926-1928; Orban et al. 2008 Proceedings of the National Academy of Sciences 105: , 2745-2750; Thiessen & Yee 2010 Child Development 81: , 1287-1303; Saffran 2002 Journal of Memory and Language 47: , 172-196; Misyak & Christiansen 2012 Language Learning 62: , 302-331). The difference among these tasks raises questions about whether they all depend on the same kinds of underlying processes and computations, or whether they are tapping into different underlying mechanisms. Prior theoretical approaches to statistical learning have often tried to explain or model learning in a single task. However, in many cases these approaches appear inadequate to explain performance in multiple tasks. For example, explaining word segmentation via the computation of sequential statistics (such as transitional probability) provides little insight into the nature of sensitivity to regularities among simultaneously presented features. In this article, we will present a formal computational approach that we believe is a good candidate to provide a unifying framework to explore and explain learning in a wide variety of statistical learning tasks. This framework suggests that statistical learning arises from a set of processes that are inherent in memory systems, including activation, interference, integration of information and forgetting (e.g. Perruchet & Vinter 1998 Journal of Memory and Language 39: , 246-263; Thiessen et al. 2013 Psychological Bulletin 139: , 792-814). From this perspective, statistical learning does not involve explicit computation of statistics, but rather the extraction of elements of the input into memory traces, and subsequent integration across those memory traces that emphasize consistent information (Thiessen and Pavlik 2013 Cognitive Science 37: , 310-343).This article is part of the themed issue 'New frontiers for statistical learning in the cognitive sciences'. © 2016 The Author(s).

  5. Classification images reveal decision variables and strategies in forced choice tasks

    PubMed Central

    Pritchett, Lisa M.; Murray, Richard F.

    2015-01-01

    Despite decades of research, there is still uncertainty about how people make simple decisions about perceptual stimuli. Most theories assume that perceptual decisions are based on decision variables, which are internal variables that encode task-relevant information. However, decision variables are usually considered to be theoretical constructs that cannot be measured directly, and this often makes it difficult to test theories of perceptual decision making. Here we show how to measure decision variables on individual trials, and we use these measurements to test theories of perceptual decision making more directly than has previously been possible. We measure classification images, which are estimates of templates that observers use to extract information from stimuli. We then calculate the dot product of these classification images with the stimuli to estimate observers' decision variables. Finally, we reconstruct each observer's “decision space,” a map that shows the probability of the observer’s responses for all values of the decision variables. We use this method to examine decision strategies in two-alternative forced choice (2AFC) tasks, for which there are several competing models. In one experiment, the resulting decision spaces support the difference model, a classic theory of 2AFC decisions. In a second experiment, we find unexpected decision spaces that are not predicted by standard models of 2AFC decisions, and that suggest intrinsic uncertainty or soft thresholding. These experiments give new evidence regarding observers’ strategies in 2AFC tasks, and they show how measuring decision variables can answer long-standing questions about perceptual decision making. PMID:26015584

  6. Brain Correlates of Mathematical Competence in Processing Mathematical Representations

    PubMed Central

    Grabner, Roland H.; Reishofer, Gernot; Koschutnig, Karl; Ebner, Franz

    2011-01-01

    The ability to extract numerical information from different representation formats (e.g., equations, tables, or diagrams) is a key component of mathematical competence but little is known about its neural correlate. Previous studies comparing mathematically less and more competent adults have focused on mental arithmetic and reported differences in left angular gyrus (AG) activity which were interpreted to reflect differential reliance on arithmetic fact retrieval during problem solving. The aim of the present functional magnetic resonance imaging study was to investigate the brain correlates of mathematical competence in a task requiring the processing of typical mathematical representations. Twenty-eight adults of lower and higher mathematical competence worked on a representation matching task in which they had to evaluate whether the numerical information of a symbolic equation matches that of a bar chart. Two task conditions without and one condition with arithmetic demands were administered. Both competence groups performed equally well in the non-arithmetic conditions and only differed in accuracy in the condition requiring calculation. Activation contrasts between the groups revealed consistently stronger left AG activation in the more competent individuals across all three task conditions. The finding of competence-related activation differences independently of arithmetic demands suggests that more and less competent individuals differ in a cognitive process other than arithmetic fact retrieval. Specifically, it is argued that the stronger left AG activity in the more competent adults may reflect their higher proficiency in processing mathematical symbols. Moreover, the study demonstrates competence-related parietal activation differences that were not accompanied by differential experimental performance. PMID:22069387

  7. The effect of two different electronic health record user interfaces on intensive care provider task load, errors of cognition, and performance.

    PubMed

    Ahmed, Adil; Chandra, Subhash; Herasevich, Vitaly; Gajic, Ognjen; Pickering, Brian W

    2011-07-01

    The care of critically ill patients generates large quantities of data. Increasingly, these data are presented to the provider within an electronic medical record. The manner in which data are organized and presented can impact on the ability of users to synthesis that data into meaningful information. The objective of this study was to test the hypothesis that novel user interfaces, which prioritize the display of high-value data to providers within system-based packages, reduce task load, and result in fewer errors of cognition compared with established user interfaces that do not. Randomized crossover study. Academic tertiary referral center. Attending, resident and fellow critical care physicians. Novel health care record user interface. Subjects randomly assigned to either a standard electronic medical record or a novel user interface, were asked to perform a structured task. The task required the subjects to use the assigned electronic environment to review the medical record of an intensive care unit patient said to be actively bleeding for data that formed the basis of answers to clinical questions posed in the form of a structured questionnaire. The primary outcome was task load, measured using the paper version of the NASA-task load index. Secondary outcome measures included time to task completion, number of errors of cognition measured by comparison of subject to post hoc gold standard questionnaire responses, and the quantity of information presented to subjects by each environment. Twenty subjects completed the task on eight patients, resulting in 160 patient-provider encounters (80 in each group). The standard electronic medical record contained a much larger data volume with a median (interquartile range) number of data points per patient of 1008 (895-1183) compared with 102 (77-112) contained within the novel user interface. The median (interquartile range) NASA-task load index values were 38.8 (32-45) and 58 (45-65) for the novel user interface compared with the standard electronic medical record (p < .001). The median (interquartile range) times in seconds taken to complete the task for four consecutive patients were 93 (57-132), 60 (48-71), 68 (48-80), and 54 (42-64) for the novel user interface compared with 145 (109-201), 125 (113-162), 129 (100-145), and 112 (92-123) for the standard interface (p < .0001), respectively. The median (interquartile range) number of errors per provider was 0.5 (0-1) and two (0.25-3) for the novel user interface and standard electronic medical record interface, respectively (p = .007). A novel user interface was designed based on the information needs of intensive care unit providers with a specific goal of development being the reduction of task load and errors of cognition associated with filtering, extracting, and using medical data contained within a comprehensive electronic medical record. The results of this simulated clinical experiment suggest that the configuration of the intensive care unit user interface contributes significantly to the task load, time to task completion, and number of errors of cognition associated with the identification, and subsequent use, of relevant patient data. Task-specific user interfaces, developed from an understanding of provider information requirements, offer advantages over interfaces currently available within a standard electronic medical record.

  8. Facilitation of memory encoding in primate hippocampus by a neuroprosthesis that promotes task-specific neural firing

    NASA Astrophysics Data System (ADS)

    Hampson, Robert E.; Song, Dong; Opris, Ioan; Santos, Lucas M.; Shin, Dae C.; Gerhardt, Greg A.; Marmarelis, Vasilis Z.; Berger, Theodore W.; Deadwyler, Sam A.

    2013-12-01

    Objective. Memory accuracy is a major problem in human disease and is the primary factor that defines Alzheimer’s, ageing and dementia resulting from impaired hippocampal function in the medial temporal lobe. Development of a hippocampal memory neuroprosthesis that facilitates normal memory encoding in nonhuman primates (NHPs) could provide the basis for improving memory in human disease states. Approach. NHPs trained to perform a short-term delayed match-to-sample (DMS) memory task were examined with multi-neuron recordings from synaptically connected hippocampal cell fields, CA1 and CA3. Recordings were analyzed utilizing a previously developed nonlinear multi-input multi-output (MIMO) neuroprosthetic model, capable of extracting CA3-to-CA1 spatiotemporal firing patterns during DMS performance. Main results. The MIMO model verified that specific CA3-to-CA1 firing patterns were critical for the successful encoding of sample phase information on more difficult DMS trials. This was validated by the delivery of successful MIMO-derived encoding patterns via electrical stimulation to the same CA1 recording locations during the sample phase which facilitated task performance in the subsequent, delayed match phase, on difficult trials that required more precise encoding of sample information. Significance. These findings provide the first successful application of a neuroprosthesis designed to enhance and/or repair memory encoding in primate brain.

  9. A Nonlinear Model for Hippocampal Cognitive Prosthesis: Memory Facilitation by Hippocampal Ensemble Stimulation

    PubMed Central

    Hampson, Robert E.; Song, Dong; Chan, Rosa H.M.; Sweatt, Andrew J.; Riley, Mitchell R.; Gerhardt, Gregory A.; Shin, Dae C.; Marmarelis, Vasilis Z.; Berger, Theodore W.; Deadwyler, Samuel A.

    2012-01-01

    Collaborative investigations have characterized how multineuron hippocampal ensembles encode memory necessary for subsequent successful performance by rodents in a delayed nonmatch to sample (DNMS) task and utilized that information to provide the basis for a memory prosthesis to enhance performance. By employing a unique nonlinear dynamic multi-input/multi-output (MIMO) model, developed and adapted to hippocampal neural ensemble firing patterns derived from simultaneous recorded CA1 and CA3 activity, it was possible to extract information encoded in the sample phase necessary for successful performance in the nonmatch phase of the task. The extension of this MIMO model to online delivery of electrical stimulation delivered to the same recording loci that mimicked successful CA1 firing patterns, provided the means to increase levels of performance on a trial-by-trial basis. Inclusion of several control procedures provides evidence for the specificity of effective MIMO model generated patterns of electrical stimulation. Increased utility of the MIMO model as a prosthesis device was exhibited by the demonstration of cumulative increases in DNMS task performance with repeated MIMO stimulation over many sessions on both stimulation and nonstimulation trials, suggesting overall system modification with continued exposure. Results reported here are compatible with and extend prior demonstrations and further support the candidacy of the MIMO model as an effective cortical prosthesis. PMID:22438334

  10. Facilitation of Memory Encoding in Primate Hippocampus by a Neuroprosthesis that Promotes Task Specific Neural Firing

    PubMed Central

    Hampson, Robert E.; Song, Dong; Opris, Ioan; Santos, Lucas M.; Shin, Dae C.; Gerhardt, Greg A.; Marmarelis, Vasilis Z.; Berger, Theodore W.; Deadwyler, Sam A.

    2014-01-01

    Objective Memory accuracy is a major problem in human disease and is the primary factor that defines Alzheimer’s’, aging and dementia resulting from impaired hippocampal function in medial temporal lobe. Development of a hippocampal memory neuroprosthesis that facilitates normal memory encoding in nonhuman primates (NHPs) could provide the basis for improving memory in human disease states. Approach NHPs trained to perform a short-term delayed match to sample (DMS) memory task were examined with multi-neuron recordings from synaptically connected hippocampal cell fields, CA1 and CA3. Recordings were analyzed utilizing a previously developed nonlinear multi-input multi-output (MIMO) neuroprosthetic model, capable of extracting CA3-to-CA1 spatiotemporal firing patterns during DMS performance. Main Results The MIMO model verified that specific CA3-to-CA1 firing patterns were critical for successful encoding of Sample phase information on more difficult DMS trials. This was validated by delivery of successful MIMO-derived encoding patterns via electrical stimulation to the same CA1 recording locations during the Sample phase which facilitated task performance in the subsequent delayed Match phase on difficult trials that required more precise encoding of Sample information. Significance These findings provide the first successful application of a neuroprosthesis designed to enhance and/or repair memory encoding in primate brain. PMID:24216292

  11. Intonation processing deficits of emotional words among Mandarin Chinese speakers with congenital amusia: an ERP study.

    PubMed

    Lu, Xuejing; Ho, Hao Tam; Liu, Fang; Wu, Daxing; Thompson, William F

    2015-01-01

    Congenital amusia is a disorder that is known to affect the processing of musical pitch. Although individuals with amusia rarely show language deficits in daily life, a number of findings point to possible impairments in speech prosody that amusic individuals may compensate for by drawing on linguistic information. Using EEG, we investigated (1) whether the processing of speech prosody is impaired in amusia and (2) whether emotional linguistic information can compensate for this impairment. Twenty Chinese amusics and 22 matched controls were presented pairs of emotional words spoken with either statement or question intonation while their EEG was recorded. Their task was to judge whether the intonations were the same. Amusics exhibited impaired performance on the intonation-matching task for emotional linguistic information, as their performance was significantly worse than that of controls. EEG results showed a reduced N2 response to incongruent intonation pairs in amusics compared with controls, which likely reflects impaired conflict processing in amusia. However, our EEG results also indicated that amusics were intact in early sensory auditory processing, as revealed by a comparable N1 modulation in both groups. We propose that the impairment in discriminating speech intonation observed among amusic individuals may arise from an inability to access information extracted at early processing stages. This, in turn, could reflect a disconnection between low-level and high-level processing.

  12. Classification of Hand Grasp Kinetics and Types Using Movement-Related Cortical Potentials and EEG Rhythms.

    PubMed

    Jochumsen, Mads; Rovsing, Cecilie; Rovsing, Helene; Niazi, Imran Khan; Dremstrup, Kim; Kamavuako, Ernest Nlandu

    2017-01-01

    Detection of single-trial movement intentions from EEG is paramount for brain-computer interfacing in neurorehabilitation. These movement intentions contain task-related information and if this is decoded, the neurorehabilitation could potentially be optimized. The aim of this study was to classify single-trial movement intentions associated with two levels of force and speed and three different grasp types using EEG rhythms and components of the movement-related cortical potential (MRCP) as features. The feature importance was used to estimate encoding of discriminative information. Two data sets were used. 29 healthy subjects executed and imagined different hand movements, while EEG was recorded over the contralateral sensorimotor cortex. The following features were extracted: delta, theta, mu/alpha, beta, and gamma rhythms, readiness potential, negative slope, and motor potential of the MRCP. Sequential forward selection was performed, and classification was performed using linear discriminant analysis and support vector machines. Limited classification accuracies were obtained from the EEG rhythms and MRCP-components: 0.48 ± 0.05 (grasp types), 0.41 ± 0.07 (kinetic profiles, motor execution), and 0.39 ± 0.08 (kinetic profiles, motor imagination). Delta activity contributed the most but all features provided discriminative information. These findings suggest that information from the entire EEG spectrum is needed to discriminate between task-related parameters from single-trial movement intentions.

  13. Intonation processing deficits of emotional words among Mandarin Chinese speakers with congenital amusia: an ERP study

    PubMed Central

    Lu, Xuejing; Ho, Hao Tam; Liu, Fang; Wu, Daxing; Thompson, William F.

    2015-01-01

    Background: Congenital amusia is a disorder that is known to affect the processing of musical pitch. Although individuals with amusia rarely show language deficits in daily life, a number of findings point to possible impairments in speech prosody that amusic individuals may compensate for by drawing on linguistic information. Using EEG, we investigated (1) whether the processing of speech prosody is impaired in amusia and (2) whether emotional linguistic information can compensate for this impairment. Method: Twenty Chinese amusics and 22 matched controls were presented pairs of emotional words spoken with either statement or question intonation while their EEG was recorded. Their task was to judge whether the intonations were the same. Results: Amusics exhibited impaired performance on the intonation-matching task for emotional linguistic information, as their performance was significantly worse than that of controls. EEG results showed a reduced N2 response to incongruent intonation pairs in amusics compared with controls, which likely reflects impaired conflict processing in amusia. However, our EEG results also indicated that amusics were intact in early sensory auditory processing, as revealed by a comparable N1 modulation in both groups. Conclusion: We propose that the impairment in discriminating speech intonation observed among amusic individuals may arise from an inability to access information extracted at early processing stages. This, in turn, could reflect a disconnection between low-level and high-level processing. PMID:25914659

  14. Hybrid Semantic Analysis for Mapping Adverse Drug Reaction Mentions in Tweets to Medical Terminology.

    PubMed

    Emadzadeh, Ehsan; Sarker, Abeed; Nikfarjam, Azadeh; Gonzalez, Graciela

    2017-01-01

    Social networks, such as Twitter, have become important sources for active monitoring of user-reported adverse drug reactions (ADRs). Automatic extraction of ADR information can be crucial for healthcare providers, drug manufacturers, and consumers. However, because of the non-standard nature of social media language, automatically extracted ADR mentions need to be mapped to standard forms before they can be used by operational pharmacovigilance systems. We propose a modular natural language processing pipeline for mapping (normalizing) colloquial mentions of ADRs to their corresponding standardized identifiers. We seek to accomplish this task and enable customization of the pipeline so that distinct unlabeled free text resources can be incorporated to use the system for other normalization tasks. Our approach, which we call Hybrid Semantic Analysis (HSA), sequentially employs rule-based and semantic matching algorithms for mapping user-generated mentions to concept IDs in the Unified Medical Language System vocabulary. The semantic matching component of HSA is adaptive in nature and uses a regression model to combine various measures of semantic relatedness and resources to optimize normalization performance on the selected data source. On a publicly available corpus, our normalization method achieves 0.502 recall and 0.823 precision (F-measure: 0.624). Our proposed method outperforms a baseline based on latent semantic analysis and another that uses MetaMap.

  15. Management Information Task Group

    DTIC Science & Technology

    2002-12-18

    Defense Business Practice Implementation Board Management Information Task Group Report...Std Z39-18 Defense Business Practice Implementation Board Management Information Task Group... Business Practice Implementation Board Management Information Task Group Report FY02-2 3

  16. PolarHub: A Global Hub for Polar Data Discovery

    NASA Astrophysics Data System (ADS)

    Li, W.

    2014-12-01

    This paper reports the outcome of a NSF project in developing a large-scale web crawler PolarHub to discover automatically the distributed polar dataset in the format of OGC web services (OWS) in the cyberspace. PolarHub is a machine robot; its goal is to visit as many webpages as possible to find those containing information about polar OWS, extract this information and store it into the backend data repository. This is a very challenging task given huge data volume of webpages on the Web. Three unique features was introduced in PolarHub to make it distinctive from earlier crawler solutions: (1) a multi-task, multi-user, multi-thread support to the crawling tasks; (2) an extensive use of thread pool and Data Access Object (DAO) design patterns to separate persistent data storage and business logic to achieve high extendibility of the crawler tool; (3) a pattern-matching based customizable crawling algorithm to support discovery of multi-type geospatial web services; and (4) a universal and portable client-server communication mechanism combining a server-push and client pull strategies for enhanced asynchronous processing. A series of experiments were conducted to identify the impact of crawling parameters to the overall system performance. The geographical distribution pattern of all PolarHub identified services is also demonstrated. We expect this work to make a major contribution to the field of geospatial information retrieval and geospatial interoperability, to bridge the gap between data provider and data consumer, and to accelerate polar science by enhancing the accessibility and reusability of adequate polar data.

  17. The evaluation of display symbology - A chronometric study of visual search. [on cathode ray tubes

    NASA Technical Reports Server (NTRS)

    Remington, R.; Williams, D.

    1984-01-01

    Three single-target visual search tasks were used to evaluate a set of CRT symbols for a helicopter traffic display. The search tasks were representative of the kinds of information extraction required in practice, and reaction time was used to measure the efficiency with which symbols could be located and identified. The results show that familiar numeric symbols were responded to more quickly than graphic symbols. The addition of modifier symbols such as a nearby flashing dot or surrounding square had a greater disruptive effect on the graphic symbols than the alphanumeric characters. The results suggest that a symbol set is like a list that must be learned. Factors that affect the time to respond to items in a list, such as familiarity and visual discriminability, and the division of list items into categories, also affect the time to identify symbols.

  18. ParaBTM: A Parallel Processing Framework for Biomedical Text Mining on Supercomputers.

    PubMed

    Xing, Yuting; Wu, Chengkun; Yang, Xi; Wang, Wei; Zhu, En; Yin, Jianping

    2018-04-27

    A prevailing way of extracting valuable information from biomedical literature is to apply text mining methods on unstructured texts. However, the massive amount of literature that needs to be analyzed poses a big data challenge to the processing efficiency of text mining. In this paper, we address this challenge by introducing parallel processing on a supercomputer. We developed paraBTM, a runnable framework that enables parallel text mining on the Tianhe-2 supercomputer. It employs a low-cost yet effective load balancing strategy to maximize the efficiency of parallel processing. We evaluated the performance of paraBTM on several datasets, utilizing three types of named entity recognition tasks as demonstration. Results show that, in most cases, the processing efficiency can be greatly improved with parallel processing, and the proposed load balancing strategy is simple and effective. In addition, our framework can be readily applied to other tasks of biomedical text mining besides NER.

  19. Global dynamics of selective attention and its lapses in primary auditory cortex.

    PubMed

    Lakatos, Peter; Barczak, Annamaria; Neymotin, Samuel A; McGinnis, Tammy; Ross, Deborah; Javitt, Daniel C; O'Connell, Monica Noelle

    2016-12-01

    Previous research demonstrated that while selectively attending to relevant aspects of the external world, the brain extracts pertinent information by aligning its neuronal oscillations to key time points of stimuli or their sampling by sensory organs. This alignment mechanism is termed oscillatory entrainment. We investigated the global, long-timescale dynamics of this mechanism in the primary auditory cortex of nonhuman primates, and hypothesized that lapses of entrainment would correspond to lapses of attention. By examining electrophysiological and behavioral measures, we observed that besides the lack of entrainment by external stimuli, attentional lapses were also characterized by high-amplitude alpha oscillations, with alpha frequency structuring of neuronal ensemble and single-unit operations. Entrainment and alpha-oscillation-dominated periods were strongly anticorrelated and fluctuated rhythmically at an ultra-slow rate. Our results indicate that these two distinct brain states represent externally versus internally oriented computational resources engaged by large-scale task-positive and task-negative functional networks.

  20. Does silent reading speed in normal adult readers depend on early visual processes? evidence from event-related brain potentials.

    PubMed

    Korinth, Sebastian Peter; Sommer, Werner; Breznitz, Zvia

    2012-01-01

    Little is known about the relationship of reading speed and early visual processes in normal readers. Here we examined the association of the early P1, N170 and late N1 component in visual event-related potentials (ERPs) with silent reading speed and a number of additional cognitive skills in a sample of 52 adult German readers utilizing a Lexical Decision Task (LDT) and a Face Decision Task (FDT). Amplitudes of the N170 component in the LDT but, interestingly, also in the FDT correlated with behavioral tests measuring silent reading speed. We suggest that reading speed performance can be at least partially accounted for by the extraction of essential structural information from visual stimuli, consisting of a domain-general and a domain-specific expertise-based portion. © 2011 Elsevier Inc. All rights reserved.

  1. Effects of VR system fidelity on analyzing isosurface visualization of volume datasets.

    PubMed

    Laha, Bireswar; Bowman, Doug A; Socha, John J

    2014-04-01

    Volume visualization is an important technique for analyzing datasets from a variety of different scientific domains. Volume data analysis is inherently difficult because volumes are three-dimensional, dense, and unfamiliar, requiring scientists to precisely control the viewpoint and to make precise spatial judgments. Researchers have proposed that more immersive (higher fidelity) VR systems might improve task performance with volume datasets, and significant results tied to different components of display fidelity have been reported. However, more information is needed to generalize these results to different task types, domains, and rendering styles. We visualized isosurfaces extracted from synchrotron microscopic computed tomography (SR-μCT) scans of beetles, in a CAVE-like display. We ran a controlled experiment evaluating the effects of three components of system fidelity (field of regard, stereoscopy, and head tracking) on a variety of abstract task categories that are applicable to various scientific domains, and also compared our results with those from our prior experiment using 3D texture-based rendering. We report many significant findings. For example, for search and spatial judgment tasks with isosurface visualization, a stereoscopic display provides better performance, but for tasks with 3D texture-based rendering, displays with higher field of regard were more effective, independent of the levels of the other display components. We also found that systems with high field of regard and head tracking improve performance in spatial judgment tasks. Our results extend existing knowledge and produce new guidelines for designing VR systems to improve the effectiveness of volume data analysis.

  2. Using event-related fMRI to examine sustained attention processes and effects of APOE ε4 in young adults.

    PubMed

    Evans, Simon; Clarke, Devin; Dowell, Nicholas G; Tabet, Naji; King, Sarah L; Hutton, Samuel B; Rusted, Jennifer M

    2018-01-01

    In this study we investigated effects of the APOE ε4 allele (which confers an enhanced risk of poorer cognitive ageing, and Alzheimer's Disease) on sustained attention (vigilance) performance in young adults using the Rapid Visual Information Processing (RVIP) task and event-related fMRI. Previous fMRI work with this task has used block designs: this study is the first to image an extended (6-minute) RVIP task. Participants were 26 carriers of the APOE ε4 allele, and 26 non carriers (aged 18-28). Pupil diameter was measured throughout, as an index of cognitive effort. We compared activity to RVIP task hits to hits on a control task (with similar visual parameters and response requirements but no working memory load): this contrast showed activity in medial frontal, inferior and superior parietal, temporal and visual cortices, consistent with previous work, demonstrating that meaningful neural data can be extracted from the RVIP task over an extended interval and using an event-related design. Behavioural performance was not affected by genotype; however, a genotype by condition (experimental task/control task) interaction on pupil diameter suggested that ε4 carriers deployed more effort to the experimental compared to the control task. fMRI results showed a condition by genotype interaction in the right hippocampal formation: only ε4 carriers showed downregulation of this region to experimental task hits versus control task hits. Experimental task beta values were correlated against hit rate: parietal correlations were seen in ε4 carriers only, frontal correlations in non-carriers only. The data indicate that, in the absence of behavioural differences, young adult ε4 carriers already show a different linkage between functional brain activity and behaviour, as well as aberrant hippocampal recruitment patterns. This may have relevance for genotype differences in cognitive ageing trajectories.

  3. ICA model order selection of task co-activation networks.

    PubMed

    Ray, Kimberly L; McKay, D Reese; Fox, Peter M; Riedel, Michael C; Uecker, Angela M; Beckmann, Christian F; Smith, Stephen M; Fox, Peter T; Laird, Angela R

    2013-01-01

    Independent component analysis (ICA) has become a widely used method for extracting functional networks in the brain during rest and task. Historically, preferred ICA dimensionality has widely varied within the neuroimaging community, but typically varies between 20 and 100 components. This can be problematic when comparing results across multiple studies because of the impact ICA dimensionality has on the topology of its resultant components. Recent studies have demonstrated that ICA can be applied to peak activation coordinates archived in a large neuroimaging database (i.e., BrainMap Database) to yield whole-brain task-based co-activation networks. A strength of applying ICA to BrainMap data is that the vast amount of metadata in BrainMap can be used to quantitatively assess tasks and cognitive processes contributing to each component. In this study, we investigated the effect of model order on the distribution of functional properties across networks as a method for identifying the most informative decompositions of BrainMap-based ICA components. Our findings suggest dimensionality of 20 for low model order ICA to examine large-scale brain networks, and dimensionality of 70 to provide insight into how large-scale networks fractionate into sub-networks. We also provide a functional and organizational assessment of visual, motor, emotion, and interoceptive task co-activation networks as they fractionate from low to high model-orders.

  4. ICA model order selection of task co-activation networks

    PubMed Central

    Ray, Kimberly L.; McKay, D. Reese; Fox, Peter M.; Riedel, Michael C.; Uecker, Angela M.; Beckmann, Christian F.; Smith, Stephen M.; Fox, Peter T.; Laird, Angela R.

    2013-01-01

    Independent component analysis (ICA) has become a widely used method for extracting functional networks in the brain during rest and task. Historically, preferred ICA dimensionality has widely varied within the neuroimaging community, but typically varies between 20 and 100 components. This can be problematic when comparing results across multiple studies because of the impact ICA dimensionality has on the topology of its resultant components. Recent studies have demonstrated that ICA can be applied to peak activation coordinates archived in a large neuroimaging database (i.e., BrainMap Database) to yield whole-brain task-based co-activation networks. A strength of applying ICA to BrainMap data is that the vast amount of metadata in BrainMap can be used to quantitatively assess tasks and cognitive processes contributing to each component. In this study, we investigated the effect of model order on the distribution of functional properties across networks as a method for identifying the most informative decompositions of BrainMap-based ICA components. Our findings suggest dimensionality of 20 for low model order ICA to examine large-scale brain networks, and dimensionality of 70 to provide insight into how large-scale networks fractionate into sub-networks. We also provide a functional and organizational assessment of visual, motor, emotion, and interoceptive task co-activation networks as they fractionate from low to high model-orders. PMID:24339802

  5. Summary statistics in the attentional blink.

    PubMed

    McNair, Nicolas A; Goodbourn, Patrick T; Shone, Lauren T; Harris, Irina M

    2017-01-01

    We used the attentional blink (AB) paradigm to investigate the processing stage at which extraction of summary statistics from visual stimuli ("ensemble coding") occurs. Experiment 1 examined whether ensemble coding requires attentional engagement with the items in the ensemble. Participants performed two sequential tasks on each trial: gender discrimination of a single face (T1) and estimating the average emotional expression of an ensemble of four faces (or of a single face, as a control condition) as T2. Ensemble coding was affected by the AB when the tasks were separated by a short temporal lag. In Experiment 2, the order of the tasks was reversed to test whether ensemble coding requires more working-memory resources, and therefore induces a larger AB, than estimating the expression of a single face. Each condition produced a similar magnitude AB in the subsequent gender-discrimination T2 task. Experiment 3 additionally investigated whether the previous results were due to participants adopting a subsampling strategy during the ensemble-coding task. Contrary to this explanation, we found different patterns of performance in the ensemble-coding condition and a condition in which participants were instructed to focus on only a single face within an ensemble. Taken together, these findings suggest that ensemble coding emerges automatically as a result of the deployment of attentional resources across the ensemble of stimuli, prior to information being consolidated in working memory.

  6. Is There a Common Summary Statistical Process for Representing the Mean and Variance? A Study Using Illustrations of Familiar Items.

    PubMed

    Yang, Yi; Tokita, Midori; Ishiguchi, Akira

    2018-01-01

    A number of studies revealed that our visual system can extract different types of summary statistics, such as the mean and variance, from sets of items. Although the extraction of such summary statistics has been studied well in isolation, the relationship between these statistics remains unclear. In this study, we explored this issue using an individual differences approach. Observers viewed illustrations of strawberries and lollypops varying in size or orientation and performed four tasks in a within-subject design, namely mean and variance discrimination tasks with size and orientation domains. We found that the performances in the mean and variance discrimination tasks were not correlated with each other and demonstrated that extractions of the mean and variance are mediated by different representation mechanisms. In addition, we tested the relationship between performances in size and orientation domains for each summary statistic (i.e. mean and variance) and examined whether each summary statistic has distinct processes across perceptual domains. The results illustrated that statistical summary representations of size and orientation may share a common mechanism for representing the mean and possibly for representing variance. Introspections for each observer performing the tasks were also examined and discussed.

  7. A harmonic linear dynamical system for prominent ECG feature extraction.

    PubMed

    Thi, Ngoc Anh Nguyen; Yang, Hyung-Jeong; Kim, SunHee; Do, Luu Ngoc

    2014-01-01

    Unsupervised mining of electrocardiography (ECG) time series is a crucial task in biomedical applications. To have efficiency of the clustering results, the prominent features extracted from preprocessing analysis on multiple ECG time series need to be investigated. In this paper, a Harmonic Linear Dynamical System is applied to discover vital prominent features via mining the evolving hidden dynamics and correlations in ECG time series. The discovery of the comprehensible and interpretable features of the proposed feature extraction methodology effectively represents the accuracy and the reliability of clustering results. Particularly, the empirical evaluation results of the proposed method demonstrate the improved performance of clustering compared to the previous main stream feature extraction approaches for ECG time series clustering tasks. Furthermore, the experimental results on real-world datasets show scalability with linear computation time to the duration of the time series.

  8. Task-dependent and task-independent neurovascular responses to syntactic processing⋆

    PubMed Central

    Caplan, David; Chen, Evan; Waters, Gloria

    2008-01-01

    The neural basis for syntactic processing was studied using event-related fMRI to determine the locations of BOLD signal increases in the contrast of syntactically complex sentences with center-embedded, object-extracted relative clauses and syntactically simple sentences with right-branching, subject-extracted relative clauses in a group of 15 participants in three tasks. In a sentence verification task, participants saw a target sentence in one of these two syntactic forms, followed by a probe in a simple active form, and determined whether the probe expressed a proposition in the target. In a plausibility judgment task, participants determined whether a sentence in one of these two syntactic forms was plausible or implausible. Finally, in a non-word detection task, participants determined whether a sentence in one of these two syntactic forms contained only real words or a non-word. BOLD signal associated with the syntactic contrast increased in the left posterior inferior frontal gyrus in non-word detection and in a widespread set of areas in the other two tasks. We conclude that the BOLD activity in the left posterior inferior frontal gyrus reflects syntactic processing independent of concurrent cognitive operations and the more widespread areas of activation reflect the use of strategies and the use of the products of syntactic processing to accomplish tasks. PMID:18387556

  9. Skin Lesion Analysis towards Melanoma Detection Using Deep Learning Network

    PubMed Central

    2018-01-01

    Skin lesions are a severe disease globally. Early detection of melanoma in dermoscopy images significantly increases the survival rate. However, the accurate recognition of melanoma is extremely challenging due to the following reasons: low contrast between lesions and skin, visual similarity between melanoma and non-melanoma lesions, etc. Hence, reliable automatic detection of skin tumors is very useful to increase the accuracy and efficiency of pathologists. In this paper, we proposed two deep learning methods to address three main tasks emerging in the area of skin lesion image processing, i.e., lesion segmentation (task 1), lesion dermoscopic feature extraction (task 2) and lesion classification (task 3). A deep learning framework consisting of two fully convolutional residual networks (FCRN) is proposed to simultaneously produce the segmentation result and the coarse classification result. A lesion index calculation unit (LICU) is developed to refine the coarse classification results by calculating the distance heat-map. A straight-forward CNN is proposed for the dermoscopic feature extraction task. The proposed deep learning frameworks were evaluated on the ISIC 2017 dataset. Experimental results show the promising accuracies of our frameworks, i.e., 0.753 for task 1, 0.848 for task 2 and 0.912 for task 3 were achieved. PMID:29439500

  10. Loose-Lipped Mobile Device Intelligent Personal Assistants: A Discussion of Information Gleaned from Siri on Locked iOS Devices.

    PubMed

    Horsman, Graeme

    2018-04-23

    The forensic analysis of mobile handsets is becoming a more prominent factor in many criminal investigations. Despite such devices frequently storing relevant evidential content to support an investigation, accessing this information is becoming an increasingly difficult task due to enhanced effective security features. Where access to a device's resident data is not possible via traditional mobile forensic methods, in some cases it may still be possible to extract user information via queries made to an installed intelligent personal assistant. This article presents an evaluation of the information which is retrievable from Apple's Siri when interacted with on a locked iOS device running iOS 11.2.5 (the latest at the time of testing). The testing of verbal commands designed to elicit a response from Siri demonstrate the ability to recover call log, SMS, Contacts, Apple Maps, Calendar, and device information which may support any further investigation. © 2018 American Academy of Forensic Sciences.

  11. Biometric recognition via texture features of eye movement trajectories in a visual searching task.

    PubMed

    Li, Chunyong; Xue, Jiguo; Quan, Cheng; Yue, Jingwei; Zhang, Chenggang

    2018-01-01

    Biometric recognition technology based on eye-movement dynamics has been in development for more than ten years. Different visual tasks, feature extraction and feature recognition methods are proposed to improve the performance of eye movement biometric system. However, the correct identification and verification rates, especially in long-term experiments, as well as the effects of visual tasks and eye trackers' temporal and spatial resolution are still the foremost considerations in eye movement biometrics. With a focus on these issues, we proposed a new visual searching task for eye movement data collection and a new class of eye movement features for biometric recognition. In order to demonstrate the improvement of this visual searching task being used in eye movement biometrics, three other eye movement feature extraction methods were also tested on our eye movement datasets. Compared with the original results, all three methods yielded better results as expected. In addition, the biometric performance of these four feature extraction methods was also compared using the equal error rate (EER) and Rank-1 identification rate (Rank-1 IR), and the texture features introduced in this paper were ultimately shown to offer some advantages with regard to long-term stability and robustness over time and spatial precision. Finally, the results of different combinations of these methods with a score-level fusion method indicated that multi-biometric methods perform better in most cases.

  12. Biometric recognition via texture features of eye movement trajectories in a visual searching task

    PubMed Central

    Li, Chunyong; Xue, Jiguo; Quan, Cheng; Yue, Jingwei

    2018-01-01

    Biometric recognition technology based on eye-movement dynamics has been in development for more than ten years. Different visual tasks, feature extraction and feature recognition methods are proposed to improve the performance of eye movement biometric system. However, the correct identification and verification rates, especially in long-term experiments, as well as the effects of visual tasks and eye trackers’ temporal and spatial resolution are still the foremost considerations in eye movement biometrics. With a focus on these issues, we proposed a new visual searching task for eye movement data collection and a new class of eye movement features for biometric recognition. In order to demonstrate the improvement of this visual searching task being used in eye movement biometrics, three other eye movement feature extraction methods were also tested on our eye movement datasets. Compared with the original results, all three methods yielded better results as expected. In addition, the biometric performance of these four feature extraction methods was also compared using the equal error rate (EER) and Rank-1 identification rate (Rank-1 IR), and the texture features introduced in this paper were ultimately shown to offer some advantages with regard to long-term stability and robustness over time and spatial precision. Finally, the results of different combinations of these methods with a score-level fusion method indicated that multi-biometric methods perform better in most cases. PMID:29617383

  13. Automatically identifying, counting, and describing wild animals in camera-trap images with deep learning.

    PubMed

    Norouzzadeh, Mohammad Sadegh; Nguyen, Anh; Kosmala, Margaret; Swanson, Alexandra; Palmer, Meredith S; Packer, Craig; Clune, Jeff

    2018-06-19

    Having accurate, detailed, and up-to-date information about the location and behavior of animals in the wild would improve our ability to study and conserve ecosystems. We investigate the ability to automatically, accurately, and inexpensively collect such data, which could help catalyze the transformation of many fields of ecology, wildlife biology, zoology, conservation biology, and animal behavior into "big data" sciences. Motion-sensor "camera traps" enable collecting wildlife pictures inexpensively, unobtrusively, and frequently. However, extracting information from these pictures remains an expensive, time-consuming, manual task. We demonstrate that such information can be automatically extracted by deep learning, a cutting-edge type of artificial intelligence. We train deep convolutional neural networks to identify, count, and describe the behaviors of 48 species in the 3.2 million-image Snapshot Serengeti dataset. Our deep neural networks automatically identify animals with >93.8% accuracy, and we expect that number to improve rapidly in years to come. More importantly, if our system classifies only images it is confident about, our system can automate animal identification for 99.3% of the data while still performing at the same 96.6% accuracy as that of crowdsourced teams of human volunteers, saving >8.4 y (i.e., >17,000 h at 40 h/wk) of human labeling effort on this 3.2 million-image dataset. Those efficiency gains highlight the importance of using deep neural networks to automate data extraction from camera-trap images, reducing a roadblock for this widely used technology. Our results suggest that deep learning could enable the inexpensive, unobtrusive, high-volume, and even real-time collection of a wealth of information about vast numbers of animals in the wild. Copyright © 2018 the Author(s). Published by PNAS.

  14. Preparation and use of varied natural tools for extractive foraging by bonobos (Pan Paniscus).

    PubMed

    Roffman, Itai; Savage-Rumbaugh, Sue; Rubert-Pugh, Elizabeth; Stadler, André; Ronen, Avraham; Nevo, Eviatar

    2015-09-01

    The tool-assisted extractive foraging capabilities of captive (zoo) and semi-captive (sanctuary) bonobo (Pan paniscus) groups were compared to each other and to those known in wild chimpanzee (Pan troglodytes) cultures. The bonobos were provided with natural raw materials and challenged with tasks not previously encountered, in experimental settings simulating natural contexts where resources requiring special retrieval efforts were hidden. They were shown that food was buried underground or inserted into long bone cavities, and left to tackle the tasks without further intervention. The bonobos used modified branches and unmodified antlers or stones to dig under rocks and in the ground or to break bones to retrieve the food. Antlers, short sticks, long sticks, and rocks were effectively used as mattocks, daggers, levers, and shovels, respectively. One bonobo successively struck a long bone with an angular hammer stone, completely bisecting it longitudinally. Another bonobo modified long branches into spears and used them as attack weapons and barriers. Bonobos in the sanctuary, unlike those in the zoo, used tool sets to perform sequential actions. The competent and diverse tool-assisted extractive foraging by the bonobos corroborates and complements the extensive information on similar tool use by chimpanzees, suggesting that such competence is a shared trait. Better performance by the sanctuary bonobos than the zoo group was probably due to differences in their cultural exposure and housing conditions. The bonobos' foraging techniques resembled some of those attributed to Oldowan hominins, implying that they can serve as referential models. © 2015 Wiley Periodicals, Inc.

  15. Affective video retrieval: violence detection in Hollywood movies by large-scale segmental feature extraction.

    PubMed

    Eyben, Florian; Weninger, Felix; Lehment, Nicolas; Schuller, Björn; Rigoll, Gerhard

    2013-01-01

    Without doubt general video and sound, as found in large multimedia archives, carry emotional information. Thus, audio and video retrieval by certain emotional categories or dimensions could play a central role for tomorrow's intelligent systems, enabling search for movies with a particular mood, computer aided scene and sound design in order to elicit certain emotions in the audience, etc. Yet, the lion's share of research in affective computing is exclusively focusing on signals conveyed by humans, such as affective speech. Uniting the fields of multimedia retrieval and affective computing is believed to lend to a multiplicity of interesting retrieval applications, and at the same time to benefit affective computing research, by moving its methodology "out of the lab" to real-world, diverse data. In this contribution, we address the problem of finding "disturbing" scenes in movies, a scenario that is highly relevant for computer-aided parental guidance. We apply large-scale segmental feature extraction combined with audio-visual classification to the particular task of detecting violence. Our system performs fully data-driven analysis including automatic segmentation. We evaluate the system in terms of mean average precision (MAP) on the official data set of the MediaEval 2012 evaluation campaign's Affect Task, which consists of 18 original Hollywood movies, achieving up to .398 MAP on unseen test data in full realism. An in-depth analysis of the worth of individual features with respect to the target class and the system errors is carried out and reveals the importance of peak-related audio feature extraction and low-level histogram-based video analysis.

  16. Affective Video Retrieval: Violence Detection in Hollywood Movies by Large-Scale Segmental Feature Extraction

    PubMed Central

    Eyben, Florian; Weninger, Felix; Lehment, Nicolas; Schuller, Björn; Rigoll, Gerhard

    2013-01-01

    Without doubt general video and sound, as found in large multimedia archives, carry emotional information. Thus, audio and video retrieval by certain emotional categories or dimensions could play a central role for tomorrow's intelligent systems, enabling search for movies with a particular mood, computer aided scene and sound design in order to elicit certain emotions in the audience, etc. Yet, the lion's share of research in affective computing is exclusively focusing on signals conveyed by humans, such as affective speech. Uniting the fields of multimedia retrieval and affective computing is believed to lend to a multiplicity of interesting retrieval applications, and at the same time to benefit affective computing research, by moving its methodology “out of the lab” to real-world, diverse data. In this contribution, we address the problem of finding “disturbing” scenes in movies, a scenario that is highly relevant for computer-aided parental guidance. We apply large-scale segmental feature extraction combined with audio-visual classification to the particular task of detecting violence. Our system performs fully data-driven analysis including automatic segmentation. We evaluate the system in terms of mean average precision (MAP) on the official data set of the MediaEval 2012 evaluation campaign's Affect Task, which consists of 18 original Hollywood movies, achieving up to .398 MAP on unseen test data in full realism. An in-depth analysis of the worth of individual features with respect to the target class and the system errors is carried out and reveals the importance of peak-related audio feature extraction and low-level histogram-based video analysis. PMID:24391704

  17. Apriori Versions Based on MapReduce for Mining Frequent Patterns on Big Data.

    PubMed

    Luna, Jose Maria; Padillo, Francisco; Pechenizkiy, Mykola; Ventura, Sebastian

    2017-09-27

    Pattern mining is one of the most important tasks to extract meaningful and useful information from raw data. This task aims to extract item-sets that represent any type of homogeneity and regularity in data. Although many efficient algorithms have been developed in this regard, the growing interest in data has caused the performance of existing pattern mining techniques to be dropped. The goal of this paper is to propose new efficient pattern mining algorithms to work in big data. To this aim, a series of algorithms based on the MapReduce framework and the Hadoop open-source implementation have been proposed. The proposed algorithms can be divided into three main groups. First, two algorithms [Apriori MapReduce (AprioriMR) and iterative AprioriMR] with no pruning strategy are proposed, which extract any existing item-set in data. Second, two algorithms (space pruning AprioriMR and top AprioriMR) that prune the search space by means of the well-known anti-monotone property are proposed. Finally, a last algorithm (maximal AprioriMR) is also proposed for mining condensed representations of frequent patterns. To test the performance of the proposed algorithms, a varied collection of big data datasets have been considered, comprising up to 3 · 10#x00B9;⁸ transactions and more than 5 million of distinct single-items. The experimental stage includes comparisons against highly efficient and well-known pattern mining algorithms. Results reveal the interest of applying MapReduce versions when complex problems are considered, and also the unsuitability of this paradigm when dealing with small data.

  18. Mining Tasks from the Web Anchor Text Graph: MSR Notebook Paper for the TREC 2015 Tasks Track

    DTIC Science & Technology

    2015-11-20

    Mining Tasks from the Web Anchor Text Graph: MSR Notebook Paper for the TREC 2015 Tasks Track Paul N. Bennett Microsoft Research Redmond, USA pauben...anchor text graph has proven useful in the general realm of query reformulation [2], we sought to quantify the value of extracting key phrases from...anchor text in the broader setting of the task understanding track. Given a query, our approach considers a simple method for identifying a relevant

  19. Radar analysis of free oscillations of rail for diagnostics defects

    NASA Astrophysics Data System (ADS)

    Shaydurov, G. Y.; Kudinov, D. S.; Kokhonkova, E. A.; Potylitsyn, V. S.

    2018-05-01

    One of the tasks of developing and implementing defectoscopy devices is the minimal influence of the human factor in their exploitation. At present, rail inspection systems do not have sufficient depth of rail research, and ultrasonic diagnostics systems need to contact the sensor with the surface being studied, which leads to low productivity. The article gives a comparative analysis of existing noncontact methods of flaw detection, offers a contactless method of diagnostics by excitation of acoustic waves and extraction of information about defects from the frequency of free rail oscillations using the radar method.

  20. A logical model of cooperating rule-based systems

    NASA Technical Reports Server (NTRS)

    Bailin, Sidney C.; Moore, John M.; Hilberg, Robert H.; Murphy, Elizabeth D.; Bahder, Shari A.

    1989-01-01

    A model is developed to assist in the planning, specification, development, and verification of space information systems involving distributed rule-based systems. The model is based on an analysis of possible uses of rule-based systems in control centers. This analysis is summarized as a data-flow model for a hypothetical intelligent control center. From this data-flow model, the logical model of cooperating rule-based systems is extracted. This model consists of four layers of increasing capability: (1) communicating agents, (2) belief-sharing knowledge sources, (3) goal-sharing interest areas, and (4) task-sharing job roles.

  1. A framework for biomedical figure segmentation towards image-based document retrieval

    PubMed Central

    2013-01-01

    The figures included in many of the biomedical publications play an important role in understanding the biological experiments and facts described within. Recent studies have shown that it is possible to integrate the information that is extracted from figures in classical document classification and retrieval tasks in order to improve their accuracy. One important observation about the figures included in biomedical publications is that they are often composed of multiple subfigures or panels, each describing different methodologies or results. The use of these multimodal figures is a common practice in bioscience, as experimental results are graphically validated via multiple methodologies or procedures. Thus, for a better use of multimodal figures in document classification or retrieval tasks, as well as for providing the evidence source for derived assertions, it is important to automatically segment multimodal figures into subfigures and panels. This is a challenging task, however, as different panels can contain similar objects (i.e., barcharts and linecharts) with multiple layouts. Also, certain types of biomedical figures are text-heavy (e.g., DNA sequences and protein sequences images) and they differ from traditional images. As a result, classical image segmentation techniques based on low-level image features, such as edges or color, are not directly applicable to robustly partition multimodal figures into single modal panels. In this paper, we describe a robust solution for automatically identifying and segmenting unimodal panels from a multimodal figure. Our framework starts by robustly harvesting figure-caption pairs from biomedical articles. We base our approach on the observation that the document layout can be used to identify encoded figures and figure boundaries within PDF files. Taking into consideration the document layout allows us to correctly extract figures from the PDF document and associate their corresponding caption. We combine pixel-level representations of the extracted images with information gathered from their corresponding captions to estimate the number of panels in the figure. Thus, our approach simultaneously identifies the number of panels and the layout of figures. In order to evaluate the approach described here, we applied our system on documents containing protein-protein interactions (PPIs) and compared the results against a gold standard that was annotated by biologists. Experimental results showed that our automatic figure segmentation approach surpasses pure caption-based and image-based approaches, achieving a 96.64% accuracy. To allow for efficient retrieval of information, as well as to provide the basis for integration into document classification and retrieval systems among other, we further developed a web-based interface that lets users easily retrieve panels containing the terms specified in the user queries. PMID:24565394

  2. Classification of EEG Signals Based on Pattern Recognition Approach.

    PubMed

    Amin, Hafeez Ullah; Mumtaz, Wajid; Subhani, Ahmad Rauf; Saad, Mohamad Naufal Mohamad; Malik, Aamir Saeed

    2017-01-01

    Feature extraction is an important step in the process of electroencephalogram (EEG) signal classification. The authors propose a "pattern recognition" approach that discriminates EEG signals recorded during different cognitive conditions. Wavelet based feature extraction such as, multi-resolution decompositions into detailed and approximate coefficients as well as relative wavelet energy were computed. Extracted relative wavelet energy features were normalized to zero mean and unit variance and then optimized using Fisher's discriminant ratio (FDR) and principal component analysis (PCA). A high density EEG dataset validated the proposed method (128-channels) by identifying two classifications: (1) EEG signals recorded during complex cognitive tasks using Raven's Advance Progressive Metric (RAPM) test; (2) EEG signals recorded during a baseline task (eyes open). Classifiers such as, K-nearest neighbors (KNN), Support Vector Machine (SVM), Multi-layer Perceptron (MLP), and Naïve Bayes (NB) were then employed. Outcomes yielded 99.11% accuracy via SVM classifier for coefficient approximations (A5) of low frequencies ranging from 0 to 3.90 Hz. Accuracy rates for detailed coefficients were 98.57 and 98.39% for SVM and KNN, respectively; and for detailed coefficients (D5) deriving from the sub-band range (3.90-7.81 Hz). Accuracy rates for MLP and NB classifiers were comparable at 97.11-89.63% and 91.60-81.07% for A5 and D5 coefficients, respectively. In addition, the proposed approach was also applied on public dataset for classification of two cognitive tasks and achieved comparable classification results, i.e., 93.33% accuracy with KNN. The proposed scheme yielded significantly higher classification performances using machine learning classifiers compared to extant quantitative feature extraction. These results suggest the proposed feature extraction method reliably classifies EEG signals recorded during cognitive tasks with a higher degree of accuracy.

  3. Classification of EEG Signals Based on Pattern Recognition Approach

    PubMed Central

    Amin, Hafeez Ullah; Mumtaz, Wajid; Subhani, Ahmad Rauf; Saad, Mohamad Naufal Mohamad; Malik, Aamir Saeed

    2017-01-01

    Feature extraction is an important step in the process of electroencephalogram (EEG) signal classification. The authors propose a “pattern recognition” approach that discriminates EEG signals recorded during different cognitive conditions. Wavelet based feature extraction such as, multi-resolution decompositions into detailed and approximate coefficients as well as relative wavelet energy were computed. Extracted relative wavelet energy features were normalized to zero mean and unit variance and then optimized using Fisher's discriminant ratio (FDR) and principal component analysis (PCA). A high density EEG dataset validated the proposed method (128-channels) by identifying two classifications: (1) EEG signals recorded during complex cognitive tasks using Raven's Advance Progressive Metric (RAPM) test; (2) EEG signals recorded during a baseline task (eyes open). Classifiers such as, K-nearest neighbors (KNN), Support Vector Machine (SVM), Multi-layer Perceptron (MLP), and Naïve Bayes (NB) were then employed. Outcomes yielded 99.11% accuracy via SVM classifier for coefficient approximations (A5) of low frequencies ranging from 0 to 3.90 Hz. Accuracy rates for detailed coefficients were 98.57 and 98.39% for SVM and KNN, respectively; and for detailed coefficients (D5) deriving from the sub-band range (3.90–7.81 Hz). Accuracy rates for MLP and NB classifiers were comparable at 97.11–89.63% and 91.60–81.07% for A5 and D5 coefficients, respectively. In addition, the proposed approach was also applied on public dataset for classification of two cognitive tasks and achieved comparable classification results, i.e., 93.33% accuracy with KNN. The proposed scheme yielded significantly higher classification performances using machine learning classifiers compared to extant quantitative feature extraction. These results suggest the proposed feature extraction method reliably classifies EEG signals recorded during cognitive tasks with a higher degree of accuracy. PMID:29209190

  4. Analysis of simulated angiographic procedures. Part 2: extracting efficiency data from audio and video recordings.

    PubMed

    Duncan, James R; Kline, Benjamin; Glaiberman, Craig B

    2007-04-01

    To create and test methods of extracting efficiency data from recordings of simulated renal stent procedures. Task analysis was performed and used to design a standardized testing protocol. Five experienced angiographers then performed 16 renal stent simulations using the Simbionix AngioMentor angiographic simulator. Audio and video recordings of these simulations were captured from multiple vantage points. The recordings were synchronized and compiled. A series of efficiency metrics (procedure time, contrast volume, and tool use) were then extracted from the recordings. The intraobserver and interobserver variability of these individual metrics was also assessed. The metrics were converted to costs and aggregated to determine the fixed and variable costs of a procedure segment or the entire procedure. Task analysis and pilot testing led to a standardized testing protocol suitable for performance assessment. Task analysis also identified seven checkpoints that divided the renal stent simulations into six segments. Efficiency metrics for these different segments were extracted from the recordings and showed excellent intra- and interobserver correlations. Analysis of the individual and aggregated efficiency metrics demonstrated large differences between segments as well as between different angiographers. These differences persisted when efficiency was expressed as either total or variable costs. Task analysis facilitated both protocol development and data analysis. Efficiency metrics were readily extracted from recordings of simulated procedures. Aggregating the metrics and dividing the procedure into segments revealed potential insights that could be easily overlooked because the simulator currently does not attempt to aggregate the metrics and only provides data derived from the entire procedure. The data indicate that analysis of simulated angiographic procedures will be a powerful method of assessing performance in interventional radiology.

  5. Retrieval of radiology reports citing critical findings with disease-specific customization.

    PubMed

    Lacson, Ronilda; Sugarbaker, Nathanael; Prevedello, Luciano M; Ivan, Ip; Mar, Wendy; Andriole, Katherine P; Khorasani, Ramin

    2012-01-01

    Communication of critical results from diagnostic procedures between caregivers is a Joint Commission national patient safety goal. Evaluating critical result communication often requires manual analysis of voluminous data, especially when reviewing unstructured textual results of radiologic findings. Information retrieval (IR) tools can facilitate this process by enabling automated retrieval of radiology reports that cite critical imaging findings. However, IR tools that have been developed for one disease or imaging modality often need substantial reconfiguration before they can be utilized for another disease entity. THIS PAPER: 1) describes the process of customizing two Natural Language Processing (NLP) and Information Retrieval/Extraction applications - an open-source toolkit, A Nearly New Information Extraction system (ANNIE); and an application developed in-house, Information for Searching Content with an Ontology-Utilizing Toolkit (iSCOUT) - to illustrate the varying levels of customization required for different disease entities and; 2) evaluates each application's performance in identifying and retrieving radiology reports citing critical imaging findings for three distinct diseases, pulmonary nodule, pneumothorax, and pulmonary embolus. Both applications can be utilized for retrieval. iSCOUT and ANNIE had precision values between 0.90-0.98 and recall values between 0.79 and 0.94. ANNIE had consistently higher precision but required more customization. Understanding the customizations involved in utilizing NLP applications for various diseases will enable users to select the most suitable tool for specific tasks.

  6. Retrieval of Radiology Reports Citing Critical Findings with Disease-Specific Customization

    PubMed Central

    Lacson, Ronilda; Sugarbaker, Nathanael; Prevedello, Luciano M; Ivan, IP; Mar, Wendy; Andriole, Katherine P; Khorasani, Ramin

    2012-01-01

    Background: Communication of critical results from diagnostic procedures between caregivers is a Joint Commission national patient safety goal. Evaluating critical result communication often requires manual analysis of voluminous data, especially when reviewing unstructured textual results of radiologic findings. Information retrieval (IR) tools can facilitate this process by enabling automated retrieval of radiology reports that cite critical imaging findings. However, IR tools that have been developed for one disease or imaging modality often need substantial reconfiguration before they can be utilized for another disease entity. Purpose: This paper: 1) describes the process of customizing two Natural Language Processing (NLP) and Information Retrieval/Extraction applications – an open-source toolkit, A Nearly New Information Extraction system (ANNIE); and an application developed in-house, Information for Searching Content with an Ontology-Utilizing Toolkit (iSCOUT) – to illustrate the varying levels of customization required for different disease entities and; 2) evaluates each application’s performance in identifying and retrieving radiology reports citing critical imaging findings for three distinct diseases, pulmonary nodule, pneumothorax, and pulmonary embolus. Results: Both applications can be utilized for retrieval. iSCOUT and ANNIE had precision values between 0.90-0.98 and recall values between 0.79 and 0.94. ANNIE had consistently higher precision but required more customization. Conclusion: Understanding the customizations involved in utilizing NLP applications for various diseases will enable users to select the most suitable tool for specific tasks. PMID:22934127

  7. What Drives Bird Vision? Bill Control and Predator Detection Overshadow Flight.

    PubMed

    Martin, Graham R

    2017-01-01

    Although flight is regarded as a key behavior of birds this review argues that the perceptual demands for its control are met within constraints set by the perceptual demands of two other key tasks: the control of bill (or feet) position, and the detection of food items/predators. Control of bill position, or of the feet when used in foraging, and timing of their arrival at a target, are based upon information derived from the optic flow-field in the binocular region that encompasses the bill. Flow-fields use information extracted from close to the bird using vision of relatively low spatial resolution. The detection of food items and predators is based upon information detected at a greater distance and depends upon regions in the retina with relatively high spatial resolution. The tasks of detecting predators and of placing the bill (or feet) accurately, make contradictory demands upon vision and these have resulted in trade-offs in the form of visual fields and in the topography of retinal regions in which spatial resolution is enhanced, indicated by foveas, areas, and high ganglion cell densities. The informational function of binocular vision in birds does not lie in binocularity per se (i.e., two eyes receiving slightly different information simultaneously about the same objects) but in the contralateral projection of the visual field of each eye. This ensures that each eye receives information from a symmetrically expanding optic flow-field centered close to the direction of the bill, and from this the crucial information of direction of travel and time-to-contact can be extracted, almost instantaneously. Interspecific comparisons of visual fields between closely related species have shown that small differences in foraging techniques can give rise to different perceptual challenges and these have resulted in differences in visual fields even within the same genus. This suggests that vision is subject to continuing and relatively rapid natural selection based upon individual differences in the structure of the optical system, retinal topography, and eye position in the skull. From a sensory ecology perspective a bird is best characterized as "a bill guided by an eye" and that control of flight is achieved within constraints on visual capacity dictated primarily by the demands of foraging and bill control.

  8. What Drives Bird Vision? Bill Control and Predator Detection Overshadow Flight

    PubMed Central

    Martin, Graham R.

    2017-01-01

    Although flight is regarded as a key behavior of birds this review argues that the perceptual demands for its control are met within constraints set by the perceptual demands of two other key tasks: the control of bill (or feet) position, and the detection of food items/predators. Control of bill position, or of the feet when used in foraging, and timing of their arrival at a target, are based upon information derived from the optic flow-field in the binocular region that encompasses the bill. Flow-fields use information extracted from close to the bird using vision of relatively low spatial resolution. The detection of food items and predators is based upon information detected at a greater distance and depends upon regions in the retina with relatively high spatial resolution. The tasks of detecting predators and of placing the bill (or feet) accurately, make contradictory demands upon vision and these have resulted in trade-offs in the form of visual fields and in the topography of retinal regions in which spatial resolution is enhanced, indicated by foveas, areas, and high ganglion cell densities. The informational function of binocular vision in birds does not lie in binocularity per se (i.e., two eyes receiving slightly different information simultaneously about the same objects) but in the contralateral projection of the visual field of each eye. This ensures that each eye receives information from a symmetrically expanding optic flow-field centered close to the direction of the bill, and from this the crucial information of direction of travel and time-to-contact can be extracted, almost instantaneously. Interspecific comparisons of visual fields between closely related species have shown that small differences in foraging techniques can give rise to different perceptual challenges and these have resulted in differences in visual fields even within the same genus. This suggests that vision is subject to continuing and relatively rapid natural selection based upon individual differences in the structure of the optical system, retinal topography, and eye position in the skull. From a sensory ecology perspective a bird is best characterized as “a bill guided by an eye” and that control of flight is achieved within constraints on visual capacity dictated primarily by the demands of foraging and bill control. PMID:29163020

  9. Combined non-parametric and parametric approach for identification of time-variant systems

    NASA Astrophysics Data System (ADS)

    Dziedziech, Kajetan; Czop, Piotr; Staszewski, Wieslaw J.; Uhl, Tadeusz

    2018-03-01

    Identification of systems, structures and machines with variable physical parameters is a challenging task especially when time-varying vibration modes are involved. The paper proposes a new combined, two-step - i.e. non-parametric and parametric - modelling approach in order to determine time-varying vibration modes based on input-output measurements. Single-degree-of-freedom (SDOF) vibration modes from multi-degree-of-freedom (MDOF) non-parametric system representation are extracted in the first step with the use of time-frequency wavelet-based filters. The second step involves time-varying parametric representation of extracted modes with the use of recursive linear autoregressive-moving-average with exogenous inputs (ARMAX) models. The combined approach is demonstrated using system identification analysis based on the experimental mass-varying MDOF frame-like structure subjected to random excitation. The results show that the proposed combined method correctly captures the dynamics of the analysed structure, using minimum a priori information on the model.

  10. Incidental Learning of S-R Contingencies in the Masked Prime Task

    ERIC Educational Resources Information Center

    Schlaghecken, Friederike; Blagrove, Elisabeth; Maylor, Elizabeth A.

    2007-01-01

    Subliminal motor priming effects in the masked prime paradigm can only be obtained when primes are part of the task set. In 2 experiments, the authors investigated whether the relevant task set feature needs to be explicitly instructed or could be extracted automatically in an incidental learning paradigm. Primes and targets were symmetrical…

  11. Classifying human operator functional state based on electrophysiological and performance measures and fuzzy clustering method.

    PubMed

    Zhang, Jian-Hua; Peng, Xiao-Di; Liu, Hua; Raisch, Jörg; Wang, Ru-Bin

    2013-12-01

    The human operator's ability to perform their tasks can fluctuate over time. Because the cognitive demands of the task can also vary it is possible that the capabilities of the operator are not sufficient to satisfy the job demands. This can lead to serious errors when the operator is overwhelmed by the task demands. Psychophysiological measures, such as heart rate and brain activity, can be used to monitor operator cognitive workload. In this paper, the most influential psychophysiological measures are extracted to characterize Operator Functional State (OFS) in automated tasks under a complex form of human-automation interaction. The fuzzy c-mean (FCM) algorithm is used and tested for its OFS classification performance. The results obtained have shown the feasibility and effectiveness of the FCM algorithm as well as the utility of the selected input features for OFS classification. Besides being able to cope with nonlinearity and fuzzy uncertainty in the psychophysiological data it can provide information about the relative importance of the input features as well as the confidence estimate of the classification results. The OFS pattern classification method developed can be incorporated into an adaptive aiding system in order to enhance the overall performance of a large class of safety-critical human-machine cooperative systems.

  12. Fully Automatic Speech-Based Analysis of the Semantic Verbal Fluency Task.

    PubMed

    König, Alexandra; Linz, Nicklas; Tröger, Johannes; Wolters, Maria; Alexandersson, Jan; Robert, Phillipe

    2018-06-08

    Semantic verbal fluency (SVF) tests are routinely used in screening for mild cognitive impairment (MCI). In this task, participants name as many items as possible of a semantic category under a time constraint. Clinicians measure task performance manually by summing the number of correct words and errors. More fine-grained variables add valuable information to clinical assessment, but are time-consuming. Therefore, the aim of this study is to investigate whether automatic analysis of the SVF could provide these as accurate as manual and thus, support qualitative screening of neurocognitive impairment. SVF data were collected from 95 older people with MCI (n = 47), Alzheimer's or related dementias (ADRD; n = 24), and healthy controls (HC; n = 24). All data were annotated manually and automatically with clusters and switches. The obtained metrics were validated using a classifier to distinguish HC, MCI, and ADRD. Automatically extracted clusters and switches were highly correlated (r = 0.9) with manually established values, and performed as well on the classification task separating HC from persons with ADRD (area under curve [AUC] = 0.939) and MCI (AUC = 0.758). The results show that it is possible to automate fine-grained analyses of SVF data for the assessment of cognitive decline. © 2018 S. Karger AG, Basel.

  13. Non-linguistic learning in aphasia: Effects of training method and stimulus characteristics

    PubMed Central

    Vallila-Rohter, Sofia; Kiran, Swathi

    2013-01-01

    Purpose The purpose of the current study was to explore non-linguistic learning ability in patients with aphasia, examining the impact of stimulus typicality and feedback on success with learning. Method Eighteen patients with aphasia and eight healthy controls participated in this study. All participants completed four computerized, non-linguistic category-learning tasks. We probed learning ability under two methods of instruction: feedback-based (FB) and paired-associate (PA). We also examined the impact of task complexity on learning ability, comparing two stimulus conditions: typical (Typ) and atypical (Atyp). Performance was compared between groups and across conditions. Results Results demonstrated that healthy controls were able to successfully learn categories under all conditions. For our patients with aphasia, two patterns of performance arose. One subgroup of patients was able to maintain learning across task manipulations and conditions. The other subgroup of patients demonstrated a sensitivity to task complexity, learning successfully only in the typical training conditions. Conclusions Results support the hypothesis that impairments of general learning are present in aphasia. Some patients demonstrated the ability to extract category information under complex training conditions, while others learned only under conditions that were simplified and emphasized salient category features. Overall, the typical training condition facilitated learning for all participants. Findings have implications for therapy, which are discussed. PMID:23695914

  14. Real-Time Lane Region Detection Using a Combination of Geometrical and Image Features

    PubMed Central

    Cáceres Hernández, Danilo; Kurnianggoro, Laksono; Filonenko, Alexander; Jo, Kang Hyun

    2016-01-01

    Over the past few decades, pavement markings have played a key role in intelligent vehicle applications such as guidance, navigation, and control. However, there are still serious issues facing the problem of lane marking detection. For example, problems include excessive processing time and false detection due to similarities in color and edges between traffic signs (channeling lines, stop lines, crosswalk, arrows, etc.). This paper proposes a strategy to extract the lane marking information taking into consideration its features such as color, edge, and width, as well as the vehicle speed. Firstly, defining the region of interest is a critical task to achieve real-time performance. In this sense, the region of interest is dependent on vehicle speed. Secondly, the lane markings are detected by using a hybrid color-edge feature method along with a probabilistic method, based on distance-color dependence and a hierarchical fitting model. Thirdly, the following lane marking information is extracted: the number of lane markings to both sides of the vehicle, the respective fitting model, and the centroid information of the lane. Using these parameters, the region is computed by using a road geometric model. To evaluate the proposed method, a set of consecutive frames was used in order to validate the performance. PMID:27869657

  15. Training the max-margin sequence model with the relaxed slack variables.

    PubMed

    Niu, Lingfeng; Wu, Jianmin; Shi, Yong

    2012-09-01

    Sequence models are widely used in many applications such as natural language processing, information extraction and optical character recognition, etc. We propose a new approach to train the max-margin based sequence model by relaxing the slack variables in this paper. With the canonical feature mapping definition, the relaxed problem is solved by training a multiclass Support Vector Machine (SVM). Compared with the state-of-the-art solutions for the sequence learning, the new method has the following advantages: firstly, the sequence training problem is transformed into a multiclassification problem, which is more widely studied and already has quite a few off-the-shelf training packages; secondly, this new approach reduces the complexity of training significantly and achieves comparable prediction performance compared with the existing sequence models; thirdly, when the size of training data is limited, by assigning different slack variables to different microlabel pairs, the new method can use the discriminative information more frugally and produces more reliable model; last but not least, by employing kernels in the intermediate multiclass SVM, nonlinear feature space can be easily explored. Experimental results on the task of named entity recognition, information extraction and handwritten letter recognition with the public datasets illustrate the efficiency and effectiveness of our method. Copyright © 2012 Elsevier Ltd. All rights reserved.

  16. Real-Time Digital Signal Processing Based on FPGAs for Electronic Skin Implementation †

    PubMed Central

    Ibrahim, Ali; Gastaldo, Paolo; Chible, Hussein; Valle, Maurizio

    2017-01-01

    Enabling touch-sensing capability would help appliances understand interaction behaviors with their surroundings. Many recent studies are focusing on the development of electronic skin because of its necessity in various application domains, namely autonomous artificial intelligence (e.g., robots), biomedical instrumentation, and replacement prosthetic devices. An essential task of the electronic skin system is to locally process the tactile data and send structured information either to mimic human skin or to respond to the application demands. The electronic skin must be fabricated together with an embedded electronic system which has the role of acquiring the tactile data, processing, and extracting structured information. On the other hand, processing tactile data requires efficient methods to extract meaningful information from raw sensor data. Machine learning represents an effective method for data analysis in many domains: it has recently demonstrated its effectiveness in processing tactile sensor data. In this framework, this paper presents the implementation of digital signal processing based on FPGAs for tactile data processing. It provides the implementation of a tensorial kernel function for a machine learning approach. Implementation results are assessed by highlighting the FPGA resource utilization and power consumption. Results demonstrate the feasibility of the proposed implementation when real-time classification of input touch modalities are targeted. PMID:28287448

  17. Real-Time Digital Signal Processing Based on FPGAs for Electronic Skin Implementation.

    PubMed

    Ibrahim, Ali; Gastaldo, Paolo; Chible, Hussein; Valle, Maurizio

    2017-03-10

    Enabling touch-sensing capability would help appliances understand interaction behaviors with their surroundings. Many recent studies are focusing on the development of electronic skin because of its necessity in various application domains, namely autonomous artificial intelligence (e.g., robots), biomedical instrumentation, and replacement prosthetic devices. An essential task of the electronic skin system is to locally process the tactile data and send structured information either to mimic human skin or to respond to the application demands. The electronic skin must be fabricated together with an embedded electronic system which has the role of acquiring the tactile data, processing, and extracting structured information. On the other hand, processing tactile data requires efficient methods to extract meaningful information from raw sensor data. Machine learning represents an effective method for data analysis in many domains: it has recently demonstrated its effectiveness in processing tactile sensor data. In this framework, this paper presents the implementation of digital signal processing based on FPGAs for tactile data processing. It provides the implementation of a tensorial kernel function for a machine learning approach. Implementation results are assessed by highlighting the FPGA resource utilization and power consumption. Results demonstrate the feasibility of the proposed implementation when real-time classification of input touch modalities are targeted.

  18. Solvent Extraction of Copper: An Extractive Metallurgy Exercise for Undergraduate Teaching Laboratories

    ERIC Educational Resources Information Center

    Smellie, Iain A.; Forgan, Ross S.; Brodie, Claire; Gavine, Jack S.; Harris, Leanne; Houston, Daniel; Hoyland, Andrew D.; McCaughan, Rory P.; Miller, Andrew J.; Wilson, Liam; Woodhall, Fiona M.

    2016-01-01

    A multidisciplinary experiment for advanced undergraduate students has been developed in the context of extractive metallurgy. The experiment serves as a model of an important modern industrial process that combines aspects of organic/inorganic synthesis and analysis. Students are tasked to prepare a salicylaldoxime ligand and samples of the…

  19. The Temporal Dynamics of Regularity Extraction in Non-Human Primates

    ERIC Educational Resources Information Center

    Minier, Laure; Fagot, Joël; Rey, Arnaud

    2016-01-01

    Extracting the regularities of our environment is one of our core cognitive abilities. To study the fine-grained dynamics of the extraction of embedded regularities, a method combining the advantages of the artificial language paradigm (Saffran, Aslin, & Newport, [Saffran, J. R., 1996]) and the serial response time task (Nissen & Bullemer,…

  20. Adverse Event extraction from Structured Product Labels using the Event-based Text-mining of Health Electronic Records (ETHER)system.

    PubMed

    Pandey, Abhishek; Kreimeyer, Kory; Foster, Matthew; Botsis, Taxiarchis; Dang, Oanh; Ly, Thomas; Wang, Wei; Forshee, Richard

    2018-01-01

    Structured Product Labels follow an XML-based document markup standard approved by the Health Level Seven organization and adopted by the US Food and Drug Administration as a mechanism for exchanging medical products information. Their current organization makes their secondary use rather challenging. We used the Side Effect Resource database and DailyMed to generate a comparison dataset of 1159 Structured Product Labels. We processed the Adverse Reaction section of these Structured Product Labels with the Event-based Text-mining of Health Electronic Records system and evaluated its ability to extract and encode Adverse Event terms to Medical Dictionary for Regulatory Activities Preferred Terms. A small sample of 100 labels was then selected for further analysis. Of the 100 labels, Event-based Text-mining of Health Electronic Records achieved a precision and recall of 81 percent and 92 percent, respectively. This study demonstrated Event-based Text-mining of Health Electronic Record's ability to extract and encode Adverse Event terms from Structured Product Labels which may potentially support multiple pharmacoepidemiological tasks.

  1. Brillouin Frequency Shift of Fiber Distributed Sensors Extracted from Noisy Signals by Quadratic Fitting.

    PubMed

    Zheng, Hanrong; Fang, Zujie; Wang, Zhaoyong; Lu, Bin; Cao, Yulong; Ye, Qing; Qu, Ronghui; Cai, Haiwen

    2018-01-31

    It is a basic task in Brillouin distributed fiber sensors to extract the peak frequency of the scattering spectrum, since the peak frequency shift gives information on the fiber temperature and strain changes. Because of high-level noise, quadratic fitting is often used in the data processing. Formulas of the dependence of the minimum detectable Brillouin frequency shift (BFS) on the signal-to-noise ratio (SNR) and frequency step have been presented in publications, but in different expressions. A detailed deduction of new formulas of BFS variance and its average is given in this paper, showing especially their dependences on the data range used in fitting, including its length and its center respective to the real spectral peak. The theoretical analyses are experimentally verified. It is shown that the center of the data range has a direct impact on the accuracy of the extracted BFS. We propose and demonstrate an iterative fitting method to mitigate such effects and improve the accuracy of BFS measurement. The different expressions of BFS variances presented in previous papers are explained and discussed.

  2. Novel texture-based descriptors for tool wear condition monitoring

    NASA Astrophysics Data System (ADS)

    Antić, Aco; Popović, Branislav; Krstanović, Lidija; Obradović, Ratko; Milošević, Mijodrag

    2018-01-01

    All state-of-the-art tool condition monitoring systems (TCM) in the tool wear recognition task, especially those that use vibration sensors, heavily depend on the choice of descriptors containing information about the tool wear state which are extracted from the particular sensor signals. All other post-processing techniques do not manage to increase the recognition precision if those descriptors are not discriminative enough. In this work, we propose a tool wear monitoring strategy which relies on the novel texture based descriptors. We consider the module of the Short Term Discrete Fourier Transform (STDFT) spectra obtained from the particular vibration sensors signal utterance as the 2D textured image. This is done by identifying the time scale of STDFT as the first dimension, and the frequency scale as the second dimension of the particular textured image. The obtained textured image is then divided into particular 2D texture patches, covering a part of the frequency range of interest. After applying the appropriate filter bank, 2D textons are extracted for each predefined frequency band. By averaging in time, we extract from the textons for each band of interest the information regarding the Probability Density Function (PDF) in the form of lower order moments, thus obtaining robust tool wear state descriptors. We validate the proposed features by the experiments conducted on the real TCM system, obtaining the high recognition accuracy.

  3. Structural health monitoring feature design by genetic programming

    NASA Astrophysics Data System (ADS)

    Harvey, Dustin Y.; Todd, Michael D.

    2014-09-01

    Structural health monitoring (SHM) systems provide real-time damage and performance information for civil, aerospace, and other high-capital or life-safety critical structures. Conventional data processing involves pre-processing and extraction of low-dimensional features from in situ time series measurements. The features are then input to a statistical pattern recognition algorithm to perform the relevant classification or regression task necessary to facilitate decisions by the SHM system. Traditional design of signal processing and feature extraction algorithms can be an expensive and time-consuming process requiring extensive system knowledge and domain expertise. Genetic programming, a heuristic program search method from evolutionary computation, was recently adapted by the authors to perform automated, data-driven design of signal processing and feature extraction algorithms for statistical pattern recognition applications. The proposed method, called Autofead, is particularly suitable to handle the challenges inherent in algorithm design for SHM problems where the manifestation of damage in structural response measurements is often unclear or unknown. Autofead mines a training database of response measurements to discover information-rich features specific to the problem at hand. This study provides experimental validation on three SHM applications including ultrasonic damage detection, bearing damage classification for rotating machinery, and vibration-based structural health monitoring. Performance comparisons with common feature choices for each problem area are provided demonstrating the versatility of Autofead to produce significant algorithm improvements on a wide range of problems.

  4. Classification of antecedents towards safety use of health information technology: A systematic review.

    PubMed

    Salahuddin, Lizawati; Ismail, Zuraini

    2015-11-01

    This paper provides a systematic review of safety use of health information technology (IT). The first objective is to identify the antecedents towards safety use of health IT by conducting systematic literature review (SLR). The second objective is to classify the identified antecedents based on the work system in Systems Engineering Initiative for Patient Safety (SEIPS) model and an extension of DeLone and McLean (D&M) information system (IS) success model. A systematic literature review (SLR) was conducted from peer-reviewed scholarly publications between January 2000 and July 2014. SLR was carried out and reported based on the preferred reporting items for systematic reviews and meta-analyses (PRISMA) statement. The related articles were identified by searching the articles published in Science Direct, Medline, EMBASE, and CINAHL databases. Data extracted from the resultant studies included are to be analysed based on the work system in Systems Engineering Initiative for Patient Safety (SEIPS) model, and also from the extended DeLone and McLean (D&M) information system (IS) success model. 55 articles delineated to be antecedents that influenced the safety use of health IT were included for review. Antecedents were identified and then classified into five key categories. The categories are (1) person, (2) technology, (3) tasks, (4) organization, and (5) environment. Specifically, person is attributed by competence while technology is associated to system quality, information quality, and service quality. Tasks are attributed by task-related stressor. Organisation is related to training, organisation resources, and teamwork. Lastly, environment is attributed by physical layout, and noise. This review provides evidence that the antecedents for safety use of health IT originated from both social and technical aspects. However, inappropriate health IT usage potentially increases the incidence of errors and produces new safety risks. The review cautions future implementation and adoption of health IT to carefully consider the complex interactions between social and technical elements propound in healthcare settings. Copyright © 2015. Published by Elsevier Ireland Ltd.

  5. An Evaluation of Detect and Avoid (DAA) Displays for Unmanned Aircraft Systems: The Effect of Information Level and Display Location on Pilot Performance

    NASA Technical Reports Server (NTRS)

    Fern, Lisa; Rorie, R. Conrad; Pack, Jessica S.; Shively, R. Jay; Draper, Mark H.

    2015-01-01

    A consortium of government, industry and academia is currently working to establish minimum operational performance standards for Detect and Avoid (DAA) and Control and Communications (C2) systems in order to enable broader integration of Unmanned Aircraft Systems (UAS) into the National Airspace System (NAS). One subset of these performance standards will need to address the DAA display requirements that support an acceptable level of pilot performance. From a pilot's perspective, the DAA task is the maintenance of self separation and collision avoidance from other aircraft, utilizing the available information and controls within the Ground Control Station (GCS), including the DAA display. The pilot-in-the-loop DAA task requires the pilot to carry out three major functions: 1) detect a potential threat, 2) determine an appropriate resolution maneuver, and 3) execute that resolution maneuver via the GCS control and navigation interface(s). The purpose of the present study was to examine two main questions with respect to DAA display considerations that could impact pilots' ability to maintain well clear from other aircraft. First, what is the effect of a minimum (or basic) information display compared to an advanced information display on pilot performance? Second, what is the effect of display location on UAS pilot performance? Two levels of information level (basic, advanced) were compared across two levels of display location (standalone, integrated), for a total of four displays. The authors propose an eight-stage pilot-DAA interaction timeline from which several pilot response time metrics can be extracted. These metrics were compared across the four display conditions. The results indicate that the advanced displays had faster overall response times compared to the basic displays, however, there were no significant differences between the standalone and integrated displays. Implications of the findings on understanding pilot performance on the DAA task, the development of DAA display performance standards, as well as the need for future research are discussed.

  6. Effects of Panax ginseng, consumed with and without glucose, on blood glucose levels and cognitive performance during sustained 'mentally demanding' tasks.

    PubMed

    Reay, Jonathon L; Kennedy, David O; Scholey, Andrew B

    2006-11-01

    Single doses of the traditional herbal treatment Panax ginseng have recently been shown to lower blood glucose levels and elicit cognitive improvements in healthy, overnight-fasted volunteers. The specific mechanisms responsible for these effects are not known. However, cognitive improvements may be related to the glycaemic properties of Panax ginseng. Using a double-blind, placebo-controlled, balanced-crossover design, 27 healthy young adults completed a 10 minute "cognitive demand" test battery at baseline. They then consumed capsules containing either ginseng (extract G115) or a placebo and 30 minutes later a drink containing glucose or placebo. A further 30 minutes later (i.e. 60 minutes post-baseline/capsules) they completed the "cognitive demand" battery six times in immediate succession. Depending on the condition to which the participant was allocated on that particular day, the combination of capsules/drink treatments corresponded to a dose of: 0mg G115/0 mg glucose (placebo); 200mg G115/0 mg glucose (ginseng); 0 mg G115/25 g glucose (glucose) or 200 mg G115/25 g glucose (ginseng/glucose combination). The 10 minute "cognitive demand" battery comprised a Serial Threes subtraction task (2 min); a Serial Sevens subtraction task (2 min); a Rapid Visual Information Processing task (5 min); and a "mental fatigue" visual analogue scale. Blood glucose levels were measured prior to the day's treatment, and before and after the post-dose completions of the battery. The results showed that both Panax ginseng and glucose enhanced performance of a mental arithmetic task and ameliorated the increase in subjective feelings of mental fatigue experienced by participants during the later stages of the sustained, cognitively demanding task performance. Accuracy of performing the Rapid Visual Information Processing task (RVIP) was also improved following the glucose load. There was no evidence of a synergistic relationship between Panax ginseng and exogenous glucose ingestion on any cognitive outcome measure. Panax ginseng caused a reduction in blood glucose levels 1 hour following consumption when ingested without glucose. These results confirm that Panax ginseng may possess glucoregulatory properties and can enhance cognitive performance.

  7. Eyes Matched to the Prize: The State of Matched Filters in Insect Visual Circuits.

    PubMed

    Kohn, Jessica R; Heath, Sarah L; Behnia, Rudy

    2018-01-01

    Confronted with an ever-changing visual landscape, animals must be able to detect relevant stimuli and translate this information into behavioral output. A visual scene contains an abundance of information: to interpret the entirety of it would be uneconomical. To optimally perform this task, neural mechanisms exist to enhance the detection of important features of the sensory environment while simultaneously filtering out irrelevant information. This can be accomplished by using a circuit design that implements specific "matched filters" that are tuned to relevant stimuli. Following this rule, the well-characterized visual systems of insects have evolved to streamline feature extraction on both a structural and functional level. Here, we review examples of specialized visual microcircuits for vital behaviors across insect species, including feature detection, escape, and estimation of self-motion. Additionally, we discuss how these microcircuits are modulated to weigh relevant input with respect to different internal and behavioral states.

  8. Deterministic realization of collective measurements via photonic quantum walks.

    PubMed

    Hou, Zhibo; Tang, Jun-Feng; Shang, Jiangwei; Zhu, Huangjun; Li, Jian; Yuan, Yuan; Wu, Kang-Da; Xiang, Guo-Yong; Li, Chuan-Feng; Guo, Guang-Can

    2018-04-12

    Collective measurements on identically prepared quantum systems can extract more information than local measurements, thereby enhancing information-processing efficiency. Although this nonclassical phenomenon has been known for two decades, it has remained a challenging task to demonstrate the advantage of collective measurements in experiments. Here, we introduce a general recipe for performing deterministic collective measurements on two identically prepared qubits based on quantum walks. Using photonic quantum walks, we realize experimentally an optimized collective measurement with fidelity 0.9946 without post selection. As an application, we achieve the highest tomographic efficiency in qubit state tomography to date. Our work offers an effective recipe for beating the precision limit of local measurements in quantum state tomography and metrology. In addition, our study opens an avenue for harvesting the power of collective measurements in quantum information-processing and for exploring the intriguing physics behind this power.

  9. Behavioral and functional strategies during tool use tasks in bonobos.

    PubMed

    Bardo, Ameline; Borel, Antony; Meunier, Hélène; Guéry, Jean-Pascal; Pouydebat, Emmanuelle

    2016-09-01

    Different primate species have developed extensive capacities for grasping and manipulating objects. However, the manual abilities of primates remain poorly known from a dynamic point of view. The aim of the present study was to quantify the functional and behavioral strategies used by captive bonobos (Pan paniscus) during tool use tasks. The study was conducted on eight captive bonobos which we observed during two tool use tasks: food extraction from a large piece of wood and food recovery from a maze. We focused on grasping postures, in-hand movements, the sequences of grasp postures used that have not been studied in bonobos, and the kind of tools selected. Bonobos used a great variety of grasping postures during both tool use tasks. They were capable of in-hand movement, demonstrated complex sequences of contacts, and showed more dynamic manipulation during the maze task than during the extraction task. They arrived on the location of the task with the tool already modified and used different kinds of tools according to the task. We also observed individual manual strategies. Bonobos were thus able to develop in-hand movements similar to humans and chimpanzees, demonstrated dynamic manipulation, and they responded to task constraints by selecting and modifying tools appropriately, usually before they started the tasks. These results show the necessity to quantify object manipulation in different species to better understand their real manual specificities, which is essential to reconstruct the evolution of primate manual abilities. © 2016 Wiley Periodicals, Inc.

  10. Interactional Features of Repair Negotiation in NS-NNS Interaction on Two Task Types: Information Gap and Personal Information Exchange

    ERIC Educational Resources Information Center

    Kitajima, Ryu

    2013-01-01

    The studies in task-based approaches in second language acquisition claim that controlled and goal convergent tasks such as information gap tasks surpass open-ended conversations such as personal information exchange tasks for the development of the learner's interlanguage, in that the formers promote more repair negotiation. And yet, few studies…

  11. Extracting Drug-Drug Interactions with Word and Character-Level Recurrent Neural Networks

    PubMed Central

    Kavuluru, Ramakanth; Rios, Anthony; Tran, Tung

    2017-01-01

    Drug-drug interactions (DDIs) are known to be responsible for nearly a third of all adverse drug reactions. Hence several current efforts focus on extracting signal from EMRs to prioritize DDIs that need further exploration. To this end, being able to extract explicit mentions of DDIs in free text narratives is an important task. In this paper, we explore recurrent neural network (RNN) architectures to detect and classify DDIs from unstructured text using the DDIExtraction dataset from the SemEval 2013 (task 9) shared task. Our methods are in line with those used in other recent deep learning efforts for relation extraction including DDI extraction. However, to our knowledge, we are the first to investigate the potential of character-level RNNs (Char-RNNs) for DDI extraction (and relation extraction in general). Furthermore, we explore a simple but effective model bootstrapping method to (a). build model averaging ensembles, (b). derive confidence intervals around mean micro-F scores (MMF), and (c). assess the average behavior of our methods. Without any rule based filtering of negative examples, a popular heuristic used by most earlier efforts, we achieve an MMF of 69.13. By adding simple replicable heuristics to filter negative instances we are able to achieve an MMF of 70.38. Furthermore, our best ensembles produce micro F-scores of 70.81 (without filtering) and 72.13 (with filtering), which are superior to metrics reported in published results. Although Char-RNNs turnout to be inferior to regular word based RNN models in overall comparisons, we find that ensembling models from both architectures results in nontrivial gains over simply using either alone, indicating that they complement each other. PMID:29034375

  12. Task Analytic Models to Guide Analysis and Design: Use of the Operator Function Model to Represent Pilot-Autoflight System Mode Problems

    NASA Technical Reports Server (NTRS)

    Degani, Asaf; Mitchell, Christine M.; Chappell, Alan R.; Shafto, Mike (Technical Monitor)

    1995-01-01

    Task-analytic models structure essential information about operator interaction with complex systems, in this case pilot interaction with the autoflight system. Such models serve two purposes: (1) they allow researchers and practitioners to understand pilots' actions; and (2) they provide a compact, computational representation needed to design 'intelligent' aids, e.g., displays, assistants, and training systems. This paper demonstrates the use of the operator function model to trace the process of mode engagements while a pilot is controlling an aircraft via the, autoflight system. The operator function model is a normative and nondeterministic model of how a well-trained, well-motivated operator manages multiple concurrent activities for effective real-time control. For each function, the model links the pilot's actions with the required information. Using the operator function model, this paper describes several mode engagement scenarios. These scenarios were observed and documented during a field study that focused on mode engagements and mode transitions during normal line operations. Data including time, ATC clearances, altitude, system states, and active modes and sub-modes, engagement of modes, were recorded during sixty-six flights. Using these data, seven prototypical mode engagement scenarios were extracted. One scenario details the decision of the crew to disengage a fully automatic mode in favor of a semi-automatic mode, and the consequences of this action. Another describes a mode error involving updating aircraft speed following the engagement of a speed submode. Other scenarios detail mode confusion at various phases of the flight. This analysis uses the operator function model to identify three aspects of mode engagement: (1) the progress of pilot-aircraft-autoflight system interaction; (2) control/display information required to perform mode management activities; and (3) the potential cause(s) of mode confusion. The goal of this paper is twofold: (1) to demonstrate the use of the operator functio model methodology to describe pilot-system interaction while engaging modes And monitoring the system, and (2) to initiate a discussion of how task-analytic models might inform design processes. While the operator function model is only one type of task-analytic representation, the hypothesis of this paper is that some type of task analytic structure is a prerequisite for the design of effective human-automation interaction.

  13. Mutual information-based facial expression recognition

    NASA Astrophysics Data System (ADS)

    Hazar, Mliki; Hammami, Mohamed; Hanêne, Ben-Abdallah

    2013-12-01

    This paper introduces a novel low-computation discriminative regions representation for expression analysis task. The proposed approach relies on interesting studies in psychology which show that most of the descriptive and responsible regions for facial expression are located around some face parts. The contributions of this work lie in the proposition of new approach which supports automatic facial expression recognition based on automatic regions selection. The regions selection step aims to select the descriptive regions responsible or facial expression and was performed using Mutual Information (MI) technique. For facial feature extraction, we have applied Local Binary Patterns Pattern (LBP) on Gradient image to encode salient micro-patterns of facial expressions. Experimental studies have shown that using discriminative regions provide better results than using the whole face regions whilst reducing features vector dimension.

  14. Validating the usability of an interactive Earth Observation based web service for landslide investigation

    NASA Astrophysics Data System (ADS)

    Albrecht, Florian; Weinke, Elisabeth; Eisank, Clemens; Vecchiotti, Filippo; Hölbling, Daniel; Friedl, Barbara; Kociu, Arben

    2017-04-01

    Regional authorities and infrastructure maintainers in almost all mountainous regions of the Earth need detailed and up-to-date landslide inventories for hazard and risk management. Landslide inventories usually are compiled through ground surveys and manual image interpretation following landslide triggering events. We developed a web service that uses Earth Observation (EO) data to support the mapping and monitoring tasks for improving the collection of landslide information. The planned validation of the EO-based web service does not only cover the analysis of the achievable landslide information quality but also the usability and user friendliness of the user interface. The underlying validation criteria are based on the user requirements and the defined tasks and aims in the work description of the FFG project Land@Slide (EO-based landslide mapping: from methodological developments to automated web-based information delivery). The service will be validated in collaboration with stakeholders, decision makers and experts. Users are requested to test the web service functionality and give feedback with a web-based questionnaire by following the subsequently described workflow. The users will operate the web-service via the responsive user interface and can extract landslide information from EO data. They compare it to reference data for quality assessment, for monitoring changes and for assessing landslide-affected infrastructure. An overview page lets the user explore a list of example projects with resulting landslide maps and mapping workflow descriptions. The example projects include mapped landslides in several test areas in Austria and Northern Italy. Landslides were extracted from high resolution (HR) and very high resolution (VHR) satellite imagery, such as Landsat, Sentinel-2, SPOT-5, WorldView-2/3 or Pléiades. The user can create his/her own project by selecting available satellite imagery or by uploading new data. Subsequently, a new landslide extraction workflow can be initiated through the functionality that the web service provides: (1) a segmentation of the image into spectrally homogeneous objects, (2) a classification of the objects into landslide and non-landslide areas and (3) an editing tool for the manual refinement of extracted landslide boundaries. In addition, the user interface of the web service provides tools that enable the user (4) to perform a monitoring that identifies changes between landslide maps of different points in time, (5) to perform a validation of the landslide maps by comparing them to reference data, and (6) to perform an assessment of affected infrastructure by comparing the landslide maps to respective infrastructure data. After exploring the web service functionality, the users are asked to fill in the online validation protocol in form of a questionnaire in order to provide their feedback. Concerning usability, we evaluate how intuitive the web service functionality can be operated, how well the integrated help information guides the users, and what kind of background information, e.g. remote sensing concepts and theory, is necessary for a practitioner to fully exploit the value of EO data. The feedback will be used for improving the user interface and for the implementation of additional functionality.

  15. A continuous time-resolved measure decoded from EEG oscillatory activity predicts working memory task performance.

    PubMed

    Astrand, Elaine

    2018-06-01

    Working memory (WM), crucial for successful behavioral performance in most of our everyday activities, holds a central role in goal-directed behavior. As task demands increase, inducing higher WM load, maintaining successful behavioral performance requires the brain to work at the higher end of its capacity. Because it is depending on both external and internal factors, individual WM load likely varies in a continuous fashion. The feasibility to extract such a continuous measure in time that correlates to behavioral performance during a working memory task remains unsolved. Multivariate pattern decoding was used to test whether a decoder constructed from two discrete levels of WM load can generalize to produce a continuous measure that predicts task performance. Specifically, a linear regression with L2-regularization was chosen with input features from EEG oscillatory activity recorded from healthy participants while performing the n-back task, [Formula: see text]. The feasibility to extract a continuous time-resolved measure that correlates positively to trial-by-trial working memory task performance is demonstrated (r  =  0.47, p  <  0.05). It is furthermore shown that this measure allows to predict task performance before action (r  =  0.49, p  <  0.05). We show that the extracted continuous measure enables to study the temporal dynamics of the complex activation pattern of WM encoding during the n-back task. Specifically, temporally precise contributions of different spectral features are observed which extends previous findings of traditional univariate approaches. These results constitute an important contribution towards a wide range of applications in the field of cognitive brain-machine interfaces. Monitoring mental processes related to attention and WM load to reduce the risk of committing errors in high-risk environments could potentially prevent many devastating consequences or using the continuous measure as neurofeedback opens up new possibilities to develop novel rehabilitation techniques for individuals with degraded WM capacity.

  16. A continuous time-resolved measure decoded from EEG oscillatory activity predicts working memory task performance

    NASA Astrophysics Data System (ADS)

    Astrand, Elaine

    2018-06-01

    Objective. Working memory (WM), crucial for successful behavioral performance in most of our everyday activities, holds a central role in goal-directed behavior. As task demands increase, inducing higher WM load, maintaining successful behavioral performance requires the brain to work at the higher end of its capacity. Because it is depending on both external and internal factors, individual WM load likely varies in a continuous fashion. The feasibility to extract such a continuous measure in time that correlates to behavioral performance during a working memory task remains unsolved. Approach. Multivariate pattern decoding was used to test whether a decoder constructed from two discrete levels of WM load can generalize to produce a continuous measure that predicts task performance. Specifically, a linear regression with L2-regularization was chosen with input features from EEG oscillatory activity recorded from healthy participants while performing the n-back task, n\\in [1,2] . Main results. The feasibility to extract a continuous time-resolved measure that correlates positively to trial-by-trial working memory task performance is demonstrated (r  =  0.47, p  <  0.05). It is furthermore shown that this measure allows to predict task performance before action (r  =  0.49, p  <  0.05). We show that the extracted continuous measure enables to study the temporal dynamics of the complex activation pattern of WM encoding during the n-back task. Specifically, temporally precise contributions of different spectral features are observed which extends previous findings of traditional univariate approaches. Significance. These results constitute an important contribution towards a wide range of applications in the field of cognitive brain–machine interfaces. Monitoring mental processes related to attention and WM load to reduce the risk of committing errors in high-risk environments could potentially prevent many devastating consequences or using the continuous measure as neurofeedback opens up new possibilities to develop novel rehabilitation techniques for individuals with degraded WM capacity.

  17. An attention-based BiLSTM-CRF approach to document-level chemical named entity recognition.

    PubMed

    Luo, Ling; Yang, Zhihao; Yang, Pei; Zhang, Yin; Wang, Lei; Lin, Hongfei; Wang, Jian

    2018-04-15

    In biomedical research, chemical is an important class of entities, and chemical named entity recognition (NER) is an important task in the field of biomedical information extraction. However, most popular chemical NER methods are based on traditional machine learning and their performances are heavily dependent on the feature engineering. Moreover, these methods are sentence-level ones which have the tagging inconsistency problem. In this paper, we propose a neural network approach, i.e. attention-based bidirectional Long Short-Term Memory with a conditional random field layer (Att-BiLSTM-CRF), to document-level chemical NER. The approach leverages document-level global information obtained by attention mechanism to enforce tagging consistency across multiple instances of the same token in a document. It achieves better performances with little feature engineering than other state-of-the-art methods on the BioCreative IV chemical compound and drug name recognition (CHEMDNER) corpus and the BioCreative V chemical-disease relation (CDR) task corpus (the F-scores of 91.14 and 92.57%, respectively). Data and code are available at https://github.com/lingluodlut/Att-ChemdNER. yangzh@dlut.edu.cn or wangleibihami@gmail.com. Supplementary data are available at Bioinformatics online.

  18. Software for Managing Parametric Studies

    NASA Technical Reports Server (NTRS)

    Yarrow, Maurice; McCann, Karen M.; DeVivo, Adrian

    2003-01-01

    The Information Power Grid Virtual Laboratory (ILab) is a Practical Extraction and Reporting Language (PERL) graphical-user-interface computer program that generates shell scripts to facilitate parametric studies performed on the Grid. (The Grid denotes a worldwide network of supercomputers used for scientific and engineering computations involving data sets too large to fit on desktop computers.) Heretofore, parametric studies on the Grid have been impeded by the need to create control language scripts and edit input data files painstaking tasks that are necessary for managing multiple jobs on multiple computers. ILab reflects an object-oriented approach to automation of these tasks: All data and operations are organized into packages in order to accelerate development and debugging. A container or document object in ILab, called an experiment, contains all the information (data and file paths) necessary to define a complex series of repeated, sequenced, and/or branching processes. For convenience and to enable reuse, this object is serialized to and from disk storage. At run time, the current ILab experiment is used to generate required input files and shell scripts, create directories, copy data files, and then both initiate and monitor the execution of all computational processes.

  19. Process evaluations of task sharing interventions for perinatal depression in low and middle income countries (LMIC): a systematic review and qualitative meta-synthesis.

    PubMed

    Munodawafa, Memory; Mall, Sumaya; Lund, Crick; Schneider, Marguerite

    2018-03-23

    Perinatal depression is common in low and middle income countries (LAMICs). Task sharing interventions have been implemented to treat perinatal depression in these settings, as a way of dealing with staff shortages. Task sharing allows lay health workers to provide services for less complex cases while being trained and supervised by specialists. Randomized controlled trials suggest that these interventions can be effective but there is limited qualitative information exploring barriers and facilitators to their implementation. This systematic review aims to systematically review current qualitative evidence of process evaluations of task sharing interventions for perinatal depression in LAMICs in relation to the United Kingdom (UK) Medical Research Council (MRC) framework for conducting process evaluations. We searched Medline/ PubMed, PsycINFO, Scopus, Cochrane Library and Web of science for studies from LAMICS using search terms under the broad categories of: (a) "maternal depression'" (b) "intervention" (c) "lay counsellor" OR "community health worker" OR "non-specialist" and (d) "LAMICs". Abstracts were independently reviewed for inclusion by two authors. Full text articles were screened and data for included articles were extracted using a standard data extraction sheet. Qualitative synthesis of qualitative evidence was conducted. 8420 articles were identified from initial searches. Of these, 26 full text articles were screened for eligibility with only three studies meeting the inclusion criteria. Main findings revealed that participants identified the following crucial factors: contextual factors included physical location, accessibility and cultural norms. Implementation factors included acceptability of the intervention and characteristics of the personnel. Mechanisms included counsellor factors such as motivating and facilitating trust; intervention factors such as use of stories and visual aids, and understandability of the content; and participant factors such as shared experience, meeting learning needs, and meeting expectations. While task sharing has been suggested as an effective way of filling the treatment gap for perinatal depression, there is a paucity of qualitative research exploring barriers and facilitators to implementing these interventions. Qualitative process evaluations are crucial for the development of culturally relevant interventions.

  20. Large-scale extraction of accurate drug-disease treatment pairs from biomedical literature for drug repurposing

    PubMed Central

    2013-01-01

    Background A large-scale, highly accurate, machine-understandable drug-disease treatment relationship knowledge base is important for computational approaches to drug repurposing. The large body of published biomedical research articles and clinical case reports available on MEDLINE is a rich source of FDA-approved drug-disease indication as well as drug-repurposing knowledge that is crucial for applying FDA-approved drugs for new diseases. However, much of this information is buried in free text and not captured in any existing databases. The goal of this study is to extract a large number of accurate drug-disease treatment pairs from published literature. Results In this study, we developed a simple but highly accurate pattern-learning approach to extract treatment-specific drug-disease pairs from 20 million biomedical abstracts available on MEDLINE. We extracted a total of 34,305 unique drug-disease treatment pairs, the majority of which are not included in existing structured databases. Our algorithm achieved a precision of 0.904 and a recall of 0.131 in extracting all pairs, and a precision of 0.904 and a recall of 0.842 in extracting frequent pairs. In addition, we have shown that the extracted pairs strongly correlate with both drug target genes and therapeutic classes, therefore may have high potential in drug discovery. Conclusions We demonstrated that our simple pattern-learning relationship extraction algorithm is able to accurately extract many drug-disease pairs from the free text of biomedical literature that are not captured in structured databases. The large-scale, accurate, machine-understandable drug-disease treatment knowledge base that is resultant of our study, in combination with pairs from structured databases, will have high potential in computational drug repurposing tasks. PMID:23742147

  1. Preferential processing of task-irrelevant beloved-related information and task performance: Two event-related potential studies.

    PubMed

    Langeslag, Sandra J E; van Strien, Jan W

    2017-09-18

    People who are in love have better attention for beloved-related information, but report having trouble focusing on other tasks, such as (home)work. So, romantic love can both improve and hurt cognition. Emotional information is preferentially processed, which improves task performance when the information is task-relevant, but hurts task performance when it is task-irrelevant. Because beloved-related information is highly emotional, the effects of romantic love on cognition may resemble these effects of emotion on cognition. We examined whether beloved-related information is preferentially processed even when it is task-irrelevant and whether this hurts task performance. In two event-related potential studies, participants who had recently fallen in love performed a visuospatial short-term memory task. Task-irrelevant beloved, friend, and stranger faces were presented during maintenance (Study 1), or encoding (Study 2). The Early Posterior Negativity (EPN) reflecting early automatic attentional capturing and the Late Positive Potential (LPP) reflecting sustained motivated attention were largest for beloved pictures. Thus, beloved pictures are preferentially processed even when they are task-irrelevant. Task performance and reaction times did not differ between beloved, friend, and stranger conditions. Nevertheless, self-reported obsessive thinking about the beloved tended to correlate negatively with task performance, and positively with reaction times, across conditions. So, although task-irrelevant beloved-related information does not impact task performance, more obsessive thinking about the beloved might relate to poorer and slower overall task performance. More research is needed to clarify why people experience trouble focusing on beloved-unrelated tasks and how this negative effect of love on cognition could be reduced. Copyright © 2017 Elsevier Ltd. All rights reserved.

  2. The CHEMDNER corpus of chemicals and drugs and its annotation principles.

    PubMed

    Krallinger, Martin; Rabal, Obdulia; Leitner, Florian; Vazquez, Miguel; Salgado, David; Lu, Zhiyong; Leaman, Robert; Lu, Yanan; Ji, Donghong; Lowe, Daniel M; Sayle, Roger A; Batista-Navarro, Riza Theresa; Rak, Rafal; Huber, Torsten; Rocktäschel, Tim; Matos, Sérgio; Campos, David; Tang, Buzhou; Xu, Hua; Munkhdalai, Tsendsuren; Ryu, Keun Ho; Ramanan, S V; Nathan, Senthil; Žitnik, Slavko; Bajec, Marko; Weber, Lutz; Irmer, Matthias; Akhondi, Saber A; Kors, Jan A; Xu, Shuo; An, Xin; Sikdar, Utpal Kumar; Ekbal, Asif; Yoshioka, Masaharu; Dieb, Thaer M; Choi, Miji; Verspoor, Karin; Khabsa, Madian; Giles, C Lee; Liu, Hongfang; Ravikumar, Komandur Elayavilli; Lamurias, Andre; Couto, Francisco M; Dai, Hong-Jie; Tsai, Richard Tzong-Han; Ata, Caglar; Can, Tolga; Usié, Anabel; Alves, Rui; Segura-Bedmar, Isabel; Martínez, Paloma; Oyarzabal, Julen; Valencia, Alfonso

    2015-01-01

    The automatic extraction of chemical information from text requires the recognition of chemical entity mentions as one of its key steps. When developing supervised named entity recognition (NER) systems, the availability of a large, manually annotated text corpus is desirable. Furthermore, large corpora permit the robust evaluation and comparison of different approaches that detect chemicals in documents. We present the CHEMDNER corpus, a collection of 10,000 PubMed abstracts that contain a total of 84,355 chemical entity mentions labeled manually by expert chemistry literature curators, following annotation guidelines specifically defined for this task. The abstracts of the CHEMDNER corpus were selected to be representative for all major chemical disciplines. Each of the chemical entity mentions was manually labeled according to its structure-associated chemical entity mention (SACEM) class: abbreviation, family, formula, identifier, multiple, systematic and trivial. The difficulty and consistency of tagging chemicals in text was measured using an agreement study between annotators, obtaining a percentage agreement of 91. For a subset of the CHEMDNER corpus (the test set of 3,000 abstracts) we provide not only the Gold Standard manual annotations, but also mentions automatically detected by the 26 teams that participated in the BioCreative IV CHEMDNER chemical mention recognition task. In addition, we release the CHEMDNER silver standard corpus of automatically extracted mentions from 17,000 randomly selected PubMed abstracts. A version of the CHEMDNER corpus in the BioC format has been generated as well. We propose a standard for required minimum information about entity annotations for the construction of domain specific corpora on chemical and drug entities. The CHEMDNER corpus and annotation guidelines are available at: http://www.biocreative.org/resources/biocreative-iv/chemdner-corpus/.

  3. The CHEMDNER corpus of chemicals and drugs and its annotation principles

    PubMed Central

    2015-01-01

    The automatic extraction of chemical information from text requires the recognition of chemical entity mentions as one of its key steps. When developing supervised named entity recognition (NER) systems, the availability of a large, manually annotated text corpus is desirable. Furthermore, large corpora permit the robust evaluation and comparison of different approaches that detect chemicals in documents. We present the CHEMDNER corpus, a collection of 10,000 PubMed abstracts that contain a total of 84,355 chemical entity mentions labeled manually by expert chemistry literature curators, following annotation guidelines specifically defined for this task. The abstracts of the CHEMDNER corpus were selected to be representative for all major chemical disciplines. Each of the chemical entity mentions was manually labeled according to its structure-associated chemical entity mention (SACEM) class: abbreviation, family, formula, identifier, multiple, systematic and trivial. The difficulty and consistency of tagging chemicals in text was measured using an agreement study between annotators, obtaining a percentage agreement of 91. For a subset of the CHEMDNER corpus (the test set of 3,000 abstracts) we provide not only the Gold Standard manual annotations, but also mentions automatically detected by the 26 teams that participated in the BioCreative IV CHEMDNER chemical mention recognition task. In addition, we release the CHEMDNER silver standard corpus of automatically extracted mentions from 17,000 randomly selected PubMed abstracts. A version of the CHEMDNER corpus in the BioC format has been generated as well. We propose a standard for required minimum information about entity annotations for the construction of domain specific corpora on chemical and drug entities. The CHEMDNER corpus and annotation guidelines are available at: http://www.biocreative.org/resources/biocreative-iv/chemdner-corpus/ PMID:25810773

  4. Automatic extraction of property norm-like data from large text corpora.

    PubMed

    Kelly, Colin; Devereux, Barry; Korhonen, Anna

    2014-01-01

    Traditional methods for deriving property-based representations of concepts from text have focused on either extracting only a subset of possible relation types, such as hyponymy/hypernymy (e.g., car is-a vehicle) or meronymy/metonymy (e.g., car has wheels), or unspecified relations (e.g., car--petrol). We propose a system for the challenging task of automatic, large-scale acquisition of unconstrained, human-like property norms from large text corpora, and discuss the theoretical implications of such a system. We employ syntactic, semantic, and encyclopedic information to guide our extraction, yielding concept-relation-feature triples (e.g., car be fast, car require petrol, car cause pollution), which approximate property-based conceptual representations. Our novel method extracts candidate triples from parsed corpora (Wikipedia and the British National Corpus) using syntactically and grammatically motivated rules, then reweights triples with a linear combination of their frequency and four statistical metrics. We assess our system output in three ways: lexical comparison with norms derived from human-generated property norm data, direct evaluation by four human judges, and a semantic distance comparison with both WordNet similarity data and human-judged concept similarity ratings. Our system offers a viable and performant method of plausible triple extraction: Our lexical comparison shows comparable performance to the current state-of-the-art, while subsequent evaluations exhibit the human-like character of our generated properties.

  5. The extraction of motion-onset VEP BCI features based on deep learning and compressed sensing.

    PubMed

    Ma, Teng; Li, Hui; Yang, Hao; Lv, Xulin; Li, Peiyang; Liu, Tiejun; Yao, Dezhong; Xu, Peng

    2017-01-01

    Motion-onset visual evoked potentials (mVEP) can provide a softer stimulus with reduced fatigue, and it has potential applications for brain computer interface(BCI)systems. However, the mVEP waveform is seriously masked in the strong background EEG activities, and an effective approach is needed to extract the corresponding mVEP features to perform task recognition for BCI control. In the current study, we combine deep learning with compressed sensing to mine discriminative mVEP information to improve the mVEP BCI performance. The deep learning and compressed sensing approach can generate the multi-modality features which can effectively improve the BCI performance with approximately 3.5% accuracy incensement over all 11 subjects and is more effective for those subjects with relatively poor performance when using the conventional features. Compared with the conventional amplitude-based mVEP feature extraction approach, the deep learning and compressed sensing approach has a higher classification accuracy and is more effective for subjects with relatively poor performance. According to the results, the deep learning and compressed sensing approach is more effective for extracting the mVEP feature to construct the corresponding BCI system, and the proposed feature extraction framework is easy to extend to other types of BCIs, such as motor imagery (MI), steady-state visual evoked potential (SSVEP)and P300. Copyright © 2016 Elsevier B.V. All rights reserved.

  6. Metabolomic analysis-Addressing NMR and LC-MS related problems in human feces sample preparation.

    PubMed

    Moosmang, Simon; Pitscheider, Maria; Sturm, Sonja; Seger, Christoph; Tilg, Herbert; Halabalaki, Maria; Stuppner, Hermann

    2017-10-31

    Metabolomics is a well-established field in fundamental clinical research with applications in different human body fluids. However, metabolomic investigations in feces are currently an emerging field. Fecal sample preparation is a demanding task due to high complexity and heterogeneity of the matrix. To gain access to the information enclosed in human feces it is necessary to extract the metabolites and make them accessible to analytical platforms like NMR or LC-MS. In this study different pre-analytical parameters and factors were investigated i.e. water content, different extraction solvents, influence of freeze-drying and homogenization, ratios of sample weight to extraction solvent, and their respective impact on metabolite profiles acquired by NMR and LC-MS. The results indicate that profiles are strongly biased by selection of extraction solvent or drying of samples, which causes different metabolites to be lost, under- or overstated. Additionally signal intensity and reproducibility of the measurement were found to be strongly dependent on sample pre-treatment steps: freeze-drying and homogenization lead to improved release of metabolites and thus increased signals, but at the same time induced variations and thus deteriorated reproducibility. We established the first protocol for extraction of human fecal samples and subsequent measurement with both complementary techniques NMR and LC-MS. Copyright © 2017 Elsevier B.V. All rights reserved.

  7. Increasing value and reducing waste in data extraction for systematic reviews: tracking data in data extraction forms.

    PubMed

    Shokraneh, Farhad; Adams, Clive E

    2017-08-04

    Data extraction is one of the most time-consuming tasks in performing a systematic review. Extraction is often onto some sort of form. Sharing completed forms can be used to check quality and accuracy of extraction or for re-cycling data to other researchers for updating. However, validating each piece of extracted data is time-consuming and linking to source problematic.In this methodology paper, we summarize three methods for reporting the location of data in original full-text reports, comparing their advantages and disadvantages.

  8. Practical limits on muscle synergy identification by non-negative matrix factorization in systems with mechanical constraints.

    PubMed

    Burkholder, Thomas J; van Antwerp, Keith W

    2013-02-01

    Statistical decomposition, including non-negative matrix factorization (NMF), is a convenient tool for identifying patterns of structured variability within behavioral motor programs, but it is unclear how the resolved factors relate to actual neural structures. Factors can be extracted from a uniformly sampled, low-dimension command space. In practical application, the command space is limited, either to those activations that perform some task(s) successfully or to activations induced in response to specific perturbations. NMF was applied to muscle activation patterns synthesized from low dimensional, synergy-like control modules mimicking simple task performance or feedback activation from proprioceptive signals. In the task-constrained paradigm, the accuracy of control module recovery was highly dependent on the sampled volume of control space, such that sampling even 50% of control space produced a substantial degradation in factor accuracy. In the feedback paradigm, NMF was not capable of extracting more than four control modules, even in a mechanical model with seven internal degrees of freedom. Reduced access to the low-dimensional control space imposed by physical constraints may result in substantial distortion of an existing low dimensional controller, such that neither the dimensionality nor the composition of the recovered/extracted factors match the original controller.

  9. Is There a Common Summary Statistical Process for Representing the Mean and Variance? A Study Using Illustrations of Familiar Items

    PubMed Central

    Yang, Yi; Tokita, Midori; Ishiguchi, Akira

    2018-01-01

    A number of studies revealed that our visual system can extract different types of summary statistics, such as the mean and variance, from sets of items. Although the extraction of such summary statistics has been studied well in isolation, the relationship between these statistics remains unclear. In this study, we explored this issue using an individual differences approach. Observers viewed illustrations of strawberries and lollypops varying in size or orientation and performed four tasks in a within-subject design, namely mean and variance discrimination tasks with size and orientation domains. We found that the performances in the mean and variance discrimination tasks were not correlated with each other and demonstrated that extractions of the mean and variance are mediated by different representation mechanisms. In addition, we tested the relationship between performances in size and orientation domains for each summary statistic (i.e. mean and variance) and examined whether each summary statistic has distinct processes across perceptual domains. The results illustrated that statistical summary representations of size and orientation may share a common mechanism for representing the mean and possibly for representing variance. Introspections for each observer performing the tasks were also examined and discussed. PMID:29399318

  10. Reverse Engineering Cellular Networks with Information Theoretic Methods

    PubMed Central

    Villaverde, Alejandro F.; Ross, John; Banga, Julio R.

    2013-01-01

    Building mathematical models of cellular networks lies at the core of systems biology. It involves, among other tasks, the reconstruction of the structure of interactions between molecular components, which is known as network inference or reverse engineering. Information theory can help in the goal of extracting as much information as possible from the available data. A large number of methods founded on these concepts have been proposed in the literature, not only in biology journals, but in a wide range of areas. Their critical comparison is difficult due to the different focuses and the adoption of different terminologies. Here we attempt to review some of the existing information theoretic methodologies for network inference, and clarify their differences. While some of these methods have achieved notable success, many challenges remain, among which we can mention dealing with incomplete measurements, noisy data, counterintuitive behaviour emerging from nonlinear relations or feedback loops, and computational burden of dealing with large data sets. PMID:24709703

  11. Task-relevant perceptual features can define categories in visual memory too.

    PubMed

    Antonelli, Karla B; Williams, Carrick C

    2017-11-01

    Although Konkle, Brady, Alvarez, and Oliva (2010, Journal of Experimental Psychology: General, 139(3), 558) claim that visual long-term memory (VLTM) is organized on underlying conceptual, not perceptual, information, visual memory results from visual search tasks are not well explained by this theory. We hypothesized that when viewing an object, any task-relevant visual information is critical to the organizational structure of VLTM. In two experiments, we examined the organization of VLTM by measuring the amount of retroactive interference created by objects possessing different combinations of task-relevant features. Based on task instructions, only the conceptual category was task relevant or both the conceptual category and a perceptual object feature were task relevant. Findings indicated that when made task relevant, perceptual object feature information, along with conceptual category information, could affect memory organization for objects in VLTM. However, when perceptual object feature information was task irrelevant, it did not contribute to memory organization; instead, memory defaulted to being organized around conceptual category information. These findings support the theory that a task-defined organizational structure is created in VLTM based on the relevance of particular object features and information.

  12. The use of cognitive task analysis to reveal the instructional limitations of experts in the teaching of procedural skills.

    PubMed

    Sullivan, Maura E; Yates, Kenneth A; Inaba, Kenji; Lam, Lydia; Clark, Richard E

    2014-05-01

    Because of the automated nature of knowledge, experts tend to omit information when describing a task. A potential solution is cognitive task analysis (CTA). The authors investigated the percentage of knowledge experts omitted when teaching a cricothyrotomy to determine the percentage of additional knowledge gained during a CTA interview. Three experts were videotaped teaching a cricothyrotomy in 2010 at the University of Southern California. After transcription, they participated in CTA interviews for the same procedure. Three additional surgeons were recruited to perform a CTA for the procedure, and a "gold standard" task list was created. Transcriptions from the teaching sessions were compared with the task list to identify omitted steps (both "what" and "how" to do). Transcripts from the CTA interviews were compared against the task list to determine the percentage of knowledge articulated by each expert during the initial "free recall" (unprompted) phase of the CTA interview versus the amount of knowledge gained by using CTA elicitation techniques (prompted). Experts omitted an average of 71% (10/14) of clinical knowledge steps, 51% (14/27) of action steps, and 73% (3.6/5) of decision steps. For action steps, experts described "how to do it" only 13% (3.6/27) of the time. The average number of steps that were described increased from 44% (20/46) when unprompted to 66% (31/46) when prompted. This study supports previous research that experts unintentionally omit knowledge when describing a procedure. CTA is a useful method to extract automated knowledge and augment expert knowledge recall during teaching.

  13. Implicit learning in cotton-top tamarins (Saguinus oedipus) and pigeons (Columba livia).

    PubMed

    Locurto, Charles; Fox, Maura; Mazzella, Andrea

    2015-06-01

    There is considerable interest in the conditions under which human subjects learn patterned information without explicit instructions to learn that information. This form of learning, termed implicit or incidental learning, can be approximated in nonhumans by exposing subjects to patterned information but delivering reinforcement randomly, thereby not requiring the subjects to learn the information in order to be reinforced. Following acquisition, nonhuman subjects are queried as to what they have learned about the patterned information. In the present experiment, we extended the study of implicit learning in nonhumans by comparing two species, cotton-top tamarins (Saguinus oedipus) and pigeons (Columba livia), on an implicit learning task that used an artificial grammar to generate the patterned elements for training. We equated the conditions of training and testing as much as possible between the two species. The results indicated that both species demonstrated approximately the same magnitude of implicit learning, judged both by a random test and by choice tests between pairs of training elements. This finding suggests that the ability to extract patterned information from situations in which such learning is not demanded is of longstanding origin.

  14. Fault detection and diagnosis for gas turbines based on a kernelized information entropy model.

    PubMed

    Wang, Weiying; Xu, Zhiqiang; Tang, Rui; Li, Shuying; Wu, Wei

    2014-01-01

    Gas turbines are considered as one kind of the most important devices in power engineering and have been widely used in power generation, airplanes, and naval ships and also in oil drilling platforms. However, they are monitored without man on duty in the most cases. It is highly desirable to develop techniques and systems to remotely monitor their conditions and analyze their faults. In this work, we introduce a remote system for online condition monitoring and fault diagnosis of gas turbine on offshore oil well drilling platforms based on a kernelized information entropy model. Shannon information entropy is generalized for measuring the uniformity of exhaust temperatures, which reflect the overall states of the gas paths of gas turbine. In addition, we also extend the entropy to compute the information quantity of features in kernel spaces, which help to select the informative features for a certain recognition task. Finally, we introduce the information entropy based decision tree algorithm to extract rules from fault samples. The experiments on some real-world data show the effectiveness of the proposed algorithms.

  15. Fault Detection and Diagnosis for Gas Turbines Based on a Kernelized Information Entropy Model

    PubMed Central

    Wang, Weiying; Xu, Zhiqiang; Tang, Rui; Li, Shuying; Wu, Wei

    2014-01-01

    Gas turbines are considered as one kind of the most important devices in power engineering and have been widely used in power generation, airplanes, and naval ships and also in oil drilling platforms. However, they are monitored without man on duty in the most cases. It is highly desirable to develop techniques and systems to remotely monitor their conditions and analyze their faults. In this work, we introduce a remote system for online condition monitoring and fault diagnosis of gas turbine on offshore oil well drilling platforms based on a kernelized information entropy model. Shannon information entropy is generalized for measuring the uniformity of exhaust temperatures, which reflect the overall states of the gas paths of gas turbine. In addition, we also extend the entropy to compute the information quantity of features in kernel spaces, which help to select the informative features for a certain recognition task. Finally, we introduce the information entropy based decision tree algorithm to extract rules from fault samples. The experiments on some real-world data show the effectiveness of the proposed algorithms. PMID:25258726

  16. Distraction during learning with hypermedia: difficult tasks help to keep task goals on track

    PubMed Central

    Scheiter, Katharina; Gerjets, Peter; Heise, Elke

    2014-01-01

    In educational hypermedia environments, students are often confronted with potential sources of distraction arising from additional information that, albeit interesting, is unrelated to their current task goal. The paper investigates the conditions under which distraction occurs and hampers performance. Based on theories of volitional action control it was hypothesized that interesting information, especially if related to a pending goal, would interfere with task performance only when working on easy, but not on difficult tasks. In Experiment 1, 66 students learned about probability theory using worked examples and solved corresponding test problems, whose task difficulty was manipulated. As a second factor, the presence of interesting information unrelated to the primary task was varied. Results showed that students solved more easy than difficult probability problems correctly. However, the presence of interesting, but task-irrelevant information did not interfere with performance. In Experiment 2, 68 students again engaged in example-based learning and problem solving in the presence of task-irrelevant information. Problem-solving difficulty was varied as a first factor. Additionally, the presence of a pending goal related to the task-irrelevant information was manipulated. As expected, problem-solving performance declined when a pending goal was present during working on easy problems, whereas no interference was observed for difficult problems. Moreover, the presence of a pending goal reduced the time on task-relevant information and increased the time on task-irrelevant information while working on easy tasks. However, as revealed by mediation analyses these changes in overt information processing behavior did not explain the decline in problem-solving performance. As an alternative explanation it is suggested that goal conflicts resulting from pending goals claim cognitive resources, which are then no longer available for learning and problem solving. PMID:24723907

  17. Temporal Discounting and Inter-Temporal Choice in Rhesus Monkeys

    PubMed Central

    Hwang, Jaewon; Kim, Soyoun; Lee, Daeyeol

    2009-01-01

    Humans and animals are more likely to take an action leading to an immediate reward than actions with delayed rewards of similar magnitudes. Although such devaluation of delayed rewards has been almost universally described by hyperbolic discount functions, the rate of this temporal discounting varies substantially among different animal species. This might be in part due to the differences in how the information about reward is presented to decision makers. In previous animal studies, reward delays or magnitudes were gradually adjusted across trials, so the animals learned the properties of future rewards from the rewards they waited for and consumed previously. In contrast, verbal cues have been used commonly in human studies. In the present study, rhesus monkeys were trained in a novel inter-temporal choice task in which the magnitude and delay of reward were indicated symbolically using visual cues and varied randomly across trials. We found that monkeys could extract the information about reward delays from visual symbols regardless of the number of symbols used to indicate the delay. The rate of temporal discounting observed in the present study was comparable to the previous estimates in other mammals, and the animal's choice behavior was largely consistent with hyperbolic discounting. Our results also suggest that the rate of temporal discounting might be influenced by contextual factors, such as the novelty of the task. The flexibility furnished by this new inter-temporal choice task might be useful for future neurobiological investigations on inter-temporal choice in non-human primates. PMID:19562091

  18. High-frequency energy in singing and speech

    NASA Astrophysics Data System (ADS)

    Monson, Brian Bruce

    While human speech and the human voice generate acoustical energy up to (and beyond) 20 kHz, the energy above approximately 5 kHz has been largely neglected. Evidence is accruing that this high-frequency energy contains perceptual information relevant to speech and voice, including percepts of quality, localization, and intelligibility. The present research was an initial step in the long-range goal of characterizing high-frequency energy in singing voice and speech, with particular regard for its perceptual role and its potential for modification during voice and speech production. In this study, a database of high-fidelity recordings of talkers was created and used for a broad acoustical analysis and general characterization of high-frequency energy, as well as specific characterization of phoneme category, voice and speech intensity level, and mode of production (speech versus singing) by high-frequency energy content. Directionality of radiation of high-frequency energy from the mouth was also examined. The recordings were used for perceptual experiments wherein listeners were asked to discriminate between speech and voice samples that differed only in high-frequency energy content. Listeners were also subjected to gender discrimination tasks, mode-of-production discrimination tasks, and transcription tasks with samples of speech and singing that contained only high-frequency content. The combination of these experiments has revealed that (1) human listeners are able to detect very subtle level changes in high-frequency energy, and (2) human listeners are able to extract significant perceptual information from high-frequency energy.

  19. Multi-object segmentation framework using deformable models for medical imaging analysis.

    PubMed

    Namías, Rafael; D'Amato, Juan Pablo; Del Fresno, Mariana; Vénere, Marcelo; Pirró, Nicola; Bellemare, Marc-Emmanuel

    2016-08-01

    Segmenting structures of interest in medical images is an important step in different tasks such as visualization, quantitative analysis, simulation, and image-guided surgery, among several other clinical applications. Numerous segmentation methods have been developed in the past three decades for extraction of anatomical or functional structures on medical imaging. Deformable models, which include the active contour models or snakes, are among the most popular methods for image segmentation combining several desirable features such as inherent connectivity and smoothness. Even though different approaches have been proposed and significant work has been dedicated to the improvement of such algorithms, there are still challenging research directions as the simultaneous extraction of multiple objects and the integration of individual techniques. This paper presents a novel open-source framework called deformable model array (DMA) for the segmentation of multiple and complex structures of interest in different imaging modalities. While most active contour algorithms can extract one region at a time, DMA allows integrating several deformable models to deal with multiple segmentation scenarios. Moreover, it is possible to consider any existing explicit deformable model formulation and even to incorporate new active contour methods, allowing to select a suitable combination in different conditions. The framework also introduces a control module that coordinates the cooperative evolution of the snakes and is able to solve interaction issues toward the segmentation goal. Thus, DMA can implement complex object and multi-object segmentations in both 2D and 3D using the contextual information derived from the model interaction. These are important features for several medical image analysis tasks in which different but related objects need to be simultaneously extracted. Experimental results on both computed tomography and magnetic resonance imaging show that the proposed framework has a wide range of applications especially in the presence of adjacent structures of interest or under intra-structure inhomogeneities giving excellent quantitative results.

  20. BMP effectiveness/efficiency monitoring evaluation of regional information and data : response to Tasks 2 & 3 : information analysis and needs assessment.

    DOT National Transportation Integrated Search

    2002-10-04

    The second task, Task 2, includes analyzing the information obtained in Task 1, expanding it if necessary, and identifying the extent of regional information and data with respect to the types of water quality facilities typically designed for the tr...

Top