Sample records for image annotation application

  1. Annotation of UAV surveillance video

    NASA Astrophysics Data System (ADS)

    Howlett, Todd; Robertson, Mark A.; Manthey, Dan; Krol, John

    2004-08-01

    Significant progress toward the development of a video annotation capability is presented in this paper. Research and development of an object tracking algorithm applicable for UAV video is described. Object tracking is necessary for attaching the annotations to the objects of interest. A methodology and format is defined for encoding video annotations using the SMPTE Key-Length-Value encoding standard. This provides the following benefits: a non-destructive annotation, compliance with existing standards, video playback in systems that are not annotation enabled and support for a real-time implementation. A model real-time video annotation system is also presented, at a high level, using the MPEG-2 Transport Stream as the transmission medium. This work was accomplished to meet the Department of Defense"s (DoD"s) need for a video annotation capability. Current practices for creating annotated products are to capture a still image frame, annotate it using an Electric Light Table application, and then pass the annotated image on as a product. That is not adequate for reporting or downstream cueing. It is too slow and there is a severe loss of information. This paper describes a capability for annotating directly on the video.

  2. Application of whole slide image markup and annotation for pathologist knowledge capture.

    PubMed

    Campbell, Walter S; Foster, Kirk W; Hinrichs, Steven H

    2013-01-01

    The ability to transfer image markup and annotation data from one scanned image of a slide to a newly acquired image of the same slide within a single vendor platform was investigated. The goal was to study the ability to use image markup and annotation data files as a mechanism to capture and retain pathologist knowledge without retaining the entire whole slide image (WSI) file. Accepted mathematical principles were investigated as a method to overcome variations in scans of the same glass slide and to accurately associate image markup and annotation data across different WSI of the same glass slide. Trilateration was used to link fixed points within the image and slide to the placement of markups and annotations of the image in a metadata file. Variation in markup and annotation placement between WSI of the same glass slide was reduced from over 80 μ to less than 4 μ in the x-axis and from 17 μ to 6 μ in the y-axis (P < 0.025). This methodology allows for the creation of a highly reproducible image library of histopathology images and interpretations for educational and research use.

  3. Application of whole slide image markup and annotation for pathologist knowledge capture

    PubMed Central

    Campbell, Walter S.; Foster, Kirk W.; Hinrichs, Steven H.

    2013-01-01

    Objective: The ability to transfer image markup and annotation data from one scanned image of a slide to a newly acquired image of the same slide within a single vendor platform was investigated. The goal was to study the ability to use image markup and annotation data files as a mechanism to capture and retain pathologist knowledge without retaining the entire whole slide image (WSI) file. Methods: Accepted mathematical principles were investigated as a method to overcome variations in scans of the same glass slide and to accurately associate image markup and annotation data across different WSI of the same glass slide. Trilateration was used to link fixed points within the image and slide to the placement of markups and annotations of the image in a metadata file. Results: Variation in markup and annotation placement between WSI of the same glass slide was reduced from over 80 μ to less than 4 μ in the x-axis and from 17 μ to 6 μ in the y-axis (P < 0.025). Conclusion: This methodology allows for the creation of a highly reproducible image library of histopathology images and interpretations for educational and research use. PMID:23599902

  4. Current and future trends in marine image annotation software

    NASA Astrophysics Data System (ADS)

    Gomes-Pereira, Jose Nuno; Auger, Vincent; Beisiegel, Kolja; Benjamin, Robert; Bergmann, Melanie; Bowden, David; Buhl-Mortensen, Pal; De Leo, Fabio C.; Dionísio, Gisela; Durden, Jennifer M.; Edwards, Luke; Friedman, Ariell; Greinert, Jens; Jacobsen-Stout, Nancy; Lerner, Steve; Leslie, Murray; Nattkemper, Tim W.; Sameoto, Jessica A.; Schoening, Timm; Schouten, Ronald; Seager, James; Singh, Hanumant; Soubigou, Olivier; Tojeira, Inês; van den Beld, Inge; Dias, Frederico; Tempera, Fernando; Santos, Ricardo S.

    2016-12-01

    Given the need to describe, analyze and index large quantities of marine imagery data for exploration and monitoring activities, a range of specialized image annotation tools have been developed worldwide. Image annotation - the process of transposing objects or events represented in a video or still image to the semantic level, may involve human interactions and computer-assisted solutions. Marine image annotation software (MIAS) have enabled over 500 publications to date. We review the functioning, application trends and developments, by comparing general and advanced features of 23 different tools utilized in underwater image analysis. MIAS requiring human input are basically a graphical user interface, with a video player or image browser that recognizes a specific time code or image code, allowing to log events in a time-stamped (and/or geo-referenced) manner. MIAS differ from similar software by the capability of integrating data associated to video collection, the most simple being the position coordinates of the video recording platform. MIAS have three main characteristics: annotating events in real time, posteriorly to annotation and interact with a database. These range from simple annotation interfaces, to full onboard data management systems, with a variety of toolboxes. Advanced packages allow to input and display data from multiple sensors or multiple annotators via intranet or internet. Posterior human-mediated annotation often include tools for data display and image analysis, e.g. length, area, image segmentation, point count; and in a few cases the possibility of browsing and editing previous dive logs or to analyze the annotations. The interaction with a database allows the automatic integration of annotations from different surveys, repeated annotation and collaborative annotation of shared datasets, browsing and querying of data. Progress in the field of automated annotation is mostly in post processing, for stable platforms or still images. Integration into available MIAS is currently limited to semi-automated processes of pixel recognition through computer-vision modules that compile expert-based knowledge. Important topics aiding the choice of a specific software are outlined, the ideal software is discussed and future trends are presented.

  5. Developing a knowledge base to support the annotation of ultrasound images of ectopic pregnancy.

    PubMed

    Dhombres, Ferdinand; Maurice, Paul; Friszer, Stéphanie; Guilbaud, Lucie; Lelong, Nathalie; Khoshnood, Babak; Charlet, Jean; Perrot, Nicolas; Jauniaux, Eric; Jurkovic, Davor; Jouannic, Jean-Marie

    2017-01-31

    Ectopic pregnancy is a frequent early complication of pregnancy associated with significant rates of morbidly and mortality. The positive diagnosis of this condition is established through transvaginal ultrasound scanning. The timing of diagnosis depends on the operator expertise in identifying the signs of ectopic pregnancy, which varies dramatically among medical staff with heterogeneous training. Developing decision support systems in this context is expected to improve the identification of these signs and subsequently improve the quality of care. In this article, we present a new knowledge base for ectopic pregnancy, and we demonstrate its use on the annotation of clinical images. The knowledge base is supported by an application ontology, which provides the taxonomy, the vocabulary and definitions for 24 types and 81 signs of ectopic pregnancy, 484 anatomical structures and 32 technical elements for image acquisition. The knowledge base provides a sign-centric model of the domain, with the relations of signs to ectopic pregnancy types, anatomical structures and the technical elements. The evaluation of the ontology and knowledge base demonstrated a positive feedback from a panel of 17 medical users. Leveraging these semantic resources, we developed an application for the annotation of ultrasound images. Using this application, 6 operators achieved a precision of 0.83 for the identification of signs in 208 ultrasound images corresponding to 35 clinical cases of ectopic pregnancy. We developed a new ectopic pregnancy knowledge base for the annotation of ultrasound images. The use of this knowledge base for the annotation of ultrasound images of ectopic pregnancy showed promising results from the perspective of clinical decision support system development. Other gynecological disorders and fetal anomalies may benefit from our approach.

  6. AggNet: Deep Learning From Crowds for Mitosis Detection in Breast Cancer Histology Images.

    PubMed

    Albarqouni, Shadi; Baur, Christoph; Achilles, Felix; Belagiannis, Vasileios; Demirci, Stefanie; Navab, Nassir

    2016-05-01

    The lack of publicly available ground-truth data has been identified as the major challenge for transferring recent developments in deep learning to the biomedical imaging domain. Though crowdsourcing has enabled annotation of large scale databases for real world images, its application for biomedical purposes requires a deeper understanding and hence, more precise definition of the actual annotation task. The fact that expert tasks are being outsourced to non-expert users may lead to noisy annotations introducing disagreement between users. Despite being a valuable resource for learning annotation models from crowdsourcing, conventional machine-learning methods may have difficulties dealing with noisy annotations during training. In this manuscript, we present a new concept for learning from crowds that handle data aggregation directly as part of the learning process of the convolutional neural network (CNN) via additional crowdsourcing layer (AggNet). Besides, we present an experimental study on learning from crowds designed to answer the following questions. 1) Can deep CNN be trained with data collected from crowdsourcing? 2) How to adapt the CNN to train on multiple types of annotation datasets (ground truth and crowd-based)? 3) How does the choice of annotation and aggregation affect the accuracy? Our experimental setup involved Annot8, a self-implemented web-platform based on Crowdflower API realizing image annotation tasks for a publicly available biomedical image database. Our results give valuable insights into the functionality of deep CNN learning from crowd annotations and prove the necessity of data aggregation integration.

  7. CellCognition: time-resolved phenotype annotation in high-throughput live cell imaging.

    PubMed

    Held, Michael; Schmitz, Michael H A; Fischer, Bernd; Walter, Thomas; Neumann, Beate; Olma, Michael H; Peter, Matthias; Ellenberg, Jan; Gerlich, Daniel W

    2010-09-01

    Fluorescence time-lapse imaging has become a powerful tool to investigate complex dynamic processes such as cell division or intracellular trafficking. Automated microscopes generate time-resolved imaging data at high throughput, yet tools for quantification of large-scale movie data are largely missing. Here we present CellCognition, a computational framework to annotate complex cellular dynamics. We developed a machine-learning method that combines state-of-the-art classification with hidden Markov modeling for annotation of the progression through morphologically distinct biological states. Incorporation of time information into the annotation scheme was essential to suppress classification noise at state transitions and confusion between different functional states with similar morphology. We demonstrate generic applicability in different assays and perturbation conditions, including a candidate-based RNA interference screen for regulators of mitotic exit in human cells. CellCognition is published as open source software, enabling live-cell imaging-based screening with assays that directly score cellular dynamics.

  8. Secure annotation for medical images based on reversible watermarking in the Integer Fibonacci-Haar transform domain

    NASA Astrophysics Data System (ADS)

    Battisti, F.; Carli, M.; Neri, A.

    2011-03-01

    The increasing use of digital image-based applications is resulting in huge databases that are often difficult to use and prone to misuse and privacy concerns. These issues are especially crucial in medical applications. The most commonly adopted solution is the encryption of both the image and the patient data in separate files that are then linked. This practice results to be inefficient since, in order to retrieve patient data or analysis details, it is necessary to decrypt both files. In this contribution, an alternative solution for secure medical image annotation is presented. The proposed framework is based on the joint use of a key-dependent wavelet transform, the Integer Fibonacci-Haar transform, of a secure cryptographic scheme, and of a reversible watermarking scheme. The system allows: i) the insertion of the patient data into the encrypted image without requiring the knowledge of the original image, ii) the encryption of annotated images without causing loss in the embedded information, and iii) due to the complete reversibility of the process, it allows recovering the original image after the mark removal. Experimental results show the effectiveness of the proposed scheme.

  9. Crowdtruth validation: a new paradigm for validating algorithms that rely on image correspondences.

    PubMed

    Maier-Hein, Lena; Kondermann, Daniel; Roß, Tobias; Mersmann, Sven; Heim, Eric; Bodenstedt, Sebastian; Kenngott, Hannes Götz; Sanchez, Alexandro; Wagner, Martin; Preukschas, Anas; Wekerle, Anna-Laura; Helfert, Stefanie; März, Keno; Mehrabi, Arianeb; Speidel, Stefanie; Stock, Christian

    2015-08-01

    Feature tracking and 3D surface reconstruction are key enabling techniques to computer-assisted minimally invasive surgery. One of the major bottlenecks related to training and validation of new algorithms is the lack of large amounts of annotated images that fully capture the wide range of anatomical/scene variance in clinical practice. To address this issue, we propose a novel approach to obtaining large numbers of high-quality reference image annotations at low cost in an extremely short period of time. The concept is based on outsourcing the correspondence search to a crowd of anonymous users from an online community (crowdsourcing) and comprises four stages: (1) feature detection, (2) correspondence search via crowdsourcing, (3) merging multiple annotations per feature by fitting Gaussian finite mixture models, (4) outlier removal using the result of the clustering as input for a second annotation task. On average, 10,000 annotations were obtained within 24 h at a cost of $100. The annotation of the crowd after clustering and before outlier removal was of expert quality with a median distance of about 1 pixel to a publically available reference annotation. The threshold for the outlier removal task directly determines the maximum annotation error, but also the number of points removed. Our concept is a novel and effective method for fast, low-cost and highly accurate correspondence generation that could be adapted to various other applications related to large-scale data annotation in medical image computing and computer-assisted interventions.

  10. High-fidelity data embedding for image annotation.

    PubMed

    He, Shan; Kirovski, Darko; Wu, Min

    2009-02-01

    High fidelity is a demanding requirement for data hiding, especially for images with artistic or medical value. This correspondence proposes a high-fidelity image watermarking for annotation with robustness to moderate distortion. To achieve the high fidelity of the embedded image, we introduce a visual perception model that aims at quantifying the local tolerance to noise for arbitrary imagery. Based on this model, we embed two kinds of watermarks: a pilot watermark that indicates the existence of the watermark and an information watermark that conveys a payload of several dozen bits. The objective is to embed 32 bits of metadata into a single image in such a way that it is robust to JPEG compression and cropping. We demonstrate the effectiveness of the visual model and the application of the proposed annotation technology using a database of challenging photographic and medical images that contain a large amount of smooth regions.

  11. AutoBD: Automated Bi-Level Description for Scalable Fine-Grained Visual Categorization.

    PubMed

    Yao, Hantao; Zhang, Shiliang; Yan, Chenggang; Zhang, Yongdong; Li, Jintao; Tian, Qi

    Compared with traditional image classification, fine-grained visual categorization is a more challenging task, because it targets to classify objects belonging to the same species, e.g. , classify hundreds of birds or cars. In the past several years, researchers have made many achievements on this topic. However, most of them are heavily dependent on the artificial annotations, e.g., bounding boxes, part annotations, and so on . The requirement of artificial annotations largely hinders the scalability and application. Motivated to release such dependence, this paper proposes a robust and discriminative visual description named Automated Bi-level Description (AutoBD). "Bi-level" denotes two complementary part-level and object-level visual descriptions, respectively. AutoBD is "automated," because it only requires the image-level labels of training images and does not need any annotations for testing images. Compared with the part annotations labeled by the human, the image-level labels can be easily acquired, which thus makes AutoBD suitable for large-scale visual categorization. Specifically, the part-level description is extracted by identifying the local region saliently representing the visual distinctiveness. The object-level description is extracted from object bounding boxes generated with a co-localization algorithm. Although only using the image-level labels, AutoBD outperforms the recent studies on two public benchmark, i.e. , classification accuracy achieves 81.6% on CUB-200-2011 and 88.9% on Car-196, respectively. On the large-scale Birdsnap data set, AutoBD achieves the accuracy of 68%, which is currently the best performance to the best of our knowledge.Compared with traditional image classification, fine-grained visual categorization is a more challenging task, because it targets to classify objects belonging to the same species, e.g. , classify hundreds of birds or cars. In the past several years, researchers have made many achievements on this topic. However, most of them are heavily dependent on the artificial annotations, e.g., bounding boxes, part annotations, and so on . The requirement of artificial annotations largely hinders the scalability and application. Motivated to release such dependence, this paper proposes a robust and discriminative visual description named Automated Bi-level Description (AutoBD). "Bi-level" denotes two complementary part-level and object-level visual descriptions, respectively. AutoBD is "automated," because it only requires the image-level labels of training images and does not need any annotations for testing images. Compared with the part annotations labeled by the human, the image-level labels can be easily acquired, which thus makes AutoBD suitable for large-scale visual categorization. Specifically, the part-level description is extracted by identifying the local region saliently representing the visual distinctiveness. The object-level description is extracted from object bounding boxes generated with a co-localization algorithm. Although only using the image-level labels, AutoBD outperforms the recent studies on two public benchmark, i.e. , classification accuracy achieves 81.6% on CUB-200-2011 and 88.9% on Car-196, respectively. On the large-scale Birdsnap data set, AutoBD achieves the accuracy of 68%, which is currently the best performance to the best of our knowledge.

  12. Learning to rank image tags with limited training examples.

    PubMed

    Songhe Feng; Zheyun Feng; Rong Jin

    2015-04-01

    With an increasing number of images that are available in social media, image annotation has emerged as an important research topic due to its application in image matching and retrieval. Most studies cast image annotation into a multilabel classification problem. The main shortcoming of this approach is that it requires a large number of training images with clean and complete annotations in order to learn a reliable model for tag prediction. We address this limitation by developing a novel approach that combines the strength of tag ranking with the power of matrix recovery. Instead of having to make a binary decision for each tag, our approach ranks tags in the descending order of their relevance to the given image, significantly simplifying the problem. In addition, the proposed method aggregates the prediction models for different tags into a matrix, and casts tag ranking into a matrix recovery problem. It introduces the matrix trace norm to explicitly control the model complexity, so that a reliable prediction model can be learned for tag ranking even when the tag space is large and the number of training images is limited. Experiments on multiple well-known image data sets demonstrate the effectiveness of the proposed framework for tag ranking compared with the state-of-the-art approaches for image annotation and tag ranking.

  13. On combining image-based and ontological semantic dissimilarities for medical image retrieval applications

    PubMed Central

    Kurtz, Camille; Depeursinge, Adrien; Napel, Sandy; Beaulieu, Christopher F.; Rubin, Daniel L.

    2014-01-01

    Computer-assisted image retrieval applications can assist radiologists by identifying similar images in archives as a means to providing decision support. In the classical case, images are described using low-level features extracted from their contents, and an appropriate distance is used to find the best matches in the feature space. However, using low-level image features to fully capture the visual appearance of diseases is challenging and the semantic gap between these features and the high-level visual concepts in radiology may impair the system performance. To deal with this issue, the use of semantic terms to provide high-level descriptions of radiological image contents has recently been advocated. Nevertheless, most of the existing semantic image retrieval strategies are limited by two factors: they require manual annotation of the images using semantic terms and they ignore the intrinsic visual and semantic relationships between these annotations during the comparison of the images. Based on these considerations, we propose an image retrieval framework based on semantic features that relies on two main strategies: (1) automatic “soft” prediction of ontological terms that describe the image contents from multi-scale Riesz wavelets and (2) retrieval of similar images by evaluating the similarity between their annotations using a new term dissimilarity measure, which takes into account both image-based and ontological term relations. The combination of these strategies provides a means of accurately retrieving similar images in databases based on image annotations and can be considered as a potential solution to the semantic gap problem. We validated this approach in the context of the retrieval of liver lesions from computed tomographic (CT) images and annotated with semantic terms of the RadLex ontology. The relevance of the retrieval results was assessed using two protocols: evaluation relative to a dissimilarity reference standard defined for pairs of images on a 25-images dataset, and evaluation relative to the diagnoses of the retrieved images on a 72-images dataset. A normalized discounted cumulative gain (NDCG) score of more than 0.92 was obtained with the first protocol, while AUC scores of more than 0.77 were obtained with the second protocol. This automatical approach could provide real-time decision support to radiologists by showing them similar images with associated diagnoses and, where available, responses to therapies. PMID:25036769

  14. Deformable image registration using convolutional neural networks

    NASA Astrophysics Data System (ADS)

    Eppenhof, Koen A. J.; Lafarge, Maxime W.; Moeskops, Pim; Veta, Mitko; Pluim, Josien P. W.

    2018-03-01

    Deformable image registration can be time-consuming and often needs extensive parameterization to perform well on a specific application. We present a step towards a registration framework based on a three-dimensional convolutional neural network. The network directly learns transformations between pairs of three-dimensional images. The outputs of the network are three maps for the x, y, and z components of a thin plate spline transformation grid. The network is trained on synthetic random transformations, which are applied to a small set of representative images for the desired application. Training therefore does not require manually annotated ground truth deformation information. The methodology is demonstrated on public data sets of inspiration-expiration lung CT image pairs, which come with annotated corresponding landmarks for evaluation of the registration accuracy. Advantages of this methodology are its fast registration times and its minimal parameterization.

  15. Linking DICOM pixel data with radiology reports using automatic semantic annotation

    NASA Astrophysics Data System (ADS)

    Pathak, Sayan D.; Kim, Woojin; Munasinghe, Indeera; Criminisi, Antonio; White, Steve; Siddiqui, Khan

    2012-02-01

    Improved access to DICOM studies to both physicians and patients is changing the ways medical imaging studies are visualized and interpreted beyond the confines of radiologists' PACS workstations. While radiologists are trained for viewing and image interpretation, a non-radiologist physician relies on the radiologists' reports. Consequently, patients historically have been typically informed about their imaging findings via oral communication with their physicians, even though clinical studies have shown that patients respond to physician's advice significantly better when the individual patients are shown their own actual data. Our previous work on automated semantic annotation of DICOM Computed Tomography (CT) images allows us to further link radiology report with the corresponding images, enabling us to bridge the gap between image data with the human interpreted textual description of the corresponding imaging studies. The mapping of radiology text is facilitated by natural language processing (NLP) based search application. When combined with our automated semantic annotation of images, it enables navigation in large DICOM studies by clicking hyperlinked text in the radiology reports. An added advantage of using semantic annotation is the ability to render the organs to their default window level setting thus eliminating another barrier to image sharing and distribution. We believe such approaches would potentially enable the consumer to have access to their imaging data and navigate them in an informed manner.

  16. Augmented Reality Technology Using Microsoft HoloLens in Anatomic Pathology.

    PubMed

    Hanna, Matthew G; Ahmed, Ishtiaque; Nine, Jeffrey; Prajapati, Shyam; Pantanowitz, Liron

    2018-05-01

    Context Augmented reality (AR) devices such as the Microsoft HoloLens have not been well used in the medical field. Objective To test the HoloLens for clinical and nonclinical applications in pathology. Design A Microsoft HoloLens was tested for virtual annotation during autopsy, viewing 3D gross and microscopic pathology specimens, navigating whole slide images, telepathology, as well as real-time pathology-radiology correlation. Results Pathology residents performing an autopsy wearing the HoloLens were remotely instructed with real-time diagrams, annotations, and voice instruction. 3D-scanned gross pathology specimens could be viewed as holograms and easily manipulated. Telepathology was supported during gross examination and at the time of intraoperative consultation, allowing users to remotely access a pathologist for guidance and to virtually annotate areas of interest on specimens in real-time. The HoloLens permitted radiographs to be coregistered on gross specimens and thereby enhanced locating important pathologic findings. The HoloLens also allowed easy viewing and navigation of whole slide images, using an AR workstation, including multiple coregistered tissue sections facilitating volumetric pathology evaluation. Conclusions The HoloLens is a novel AR tool with multiple clinical and nonclinical applications in pathology. The device was comfortable to wear, easy to use, provided sufficient computing power, and supported high-resolution imaging. It was useful for autopsy, gross and microscopic examination, and ideally suited for digital pathology. Unique applications include remote supervision and annotation, 3D image viewing and manipulation, telepathology in a mixed-reality environment, and real-time pathology-radiology correlation.

  17. Modeling loosely annotated images using both given and imagined annotations

    NASA Astrophysics Data System (ADS)

    Tang, Hong; Boujemaa, Nozha; Chen, Yunhao; Deng, Lei

    2011-12-01

    In this paper, we present an approach to learn latent semantic analysis models from loosely annotated images for automatic image annotation and indexing. The given annotation in training images is loose due to: 1. ambiguous correspondences between visual features and annotated keywords; 2. incomplete lists of annotated keywords. The second reason motivates us to enrich the incomplete annotation in a simple way before learning a topic model. In particular, some ``imagined'' keywords are poured into the incomplete annotation through measuring similarity between keywords in terms of their co-occurrence. Then, both given and imagined annotations are employed to learn probabilistic topic models for automatically annotating new images. We conduct experiments on two image databases (i.e., Corel and ESP) coupled with their loose annotations, and compare the proposed method with state-of-the-art discrete annotation methods. The proposed method improves word-driven probability latent semantic analysis (PLSA-words) up to a comparable performance with the best discrete annotation method, while a merit of PLSA-words is still kept, i.e., a wider semantic range.

  18. Informatics in radiology: An open-source and open-access cancer biomedical informatics grid annotation and image markup template builder.

    PubMed

    Mongkolwat, Pattanasak; Channin, David S; Kleper, Vladimir; Rubin, Daniel L

    2012-01-01

    In a routine clinical environment or clinical trial, a case report form or structured reporting template can be used to quickly generate uniform and consistent reports. Annotation and image markup (AIM), a project supported by the National Cancer Institute's cancer biomedical informatics grid, can be used to collect information for a case report form or structured reporting template. AIM is designed to store, in a single information source, (a) the description of pixel data with use of markups or graphical drawings placed on the image, (b) calculation results (which may or may not be directly related to the markups), and (c) supplemental information. To facilitate the creation of AIM annotations with data entry templates, an AIM template schema and an open-source template creation application were developed to assist clinicians, image researchers, and designers of clinical trials to quickly create a set of data collection items, thereby ultimately making image information more readily accessible.

  19. Informatics in Radiology: An Open-Source and Open-Access Cancer Biomedical Informatics Grid Annotation and Image Markup Template Builder

    PubMed Central

    Channin, David S.; Rubin, Vladimir Kleper Daniel L.

    2012-01-01

    In a routine clinical environment or clinical trial, a case report form or structured reporting template can be used to quickly generate uniform and consistent reports. Annotation and Image Markup (AIM), a project supported by the National Cancer Institute’s cancer Biomedical Informatics Grid, can be used to collect information for a case report form or structured reporting template. AIM is designed to store, in a single information source, (a) the description of pixel data with use of markups or graphical drawings placed on the image, (b) calculation results (which may or may not be directly related to the markups), and (c) supplemental information. To facilitate the creation of AIM annotations with data entry templates, an AIM template schema and an open-source template creation application were developed to assist clinicians, image researchers, and designers of clinical trials to quickly create a set of data collection items, thereby ultimately making image information more readily accessible. © RSNA, 2012 PMID:22556315

  20. Evaluation of web-based annotation of ophthalmic images for multicentric clinical trials.

    PubMed

    Chalam, K V; Jain, P; Shah, V A; Shah, Gaurav Y

    2006-06-01

    An Internet browser-based annotation system can be used to identify and describe features in digitalized retinal images, in multicentric clinical trials, in real time. In this web-based annotation system, the user employs a mouse to draw and create annotations on a transparent layer, that encapsulates the observations and interpretations of a specific image. Multiple annotation layers may be overlaid on a single image. These layers may correspond to annotations by different users on the same image or annotations of a temporal sequence of images of a disease process, over a period of time. In addition, geometrical properties of annotated figures may be computed and measured. The annotations are stored in a central repository database on a server, which can be retrieved by multiple users in real time. This system facilitates objective evaluation of digital images and comparison of double-blind readings of digital photographs, with an identifiable audit trail. Annotation of ophthalmic images allowed clinically feasible and useful interpretation to track properties of an area of fundus pathology. This provided an objective method to monitor properties of pathologies over time, an essential component of multicentric clinical trials. The annotation system also allowed users to view stereoscopic images that are stereo pairs. This web-based annotation system is useful and valuable in monitoring patient care, in multicentric clinical trials, telemedicine, teaching and routine clinical settings.

  1. Collection of sequential imaging events for research in breast cancer screening

    NASA Astrophysics Data System (ADS)

    Patel, M. N.; Young, K.; Halling-Brown, M. D.

    2016-03-01

    Due to the huge amount of research involving medical images, there is a widely accepted need for comprehensive collections of medical images to be made available for research. This demand led to the design and implementation of a flexible image repository, which retrospectively collects images and data from multiple sites throughout the UK. The OPTIMAM Medical Image Database (OMI-DB) was created to provide a centralized, fully annotated dataset for research. The database contains both processed and unprocessed images, associated data, annotations and expert-determined ground truths. Collection has been ongoing for over three years, providing the opportunity to collect sequential imaging events. Extensive alterations to the identification, collection, processing and storage arms of the system have been undertaken to support the introduction of sequential events, including interval cancers. These updates to the collection systems allow the acquisition of many more images, but more importantly, allow one to build on the existing high-dimensional data stored in the OMI-DB. A research dataset of this scale, which includes original normal and subsequent malignant cases along with expert derived and clinical annotations, is currently unique. These data provide a powerful resource for future research and has initiated new research projects, amongst which, is the quantification of normal cases by applying a large number of quantitative imaging features, with a priori knowledge that eventually these cases develop a malignancy. This paper describes, extensions to the OMI-DB collection systems and tools and discusses the prospective applications of having such a rich dataset for future research applications.

  2. Automatic medical image annotation and keyword-based image retrieval using relevance feedback.

    PubMed

    Ko, Byoung Chul; Lee, JiHyeon; Nam, Jae-Yeal

    2012-08-01

    This paper presents novel multiple keywords annotation for medical images, keyword-based medical image retrieval, and relevance feedback method for image retrieval for enhancing image retrieval performance. For semantic keyword annotation, this study proposes a novel medical image classification method combining local wavelet-based center symmetric-local binary patterns with random forests. For keyword-based image retrieval, our retrieval system use the confidence score that is assigned to each annotated keyword by combining probabilities of random forests with predefined body relation graph. To overcome the limitation of keyword-based image retrieval, we combine our image retrieval system with relevance feedback mechanism based on visual feature and pattern classifier. Compared with other annotation and relevance feedback algorithms, the proposed method shows both improved annotation performance and accurate retrieval results.

  3. OntoVIP: an ontology for the annotation of object models used for medical image simulation.

    PubMed

    Gibaud, Bernard; Forestier, Germain; Benoit-Cattin, Hugues; Cervenansky, Frédéric; Clarysse, Patrick; Friboulet, Denis; Gaignard, Alban; Hugonnard, Patrick; Lartizien, Carole; Liebgott, Hervé; Montagnat, Johan; Tabary, Joachim; Glatard, Tristan

    2014-12-01

    This paper describes the creation of a comprehensive conceptualization of object models used in medical image simulation, suitable for major imaging modalities and simulators. The goal is to create an application ontology that can be used to annotate the models in a repository integrated in the Virtual Imaging Platform (VIP), to facilitate their sharing and reuse. Annotations make the anatomical, physiological and pathophysiological content of the object models explicit. In such an interdisciplinary context we chose to rely on a common integration framework provided by a foundational ontology, that facilitates the consistent integration of the various modules extracted from several existing ontologies, i.e. FMA, PATO, MPATH, RadLex and ChEBI. Emphasis is put on methodology for achieving this extraction and integration. The most salient aspects of the ontology are presented, especially the organization in model layers, as well as its use to browse and query the model repository. Copyright © 2014 Elsevier Inc. All rights reserved.

  4. Automatic annotation of histopathological images using a latent topic model based on non-negative matrix factorization

    PubMed Central

    Cruz-Roa, Angel; Díaz, Gloria; Romero, Eduardo; González, Fabio A.

    2011-01-01

    Histopathological images are an important resource for clinical diagnosis and biomedical research. From an image understanding point of view, the automatic annotation of these images is a challenging problem. This paper presents a new method for automatic histopathological image annotation based on three complementary strategies, first, a part-based image representation, called the bag of features, which takes advantage of the natural redundancy of histopathological images for capturing the fundamental patterns of biological structures, second, a latent topic model, based on non-negative matrix factorization, which captures the high-level visual patterns hidden in the image, and, third, a probabilistic annotation model that links visual appearance of morphological and architectural features associated to 10 histopathological image annotations. The method was evaluated using 1,604 annotated images of skin tissues, which included normal and pathological architectural and morphological features, obtaining a recall of 74% and a precision of 50%, which improved a baseline annotation method based on support vector machines in a 64% and 24%, respectively. PMID:22811960

  5. The caBIG annotation and image Markup project.

    PubMed

    Channin, David S; Mongkolwat, Pattanasak; Kleper, Vladimir; Sepukar, Kastubh; Rubin, Daniel L

    2010-04-01

    Image annotation and markup are at the core of medical interpretation in both the clinical and the research setting. Digital medical images are managed with the DICOM standard format. While DICOM contains a large amount of meta-data about whom, where, and how the image was acquired, DICOM says little about the content or meaning of the pixel data. An image annotation is the explanatory or descriptive information about the pixel data of an image that is generated by a human or machine observer. An image markup is the graphical symbols placed over the image to depict an annotation. While DICOM is the standard for medical image acquisition, manipulation, transmission, storage, and display, there are no standards for image annotation and markup. Many systems expect annotation to be reported verbally, while markups are stored in graphical overlays or proprietary formats. This makes it difficult to extract and compute with both of them. The goal of the Annotation and Image Markup (AIM) project is to develop a mechanism, for modeling, capturing, and serializing image annotation and markup data that can be adopted as a standard by the medical imaging community. The AIM project produces both human- and machine-readable artifacts. This paper describes the AIM information model, schemas, software libraries, and tools so as to prepare researchers and developers for their use of AIM.

  6. SurfaceSlide: a multitouch digital pathology platform.

    PubMed

    Wang, Yinhai; Williamson, Kate E; Kelly, Paul J; James, Jacqueline A; Hamilton, Peter W

    2012-01-01

    Digital pathology provides a digital environment for the management and interpretation of pathological images and associated data. It is becoming increasing popular to use modern computer based tools and applications in pathological education, tissue based research and clinical diagnosis. Uptake of this new technology is stymied by its single user orientation and its prerequisite and cumbersome combination of mouse and keyboard for navigation and annotation. In this study we developed SurfaceSlide, a dedicated viewing platform which enables the navigation and annotation of gigapixel digitised pathological images using fingertip touch. SurfaceSlide was developed using the Microsoft Surface, a 30 inch multitouch tabletop computing platform. SurfaceSlide users can perform direct panning and zooming operations on digitised slide images. These images are downloaded onto the Microsoft Surface platform from a remote server on-demand. Users can also draw annotations and key in texts using an on-screen virtual keyboard. We also developed a smart caching protocol which caches the surrounding regions of a field of view in multi-resolutions thus providing a smooth and vivid user experience and reducing the delay for image downloading from the internet. We compared the usability of SurfaceSlide against Aperio ImageScope and PathXL online viewer. SurfaceSlide is intuitive, fast and easy to use. SurfaceSlide represents the most direct, effective and intimate human-digital slide interaction experience. It is expected that SurfaceSlide will significantly enhance digital pathology tools and applications in education and clinical practice.

  7. SurfaceSlide: A Multitouch Digital Pathology Platform

    PubMed Central

    Wang, Yinhai; Williamson, Kate E.; Kelly, Paul J.; James, Jacqueline A.; Hamilton, Peter W.

    2012-01-01

    Background Digital pathology provides a digital environment for the management and interpretation of pathological images and associated data. It is becoming increasing popular to use modern computer based tools and applications in pathological education, tissue based research and clinical diagnosis. Uptake of this new technology is stymied by its single user orientation and its prerequisite and cumbersome combination of mouse and keyboard for navigation and annotation. Methodology In this study we developed SurfaceSlide, a dedicated viewing platform which enables the navigation and annotation of gigapixel digitised pathological images using fingertip touch. SurfaceSlide was developed using the Microsoft Surface, a 30 inch multitouch tabletop computing platform. SurfaceSlide users can perform direct panning and zooming operations on digitised slide images. These images are downloaded onto the Microsoft Surface platform from a remote server on-demand. Users can also draw annotations and key in texts using an on-screen virtual keyboard. We also developed a smart caching protocol which caches the surrounding regions of a field of view in multi-resolutions thus providing a smooth and vivid user experience and reducing the delay for image downloading from the internet. We compared the usability of SurfaceSlide against Aperio ImageScope and PathXL online viewer. Conclusion SurfaceSlide is intuitive, fast and easy to use. SurfaceSlide represents the most direct, effective and intimate human–digital slide interaction experience. It is expected that SurfaceSlide will significantly enhance digital pathology tools and applications in education and clinical practice. PMID:22292040

  8. Fuzzy Emotional Semantic Analysis and Automated Annotation of Scene Images

    PubMed Central

    Cao, Jianfang; Chen, Lichao

    2015-01-01

    With the advances in electronic and imaging techniques, the production of digital images has rapidly increased, and the extraction and automated annotation of emotional semantics implied by images have become issues that must be urgently addressed. To better simulate human subjectivity and ambiguity for understanding scene images, the current study proposes an emotional semantic annotation method for scene images based on fuzzy set theory. A fuzzy membership degree was calculated to describe the emotional degree of a scene image and was implemented using the Adaboost algorithm and a back-propagation (BP) neural network. The automated annotation method was trained and tested using scene images from the SUN Database. The annotation results were then compared with those based on artificial annotation. Our method showed an annotation accuracy rate of 91.2% for basic emotional values and 82.4% after extended emotional values were added, which correspond to increases of 5.5% and 8.9%, respectively, compared with the results from using a single BP neural network algorithm. Furthermore, the retrieval accuracy rate based on our method reached approximately 89%. This study attempts to lay a solid foundation for the automated emotional semantic annotation of more types of images and therefore is of practical significance. PMID:25838818

  9. DEVA: An extensible ontology-based annotation model for visual document collections

    NASA Astrophysics Data System (ADS)

    Jelmini, Carlo; Marchand-Maillet, Stephane

    2003-01-01

    The description of visual documents is a fundamental aspect of any efficient information management system, but the process of manually annotating large collections of documents is tedious and far from being perfect. The need for a generic and extensible annotation model therefore arises. In this paper, we present DEVA, an open, generic and expressive multimedia annotation framework. DEVA is an extension of the Dublin Core specification. The model can represent the semantic content of any visual document. It is described in the ontology language DAML+OIL and can easily be extended with external specialized ontologies, adapting the vocabulary to the given application domain. In parallel, we present the Magritte annotation tool, which is an early prototype that validates the DEVA features. Magritte allows to manually annotating image collections. It is designed with a modular and extensible architecture, which enables the user to dynamically adapt the user interface to specialized ontologies merged into DEVA.

  10. Computer systems for annotation of single molecule fragments

    DOEpatents

    Schwartz, David Charles; Severin, Jessica

    2016-07-19

    There are provided computer systems for visualizing and annotating single molecule images. Annotation systems in accordance with this disclosure allow a user to mark and annotate single molecules of interest and their restriction enzyme cut sites thereby determining the restriction fragments of single nucleic acid molecules. The markings and annotations may be automatically generated by the system in certain embodiments and they may be overlaid translucently onto the single molecule images. An image caching system may be implemented in the computer annotation systems to reduce image processing time. The annotation systems include one or more connectors connecting to one or more databases capable of storing single molecule data as well as other biomedical data. Such diverse array of data can be retrieved and used to validate the markings and annotations. The annotation systems may be implemented and deployed over a computer network. They may be ergonomically optimized to facilitate user interactions.

  11. Multivendor Spectral-Domain Optical Coherence Tomography Dataset, Observer Annotation Performance Evaluation, and Standardized Evaluation Framework for Intraretinal Cystoid Fluid Segmentation.

    PubMed

    Wu, Jing; Philip, Ana-Maria; Podkowinski, Dominika; Gerendas, Bianca S; Langs, Georg; Simader, Christian; Waldstein, Sebastian M; Schmidt-Erfurth, Ursula M

    2016-01-01

    Development of image analysis and machine learning methods for segmentation of clinically significant pathology in retinal spectral-domain optical coherence tomography (SD-OCT), used in disease detection and prediction, is limited due to the availability of expertly annotated reference data. Retinal segmentation methods use datasets that either are not publicly available, come from only one device, or use different evaluation methodologies making them difficult to compare. Thus we present and evaluate a multiple expert annotated reference dataset for the problem of intraretinal cystoid fluid (IRF) segmentation, a key indicator in exudative macular disease. In addition, a standardized framework for segmentation accuracy evaluation, applicable to other pathological structures, is presented. Integral to this work is the dataset used which must be fit for purpose for IRF segmentation algorithm training and testing. We describe here a multivendor dataset comprised of 30 scans. Each OCT scan for system training has been annotated by multiple graders using a proprietary system. Evaluation of the intergrader annotations shows a good correlation, thus making the reproducibly annotated scans suitable for the training and validation of image processing and machine learning based segmentation methods. The dataset will be made publicly available in the form of a segmentation Grand Challenge.

  12. Multivendor Spectral-Domain Optical Coherence Tomography Dataset, Observer Annotation Performance Evaluation, and Standardized Evaluation Framework for Intraretinal Cystoid Fluid Segmentation

    PubMed Central

    Wu, Jing; Philip, Ana-Maria; Podkowinski, Dominika; Gerendas, Bianca S.; Langs, Georg; Simader, Christian

    2016-01-01

    Development of image analysis and machine learning methods for segmentation of clinically significant pathology in retinal spectral-domain optical coherence tomography (SD-OCT), used in disease detection and prediction, is limited due to the availability of expertly annotated reference data. Retinal segmentation methods use datasets that either are not publicly available, come from only one device, or use different evaluation methodologies making them difficult to compare. Thus we present and evaluate a multiple expert annotated reference dataset for the problem of intraretinal cystoid fluid (IRF) segmentation, a key indicator in exudative macular disease. In addition, a standardized framework for segmentation accuracy evaluation, applicable to other pathological structures, is presented. Integral to this work is the dataset used which must be fit for purpose for IRF segmentation algorithm training and testing. We describe here a multivendor dataset comprised of 30 scans. Each OCT scan for system training has been annotated by multiple graders using a proprietary system. Evaluation of the intergrader annotations shows a good correlation, thus making the reproducibly annotated scans suitable for the training and validation of image processing and machine learning based segmentation methods. The dataset will be made publicly available in the form of a segmentation Grand Challenge. PMID:27579177

  13. Ontology-based image navigation: exploring 3.0-T MR neurography of the brachial plexus using AIM and RadLex.

    PubMed

    Wang, Kenneth C; Salunkhe, Aditya R; Morrison, James J; Lee, Pearlene P; Mejino, José L V; Detwiler, Landon T; Brinkley, James F; Siegel, Eliot L; Rubin, Daniel L; Carrino, John A

    2015-01-01

    Disorders of the peripheral nervous system have traditionally been evaluated using clinical history, physical examination, and electrodiagnostic testing. In selected cases, imaging modalities such as magnetic resonance (MR) neurography may help further localize or characterize abnormalities associated with peripheral neuropathies, and the clinical importance of such techniques is increasing. However, MR image interpretation with respect to peripheral nerve anatomy and disease often presents a diagnostic challenge because the relevant knowledge base remains relatively specialized. Using the radiology knowledge resource RadLex®, a series of RadLex queries, the Annotation and Image Markup standard for image annotation, and a Web services-based software architecture, the authors developed an application that allows ontology-assisted image navigation. The application provides an image browsing interface, allowing users to visually inspect the imaging appearance of anatomic structures. By interacting directly with the images, users can access additional structure-related information that is derived from RadLex (eg, muscle innervation, muscle attachment sites). These data also serve as conceptual links to navigate from one portion of the imaging atlas to another. With 3.0-T MR neurography of the brachial plexus as the initial area of interest, the resulting application provides support to radiologists in the image interpretation process by allowing efficient exploration of the MR imaging appearance of relevant nerve segments, muscles, bone structures, vascular landmarks, anatomic spaces, and entrapment sites, and the investigation of neuromuscular relationships. RSNA, 2015

  14. Ontology-based Image Navigation: Exploring 3.0-T MR Neurography of the Brachial Plexus Using AIM and RadLex

    PubMed Central

    Salunkhe, Aditya R.; Morrison, James J.; Lee, Pearlene P.; Mejino, José L. V.; Detwiler, Landon T.; Brinkley, James F.; Siegel, Eliot L.; Rubin, Daniel L.; Carrino, John A.

    2015-01-01

    Disorders of the peripheral nervous system have traditionally been evaluated using clinical history, physical examination, and electrodiagnostic testing. In selected cases, imaging modalities such as magnetic resonance (MR) neurography may help further localize or characterize abnormalities associated with peripheral neuropathies, and the clinical importance of such techniques is increasing. However, MR image interpretation with respect to peripheral nerve anatomy and disease often presents a diagnostic challenge because the relevant knowledge base remains relatively specialized. Using the radiology knowledge resource RadLex®, a series of RadLex queries, the Annotation and Image Markup standard for image annotation, and a Web services–based software architecture, the authors developed an application that allows ontology-assisted image navigation. The application provides an image browsing interface, allowing users to visually inspect the imaging appearance of anatomic structures. By interacting directly with the images, users can access additional structure-related information that is derived from RadLex (eg, muscle innervation, muscle attachment sites). These data also serve as conceptual links to navigate from one portion of the imaging atlas to another. With 3.0-T MR neurography of the brachial plexus as the initial area of interest, the resulting application provides support to radiologists in the image interpretation process by allowing efficient exploration of the MR imaging appearance of relevant nerve segments, muscles, bone structures, vascular landmarks, anatomic spaces, and entrapment sites, and the investigation of neuromuscular relationships. ©RSNA, 2015 PMID:25590394

  15. Towards Automated Annotation of Benthic Survey Images: Variability of Human Experts and Operational Modes of Automation

    PubMed Central

    Beijbom, Oscar; Edmunds, Peter J.; Roelfsema, Chris; Smith, Jennifer; Kline, David I.; Neal, Benjamin P.; Dunlap, Matthew J.; Moriarty, Vincent; Fan, Tung-Yung; Tan, Chih-Jui; Chan, Stephen; Treibitz, Tali; Gamst, Anthony; Mitchell, B. Greg; Kriegman, David

    2015-01-01

    Global climate change and other anthropogenic stressors have heightened the need to rapidly characterize ecological changes in marine benthic communities across large scales. Digital photography enables rapid collection of survey images to meet this need, but the subsequent image annotation is typically a time consuming, manual task. We investigated the feasibility of using automated point-annotation to expedite cover estimation of the 17 dominant benthic categories from survey-images captured at four Pacific coral reefs. Inter- and intra- annotator variability among six human experts was quantified and compared to semi- and fully- automated annotation methods, which are made available at coralnet.ucsd.edu. Our results indicate high expert agreement for identification of coral genera, but lower agreement for algal functional groups, in particular between turf algae and crustose coralline algae. This indicates the need for unequivocal definitions of algal groups, careful training of multiple annotators, and enhanced imaging technology. Semi-automated annotation, where 50% of the annotation decisions were performed automatically, yielded cover estimate errors comparable to those of the human experts. Furthermore, fully-automated annotation yielded rapid, unbiased cover estimates but with increased variance. These results show that automated annotation can increase spatial coverage and decrease time and financial outlay for image-based reef surveys. PMID:26154157

  16. Can masses of non-experts train highly accurate image classifiers? A crowdsourcing approach to instrument segmentation in laparoscopic images.

    PubMed

    Maier-Hein, Lena; Mersmann, Sven; Kondermann, Daniel; Bodenstedt, Sebastian; Sanchez, Alexandro; Stock, Christian; Kenngott, Hannes Gotz; Eisenmann, Mathias; Speidel, Stefanie

    2014-01-01

    Machine learning algorithms are gaining increasing interest in the context of computer-assisted interventions. One of the bottlenecks so far, however, has been the availability of training data, typically generated by medical experts with very limited resources. Crowdsourcing is a new trend that is based on outsourcing cognitive tasks to many anonymous untrained individuals from an online community. In this work, we investigate the potential of crowdsourcing for segmenting medical instruments in endoscopic image data. Our study suggests that (1) segmentations computed from annotations of multiple anonymous non-experts are comparable to those made by medical experts and (2) training data generated by the crowd is of the same quality as that annotated by medical experts. Given the speed of annotation, scalability and low costs, this implies that the scientific community might no longer need to rely on experts to generate reference or training data for certain applications. To trigger further research in endoscopic image processing, the data used in this study will be made publicly available.

  17. GelScape: a web-based server for interactively annotating, manipulating, comparing and archiving 1D and 2D gel images.

    PubMed

    Young, Nelson; Chang, Zhan; Wishart, David S

    2004-04-12

    GelScape is a web-based tool that permits facile, interactive annotation, comparison, manipulation and storage of protein gel images. It uses Java applet-servlet technology to allow rapid, remote image handling and image processing in a platform-independent manner. It supports many of the features found in commercial, stand-alone gel analysis software including spot annotation, spot integration, gel warping, image resizing, HTML image mapping, image overlaying as well as the storage of gel image and gel annotation data in compliance with Federated Gel Database requirements.

  18. Multi-Atlas Segmentation using Partially Annotated Data: Methods and Annotation Strategies.

    PubMed

    Koch, Lisa M; Rajchl, Martin; Bai, Wenjia; Baumgartner, Christian F; Tong, Tong; Passerat-Palmbach, Jonathan; Aljabar, Paul; Rueckert, Daniel

    2017-08-22

    Multi-atlas segmentation is a widely used tool in medical image analysis, providing robust and accurate results by learning from annotated atlas datasets. However, the availability of fully annotated atlas images for training is limited due to the time required for the labelling task. Segmentation methods requiring only a proportion of each atlas image to be labelled could therefore reduce the workload on expert raters tasked with annotating atlas images. To address this issue, we first re-examine the labelling problem common in many existing approaches and formulate its solution in terms of a Markov Random Field energy minimisation problem on a graph connecting atlases and the target image. This provides a unifying framework for multi-atlas segmentation. We then show how modifications in the graph configuration of the proposed framework enable the use of partially annotated atlas images and investigate different partial annotation strategies. The proposed method was evaluated on two Magnetic Resonance Imaging (MRI) datasets for hippocampal and cardiac segmentation. Experiments were performed aimed at (1) recreating existing segmentation techniques with the proposed framework and (2) demonstrating the potential of employing sparsely annotated atlas data for multi-atlas segmentation.

  19. Quantitative imaging biomarker ontology (QIBO) for knowledge representation of biomedical imaging biomarkers.

    PubMed

    Buckler, Andrew J; Liu, Tiffany Ting; Savig, Erica; Suzek, Baris E; Ouellette, M; Danagoulian, J; Wernsing, G; Rubin, Daniel L; Paik, David

    2013-08-01

    A widening array of novel imaging biomarkers is being developed using ever more powerful clinical and preclinical imaging modalities. These biomarkers have demonstrated effectiveness in quantifying biological processes as they occur in vivo and in the early prediction of therapeutic outcomes. However, quantitative imaging biomarker data and knowledge are not standardized, representing a critical barrier to accumulating medical knowledge based on quantitative imaging data. We use an ontology to represent, integrate, and harmonize heterogeneous knowledge across the domain of imaging biomarkers. This advances the goal of developing applications to (1) improve precision and recall of storage and retrieval of quantitative imaging-related data using standardized terminology; (2) streamline the discovery and development of novel imaging biomarkers by normalizing knowledge across heterogeneous resources; (3) effectively annotate imaging experiments thus aiding comprehension, re-use, and reproducibility; and (4) provide validation frameworks through rigorous specification as a basis for testable hypotheses and compliance tests. We have developed the Quantitative Imaging Biomarker Ontology (QIBO), which currently consists of 488 terms spanning the following upper classes: experimental subject, biological intervention, imaging agent, imaging instrument, image post-processing algorithm, biological target, indicated biology, and biomarker application. We have demonstrated that QIBO can be used to annotate imaging experiments with standardized terms in the ontology and to generate hypotheses for novel imaging biomarker-disease associations. Our results established the utility of QIBO in enabling integrated analysis of quantitative imaging data.

  20. Managing and Querying Image Annotation and Markup in XML.

    PubMed

    Wang, Fusheng; Pan, Tony; Sharma, Ashish; Saltz, Joel

    2010-01-01

    Proprietary approaches for representing annotations and image markup are serious barriers for researchers to share image data and knowledge. The Annotation and Image Markup (AIM) project is developing a standard based information model for image annotation and markup in health care and clinical trial environments. The complex hierarchical structures of AIM data model pose new challenges for managing such data in terms of performance and support of complex queries. In this paper, we present our work on managing AIM data through a native XML approach, and supporting complex image and annotation queries through native extension of XQuery language. Through integration with xService, AIM databases can now be conveniently shared through caGrid.

  1. Managing and Querying Image Annotation and Markup in XML

    PubMed Central

    Wang, Fusheng; Pan, Tony; Sharma, Ashish; Saltz, Joel

    2010-01-01

    Proprietary approaches for representing annotations and image markup are serious barriers for researchers to share image data and knowledge. The Annotation and Image Markup (AIM) project is developing a standard based information model for image annotation and markup in health care and clinical trial environments. The complex hierarchical structures of AIM data model pose new challenges for managing such data in terms of performance and support of complex queries. In this paper, we present our work on managing AIM data through a native XML approach, and supporting complex image and annotation queries through native extension of XQuery language. Through integration with xService, AIM databases can now be conveniently shared through caGrid. PMID:21218167

  2. A Methodology and Implementation for Annotating Digital Images for Context-appropriate Use in an Academic Health Care Environment

    PubMed Central

    Goede, Patricia A.; Lauman, Jason R.; Cochella, Christopher; Katzman, Gregory L.; Morton, David A.; Albertine, Kurt H.

    2004-01-01

    Use of digital medical images has become common over the last several years, coincident with the release of inexpensive, mega-pixel quality digital cameras and the transition to digital radiology operation by hospitals. One problem that clinicians, medical educators, and basic scientists encounter when handling images is the difficulty of using business and graphic arts commercial-off-the-shelf (COTS) software in multicontext authoring and interactive teaching environments. The authors investigated and developed software-supported methodologies to help clinicians, medical educators, and basic scientists become more efficient and effective in their digital imaging environments. The software that the authors developed provides the ability to annotate images based on a multispecialty methodology for annotation and visual knowledge representation. This annotation methodology is designed by consensus, with contributions from the authors and physicians, medical educators, and basic scientists in the Departments of Radiology, Neurobiology and Anatomy, Dermatology, and Ophthalmology at the University of Utah. The annotation methodology functions as a foundation for creating, using, reusing, and extending dynamic annotations in a context-appropriate, interactive digital environment. The annotation methodology supports the authoring process as well as output and presentation mechanisms. The annotation methodology is the foundation for a Windows implementation that allows annotated elements to be represented as structured eXtensible Markup Language and stored separate from the image(s). PMID:14527971

  3. Automatic segmentation of MR brain images of preterm infants using supervised classification.

    PubMed

    Moeskops, Pim; Benders, Manon J N L; Chiţ, Sabina M; Kersbergen, Karina J; Groenendaal, Floris; de Vries, Linda S; Viergever, Max A; Išgum, Ivana

    2015-09-01

    Preterm birth is often associated with impaired brain development. The state and expected progression of preterm brain development can be evaluated using quantitative assessment of MR images. Such measurements require accurate segmentation of different tissue types in those images. This paper presents an algorithm for the automatic segmentation of unmyelinated white matter (WM), cortical grey matter (GM), and cerebrospinal fluid in the extracerebral space (CSF). The algorithm uses supervised voxel classification in three subsequent stages. In the first stage, voxels that can easily be assigned to one of the three tissue types are labelled. In the second stage, dedicated analysis of the remaining voxels is performed. The first and the second stages both use two-class classification for each tissue type separately. Possible inconsistencies that could result from these tissue-specific segmentation stages are resolved in the third stage, which performs multi-class classification. A set of T1- and T2-weighted images was analysed, but the optimised system performs automatic segmentation using a T2-weighted image only. We have investigated the performance of the algorithm when using training data randomly selected from completely annotated images as well as when using training data from only partially annotated images. The method was evaluated on images of preterm infants acquired at 30 and 40weeks postmenstrual age (PMA). When the method was trained using random selection from the completely annotated images, the average Dice coefficients were 0.95 for WM, 0.81 for GM, and 0.89 for CSF on an independent set of images acquired at 30weeks PMA. When the method was trained using only the partially annotated images, the average Dice coefficients were 0.95 for WM, 0.78 for GM and 0.87 for CSF for the images acquired at 30weeks PMA, and 0.92 for WM, 0.80 for GM and 0.85 for CSF for the images acquired at 40weeks PMA. Even though the segmentations obtained using training data from the partially annotated images resulted in slightly lower Dice coefficients, the performance in all experiments was close to that of a second human expert (0.93 for WM, 0.79 for GM and 0.86 for CSF for the images acquired at 30weeks, and 0.94 for WM, 0.76 for GM and 0.87 for CSF for the images acquired at 40weeks). These results show that the presented method is robust to age and acquisition protocol and that it performs accurate segmentation of WM, GM, and CSF when the training data is extracted from complete annotations as well as when the training data is extracted from partial annotations only. This extends the applicability of the method by reducing the time and effort necessary to create training data in a population with different characteristics. Copyright © 2015 Elsevier Inc. All rights reserved.

  4. Leveraging the crowd for annotation of retinal images.

    PubMed

    Leifman, George; Swedish, Tristan; Roesch, Karin; Raskar, Ramesh

    2015-01-01

    Medical data presents a number of challenges. It tends to be unstructured, noisy and protected. To train algorithms to understand medical images, doctors can label the condition associated with a particular image, but obtaining enough labels can be difficult. We propose an annotation approach which starts with a small pool of expertly annotated images and uses their expertise to rate the performance of crowd-sourced annotations. In this paper we demonstrate how to apply our approach for annotation of large-scale datasets of retinal images. We introduce a novel data validation procedure which is designed to cope with noisy ground-truth data and with non-consistent input from both experts and crowd-workers.

  5. Ontology modularization to improve semantic medical image annotation.

    PubMed

    Wennerberg, Pinar; Schulz, Klaus; Buitelaar, Paul

    2011-02-01

    Searching for medical images and patient reports is a significant challenge in a clinical setting. The contents of such documents are often not described in sufficient detail thus making it difficult to utilize the inherent wealth of information contained within them. Semantic image annotation addresses this problem by describing the contents of images and reports using medical ontologies. Medical images and patient reports are then linked to each other through common annotations. Subsequently, search algorithms can more effectively find related sets of documents on the basis of these semantic descriptions. A prerequisite to realizing such a semantic search engine is that the data contained within should have been previously annotated with concepts from medical ontologies. One major challenge in this regard is the size and complexity of medical ontologies as annotation sources. Manual annotation is particularly time consuming labor intensive in a clinical environment. In this article we propose an approach to reducing the size of clinical ontologies for more efficient manual image and text annotation. More precisely, our goal is to identify smaller fragments of a large anatomy ontology that are relevant for annotating medical images from patients suffering from lymphoma. Our work is in the area of ontology modularization, which is a recent and active field of research. We describe our approach, methods and data set in detail and we discuss our results. Copyright © 2010 Elsevier Inc. All rights reserved.

  6. K-Nearest Neighbors Relevance Annotation Model for Distance Education

    ERIC Educational Resources Information Center

    Ke, Xiao; Li, Shaozi; Cao, Donglin

    2011-01-01

    With the rapid development of Internet technologies, distance education has become a popular educational mode. In this paper, the authors propose an online image automatic annotation distance education system, which could effectively help children learn interrelations between image content and corresponding keywords. Image automatic annotation is…

  7. CAMEL: concept annotated image libraries

    NASA Astrophysics Data System (ADS)

    Natsev, Apostol; Chadha, Atul; Soetarman, Basuki; Vitter, Jeffrey S.

    2001-01-01

    The problem of content-based image searching has received considerable attention in the last few years. Thousands of images are now available on the Internet, and many important applications require searching of images in domains such as E-commerce, medical imaging, weather prediction, satellite imagery, and so on. Yet, content-based image querying is still largely unestablished as a mainstream field, nor is it widely used by search engines. We believe that two of the major hurdles for this poor acceptance are poor retrieval quality and usability.

  8. CAMEL: concept annotated image libraries

    NASA Astrophysics Data System (ADS)

    Natsev, Apostol; Chadha, Atul; Soetarman, Basuki; Vitter, Jeffrey S.

    2000-12-01

    The problem of content-based image searching has received considerable attention in the last few years. Thousands of images are now available on the Internet, and many important applications require searching of images in domains such as E-commerce, medical imaging, weather prediction, satellite imagery, and so on. Yet, content-based image querying is still largely unestablished as a mainstream field, nor is it widely used by search engines. We believe that two of the major hurdles for this poor acceptance are poor retrieval quality and usability.

  9. Quantitative imaging features: extension of the oncology medical image database

    NASA Astrophysics Data System (ADS)

    Patel, M. N.; Looney, P. T.; Young, K. C.; Halling-Brown, M. D.

    2015-03-01

    Radiological imaging is fundamental within the healthcare industry and has become routinely adopted for diagnosis, disease monitoring and treatment planning. With the advent of digital imaging modalities and the rapid growth in both diagnostic and therapeutic imaging, the ability to be able to harness this large influx of data is of paramount importance. The Oncology Medical Image Database (OMI-DB) was created to provide a centralized, fully annotated dataset for research. The database contains both processed and unprocessed images, associated data, and annotations and where applicable expert determined ground truths describing features of interest. Medical imaging provides the ability to detect and localize many changes that are important to determine whether a disease is present or a therapy is effective by depicting alterations in anatomic, physiologic, biochemical or molecular processes. Quantitative imaging features are sensitive, specific, accurate and reproducible imaging measures of these changes. Here, we describe an extension to the OMI-DB whereby a range of imaging features and descriptors are pre-calculated using a high throughput approach. The ability to calculate multiple imaging features and data from the acquired images would be valuable and facilitate further research applications investigating detection, prognosis, and classification. The resultant data store contains more than 10 million quantitative features as well as features derived from CAD predictions. Theses data can be used to build predictive models to aid image classification, treatment response assessment as well as to identify prognostic imaging biomarkers.

  10. WebMedSA: a web-based framework for segmenting and annotating medical images using biomedical ontologies

    NASA Astrophysics Data System (ADS)

    Vega, Francisco; Pérez, Wilson; Tello, Andrés.; Saquicela, Victor; Espinoza, Mauricio; Solano-Quinde, Lizandro; Vidal, Maria-Esther; La Cruz, Alexandra

    2015-12-01

    Advances in medical imaging have fostered medical diagnosis based on digital images. Consequently, the number of studies by medical images diagnosis increases, thus, collaborative work and tele-radiology systems are required to effectively scale up to this diagnosis trend. We tackle the problem of the collaborative access of medical images, and present WebMedSA, a framework to manage large datasets of medical images. WebMedSA relies on a PACS and supports the ontological annotation, as well as segmentation and visualization of the images based on their semantic description. Ontological annotations can be performed directly on the volumetric image or at different image planes (e.g., axial, coronal, or sagittal); furthermore, annotations can be complemented after applying a segmentation technique. WebMedSA is based on three main steps: (1) RDF-ization process for extracting, anonymizing, and serializing metadata comprised in DICOM medical images into RDF/XML; (2) Integration of different biomedical ontologies (using L-MOM library), making this approach ontology independent; and (3) segmentation and visualization of annotated data which is further used to generate new annotations according to expert knowledge, and validation. Initial user evaluations suggest that WebMedSA facilitates the exchange of knowledge between radiologists, and provides the basis for collaborative work among them.

  11. Automated analysis and reannotation of subcellular locations in confocal images from the Human Protein Atlas.

    PubMed

    Li, Jieyue; Newberg, Justin Y; Uhlén, Mathias; Lundberg, Emma; Murphy, Robert F

    2012-01-01

    The Human Protein Atlas contains immunofluorescence images showing subcellular locations for thousands of proteins. These are currently annotated by visual inspection. In this paper, we describe automated approaches to analyze the images and their use to improve annotation. We began by training classifiers to recognize the annotated patterns. By ranking proteins according to the confidence of the classifier, we generated a list of proteins that were strong candidates for reexamination. In parallel, we applied hierarchical clustering to group proteins and identified proteins whose annotations were inconsistent with the remainder of the proteins in their cluster. These proteins were reexamined by the original annotators, and a significant fraction had their annotations changed. The results demonstrate that automated approaches can provide an important complement to visual annotation.

  12. Learning pathology using collaborative vs. individual annotation of whole slide images: a mixed methods trial.

    PubMed

    Sahota, Michael; Leung, Betty; Dowdell, Stephanie; Velan, Gary M

    2016-12-12

    Students in biomedical disciplines require understanding of normal and abnormal microscopic appearances of human tissues (histology and histopathology). For this purpose, practical classes in these disciplines typically use virtual microscopy, viewing digitised whole slide images in web browsers. To enhance engagement, tools have been developed to enable individual or collaborative annotation of whole slide images within web browsers. To date, there have been no studies that have critically compared the impact on learning of individual and collaborative annotations on whole slide images. Junior and senior students engaged in Pathology practical classes within Medical Science and Medicine programs participated in cross-over trials of individual and collaborative annotation activities. Students' understanding of microscopic morphology was compared using timed online quizzes, while students' perceptions of learning were evaluated using an online questionnaire. For senior medical students, collaborative annotation of whole slide images was superior for understanding key microscopic features when compared to individual annotation; whilst being at least equivalent to individual annotation for junior medical science students. Across cohorts, students agreed that the annotation activities provided a user-friendly learning environment that met their flexible learning needs, improved efficiency, provided useful feedback, and helped them to set learning priorities. Importantly, these activities were also perceived to enhance motivation and improve understanding. Collaborative annotation improves understanding of microscopic morphology for students with sufficient background understanding of the discipline. These findings have implications for the deployment of annotation activities in biomedical curricula, and potentially for postgraduate training in Anatomical Pathology.

  13. The Accuracy and Reliability of Crowdsource Annotations of Digital Retinal Images

    PubMed Central

    Mitry, Danny; Zutis, Kris; Dhillon, Baljean; Peto, Tunde; Hayat, Shabina; Khaw, Kay-Tee; Morgan, James E.; Moncur, Wendy; Trucco, Emanuele; Foster, Paul J.

    2016-01-01

    Purpose Crowdsourcing is based on outsourcing computationally intensive tasks to numerous individuals in the online community who have no formal training. Our aim was to develop a novel online tool designed to facilitate large-scale annotation of digital retinal images, and to assess the accuracy of crowdsource grading using this tool, comparing it to expert classification. Methods We used 100 retinal fundus photograph images with predetermined disease criteria selected by two experts from a large cohort study. The Amazon Mechanical Turk Web platform was used to drive traffic to our site so anonymous workers could perform a classification and annotation task of the fundus photographs in our dataset after a short training exercise. Three groups were assessed: masters only, nonmasters only and nonmasters with compulsory training. We calculated the sensitivity, specificity, and area under the curve (AUC) of receiver operating characteristic (ROC) plots for all classifications compared to expert grading, and used the Dice coefficient and consensus threshold to assess annotation accuracy. Results In total, we received 5389 annotations for 84 images (excluding 16 training images) in 2 weeks. A specificity and sensitivity of 71% (95% confidence interval [CI], 69%–74%) and 87% (95% CI, 86%–88%) was achieved for all classifications. The AUC in this study for all classifications combined was 0.93 (95% CI, 0.91–0.96). For image annotation, a maximal Dice coefficient (∼0.6) was achieved with a consensus threshold of 0.25. Conclusions This study supports the hypothesis that annotation of abnormalities in retinal images by ophthalmologically naive individuals is comparable to expert annotation. The highest AUC and agreement with expert annotation was achieved in the nonmasters with compulsory training group. Translational Relevance The use of crowdsourcing as a technique for retinal image analysis may be comparable to expert graders and has the potential to deliver timely, accurate, and cost-effective image analysis. PMID:27668130

  14. The Accuracy and Reliability of Crowdsource Annotations of Digital Retinal Images.

    PubMed

    Mitry, Danny; Zutis, Kris; Dhillon, Baljean; Peto, Tunde; Hayat, Shabina; Khaw, Kay-Tee; Morgan, James E; Moncur, Wendy; Trucco, Emanuele; Foster, Paul J

    2016-09-01

    Crowdsourcing is based on outsourcing computationally intensive tasks to numerous individuals in the online community who have no formal training. Our aim was to develop a novel online tool designed to facilitate large-scale annotation of digital retinal images, and to assess the accuracy of crowdsource grading using this tool, comparing it to expert classification. We used 100 retinal fundus photograph images with predetermined disease criteria selected by two experts from a large cohort study. The Amazon Mechanical Turk Web platform was used to drive traffic to our site so anonymous workers could perform a classification and annotation task of the fundus photographs in our dataset after a short training exercise. Three groups were assessed: masters only, nonmasters only and nonmasters with compulsory training. We calculated the sensitivity, specificity, and area under the curve (AUC) of receiver operating characteristic (ROC) plots for all classifications compared to expert grading, and used the Dice coefficient and consensus threshold to assess annotation accuracy. In total, we received 5389 annotations for 84 images (excluding 16 training images) in 2 weeks. A specificity and sensitivity of 71% (95% confidence interval [CI], 69%-74%) and 87% (95% CI, 86%-88%) was achieved for all classifications. The AUC in this study for all classifications combined was 0.93 (95% CI, 0.91-0.96). For image annotation, a maximal Dice coefficient (∼0.6) was achieved with a consensus threshold of 0.25. This study supports the hypothesis that annotation of abnormalities in retinal images by ophthalmologically naive individuals is comparable to expert annotation. The highest AUC and agreement with expert annotation was achieved in the nonmasters with compulsory training group. The use of crowdsourcing as a technique for retinal image analysis may be comparable to expert graders and has the potential to deliver timely, accurate, and cost-effective image analysis.

  15. Effectiveness of image features and similarity measures in cluster-based approaches for content-based image retrieval

    NASA Astrophysics Data System (ADS)

    Du, Hongbo; Al-Jubouri, Hanan; Sellahewa, Harin

    2014-05-01

    Content-based image retrieval is an automatic process of retrieving images according to image visual contents instead of textual annotations. It has many areas of application from automatic image annotation and archive, image classification and categorization to homeland security and law enforcement. The key issues affecting the performance of such retrieval systems include sensible image features that can effectively capture the right amount of visual contents and suitable similarity measures to find similar and relevant images ranked in a meaningful order. Many different approaches, methods and techniques have been developed as a result of very intensive research in the past two decades. Among many existing approaches, is a cluster-based approach where clustering methods are used to group local feature descriptors into homogeneous regions, and search is conducted by comparing the regions of the query image against those of the stored images. This paper serves as a review of works in this area. The paper will first summarize the existing work reported in the literature and then present the authors' own investigations in this field. The paper intends to highlight not only achievements made by recent research but also challenges and difficulties still remaining in this area.

  16. Automatic multi-label annotation of abdominal CT images using CBIR

    NASA Astrophysics Data System (ADS)

    Xue, Zhiyun; Antani, Sameer; Long, L. Rodney; Thoma, George R.

    2017-03-01

    We present a technique to annotate multiple organs shown in 2-D abdominal/pelvic CT images using CBIR. This annotation task is motivated by our research interests in visual question-answering (VQA). We aim to apply results from this effort in Open-iSM, a multimodal biomedical search engine developed by the National Library of Medicine (NLM). Understanding visual content of biomedical images is a necessary step for VQA. Though sufficient annotational information about an image may be available in related textual metadata, not all may be useful as descriptive tags, particularly for anatomy on the image. In this paper, we develop and evaluate a multi-label image annotation method using CBIR. We evaluate our method on two 2-D CT image datasets we generated from 3-D volumetric data obtained from a multi-organ segmentation challenge hosted in MICCAI 2015. Shape and spatial layout information is used to encode visual characteristics of the anatomy. We adapt a weighted voting scheme to assign multiple labels to the query image by combining the labels of the images identified as similar by the method. Key parameters that may affect the annotation performance, such as the number of images used in the label voting and the threshold for excluding labels that have low weights, are studied. The method proposes a coarse-to-fine retrieval strategy which integrates the classification with the nearest-neighbor search. Results from our evaluation (using the MICCAI CT image datasets as well as figures from Open-i) are presented.

  17. Enhancing Comparative Effectiveness Research With Automated Pediatric Pneumonia Detection in a Multi-Institutional Clinical Repository: A PHIS+ Pilot Study.

    PubMed

    Meystre, Stephane; Gouripeddi, Ramkiran; Tieder, Joel; Simmons, Jeffrey; Srivastava, Rajendu; Shah, Samir

    2017-05-15

    Community-acquired pneumonia is a leading cause of pediatric morbidity. Administrative data are often used to conduct comparative effectiveness research (CER) with sufficient sample sizes to enhance detection of important outcomes. However, such studies are prone to misclassification errors because of the variable accuracy of discharge diagnosis codes. The aim of this study was to develop an automated, scalable, and accurate method to determine the presence or absence of pneumonia in children using chest imaging reports. The multi-institutional PHIS+ clinical repository was developed to support pediatric CER by expanding an administrative database of children's hospitals with detailed clinical data. To develop a scalable approach to find patients with bacterial pneumonia more accurately, we developed a Natural Language Processing (NLP) application to extract relevant information from chest diagnostic imaging reports. Domain experts established a reference standard by manually annotating 282 reports to train and then test the NLP application. Findings of pleural effusion, pulmonary infiltrate, and pneumonia were automatically extracted from the reports and then used to automatically classify whether a report was consistent with bacterial pneumonia. Compared with the annotated diagnostic imaging reports reference standard, the most accurate implementation of machine learning algorithms in our NLP application allowed extracting relevant findings with a sensitivity of .939 and a positive predictive value of .925. It allowed classifying reports with a sensitivity of .71, a positive predictive value of .86, and a specificity of .962. When compared with each of the domain experts manually annotating these reports, the NLP application allowed for significantly higher sensitivity (.71 vs .527) and similar positive predictive value and specificity . NLP-based pneumonia information extraction of pediatric diagnostic imaging reports performed better than domain experts in this pilot study. NLP is an efficient method to extract information from a large collection of imaging reports to facilitate CER. ©Stephane Meystre, Ramkiran Gouripeddi, Joel Tieder, Jeffrey Simmons, Rajendu Srivastava, Samir Shah. Originally published in the Journal of Medical Internet Research (http://www.jmir.org), 15.05.2017.

  18. Annotating images by mining image search results.

    PubMed

    Wang, Xin-Jing; Zhang, Lei; Li, Xirong; Ma, Wei-Ying

    2008-11-01

    Although it has been studied for years by the computer vision and machine learning communities, image annotation is still far from practical. In this paper, we propose a novel attempt at model-free image annotation, which is a data-driven approach that annotates images by mining their search results. Some 2.4 million images with their surrounding text are collected from a few photo forums to support this approach. The entire process is formulated in a divide-and-conquer framework where a query keyword is provided along with the uncaptioned image to improve both the effectiveness and efficiency. This is helpful when the collected data set is not dense everywhere. In this sense, our approach contains three steps: 1) the search process to discover visually and semantically similar search results, 2) the mining process to identify salient terms from textual descriptions of the search results, and 3) the annotation rejection process to filter out noisy terms yielded by Step 2. To ensure real-time annotation, two key techniques are leveraged-one is to map the high-dimensional image visual features into hash codes, the other is to implement it as a distributed system, of which the search and mining processes are provided as Web services. As a typical result, the entire process finishes in less than 1 second. Since no training data set is required, our approach enables annotating with unlimited vocabulary and is highly scalable and robust to outliers. Experimental results on both real Web images and a benchmark image data set show the effectiveness and efficiency of the proposed algorithm. It is also worth noting that, although the entire approach is illustrated within the divide-and conquer framework, a query keyword is not crucial to our current implementation. We provide experimental results to prove this.

  19. Real-time image annotation by manifold-based biased Fisher discriminant analysis

    NASA Astrophysics Data System (ADS)

    Ji, Rongrong; Yao, Hongxun; Wang, Jicheng; Sun, Xiaoshuai; Liu, Xianming

    2008-01-01

    Automatic Linguistic Annotation is a promising solution to bridge the semantic gap in content-based image retrieval. However, two crucial issues are not well addressed in state-of-art annotation algorithms: 1. The Small Sample Size (3S) problem in keyword classifier/model learning; 2. Most of annotation algorithms can not extend to real-time online usage due to their low computational efficiencies. This paper presents a novel Manifold-based Biased Fisher Discriminant Analysis (MBFDA) algorithm to address these two issues by transductive semantic learning and keyword filtering. To address the 3S problem, Co-Training based Manifold learning is adopted for keyword model construction. To achieve real-time annotation, a Bias Fisher Discriminant Analysis (BFDA) based semantic feature reduction algorithm is presented for keyword confidence discrimination and semantic feature reduction. Different from all existing annotation methods, MBFDA views image annotation from a novel Eigen semantic feature (which corresponds to keywords) selection aspect. As demonstrated in experiments, our manifold-based biased Fisher discriminant analysis annotation algorithm outperforms classical and state-of-art annotation methods (1.K-NN Expansion; 2.One-to-All SVM; 3.PWC-SVM) in both computational time and annotation accuracy with a large margin.

  20. ProstateAnalyzer: Web-based medical application for the management of prostate cancer using multiparametric MR imaging.

    PubMed

    Mata, Christian; Walker, Paul M; Oliver, Arnau; Brunotte, François; Martí, Joan; Lalande, Alain

    2016-01-01

    In this paper, we present ProstateAnalyzer, a new web-based medical tool for prostate cancer diagnosis. ProstateAnalyzer allows the visualization and analysis of magnetic resonance images (MRI) in a single framework. ProstateAnalyzer recovers the data from a PACS server and displays all the associated MRI images in the same framework, usually consisting of 3D T2-weighted imaging for anatomy, dynamic contrast-enhanced MRI for perfusion, diffusion-weighted imaging in the form of an apparent diffusion coefficient (ADC) map and MR Spectroscopy. ProstateAnalyzer allows annotating regions of interest in a sequence and propagates them to the others. From a representative case, the results using the four visualization platforms are fully detailed, showing the interaction among them. The tool has been implemented as a Java-based applet application to facilitate the portability of the tool to the different computer architectures and software and allowing the possibility to work remotely via the web. ProstateAnalyzer enables experts to manage prostate cancer patient data set more efficiently. The tool allows delineating annotations by experts and displays all the required information for use in diagnosis. According to the current European Society of Urogenital Radiology guidelines, it also includes the PI-RADS structured reporting scheme.

  1. BisQue: cloud-based system for management, annotation, visualization, analysis and data mining of underwater and remote sensing imagery

    NASA Astrophysics Data System (ADS)

    Fedorov, D.; Miller, R. J.; Kvilekval, K. G.; Doheny, B.; Sampson, S.; Manjunath, B. S.

    2016-02-01

    Logistical and financial limitations of underwater operations are inherent in marine science, including biodiversity observation. Imagery is a promising way to address these challenges, but the diversity of organisms thwarts simple automated analysis. Recent developments in computer vision methods, such as convolutional neural networks (CNN), are promising for automated classification and detection tasks but are typically very computationally expensive and require extensive training on large datasets. Therefore, managing and connecting distributed computation, large storage and human annotations of diverse marine datasets is crucial for effective application of these methods. BisQue is a cloud-based system for management, annotation, visualization, analysis and data mining of underwater and remote sensing imagery and associated data. Designed to hide the complexity of distributed storage, large computational clusters, diversity of data formats and inhomogeneous computational environments behind a user friendly web-based interface, BisQue is built around an idea of flexible and hierarchical annotations defined by the user. Such textual and graphical annotations can describe captured attributes and the relationships between data elements. Annotations are powerful enough to describe cells in fluorescent 4D images, fish species in underwater videos and kelp beds in aerial imagery. Presently we are developing BisQue-based analysis modules for automated identification of benthic marine organisms. Recent experiments with drop-out and CNN based classification of several thousand annotated underwater images demonstrated an overall accuracy above 70% for the 15 best performing species and above 85% for the top 5 species. Based on these promising results, we have extended bisque with a CNN-based classification system allowing continuous training on user-provided data.

  2. Automated tumor analysis for molecular profiling in lung cancer

    PubMed Central

    Boyd, Clinton; James, Jacqueline A.; Loughrey, Maurice B.; Hougton, Joseph P.; Boyle, David P.; Kelly, Paul; Maxwell, Perry; McCleary, David; Diamond, James; McArt, Darragh G.; Tunstall, Jonathon; Bankhead, Peter; Salto-Tellez, Manuel

    2015-01-01

    The discovery and clinical application of molecular biomarkers in solid tumors, increasingly relies on nucleic acid extraction from FFPE tissue sections and subsequent molecular profiling. This in turn requires the pathological review of haematoxylin & eosin (H&E) stained slides, to ensure sample quality, tumor DNA sufficiency by visually estimating the percentage tumor nuclei and tumor annotation for manual macrodissection. In this study on NSCLC, we demonstrate considerable variation in tumor nuclei percentage between pathologists, potentially undermining the precision of NSCLC molecular evaluation and emphasising the need for quantitative tumor evaluation. We subsequently describe the development and validation of a system called TissueMark for automated tumor annotation and percentage tumor nuclei measurement in NSCLC using computerized image analysis. Evaluation of 245 NSCLC slides showed precise automated tumor annotation of cases using Tissuemark, strong concordance with manually drawn boundaries and identical EGFR mutational status, following manual macrodissection from the image analysis generated tumor boundaries. Automated analysis of cell counts for % tumor measurements by Tissuemark showed reduced variability and significant correlation (p < 0.001) with benchmark tumor cell counts. This study demonstrates a robust image analysis technology that can facilitate the automated quantitative analysis of tissue samples for molecular profiling in discovery and diagnostics. PMID:26317646

  3. Semantic attributes for people's appearance description: an appearance modality for video surveillance applications

    NASA Astrophysics Data System (ADS)

    Frikha, Mayssa; Fendri, Emna; Hammami, Mohamed

    2017-09-01

    Using semantic attributes such as gender, clothes, and accessories to describe people's appearance is an appealing modeling method for video surveillance applications. We proposed a midlevel appearance signature based on extracting a list of nameable semantic attributes describing the body in uncontrolled acquisition conditions. Conventional approaches extract the same set of low-level features to learn the semantic classifiers uniformly. Their critical limitation is the inability to capture the dominant visual characteristics for each trait separately. The proposed approach consists of extracting low-level features in an attribute-adaptive way by automatically selecting the most relevant features for each attribute separately. Furthermore, relying on a small training-dataset would easily lead to poor performance due to the large intraclass and interclass variations. We annotated large scale people images collected from different person reidentification benchmarks covering a large attribute sample and reflecting the challenges of uncontrolled acquisition conditions. These annotations were gathered into an appearance semantic attribute dataset that contains 3590 images annotated with 14 attributes. Various experiments prove that carefully designed features for learning the visual characteristics for an attribute provide an improvement of the correct classification accuracy and a reduction of both spatial and temporal complexities against state-of-the-art approaches.

  4. The National Cancer Informatics Program (NCIP) Annotation and Image Markup (AIM) Foundation model.

    PubMed

    Mongkolwat, Pattanasak; Kleper, Vladimir; Talbot, Skip; Rubin, Daniel

    2014-12-01

    Knowledge contained within in vivo imaging annotated by human experts or computer programs is typically stored as unstructured text and separated from other associated information. The National Cancer Informatics Program (NCIP) Annotation and Image Markup (AIM) Foundation information model is an evolution of the National Institute of Health's (NIH) National Cancer Institute's (NCI) Cancer Bioinformatics Grid (caBIG®) AIM model. The model applies to various image types created by various techniques and disciplines. It has evolved in response to the feedback and changing demands from the imaging community at NCI. The foundation model serves as a base for other imaging disciplines that want to extend the type of information the model collects. The model captures physical entities and their characteristics, imaging observation entities and their characteristics, markups (two- and three-dimensional), AIM statements, calculations, image source, inferences, annotation role, task context or workflow, audit trail, AIM creator details, equipment used to create AIM instances, subject demographics, and adjudication observations. An AIM instance can be stored as a Digital Imaging and Communications in Medicine (DICOM) structured reporting (SR) object or Extensible Markup Language (XML) document for further processing and analysis. An AIM instance consists of one or more annotations and associated markups of a single finding along with other ancillary information in the AIM model. An annotation describes information about the meaning of pixel data in an image. A markup is a graphical drawing placed on the image that depicts a region of interest. This paper describes fundamental AIM concepts and how to use and extend AIM for various imaging disciplines.

  5. Document image retrieval through word shape coding.

    PubMed

    Lu, Shijian; Li, Linlin; Tan, Chew Lim

    2008-11-01

    This paper presents a document retrieval technique that is capable of searching document images without OCR (optical character recognition). The proposed technique retrieves document images by a new word shape coding scheme, which captures the document content through annotating each word image by a word shape code. In particular, we annotate word images by using a set of topological shape features including character ascenders/descenders, character holes, and character water reservoirs. With the annotated word shape codes, document images can be retrieved by either query keywords or a query document image. Experimental results show that the proposed document image retrieval technique is fast, efficient, and tolerant to various types of document degradation.

  6. Neural network control of focal position during time-lapse microscopy of cells.

    PubMed

    Wei, Ling; Roberts, Elijah

    2018-05-09

    Live-cell microscopy is quickly becoming an indispensable technique for studying the dynamics of cellular processes. Maintaining the specimen in focus during image acquisition is crucial for high-throughput applications, especially for long experiments or when a large sample is being continuously scanned. Automated focus control methods are often expensive, imperfect, or ill-adapted to a specific application and are a bottleneck for widespread adoption of high-throughput, live-cell imaging. Here, we demonstrate a neural network approach for automatically maintaining focus during bright-field microscopy. Z-stacks of yeast cells growing in a microfluidic device were collected and used to train a convolutional neural network to classify images according to their z-position. We studied the effect on prediction accuracy of the various hyperparameters of the neural network, including downsampling, batch size, and z-bin resolution. The network was able to predict the z-position of an image with ±1 μm accuracy, outperforming human annotators. Finally, we used our neural network to control microscope focus in real-time during a 24 hour growth experiment. The method robustly maintained the correct focal position compensating for 40 μm of focal drift and was insensitive to changes in the field of view. About ~100 annotated z-stacks were required to train the network making our method quite practical for custom autofocus applications.

  7. Developing national on-line services to annotate and analyse underwater imagery in a research cloud

    NASA Astrophysics Data System (ADS)

    Proctor, R.; Langlois, T.; Friedman, A.; Davey, B.

    2017-12-01

    Fish image annotation data is currently collected by various research, management and academic institutions globally (+100,000's hours of deployments) with varying degrees of standardisation and limited formal collaboration or data synthesis. We present a case study of how national on-line services, developed within a domain-oriented research cloud, have been used to annotate habitat images and synthesise fish annotation data sets collected using Autonomous Underwater Vehicles (AUVs) and baited remote underwater stereo-video (stereo-BRUV). Two developing software tools have been brought together in the marine science cloud to provide marine biologists with a powerful service for image annotation. SQUIDLE+ is an online platform designed for exploration, management and annotation of georeferenced images & video data. It provides a flexible annotation framework allowing users to work with their preferred annotation schemes. We have used SQUIDLE+ to sample the habitat composition and complexity of images of the benthos collected using stereo-BRUV. GlobalArchive is designed to be a centralised repository of aquatic ecological survey data with design principles including ease of use, secure user access, flexible data import, and the collection of any sampling and image analysis information. To easily share and synthesise data we have implemented data sharing protocols, including Open Data and synthesis Collaborations, and a spatial map to explore global datasets and filter to create a synthesis. These tools in the science cloud, together with a virtual desktop analysis suite offering python and R environments offer an unprecedented capability to deliver marine biodiversity information of value to marine managers and scientists alike.

  8. Image Annotation and Topic Extraction Using Super-Word Latent Dirichlet Allocation

    DTIC Science & Technology

    2013-09-01

    an image can be used to improve automated image annotation performance over existing generalized annotators. Second, image anno - 3 tations can be used...the other variables. The first ratio in the sampling Equation 2.18 uses word frequency by total words, φ̂ (w) j . The second ratio divides word...topics by total words in that document θ̂ (d) j . Both leave out the current assignment of zi and the results are used to randomly choose a new topic

  9. Integrating shape into an interactive segmentation framework

    NASA Astrophysics Data System (ADS)

    Kamalakannan, S.; Bryant, B.; Sari-Sarraf, H.; Long, R.; Antani, S.; Thoma, G.

    2013-02-01

    This paper presents a novel interactive annotation toolbox which extends a well-known user-steered segmentation framework, namely Intelligent Scissors (IS). IS, posed as a shortest path problem, is essentially driven by lower level image based features. All the higher level knowledge about the problem domain is obtained from the user through mouse clicks. The proposed work integrates one higher level feature, namely shape up to a rigid transform, into the IS framework, thus reducing the burden on the user and the subjectivity involved in the annotation procedure, especially during instances of occlusions, broken edges, noise and spurious boundaries. The above mentioned scenarios are commonplace in medical image annotation applications and, hence, such a tool will be of immense help to the medical community. As a first step, an offline training procedure is performed in which a mean shape and the corresponding shape variance is computed by registering training shapes up to a rigid transform in a level-set framework. The user starts the interactive segmentation procedure by providing a training segment, which is a part of the target boundary. A partial shape matching scheme based on a scale-invariant curvature signature is employed in order to extract shape correspondences and subsequently predict the shape of the unsegmented target boundary. A `zone of confidence' is generated for the predicted boundary to accommodate shape variations. The method is evaluated on segmentation of digital chest x-ray images for lung annotation which is a crucial step in developing algorithms for screening Tuberculosis.

  10. Image annotation based on positive-negative instances learning

    NASA Astrophysics Data System (ADS)

    Zhang, Kai; Hu, Jiwei; Liu, Quan; Lou, Ping

    2017-07-01

    Automatic image annotation is now a tough task in computer vision, the main sense of this tech is to deal with managing the massive image on the Internet and assisting intelligent retrieval. This paper designs a new image annotation model based on visual bag of words, using the low level features like color and texture information as well as mid-level feature as SIFT, and mixture the pic2pic, label2pic and label2label correlation to measure the correlation degree of labels and images. We aim to prune the specific features for each single label and formalize the annotation task as a learning process base on Positive-Negative Instances Learning. Experiments are performed using the Corel5K Dataset, and provide a quite promising result when comparing with other existing methods.

  11. iPad: Semantic annotation and markup of radiological images.

    PubMed

    Rubin, Daniel L; Rodriguez, Cesar; Shah, Priyanka; Beaulieu, Chris

    2008-11-06

    Radiological images contain a wealth of information,such as anatomy and pathology, which is often not explicit and computationally accessible. Information schemes are being developed to describe the semantic content of images, but such schemes can be unwieldy to operationalize because there are few tools to enable users to capture structured information easily as part of the routine research workflow. We have created iPad, an open source tool enabling researchers and clinicians to create semantic annotations on radiological images. iPad hides the complexity of the underlying image annotation information model from users, permitting them to describe images and image regions using a graphical interface that maps their descriptions to structured ontologies semi-automatically. Image annotations are saved in a variety of formats,enabling interoperability among medical records systems, image archives in hospitals, and the Semantic Web. Tools such as iPad can help reduce the burden of collecting structured information from images, and it could ultimately enable researchers and physicians to exploit images on a very large scale and glean the biological and physiological significance of image content.

  12. Intervertebral disc detection in X-ray images using faster R-CNN.

    PubMed

    Ruhan Sa; Owens, William; Wiegand, Raymond; Studin, Mark; Capoferri, Donald; Barooha, Kenneth; Greaux, Alexander; Rattray, Robert; Hutton, Adam; Cintineo, John; Chaudhary, Vipin

    2017-07-01

    Automatic identification of specific osseous landmarks on the spinal radiograph can be used to automate calculations for correcting ligament instability and injury, which affect 75% of patients injured in motor vehicle accidents. In this work, we propose to use deep learning based object detection method as the first step towards identifying landmark points in lateral lumbar X-ray images. The significant breakthrough of deep learning technology has made it a prevailing choice for perception based applications, however, the lack of large annotated training dataset has brought challenges to utilizing the technology in medical image processing field. In this work, we propose to fine tune a deep network, Faster-RCNN, a state-of-the-art deep detection network in natural image domain, using small annotated clinical datasets. In the experiment we show that, by using only 81 lateral lumbar X-Ray training images, one can achieve much better performance compared to traditional sliding window detection method on hand crafted features. Furthermore, we fine-tuned the network using 974 training images and tested on 108 images, which achieved average precision of 0.905 with average computation time of 3 second per image, which greatly outperformed traditional methods in terms of accuracy and efficiency.

  13. Generative Adversarial Networks: An Overview

    NASA Astrophysics Data System (ADS)

    Creswell, Antonia; White, Tom; Dumoulin, Vincent; Arulkumaran, Kai; Sengupta, Biswa; Bharath, Anil A.

    2018-01-01

    Generative adversarial networks (GANs) provide a way to learn deep representations without extensively annotated training data. They achieve this through deriving backpropagation signals through a competitive process involving a pair of networks. The representations that can be learned by GANs may be used in a variety of applications, including image synthesis, semantic image editing, style transfer, image super-resolution and classification. The aim of this review paper is to provide an overview of GANs for the signal processing community, drawing on familiar analogies and concepts where possible. In addition to identifying different methods for training and constructing GANs, we also point to remaining challenges in their theory and application.

  14. Transfer Learning with Convolutional Neural Networks for SAR Ship Recognition

    NASA Astrophysics Data System (ADS)

    Zhang, Di; Liu, Jia; Heng, Wang; Ren, Kaijun; Song, Junqiang

    2018-03-01

    Ship recognition is the backbone of marine surveillance systems. Recent deep learning methods, e.g. Convolutional Neural Networks (CNNs), have shown high performance for optical images. Learning CNNs, however, requires a number of annotated samples to estimate numerous model parameters, which prevents its application to Synthetic Aperture Radar (SAR) images due to the limited annotated training samples. Transfer learning has been a promising technique for applications with limited data. To this end, a novel SAR ship recognition method based on CNNs with transfer learning has been developed. In this work, we firstly start with a CNNs model that has been trained in advance on Moving and Stationary Target Acquisition and Recognition (MSTAR) database. Next, based on the knowledge gained from this image recognition task, we fine-tune the CNNs on a new task to recognize three types of ships in the OpenSARShip database. The experimental results show that our proposed approach can obviously increase the recognition rate comparing with the result of merely applying CNNs. In addition, compared to existing methods, the proposed method proves to be very competitive and can learn discriminative features directly from training data instead of requiring pre-specification or pre-selection manually.

  15. The effectiveness of annotated (vs. non-annotated) digital pathology slides as a teaching tool during dermatology and pathology residencies.

    PubMed

    Marsch, Amanda F; Espiritu, Baltazar; Groth, John; Hutchens, Kelli A

    2014-06-01

    With today's technology, paraffin-embedded, hematoxylin & eosin-stained pathology slides can be scanned to generate high quality virtual slides. Using proprietary software, digital images can also be annotated with arrows, circles and boxes to highlight certain diagnostic features. Previous studies assessing digital microscopy as a teaching tool did not involve the annotation of digital images. The objective of this study was to compare the effectiveness of annotated digital pathology slides versus non-annotated digital pathology slides as a teaching tool during dermatology and pathology residencies. A study group composed of 31 dermatology and pathology residents was asked to complete an online pre-quiz consisting of 20 multiple choice style questions, each associated with a static digital pathology image. After completion, participants were given access to an online tutorial composed of digitally annotated pathology slides and subsequently asked to complete a post-quiz. A control group of 12 residents completed a non-annotated version of the tutorial. Nearly all participants in the study group improved their quiz score, with an average improvement of 17%, versus only 3% (P = 0.005) in the control group. These results support the notion that annotated digital pathology slides are superior to non-annotated slides for the purpose of resident education. © 2014 John Wiley & Sons A/S. Published by John Wiley & Sons Ltd.

  16. PAPARA(ZZ)I: An open-source software interface for annotating photographs of the deep-sea

    NASA Astrophysics Data System (ADS)

    Marcon, Yann; Purser, Autun

    PAPARA(ZZ)I is a lightweight and intuitive image annotation program developed for the study of benthic megafauna. It offers functionalities such as free, grid and random point annotation. Annotations may be made following existing classification schemes for marine biota and substrata or with the use of user defined, customised lists of keywords, which broadens the range of potential application of the software to other types of studies (e.g. marine litter distribution assessment). If Internet access is available, PAPARA(ZZ)I can also query and use standardised taxa names directly from the World Register of Marine Species (WoRMS). Program outputs include abundances, densities and size calculations per keyword (e.g. per taxon). These results are written into text files that can be imported into spreadsheet programs for further analyses. PAPARA(ZZ)I is open-source and is available at http://papara-zz-i.github.io. Compiled versions exist for most 64-bit operating systems: Windows, Mac OS X and Linux.

  17. Breast mass detection in mammography and tomosynthesis via fully convolutional network-based heatmap regression

    NASA Astrophysics Data System (ADS)

    Zhang, Jun; Cain, Elizabeth Hope; Saha, Ashirbani; Zhu, Zhe; Mazurowski, Maciej A.

    2018-02-01

    Breast mass detection in mammography and digital breast tomosynthesis (DBT) is an essential step in computerized breast cancer analysis. Deep learning-based methods incorporate feature extraction and model learning into a unified framework and have achieved impressive performance in various medical applications (e.g., disease diagnosis, tumor detection, and landmark detection). However, these methods require large-scale accurately annotated data. Unfortunately, it is challenging to get precise annotations of breast masses. To address this issue, we propose a fully convolutional network (FCN) based heatmap regression method for breast mass detection, using only weakly annotated mass regions in mammography images. Specifically, we first generate heat maps of masses based on human-annotated rough regions for breast masses. We then develop an FCN model for end-to-end heatmap regression with an F-score loss function, where the mammography images are regarded as the input and heatmaps for breast masses are used as the output. Finally, the probability map of mass locations can be estimated with the trained model. Experimental results on a mammography dataset with 439 subjects demonstrate the effectiveness of our method. Furthermore, we evaluate whether we can use mammography data to improve detection models for DBT, since mammography shares similar structure with tomosynthesis. We propose a transfer learning strategy by fine-tuning the learned FCN model from mammography images. We test this approach on a small tomosynthesis dataset with only 40 subjects, and we show an improvement in the detection performance as compared to training the model from scratch.

  18. DeepScope: Nonintrusive Whole Slide Saliency Annotation and Prediction from Pathologists at the Microscope

    PubMed Central

    Schaumberg, Andrew J.; Sirintrapun, S. Joseph; Al-Ahmadie, Hikmat A.; Schüffler, Peter J.; Fuchs, Thomas J.

    2018-01-01

    Modern digital pathology departments have grown to produce whole-slide image data at petabyte scale, an unprecedented treasure chest for medical machine learning tasks. Unfortunately, most digital slides are not annotated at the image level, hindering large-scale application of supervised learning. Manual labeling is prohibitive, requiring pathologists with decades of training and outstanding clinical service responsibilities. This problem is further aggravated by the United States Food and Drug Administration’s ruling that primary diagnosis must come from a glass slide rather than a digital image. We present the first end-to-end framework to overcome this problem, gathering annotations in a nonintrusive manner during a pathologist’s routine clinical work: (i) microscope-specific 3D-printed commodity camera mounts are used to video record the glass-slide-based clinical diagnosis process; (ii) after routine scanning of the whole slide, the video frames are registered to the digital slide; (iii) motion and observation time are estimated to generate a spatial and temporal saliency map of the whole slide. Demonstrating the utility of these annotations, we train a convolutional neural network that detects diagnosis-relevant salient regions, then report accuracy of 85.15% in bladder and 91.40% in prostate, with 75.00% accuracy when training on prostate but predicting in bladder, despite different pathologists examining the different tissues. When training on one patient but testing on another, AUROC in bladder is 0.79±0.11 and in prostate is 0.96±0.04. Our tool is available at https://bitbucket.org/aschaumberg/deepscope PMID:29601065

  19. Semantically Interoperable XML Data

    PubMed Central

    Vergara-Niedermayr, Cristobal; Wang, Fusheng; Pan, Tony; Kurc, Tahsin; Saltz, Joel

    2013-01-01

    XML is ubiquitously used as an information exchange platform for web-based applications in healthcare, life sciences, and many other domains. Proliferating XML data are now managed through latest native XML database technologies. XML data sources conforming to common XML schemas could be shared and integrated with syntactic interoperability. Semantic interoperability can be achieved through semantic annotations of data models using common data elements linked to concepts from ontologies. In this paper, we present a framework and software system to support the development of semantic interoperable XML based data sources that can be shared through a Grid infrastructure. We also present our work on supporting semantic validated XML data through semantic annotations for XML Schema, semantic validation and semantic authoring of XML data. We demonstrate the use of the system for a biomedical database of medical image annotations and markups. PMID:25298789

  20. Automated microscopy for high-content RNAi screening

    PubMed Central

    2010-01-01

    Fluorescence microscopy is one of the most powerful tools to investigate complex cellular processes such as cell division, cell motility, or intracellular trafficking. The availability of RNA interference (RNAi) technology and automated microscopy has opened the possibility to perform cellular imaging in functional genomics and other large-scale applications. Although imaging often dramatically increases the content of a screening assay, it poses new challenges to achieve accurate quantitative annotation and therefore needs to be carefully adjusted to the specific needs of individual screening applications. In this review, we discuss principles of assay design, large-scale RNAi, microscope automation, and computational data analysis. We highlight strategies for imaging-based RNAi screening adapted to different library and assay designs. PMID:20176920

  1. Interpretation and mapping of geological features using mobile devices for 3D outcrop modelling

    NASA Astrophysics Data System (ADS)

    Buckley, Simon J.; Kehl, Christian; Mullins, James R.; Howell, John A.

    2016-04-01

    Advances in 3D digital geometric characterisation have resulted in widespread adoption in recent years, with photorealistic models utilised for interpretation, quantitative and qualitative analysis, as well as education, in an increasingly diverse range of geoscience applications. Topographic models created using lidar and photogrammetry, optionally combined with imagery from sensors such as hyperspectral and thermal cameras, are now becoming commonplace in geoscientific research. Mobile devices (tablets and smartphones) are maturing rapidly to become powerful field computers capable of displaying and interpreting 3D models directly in the field. With increasingly high-quality digital image capture, combined with on-board sensor pose estimation, mobile devices are, in addition, a source of primary data, which can be employed to enhance existing geological models. Adding supplementary image textures and 2D annotations to photorealistic models is therefore a desirable next step to complement conventional field geoscience. This contribution reports on research into field-based interpretation and conceptual sketching on images and photorealistic models on mobile devices, motivated by the desire to utilise digital outcrop models to generate high quality training images (TIs) for multipoint statistics (MPS) property modelling. Representative training images define sedimentological concepts and spatial relationships between elements in the system, which are subsequently modelled using artificial learning to populate geocellular models. Photorealistic outcrop models are underused sources of quantitative and qualitative information for generating TIs, explored further in this research by linking field and office workflows through the mobile device. Existing textured models are loaded to the mobile device, allowing rendering in a 3D environment. Because interpretation in 2D is more familiar and comfortable for users, the developed application allows new images to be captured with the device's digital camera, and an interface is available for annotating (interpreting) the image using lines and polygons. Image-to-geometry registration is then performed using a developed algorithm, initialised using the coarse pose from the on-board orientation and positioning sensors. The annotations made on the captured images are then available in the 3D model coordinate system for overlay and export. This workflow allows geologists to make interpretations and conceptual models in the field, which can then be linked to and refined in office workflows for later MPS property modelling.

  2. Prokaryotic Contig Annotation Pipeline Server: Web Application for a Prokaryotic Genome Annotation Pipeline Based on the Shiny App Package.

    PubMed

    Park, Byeonghyeok; Baek, Min-Jeong; Min, Byoungnam; Choi, In-Geol

    2017-09-01

    Genome annotation is a primary step in genomic research. To establish a light and portable prokaryotic genome annotation pipeline for use in individual laboratories, we developed a Shiny app package designated as "P-CAPS" (Prokaryotic Contig Annotation Pipeline Server). The package is composed of R and Python scripts that integrate publicly available annotation programs into a server application. P-CAPS is not only a browser-based interactive application but also a distributable Shiny app package that can be installed on any personal computer. The final annotation is provided in various standard formats and is summarized in an R markdown document. Annotation can be visualized and examined with a public genome browser. A benchmark test showed that the annotation quality and completeness of P-CAPS were reliable and compatible with those of currently available public pipelines.

  3. SureChEMBL: a large-scale, chemically annotated patent document database.

    PubMed

    Papadatos, George; Davies, Mark; Dedman, Nathan; Chambers, Jon; Gaulton, Anna; Siddle, James; Koks, Richard; Irvine, Sean A; Pettersson, Joe; Goncharoff, Nicko; Hersey, Anne; Overington, John P

    2016-01-04

    SureChEMBL is a publicly available large-scale resource containing compounds extracted from the full text, images and attachments of patent documents. The data are extracted from the patent literature according to an automated text and image-mining pipeline on a daily basis. SureChEMBL provides access to a previously unavailable, open and timely set of annotated compound-patent associations, complemented with sophisticated combined structure and keyword-based search capabilities against the compound repository and patent document corpus; given the wealth of knowledge hidden in patent documents, analysis of SureChEMBL data has immediate applications in drug discovery, medicinal chemistry and other commercial areas of chemical science. Currently, the database contains 17 million compounds extracted from 14 million patent documents. Access is available through a dedicated web-based interface and data downloads at: https://www.surechembl.org/. © The Author(s) 2015. Published by Oxford University Press on behalf of Nucleic Acids Research.

  4. SureChEMBL: a large-scale, chemically annotated patent document database

    PubMed Central

    Papadatos, George; Davies, Mark; Dedman, Nathan; Chambers, Jon; Gaulton, Anna; Siddle, James; Koks, Richard; Irvine, Sean A.; Pettersson, Joe; Goncharoff, Nicko; Hersey, Anne; Overington, John P.

    2016-01-01

    SureChEMBL is a publicly available large-scale resource containing compounds extracted from the full text, images and attachments of patent documents. The data are extracted from the patent literature according to an automated text and image-mining pipeline on a daily basis. SureChEMBL provides access to a previously unavailable, open and timely set of annotated compound-patent associations, complemented with sophisticated combined structure and keyword-based search capabilities against the compound repository and patent document corpus; given the wealth of knowledge hidden in patent documents, analysis of SureChEMBL data has immediate applications in drug discovery, medicinal chemistry and other commercial areas of chemical science. Currently, the database contains 17 million compounds extracted from 14 million patent documents. Access is available through a dedicated web-based interface and data downloads at: https://www.surechembl.org/. PMID:26582922

  5. Content-Based Management of Image Databases in the Internet Age

    ERIC Educational Resources Information Center

    Kleban, James Theodore

    2010-01-01

    The Internet Age has seen the emergence of richly annotated image data collections numbering in the billions of items. This work makes contributions in three primary areas which aid the management of this data: image representation, efficient retrieval, and annotation based on content and metadata. The contributions are as follows. First,…

  6. Effective user guidance in online interactive semantic segmentation

    NASA Astrophysics Data System (ADS)

    Petersen, Jens; Bendszus, Martin; Debus, Jürgen; Heiland, Sabine; Maier-Hein, Klaus H.

    2017-03-01

    With the recent success of machine learning based solutions for automatic image parsing, the availability of reference image annotations for algorithm training is one of the major bottlenecks in medical image segmentation. We are interested in interactive semantic segmentation methods that can be used in an online fashion to generate expert segmentations. These can be used to train automated segmentation techniques or, from an application perspective, for quick and accurate tumor progression monitoring. Using simulated user interactions in a MRI glioblastoma segmentation task, we show that if the user possesses knowledge of the correct segmentation it is significantly (p <= 0.009) better to present data and current segmentation to the user in such a manner that they can easily identify falsely classified regions compared to guiding the user to regions where the classifier exhibits high uncertainty, resulting in differences of mean Dice scores between +0.070 (Whole tumor) and +0.136 (Tumor Core) after 20 iterations. The annotation process should cover all classes equally, which results in a significant (p <= 0.002) improvement compared to completely random annotations anywhere in falsely classified regions for small tumor regions such as the necrotic tumor core (mean Dice +0.151 after 20 it.) and non-enhancing abnormalities (mean Dice +0.069 after 20 it.). These findings provide important insights for the development of efficient interactive segmentation systems and user interfaces.

  7. The 'Soil Cover App' - a new tool for fast determination of dead and living biomass on soil

    NASA Astrophysics Data System (ADS)

    Bauer, Thomas; Strauss, Peter; Riegler-Nurscher, Peter; Prankl, Johann; Prankl, Heinrich

    2017-04-01

    Worldwide many agricultural practices aim on soil protection strategies using living or dead biomass as soil cover. Especially for the case when management practices are focusing on soil erosion mitigation the effectiveness of these practices is directly driven by the amount of soil coverleft on the soil surface. Hence there is a need for quick and reliable methods of soil cover estimation not only for living biomass but particularly for dead biomass (mulch). Available methods for the soil cover measurement are either subjective, depending on an educated guess or time consuming, e.g., if the image is analysed manually at grid points. We therefore developed a mobile application using an algorithm based on entangled forest classification. The final output of the algorithm gives classified labels for each pixel of the input image as well as the percentage of each class which are living biomass, dead biomass, stones and soil. Our training dataset consisted of more than 250 different images and their annotated class information. Images have been taken in a set of different environmental conditions such as light, soil coverages from between 0% to 100%, different materials such as living plants, residues, straw material and stones. We compared the results provided by our mobile application with a data set of 180 images that had been manually annotated A comparison between both methods revealed a regression slope of 0.964 with a coefficient of determination R2 = 0.92, corresponding to an average error of about 4%. While average error of living plant classification was about 3%, dead residue classification resulted in an 8% error. Thus the new mobile application tool offers a fast and easy way to obtain information on the protective potential of a particular agricultural management site.

  8. AnnotateGenomicRegions: a web application.

    PubMed

    Zammataro, Luca; DeMolfetta, Rita; Bucci, Gabriele; Ceol, Arnaud; Muller, Heiko

    2014-01-01

    Modern genomic technologies produce large amounts of data that can be mapped to specific regions in the genome. Among the first steps in interpreting the results is annotation of genomic regions with known features such as genes, promoters, CpG islands etc. Several tools have been published to perform this task. However, using these tools often requires a significant amount of bioinformatics skills and/or downloading and installing dedicated software. Here we present AnnotateGenomicRegions, a web application that accepts genomic regions as input and outputs a selection of overlapping and/or neighboring genome annotations. Supported organisms include human (hg18, hg19), mouse (mm8, mm9, mm10), zebrafish (danRer7), and Saccharomyces cerevisiae (sacCer2, sacCer3). AnnotateGenomicRegions is accessible online on a public server or can be installed locally. Some frequently used annotations and genomes are embedded in the application while custom annotations may be added by the user. The increasing spread of genomic technologies generates the need for a simple-to-use annotation tool for genomic regions that can be used by biologists and bioinformaticians alike. AnnotateGenomicRegions meets this demand. AnnotateGenomicRegions is an open-source web application that can be installed on any personal computer or institute server. AnnotateGenomicRegions is available at: http://cru.genomics.iit.it/AnnotateGenomicRegions.

  9. AnnotateGenomicRegions: a web application

    PubMed Central

    2014-01-01

    Background Modern genomic technologies produce large amounts of data that can be mapped to specific regions in the genome. Among the first steps in interpreting the results is annotation of genomic regions with known features such as genes, promoters, CpG islands etc. Several tools have been published to perform this task. However, using these tools often requires a significant amount of bioinformatics skills and/or downloading and installing dedicated software. Results Here we present AnnotateGenomicRegions, a web application that accepts genomic regions as input and outputs a selection of overlapping and/or neighboring genome annotations. Supported organisms include human (hg18, hg19), mouse (mm8, mm9, mm10), zebrafish (danRer7), and Saccharomyces cerevisiae (sacCer2, sacCer3). AnnotateGenomicRegions is accessible online on a public server or can be installed locally. Some frequently used annotations and genomes are embedded in the application while custom annotations may be added by the user. Conclusions The increasing spread of genomic technologies generates the need for a simple-to-use annotation tool for genomic regions that can be used by biologists and bioinformaticians alike. AnnotateGenomicRegions meets this demand. AnnotateGenomicRegions is an open-source web application that can be installed on any personal computer or institute server. AnnotateGenomicRegions is available at: http://cru.genomics.iit.it/AnnotateGenomicRegions. PMID:24564446

  10. Enabling Histopathological Annotations on Immunofluorescent Images through Virtualization of Hematoxylin and Eosin

    PubMed Central

    Lahiani, Amal; Klaiman, Eldad; Grimm, Oliver

    2018-01-01

    Context: Medical diagnosis and clinical decisions rely heavily on the histopathological evaluation of tissue samples, especially in oncology. Historically, classical histopathology has been the gold standard for tissue evaluation and assessment by pathologists. The most widely and commonly used dyes in histopathology are hematoxylin and eosin (H&E) as most malignancies diagnosis is largely based on this protocol. H&E staining has been used for more than a century to identify tissue characteristics and structures morphologies that are needed for tumor diagnosis. In many cases, as tissue is scarce in clinical studies, fluorescence imaging is necessary to allow staining of the same specimen with multiple biomarkers simultaneously. Since fluorescence imaging is a relatively new technology in the pathology landscape, histopathologists are not used to or trained in annotating or interpreting these images. Aims, Settings and Design: To allow pathologists to annotate these images without the need for additional training, we designed an algorithm for the conversion of fluorescence images to brightfield H&E images. Subjects and Methods: In this algorithm, we use fluorescent nuclei staining to reproduce the hematoxylin information and natural tissue autofluorescence to reproduce the eosin information avoiding the necessity to specifically stain the proteins or intracellular structures with an additional fluorescence stain. Statistical Analysis Used: Our method is based on optimizing a transform function from fluorescence to H&E images using least mean square optimization. Results: It results in high quality virtual H&E digital images that can easily and efficiently be analyzed by pathologists. We validated our results with pathologists by making them annotate tumor in real and virtual H&E whole slide images and we obtained promising results. Conclusions: Hence, we provide a solution that enables pathologists to assess tissue and annotate specific structures based on multiplexed fluorescence images. PMID:29531846

  11. Enabling Histopathological Annotations on Immunofluorescent Images through Virtualization of Hematoxylin and Eosin.

    PubMed

    Lahiani, Amal; Klaiman, Eldad; Grimm, Oliver

    2018-01-01

    Medical diagnosis and clinical decisions rely heavily on the histopathological evaluation of tissue samples, especially in oncology. Historically, classical histopathology has been the gold standard for tissue evaluation and assessment by pathologists. The most widely and commonly used dyes in histopathology are hematoxylin and eosin (H&E) as most malignancies diagnosis is largely based on this protocol. H&E staining has been used for more than a century to identify tissue characteristics and structures morphologies that are needed for tumor diagnosis. In many cases, as tissue is scarce in clinical studies, fluorescence imaging is necessary to allow staining of the same specimen with multiple biomarkers simultaneously. Since fluorescence imaging is a relatively new technology in the pathology landscape, histopathologists are not used to or trained in annotating or interpreting these images. To allow pathologists to annotate these images without the need for additional training, we designed an algorithm for the conversion of fluorescence images to brightfield H&E images. In this algorithm, we use fluorescent nuclei staining to reproduce the hematoxylin information and natural tissue autofluorescence to reproduce the eosin information avoiding the necessity to specifically stain the proteins or intracellular structures with an additional fluorescence stain. Our method is based on optimizing a transform function from fluorescence to H&E images using least mean square optimization. It results in high quality virtual H&E digital images that can easily and efficiently be analyzed by pathologists. We validated our results with pathologists by making them annotate tumor in real and virtual H&E whole slide images and we obtained promising results. Hence, we provide a solution that enables pathologists to assess tissue and annotate specific structures based on multiplexed fluorescence images.

  12. Portrayed emotions in the movie "Forrest Gump"

    PubMed Central

    Boennen, Manuel; Mareike, Gehrke; Golz, Madleen; Hartigs, Benita; Hoffmann, Nico; Keil, Sebastian; Perlow, Malú; Peukmann, Anne Katrin; Rabe, Lea Noell; von Sobbe, Franca-Rosa; Hanke, Michael

    2015-01-01

    Here we present a dataset with a description of portrayed emotions in the movie ”Forrest Gump”. A total of 12 observers independently annotated emotional episodes regarding their temporal location and duration. The nature of an emotion was characterized with basic attributes, such as arousal and valence, as well as explicit emotion category labels. In addition, annotations include a record of the perceptual evidence for the presence of an emotion. Two variants of the movie were annotated separately: 1) an audio-movie version of Forrest Gump that has been used as a stimulus for the acquisition of a large public functional brain imaging dataset, and 2) the original audio-visual movie. We present reliability and consistency estimates that suggest that both stimuli can be used to study visual and auditory emotion cue processing in real-life like situations. Raw annotations from all observers are publicly released in full in order to maximize their utility for a wide range of applications and possible future extensions. In addition, aggregate time series of inter-observer agreement with respect to particular attributes of portrayed emotions are provided to facilitate adoption of these data. PMID:25977755

  13. Towards the VWO Annotation Service: a Success Story of the IMAGE RPI Expert Rating System

    NASA Astrophysics Data System (ADS)

    Reinisch, B. W.; Galkin, I. A.; Fung, S. F.; Benson, R. F.; Kozlov, A. V.; Khmyrov, G. M.; Garcia, L. N.

    2010-12-01

    Interpretation of Heliophysics wave data requires specialized knowledge of wave phenomena. Users of the virtual wave observatory (VWO) will greatly benefit from a data annotation service that will allow querying of data by phenomenon type, thus helping accomplish the VWO goal to make Heliophysics wave data searchable, understandable, and usable by the scientific community. Individual annotations can be sorted by phenomenon type and reduced into event lists (catalogs). However, in contrast to the event lists, annotation records allow a greater flexibility of collaborative management by more easily admitting operations of addition, revision, or deletion. They can therefore become the building blocks for an interactive Annotation Service with a suitable graphic user interface to the VWO middleware. The VWO Annotation Service vision is an interactive, collaborative sharing of domain expert knowledge with fellow scientists and students alike. An effective prototype of the VWO Annotation Service has been in operation at the University of Massachusetts Lowell since 2001. An expert rating system (ERS) was developed for annotating the IMAGE radio plasma imager (RPI) active sounding data containing 1.2 million plasmagrams. The RPI data analysts can use ERS to submit expert ratings of plasmagram features, such as presence of echo traces resulted from reflected RPI signals from distant plasma structures. Since its inception in 2001, the RPI ERS has accumulated 7351 expert plasmagram ratings in 16 phenomenon categories, together with free-text descriptions and other metadata. In addition to human expert ratings, the system holds 225,125 ratings submitted by the CORPRAL data prospecting software that employs a model of the human pre-attentive vision to select images potentially containing interesting features. The annotation records proved to be instrumental in a number of investigations where manual data exploration would have been prohibitively tedious and expensive. Especially useful are queries of the annotation database for successive plasmagrams containing echo traces. Several success stories of the RPI ERS using this capability will be discussed, particularly in terms of how they may be extended to develop the VWO Annotation Service.

  14. A fully automatic end-to-end method for content-based image retrieval of CT scans with similar liver lesion annotations.

    PubMed

    Spanier, A B; Caplan, N; Sosna, J; Acar, B; Joskowicz, L

    2018-01-01

    The goal of medical content-based image retrieval (M-CBIR) is to assist radiologists in the decision-making process by retrieving medical cases similar to a given image. One of the key interests of radiologists is lesions and their annotations, since the patient treatment depends on the lesion diagnosis. Therefore, a key feature of M-CBIR systems is the retrieval of scans with the most similar lesion annotations. To be of value, M-CBIR systems should be fully automatic to handle large case databases. We present a fully automatic end-to-end method for the retrieval of CT scans with similar liver lesion annotations. The input is a database of abdominal CT scans labeled with liver lesions, a query CT scan, and optionally one radiologist-specified lesion annotation of interest. The output is an ordered list of the database CT scans with the most similar liver lesion annotations. The method starts by automatically segmenting the liver in the scan. It then extracts a histogram-based features vector from the segmented region, learns the features' relative importance, and ranks the database scans according to the relative importance measure. The main advantages of our method are that it fully automates the end-to-end querying process, that it uses simple and efficient techniques that are scalable to large datasets, and that it produces quality retrieval results using an unannotated CT scan. Our experimental results on 9 CT queries on a dataset of 41 volumetric CT scans from the 2014 Image CLEF Liver Annotation Task yield an average retrieval accuracy (Normalized Discounted Cumulative Gain index) of 0.77 and 0.84 without/with annotation, respectively. Fully automatic end-to-end retrieval of similar cases based on image information alone, rather that on disease diagnosis, may help radiologists to better diagnose liver lesions.

  15. Deformably registering and annotating whole CLARITY brains to an atlas via masked LDDMM

    NASA Astrophysics Data System (ADS)

    Kutten, Kwame S.; Vogelstein, Joshua T.; Charon, Nicolas; Ye, Li; Deisseroth, Karl; Miller, Michael I.

    2016-04-01

    The CLARITY method renders brains optically transparent to enable high-resolution imaging in the structurally intact brain. Anatomically annotating CLARITY brains is necessary for discovering which regions contain signals of interest. Manually annotating whole-brain, terabyte CLARITY images is difficult, time-consuming, subjective, and error-prone. Automatically registering CLARITY images to a pre-annotated brain atlas offers a solution, but is difficult for several reasons. Removal of the brain from the skull and subsequent storage and processing cause variable non-rigid deformations, thus compounding inter-subject anatomical variability. Additionally, the signal in CLARITY images arises from various biochemical contrast agents which only sparsely label brain structures. This sparse labeling challenges the most commonly used registration algorithms that need to match image histogram statistics to the more densely labeled histological brain atlases. The standard method is a multiscale Mutual Information B-spline algorithm that dynamically generates an average template as an intermediate registration target. We determined that this method performs poorly when registering CLARITY brains to the Allen Institute's Mouse Reference Atlas (ARA), because the image histogram statistics are poorly matched. Therefore, we developed a method (Mask-LDDMM) for registering CLARITY images, that automatically finds the brain boundary and learns the optimal deformation between the brain and atlas masks. Using Mask-LDDMM without an average template provided better results than the standard approach when registering CLARITY brains to the ARA. The LDDMM pipelines developed here provide a fast automated way to anatomically annotate CLARITY images; our code is available as open source software at http://NeuroData.io.

  16. Smartphone applications: A contemporary resource for dermatopathology

    PubMed Central

    Hanna, Matthew G.; Parwani, Anil V.; Pantanowitz, Liron; Punjabi, Vinod; Singh, Rajendra

    2015-01-01

    Introduction: Smartphone applications in medicine are becoming increasingly prevalent. Given that most pathologists and pathology trainees today use smartphones, an obvious modality for pathology education is through smartphone applications. “MyDermPath” is a novel smartphone application that was developed as an interactive reference tool for dermatology and dermatopathology, available for iOS and Android. Materials and Methods: “MyDermPath” was developed using Apple Xcode and Google Android SDK. Dermatology images (static and virtual slides) were annotated and configured into an algorithmic format. Each image comprised educational data (diagnosis, clinical information, histopathology, special stains, differential diagnosis, clinical management, linked PubMed references). Added functionality included personal note taking, pop quiz, and image upload capabilities. A website was created (http://mydermpath.com) to mirror the app. Results: The application was released in August 2011 and updated in November 2013. More than 1,100 reference diagnoses, with over 2,000 images are available via the application and website. The application has been downloaded approximately 14,000 times. The application is available for use on iOS and Android platforms. Conclusions: Smartphone applications have tremendous potential for advancing pathology education. “MyDermPath” represents an interactive reference tool for dermatology and dermatopathologists. PMID:26284155

  17. Smartphone applications: A contemporary resource for dermatopathology.

    PubMed

    Hanna, Matthew G; Parwani, Anil V; Pantanowitz, Liron; Punjabi, Vinod; Singh, Rajendra

    2015-01-01

    Smartphone applications in medicine are becoming increasingly prevalent. Given that most pathologists and pathology trainees today use smartphones, an obvious modality for pathology education is through smartphone applications. "MyDermPath" is a novel smartphone application that was developed as an interactive reference tool for dermatology and dermatopathology, available for iOS and Android. "MyDermPath" was developed using Apple Xcode and Google Android SDK. Dermatology images (static and virtual slides) were annotated and configured into an algorithmic format. Each image comprised educational data (diagnosis, clinical information, histopathology, special stains, differential diagnosis, clinical management, linked PubMed references). Added functionality included personal note taking, pop quiz, and image upload capabilities. A website was created (http://mydermpath.com) to mirror the app. The application was released in August 2011 and updated in November 2013. More than 1,100 reference diagnoses, with over 2,000 images are available via the application and website. The application has been downloaded approximately 14,000 times. The application is available for use on iOS and Android platforms. Smartphone applications have tremendous potential for advancing pathology education. "MyDermPath" represents an interactive reference tool for dermatology and dermatopathologists.

  18. Computer Applications in Marketing. An Annotated Bibliography of Computer Software.

    ERIC Educational Resources Information Center

    Burrow, Jim; Schwamman, Faye

    This bibliography contains annotations of 95 items of educational and business software with applications in seven marketing and business functions. The annotations, which appear in alphabetical order by title, provide this information: category (related application), title, date, source and price, equipment, supplementary materials, description…

  19. A boosting framework for visuality-preserving distance metric learning and its application to medical image retrieval.

    PubMed

    Yang, Liu; Jin, Rong; Mummert, Lily; Sukthankar, Rahul; Goode, Adam; Zheng, Bin; Hoi, Steven C H; Satyanarayanan, Mahadev

    2010-01-01

    Similarity measurement is a critical component in content-based image retrieval systems, and learning a good distance metric can significantly improve retrieval performance. However, despite extensive study, there are several major shortcomings with the existing approaches for distance metric learning that can significantly affect their application to medical image retrieval. In particular, "similarity" can mean very different things in image retrieval: resemblance in visual appearance (e.g., two images that look like one another) or similarity in semantic annotation (e.g., two images of tumors that look quite different yet are both malignant). Current approaches for distance metric learning typically address only one goal without consideration of the other. This is problematic for medical image retrieval where the goal is to assist doctors in decision making. In these applications, given a query image, the goal is to retrieve similar images from a reference library whose semantic annotations could provide the medical professional with greater insight into the possible interpretations of the query image. If the system were to retrieve images that did not look like the query, then users would be less likely to trust the system; on the other hand, retrieving images that appear superficially similar to the query but are semantically unrelated is undesirable because that could lead users toward an incorrect diagnosis. Hence, learning a distance metric that preserves both visual resemblance and semantic similarity is important. We emphasize that, although our study is focused on medical image retrieval, the problem addressed in this work is critical to many image retrieval systems. We present a boosting framework for distance metric learning that aims to preserve both visual and semantic similarities. The boosting framework first learns a binary representation using side information, in the form of labeled pairs, and then computes the distance as a weighted Hamming distance using the learned binary representation. A boosting algorithm is presented to efficiently learn the distance function. We evaluate the proposed algorithm on a mammographic image reference library with an Interactive Search-Assisted Decision Support (ISADS) system and on the medical image data set from ImageCLEF. Our results show that the boosting framework compares favorably to state-of-the-art approaches for distance metric learning in retrieval accuracy, with much lower computational cost. Additional evaluation with the COREL collection shows that our algorithm works well for regular image data sets.

  20. A way toward analyzing high-content bioimage data by means of semantic annotation and visual data mining

    NASA Astrophysics Data System (ADS)

    Herold, Julia; Abouna, Sylvie; Zhou, Luxian; Pelengaris, Stella; Epstein, David B. A.; Khan, Michael; Nattkemper, Tim W.

    2009-02-01

    In the last years, bioimaging has turned from qualitative measurements towards a high-throughput and highcontent modality, providing multiple variables for each biological sample analyzed. We present a system which combines machine learning based semantic image annotation and visual data mining to analyze such new multivariate bioimage data. Machine learning is employed for automatic semantic annotation of regions of interest. The annotation is the prerequisite for a biological object-oriented exploration of the feature space derived from the image variables. With the aid of visual data mining, the obtained data can be explored simultaneously in the image as well as in the feature domain. Especially when little is known of the underlying data, for example in the case of exploring the effects of a drug treatment, visual data mining can greatly aid the process of data evaluation. We demonstrate how our system is used for image evaluation to obtain information relevant to diabetes study and screening of new anti-diabetes treatments. Cells of the Islet of Langerhans and whole pancreas in pancreas tissue samples are annotated and object specific molecular features are extracted from aligned multichannel fluorescence images. These are interactively evaluated for cell type classification in order to determine the cell number and mass. Only few parameters need to be specified which makes it usable also for non computer experts and allows for high-throughput analysis.

  1. Facilitating Analysis of Multiple Partial Data Streams

    NASA Technical Reports Server (NTRS)

    Maimone, Mark W.; Liebersbach, Robert R.

    2008-01-01

    Robotic Operations Automation: Mechanisms, Imaging, Navigation report Generation (ROAMING) is a set of computer programs that facilitates and accelerates both tactical and strategic analysis of time-sampled data especially the disparate and often incomplete streams of Mars Explorer Rover (MER) telemetry data described in the immediately preceding article. As used here, tactical refers to the activities over a relatively short time (one Martian day in the original MER application) and strategic refers to a longer time (the entire multi-year MER missions in the original application). Prior to installation, ROAMING must be configured with the types of data of interest, and parsers must be modified to understand the format of the input data (many example parsers are provided, including for general CSV files). Thereafter, new data from multiple disparate sources are automatically resampled into a single common annotated spreadsheet stored in a readable space-separated format, and these data can be processed or plotted at any time scale. Such processing or plotting makes it possible to study not only the details of a particular activity spanning only a few seconds, but also longer-term trends. ROAMING makes it possible to generate mission-wide plots of multiple engineering quantities [e.g., vehicle tilt as in Figure 1(a), motor current, numbers of images] that, heretofore could be found only in thousands of separate files. ROAMING also supports automatic annotation of both images and graphs. In the MER application, labels given to terrain features by rover scientists and engineers are automatically plotted in all received images based on their associated camera models (see Figure 2), times measured in seconds are mapped to Mars local time, and command names or arbitrary time-labeled events can be used to label engineering plots, as in Figure 1(b).

  2. An efficient annotation and gene-expression derivation tool for Illumina Solexa datasets.

    PubMed

    Hosseini, Parsa; Tremblay, Arianne; Matthews, Benjamin F; Alkharouf, Nadim W

    2010-07-02

    The data produced by an Illumina flow cell with all eight lanes occupied, produces well over a terabyte worth of images with gigabytes of reads following sequence alignment. The ability to translate such reads into meaningful annotation is therefore of great concern and importance. Very easily, one can get flooded with such a great volume of textual, unannotated data irrespective of read quality or size. CASAVA, a optional analysis tool for Illumina sequencing experiments, enables the ability to understand INDEL detection, SNP information, and allele calling. To not only extract from such analysis, a measure of gene expression in the form of tag-counts, but furthermore to annotate such reads is therefore of significant value. We developed TASE (Tag counting and Analysis of Solexa Experiments), a rapid tag-counting and annotation software tool specifically designed for Illumina CASAVA sequencing datasets. Developed in Java and deployed using jTDS JDBC driver and a SQL Server backend, TASE provides an extremely fast means of calculating gene expression through tag-counts while annotating sequenced reads with the gene's presumed function, from any given CASAVA-build. Such a build is generated for both DNA and RNA sequencing. Analysis is broken into two distinct components: DNA sequence or read concatenation, followed by tag-counting and annotation. The end result produces output containing the homology-based functional annotation and respective gene expression measure signifying how many times sequenced reads were found within the genomic ranges of functional annotations. TASE is a powerful tool to facilitate the process of annotating a given Illumina Solexa sequencing dataset. Our results indicate that both homology-based annotation and tag-count analysis are achieved in very efficient times, providing researchers to delve deep in a given CASAVA-build and maximize information extraction from a sequencing dataset. TASE is specially designed to translate sequence data in a CASAVA-build into functional annotations while producing corresponding gene expression measurements. Achieving such analysis is executed in an ultrafast and highly efficient manner, whether the analysis be a single-read or paired-end sequencing experiment. TASE is a user-friendly and freely available application, allowing rapid analysis and annotation of any given Illumina Solexa sequencing dataset with ease.

  3. The application of color display techniques for the analysis of Nimbus infrared radiation data

    NASA Technical Reports Server (NTRS)

    Allison, L. J.; Cherrix, G. T.; Ausfresser, H.

    1972-01-01

    A color enhancement system designed for the Applications Technology Satellite (ATS) spin scan experiment has been adapted for the analysis of Nimbus infrared radiation measurements. For a given scene recorded on magnetic tape by the Nimbus scanning radiometers, a virtually unlimited number of color images can be produced at the ATS Operations Control Center from a color selector paper tape input. Linear image interpolation has produced radiation analyses in which each brightness-color interval has a smooth boundary without any mosaic effects. An annotated latitude-longitude gridding program makes it possible to precisely locate geophysical parameters, which permits accurate interpretation of pertinent meteorological, geological, hydrological, and oceanographic features.

  4. A high-performance spatial database based approach for pathology imaging algorithm evaluation

    PubMed Central

    Wang, Fusheng; Kong, Jun; Gao, Jingjing; Cooper, Lee A.D.; Kurc, Tahsin; Zhou, Zhengwen; Adler, David; Vergara-Niedermayr, Cristobal; Katigbak, Bryan; Brat, Daniel J.; Saltz, Joel H.

    2013-01-01

    Background: Algorithm evaluation provides a means to characterize variability across image analysis algorithms, validate algorithms by comparison with human annotations, combine results from multiple algorithms for performance improvement, and facilitate algorithm sensitivity studies. The sizes of images and image analysis results in pathology image analysis pose significant challenges in algorithm evaluation. We present an efficient parallel spatial database approach to model, normalize, manage, and query large volumes of analytical image result data. This provides an efficient platform for algorithm evaluation. Our experiments with a set of brain tumor images demonstrate the application, scalability, and effectiveness of the platform. Context: The paper describes an approach and platform for evaluation of pathology image analysis algorithms. The platform facilitates algorithm evaluation through a high-performance database built on the Pathology Analytic Imaging Standards (PAIS) data model. Aims: (1) Develop a framework to support algorithm evaluation by modeling and managing analytical results and human annotations from pathology images; (2) Create a robust data normalization tool for converting, validating, and fixing spatial data from algorithm or human annotations; (3) Develop a set of queries to support data sampling and result comparisons; (4) Achieve high performance computation capacity via a parallel data management infrastructure, parallel data loading and spatial indexing optimizations in this infrastructure. Materials and Methods: We have considered two scenarios for algorithm evaluation: (1) algorithm comparison where multiple result sets from different methods are compared and consolidated; and (2) algorithm validation where algorithm results are compared with human annotations. We have developed a spatial normalization toolkit to validate and normalize spatial boundaries produced by image analysis algorithms or human annotations. The validated data were formatted based on the PAIS data model and loaded into a spatial database. To support efficient data loading, we have implemented a parallel data loading tool that takes advantage of multi-core CPUs to accelerate data injection. The spatial database manages both geometric shapes and image features or classifications, and enables spatial sampling, result comparison, and result aggregation through expressive structured query language (SQL) queries with spatial extensions. To provide scalable and efficient query support, we have employed a shared nothing parallel database architecture, which distributes data homogenously across multiple database partitions to take advantage of parallel computation power and implements spatial indexing to achieve high I/O throughput. Results: Our work proposes a high performance, parallel spatial database platform for algorithm validation and comparison. This platform was evaluated by storing, managing, and comparing analysis results from a set of brain tumor whole slide images. The tools we develop are open source and available to download. Conclusions: Pathology image algorithm validation and comparison are essential to iterative algorithm development and refinement. One critical component is the support for queries involving spatial predicates and comparisons. In our work, we develop an efficient data model and parallel database approach to model, normalize, manage and query large volumes of analytical image result data. Our experiments demonstrate that the data partitioning strategy and the grid-based indexing result in good data distribution across database nodes and reduce I/O overhead in spatial join queries through parallel retrieval of relevant data and quick subsetting of datasets. The set of tools in the framework provide a full pipeline to normalize, load, manage and query analytical results for algorithm evaluation. PMID:23599905

  5. VAT: a computational framework to functionally annotate variants in personal genomes within a cloud-computing environment.

    PubMed

    Habegger, Lukas; Balasubramanian, Suganthi; Chen, David Z; Khurana, Ekta; Sboner, Andrea; Harmanci, Arif; Rozowsky, Joel; Clarke, Declan; Snyder, Michael; Gerstein, Mark

    2012-09-01

    The functional annotation of variants obtained through sequencing projects is generally assumed to be a simple intersection of genomic coordinates with genomic features. However, complexities arise for several reasons, including the differential effects of a variant on alternatively spliced transcripts, as well as the difficulty in assessing the impact of small insertions/deletions and large structural variants. Taking these factors into consideration, we developed the Variant Annotation Tool (VAT) to functionally annotate variants from multiple personal genomes at the transcript level as well as obtain summary statistics across genes and individuals. VAT also allows visualization of the effects of different variants, integrates allele frequencies and genotype data from the underlying individuals and facilitates comparative analysis between different groups of individuals. VAT can either be run through a command-line interface or as a web application. Finally, in order to enable on-demand access and to minimize unnecessary transfers of large data files, VAT can be run as a virtual machine in a cloud-computing environment. VAT is implemented in C and PHP. The VAT web service, Amazon Machine Image, source code and detailed documentation are available at vat.gersteinlab.org.

  6. Image-based diagnostic aid for interstitial lung disease with secondary data integration

    NASA Astrophysics Data System (ADS)

    Depeursinge, Adrien; Müller, Henning; Hidki, Asmâa; Poletti, Pierre-Alexandre; Platon, Alexandra; Geissbuhler, Antoine

    2007-03-01

    Interstitial lung diseases (ILDs) are a relatively heterogeneous group of around 150 illnesses with often very unspecific symptoms. The most complete imaging method for the characterisation of ILDs is the high-resolution computed tomography (HRCT) of the chest but a correct interpretation of these images is difficult even for specialists as many diseases are rare and thus little experience exists. Moreover, interpreting HRCT images requires knowledge of the context defined by clinical data of the studied case. A computerised diagnostic aid tool based on HRCT images with associated medical data to retrieve similar cases of ILDs from a dedicated database can bring quick and precious information for example for emergency radiologists. The experience from a pilot project highlighted the need for detailed database containing high-quality annotations in addition to clinical data. The state of the art is studied to identify requirements for image-based diagnostic aid for interstitial lung disease with secondary data integration. The data acquisition steps are detailed. The selection of the most relevant clinical parameters is done in collaboration with lung specialists from current literature, along with knowledge bases of computer-based diagnostic decision support systems. In order to perform high-quality annotations of the interstitial lung tissue in the HRCT images an annotation software and its own file format is implemented for DICOM images. A multimedia database is implemented to store ILD cases with clinical data and annotated image series. Cases from the University & University Hospitals of Geneva (HUG) are retrospectively and prospectively collected to populate the database. Currently, 59 cases with certified diagnosis and their clinical parameters are stored in the database as well as 254 image series of which 26 have their regions of interest annotated. The available data was used to test primary visual features for the classification of lung tissue patterns. These features show good discriminative properties for the separation of five classes of visual observations.

  7. Annotating image ROIs with text descriptions for multimodal biomedical document retrieval

    NASA Astrophysics Data System (ADS)

    You, Daekeun; Simpson, Matthew; Antani, Sameer; Demner-Fushman, Dina; Thoma, George R.

    2013-01-01

    Regions of interest (ROIs) that are pointed to by overlaid markers (arrows, asterisks, etc.) in biomedical images are expected to contain more important and relevant information than other regions for biomedical article indexing and retrieval. We have developed several algorithms that localize and extract the ROIs by recognizing markers on images. Cropped ROIs then need to be annotated with contents describing them best. In most cases accurate textual descriptions of the ROIs can be found from figure captions, and these need to be combined with image ROIs for annotation. The annotated ROIs can then be used to, for example, train classifiers that separate ROIs into known categories (medical concepts), or to build visual ontologies, for indexing and retrieval of biomedical articles. We propose an algorithm that pairs visual and textual ROIs that are extracted from images and figure captions, respectively. This algorithm based on dynamic time warping (DTW) clusters recognized pointers into groups, each of which contains pointers with identical visual properties (shape, size, color, etc.). Then a rule-based matching algorithm finds the best matching group for each textual ROI mention. Our method yields a precision and recall of 96% and 79%, respectively, when ground truth textual ROI data is used.

  8. Landslides on Charon

    NASA Image and Video Library

    2016-10-18

    Scientists from NASA's New Horizons mission have spotted signs of long run-out landslides on Pluto's largest moon, Charon. This image of Charon's informally named "Serenity Chasma" was taken by New Horizons' Long Range Reconnaissance Imager (LORRI) on July 14, 2015, from a distance of 48,912 miles (78,717 kilometers). An annotated image shows arrows in the annotated figure mark indications of landslide activity at http://photojournal.jpl.nasa.gov/catalog/PIA21128

  9. Nonlinear Deep Kernel Learning for Image Annotation.

    PubMed

    Jiu, Mingyuan; Sahbi, Hichem

    2017-02-08

    Multiple kernel learning (MKL) is a widely used technique for kernel design. Its principle consists in learning, for a given support vector classifier, the most suitable convex (or sparse) linear combination of standard elementary kernels. However, these combinations are shallow and often powerless to capture the actual similarity between highly semantic data, especially for challenging classification tasks such as image annotation. In this paper, we redefine multiple kernels using deep multi-layer networks. In this new contribution, a deep multiple kernel is recursively defined as a multi-layered combination of nonlinear activation functions, each one involves a combination of several elementary or intermediate kernels, and results into a positive semi-definite deep kernel. We propose four different frameworks in order to learn the weights of these networks: supervised, unsupervised, kernel-based semisupervised and Laplacian-based semi-supervised. When plugged into support vector machines (SVMs), the resulting deep kernel networks show clear gain, compared to several shallow kernels for the task of image annotation. Extensive experiments and analysis on the challenging ImageCLEF photo annotation benchmark, the COREL5k database and the Banana dataset validate the effectiveness of the proposed method.

  10. Approaches to Fungal Genome Annotation

    PubMed Central

    Haas, Brian J.; Zeng, Qiandong; Pearson, Matthew D.; Cuomo, Christina A.; Wortman, Jennifer R.

    2011-01-01

    Fungal genome annotation is the starting point for analysis of genome content. This generally involves the application of diverse methods to identify features on a genome assembly such as protein-coding and non-coding genes, repeats and transposable elements, and pseudogenes. Here we describe tools and methods leveraged for eukaryotic genome annotation with a focus on the annotation of fungal nuclear and mitochondrial genomes. We highlight the application of the latest technologies and tools to improve the quality of predicted gene sets. The Broad Institute eukaryotic genome annotation pipeline is described as one example of how such methods and tools are integrated into a sequencing center’s production genome annotation environment. PMID:22059117

  11. A new image representation for compact and secure communication

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Prasad, Lakshman; Skourikhine, A. N.

    In many areas of nuclear materials management there is a need for communication, archival, and retrieval of annotated image data between heterogeneous platforms and devices to effectively implement safety, security, and safeguards of nuclear materials. Current image formats such as JPEG are not ideally suited in such scenarios as they are not scalable to different viewing formats, and do not provide a high-level representation of images that facilitate automatic object/change detection or annotation. The new Scalable Vector Graphics (SVG) open standard for representing graphical information, recommended by the World Wide Web Consortium (W3C) is designed to address issues of imagemore » scalability, portability, and annotation. However, until now there has been no viable technology to efficiently field images of high visual quality under this standard. Recently, LANL has developed a vectorized image representation that is compatible with the SVG standard and preserves visual quality. This is based on a new geometric framework for characterizing complex features in real-world imagery that incorporates perceptual principles of processing visual information known from cognitive psychology and vision science, to obtain a polygonal image representation of high fidelity. This representation can take advantage of all textual compression and encryption routines unavailable to other image formats. Moreover, this vectorized image representation can be exploited to facilitate automated object recognition that can reduce time required for data review. The objects/features of interest in these vectorized images can be annotated via animated graphics to facilitate quick and easy display and comprehension of processed image content.« less

  12. Metadata requirements for results of diagnostic imaging procedures: a BIIF profile to support user applications

    NASA Astrophysics Data System (ADS)

    Brown, Nicholas J.; Lloyd, David S.; Reynolds, Melvin I.; Plummer, David L.

    2002-05-01

    A visible digital image is rendered from a set of digital image data. Medical digital image data can be stored as either: (a) pre-rendered format, corresponding to a photographic print, or (b) un-rendered format, corresponding to a photographic negative. The appropriate image data storage format and associated header data (metadata) required by a user of the results of a diagnostic procedure recorded electronically depends on the task(s) to be performed. The DICOM standard provides a rich set of metadata that supports the needs of complex applications. Many end user applications, such as simple report text viewing and display of a selected image, are not so demanding and generic image formats such as JPEG are sometimes used. However, these are lacking some basic identification requirements. In this paper we make specific proposals for minimal extensions to generic image metadata of value in various domains, which enable safe use in the case of two simple healthcare end user scenarios: (a) viewing of text and a selected JPEG image activated by a hyperlink and (b) viewing of one or more JPEG images together with superimposed text and graphics annotation using a file specified by a profile of the ISO/IEC Basic Image Interchange Format (BIIF).

  13. Automatic detection of regions of interest in mammographic images

    NASA Astrophysics Data System (ADS)

    Cheng, Erkang; Ling, Haibin; Bakic, Predrag R.; Maidment, Andrew D. A.; Megalooikonomou, Vasileios

    2011-03-01

    This work is a part of our ongoing study aimed at comparing the topology of anatomical branching structures with the underlying image texture. Detection of regions of interest (ROIs) in clinical breast images serves as the first step in development of an automated system for image analysis and breast cancer diagnosis. In this paper, we have investigated machine learning approaches for the task of identifying ROIs with visible breast ductal trees in a given galactographic image. Specifically, we have developed boosting based framework using the AdaBoost algorithm in combination with Haar wavelet features for the ROI detection. Twenty-eight clinical galactograms with expert annotated ROIs were used for training. Positive samples were generated by resampling near the annotated ROIs, and negative samples were generated randomly by image decomposition. Each detected ROI candidate was given a confidences core. Candidate ROIs with spatial overlap were merged and their confidence scores combined. We have compared three strategies for elimination of false positives. The strategies differed in their approach to combining confidence scores by summation, averaging, or selecting the maximum score.. The strategies were compared based upon the spatial overlap with annotated ROIs. Using a 4-fold cross-validation with the annotated clinical galactographic images, the summation strategy showed the best performance with 75% detection rate. When combining the top two candidates, the selection of maximum score showed the best performance with 96% detection rate.

  14. Comparative analysis of semantic localization accuracies between adult and pediatric DICOM CT images

    NASA Astrophysics Data System (ADS)

    Robertson, Duncan; Pathak, Sayan D.; Criminisi, Antonio; White, Steve; Haynor, David; Chen, Oliver; Siddiqui, Khan

    2012-02-01

    Existing literature describes a variety of techniques for semantic annotation of DICOM CT images, i.e. the automatic detection and localization of anatomical structures. Semantic annotation facilitates enhanced image navigation, linkage of DICOM image content and non-image clinical data, content-based image retrieval, and image registration. A key challenge for semantic annotation algorithms is inter-patient variability. However, while the algorithms described in published literature have been shown to cope adequately with the variability in test sets comprising adult CT scans, the problem presented by the even greater variability in pediatric anatomy has received very little attention. Most existing semantic annotation algorithms can only be extended to work on scans of both adult and pediatric patients by adapting parameters heuristically in light of patient size. In contrast, our approach, which uses random regression forests ('RRF'), learns an implicit model of scale variation automatically using training data. In consequence, anatomical structures can be localized accurately in both adult and pediatric CT studies without the need for parameter adaptation or additional information about patient scale. We show how the RRF algorithm is able to learn scale invariance from a combined training set containing a mixture of pediatric and adult scans. Resulting localization accuracy for both adult and pediatric data remains comparable with that obtained using RRFs trained and tested using only adult data.

  15. Communication spaces

    PubMed Central

    Coiera, Enrico

    2014-01-01

    Background and objective Annotations to physical workspaces such as signs and notes are ubiquitous. When densely annotated, work areas become communication spaces. This study aims to characterize the types and purpose of such annotations. Methods A qualitative observational study was undertaken in two wards and the radiology department of a 440-bed metropolitan teaching hospital. Images were purposefully sampled; 39 were analyzed after excluding inferior images. Results Annotation functions included signaling identity, location, capability, status, availability, and operation. They encoded data, rules or procedural descriptions. Most aggregated into groups that either created a workflow by referencing each other, supported a common workflow without reference to each other, or were heterogeneous, referring to many workflows. Higher-level assemblies of such groupings were also observed. Discussion Annotations make visible the gap between work done and the capability of a space to support work. Annotations are repairs of an environment, improving fitness for purpose, fixing inadequacy in design, or meeting emergent needs. Annotations thus record the missing information needed to undertake tasks, typically added post-implemented. Measuring annotation levels post-implementation could help assess the fit of technology to task. Physical and digital spaces could meet broader user needs by formally supporting user customization, ‘programming through annotation’. Augmented reality systems could also directly support annotation, addressing existing information gaps, and enhancing work with context sensitive annotation. Conclusions Communication spaces offer a model of how work unfolds. Annotations make visible local adaptation that makes technology fit for purpose post-implementation and suggest an important role for annotatable information systems and digital augmentation of the physical environment. PMID:24005797

  16. Assessing Strength of Evidence of Appropriate Use Criteria for Diagnostic Imaging Examinations.

    PubMed

    Lacson, Ronilda; Raja, Ali S; Osterbur, David; Ip, Ivan; Schneider, Louise; Bain, Paul; Mita, Carol; Whelan, Julia; Silveira, Patricia; Dement, David; Khorasani, Ramin

    2016-05-01

    For health information technology tools to fully inform evidence-based decisions, recommendations must be reliably assessed for quality and strength of evidence. We aimed to create an annotation framework for grading recommendations regarding appropriate use of diagnostic imaging examinations. The annotation framework was created by an expert panel (clinicians in three medical specialties, medical librarians, and biomedical scientists) who developed a process for achieving consensus in assessing recommendations, and evaluated by measuring agreement in grading the strength of evidence for 120 empirically selected recommendations using the Oxford Levels of Evidence. Eighty-two percent of recommendations were assigned to Level 5 (expert opinion). Inter-annotator agreement was 0.70 on initial grading (κ = 0.35, 95% CI, 0.23-0.48). After systematic discussion utilizing the annotation framework, agreement increased significantly to 0.97 (κ = 0.88, 95% CI, 0.77-0.99). A novel annotation framework was effective for grading the strength of evidence supporting appropriate use criteria for diagnostic imaging exams. © The Author 2016. Published by Oxford University Press on behalf of the American Medical Informatics Association. All rights reserved. For Permissions, please email: journals.permissions@oup.com.

  17. SemVisM: semantic visualizer for medical image

    NASA Astrophysics Data System (ADS)

    Landaeta, Luis; La Cruz, Alexandra; Baranya, Alexander; Vidal, María.-Esther

    2015-01-01

    SemVisM is a toolbox that combines medical informatics and computer graphics tools for reducing the semantic gap between low-level features and high-level semantic concepts/terms in the images. This paper presents a novel strategy for visualizing medical data annotated semantically, combining rendering techniques, and segmentation algorithms. SemVisM comprises two main components: i) AMORE (A Modest vOlume REgister) to handle input data (RAW, DAT or DICOM) and to initially annotate the images using terms defined on medical ontologies (e.g., MesH, FMA or RadLex), and ii) VOLPROB (VOlume PRObability Builder) for generating the annotated volumetric data containing the classified voxels that belong to a particular tissue. SemVisM is built on top of the semantic visualizer ANISE.1

  18. C-ME: A 3D Community-Based, Real-Time Collaboration Tool for Scientific Research and Training

    PubMed Central

    Kolatkar, Anand; Kennedy, Kevin; Halabuk, Dan; Kunken, Josh; Marrinucci, Dena; Bethel, Kelly; Guzman, Rodney; Huckaby, Tim; Kuhn, Peter

    2008-01-01

    The need for effective collaboration tools is growing as multidisciplinary proteome-wide projects and distributed research teams become more common. The resulting data is often quite disparate, stored in separate locations, and not contextually related. Collaborative Molecular Modeling Environment (C-ME) is an interactive community-based collaboration system that allows researchers to organize information, visualize data on a two-dimensional (2-D) or three-dimensional (3-D) basis, and share and manage that information with collaborators in real time. C-ME stores the information in industry-standard databases that are immediately accessible by appropriate permission within the computer network directory service or anonymously across the internet through the C-ME application or through a web browser. The system addresses two important aspects of collaboration: context and information management. C-ME allows a researcher to use a 3-D atomic structure model or a 2-D image as a contextual basis on which to attach and share annotations to specific atoms or molecules or to specific regions of a 2-D image. These annotations provide additional information about the atomic structure or image data that can then be evaluated, amended or added to by other project members. PMID:18286178

  19. Learning multiple relative attributes with humans in the loop.

    PubMed

    Qian, Buyue; Wang, Xiang; Cao, Nan; Jiang, Yu-Gang; Davidson, Ian

    2014-12-01

    Semantic attributes have been recognized as a more spontaneous manner to describe and annotate image content. It is widely accepted that image annotation using semantic attributes is a significant improvement to the traditional binary or multiclass annotation due to its naturally continuous and relative properties. Though useful, existing approaches rely on an abundant supervision and high-quality training data, which limit their applicability. Two standard methods to overcome small amounts of guidance and low-quality training data are transfer and active learning. In the context of relative attributes, this would entail learning multiple relative attributes simultaneously and actively querying a human for additional information. This paper addresses the two main limitations in existing work: 1) it actively adds humans to the learning loop so that minimal additional guidance can be given and 2) it learns multiple relative attributes simultaneously and thereby leverages dependence amongst them. In this paper, we formulate a joint active learning to rank framework with pairwise supervision to achieve these two aims, which also has other benefits such as the ability to be kernelized. The proposed framework optimizes over a set of ranking functions (measuring the strength of the presence of attributes) simultaneously and dependently on each other. The proposed pairwise queries take the form of which one of these two pictures is more natural? These queries can be easily answered by humans. Extensive empirical study on real image data sets shows that our proposed method, compared with several state-of-the-art methods, achieves superior retrieval performance while requires significantly less human inputs.

  20. Semantics-Based Intelligent Indexing and Retrieval of Digital Images - A Case Study

    NASA Astrophysics Data System (ADS)

    Osman, Taha; Thakker, Dhavalkumar; Schaefer, Gerald

    The proliferation of digital media has led to a huge interest in classifying and indexing media objects for generic search and usage. In particular, we are witnessing colossal growth in digital image repositories that are difficult to navigate using free-text search mechanisms, which often return inaccurate matches as they typically rely on statistical analysis of query keyword recurrence in the image annotation or surrounding text. In this chapter we present a semantically enabled image annotation and retrieval engine that is designed to satisfy the requirements of commercial image collections market in terms of both accuracy and efficiency of the retrieval process. Our search engine relies on methodically structured ontologies for image annotation, thus allowing for more intelligent reasoning about the image content and subsequently obtaining a more accurate set of results and a richer set of alternatives matchmaking the original query. We also show how our well-analysed and designed domain ontology contributes to the implicit expansion of user queries as well as presenting our initial thoughts on exploiting lexical databases for explicit semantic-based query expansion.

  1. Marky: a tool supporting annotation consistency in multi-user and iterative document annotation projects.

    PubMed

    Pérez-Pérez, Martín; Glez-Peña, Daniel; Fdez-Riverola, Florentino; Lourenço, Anália

    2015-02-01

    Document annotation is a key task in the development of Text Mining methods and applications. High quality annotated corpora are invaluable, but their preparation requires a considerable amount of resources and time. Although the existing annotation tools offer good user interaction interfaces to domain experts, project management and quality control abilities are still limited. Therefore, the current work introduces Marky, a new Web-based document annotation tool equipped to manage multi-user and iterative projects, and to evaluate annotation quality throughout the project life cycle. At the core, Marky is a Web application based on the open source CakePHP framework. User interface relies on HTML5 and CSS3 technologies. Rangy library assists in browser-independent implementation of common DOM range and selection tasks, and Ajax and JQuery technologies are used to enhance user-system interaction. Marky grants solid management of inter- and intra-annotator work. Most notably, its annotation tracking system supports systematic and on-demand agreement analysis and annotation amendment. Each annotator may work over documents as usual, but all the annotations made are saved by the tracking system and may be further compared. So, the project administrator is able to evaluate annotation consistency among annotators and across rounds of annotation, while annotators are able to reject or amend subsets of annotations made in previous rounds. As a side effect, the tracking system minimises resource and time consumption. Marky is a novel environment for managing multi-user and iterative document annotation projects. Compared to other tools, Marky offers a similar visually intuitive annotation experience while providing unique means to minimise annotation effort and enforce annotation quality, and therefore corpus consistency. Marky is freely available for non-commercial use at http://sing.ei.uvigo.es/marky. Copyright © 2014 Elsevier Ireland Ltd. All rights reserved.

  2. ERTS data user investigation to develop a multistage forest sampling inventory system

    NASA Technical Reports Server (NTRS)

    Langley, P. G.; Vanroessel, J. W. (Principal Investigator); Wert, S. L.

    1973-01-01

    The author has identified the following significant results. A system to provide precision annotation of predetermined forest inventory sampling units on the ERTS-1 MSS images was developed. In addition, an annotation system for high altitude U2 photographs was completed. MSS bulk image accuracy is good enough to allow the use of one square mile sampling units. IMANCO image analyzer interpretation work for small scale images demonstrated the need for much additional analyses. Continuing image interpretation work for the next reporting period is concentrated on manual image interpretation work as well as digital interpretation system development using the computer compatible tapes.

  3. Meta4: a web application for sharing and annotating metagenomic gene predictions using web services.

    PubMed

    Richardson, Emily J; Escalettes, Franck; Fotheringham, Ian; Wallace, Robert J; Watson, Mick

    2013-01-01

    Whole-genome shotgun metagenomics experiments produce DNA sequence data from entire ecosystems, and provide a huge amount of novel information. Gene discovery projects require up-to-date information about sequence homology and domain structure for millions of predicted proteins to be presented in a simple, easy-to-use system. There is a lack of simple, open, flexible tools that allow the rapid sharing of metagenomics datasets with collaborators in a format they can easily interrogate. We present Meta4, a flexible and extensible web application that can be used to share and annotate metagenomic gene predictions. Proteins and predicted domains are stored in a simple relational database, with a dynamic front-end which displays the results in an internet browser. Web services are used to provide up-to-date information about the proteins from homology searches against public databases. Information about Meta4 can be found on the project website, code is available on Github, a cloud image is available, and an example implementation can be seen at.

  4. Social Image Tag Ranking by Two-View Learning

    NASA Astrophysics Data System (ADS)

    Zhuang, Jinfeng; Hoi, Steven C. H.

    Tags play a central role in text-based social image retrieval and browsing. However, the tags annotated by web users could be noisy, irrelevant, and often incomplete for describing the image contents, which may severely deteriorate the performance of text-based image retrieval models. In order to solve this problem, researchers have proposed techniques to rank the annotated tags of a social image according to their relevance to the visual content of the image. In this paper, we aim to overcome the challenge of social image tag ranking for a corpus of social images with rich user-generated tags by proposing a novel two-view learning approach. It can effectively exploit both textual and visual contents of social images to discover the complicated relationship between tags and images. Unlike the conventional learning approaches that usually assumes some parametric models, our method is completely data-driven and makes no assumption about the underlying models, making the proposed solution practically more effective. We formulate our method as an optimization task and present an efficient algorithm to solve it. To evaluate the efficacy of our method, we conducted an extensive set of experiments by applying our technique to both text-based social image retrieval and automatic image annotation tasks. Our empirical results showed that the proposed method can be more effective than the conventional approaches.

  5. Image annotation by deep neural networks with attention shaping

    NASA Astrophysics Data System (ADS)

    Zheng, Kexin; Lv, Shaohe; Ma, Fang; Chen, Fei; Jin, Chi; Dou, Yong

    2017-07-01

    Image annotation is a task of assigning semantic labels to an image. Recently, deep neural networks with visual attention have been utilized successfully in many computer vision tasks. In this paper, we show that conventional attention mechanism is easily misled by the salient class, i.e., the attended region always contains part of the image area describing the content of salient class at different attention iterations. To this end, we propose a novel attention shaping mechanism, which aims to maximize the non-overlapping area between consecutive attention processes by taking into account the history of previous attention vectors. Several weighting polices are studied to utilize the history information in different manners. In two benchmark datasets, i.e., PASCAL VOC2012 and MIRFlickr-25k, the average precision is improved by up to 10% in comparison with the state-of-the-art annotation methods.

  6. An efficient annotation and gene-expression derivation tool for Illumina Solexa datasets

    PubMed Central

    2010-01-01

    Background The data produced by an Illumina flow cell with all eight lanes occupied, produces well over a terabyte worth of images with gigabytes of reads following sequence alignment. The ability to translate such reads into meaningful annotation is therefore of great concern and importance. Very easily, one can get flooded with such a great volume of textual, unannotated data irrespective of read quality or size. CASAVA, a optional analysis tool for Illumina sequencing experiments, enables the ability to understand INDEL detection, SNP information, and allele calling. To not only extract from such analysis, a measure of gene expression in the form of tag-counts, but furthermore to annotate such reads is therefore of significant value. Findings We developed TASE (Tag counting and Analysis of Solexa Experiments), a rapid tag-counting and annotation software tool specifically designed for Illumina CASAVA sequencing datasets. Developed in Java and deployed using jTDS JDBC driver and a SQL Server backend, TASE provides an extremely fast means of calculating gene expression through tag-counts while annotating sequenced reads with the gene's presumed function, from any given CASAVA-build. Such a build is generated for both DNA and RNA sequencing. Analysis is broken into two distinct components: DNA sequence or read concatenation, followed by tag-counting and annotation. The end result produces output containing the homology-based functional annotation and respective gene expression measure signifying how many times sequenced reads were found within the genomic ranges of functional annotations. Conclusions TASE is a powerful tool to facilitate the process of annotating a given Illumina Solexa sequencing dataset. Our results indicate that both homology-based annotation and tag-count analysis are achieved in very efficient times, providing researchers to delve deep in a given CASAVA-build and maximize information extraction from a sequencing dataset. TASE is specially designed to translate sequence data in a CASAVA-build into functional annotations while producing corresponding gene expression measurements. Achieving such analysis is executed in an ultrafast and highly efficient manner, whether the analysis be a single-read or paired-end sequencing experiment. TASE is a user-friendly and freely available application, allowing rapid analysis and annotation of any given Illumina Solexa sequencing dataset with ease. PMID:20598141

  7. MimoSA: a system for minimotif annotation

    PubMed Central

    2010-01-01

    Background Minimotifs are short peptide sequences within one protein, which are recognized by other proteins or molecules. While there are now several minimotif databases, they are incomplete. There are reports of many minimotifs in the primary literature, which have yet to be annotated, while entirely novel minimotifs continue to be published on a weekly basis. Our recently proposed function and sequence syntax for minimotifs enables us to build a general tool that will facilitate structured annotation and management of minimotif data from the biomedical literature. Results We have built the MimoSA application for minimotif annotation. The application supports management of the Minimotif Miner database, literature tracking, and annotation of new minimotifs. MimoSA enables the visualization, organization, selection and editing functions of minimotifs and their attributes in the MnM database. For the literature components, Mimosa provides paper status tracking and scoring of papers for annotation through a freely available machine learning approach, which is based on word correlation. The paper scoring algorithm is also available as a separate program, TextMine. Form-driven annotation of minimotif attributes enables entry of new minimotifs into the MnM database. Several supporting features increase the efficiency of annotation. The layered architecture of MimoSA allows for extensibility by separating the functions of paper scoring, minimotif visualization, and database management. MimoSA is readily adaptable to other annotation efforts that manually curate literature into a MySQL database. Conclusions MimoSA is an extensible application that facilitates minimotif annotation and integrates with the Minimotif Miner database. We have built MimoSA as an application that integrates dynamic abstract scoring with a high performance relational model of minimotif syntax. MimoSA's TextMine, an efficient paper-scoring algorithm, can be used to dynamically rank papers with respect to context. PMID:20565705

  8. Deep learning-based fine-grained car make/model classification for visual surveillance

    NASA Astrophysics Data System (ADS)

    Gundogdu, Erhan; Parıldı, Enes Sinan; Solmaz, Berkan; Yücesoy, Veysel; Koç, Aykut

    2017-10-01

    Fine-grained object recognition is a potential computer vision problem that has been recently addressed by utilizing deep Convolutional Neural Networks (CNNs). Nevertheless, the main disadvantage of classification methods relying on deep CNN models is the need for considerably large amount of data. In addition, there exists relatively less amount of annotated data for a real world application, such as the recognition of car models in a traffic surveillance system. To this end, we mainly concentrate on the classification of fine-grained car make and/or models for visual scenarios by the help of two different domains. First, a large-scale dataset including approximately 900K images is constructed from a website which includes fine-grained car models. According to their labels, a state-of-the-art CNN model is trained on the constructed dataset. The second domain that is dealt with is the set of images collected from a camera integrated to a traffic surveillance system. These images, which are over 260K, are gathered by a special license plate detection method on top of a motion detection algorithm. An appropriately selected size of the image is cropped from the region of interest provided by the detected license plate location. These sets of images and their provided labels for more than 30 classes are employed to fine-tune the CNN model which is already trained on the large scale dataset described above. To fine-tune the network, the last two fully-connected layers are randomly initialized and the remaining layers are fine-tuned in the second dataset. In this work, the transfer of a learned model on a large dataset to a smaller one has been successfully performed by utilizing both the limited annotated data of the traffic field and a large scale dataset with available annotations. Our experimental results both in the validation dataset and the real field show that the proposed methodology performs favorably against the training of the CNN model from scratch.

  9. TissueWikiMobile: an Integrative Protein Expression Image Browser for Pathological Knowledge Sharing and Annotation on a Mobile Device

    PubMed Central

    Cheng, Chihwen; Stokes, Todd H.; Hang, Sovandy; Wang, May D.

    2016-01-01

    Doctors need fast and convenient access to medical data. This motivates the use of mobile devices for knowledge retrieval and sharing. We have developed TissueWikiMobile on the Apple iPhone and iPad to seamlessly access TissueWiki, an enormous repository of medical histology images. TissueWiki is a three terabyte database of antibody information and histology images from the Human Protein Atlas (HPA). Using TissueWikiMobile, users are capable of extracting knowledge from protein expression, adding annotations to highlight regions of interest on images, and sharing their professional insight. By providing an intuitive human computer interface, users can efficiently operate TissueWikiMobile to access important biomedical data without losing mobility. TissueWikiMobile furnishes the health community a ubiquitous way to collaborate and share their expert opinions not only on the performance of various antibodies stains but also on histology image annotation. PMID:27532057

  10. Estimating False Positive Contamination in Crater Annotations from Citizen Science Data

    NASA Astrophysics Data System (ADS)

    Tar, P. D.; Bugiolacchi, R.; Thacker, N. A.; Gilmour, J. D.

    2017-01-01

    Web-based citizen science often involves the classification of image features by large numbers of minimally trained volunteers, such as the identification of lunar impact craters under the Moon Zoo project. Whilst such approaches facilitate the analysis of large image data sets, the inexperience of users and ambiguity in image content can lead to contamination from false positive identifications. We give an approach, using Linear Poisson Models and image template matching, that can quantify levels of false positive contamination in citizen science Moon Zoo crater annotations. Linear Poisson Models are a form of machine learning which supports predictive error modelling and goodness-of-fits, unlike most alternative machine learning methods. The proposed supervised learning system can reduce the variability in crater counts whilst providing predictive error assessments of estimated quantities of remaining true verses false annotations. In an area of research influenced by human subjectivity, the proposed method provides a level of objectivity through the utilisation of image evidence, guided by candidate crater identifications.

  11. DICOM for quantitative imaging biomarker development: a standards based approach to sharing clinical data and structured PET/CT analysis results in head and neck cancer research.

    PubMed

    Fedorov, Andriy; Clunie, David; Ulrich, Ethan; Bauer, Christian; Wahle, Andreas; Brown, Bartley; Onken, Michael; Riesmeier, Jörg; Pieper, Steve; Kikinis, Ron; Buatti, John; Beichel, Reinhard R

    2016-01-01

    Background. Imaging biomarkers hold tremendous promise for precision medicine clinical applications. Development of such biomarkers relies heavily on image post-processing tools for automated image quantitation. Their deployment in the context of clinical research necessitates interoperability with the clinical systems. Comparison with the established outcomes and evaluation tasks motivate integration of the clinical and imaging data, and the use of standardized approaches to support annotation and sharing of the analysis results and semantics. We developed the methodology and tools to support these tasks in Positron Emission Tomography and Computed Tomography (PET/CT) quantitative imaging (QI) biomarker development applied to head and neck cancer (HNC) treatment response assessment, using the Digital Imaging and Communications in Medicine (DICOM(®)) international standard and free open-source software. Methods. Quantitative analysis of PET/CT imaging data collected on patients undergoing treatment for HNC was conducted. Processing steps included Standardized Uptake Value (SUV) normalization of the images, segmentation of the tumor using manual and semi-automatic approaches, automatic segmentation of the reference regions, and extraction of the volumetric segmentation-based measurements. Suitable components of the DICOM standard were identified to model the various types of data produced by the analysis. A developer toolkit of conversion routines and an Application Programming Interface (API) were contributed and applied to create a standards-based representation of the data. Results. DICOM Real World Value Mapping, Segmentation and Structured Reporting objects were utilized for standards-compliant representation of the PET/CT QI analysis results and relevant clinical data. A number of correction proposals to the standard were developed. The open-source DICOM toolkit (DCMTK) was improved to simplify the task of DICOM encoding by introducing new API abstractions. Conversion and visualization tools utilizing this toolkit were developed. The encoded objects were validated for consistency and interoperability. The resulting dataset was deposited in the QIN-HEADNECK collection of The Cancer Imaging Archive (TCIA). Supporting tools for data analysis and DICOM conversion were made available as free open-source software. Discussion. We presented a detailed investigation of the development and application of the DICOM model, as well as the supporting open-source tools and toolkits, to accommodate representation of the research data in QI biomarker development. We demonstrated that the DICOM standard can be used to represent the types of data relevant in HNC QI biomarker development, and encode their complex relationships. The resulting annotated objects are amenable to data mining applications, and are interoperable with a variety of systems that support the DICOM standard.

  12. VAT: a computational framework to functionally annotate variants in personal genomes within a cloud-computing environment

    PubMed Central

    Habegger, Lukas; Balasubramanian, Suganthi; Chen, David Z.; Khurana, Ekta; Sboner, Andrea; Harmanci, Arif; Rozowsky, Joel; Clarke, Declan; Snyder, Michael; Gerstein, Mark

    2012-01-01

    Summary: The functional annotation of variants obtained through sequencing projects is generally assumed to be a simple intersection of genomic coordinates with genomic features. However, complexities arise for several reasons, including the differential effects of a variant on alternatively spliced transcripts, as well as the difficulty in assessing the impact of small insertions/deletions and large structural variants. Taking these factors into consideration, we developed the Variant Annotation Tool (VAT) to functionally annotate variants from multiple personal genomes at the transcript level as well as obtain summary statistics across genes and individuals. VAT also allows visualization of the effects of different variants, integrates allele frequencies and genotype data from the underlying individuals and facilitates comparative analysis between different groups of individuals. VAT can either be run through a command-line interface or as a web application. Finally, in order to enable on-demand access and to minimize unnecessary transfers of large data files, VAT can be run as a virtual machine in a cloud-computing environment. Availability and Implementation: VAT is implemented in C and PHP. The VAT web service, Amazon Machine Image, source code and detailed documentation are available at vat.gersteinlab.org. Contact: lukas.habegger@yale.edu or mark.gerstein@yale.edu Supplementary Information: Supplementary data are available at Bioinformatics online. PMID:22743228

  13. Smartphones as multimodal communication devices to facilitate clinical knowledge processes: randomized controlled trial.

    PubMed

    Pimmer, Christoph; Mateescu, Magdalena; Zahn, Carmen; Genewein, Urs

    2013-11-27

    Despite the widespread use and advancements of mobile technology that facilitate rich communication modes, there is little evidence demonstrating the value of smartphones for effective interclinician communication and knowledge processes. The objective of this study was to determine the effects of different synchronous smartphone-based modes of communication, such as (1) speech only, (2) speech and images, and (3) speech, images, and image annotation (guided noticing) on the recall and transfer of visually and verbally represented medical knowledge. The experiment was conducted from November 2011 to May 2012 at the University Hospital Basel (Switzerland) with 42 medical students in a master's program. All participants analyzed a standardized case (a patient with a subcapital fracture of the fifth metacarpal bone) based on a radiological image, photographs of the hand, and textual descriptions, and were asked to consult a remote surgical specialist via a smartphone. Participants were randomly assigned to 3 experimental conditions/groups. In group 1, the specialist provided verbal explanations (speech only). In group 2, the specialist provided verbal explanations and displayed the radiological image and the photographs to the participants (speech and images). In group 3, the specialist provided verbal explanations, displayed the radiological image and the photographs, and annotated the radiological image by drawing structures/angle elements (speech, images, and image annotation). To assess knowledge recall, participants were asked to write brief summaries of the case (verbally represented knowledge) after the consultation and to re-analyze the diagnostic images (visually represented knowledge). To assess knowledge transfer, participants analyzed a similar case without specialist support. Data analysis by ANOVA found that participants in groups 2 and 3 (images used) evaluated the support provided by the specialist as significantly more positive than group 1, the speech-only group (group 1: mean 4.08, SD 0.90; group 2: mean 4.73, SD 0.59; group 3: mean 4.93, SD 0.25; F2,39=6.76, P=.003; partial η(2)=0.26, 1-β=.90). However, significant positive effects on the recall and transfer of visually represented medical knowledge were only observed when the smartphone-based communication involved the combination of speech, images, and image annotation (group 3). There were no significant positive effects on the recall and transfer of visually represented knowledge between group 1 (speech only) and group 2 (speech and images). No significant differences were observed between the groups regarding verbally represented medical knowledge. The results show (1) the value of annotation functions for digital and mobile technology for interclinician communication and medical informatics, and (2) the use of guided noticing (the integration of speech, images, and image annotation) leads to significantly improved knowledge gains for visually represented knowledge. This is particularly valuable in situations involving complex visual subject matters, typical in clinical practice.

  14. Smartphones as Multimodal Communication Devices to Facilitate Clinical Knowledge Processes: Randomized Controlled Trial

    PubMed Central

    Mateescu, Magdalena; Zahn, Carmen; Genewein, Urs

    2013-01-01

    Background Despite the widespread use and advancements of mobile technology that facilitate rich communication modes, there is little evidence demonstrating the value of smartphones for effective interclinician communication and knowledge processes. Objective The objective of this study was to determine the effects of different synchronous smartphone-based modes of communication, such as (1) speech only, (2) speech and images, and (3) speech, images, and image annotation (guided noticing) on the recall and transfer of visually and verbally represented medical knowledge. Methods The experiment was conducted from November 2011 to May 2012 at the University Hospital Basel (Switzerland) with 42 medical students in a master’s program. All participants analyzed a standardized case (a patient with a subcapital fracture of the fifth metacarpal bone) based on a radiological image, photographs of the hand, and textual descriptions, and were asked to consult a remote surgical specialist via a smartphone. Participants were randomly assigned to 3 experimental conditions/groups. In group 1, the specialist provided verbal explanations (speech only). In group 2, the specialist provided verbal explanations and displayed the radiological image and the photographs to the participants (speech and images). In group 3, the specialist provided verbal explanations, displayed the radiological image and the photographs, and annotated the radiological image by drawing structures/angle elements (speech, images, and image annotation). To assess knowledge recall, participants were asked to write brief summaries of the case (verbally represented knowledge) after the consultation and to re-analyze the diagnostic images (visually represented knowledge). To assess knowledge transfer, participants analyzed a similar case without specialist support. Results Data analysis by ANOVA found that participants in groups 2 and 3 (images used) evaluated the support provided by the specialist as significantly more positive than group 1, the speech-only group (group 1: mean 4.08, SD 0.90; group 2: mean 4.73, SD 0.59; group 3: mean 4.93, SD 0.25; F 2,39=6.76, P=.003; partial η2=0.26, 1–β=.90). However, significant positive effects on the recall and transfer of visually represented medical knowledge were only observed when the smartphone-based communication involved the combination of speech, images, and image annotation (group 3). There were no significant positive effects on the recall and transfer of visually represented knowledge between group 1 (speech only) and group 2 (speech and images). No significant differences were observed between the groups regarding verbally represented medical knowledge. Conclusions The results show (1) the value of annotation functions for digital and mobile technology for interclinician communication and medical informatics, and (2) the use of guided noticing (the integration of speech, images, and image annotation) leads to significantly improved knowledge gains for visually represented knowledge. This is particularly valuable in situations involving complex visual subject matters, typical in clinical practice. PMID:24284080

  15. A metadata-aware application for remote scoring and exchange of tissue microarray images

    PubMed Central

    2013-01-01

    Background The use of tissue microarrays (TMA) and advances in digital scanning microscopy has enabled the collection of thousands of tissue images. There is a need for software tools to annotate, query and share this data amongst researchers in different physical locations. Results We have developed an open source web-based application for remote scoring of TMA images, which exploits the value of Microsoft Silverlight Deep Zoom to provide a intuitive interface for zooming and panning around digital images. We use and extend existing XML-based standards to ensure that the data collected can be archived and that our system is interoperable with other standards-compliant systems. Conclusion The application has been used for multi-centre scoring of TMA slides composed of tissues from several Phase III breast cancer trials and ten different studies participating in the International Breast Cancer Association Consortium (BCAC). The system has enabled researchers to simultaneously score large collections of TMA and export the standardised data to integrate with pathological and clinical outcome data, thereby facilitating biomarker discovery. PMID:23635078

  16. Annotated Bibliography on Religious Development.

    ERIC Educational Resources Information Center

    Bucher, Anton A.; Reich, K. Helmut

    1991-01-01

    Presents an annotated bibliography on religious development that covers the areas of psychology and religion, measurement of religiousness, religious development during the life cycle, religious experiences, conversion, religion and morality, and images of God. (Author/BB)

  17. Effectiveness of Global Features for Automatic Medical Image Classification and Retrieval – the experiences of OHSU at ImageCLEFmed

    PubMed Central

    Kalpathy-Cramer, Jayashree; Hersh, William

    2008-01-01

    In 2006 and 2007, Oregon Health & Science University (OHSU) participated in the automatic image annotation task for medical images at ImageCLEF, an annual international benchmarking event that is part of the Cross Language Evaluation Forum (CLEF). The goal of the automatic annotation task was to classify 1000 test images based on the Image Retrieval in Medical Applications (IRMA) code, given a set of 10,000 training images. There were 116 distinct classes in 2006 and 2007. We evaluated the efficacy of a variety of primarily global features for this classification task. These included features based on histograms, gray level correlation matrices and the gist technique. A multitude of classifiers including k-nearest neighbors, two-level neural networks, support vector machines, and maximum likelihood classifiers were evaluated. Our official error rates for the 1000 test images were 26% in 2006 using the flat classification structure. The error count in 2007 was 67.8 using the hierarchical classification error computation based on the IRMA code in 2007. Confusion matrices as well as clustering experiments were used to identify visually similar classes. The use of the IRMA code did not help us in the classification task as the semantic hierarchy of the IRMA classes did not correspond well with the hierarchy based on clustering of image features that we used. Our most frequent misclassification errors were along the view axis. Subsequent experiments based on a two-stage classification system decreased our error rate to 19.8% for the 2006 dataset and our error count to 55.4 for the 2007 data. PMID:19884953

  18. Effects of Multimedia Annotations on Thai EFL Readers' Words and Text Recall

    ERIC Educational Resources Information Center

    Gasigijtamrong, Jenjit

    2013-01-01

    This study aimed to investigate the effects of using multimedia annotations on EFL readers' word recall and text recall and to explore which type of multimedia annotations--L1 meaning, L2 meaning, sound, and image--would have a better effect on their recall of new words and text comprehension. The participants were 78 students who enrolled in an…

  19. Solar Tutorial and Annotation Resource (STAR)

    NASA Astrophysics Data System (ADS)

    Showalter, C.; Rex, R.; Hurlburt, N. E.; Zita, E. J.

    2009-12-01

    We have written a software suite designed to facilitate solar data analysis by scientists, students, and the public, anticipating enormous datasets from future instruments. Our “STAR" suite includes an interactive learning section explaining 15 classes of solar events. Users learn software tools that exploit humans’ superior ability (over computers) to identify many events. Annotation tools include time slice generation to quantify loop oscillations, the interpolation of event shapes using natural cubic splines (for loops, sigmoids, and filaments) and closed cubic splines (for coronal holes). Learning these tools in an environment where examples are provided prepares new users to comfortably utilize annotation software with new data. Upon completion of our tutorial, users are presented with media of various solar events and asked to identify and annotate the images, to test their mastery of the system. Goals of the project include public input into the data analysis of very large datasets from future solar satellites, and increased public interest and knowledge about the Sun. In 2010, the Solar Dynamics Observatory (SDO) will be launched into orbit. SDO’s advancements in solar telescope technology will generate a terabyte per day of high-quality data, requiring innovation in data management. While major projects develop automated feature recognition software, so that computers can complete much of the initial event tagging and analysis, still, that software cannot annotate features such as sigmoids, coronal magnetic loops, coronal dimming, etc., due to large amounts of data concentrated in relatively small areas. Previously, solar physicists manually annotated these features, but with the imminent influx of data it is unrealistic to expect specialized researchers to examine every image that computers cannot fully process. A new approach is needed to efficiently process these data. Providing analysis tools and data access to students and the public have proven efficient in similar astrophysical projects (e.g. the “Galaxy Zoo.”) For “crowdsourcing” to be effective for solar research, the public needs knowledge and skills to recognize and annotate key events on the Sun. Our tutorial can provide this training, with over 200 images and 18 movies showing examples of active regions, coronal dimmings, coronal holes, coronal jets, coronal waves, emerging flux, sigmoids, coronal magnetic loops, filaments, filament eruption, flares, loop oscillation, plage, surges, and sunspots. Annotation tools are provided for many of these events. Many features of the tutorial, such as mouse-over definitions and interactive annotation examples, are designed to assist people without previous experience in solar physics. After completing the tutorial, the user is presented with an interactive quiz: a series of movies and images to identify and annotate. The tutorial teaches the user, with feedback on correct and incorrect answers, until the user develops appropriate confidence and skill. This prepares users to annotate new data, based on their experience with event recognition and annotation tools. Trained users can contribute significantly to our data analysis tasks, even as our training tool contributes to public science literacy and interest in solar physics.

  20. Towards ontology-based decision support systems for complex ultrasound diagnosis in obstetrics and gynecology.

    PubMed

    Maurice, P; Dhombres, F; Blondiaux, E; Friszer, S; Guilbaud, L; Lelong, N; Khoshnood, B; Charlet, J; Perrot, N; Jauniaux, E; Jurkovic, D; Jouannic, J-M

    2017-05-01

    We have developed a new knowledge base intelligent system for obstetrics and gynecology ultrasound imaging, based on an ontology and a reference image collection. This study evaluates the new system to support accurate annotations of ultrasound images. We have used the early ultrasound diagnosis of ectopic pregnancies as a model clinical issue. The ectopic pregnancy ontology was derived from medical texts (4260 ultrasound reports of ectopic pregnancy from a specialist center in the UK and 2795 Pubmed abstracts indexed with the MeSH term "Pregnancy, Ectopic") and the reference image collection was built on a selection from 106 publications. We conducted a retrospective analysis of the signs in 35 scans of ectopic pregnancy by six observers using the new system. The resulting ectopic pregnancy ontology consisted of 1395 terms, and 80 images were collected for the reference collection. The observers used the knowledge base intelligent system to provide a total of 1486 sign annotations. The precision, recall and F-measure for the annotations were 0.83, 0.62 and 0.71, respectively. The global proportion of agreement was 40.35% 95% CI [38.64-42.05]. The ontology-based intelligent system provides accurate annotations of ultrasound images and suggests that it may benefit non-expert operators. The precision rate is appropriate for accurate input of a computer-based clinical decision support and could be used to support medical imaging diagnosis of complex conditions in obstetrics and gynecology. Copyright © 2017. Published by Elsevier Masson SAS.

  1. An information gathering system for medical image inspection

    NASA Astrophysics Data System (ADS)

    Lee, Young-Jin; Bajcsy, Peter

    2005-04-01

    We present an information gathering system for medical image inspection that consists of software tools for capturing computer-centric and human-centric information. Computer-centric information includes (1) static annotations, such as (a) image drawings enclosing any selected area, a set of areas with similar colors, a set of salient points, and (b) textual descriptions associated with either image drawings or links between pairs of image drawings, and (2) dynamic (or temporal) information, such as mouse movements, zoom level changes, image panning and frame selections from an image stack. Human-centric information is represented by video and audio signals that are acquired by computer-mounted cameras and microphones. The short-term goal of the presented system is to facilitate learning of medical novices from medical experts, while the long-term goal is to data mine all information about image inspection for assisting in making diagnoses. In this work, we built basic software functionality for gathering computer-centric and human-centric information of the aforementioned variables. Next, we developed the information playback capabilities of all gathered information for educational purposes. Finally, we prototyped text-based and image template-based search engines to retrieve information from recorded annotations, for example, (a) find all annotations containing the word "blood vessels", or (b) search for similar areas to a selected image area. The information gathering system for medical image inspection reported here has been tested with images from the Histology Atlas database.

  2. Enhancing Web applications in radiology with Java: estimating MR imaging relaxation times.

    PubMed

    Dagher, A P; Fitzpatrick, M; Flanders, A E; Eng, J

    1998-01-01

    Java is a relatively new programming language that has been used to develop a World Wide Web-based tool for estimating magnetic resonance (MR) imaging relaxation times, thereby demonstrating how Java may be used for Web-based radiology applications beyond improving the user interface of teaching files. A standard processing algorithm coded with Java is downloaded along with the hypertext markup language (HTML) document. The user (client) selects the desired pulse sequence and inputs data obtained from a region of interest on the MR images. The algorithm is used to modify selected MR imaging parameters in an equation that models the phenomenon being evaluated. MR imaging relaxation times are estimated, and confidence intervals and a P value expressing the accuracy of the final results are calculated. Design features such as simplicity, object-oriented programming, and security restrictions allow Java to expand the capabilities of HTML by offering a more versatile user interface that includes dynamic annotations and graphics. Java also allows the client to perform more sophisticated information processing and computation than is usually associated with Web applications. Java is likely to become a standard programming option, and the development of stand-alone Java applications may become more common as Java is integrated into future versions of computer operating systems.

  3. Automatic pelvis segmentation from x-ray images of a mouse model

    NASA Astrophysics Data System (ADS)

    Al Okashi, Omar M.; Du, Hongbo; Al-Assam, Hisham

    2017-05-01

    The automatic detection and quantification of skeletal structures has a variety of different applications for biological research. Accurate segmentation of the pelvis from X-ray images of mice in a high-throughput project such as the Mouse Genomes Project not only saves time and cost but also helps achieving an unbiased quantitative analysis within the phenotyping pipeline. This paper proposes an automatic solution for pelvis segmentation based on structural and orientation properties of the pelvis in X-ray images. The solution consists of three stages including pre-processing image to extract pelvis area, initial pelvis mask preparation and final pelvis segmentation. Experimental results on a set of 100 X-ray images showed consistent performance of the algorithm. The automated solution overcomes the weaknesses of a manual annotation procedure where intra- and inter-observer variations cannot be avoided.

  4. Ten steps to get started in Genome Assembly and Annotation

    PubMed Central

    Dominguez Del Angel, Victoria; Hjerde, Erik; Sterck, Lieven; Capella-Gutierrez, Salvadors; Notredame, Cederic; Vinnere Pettersson, Olga; Amselem, Joelle; Bouri, Laurent; Bocs, Stephanie; Klopp, Christophe; Gibrat, Jean-Francois; Vlasova, Anna; Leskosek, Brane L.; Soler, Lucile; Binzer-Panchal, Mahesh; Lantz, Henrik

    2018-01-01

    As a part of the ELIXIR-EXCELERATE efforts in capacity building, we present here 10 steps to facilitate researchers getting started in genome assembly and genome annotation. The guidelines given are broadly applicable, intended to be stable over time, and cover all aspects from start to finish of a general assembly and annotation project. Intrinsic properties of genomes are discussed, as is the importance of using high quality DNA. Different sequencing technologies and generally applicable workflows for genome assembly are also detailed. We cover structural and functional annotation and encourage readers to also annotate transposable elements, something that is often omitted from annotation workflows. The importance of data management is stressed, and we give advice on where to submit data and how to make your results Findable, Accessible, Interoperable, and Reusable (FAIR). PMID:29568489

  5. Performance and Architecture Lab Modeling Tool

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    2014-06-19

    Analytical application performance models are critical for diagnosing performance-limiting resources, optimizing systems, and designing machines. Creating models, however, is difficult. Furthermore, models are frequently expressed in forms that are hard to distribute and validate. The Performance and Architecture Lab Modeling tool, or Palm, is a modeling tool designed to make application modeling easier. Palm provides a source code modeling annotation language. Not only does the modeling language divide the modeling task into sub problems, it formally links an application's source code with its model. This link is important because a model's purpose is to capture application behavior. Furthermore, this linkmore » makes it possible to define rules for generating models according to source code organization. Palm generates hierarchical models according to well-defined rules. Given an application, a set of annotations, and a representative execution environment, Palm will generate the same model. A generated model is a an executable program whose constituent parts directly correspond to the modeled application. Palm generates models by combining top-down (human-provided) semantic insight with bottom-up static and dynamic analysis. A model's hierarchy is defined by static and dynamic source code structure. Because Palm coordinates models and source code, Palm's models are 'first-class' and reproducible. Palm automates common modeling tasks. For instance, Palm incorporates measurements to focus attention, represent constant behavior, and validate models. Palm's workflow is as follows. The workflow's input is source code annotated with Palm modeling annotations. The most important annotation models an instance of a block of code. Given annotated source code, the Palm Compiler produces executables and the Palm Monitor collects a representative performance profile. The Palm Generator synthesizes a model based on the static and dynamic mapping of annotations to program behavior. The model -- an executable program -- is a hierarchical composition of annotation functions, synthesized functions, statistics for runtime values, and performance measurements.« less

  6. FDR-controlled metabolite annotation for high-resolution imaging mass spectrometry.

    PubMed

    Palmer, Andrew; Phapale, Prasad; Chernyavsky, Ilya; Lavigne, Regis; Fay, Dominik; Tarasov, Artem; Kovalev, Vitaly; Fuchser, Jens; Nikolenko, Sergey; Pineau, Charles; Becker, Michael; Alexandrov, Theodore

    2017-01-01

    High-mass-resolution imaging mass spectrometry promises to localize hundreds of metabolites in tissues, cell cultures, and agar plates with cellular resolution, but it is hampered by the lack of bioinformatics tools for automated metabolite identification. We report pySM, a framework for false discovery rate (FDR)-controlled metabolite annotation at the level of the molecular sum formula, for high-mass-resolution imaging mass spectrometry (https://github.com/alexandrovteam/pySM). We introduce a metabolite-signal match score and a target-decoy FDR estimate for spatial metabolomics.

  7. Rotation-invariant convolutional neural networks for galaxy morphology prediction

    NASA Astrophysics Data System (ADS)

    Dieleman, Sander; Willett, Kyle W.; Dambre, Joni

    2015-06-01

    Measuring the morphological parameters of galaxies is a key requirement for studying their formation and evolution. Surveys such as the Sloan Digital Sky Survey have resulted in the availability of very large collections of images, which have permitted population-wide analyses of galaxy morphology. Morphological analysis has traditionally been carried out mostly via visual inspection by trained experts, which is time consuming and does not scale to large (≳104) numbers of images. Although attempts have been made to build automated classification systems, these have not been able to achieve the desired level of accuracy. The Galaxy Zoo project successfully applied a crowdsourcing strategy, inviting online users to classify images by answering a series of questions. Unfortunately, even this approach does not scale well enough to keep up with the increasing availability of galaxy images. We present a deep neural network model for galaxy morphology classification which exploits translational and rotational symmetry. It was developed in the context of the Galaxy Challenge, an international competition to build the best model for morphology classification based on annotated images from the Galaxy Zoo project. For images with high agreement among the Galaxy Zoo participants, our model is able to reproduce their consensus with near-perfect accuracy (>99 per cent) for most questions. Confident model predictions are highly accurate, which makes the model suitable for filtering large collections of images and forwarding challenging images to experts for manual annotation. This approach greatly reduces the experts' workload without affecting accuracy. The application of these algorithms to larger sets of training data will be critical for analysing results from future surveys such as the Large Synoptic Survey Telescope.

  8. Application of neuroanatomical ontologies for neuroimaging data annotation.

    PubMed

    Turner, Jessica A; Mejino, Jose L V; Brinkley, James F; Detwiler, Landon T; Lee, Hyo Jong; Martone, Maryann E; Rubin, Daniel L

    2010-01-01

    The annotation of functional neuroimaging results for data sharing and re-use is particularly challenging, due to the diversity of terminologies of neuroanatomical structures and cortical parcellation schemes. To address this challenge, we extended the Foundational Model of Anatomy Ontology (FMA) to include cytoarchitectural, Brodmann area labels, and a morphological cortical labeling scheme (e.g., the part of Brodmann area 6 in the left precentral gyrus). This representation was also used to augment the neuroanatomical axis of RadLex, the ontology for clinical imaging. The resulting neuroanatomical ontology contains explicit relationships indicating which brain regions are "part of" which other regions, across cytoarchitectural and morphological labeling schemas. We annotated a large functional neuroimaging dataset with terms from the ontology and applied a reasoning engine to analyze this dataset in conjunction with the ontology, and achieved successful inferences from the most specific level (e.g., how many subjects showed activation in a subpart of the middle frontal gyrus) to more general (how many activations were found in areas connected via a known white matter tract?). In summary, we have produced a neuroanatomical ontology that harmonizes several different terminologies of neuroanatomical structures and cortical parcellation schemes. This neuroanatomical ontology is publicly available as a view of FMA at the Bioportal website. The ontological encoding of anatomic knowledge can be exploited by computer reasoning engines to make inferences about neuroanatomical relationships described in imaging datasets using different terminologies. This approach could ultimately enable knowledge discovery from large, distributed fMRI studies or medical record mining.

  9. Semantic annotation in biomedicine: the current landscape.

    PubMed

    Jovanović, Jelena; Bagheri, Ebrahim

    2017-09-22

    The abundance and unstructured nature of biomedical texts, be it clinical or research content, impose significant challenges for the effective and efficient use of information and knowledge stored in such texts. Annotation of biomedical documents with machine intelligible semantics facilitates advanced, semantics-based text management, curation, indexing, and search. This paper focuses on annotation of biomedical entity mentions with concepts from relevant biomedical knowledge bases such as UMLS. As a result, the meaning of those mentions is unambiguously and explicitly defined, and thus made readily available for automated processing. This process is widely known as semantic annotation, and the tools that perform it are known as semantic annotators.Over the last dozen years, the biomedical research community has invested significant efforts in the development of biomedical semantic annotation technology. Aiming to establish grounds for further developments in this area, we review a selected set of state of the art biomedical semantic annotators, focusing particularly on general purpose annotators, that is, semantic annotation tools that can be customized to work with texts from any area of biomedicine. We also examine potential directions for further improvements of today's annotators which could make them even more capable of meeting the needs of real-world applications. To motivate and encourage further developments in this area, along the suggested and/or related directions, we review existing and potential practical applications and benefits of semantic annotators.

  10. Geometry Processing of Conventionally Produced Mouse Brain Slice Images.

    PubMed

    Agarwal, Nitin; Xu, Xiangmin; Gopi, M

    2018-04-21

    Brain mapping research in most neuroanatomical laboratories relies on conventional processing techniques, which often introduce histological artifacts such as tissue tears and tissue loss. In this paper we present techniques and algorithms for automatic registration and 3D reconstruction of conventionally produced mouse brain slices in a standardized atlas space. This is achieved first by constructing a virtual 3D mouse brain model from annotated slices of Allen Reference Atlas (ARA). Virtual re-slicing of the reconstructed model generates ARA-based slice images corresponding to the microscopic images of histological brain sections. These image pairs are aligned using a geometric approach through contour images. Histological artifacts in the microscopic images are detected and removed using Constrained Delaunay Triangulation before performing global alignment. Finally, non-linear registration is performed by solving Laplace's equation with Dirichlet boundary conditions. Our methods provide significant improvements over previously reported registration techniques for the tested slices in 3D space, especially on slices with significant histological artifacts. Further, as one of the application we count the number of neurons in various anatomical regions using a dataset of 51 microscopic slices from a single mouse brain. To the best of our knowledge the presented work is the first that automatically registers both clean as well as highly damaged high-resolutions histological slices of mouse brain to a 3D annotated reference atlas space. This work represents a significant contribution to this subfield of neuroscience as it provides tools to neuroanatomist for analyzing and processing histological data. Copyright © 2018 Elsevier B.V. All rights reserved.

  11. Common data model for natural language processing based on two existing standard information models: CDA+GrAF.

    PubMed

    Meystre, Stéphane M; Lee, Sanghoon; Jung, Chai Young; Chevrier, Raphaël D

    2012-08-01

    An increasing need for collaboration and resources sharing in the Natural Language Processing (NLP) research and development community motivates efforts to create and share a common data model and a common terminology for all information annotated and extracted from clinical text. We have combined two existing standards: the HL7 Clinical Document Architecture (CDA), and the ISO Graph Annotation Format (GrAF; in development), to develop such a data model entitled "CDA+GrAF". We experimented with several methods to combine these existing standards, and eventually selected a method wrapping separate CDA and GrAF parts in a common standoff annotation (i.e., separate from the annotated text) XML document. Two use cases, clinical document sections, and the 2010 i2b2/VA NLP Challenge (i.e., problems, tests, and treatments, with their assertions and relations), were used to create examples of such standoff annotation documents, and were successfully validated with the XML schemata provided with both standards. We developed a tool to automatically translate annotation documents from the 2010 i2b2/VA NLP Challenge format to GrAF, and automatically generated 50 annotation documents using this tool, all successfully validated. Finally, we adapted the XSL stylesheet provided with HL7 CDA to allow viewing annotation XML documents in a web browser, and plan to adapt existing tools for translating annotation documents between CDA+GrAF and the UIMA and GATE frameworks. This common data model may ease directly comparing NLP tools and applications, combining their output, transforming and "translating" annotations between different NLP applications, and eventually "plug-and-play" of different modules in NLP applications. Copyright © 2011 Elsevier Inc. All rights reserved.

  12. Multimodal MSI in Conjunction with Broad Coverage Spatially Resolved MS 2 Increases Confidence in Both Molecular Identification and Localization

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Veličković, Dušan; Chu, Rosalie K.; Carrell, Alyssa A.

    One critical aspect of mass spectrometry imaging (MSI) is the need to confidently identify detected analytes. While orthogonal tandem MS (e.g., LC-MS 2) experiments from sample extracts can assist in annotating ions, the spatial information about these molecules is lost. Accordingly, this could cause mislead conclusions, especially in cases where isobaric species exhibit different distributions within a sample. In this Technical Note, we employed a multimodal imaging approach, using matrix assisted laser desorption/ionization (MALDI)-MSI and liquid extraction surface analysis (LESA)-MS 2I, to confidently annotate and One critical aspect of mass spectrometry imaging (MSI) is the need to confidently identify detectedmore » analytes. While orthogonal tandem MS (e.g., LC-MS2) experiments from sample extracts can assist in annotating ions, the spatial information about these molecules is lost. Accordingly, this could cause mislead conclusions, especially in cases where isobaric species exhibit different distributions within a sample. In this Technical Note, we employed a multimodal imaging approach, using matrix assisted laser desorption/ionization (MALDI)-MSI and liquid extraction surface analysis (LESA)-MS 2I, to confidently annotate and localize a broad range of metabolites involved in a tripartite symbiosis system of moss, cyanobacteria, and fungus. We found that the combination of these two imaging modalities generated very congruent ion images, providing the link between highly accurate structural information onfered by LESA and high spatial resolution attainable by MALDI. These results demonstrate how this combined methodology could be very useful in differentiating metabolite routes in complex systems.« less

  13. Smart Annotation of Cyclic Data Using Hierarchical Hidden Markov Models.

    PubMed

    Martindale, Christine F; Hoenig, Florian; Strohrmann, Christina; Eskofier, Bjoern M

    2017-10-13

    Cyclic signals are an intrinsic part of daily life, such as human motion and heart activity. The detailed analysis of them is important for clinical applications such as pathological gait analysis and for sports applications such as performance analysis. Labeled training data for algorithms that analyze these cyclic data come at a high annotation cost due to only limited annotations available under laboratory conditions or requiring manual segmentation of the data under less restricted conditions. This paper presents a smart annotation method that reduces this cost of labeling for sensor-based data, which is applicable to data collected outside of strict laboratory conditions. The method uses semi-supervised learning of sections of cyclic data with a known cycle number. A hierarchical hidden Markov model (hHMM) is used, achieving a mean absolute error of 0.041 ± 0.020 s relative to a manually-annotated reference. The resulting model was also used to simultaneously segment and classify continuous, 'in the wild' data, demonstrating the applicability of using hHMM, trained on limited data sections, to label a complete dataset. This technique achieved comparable results to its fully-supervised equivalent. Our semi-supervised method has the significant advantage of reduced annotation cost. Furthermore, it reduces the opportunity for human error in the labeling process normally required for training of segmentation algorithms. It also lowers the annotation cost of training a model capable of continuous monitoring of cycle characteristics such as those employed to analyze the progress of movement disorders or analysis of running technique.

  14. DICOM for quantitative imaging biomarker development: a standards based approach to sharing clinical data and structured PET/CT analysis results in head and neck cancer research

    PubMed Central

    Clunie, David; Ulrich, Ethan; Bauer, Christian; Wahle, Andreas; Brown, Bartley; Onken, Michael; Riesmeier, Jörg; Pieper, Steve; Kikinis, Ron; Buatti, John; Beichel, Reinhard R.

    2016-01-01

    Background. Imaging biomarkers hold tremendous promise for precision medicine clinical applications. Development of such biomarkers relies heavily on image post-processing tools for automated image quantitation. Their deployment in the context of clinical research necessitates interoperability with the clinical systems. Comparison with the established outcomes and evaluation tasks motivate integration of the clinical and imaging data, and the use of standardized approaches to support annotation and sharing of the analysis results and semantics. We developed the methodology and tools to support these tasks in Positron Emission Tomography and Computed Tomography (PET/CT) quantitative imaging (QI) biomarker development applied to head and neck cancer (HNC) treatment response assessment, using the Digital Imaging and Communications in Medicine (DICOM®) international standard and free open-source software. Methods. Quantitative analysis of PET/CT imaging data collected on patients undergoing treatment for HNC was conducted. Processing steps included Standardized Uptake Value (SUV) normalization of the images, segmentation of the tumor using manual and semi-automatic approaches, automatic segmentation of the reference regions, and extraction of the volumetric segmentation-based measurements. Suitable components of the DICOM standard were identified to model the various types of data produced by the analysis. A developer toolkit of conversion routines and an Application Programming Interface (API) were contributed and applied to create a standards-based representation of the data. Results. DICOM Real World Value Mapping, Segmentation and Structured Reporting objects were utilized for standards-compliant representation of the PET/CT QI analysis results and relevant clinical data. A number of correction proposals to the standard were developed. The open-source DICOM toolkit (DCMTK) was improved to simplify the task of DICOM encoding by introducing new API abstractions. Conversion and visualization tools utilizing this toolkit were developed. The encoded objects were validated for consistency and interoperability. The resulting dataset was deposited in the QIN-HEADNECK collection of The Cancer Imaging Archive (TCIA). Supporting tools for data analysis and DICOM conversion were made available as free open-source software. Discussion. We presented a detailed investigation of the development and application of the DICOM model, as well as the supporting open-source tools and toolkits, to accommodate representation of the research data in QI biomarker development. We demonstrated that the DICOM standard can be used to represent the types of data relevant in HNC QI biomarker development, and encode their complex relationships. The resulting annotated objects are amenable to data mining applications, and are interoperable with a variety of systems that support the DICOM standard. PMID:27257542

  15. Integrating UIMA annotators in a web-based text processing framework.

    PubMed

    Chen, Xiang; Arnold, Corey W

    2013-01-01

    The Unstructured Information Management Architecture (UIMA) [1] framework is a growing platform for natural language processing (NLP) applications. However, such applications may be difficult for non-technical users deploy. This project presents a web-based framework that wraps UIMA-based annotator systems into a graphical user interface for researchers and clinicians, and a web service for developers. An annotator that extracts data elements from lung cancer radiology reports is presented to illustrate the use of the system. Annotation results from the web system can be exported to multiple formats for users to utilize in other aspects of their research and workflow. This project demonstrates the benefits of a lay-user interface for complex NLP applications. Efforts such as this can lead to increased interest and support for NLP work in the clinical domain.

  16. The use of surface geophysical techniques to detect fractures in bedrock; an annotated bibliography

    USGS Publications Warehouse

    Lewis, Mark R.; Haeni, F.P.

    1987-01-01

    This annotated bibliography compiles references about the theory and application of surface geophysical techniques to locate fractures or fracture zones within bedrock units. Forty-three publications are referenced, including journal articles, theses, conference proceedings, abstracts, translations, and reports prepared by private contractors and U.S. Government agencies. Thirty-one of the publications are annotated. The remainder are untranslated foreign language articles, which are listed only as bibliographic references. Most annotations summarize the location, geologic setting, surface geophysical technique used, and results of a study. A few highly relevant theoretical studies are annotated also. Publications that discuss only the use of borehole geophysical techniques to locate fractures are excluded from this bibliography. Also excluded are highly theoretical works that may have little or no known practical application.

  17. Comparison of Natural Language Processing Rules-based and Machine-learning Systems to Identify Lumbar Spine Imaging Findings Related to Low Back Pain.

    PubMed

    Tan, W Katherine; Hassanpour, Saeed; Heagerty, Patrick J; Rundell, Sean D; Suri, Pradeep; Huhdanpaa, Hannu T; James, Kathryn; Carrell, David S; Langlotz, Curtis P; Organ, Nancy L; Meier, Eric N; Sherman, Karen J; Kallmes, David F; Luetmer, Patrick H; Griffith, Brent; Nerenz, David R; Jarvik, Jeffrey G

    2018-03-28

    To evaluate a natural language processing (NLP) system built with open-source tools for identification of lumbar spine imaging findings related to low back pain on magnetic resonance and x-ray radiology reports from four health systems. We used a limited data set (de-identified except for dates) sampled from lumbar spine imaging reports of a prospectively assembled cohort of adults. From N = 178,333 reports, we randomly selected N = 871 to form a reference-standard dataset, consisting of N = 413 x-ray reports and N = 458 MR reports. Using standardized criteria, four spine experts annotated the presence of 26 findings, where 71 reports were annotated by all four experts and 800 were each annotated by two experts. We calculated inter-rater agreement and finding prevalence from annotated data. We randomly split the annotated data into development (80%) and testing (20%) sets. We developed an NLP system from both rule-based and machine-learned models. We validated the system using accuracy metrics such as sensitivity, specificity, and area under the receiver operating characteristic curve (AUC). The multirater annotated dataset achieved inter-rater agreement of Cohen's kappa > 0.60 (substantial agreement) for 25 of 26 findings, with finding prevalence ranging from 3% to 89%. In the testing sample, rule-based and machine-learned predictions both had comparable average specificity (0.97 and 0.95, respectively). The machine-learned approach had a higher average sensitivity (0.94, compared to 0.83 for rules-based), and a higher overall AUC (0.98, compared to 0.90 for rules-based). Our NLP system performed well in identifying the 26 lumbar spine findings, as benchmarked by reference-standard annotation by medical experts. Machine-learned models provided substantial gains in model sensitivity with slight loss of specificity, and overall higher AUC. Copyright © 2018 The Association of University Radiologists. All rights reserved.

  18. Contour-Driven Atlas-Based Segmentation

    PubMed Central

    Wachinger, Christian; Fritscher, Karl; Sharp, Greg; Golland, Polina

    2016-01-01

    We propose new methods for automatic segmentation of images based on an atlas of manually labeled scans and contours in the image. First, we introduce a Bayesian framework for creating initial label maps from manually annotated training images. Within this framework, we model various registration- and patch-based segmentation techniques by changing the deformation field prior. Second, we perform contour-driven regression on the created label maps to refine the segmentation. Image contours and image parcellations give rise to non-stationary kernel functions that model the relationship between image locations. Setting the kernel to the covariance function in a Gaussian process establishes a distribution over label maps supported by image structures. Maximum a posteriori estimation of the distribution over label maps conditioned on the outcome of the atlas-based segmentation yields the refined segmentation. We evaluate the segmentation in two clinical applications: the segmentation of parotid glands in head and neck CT scans and the segmentation of the left atrium in cardiac MR angiography images. PMID:26068202

  19. Facilitating medical information search using Google Glass connected to a content-based medical image retrieval system.

    PubMed

    Widmer, Antoine; Schaer, Roger; Markonis, Dimitrios; Muller, Henning

    2014-01-01

    Wearable computing devices are starting to change the way users interact with computers and the Internet. Among them, Google Glass includes a small screen located in front of the right eye, a camera filming in front of the user and a small computing unit. Google Glass has the advantage to provide online services while allowing the user to perform tasks with his/her hands. These augmented glasses uncover many useful applications, also in the medical domain. For example, Google Glass can easily provide video conference between medical doctors to discuss a live case. Using these glasses can also facilitate medical information search by allowing the access of a large amount of annotated medical cases during a consultation in a non-disruptive fashion for medical staff. In this paper, we developed a Google Glass application able to take a photo and send it to a medical image retrieval system along with keywords in order to retrieve similar cases. As a preliminary assessment of the usability of the application, we tested the application under three conditions (images of the skin; printed CT scans and MRI images; and CT and MRI images acquired directly from an LCD screen) to explore whether using Google Glass affects the accuracy of the results returned by the medical image retrieval system. The preliminary results show that despite minor problems due to the relative stability of the Google Glass, images can be sent to and processed by the medical image retrieval system and similar images are returned to the user, potentially helping in the decision making process.

  20. High-performance web viewer for cardiac images

    NASA Astrophysics Data System (ADS)

    dos Santos, Marcelo; Furuie, Sergio S.

    2004-04-01

    With the advent of the digital devices for medical diagnosis the use of the regular films in radiology has decreased. Thus, the management and handling of medical images in digital format has become an important and critical task. In Cardiology, for example, the main difficulty is to display dynamic images with the appropriated color palette and frame rate used on acquisition process by Cath, Angio and Echo systems. In addition, other difficulty is handling large images in memory by any existing personal computer, including thin clients. In this work we present a web-based application that carries out these tasks with robustness and excellent performance, without burdening the server and network. This application provides near-diagnostic quality display of cardiac images stored as DICOM 3.0 files via a web browser and provides a set of resources that allows the viewing of still and dynamic images. It can access image files from the local disks, or network connection. Its features include: allows real-time playback, dynamic thumbnails image viewing during loading, access to patient database information, image processing tools, linear and angular measurements, on-screen annotations, image printing and exporting DICOM images to other image formats, and many others, all characterized by a pleasant user-friendly interface, inside a Web browser by means of a Java application. This approach offers some advantages over the most of medical images viewers, such as: facility of installation, integration with other systems by means of public and standardized interfaces, platform independence, efficient manipulation and display of medical images, all with high performance.

  1. Recent Literature Shows Accelerated Growth in Hypermedia Tools: An Annotated Bibliography.

    ERIC Educational Resources Information Center

    Gabbard, Ralph

    1994-01-01

    Presents an annotated bibliography of materials on hypertext/hypermedia. Information available on the World Wide Web is described; journals that cover hypermedia are listed; and the main bibliography is divided into 3 sections on general hypertext applications (17 titles), DOS/Windows applications (17 titles), and HyperCard applications (18…

  2. VRML and Collaborative Environments: New Tools for Networked Visualization

    NASA Astrophysics Data System (ADS)

    Crutcher, R. M.; Plante, R. L.; Rajlich, P.

    We present two new applications that engage the network as a tool for astronomical research and/or education. The first is a VRML server which allows users over the Web to interactively create three-dimensional visualizations of FITS images contained in the NCSA Astronomy Digital Image Library (ADIL). The server's Web interface allows users to select images from the ADIL, fill in processing parameters, and create renderings featuring isosurfaces, slices, contours, and annotations; the often extensive computations are carried out on an NCSA SGI supercomputer server without the user having an individual account on the system. The user can then download the 3D visualizations as VRML files, which may be rotated and manipulated locally on virtually any class of computer. The second application is the ADILBrowser, a part of the NCSA Horizon Image Data Browser Java package. ADILBrowser allows a group of participants to browse images from the ADIL within a collaborative session. The collaborative environment is provided by the NCSA Habanero package which includes text and audio chat tools and a white board. The ADILBrowser is just an example of a collaborative tool that can be built with the Horizon and Habanero packages. The classes provided by these packages can be assembled to create custom collaborative applications that visualize data either from local disk or from anywhere on the network.

  3. Insights from Classifying Visual Concepts with Multiple Kernel Learning

    PubMed Central

    Binder, Alexander; Nakajima, Shinichi; Kloft, Marius; Müller, Christina; Samek, Wojciech; Brefeld, Ulf; Müller, Klaus-Robert; Kawanabe, Motoaki

    2012-01-01

    Combining information from various image features has become a standard technique in concept recognition tasks. However, the optimal way of fusing the resulting kernel functions is usually unknown in practical applications. Multiple kernel learning (MKL) techniques allow to determine an optimal linear combination of such similarity matrices. Classical approaches to MKL promote sparse mixtures. Unfortunately, 1-norm regularized MKL variants are often observed to be outperformed by an unweighted sum kernel. The main contributions of this paper are the following: we apply a recently developed non-sparse MKL variant to state-of-the-art concept recognition tasks from the application domain of computer vision. We provide insights on benefits and limits of non-sparse MKL and compare it against its direct competitors, the sum-kernel SVM and sparse MKL. We report empirical results for the PASCAL VOC 2009 Classification and ImageCLEF2010 Photo Annotation challenge data sets. Data sets (kernel matrices) as well as further information are available at http://doc.ml.tu-berlin.de/image_mkl/(Accessed 2012 Jun 25). PMID:22936970

  4. A memory learning framework for effective image retrieval.

    PubMed

    Han, Junwei; Ngan, King N; Li, Mingjing; Zhang, Hong-Jiang

    2005-04-01

    Most current content-based image retrieval systems are still incapable of providing users with their desired results. The major difficulty lies in the gap between low-level image features and high-level image semantics. To address the problem, this study reports a framework for effective image retrieval by employing a novel idea of memory learning. It forms a knowledge memory model to store the semantic information by simply accumulating user-provided interactions. A learning strategy is then applied to predict the semantic relationships among images according to the memorized knowledge. Image queries are finally performed based on a seamless combination of low-level features and learned semantics. One important advantage of our framework is its ability to efficiently annotate images and also propagate the keyword annotation from the labeled images to unlabeled images. The presented algorithm has been integrated into a practical image retrieval system. Experiments on a collection of 10,000 general-purpose images demonstrate the effectiveness of the proposed framework.

  5. Automated synovium segmentation in doppler ultrasound images for rheumatoid arthritis assessment

    NASA Astrophysics Data System (ADS)

    Yeung, Pak-Hei; Tan, York-Kiat; Xu, Shuoyu

    2018-02-01

    We need better clinical tools to improve monitoring of synovitis, synovial inflammation in the joints, in rheumatoid arthritis (RA) assessment. Given its economical, safe and fast characteristics, ultrasound (US) especially Doppler ultrasound is frequently used. However, manual scoring of synovitis in US images is subjective and prone to observer variations. In this study, we propose a new and robust method for automated synovium segmentation in the commonly affected joints, i.e. metacarpophalangeal (MCP) and metatarsophalangeal (MTP) joints, which would facilitate automation in quantitative RA assessment. The bone contour in the US image is firstly detected based on a modified dynamic programming method, incorporating angular information for detecting curved bone surface and using image fuzzification to identify missing bone structure. K-means clustering is then performed to initialize potential synovium areas by utilizing the identified bone contour as boundary reference. After excluding invalid candidate regions, the final segmented synovium is identified by reconnecting remaining candidate regions using level set evolution. 15 MCP and 15 MTP US images were analyzed in this study. For each image, segmentations by our proposed method as well as two sets of annotations performed by an experienced clinician at different time-points were acquired. Dice's coefficient is 0.77+/-0.12 between the two sets of annotations. Similar Dice's coefficients are achieved between automated segmentation and either the first set of annotations (0.76+/-0.12) or the second set of annotations (0.75+/-0.11), with no significant difference (P = 0.77). These results verify that the accuracy of segmentation by our proposed method and by clinician is comparable. Therefore, reliable synovium identification can be made by our proposed method.

  6. Faster, efficient and secure collection of research images: the utilization of cloud technology to expand the OMI-DB

    NASA Astrophysics Data System (ADS)

    Patel, M. N.; Young, K.; Halling-Brown, M. D.

    2018-03-01

    The demand for medical images for research is ever increasing owing to the rapid rise in novel machine learning approaches for early detection and diagnosis. The OPTIMAM Medical Image Database (OMI-DB)1,2 was created to provide a centralized, fully annotated dataset for research. The database contains both processed and unprocessed images, associated data, annotations and expert-determined ground truths. Since the inception of the database in early 2011, the volume of images and associated data collected has dramatically increased owing to automation of the collection pipeline and inclusion of new sites. Currently, these data are stored at each respective collection site and synced periodically to a central store. This leads to a large data footprint at each site, requiring large physical onsite storage, which is expensive. Here, we propose an update to the OMI-DB collection system, whereby the storage of all the data is automatically transferred to the cloud on collection. This change in the data collection paradigm reduces the reliance of physical servers at each site; allows greater scope for future expansion; and removes the need for dedicated backups and improves security. Moreover, with the number of applications to access the data increasing rapidly with the maturity of the dataset cloud technology facilities faster sharing of data and better auditing of data access. Such updates, although may sound trivial; require substantial modification to the existing pipeline to ensure data integrity and security compliance. Here, we describe the extensions to the OMI-DB collection pipeline and discuss the relative merits of the new system.

  7. Real-Time Ultrasound Segmentation, Analysis and Visualisation of Deep Cervical Muscle Structure.

    PubMed

    Cunningham, Ryan J; Harding, Peter J; Loram, Ian D

    2017-02-01

    Despite widespread availability of ultrasound and a need for personalised muscle diagnosis (neck/back pain-injury, work related disorder, myopathies, neuropathies), robust, online segmentation of muscles within complex groups remains unsolved by existing methods. For example, Cervical Dystonia (CD) is a prevalent neurological condition causing painful spasticity in one or multiple muscles in the cervical muscle system. Clinicians currently have no method for targeting/monitoring treatment of deep muscles. Automated methods of muscle segmentation would enable clinicians to study, target, and monitor the deep cervical muscles via ultrasound. We have developed a method for segmenting five bilateral cervical muscles and the spine via ultrasound alone, in real-time. Magnetic Resonance Imaging (MRI) and ultrasound data were collected from 22 participants (age: 29.0±6.6, male: 12). To acquire ultrasound muscle segment labels, a novel multimodal registration method was developed, involving MRI image annotation, and shape registration to MRI-matched ultrasound images, via approximation of the tissue deformation. We then applied polynomial regression to transform our annotations and textures into a mean space, before using shape statistics to generate a texture-to-shape dictionary. For segmentation, test images were compared to dictionary textures giving an initial segmentation, and then we used a customized Active Shape Model to refine the fit. Using ultrasound alone, on unseen participants, our technique currently segments a single image in [Formula: see text] to over 86% accuracy (Jaccard index). We propose this approach is applicable generally to segment, extrapolate and visualise deep muscle structure, and analyse statistical features online.

  8. Automated Detection of Synapses in Serial Section Transmission Electron Microscopy Image Stacks

    PubMed Central

    Kreshuk, Anna; Koethe, Ullrich; Pax, Elizabeth; Bock, Davi D.; Hamprecht, Fred A.

    2014-01-01

    We describe a method for fully automated detection of chemical synapses in serial electron microscopy images with highly anisotropic axial and lateral resolution, such as images taken on transmission electron microscopes. Our pipeline starts from classification of the pixels based on 3D pixel features, which is followed by segmentation with an Ising model MRF and another classification step, based on object-level features. Classifiers are learned on sparse user labels; a fully annotated data subvolume is not required for training. The algorithm was validated on a set of 238 synapses in 20 serial 7197×7351 pixel images (4.5×4.5×45 nm resolution) of mouse visual cortex, manually labeled by three independent human annotators and additionally re-verified by an expert neuroscientist. The error rate of the algorithm (12% false negative, 7% false positive detections) is better than state-of-the-art, even though, unlike the state-of-the-art method, our algorithm does not require a prior segmentation of the image volume into cells. The software is based on the ilastik learning and segmentation toolkit and the vigra image processing library and is freely available on our website, along with the test data and gold standard annotations (http://www.ilastik.org/synapse-detection/sstem). PMID:24516550

  9. Medical Image Data and Datasets in the Era of Machine Learning-Whitepaper from the 2016 C-MIMI Meeting Dataset Session.

    PubMed

    Kohli, Marc D; Summers, Ronald M; Geis, J Raymond

    2017-08-01

    At the first annual Conference on Machine Intelligence in Medical Imaging (C-MIMI), held in September 2016, a conference session on medical image data and datasets for machine learning identified multiple issues. The common theme from attendees was that everyone participating in medical image evaluation with machine learning is data starved. There is an urgent need to find better ways to collect, annotate, and reuse medical imaging data. Unique domain issues with medical image datasets require further study, development, and dissemination of best practices and standards, and a coordinated effort among medical imaging domain experts, medical imaging informaticists, government and industry data scientists, and interested commercial, academic, and government entities. High-level attributes of reusable medical image datasets suitable to train, test, validate, verify, and regulate ML products should be better described. NIH and other government agencies should promote and, where applicable, enforce, access to medical image datasets. We should improve communication among medical imaging domain experts, medical imaging informaticists, academic clinical and basic science researchers, government and industry data scientists, and interested commercial entities.

  10. Modeling semantic aspects for cross-media image indexing.

    PubMed

    Monay, Florent; Gatica-Perez, Daniel

    2007-10-01

    To go beyond the query-by-example paradigm in image retrieval, there is a need for semantic indexing of large image collections for intuitive text-based image search. Different models have been proposed to learn the dependencies between the visual content of an image set and the associated text captions, then allowing for the automatic creation of semantic indices for unannotated images. The task, however, remains unsolved. In this paper, we present three alternatives to learn a Probabilistic Latent Semantic Analysis model (PLSA) for annotated images, and evaluate their respective performance for automatic image indexing. Under the PLSA assumptions, an image is modeled as a mixture of latent aspects that generates both image features and text captions, and we investigate three ways to learn the mixture of aspects. We also propose a more discriminative image representation than the traditional Blob histogram, concatenating quantized local color information and quantized local texture descriptors. The first learning procedure of a PLSA model for annotated images is a standard EM algorithm, which implicitly assumes that the visual and the textual modalities can be treated equivalently. The other two models are based on an asymmetric PLSA learning, allowing to constrain the definition of the latent space on the visual or on the textual modality. We demonstrate that the textual modality is more appropriate to learn a semantically meaningful latent space, which translates into improved annotation performance. A comparison of our learning algorithms with respect to recent methods on a standard dataset is presented, and a detailed evaluation of the performance shows the validity of our framework.

  11. A data model and database for high-resolution pathology analytical image informatics.

    PubMed

    Wang, Fusheng; Kong, Jun; Cooper, Lee; Pan, Tony; Kurc, Tahsin; Chen, Wenjin; Sharma, Ashish; Niedermayr, Cristobal; Oh, Tae W; Brat, Daniel; Farris, Alton B; Foran, David J; Saltz, Joel

    2011-01-01

    The systematic analysis of imaged pathology specimens often results in a vast amount of morphological information at both the cellular and sub-cellular scales. While microscopy scanners and computerized analysis are capable of capturing and analyzing data rapidly, microscopy image data remain underutilized in research and clinical settings. One major obstacle which tends to reduce wider adoption of these new technologies throughout the clinical and scientific communities is the challenge of managing, querying, and integrating the vast amounts of data resulting from the analysis of large digital pathology datasets. This paper presents a data model, which addresses these challenges, and demonstrates its implementation in a relational database system. This paper describes a data model, referred to as Pathology Analytic Imaging Standards (PAIS), and a database implementation, which are designed to support the data management and query requirements of detailed characterization of micro-anatomic morphology through many interrelated analysis pipelines on whole-slide images and tissue microarrays (TMAs). (1) Development of a data model capable of efficiently representing and storing virtual slide related image, annotation, markup, and feature information. (2) Development of a database, based on the data model, capable of supporting queries for data retrieval based on analysis and image metadata, queries for comparison of results from different analyses, and spatial queries on segmented regions, features, and classified objects. The work described in this paper is motivated by the challenges associated with characterization of micro-scale features for comparative and correlative analyses involving whole-slides tissue images and TMAs. Technologies for digitizing tissues have advanced significantly in the past decade. Slide scanners are capable of producing high-magnification, high-resolution images from whole slides and TMAs within several minutes. Hence, it is becoming increasingly feasible for basic, clinical, and translational research studies to produce thousands of whole-slide images. Systematic analysis of these large datasets requires efficient data management support for representing and indexing results from hundreds of interrelated analyses generating very large volumes of quantifications such as shape and texture and of classifications of the quantified features. We have designed a data model and a database to address the data management requirements of detailed characterization of micro-anatomic morphology through many interrelated analysis pipelines. The data model represents virtual slide related image, annotation, markup and feature information. The database supports a wide range of metadata and spatial queries on images, annotations, markups, and features. We currently have three databases running on a Dell PowerEdge T410 server with CentOS 5.5 Linux operating system. The database server is IBM DB2 Enterprise Edition 9.7.2. The set of databases consists of 1) a TMA database containing image analysis results from 4740 cases of breast cancer, with 641 MB storage size; 2) an algorithm validation database, which stores markups and annotations from two segmentation algorithms and two parameter sets on 18 selected slides, with 66 GB storage size; and 3) an in silico brain tumor study database comprising results from 307 TCGA slides, with 365 GB storage size. The latter two databases also contain human-generated annotations and markups for regions and nuclei. Modeling and managing pathology image analysis results in a database provide immediate benefits on the value and usability of data in a research study. The database provides powerful query capabilities, which are otherwise difficult or cumbersome to support by other approaches such as programming languages. Standardized, semantic annotated data representation and interfaces also make it possible to more efficiently share image data and analysis results.

  12. Webcams, Crowdsourcing, and Enhanced Crosswalks: Developing a Novel Method to Analyze Active Transportation.

    PubMed

    Hipp, J Aaron; Manteiga, Alicia; Burgess, Amanda; Stylianou, Abby; Pless, Robert

    2016-01-01

    Active transportation opportunities and infrastructure are an important component of a community's design, livability, and health. Features of the built environment influence active transportation, but objective study of the natural experiment effects of built environment improvements on active transportation is challenging. The purpose of this study was to develop and present a novel method of active transportation research using webcams and crowdsourcing, and to determine if crosswalk enhancement was associated with changes in active transportation rates, including across a variety of weather conditions. The 20,529 publicly available webcam images from two street intersections in Washington, DC, USA were used to examine the impact of an improved crosswalk on active transportation. A crowdsource, Amazon Mechanical Turk, annotated image data. Temperature data were collected from the National Oceanic and Atmospheric Administration, and precipitation data were annotated from images by trained research assistants. Summary analyses demonstrated slight, bi-directional differences in the percent of images with pedestrians and bicyclists captured before and after the enhancement of the crosswalks. Chi-square analyses revealed these changes were not significant. In general, pedestrian presence increased in images captured during moderate temperatures compared to images captured during hot or cold temperatures. Chi-square analyses indicated the crosswalk improvement may have encouraged walking and biking in uncomfortable outdoor conditions (P < 0.5). The methods employed provide an objective, cost-effective alternative to traditional means of examining the effects of built environment changes on active transportation. The use of webcams to collect active transportation data has applications for community policymakers, planners, and health professionals. Future research will work to validate this method in a variety of settings as well as across different built environment and community policy initiatives.

  13. Game-powered machine learning

    PubMed Central

    Barrington, Luke; Turnbull, Douglas; Lanckriet, Gert

    2012-01-01

    Searching for relevant content in a massive amount of multimedia information is facilitated by accurately annotating each image, video, or song with a large number of relevant semantic keywords, or tags. We introduce game-powered machine learning, an integrated approach to annotating multimedia content that combines the effectiveness of human computation, through online games, with the scalability of machine learning. We investigate this framework for labeling music. First, a socially-oriented music annotation game called Herd It collects reliable music annotations based on the “wisdom of the crowds.” Second, these annotated examples are used to train a supervised machine learning system. Third, the machine learning system actively directs the annotation games to collect new data that will most benefit future model iterations. Once trained, the system can automatically annotate a corpus of music much larger than what could be labeled using human computation alone. Automatically annotated songs can be retrieved based on their semantic relevance to text-based queries (e.g., “funky jazz with saxophone,” “spooky electronica,” etc.). Based on the results presented in this paper, we find that actively coupling annotation games with machine learning provides a reliable and scalable approach to making searchable massive amounts of multimedia data. PMID:22460786

  14. Game-powered machine learning.

    PubMed

    Barrington, Luke; Turnbull, Douglas; Lanckriet, Gert

    2012-04-24

    Searching for relevant content in a massive amount of multimedia information is facilitated by accurately annotating each image, video, or song with a large number of relevant semantic keywords, or tags. We introduce game-powered machine learning, an integrated approach to annotating multimedia content that combines the effectiveness of human computation, through online games, with the scalability of machine learning. We investigate this framework for labeling music. First, a socially-oriented music annotation game called Herd It collects reliable music annotations based on the "wisdom of the crowds." Second, these annotated examples are used to train a supervised machine learning system. Third, the machine learning system actively directs the annotation games to collect new data that will most benefit future model iterations. Once trained, the system can automatically annotate a corpus of music much larger than what could be labeled using human computation alone. Automatically annotated songs can be retrieved based on their semantic relevance to text-based queries (e.g., "funky jazz with saxophone," "spooky electronica," etc.). Based on the results presented in this paper, we find that actively coupling annotation games with machine learning provides a reliable and scalable approach to making searchable massive amounts of multimedia data.

  15. Crowdsourcing for error detection in cortical surface delineations.

    PubMed

    Ganz, Melanie; Kondermann, Daniel; Andrulis, Jonas; Knudsen, Gitte Moos; Maier-Hein, Lena

    2017-01-01

    With the recent trend toward big data analysis, neuroimaging datasets have grown substantially in the past years. While larger datasets potentially offer important insights for medical research, one major bottleneck is the requirement for resources of medical experts needed to validate automatic processing results. To address this issue, the goal of this paper was to assess whether anonymous nonexperts from an online community can perform quality control of MR-based cortical surface delineations derived by an automatic algorithm. So-called knowledge workers from an online crowdsourcing platform were asked to annotate errors in automatic cortical surface delineations on 100 central, coronal slices of MR images. On average, annotations for 100 images were obtained in less than an hour. When using expert annotations as reference, the crowd on average achieves a sensitivity of 82 % and a precision of 42 %. Merging multiple annotations per image significantly improves the sensitivity of the crowd (up to 95 %), but leads to a decrease in precision (as low as 22 %). Our experiments show that the detection of errors in automatic cortical surface delineations generated by anonymous untrained workers is feasible. Future work will focus on increasing the sensitivity of our method further, such that the error detection tasks can be handled exclusively by the crowd and expert resources can be focused on error correction.

  16. Ontology design patterns to disambiguate relations between genes and gene products in GENIA

    PubMed Central

    2011-01-01

    Motivation Annotated reference corpora play an important role in biomedical information extraction. A semantic annotation of the natural language texts in these reference corpora using formal ontologies is challenging due to the inherent ambiguity of natural language. The provision of formal definitions and axioms for semantic annotations offers the means for ensuring consistency as well as enables the development of verifiable annotation guidelines. Consistent semantic annotations facilitate the automatic discovery of new information through deductive inferences. Results We provide a formal characterization of the relations used in the recent GENIA corpus annotations. For this purpose, we both select existing axiom systems based on the desired properties of the relations within the domain and develop new axioms for several relations. To apply this ontology of relations to the semantic annotation of text corpora, we implement two ontology design patterns. In addition, we provide a software application to convert annotated GENIA abstracts into OWL ontologies by combining both the ontology of relations and the design patterns. As a result, the GENIA abstracts become available as OWL ontologies and are amenable for automated verification, deductive inferences and other knowledge-based applications. Availability Documentation, implementation and examples are available from http://www-tsujii.is.s.u-tokyo.ac.jp/GENIA/. PMID:22166341

  17. Automated Detection of Leakage in Fluorescein Angiography Images with Application to Malarial Retinopathy

    PubMed Central

    Zhao, Yitian; J. C. MacCormick, Ian; G. Parry, David; Leach, Sophie; A. V. Beare, Nicholas; P. Harding, Simon; Zheng, Yalin

    2015-01-01

    The detection and assessment of leakage in retinal fluorescein angiogram images is important for the management of a wide range of retinal diseases. We have developed a framework that can automatically detect three types of leakage (large focal, punctate focal, and vessel segment leakage) and validated it on images from patients with malarial retinopathy. This framework comprises three steps: vessel segmentation, saliency feature generation and leakage detection. We tested the effectiveness of this framework by applying it to images from 20 patients with large focal leak, 10 patients with punctate focal leak, and 5,846 vessel segments from 10 patients with vessel leakage. The sensitivity in detecting large focal, punctate focal and vessel segment leakage are 95%, 82% and 81%, respectively, when compared to manual annotation by expert human observers. Our framework has the potential to become a powerful new tool for studying malarial retinopathy, and other conditions involving retinal leakage. PMID:26030010

  18. Automated detection of leakage in fluorescein angiography images with application to malarial retinopathy.

    PubMed

    Zhao, Yitian; MacCormick, Ian J C; Parry, David G; Leach, Sophie; Beare, Nicholas A V; Harding, Simon P; Zheng, Yalin

    2015-06-01

    The detection and assessment of leakage in retinal fluorescein angiogram images is important for the management of a wide range of retinal diseases. We have developed a framework that can automatically detect three types of leakage (large focal, punctate focal, and vessel segment leakage) and validated it on images from patients with malarial retinopathy. This framework comprises three steps: vessel segmentation, saliency feature generation and leakage detection. We tested the effectiveness of this framework by applying it to images from 20 patients with large focal leak, 10 patients with punctate focal leak, and 5,846 vessel segments from 10 patients with vessel leakage. The sensitivity in detecting large focal, punctate focal and vessel segment leakage are 95%, 82% and 81%, respectively, when compared to manual annotation by expert human observers. Our framework has the potential to become a powerful new tool for studying malarial retinopathy, and other conditions involving retinal leakage.

  19. Time required for navigated macular laser photocoagulation treatment with the Navilas.

    PubMed

    Ober, Michael D; Kernt, Marcus; Cortes, Marco A; Kozak, Igor

    2013-04-01

    Navilas laser is a novel technology combining photocoagulation with imaging, including fluorescein angiographic (FA) images which are annotated and aligned to a live fundus view. We determine the time necessary for planning and treatment of macular edema utilizing the Navilas. The screen recordings during treatments were retrospectively analyzed for treatment type, number of laser shots, the duration of planning (measured from the time the planning image was selected to time of marking the last planned treatment spot), and total time of laser application. A total of 93 treatments (30 grid, 30 focal and 33 combined treatments) by four physicians from three sites were included. An average of 125 spots were applied to each eye. The total time spent for each focal treatment - including the planning was 7 min 47 s (±3 min and 32 s). Navilas is a novel device providing a time efficient platform for evaluating FA images and performing threshold macular laser photocoagulation.

  20. Optimal graph based segmentation using flow lines with application to airway wall segmentation.

    PubMed

    Petersen, Jens; Nielsen, Mads; Lo, Pechin; Saghir, Zaigham; Dirksen, Asger; de Bruijne, Marleen

    2011-01-01

    This paper introduces a novel optimal graph construction method that is applicable to multi-dimensional, multi-surface segmentation problems. Such problems are often solved by refining an initial coarse surface within the space given by graph columns. Conventional columns are not well suited for surfaces with high curvature or complex shapes but the proposed columns, based on properly generated flow lines, which are non-intersecting, guarantee solutions that do not self-intersect and are better able to handle such surfaces. The method is applied to segment human airway walls in computed tomography images. Comparison with manual annotations on 649 cross-sectional images from 15 different subjects shows significantly smaller contour distances and larger area of overlap than are obtained with recently published graph based methods. Airway abnormality measurements obtained with the method on 480 scan pairs from a lung cancer screening trial are reproducible and correlate significantly with lung function.

  1. KID Project: an internet-based digital video atlas of capsule endoscopy for research purposes.

    PubMed

    Koulaouzidis, Anastasios; Iakovidis, Dimitris K; Yung, Diana E; Rondonotti, Emanuele; Kopylov, Uri; Plevris, John N; Toth, Ervin; Eliakim, Abraham; Wurm Johansson, Gabrielle; Marlicz, Wojciech; Mavrogenis, Georgios; Nemeth, Artur; Thorlacius, Henrik; Tontini, Gian Eugenio

    2017-06-01

     Capsule endoscopy (CE) has revolutionized small-bowel (SB) investigation. Computational methods can enhance diagnostic yield (DY); however, incorporating machine learning algorithms (MLAs) into CE reading is difficult as large amounts of image annotations are required for training. Current databases lack graphic annotations of pathologies and cannot be used. A novel database, KID, aims to provide a reference for research and development of medical decision support systems (MDSS) for CE.  Open-source software was used for the KID database. Clinicians contribute anonymized, annotated CE images and videos. Graphic annotations are supported by an open-access annotation tool (Ratsnake). We detail an experiment based on the KID database, examining differences in SB lesion measurement between human readers and a MLA. The Jaccard Index (JI) was used to evaluate similarity between annotations by the MLA and human readers.  The MLA performed best in measuring lymphangiectasias with a JI of 81 ± 6 %. The other lesion types were: angioectasias (JI 64 ± 11 %), aphthae (JI 64 ± 8 %), chylous cysts (JI 70 ± 14 %), polypoid lesions (JI 75 ± 21 %), and ulcers (JI 56 ± 9 %).  MLA can perform as well as human readers in the measurement of SB angioectasias in white light (WL). Automated lesion measurement is therefore feasible. KID is currently the only open-source CE database developed specifically to aid development of MDSS. Our experiment demonstrates this potential.

  2. BDVC (Bimodal Database of Violent Content): A database of violent audio and video

    NASA Astrophysics Data System (ADS)

    Rivera Martínez, Jose Luis; Mijes Cruz, Mario Humberto; Rodríguez Vázqu, Manuel Antonio; Rodríguez Espejo, Luis; Montoya Obeso, Abraham; García Vázquez, Mireya Saraí; Ramírez Acosta, Alejandro Álvaro

    2017-09-01

    Nowadays there is a trend towards the use of unimodal databases for multimedia content description, organization and retrieval applications of a single type of content like text, voice and images, instead bimodal databases allow to associate semantically two different types of content like audio-video, image-text, among others. The generation of a bimodal database of audio-video implies the creation of a connection between the multimedia content through the semantic relation that associates the actions of both types of information. This paper describes in detail the used characteristics and methodology for the creation of the bimodal database of violent content; the semantic relationship is stablished by the proposed concepts that describe the audiovisual information. The use of bimodal databases in applications related to the audiovisual content processing allows an increase in the semantic performance only and only if these applications process both type of content. This bimodal database counts with 580 audiovisual annotated segments, with a duration of 28 minutes, divided in 41 classes. Bimodal databases are a tool in the generation of applications for the semantic web.

  3. Automated processing of zebrafish imaging data: a survey.

    PubMed

    Mikut, Ralf; Dickmeis, Thomas; Driever, Wolfgang; Geurts, Pierre; Hamprecht, Fred A; Kausler, Bernhard X; Ledesma-Carbayo, María J; Marée, Raphaël; Mikula, Karol; Pantazis, Periklis; Ronneberger, Olaf; Santos, Andres; Stotzka, Rainer; Strähle, Uwe; Peyriéras, Nadine

    2013-09-01

    Due to the relative transparency of its embryos and larvae, the zebrafish is an ideal model organism for bioimaging approaches in vertebrates. Novel microscope technologies allow the imaging of developmental processes in unprecedented detail, and they enable the use of complex image-based read-outs for high-throughput/high-content screening. Such applications can easily generate Terabytes of image data, the handling and analysis of which becomes a major bottleneck in extracting the targeted information. Here, we describe the current state of the art in computational image analysis in the zebrafish system. We discuss the challenges encountered when handling high-content image data, especially with regard to data quality, annotation, and storage. We survey methods for preprocessing image data for further analysis, and describe selected examples of automated image analysis, including the tracking of cells during embryogenesis, heartbeat detection, identification of dead embryos, recognition of tissues and anatomical landmarks, and quantification of behavioral patterns of adult fish. We review recent examples for applications using such methods, such as the comprehensive analysis of cell lineages during early development, the generation of a three-dimensional brain atlas of zebrafish larvae, and high-throughput drug screens based on movement patterns. Finally, we identify future challenges for the zebrafish image analysis community, notably those concerning the compatibility of algorithms and data formats for the assembly of modular analysis pipelines.

  4. Automated Processing of Zebrafish Imaging Data: A Survey

    PubMed Central

    Dickmeis, Thomas; Driever, Wolfgang; Geurts, Pierre; Hamprecht, Fred A.; Kausler, Bernhard X.; Ledesma-Carbayo, María J.; Marée, Raphaël; Mikula, Karol; Pantazis, Periklis; Ronneberger, Olaf; Santos, Andres; Stotzka, Rainer; Strähle, Uwe; Peyriéras, Nadine

    2013-01-01

    Abstract Due to the relative transparency of its embryos and larvae, the zebrafish is an ideal model organism for bioimaging approaches in vertebrates. Novel microscope technologies allow the imaging of developmental processes in unprecedented detail, and they enable the use of complex image-based read-outs for high-throughput/high-content screening. Such applications can easily generate Terabytes of image data, the handling and analysis of which becomes a major bottleneck in extracting the targeted information. Here, we describe the current state of the art in computational image analysis in the zebrafish system. We discuss the challenges encountered when handling high-content image data, especially with regard to data quality, annotation, and storage. We survey methods for preprocessing image data for further analysis, and describe selected examples of automated image analysis, including the tracking of cells during embryogenesis, heartbeat detection, identification of dead embryos, recognition of tissues and anatomical landmarks, and quantification of behavioral patterns of adult fish. We review recent examples for applications using such methods, such as the comprehensive analysis of cell lineages during early development, the generation of a three-dimensional brain atlas of zebrafish larvae, and high-throughput drug screens based on movement patterns. Finally, we identify future challenges for the zebrafish image analysis community, notably those concerning the compatibility of algorithms and data formats for the assembly of modular analysis pipelines. PMID:23758125

  5. Protein Annotators' Assistant: A Novel Application of Information Retrieval Techniques.

    ERIC Educational Resources Information Center

    Wise, Michael J.

    2000-01-01

    Protein Annotators' Assistant (PAA) is a software system which assists protein annotators in assigning functions to newly sequenced proteins. PAA employs a number of information retrieval techniques in a novel setting and is thus related to text categorization, where multiple categories may be suggested, except that in this case none of the…

  6. Artistic image analysis using graph-based learning approaches.

    PubMed

    Carneiro, Gustavo

    2013-08-01

    We introduce a new methodology for the problem of artistic image analysis, which among other tasks, involves the automatic identification of visual classes present in an art work. In this paper, we advocate the idea that artistic image analysis must explore a graph that captures the network of artistic influences by computing the similarities in terms of appearance and manual annotation. One of the novelties of our methodology is the proposed formulation that is a principled way of combining these two similarities in a single graph. Using this graph, we show that an efficient random walk algorithm based on an inverted label propagation formulation produces more accurate annotation and retrieval results compared with the following baseline algorithms: bag of visual words, label propagation, matrix completion, and structural learning. We also show that the proposed approach leads to a more efficient inference and training procedures. This experiment is run on a database containing 988 artistic images (with 49 visual classification problems divided into a multiclass problem with 27 classes and 48 binary problems), where we show the inference and training running times, and quantitative comparisons with respect to several retrieval and annotation performance measures.

  7. An annotation of cuts, depicted locations, and temporal progression in the motion picture "Forrest Gump"

    PubMed Central

    Häusler, Christian O.; Hanke, Michael

    2016-01-01

    Here we present an annotation of locations and temporal progression depicted in the movie “Forrest Gump”, as an addition to a large public functional brain imaging dataset ( http://studyforrest.org). The annotation provides information about the exact timing of each of the 870 shots, and the depicted location after every cut with a high, medium, and low level of abstraction. Additionally, four classes are used to distinguish the differences of the depicted time between shots. Each shot is also annotated regarding the type of location (interior/exterior) and time of day. This annotation enables further studies of visual perception, memory of locations, and the perception of time under conditions of real-life complexity using the studyforrest dataset. PMID:27781092

  8. Combining rules, background knowledge and change patterns to maintain semantic annotations.

    PubMed

    Cardoso, Silvio Domingos; Chantal, Reynaud-Delaître; Da Silveira, Marcos; Pruski, Cédric

    2017-01-01

    Knowledge Organization Systems (KOS) play a key role in enriching biomedical information in order to make it machine-understandable and shareable. This is done by annotating medical documents, or more specifically, associating concept labels from KOS with pieces of digital information, e.g., images or texts. However, the dynamic nature of KOS may impact the annotations, thus creating a mismatch between the evolved concept and the associated information. To solve this problem, methods to maintain the quality of the annotations are required. In this paper, we define a framework based on rules, background knowledge and change patterns to drive the annotation adaption process. We evaluate experimentally the proposed approach in realistic cases-studies and demonstrate the overall performance of our approach in different KOS considering the precision, recall, F1-score and AUC value of the system.

  9. Combining rules, background knowledge and change patterns to maintain semantic annotations

    PubMed Central

    Cardoso, Silvio Domingos; Chantal, Reynaud-Delaître; Da Silveira, Marcos; Pruski, Cédric

    2017-01-01

    Knowledge Organization Systems (KOS) play a key role in enriching biomedical information in order to make it machine-understandable and shareable. This is done by annotating medical documents, or more specifically, associating concept labels from KOS with pieces of digital information, e.g., images or texts. However, the dynamic nature of KOS may impact the annotations, thus creating a mismatch between the evolved concept and the associated information. To solve this problem, methods to maintain the quality of the annotations are required. In this paper, we define a framework based on rules, background knowledge and change patterns to drive the annotation adaption process. We evaluate experimentally the proposed approach in realistic cases-studies and demonstrate the overall performance of our approach in different KOS considering the precision, recall, F1-score and AUC value of the system. PMID:29854115

  10. Learning to merge: a new tool for interactive mapping

    NASA Astrophysics Data System (ADS)

    Porter, Reid B.; Lundquist, Sheng; Ruggiero, Christy

    2013-05-01

    The task of turning raw imagery into semantically meaningful maps and overlays is a key area of remote sensing activity. Image analysts, in applications ranging from environmental monitoring to intelligence, use imagery to generate and update maps of terrain, vegetation, road networks, buildings and other relevant features. Often these tasks can be cast as a pixel labeling problem, and several interactive pixel labeling tools have been developed. These tools exploit training data, which is generated by analysts using simple and intuitive paint-program annotation tools, in order to tailor the labeling algorithm for the particular dataset and task. In other cases, the task is best cast as a pixel segmentation problem. Interactive pixel segmentation tools have also been developed, but these tools typically do not learn from training data like the pixel labeling tools do. In this paper we investigate tools for interactive pixel segmentation that also learn from user input. The input has the form of segment merging (or grouping). Merging examples are 1) easily obtained from analysts using vector annotation tools, and 2) more challenging to exploit than traditional labels. We outline the key issues in developing these interactive merging tools, and describe their application to remote sensing.

  11. Evaluation of a novel tablet application for improvement in colonoscopy training and mentoring (with video).

    PubMed

    Laborde, Cregan J; Bell, Charreau S; Slaughter, James Chris; Valdastri, Pietro; Obstein, Keith L

    2017-03-01

    Endoscopic training can be challenging for the trainee and preceptor. Frustration can result from ineffective communication regarding areas of interest. Our team developed a novel tablet application for real-time mirroring of the colonoscopy examination that allows preceptors to make annotations directly on the viewing monitor. The potential for improvement in team proficiency and satisfaction is unknown. The on-screen endoscopic image is mirrored to an Android tablet and permits real-time annotation directly on the in-room endoscopic image display. Preceptors can also "freeze-frame" an image and provide visual on-screen instruction (telestration). Trainees, precepted by a GI attending, were 1:1 randomized to perform colonoscopy on a training phantom using the application with traditional precepting or traditional precepting alone. Magnetized polyps (size < 5 mm) were placed in 1 of 5 preset location scenarios. Each trainee performed a total of 10 colonoscopies and completed each location scenario twice. During withdrawal, the trainee and the attending identified polyps. Outcome measures included number of polyps missed and participant satisfaction after each trial. Fifteen trainees (6 novice and 9 GI fellows) performed a total of 150 colonoscopies where 330 polyps in total were placed. Fellows missed fewer polyps using the tablet versus traditional precepting alone (4.2% vs 12.5%; P = .04). There was no significant difference in missed polyps for novices (12.5% vs 18.8%; P = .66). Overall, fellows missed fewer polyps when compared with novices regardless of the precepting method (P = .01). The attending and all trainees reported reduced stress with improved communication using the tablet. Fellows missed fewer polyps using the tablet when compared with traditional endoscopy precepting. All trainees reported reduced stress, quicker identification of polyps, and improved educational satisfaction using the tablet. Our application has the potential to improve trainee plus attending team lesion detection and to enhance the endoscopy training experience for both the trainee and attending preceptor. Copyright © 2017 American Society for Gastrointestinal Endoscopy. Published by Elsevier Inc. All rights reserved.

  12. Annotated chemical patent corpus: a gold standard for text mining.

    PubMed

    Akhondi, Saber A; Klenner, Alexander G; Tyrchan, Christian; Manchala, Anil K; Boppana, Kiran; Lowe, Daniel; Zimmermann, Marc; Jagarlapudi, Sarma A R P; Sayle, Roger; Kors, Jan A; Muresan, Sorel

    2014-01-01

    Exploring the chemical and biological space covered by patent applications is crucial in early-stage medicinal chemistry activities. Patent analysis can provide understanding of compound prior art, novelty checking, validation of biological assays, and identification of new starting points for chemical exploration. Extracting chemical and biological entities from patents through manual extraction by expert curators can take substantial amount of time and resources. Text mining methods can help to ease this process. To validate the performance of such methods, a manually annotated patent corpus is essential. In this study we have produced a large gold standard chemical patent corpus. We developed annotation guidelines and selected 200 full patents from the World Intellectual Property Organization, United States Patent and Trademark Office, and European Patent Office. The patents were pre-annotated automatically and made available to four independent annotator groups each consisting of two to ten annotators. The annotators marked chemicals in different subclasses, diseases, targets, and modes of action. Spelling mistakes and spurious line break due to optical character recognition errors were also annotated. A subset of 47 patents was annotated by at least three annotator groups, from which harmonized annotations and inter-annotator agreement scores were derived. One group annotated the full set. The patent corpus includes 400,125 annotations for the full set and 36,537 annotations for the harmonized set. All patents and annotated entities are publicly available at www.biosemantics.org.

  13. Annotated Chemical Patent Corpus: A Gold Standard for Text Mining

    PubMed Central

    Akhondi, Saber A.; Klenner, Alexander G.; Tyrchan, Christian; Manchala, Anil K.; Boppana, Kiran; Lowe, Daniel; Zimmermann, Marc; Jagarlapudi, Sarma A. R. P.; Sayle, Roger; Kors, Jan A.; Muresan, Sorel

    2014-01-01

    Exploring the chemical and biological space covered by patent applications is crucial in early-stage medicinal chemistry activities. Patent analysis can provide understanding of compound prior art, novelty checking, validation of biological assays, and identification of new starting points for chemical exploration. Extracting chemical and biological entities from patents through manual extraction by expert curators can take substantial amount of time and resources. Text mining methods can help to ease this process. To validate the performance of such methods, a manually annotated patent corpus is essential. In this study we have produced a large gold standard chemical patent corpus. We developed annotation guidelines and selected 200 full patents from the World Intellectual Property Organization, United States Patent and Trademark Office, and European Patent Office. The patents were pre-annotated automatically and made available to four independent annotator groups each consisting of two to ten annotators. The annotators marked chemicals in different subclasses, diseases, targets, and modes of action. Spelling mistakes and spurious line break due to optical character recognition errors were also annotated. A subset of 47 patents was annotated by at least three annotator groups, from which harmonized annotations and inter-annotator agreement scores were derived. One group annotated the full set. The patent corpus includes 400,125 annotations for the full set and 36,537 annotations for the harmonized set. All patents and annotated entities are publicly available at www.biosemantics.org. PMID:25268232

  14. The Cerefy Neuroradiology Atlas: a Talairach-Tournoux atlas-based tool for analysis of neuroimages available over the internet.

    PubMed

    Nowinski, Wieslaw L; Belov, Dmitry

    2003-09-01

    The article introduces an atlas-assisted method and a tool called the Cerefy Neuroradiology Atlas (CNA), available over the Internet for neuroradiology and human brain mapping. The CNA contains an enhanced, extended, and fully segmented and labeled electronic version of the Talairach-Tournoux brain atlas, including parcelated gyri and Brodmann's areas. To our best knowledge, this is the first online, publicly available application with the Talairach-Tournoux atlas. The process of atlas-assisted neuroimage analysis is done in five steps: image data loading, Talairach landmark setting, atlas normalization, image data exploration and analysis, and result saving. Neuroimage analysis is supported by a near-real-time, atlas-to-data warping based on the Talairach transformation. The CNA runs on multiple platforms; is able to process simultaneously multiple anatomical and functional data sets; and provides functions for a rapid atlas-to-data registration, interactive structure labeling and annotating, and mensuration. It is also empowered with several unique features, including interactive atlas warping facilitating fine tuning of atlas-to-data fit, navigation on the triplanar formed by the image data and the atlas, multiple-images-in-one display with interactive atlas-anatomy-function blending, multiple label display, and saving of labeled and annotated image data. The CNA is useful for fast atlas-assisted analysis of neuroimage data sets. It increases accuracy and reduces time in localization analysis of activation regions; facilitates to communicate the information on the interpreted scans from the neuroradiologist to other clinicians and medical students; increases the neuroradiologist's confidence in terms of anatomy and spatial relationships; and serves as a user-friendly, public domain tool for neuroeducation. At present, more than 700 users from five continents have subscribed to the CNA.

  15. Chemical Geology: An Annotated Bibliography. CEGS Programs Publication Number 11.

    ERIC Educational Resources Information Center

    Billings, Gale K.

    The annotated bibliography is intended to aid geologists whose primary background is not in geochemistry. The references thus range from chemistry texts to papers on complex geochemical applications. The emphasis has been on those books and papers concerned with the application of chemical concepts to geology. Citations are arranged topically to…

  16. The Human Side of Knowledge Management: An Annotated Bibliography.

    ERIC Educational Resources Information Center

    Mayer, Pamela S.

    This annotated bibliography lists books and articles that have direct application to managing the human side of knowledge acquisition, transfer, and application. What is meant by the human dimension of knowledge is how motivation and learning affect the acquisition and transfer of knowledge and how group dynamics mediate the role of knowledge in…

  17. Managing biomedical image metadata for search and retrieval of similar images.

    PubMed

    Korenblum, Daniel; Rubin, Daniel; Napel, Sandy; Rodriguez, Cesar; Beaulieu, Chris

    2011-08-01

    Radiology images are generally disconnected from the metadata describing their contents, such as imaging observations ("semantic" metadata), which are usually described in text reports that are not directly linked to the images. We developed a system, the Biomedical Image Metadata Manager (BIMM) to (1) address the problem of managing biomedical image metadata and (2) facilitate the retrieval of similar images using semantic feature metadata. Our approach allows radiologists, researchers, and students to take advantage of the vast and growing repositories of medical image data by explicitly linking images to their associated metadata in a relational database that is globally accessible through a Web application. BIMM receives input in the form of standard-based metadata files using Web service and parses and stores the metadata in a relational database allowing efficient data query and maintenance capabilities. Upon querying BIMM for images, 2D regions of interest (ROIs) stored as metadata are automatically rendered onto preview images included in search results. The system's "match observations" function retrieves images with similar ROIs based on specific semantic features describing imaging observation characteristics (IOCs). We demonstrate that the system, using IOCs alone, can accurately retrieve images with diagnoses matching the query images, and we evaluate its performance on a set of annotated liver lesion images. BIMM has several potential applications, e.g., computer-aided detection and diagnosis, content-based image retrieval, automating medical analysis protocols, and gathering population statistics like disease prevalences. The system provides a framework for decision support systems, potentially improving their diagnostic accuracy and selection of appropriate therapies.

  18. Electronic Still Camera Project on STS-48

    NASA Technical Reports Server (NTRS)

    1991-01-01

    On behalf of NASA, the Office of Commercial Programs (OCP) has signed a Technical Exchange Agreement (TEA) with Autometric, Inc. (Autometric) of Alexandria, Virginia. The purpose of this agreement is to evaluate and analyze a high-resolution Electronic Still Camera (ESC) for potential commercial applications. During the mission, Autometric will provide unique photo analysis and hard-copy production. Once the mission is complete, Autometric will furnish NASA with an analysis of the ESC s capabilities. Electronic still photography is a developing technology providing the means by which a hand held camera electronically captures and produces a digital image with resolution approaching film quality. The digital image, stored on removable hard disks or small optical disks, can be converted to a format suitable for downlink transmission, or it can be enhanced using image processing software. The on-orbit ability to enhance or annotate high-resolution images and then downlink these images in real-time will greatly improve Space Shuttle and Space Station capabilities in Earth observations and on-board photo documentation.

  19. BioSAVE: display of scored annotation within a sequence context.

    PubMed

    Pollock, Richard F; Adryan, Boris

    2008-03-20

    Visualization of sequence annotation is a common feature in many bioinformatics tools. For many applications it is desirable to restrict the display of such annotation according to a score cutoff, as biological interpretation can be difficult in the presence of the entire data. Unfortunately, many visualisation solutions are somewhat static in the way they handle such score cutoffs. We present BioSAVE, a sequence annotation viewer with on-the-fly selection of visualisation thresholds for each feature. BioSAVE is a versatile OS X program for visual display of scored features (annotation) within a sequence context. The program reads sequence and additional supplementary annotation data (e.g., position weight matrix matches, conservation scores, structural domains) from a variety of commonly used file formats and displays them graphically. Onscreen controls then allow for live customisation of these graphics, including on-the-fly selection of visualisation thresholds for each feature. Possible applications of the program include display of transcription factor binding sites in a genomic context or the visualisation of structural domain assignments in protein sequences and many more. The dynamic visualisation of these annotations is useful, e.g., for the determination of cutoff values of predicted features to match experimental data. Program, source code and exemplary files are freely available at the BioSAVE homepage.

  20. BioSAVE: Display of scored annotation within a sequence context

    PubMed Central

    Pollock, Richard F; Adryan, Boris

    2008-01-01

    Background Visualization of sequence annotation is a common feature in many bioinformatics tools. For many applications it is desirable to restrict the display of such annotation according to a score cutoff, as biological interpretation can be difficult in the presence of the entire data. Unfortunately, many visualisation solutions are somewhat static in the way they handle such score cutoffs. Results We present BioSAVE, a sequence annotation viewer with on-the-fly selection of visualisation thresholds for each feature. BioSAVE is a versatile OS X program for visual display of scored features (annotation) within a sequence context. The program reads sequence and additional supplementary annotation data (e.g., position weight matrix matches, conservation scores, structural domains) from a variety of commonly used file formats and displays them graphically. Onscreen controls then allow for live customisation of these graphics, including on-the-fly selection of visualisation thresholds for each feature. Conclusion Possible applications of the program include display of transcription factor binding sites in a genomic context or the visualisation of structural domain assignments in protein sequences and many more. The dynamic visualisation of these annotations is useful, e.g., for the determination of cutoff values of predicted features to match experimental data. Program, source code and exemplary files are freely available at the BioSAVE homepage. PMID:18366701

  1. Type-Separated Bytecode - Its Construction and Evaluation

    NASA Astrophysics Data System (ADS)

    Adler, Philipp; Amme, Wolfram

    A lot of constrained systems still use interpreters to run mobile applications written in Java. These interpreters demand for only a few resources. On the other hand, it is difficult to apply optimizations during the runtime of the application. Annotations could be used to achieve a simpler and faster code analysis, which would allow optimizations even for interpreters on constrained devices. Unfortunately, there is no viable way of transporting annotations to and verifying them at the code consumer. In this paper we present type-separated bytecode as an intermediate representation which allows to safely transport annotations as type-extensions. We have implemented several versions of this system and show that it is possible to obtain a performance comparable to Java Bytecode, even though we use a type-separated system with annotations.

  2. Classification of yeast cells from image features to evaluate pathogen conditions

    NASA Astrophysics Data System (ADS)

    van der Putten, Peter; Bertens, Laura; Liu, Jinshuo; Hagen, Ferry; Boekhout, Teun; Verbeek, Fons J.

    2007-01-01

    Morphometrics from images, image analysis, may reveal differences between classes of objects present in the images. We have performed an image-features-based classification for the pathogenic yeast Cryptococcus neoformans. Building and analyzing image collections from the yeast under different environmental or genetic conditions may help to diagnose a new "unseen" situation. Diagnosis here means that retrieval of the relevant information from the image collection is at hand each time a new "sample" is presented. The basidiomycetous yeast Cryptococcus neoformans can cause infections such as meningitis or pneumonia. The presence of an extra-cellular capsule is known to be related to virulence. This paper reports on the approach towards developing classifiers for detecting potentially more or less virulent cells in a sample, i.e. an image, by using a range of features derived from the shape or density distribution. The classifier can henceforth be used for automating screening and annotating existing image collections. In addition we will present our methods for creating samples, collecting images, image preprocessing, identifying "yeast cells" and creating feature extraction from the images. We compare various expertise based and fully automated methods of feature selection and benchmark a range of classification algorithms and illustrate successful application to this particular domain.

  3. Using simulated fluorescence cell micrographs for the evaluation of cell image segmentation algorithms.

    PubMed

    Wiesmann, Veit; Bergler, Matthias; Palmisano, Ralf; Prinzen, Martin; Franz, Daniela; Wittenberg, Thomas

    2017-03-18

    Manual assessment and evaluation of fluorescent micrograph cell experiments is time-consuming and tedious. Automated segmentation pipelines can ensure efficient and reproducible evaluation and analysis with constant high quality for all images of an experiment. Such cell segmentation approaches are usually validated and rated in comparison to manually annotated micrographs. Nevertheless, manual annotations are prone to errors and display inter- and intra-observer variability which influence the validation results of automated cell segmentation pipelines. We present a new approach to simulate fluorescent cell micrographs that provides an objective ground truth for the validation of cell segmentation methods. The cell simulation was evaluated twofold: (1) An expert observer study shows that the proposed approach generates realistic fluorescent cell micrograph simulations. (2) An automated segmentation pipeline on the simulated fluorescent cell micrographs reproduces segmentation performances of that pipeline on real fluorescent cell micrographs. The proposed simulation approach produces realistic fluorescent cell micrographs with corresponding ground truth. The simulated data is suited to evaluate image segmentation pipelines more efficiently and reproducibly than it is possible on manually annotated real micrographs.

  4. Minimization of annotation work: diagnosis of mammographic masses via active learning

    NASA Astrophysics Data System (ADS)

    Zhao, Yu; Zhang, Jingyang; Xie, Hongzhi; Zhang, Shuyang; Gu, Lixu

    2018-06-01

    The prerequisite for establishing an effective prediction system for mammographic diagnosis is the annotation of each mammographic image. The manual annotation work is time-consuming and laborious, which becomes a great hindrance for researchers. In this article, we propose a novel active learning algorithm that can adequately address this problem, leading to the minimization of the labeling costs on the premise of guaranteed performance. Our proposed method is different from the existing active learning methods designed for the general problem as it is specifically designed for mammographic images. Through its modified discriminant functions and improved sample query criteria, the proposed method can fully utilize the pairing of mammographic images and select the most valuable images from both the mediolateral and craniocaudal views. Moreover, in order to extend active learning to the ordinal regression problem, which has no precedent in existing studies, but is essential for mammographic diagnosis (mammographic diagnosis is not only a classification task, but also an ordinal regression task for predicting an ordinal variable, viz. the malignancy risk of lesions), multiple sample query criteria need to be taken into consideration simultaneously. We formulate it as a criteria integration problem and further present an algorithm based on self-adaptive weighted rank aggregation to achieve a good solution. The efficacy of the proposed method was demonstrated on thousands of mammographic images from the digital database for screening mammography. The labeling costs of obtaining optimal performance in the classification and ordinal regression task respectively fell to 33.8 and 19.8 percent of their original costs. The proposed method also generated 1228 wins, 369 ties and 47 losses for the classification task, and 1933 wins, 258 ties and 185 losses for the ordinal regression task compared to the other state-of-the-art active learning algorithms. By taking the particularities of mammographic images, the proposed AL method can indeed reduce the manual annotation work to a great extent without sacrificing the performance of the prediction system for mammographic diagnosis.

  5. Minimization of annotation work: diagnosis of mammographic masses via active learning.

    PubMed

    Zhao, Yu; Zhang, Jingyang; Xie, Hongzhi; Zhang, Shuyang; Gu, Lixu

    2018-05-22

    The prerequisite for establishing an effective prediction system for mammographic diagnosis is the annotation of each mammographic image. The manual annotation work is time-consuming and laborious, which becomes a great hindrance for researchers. In this article, we propose a novel active learning algorithm that can adequately address this problem, leading to the minimization of the labeling costs on the premise of guaranteed performance. Our proposed method is different from the existing active learning methods designed for the general problem as it is specifically designed for mammographic images. Through its modified discriminant functions and improved sample query criteria, the proposed method can fully utilize the pairing of mammographic images and select the most valuable images from both the mediolateral and craniocaudal views. Moreover, in order to extend active learning to the ordinal regression problem, which has no precedent in existing studies, but is essential for mammographic diagnosis (mammographic diagnosis is not only a classification task, but also an ordinal regression task for predicting an ordinal variable, viz. the malignancy risk of lesions), multiple sample query criteria need to be taken into consideration simultaneously. We formulate it as a criteria integration problem and further present an algorithm based on self-adaptive weighted rank aggregation to achieve a good solution. The efficacy of the proposed method was demonstrated on thousands of mammographic images from the digital database for screening mammography. The labeling costs of obtaining optimal performance in the classification and ordinal regression task respectively fell to 33.8 and 19.8 percent of their original costs. The proposed method also generated 1228 wins, 369 ties and 47 losses for the classification task, and 1933 wins, 258 ties and 185 losses for the ordinal regression task compared to the other state-of-the-art active learning algorithms. By taking the particularities of mammographic images, the proposed AL method can indeed reduce the manual annotation work to a great extent without sacrificing the performance of the prediction system for mammographic diagnosis.

  6. Basic level scene understanding: categories, attributes and structures

    PubMed Central

    Xiao, Jianxiong; Hays, James; Russell, Bryan C.; Patterson, Genevieve; Ehinger, Krista A.; Torralba, Antonio; Oliva, Aude

    2013-01-01

    A longstanding goal of computer vision is to build a system that can automatically understand a 3D scene from a single image. This requires extracting semantic concepts and 3D information from 2D images which can depict an enormous variety of environments that comprise our visual world. This paper summarizes our recent efforts toward these goals. First, we describe the richly annotated SUN database which is a collection of annotated images spanning 908 different scene categories with object, attribute, and geometric labels for many scenes. This database allows us to systematically study the space of scenes and to establish a benchmark for scene and object recognition. We augment the categorical SUN database with 102 scene attributes for every image and explore attribute recognition. Finally, we present an integrated system to extract the 3D structure of the scene and objects depicted in an image. PMID:24009590

  7. Comparing algorithms for automated vessel segmentation in computed tomography scans of the lung: the VESSEL12 study

    PubMed Central

    Rudyanto, Rina D.; Kerkstra, Sjoerd; van Rikxoort, Eva M.; Fetita, Catalin; Brillet, Pierre-Yves; Lefevre, Christophe; Xue, Wenzhe; Zhu, Xiangjun; Liang, Jianming; Öksüz, İlkay; Ünay, Devrim; Kadipaşaogandcaron;lu, Kamuran; Estépar, Raúl San José; Ross, James C.; Washko, George R.; Prieto, Juan-Carlos; Hoyos, Marcela Hernández; Orkisz, Maciej; Meine, Hans; Hüllebrand, Markus; Stöcker, Christina; Mir, Fernando Lopez; Naranjo, Valery; Villanueva, Eliseo; Staring, Marius; Xiao, Changyan; Stoel, Berend C.; Fabijanska, Anna; Smistad, Erik; Elster, Anne C.; Lindseth, Frank; Foruzan, Amir Hossein; Kiros, Ryan; Popuri, Karteek; Cobzas, Dana; Jimenez-Carretero, Daniel; Santos, Andres; Ledesma-Carbayo, Maria J.; Helmberger, Michael; Urschler, Martin; Pienn, Michael; Bosboom, Dennis G.H.; Campo, Arantza; Prokop, Mathias; de Jong, Pim A.; Ortiz-de-Solorzano, Carlos; Muñoz-Barrutia, Arrate; van Ginneken, Bram

    2016-01-01

    The VESSEL12 (VESsel SEgmentation in the Lung) challenge objectively compares the performance of different algorithms to identify vessels in thoracic computed tomography (CT) scans. Vessel segmentation is fundamental in computer aided processing of data generated by 3D imaging modalities. As manual vessel segmentation is prohibitively time consuming, any real world application requires some form of automation. Several approaches exist for automated vessel segmentation, but judging their relative merits is difficult due to a lack of standardized evaluation. We present an annotated reference dataset containing 20 CT scans and propose nine categories to perform a comprehensive evaluation of vessel segmentation algorithms from both academia and industry. Twenty algorithms participated in the VESSEL12 challenge, held at International Symposium on Biomedical Imaging (ISBI) 2012. All results have been published at the VESSEL12 website http://vessel12.grand-challenge.org. The challenge remains ongoing and open to new participants. Our three contributions are: (1) an annotated reference dataset available online for evaluation of new algorithms; (2) a quantitative scoring system for objective comparison of algorithms; and (3) performance analysis of the strengths and weaknesses of the various vessel segmentation methods in the presence of various lung diseases. PMID:25113321

  8. The Cancer Imaging Archive (TCIA) | Informatics Technology for Cancer Research (ITCR)

    Cancer.gov

    TCIA is NCI’s repository for publicly shared cancer imaging data. TCIA collections include radiology and pathology images, clinical and clinical trial data, image derived annotations and quantitative features and a growing collection of related ‘omics data both from clinical and pre-clinical studies.

  9. KID Project: an internet-based digital video atlas of capsule endoscopy for research purposes

    PubMed Central

    Koulaouzidis, Anastasios; Iakovidis, Dimitris K.; Yung, Diana E.; Rondonotti, Emanuele; Kopylov, Uri; Plevris, John N.; Toth, Ervin; Eliakim, Abraham; Wurm Johansson, Gabrielle; Marlicz, Wojciech; Mavrogenis, Georgios; Nemeth, Artur; Thorlacius, Henrik; Tontini, Gian Eugenio

    2017-01-01

    Background and aims  Capsule endoscopy (CE) has revolutionized small-bowel (SB) investigation. Computational methods can enhance diagnostic yield (DY); however, incorporating machine learning algorithms (MLAs) into CE reading is difficult as large amounts of image annotations are required for training. Current databases lack graphic annotations of pathologies and cannot be used. A novel database, KID, aims to provide a reference for research and development of medical decision support systems (MDSS) for CE. Methods  Open-source software was used for the KID database. Clinicians contribute anonymized, annotated CE images and videos. Graphic annotations are supported by an open-access annotation tool (Ratsnake). We detail an experiment based on the KID database, examining differences in SB lesion measurement between human readers and a MLA. The Jaccard Index (JI) was used to evaluate similarity between annotations by the MLA and human readers. Results  The MLA performed best in measuring lymphangiectasias with a JI of 81 ± 6 %. The other lesion types were: angioectasias (JI 64 ± 11 %), aphthae (JI 64 ± 8 %), chylous cysts (JI 70 ± 14 %), polypoid lesions (JI 75 ± 21 %), and ulcers (JI 56 ± 9 %). Conclusion  MLA can perform as well as human readers in the measurement of SB angioectasias in white light (WL). Automated lesion measurement is therefore feasible. KID is currently the only open-source CE database developed specifically to aid development of MDSS. Our experiment demonstrates this potential. PMID:28580415

  10. Digital Imaging and Communications in Medicine Whole Slide Imaging Connectathon at Digital Pathology Association Pathology Visions 2017.

    PubMed

    Clunie, David; Hosseinzadeh, Dan; Wintell, Mikael; De Mena, David; Lajara, Nieves; Garcia-Rojo, Marcial; Bueno, Gloria; Saligrama, Kiran; Stearrett, Aaron; Toomey, David; Abels, Esther; Apeldoorn, Frank Van; Langevin, Stephane; Nichols, Sean; Schmid, Joachim; Horchner, Uwe; Beckwith, Bruce; Parwani, Anil; Pantanowitz, Liron

    2018-01-01

    As digital pathology systems for clinical diagnostic work applications become mainstream, interoperability between these systems from different vendors becomes critical. For the first time, multiple digital pathology vendors have publicly revealed the use of the digital imaging and communications in medicine (DICOM) standard file format and network protocol to communicate between separate whole slide acquisition, storage, and viewing components. Note the use of DICOM for clinical diagnostic applications is still to be validated in the United States. The successful demonstration shows that the DICOM standard is fundamentally sound, though many lessons were learned. These lessons will be incorporated as incremental improvements in the standard, provide more detailed profiles to constrain variation for specific use cases, and offer educational material for implementers. Future Connectathon events will expand the scope to include more devices and vendors, as well as more ambitious use cases including laboratory information system integration and annotation for image analysis, as well as more geographic diversity. Users should request DICOM features in all purchases and contracts. It is anticipated that the growth of DICOM-compliant manufacturers will likely also ease DICOM for pathology becoming a recognized standard and as such the regulatory pathway for digital pathology products.

  11. Transcription and Annotation of a Japanese Accented Spoken Corpus of L2 Spanish for the Development of CAPT Applications

    ERIC Educational Resources Information Center

    Carranza, Mario

    2016-01-01

    This paper addresses the process of transcribing and annotating spontaneous non-native speech with the aim of compiling a training corpus for the development of Computer Assisted Pronunciation Training (CAPT) applications, enhanced with Automatic Speech Recognition (ASR) technology. To better adapt ASR technology to CAPT tools, the recognition…

  12. A guide to best practices for Gene Ontology (GO) manual annotation

    PubMed Central

    Balakrishnan, Rama; Harris, Midori A.; Huntley, Rachael; Van Auken, Kimberly; Cherry, J. Michael

    2013-01-01

    The Gene Ontology Consortium (GOC) is a community-based bioinformatics project that classifies gene product function through the use of structured controlled vocabularies. A fundamental application of the Gene Ontology (GO) is in the creation of gene product annotations, evidence-based associations between GO definitions and experimental or sequence-based analysis. Currently, the GOC disseminates 126 million annotations covering >374 000 species including all the kingdoms of life. This number includes two classes of GO annotations: those created manually by experienced biocurators reviewing the literature or by examination of biological data (1.1 million annotations covering 2226 species) and those generated computationally via automated methods. As manual annotations are often used to propagate functional predictions between related proteins within and between genomes, it is critical to provide accurate consistent manual annotations. Toward this goal, we present here the conventions defined by the GOC for the creation of manual annotation. This guide represents the best practices for manual annotation as established by the GOC project over the past 12 years. We hope this guide will encourage research communities to annotate gene products of their interest to enhance the corpus of GO annotations available to all. Database URL: http://www.geneontology.org PMID:23842463

  13. Qcorp: an annotated classification corpus of Chinese health questions.

    PubMed

    Guo, Haihong; Na, Xu; Li, Jiao

    2018-03-22

    Health question-answering (QA) systems have become a typical application scenario of Artificial Intelligent (AI). An annotated question corpus is prerequisite for training machines to understand health information needs of users. Thus, we aimed to develop an annotated classification corpus of Chinese health questions (Qcorp) and make it openly accessible. We developed a two-layered classification schema and corresponding annotation rules on basis of our previous work. Using the schema, we annotated 5000 questions that were randomly selected from 5 Chinese health websites within 6 broad sections. 8 annotators participated in the annotation task, and the inter-annotator agreement was evaluated to ensure the corpus quality. Furthermore, the distribution and relationship of the annotated tags were measured by descriptive statistics and social network map. The questions were annotated using 7101 tags that covers 29 topic categories in the two-layered schema. In our released corpus, the distribution of questions on the top-layered categories was treatment of 64.22%, diagnosis of 37.14%, epidemiology of 14.96%, healthy lifestyle of 10.38%, and health provider choice of 4.54% respectively. Both the annotated health questions and annotation schema were openly accessible on the Qcorp website. Users can download the annotated Chinese questions in CSV, XML, and HTML format. We developed a Chinese health question corpus including 5000 manually annotated questions. It is openly accessible and would contribute to the intelligent health QA system development.

  14. A hierarchical knowledge-based approach for retrieving similar medical images described with semantic annotations

    PubMed Central

    Kurtz, Camille; Beaulieu, Christopher F.; Napel, Sandy; Rubin, Daniel L.

    2014-01-01

    Computer-assisted image retrieval applications could assist radiologist interpretations by identifying similar images in large archives as a means to providing decision support. However, the semantic gap between low-level image features and their high level semantics may impair the system performances. Indeed, it can be challenging to comprehensively characterize the images using low-level imaging features to fully capture the visual appearance of diseases on images, and recently the use of semantic terms has been advocated to provide semantic descriptions of the visual contents of images. However, most of the existing image retrieval strategies do not consider the intrinsic properties of these terms during the comparison of the images beyond treating them as simple binary (presence/absence) features. We propose a new framework that includes semantic features in images and that enables retrieval of similar images in large databases based on their semantic relations. It is based on two main steps: (1) annotation of the images with semantic terms extracted from an ontology, and (2) evaluation of the similarity of image pairs by computing the similarity between the terms using the Hierarchical Semantic-Based Distance (HSBD) coupled to an ontological measure. The combination of these two steps provides a means of capturing the semantic correlations among the terms used to characterize the images that can be considered as a potential solution to deal with the semantic gap problem. We validate this approach in the context of the retrieval and the classification of 2D regions of interest (ROIs) extracted from computed tomographic (CT) images of the liver. Under this framework, retrieval accuracy of more than 0.96 was obtained on a 30-images dataset using the Normalized Discounted Cumulative Gain (NDCG) index that is a standard technique used to measure the effectiveness of information retrieval algorithms when a separate reference standard is available. Classification results of more than 95% were obtained on a 77-images dataset. For comparison purpose, the use of the Earth Mover's Distance (EMD), which is an alternative distance metric that considers all the existing relations among the terms, led to results retrieval accuracy of 0.95 and classification results of 93% with a higher computational cost. The results provided by the presented framework are competitive with the state-of-the-art and emphasize the usefulness of the proposed methodology for radiology image retrieval and classification. PMID:24632078

  15. An integrated one-step system to extract, analyze and annotate all relevant information from image-based cell screening of chemical libraries.

    PubMed

    Rabal, Obdulia; Link, Wolfgang; Serelde, Beatriz G; Bischoff, James R; Oyarzabal, Julen

    2010-04-01

    Here we report the development and validation of a complete solution to manage and analyze the data produced by image-based phenotypic screening campaigns of small-molecule libraries. In one step initial crude images are analyzed for multiple cytological features, statistical analysis is performed and molecules that produce the desired phenotypic profile are identified. A naïve Bayes classifier, integrating chemical and phenotypic spaces, is built and utilized during the process to assess those images initially classified as "fuzzy"-an automated iterative feedback tuning. Simultaneously, all this information is directly annotated in a relational database containing the chemical data. This novel fully automated method was validated by conducting a re-analysis of results from a high-content screening campaign involving 33 992 molecules used to identify inhibitors of the PI3K/Akt signaling pathway. Ninety-two percent of confirmed hits identified by the conventional multistep analysis method were identified using this integrated one-step system as well as 40 new hits, 14.9% of the total, originally false negatives. Ninety-six percent of true negatives were properly recognized too. A web-based access to the database, with customizable data retrieval and visualization tools, facilitates the posterior analysis of annotated cytological features which allows identification of additional phenotypic profiles; thus, further analysis of original crude images is not required.

  16. Validating automatic semantic annotation of anatomy in DICOM CT images

    NASA Astrophysics Data System (ADS)

    Pathak, Sayan D.; Criminisi, Antonio; Shotton, Jamie; White, Steve; Robertson, Duncan; Sparks, Bobbi; Munasinghe, Indeera; Siddiqui, Khan

    2011-03-01

    In the current health-care environment, the time available for physicians to browse patients' scans is shrinking due to the rapid increase in the sheer number of images. This is further aggravated by mounting pressure to become more productive in the face of decreasing reimbursement. Hence, there is an urgent need to deliver technology which enables faster and effortless navigation through sub-volume image visualizations. Annotating image regions with semantic labels such as those derived from the RADLEX ontology can vastly enhance image navigation and sub-volume visualization. This paper uses random regression forests for efficient, automatic detection and localization of anatomical structures within DICOM 3D CT scans. A regression forest is a collection of decision trees which are trained to achieve direct mapping from voxels to organ location and size in a single pass. This paper focuses on comparing automated labeling with expert-annotated ground-truth results on a database of 50 highly variable CT scans. Initial investigations show that regression forest derived localization errors are smaller and more robust than those achieved by state-of-the-art global registration approaches. The simplicity of the algorithm's context-rich visual features yield typical runtimes of less than 10 seconds for a 5123 voxel DICOM CT series on a single-threaded, single-core machine running multiple trees; each tree taking less than a second. Furthermore, qualitative evaluation demonstrates that using the detected organs' locations as index into the image volume improves the efficiency of the navigational workflow in all the CT studies.

  17. Teaching and Learning Communities through Online Annotation

    NASA Astrophysics Data System (ADS)

    van der Pluijm, B.

    2016-12-01

    What do colleagues do with your assigned textbook? What they say or think about the material? Want students to be more engaged in their learning experience? If so, online materials that complement standard lecture format provide new opportunity through managed, online group annotation that leverages the ubiquity of internet access, while personalizing learning. The concept is illustrated with the new online textbook "Processes in Structural Geology and Tectonics", by Ben van der Pluijm and Stephen Marshak, which offers a platform for sharing of experiences, supplementary materials and approaches, including readings, mathematical applications, exercises, challenge questions, quizzes, alternative explanations, and more. The annotation framework used is Hypothes.is, which offers a free, open platform markup environment for annotation of websites and PDF postings. The annotations can be public, grouped or individualized, as desired, including export access and download of annotations. A teacher group, hosted by a moderator/owner, limits access to members of a user group of teachers, so that its members can use, copy or transcribe annotations for their own lesson material. Likewise, an instructor can host a student group that encourages sharing of observations, questions and answers among students and instructor. Also, the instructor can create one or more closed groups that offers study help and hints to students. Options galore, all of which aim to engage students and to promote greater responsibility for their learning experience. Beyond new capacity, the ability to analyze student annotation supports individual learners and their needs. For example, student notes can be analyzed for key phrases and concepts, and identify misunderstandings, omissions and problems. Also, example annotations can be shared to enhance notetaking skills and to help with studying. Lastly, online annotation allows active application to lecture posted slides, supporting real-time notetaking during lecture presentation. Sharing of experiences and practices of annotation could benefit teachers and learners alike, and does not require complicated software, coding skills or special hardware environments.

  18. Representing annotation compositionality and provenance for the Semantic Web

    PubMed Central

    2013-01-01

    Background Though the annotation of digital artifacts with metadata has a long history, the bulk of that work focuses on the association of single terms or concepts to single targets. As annotation efforts expand to capture more complex information, annotations will need to be able to refer to knowledge structures formally defined in terms of more atomic knowledge structures. Existing provenance efforts in the Semantic Web domain primarily focus on tracking provenance at the level of whole triples and do not provide enough detail to track how individual triple elements of annotations were derived from triple elements of other annotations. Results We present a task- and domain-independent ontological model for capturing annotations and their linkage to their denoted knowledge representations, which can be singular concepts or more complex sets of assertions. We have implemented this model as an extension of the Information Artifact Ontology in OWL and made it freely available, and we show how it can be integrated with several prominent annotation and provenance models. We present several application areas for the model, ranging from linguistic annotation of text to the annotation of disease-associations in genome sequences. Conclusions With this model, progressively more complex annotations can be composed from other annotations, and the provenance of compositional annotations can be represented at the annotation level or at the level of individual elements of the RDF triples composing the annotations. This in turn allows for progressively richer annotations to be constructed from previous annotation efforts, the precise provenance recording of which facilitates evidence-based inference and error tracking. PMID:24268021

  19. Innovative uses of GigaPan Technology for Onsite and Distance Education

    NASA Astrophysics Data System (ADS)

    Bentley, C.; Schott, R. C.; Piatek, J. L.; Richards, B.

    2013-12-01

    GigaPans are gigapixel panoramic images that can be viewed at a wide range of magnifications, allowing users to explore them in various degrees of detail from the smallest scale to the full image extent. In addition to panoramic images captured with the GigaPan camera mount ('Dry Falls' - http://www.gigapan.com/gigapans/89093), users can also upload annotated images (For example, 'Massanutten sandstone slab with trace fossils (annotated)', http://www.gigapan.com/gigapans/124295) and satellite images (For example, 'Geology vs. Topography - State of Connecticut', http://www.gigapan.com/gigapans/111265). Panoramas with similar topics have been gathered together on the site in galleries, both user-generated and site-curated (For example, http://www.gigapan.com/galleries?categories=geology&page=1). Further innovations in display technology have also led to the development of improved viewers (for example, the annotations in the image linked above can be explored via paired viewers at http://coursecontent.nic.edu/bdrichards/gigapixelimages/callanview) GigaPan panoramas can be created through use of the GigaPan robotic camera mount and a digital camera (different models of the camera mount are available and work with a wide range of cameras). The camera mount can be used to create high-resolution pans ranging in scale from hand sample to outcrop up to landscape via the stitching software included with the robotic mount. The software can also be used to generate GigaPan images from other sources, such as thin section or satellite images, so these images can also be viewed with the online viewer. GigaPan images are typically viewed via a web-based interface that allows the user to interact with the image from the limits of the image detail up to the full panorama. After uploading, information can be added to panoramas with both text captions and geo-referencing (geo-located panoramas can then be viewed in Google Earth). Users can record specific locations and zoom levels in these images via "snapshots": these snapshots can direct others to the same location in the image as well as generate conversations with attached text comments. Users can also group related GigaPans by creating "galleries" of thematically related images (similar to photo albums). Gigapixel images can also be formatted for processing and viewing in an increasing number of platforms/modes as software vendors and internet browsers begin to provide 'add-in' support. This opens up opportunities for innovative adaptations for geoscience education. (For example, http://coursecontent.nic.edu/bdrichards/gigapixelimages/dryfalls) Specific applications of these images for geoscience educations include classroom activities and independent exercises that encourage students to take an active inquiry-based approach to understanding geoscience concepts at multiple skill levels. GigaPans in field research serve as both records of field locations and additional datasets for detailed analyses, such as observing color changes or variations in grain size. Related GigaPans can be also be presented together when embedded in webpages, useful for generating exercises for education purposes or for analyses of outcrops from the macro (landscape, outcrop) down to the micro scale (hand sample, thin section).

  20. Identification of widespread adenosine nucleotide binding in Mycobacterium tuberculosis

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Ansong, Charles; Ortega, Corrie; Payne, Samuel H.

    The annotation of protein function is almost completely performed by in silico approaches. However, computational prediction of protein function is frequently incomplete and error prone. In Mycobacterium tuberculosis (Mtb), ~25% of all genes have no predicted function and are annotated as hypothetical proteins. This lack of functional information severely limits our understanding of Mtb pathogenicity. Current tools for experimental functional annotation are limited and often do not scale to entire protein families. Here, we report a generally applicable chemical biology platform to functionally annotate bacterial proteins by combining activity-based protein profiling (ABPP) and quantitative LC-MS-based proteomics. As an example ofmore » this approach for high-throughput protein functional validation and discovery, we experimentally annotate the families of ATP-binding proteins in Mtb. Our data experimentally validate prior in silico predictions of >250 ATPases and adenosine nucleotide-binding proteins, and reveal 73 hypothetical proteins as novel ATP-binding proteins. We identify adenosine cofactor interactions with many hypothetical proteins containing a diversity of unrelated sequences, providing a new and expanded view of adenosine nucleotide binding in Mtb. Furthermore, many of these hypothetical proteins are both unique to Mycobacteria and essential for infection, suggesting specialized functions in mycobacterial physiology and pathogenicity. Thus, we provide a generally applicable approach for high throughput protein function discovery and validation, and highlight several ways in which application of activity-based proteomics data can improve the quality of functional annotations to facilitate novel biological insights.« less

  1. Joint Probability Models of Radiology Images and Clinical Annotations

    ERIC Educational Resources Information Center

    Arnold, Corey Wells

    2009-01-01

    Radiology data, in the form of images and reports, is growing at a high rate due to the introduction of new imaging modalities, new uses of existing modalities, and the growing importance of objective image information in the diagnosis and treatment of patients. This increase has resulted in an enormous set of image data that is richly annotated…

  2. Optimized Graph Learning Using Partial Tags and Multiple Features for Image and Video Annotation.

    PubMed

    Song, Jingkuan; Gao, Lianli; Nie, Feiping; Shen, Heng Tao; Yan, Yan; Sebe, Nicu

    2016-11-01

    In multimedia annotation, due to the time constraints and the tediousness of manual tagging, it is quite common to utilize both tagged and untagged data to improve the performance of supervised learning when only limited tagged training data are available. This is often done by adding a geometry-based regularization term in the objective function of a supervised learning model. In this case, a similarity graph is indispensable to exploit the geometrical relationships among the training data points, and the graph construction scheme essentially determines the performance of these graph-based learning algorithms. However, most of the existing works construct the graph empirically and are usually based on a single feature without using the label information. In this paper, we propose a semi-supervised annotation approach by learning an optimized graph (OGL) from multi-cues (i.e., partial tags and multiple features), which can more accurately embed the relationships among the data points. Since OGL is a transductive method and cannot deal with novel data points, we further extend our model to address the out-of-sample issue. Extensive experiments on image and video annotation show the consistent superiority of OGL over the state-of-the-art methods.

  3. Jenkins-CI, an Open-Source Continuous Integration System, as a Scientific Data and Image-Processing Platform.

    PubMed

    Moutsatsos, Ioannis K; Hossain, Imtiaz; Agarinis, Claudia; Harbinski, Fred; Abraham, Yann; Dobler, Luc; Zhang, Xian; Wilson, Christopher J; Jenkins, Jeremy L; Holway, Nicholas; Tallarico, John; Parker, Christian N

    2017-03-01

    High-throughput screening generates large volumes of heterogeneous data that require a diverse set of computational tools for management, processing, and analysis. Building integrated, scalable, and robust computational workflows for such applications is challenging but highly valuable. Scientific data integration and pipelining facilitate standardized data processing, collaboration, and reuse of best practices. We describe how Jenkins-CI, an "off-the-shelf," open-source, continuous integration system, is used to build pipelines for processing images and associated data from high-content screening (HCS). Jenkins-CI provides numerous plugins for standard compute tasks, and its design allows the quick integration of external scientific applications. Using Jenkins-CI, we integrated CellProfiler, an open-source image-processing platform, with various HCS utilities and a high-performance Linux cluster. The platform is web-accessible, facilitates access and sharing of high-performance compute resources, and automates previously cumbersome data and image-processing tasks. Imaging pipelines developed using the desktop CellProfiler client can be managed and shared through a centralized Jenkins-CI repository. Pipelines and managed data are annotated to facilitate collaboration and reuse. Limitations with Jenkins-CI (primarily around the user interface) were addressed through the selection of helper plugins from the Jenkins-CI community.

  4. Jenkins-CI, an Open-Source Continuous Integration System, as a Scientific Data and Image-Processing Platform

    PubMed Central

    Moutsatsos, Ioannis K.; Hossain, Imtiaz; Agarinis, Claudia; Harbinski, Fred; Abraham, Yann; Dobler, Luc; Zhang, Xian; Wilson, Christopher J.; Jenkins, Jeremy L.; Holway, Nicholas; Tallarico, John; Parker, Christian N.

    2016-01-01

    High-throughput screening generates large volumes of heterogeneous data that require a diverse set of computational tools for management, processing, and analysis. Building integrated, scalable, and robust computational workflows for such applications is challenging but highly valuable. Scientific data integration and pipelining facilitate standardized data processing, collaboration, and reuse of best practices. We describe how Jenkins-CI, an “off-the-shelf,” open-source, continuous integration system, is used to build pipelines for processing images and associated data from high-content screening (HCS). Jenkins-CI provides numerous plugins for standard compute tasks, and its design allows the quick integration of external scientific applications. Using Jenkins-CI, we integrated CellProfiler, an open-source image-processing platform, with various HCS utilities and a high-performance Linux cluster. The platform is web-accessible, facilitates access and sharing of high-performance compute resources, and automates previously cumbersome data and image-processing tasks. Imaging pipelines developed using the desktop CellProfiler client can be managed and shared through a centralized Jenkins-CI repository. Pipelines and managed data are annotated to facilitate collaboration and reuse. Limitations with Jenkins-CI (primarily around the user interface) were addressed through the selection of helper plugins from the Jenkins-CI community. PMID:27899692

  5. A Dataset and a Technique for Generalized Nuclear Segmentation for Computational Pathology.

    PubMed

    Kumar, Neeraj; Verma, Ruchika; Sharma, Sanuj; Bhargava, Surabhi; Vahadane, Abhishek; Sethi, Amit

    2017-07-01

    Nuclear segmentation in digital microscopic tissue images can enable extraction of high-quality features for nuclear morphometrics and other analysis in computational pathology. Conventional image processing techniques, such as Otsu thresholding and watershed segmentation, do not work effectively on challenging cases, such as chromatin-sparse and crowded nuclei. In contrast, machine learning-based segmentation can generalize across various nuclear appearances. However, training machine learning algorithms requires data sets of images, in which a vast number of nuclei have been annotated. Publicly accessible and annotated data sets, along with widely agreed upon metrics to compare techniques, have catalyzed tremendous innovation and progress on other image classification problems, particularly in object recognition. Inspired by their success, we introduce a large publicly accessible data set of hematoxylin and eosin (H&E)-stained tissue images with more than 21000 painstakingly annotated nuclear boundaries, whose quality was validated by a medical doctor. Because our data set is taken from multiple hospitals and includes a diversity of nuclear appearances from several patients, disease states, and organs, techniques trained on it are likely to generalize well and work right out-of-the-box on other H&E-stained images. We also propose a new metric to evaluate nuclear segmentation results that penalizes object- and pixel-level errors in a unified manner, unlike previous metrics that penalize only one type of error. We also propose a segmentation technique based on deep learning that lays a special emphasis on identifying the nuclear boundaries, including those between the touching or overlapping nuclei, and works well on a diverse set of test images.

  6. The Biological Reference Repository (BioR): a rapid and flexible system for genomics annotation.

    PubMed

    Kocher, Jean-Pierre A; Quest, Daniel J; Duffy, Patrick; Meiners, Michael A; Moore, Raymond M; Rider, David; Hossain, Asif; Hart, Steven N; Dinu, Valentin

    2014-07-01

    The Biological Reference Repository (BioR) is a toolkit for annotating variants. BioR stores public and user-specific annotation sources in indexed JSON-encoded flat files (catalogs). The BioR toolkit provides the functionality to combine and retrieve annotation from these catalogs via the command-line interface. Several catalogs from commonly used annotation sources and instructions for creating user-specific catalogs are provided. Commands from the toolkit can be combined with other UNIX commands for advanced annotation processing. We also provide instructions for the development of custom annotation pipelines. The package is implemented in Java and makes use of external tools written in Java and Perl. The toolkit can be executed on Mac OS X 10.5 and above or any Linux distribution. The BioR application, quickstart, and user guide documents and many biological examples are available at http://bioinformaticstools.mayo.edu. © The Author 2014. Published by Oxford University Press.

  7. A virtual microscope for academic medical education: the pate project.

    PubMed

    Brochhausen, Christoph; Winther, Hinrich B; Hundt, Christian; Schmitt, Volker H; Schömer, Elmar; Kirkpatrick, C James

    2015-05-11

    Whole-slide imaging (WSI) has become more prominent and continues to gain in importance in student teaching. Applications with different scope have been developed. Many of these applications have either technical or design shortcomings. To design a survey to determine student expectations of WSI applications for teaching histological and pathological diagnosis. To develop a new WSI application based on the findings of the survey. A total of 216 students were questioned about their experiences and expectations of WSI applications, as well as favorable and undesired features. The survey included 14 multiple choice and two essay questions. Based on the survey, we developed a new WSI application called Pate utilizing open source technologies. The survey sample included 216 students-62.0% (134) women and 36.1% (78) men. Out of 216 students, 4 (1.9%) did not disclose their gender. The best-known preexisting WSI applications included Mainzer Histo Maps (199/216, 92.1%), Histoweb Tübingen (16/216, 7.4%), and Histonet Ulm (8/216, 3.7%). Desired features for the students were latitude in the slides (190/216, 88.0%), histological (191/216, 88.4%) and pathological (186/216, 86.1%) annotations, points of interest (181/216, 83.8%), background information (146/216, 67.6%), and auxiliary informational texts (113/216, 52.3%). By contrast, a discussion forum was far less important (9/216, 4.2%) for the students. The survey revealed that the students appreciate a rich feature set, including WSI functionality, points of interest, auxiliary informational texts, and annotations. The development of Pate was significantly influenced by the findings of the survey. Although Pate currently has some issues with the Zoomify file format, it could be shown that Web technologies are capable of providing a high-performance WSI experience, as well as a rich feature set.

  8. Detecting objects in radiographs for homeland security

    NASA Astrophysics Data System (ADS)

    Prasad, Lakshman; Snyder, Hans

    2005-05-01

    We present a general scheme for segmenting a radiographic image into polygons that correspond to visual features. This decomposition provides a vectorized representation that is a high-level description of the image. The polygons correspond to objects or object parts present in the image. This characterization of radiographs allows the direct application of several shape recognition algorithms to identify objects. In this paper we describe the use of constrained Delaunay triangulations as a uniform foundational tool to achieve multiple visual tasks, namely image segmentation, shape decomposition, and parts-based shape matching. Shape decomposition yields parts that serve as tokens representing local shape characteristics. Parts-based shape matching enables the recognition of objects in the presence of occlusions, which commonly occur in radiographs. The polygonal representation of image features affords the efficient design and application of sophisticated geometric filtering methods to detect large-scale structural properties of objects in images. Finally, the representation of radiographs via polygons results in significant reduction of image file sizes and permits the scalable graphical representation of images, along with annotations of detected objects, in the SVG (scalable vector graphics) format that is proposed by the world wide web consortium (W3C). This is a textual representation that can be compressed and encrypted for efficient and secure transmission of information over wireless channels and on the Internet. In particular, our methods described here provide an algorithmic framework for developing image analysis tools for screening cargo at ports of entry for homeland security.

  9. Semantic photo synthesis

    NASA Astrophysics Data System (ADS)

    Johnson, Matthew; Brostow, G. J.; Shotton, J.; Kwatra, V.; Cipolla, R.

    2007-02-01

    Composite images are synthesized from existing photographs by artists who make concept art, e.g. storyboards for movies or architectural planning. Current techniques allow an artist to fabricate such an image by digitally splicing parts of stock photographs. While these images serve mainly to "quickly" convey how a scene should look, their production is laborious. We propose a technique that allows a person to design a new photograph with substantially less effort. This paper presents a method that generates a composite image when a user types in nouns, such as "boat" and "sand." The artist can optionally design an intended image by specifying other constraints. Our algorithm formulates the constraints as queries to search an automatically annotated image database. The desired photograph, not a collage, is then synthesized using graph-cut optimization, optionally allowing for further user interaction to edit or choose among alternative generated photos. Our results demonstrate our contributions of (1) a method of creating specific images with minimal human effort, and (2) a combined algorithm for automatically building an image library with semantic annotations from any photo collection.

  10. Linking Disparate Datasets of the Earth Sciences with the SemantEco Annotator

    NASA Astrophysics Data System (ADS)

    Seyed, P.; Chastain, K.; McGuinness, D. L.

    2013-12-01

    Use of Semantic Web technologies for data management in the Earth sciences (and beyond) has great potential but is still in its early stages, since the challenges of translating data into a more explicit or semantic form for immediate use within applications has not been fully addressed. In this abstract we help address this challenge by introducing the SemantEco Annotator, which enables anyone, regardless of expertise, to semantically annotate tabular Earth Science data and translate it into linked data format, while applying the logic inherent in community-standard vocabularies to guide the process. The Annotator was conceived under a desire to unify dataset content from a variety of sources under common vocabularies, for use in semantically-enabled web applications. Our current use case employs linked data generated by the Annotator for use in the SemantEco environment, which utilizes semantics to help users explore, search, and visualize water or air quality measurement and species occurrence data through a map-based interface. The generated data can also be used immediately to facilitate discovery and search capabilities within 'big data' environments. The Annotator provides a method for taking information about a dataset, that may only be known to its maintainers, and making it explicit, in a uniform and machine-readable fashion, such that a person or information system can more easily interpret the underlying structure and meaning. Its primary mechanism is to enable a user to formally describe how columns of a tabular dataset relate and/or describe entities. For example, if a user identifies columns for latitude and longitude coordinates, we can infer the data refers to a point that can be plotted on a map. Further, it can be made explicit that measurements of 'nitrate' and 'NO3-' are of the same entity through vocabulary assignments, thus more easily utilizing data sets that use different nomenclatures. The Annotator provides an extensive and searchable library of vocabularies to assist the user in locating terms to describe observed entities, their properties, and relationships. The Annotator leverages vocabulary definitions of these concepts to guide the user in describing data in a logically consistent manner. The vocabularies made available through the Annotator are open, as is the Annotator itself. We have taken a step towards making semantic annotation/translation of data more accessible. Our vision for the Annotator is as a tool that can be integrated into a semantic data 'workbench' environment, which would allow semantic annotation of a variety of data formats, using standard vocabularies. These vocabularies involved enable search for similar datasets, and integration with any semantically-enabled applications for analysis and visualization.

  11. 36 CFR 1206.22 - What type of proposal is eligible for a publications grant?

    Code of Federal Regulations, 2012 CFR

    2012-07-01

    ... records; (2) Microfilm editions consisting of organized collections of images of original sources, usually... images of original editions. Electronic editions may include transcriptions and/or annotations and other...

  12. High-Content Microscopy Analysis of Subcellular Structures: Assay Development and Application to Focal Adhesion Quantification.

    PubMed

    Kroll, Torsten; Schmidt, David; Schwanitz, Georg; Ahmad, Mubashir; Hamann, Jana; Schlosser, Corinne; Lin, Yu-Chieh; Böhm, Konrad J; Tuckermann, Jan; Ploubidou, Aspasia

    2016-07-01

    High-content analysis (HCA) converts raw light microscopy images to quantitative data through the automated extraction, multiparametric analysis, and classification of the relevant information content. Combined with automated high-throughput image acquisition, HCA applied to the screening of chemicals or RNAi-reagents is termed high-content screening (HCS). Its power in quantifying cell phenotypes makes HCA applicable also to routine microscopy. However, developing effective HCA and bioinformatic analysis pipelines for acquisition of biologically meaningful data in HCS is challenging. Here, the step-by-step development of an HCA assay protocol and an HCS bioinformatics analysis pipeline are described. The protocol's power is demonstrated by application to focal adhesion (FA) detection, quantitative analysis of multiple FA features, and functional annotation of signaling pathways regulating FA size, using primary data of a published RNAi screen. The assay and the underlying strategy are aimed at researchers performing microscopy-based quantitative analysis of subcellular features, on a small scale or in large HCS experiments. © 2016 by John Wiley & Sons, Inc. Copyright © 2016 John Wiley & Sons, Inc.

  13. Teaching Students To Annotate and Underline Text Effectively--Guidelines and Procedures. College Reading and Learning Assistance Technical Report No. 87-02.

    ERIC Educational Resources Information Center

    Nist, Sherrie L.

    Of all the effective strategies available to college developmental reading students, annotating (noting important ideas or examples in text margins) and underlining have the widest appeal among students and the most practical application in any course. Annotating/underlining serves a dual function: students can isolate key ideas at the time of the…

  14. Large-Scale medical image analytics: Recent methodologies, applications and Future directions.

    PubMed

    Zhang, Shaoting; Metaxas, Dimitris

    2016-10-01

    Despite the ever-increasing amount and complexity of annotated medical image data, the development of large-scale medical image analysis algorithms has not kept pace with the need for methods that bridge the semantic gap between images and diagnoses. The goal of this position paper is to discuss and explore innovative and large-scale data science techniques in medical image analytics, which will benefit clinical decision-making and facilitate efficient medical data management. Particularly, we advocate that the scale of image retrieval systems should be significantly increased at which interactive systems can be effective for knowledge discovery in potentially large databases of medical images. For clinical relevance, such systems should return results in real-time, incorporate expert feedback, and be able to cope with the size, quality, and variety of the medical images and their associated metadata for a particular domain. The design, development, and testing of the such framework can significantly impact interactive mining in medical image databases that are growing rapidly in size and complexity and enable novel methods of analysis at much larger scales in an efficient, integrated fashion. Copyright © 2016. Published by Elsevier B.V.

  15. Saint: a lightweight integration environment for model annotation.

    PubMed

    Lister, Allyson L; Pocock, Matthew; Taschuk, Morgan; Wipat, Anil

    2009-11-15

    Saint is a web application which provides a lightweight annotation integration environment for quantitative biological models. The system enables modellers to rapidly mark up models with biological information derived from a range of data sources. Saint is freely available for use on the web at http://www.cisban.ac.uk/saint. The web application is implemented in Google Web Toolkit and Tomcat, with all major browsers supported. The Java source code is freely available for download at http://saint-annotate.sourceforge.net. The Saint web server requires an installation of libSBML and has been tested on Linux (32-bit Ubuntu 8.10 and 9.04).

  16. Pfarao: a web application for protein family analysis customized for cytoskeletal and motor proteins (CyMoBase).

    PubMed

    Odronitz, Florian; Kollmar, Martin

    2006-11-29

    Annotation of protein sequences of eukaryotic organisms is crucial for the understanding of their function in the cell. Manual annotation is still by far the most accurate way to correctly predict genes. The classification of protein sequences, their phylogenetic relation and the assignment of function involves information from various sources. This often leads to a collection of heterogeneous data, which is hard to track. Cytoskeletal and motor proteins consist of large and diverse superfamilies comprising up to several dozen members per organism. Up to date there is no integrated tool available to assist in the manual large-scale comparative genomic analysis of protein families. Pfarao (Protein Family Application for Retrieval, Analysis and Organisation) is a database driven online working environment for the analysis of manually annotated protein sequences and their relationship. Currently, the system can store and interrelate a wide range of information about protein sequences, species, phylogenetic relations and sequencing projects as well as links to literature and domain predictions. Sequences can be imported from multiple sequence alignments that are generated during the annotation process. A web interface allows to conveniently browse the database and to compile tabular and graphical summaries of its content. We implemented a protein sequence-centric web application to store, organize, interrelate, and present heterogeneous data that is generated in manual genome annotation and comparative genomics. The application has been developed for the analysis of cytoskeletal and motor proteins (CyMoBase) but can easily be adapted for any protein.

  17. Prospective study of automated versus manual annotation of early time-lapse markers in the human preimplantation embryo.

    PubMed

    Kaser, Daniel J; Farland, Leslie V; Missmer, Stacey A; Racowsky, Catherine

    2017-08-01

    How does automated time-lapse annotation (Eeva™) compare to manual annotation of the same video images performed by embryologists certified in measuring durations of the 2-cell (P2; time to the 3-cell minus time to the 2-cell, or t3-t2) and 3-cell (P3; time to 4-cell minus time to the 3-cell, or t4-t3) stages? Manual annotation was superior to the automated annotation provided by Eeva™ version 2.2, because manual annotation assigned a rating to a higher proportion of embryos and yielded a greater sensitivity for blastocyst prediction than automated annotation. While use of the Eeva™ test has been shown to improve an embryologist's ability to predict blastocyst formation compared to Day 3 morphology alone, the accuracy of the automated image analysis employed by the Eeva™ system has never been compared to manual annotation of the same time-lapse markers by a trained embryologist. We conducted a prospective cohort study of embryos (n = 1477) cultured in the Eeva™ system (n = 8 microscopes) at our institution from August 2014 to February 2016. Embryos were assigned a blastocyst prediction rating of High (H), Medium (M), Low (L), or Not Rated (NR) by Eeva™ version 2.2 according to P2 and P3. An embryologist from a team of 10, then manually annotated each embryo and if the automated and manual ratings differed, a second embryologist independently annotated the embryo. If both embryologists disagreed with the automated Eeva™ rating, then the rating was classified as discordant. If the second embryologist agreed with the automated Eeva™ score, the rating was not considered discordant. Spearman's correlation (ρ), weighted kappa statistics and the intra-class correlation (ICC) coefficients with 95% confidence intervals (CI) between Eeva™ and manual annotation were calculated, as were the proportions of discordant embryos, and the sensitivity, specificity, positive predictive value (PPV) and NPV of each method for blastocyst prediction. The distribution of H, M and L ratings differed by annotation method (P < 0.0001). The correlation between Eeva™ and manual annotation was higher for P2 (ρ = 0.75; ICC = 0.82; 95% CI 0.82-0.83) than for P3 (ρ = 0.39; ICC = 0.20; 95% CI 0.16-0.26). Eeva™ was more likely than an embryologist to rate an embryo as NR (11.1% vs. 3.0%, P < 0.0001). Discordance occurred in 30.0% (443/1477) of all embryos and was not associated with factors such as Day 3 cell number, fragmentation, symmetry or presence of abnormal cleavage. Rather, discordance was associated with direct cleavage (P2 ≤ 5 h) and short P3 (≤0.25 h), and also factors intrinsic to the Eeva™ system, such as the automated rating (proportion of discordant embryos by rating: H: 9.3%; M: 18.1%; L: 41.3%; NR: 31.4%; P < 0.0001), microwell location (peripheral: 31.2%; central: 23.8%; P = 0.02) and Eeva™ microscope (n = 8; range 22.9-42.6%; P < 0.0001). Manual annotation upgraded 82.6% of all discordant embryos from a lower to a higher rating, and improved the sensitivity for predicting blastocyst formation. One team of embryologists performed the manual annotations; however, the study staff was trained and certified by the company sponsor. Only two time-lapse markers were evaluated, so the results are not generalizable to other parameters; likewise, the results are not generalizable to future versions of Eeva™ or other automated image analysis systems. Based on the proportion of discordance and the improved performance of manual annotation, clinics using the Eeva™ system should consider manual annotation of P2 and P3 to confirm the automated ratings generated by Eeva™. These data were acquired in a study funded by Progyny, Inc. There are no competing interests. N/A. © The Author 2017. Published by Oxford University Press on behalf of the European Society of Human Reproduction and Embryology. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com

  18. Automated Detection of Microaneurysms Using Scale-Adapted Blob Analysis and Semi-Supervised Learning

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Adal, Kedir M.; Sidebe, Desire; Ali, Sharib

    2014-01-07

    Despite several attempts, automated detection of microaneurysm (MA) from digital fundus images still remains to be an open issue. This is due to the subtle nature of MAs against the surrounding tissues. In this paper, the microaneurysm detection problem is modeled as finding interest regions or blobs from an image and an automatic local-scale selection technique is presented. Several scale-adapted region descriptors are then introduced to characterize these blob regions. A semi-supervised based learning approach, which requires few manually annotated learning examples, is also proposed to train a classifier to detect true MAs. The developed system is built using onlymore » few manually labeled and a large number of unlabeled retinal color fundus images. The performance of the overall system is evaluated on Retinopathy Online Challenge (ROC) competition database. A competition performance measure (CPM) of 0.364 shows the competitiveness of the proposed system against state-of-the art techniques as well as the applicability of the proposed features to analyze fundus images.« less

  19. Automated geo/ortho registered aerial imagery product generation using the mapping system interface card (MSIC)

    NASA Astrophysics Data System (ADS)

    Bratcher, Tim; Kroutil, Robert; Lanouette, André; Lewis, Paul E.; Miller, David; Shen, Sylvia; Thomas, Mark

    2013-05-01

    The development concept paper for the MSIC system was first introduced in August 2012 by these authors. This paper describes the final assembly, testing, and commercial availability of the Mapping System Interface Card (MSIC). The 2.3kg MSIC is a self-contained, compact variable configuration, low cost real-time precision metadata annotator with embedded INS/GPS designed specifically for use in small aircraft. The MSIC was specifically designed to convert commercial-off-the-shelf (COTS) digital cameras and imaging/non-imaging spectrometers with Camera Link standard data streams into mapping systems for airborne emergency response and scientific remote sensing applications. COTS digital cameras and imaging/non-imaging spectrometers covering the ultraviolet through long-wave infrared wavelengths are important tools now readily available and affordable for use by emergency responders and scientists. The MSIC will significantly enhance the capability of emergency responders and scientists by providing a direct transformation of these important COTS sensor tools into low-cost real-time aerial mapping systems.

  20. A Next-generation Tissue Microarray (ngTMA) Protocol for Biomarker Studies

    PubMed Central

    Zlobec, Inti; Suter, Guido; Perren, Aurel; Lugli, Alessandro

    2014-01-01

    Biomarker research relies on tissue microarrays (TMA). TMAs are produced by repeated transfer of small tissue cores from a ‘donor’ block into a ‘recipient’ block and then used for a variety of biomarker applications. The construction of conventional TMAs is labor intensive, imprecise, and time-consuming. Here, a protocol using next-generation Tissue Microarrays (ngTMA) is outlined. ngTMA is based on TMA planning and design, digital pathology, and automated tissue microarraying. The protocol is illustrated using an example of 134 metastatic colorectal cancer patients. Histological, statistical and logistical aspects are considered, such as the tissue type, specific histological regions, and cell types for inclusion in the TMA, the number of tissue spots, sample size, statistical analysis, and number of TMA copies. Histological slides for each patient are scanned and uploaded onto a web-based digital platform. There, they are viewed and annotated (marked) using a 0.6-2.0 mm diameter tool, multiple times using various colors to distinguish tissue areas. Donor blocks and 12 ‘recipient’ blocks are loaded into the instrument. Digital slides are retrieved and matched to donor block images. Repeated arraying of annotated regions is automatically performed resulting in an ngTMA. In this example, six ngTMAs are planned containing six different tissue types/histological zones. Two copies of the ngTMAs are desired. Three to four slides for each patient are scanned; 3 scan runs are necessary and performed overnight. All slides are annotated; different colors are used to represent the different tissues/zones, namely tumor center, invasion front, tumor/stroma, lymph node metastases, liver metastases, and normal tissue. 17 annotations/case are made; time for annotation is 2-3 min/case. 12 ngTMAs are produced containing 4,556 spots. Arraying time is 15-20 hr. Due to its precision, flexibility and speed, ngTMA is a powerful tool to further improve the quality of TMAs used in clinical and translational research. PMID:25285857

  1. Neuronal Morphology goes Digital: A Research Hub for Cellular and System Neuroscience

    PubMed Central

    Parekh, Ruchi; Ascoli, Giorgio A.

    2013-01-01

    Summary The importance of neuronal morphology in brain function has been recognized for over a century. The broad applicability of “digital reconstructions” of neuron morphology across neuroscience sub-disciplines has stimulated the rapid development of numerous synergistic tools for data acquisition, anatomical analysis, three-dimensional rendering, electrophysiological simulation, growth models, and data sharing. Here we discuss the processes of histological labeling, microscopic imaging, and semi-automated tracing. Moreover, we provide an annotated compilation of currently available resources in this rich research “ecosystem” as a central reference for experimental and computational neuroscience. PMID:23522039

  2. Bibliocable.

    ERIC Educational Resources Information Center

    Cable Television Information Center, Washington, DC.

    This selective, annotated bibliography covers 67 items published on cable television from 1968 to 1972. The books, articles, and report literature included here deal with these topics: introduction, background, access, applications, economic aspects, franchising, regulation, and technology. Each annotation includes sources and ordering…

  3. Enhancing colposcopy with polarized light.

    PubMed

    Ferris, Daron G; Li, Wenjing; Gustafsson, Ulf; Lieberman, Richard W; Galdos, Oscar; Santos, Carlos

    2010-07-01

    To determine the potential utility of polarized light used during colposcopic examinations. Matched, polarized, and unpolarized colposcopic images and diagnostic annotations from 31 subjects receiving excisional treatment of cervical neoplasia were compared. Sensitivity, specificity, and mean Euclidean distances between the centroids of the gaussian ellipsoids for the different epithelial types were calculated for unpolarized and polarized images. The sensitivities of polarized colposcopic annotations for discriminating cervical intraepithelial neoplasia (CIN) 2 or higher were greater for all 3 acetowhite categories when compared with unpolarized annotations (58% [44/76] vs 45% [34/76], 68% [50/74] vs 59% [45/76], and 68% [49/72] vs 66% [50/76], respectively). The average percent differences in Euclidean distances between the epithelial types for unpolarized and polarized cervical images were as follows: CIN 2/3 versus CIN 1 = 33% (10/30, p =.03), CIN 2/3 versus columnar epithelium = 22% (p =.004), CIN 2/3 versus immature metaplasia = 29% (14/47, p =.11), and CIN 1 versus immature metaplasia = 27% (4.4/16, p =.16). Because of its ability to interrogate at a deeper plane and eliminate obscuring glare, polarized light colposcopy may enhance the evaluation and detection of cervical neoplasias.

  4. NoGOA: predicting noisy GO annotations using evidences and sparse representation.

    PubMed

    Yu, Guoxian; Lu, Chang; Wang, Jun

    2017-07-21

    Gene Ontology (GO) is a community effort to represent functional features of gene products. GO annotations (GOA) provide functional associations between GO terms and gene products. Due to resources limitation, only a small portion of annotations are manually checked by curators, and the others are electronically inferred. Although quality control techniques have been applied to ensure the quality of annotations, the community consistently report that there are still considerable noisy (or incorrect) annotations. Given the wide application of annotations, however, how to identify noisy annotations is an important but yet seldom studied open problem. We introduce a novel approach called NoGOA to predict noisy annotations. NoGOA applies sparse representation on the gene-term association matrix to reduce the impact of noisy annotations, and takes advantage of sparse representation coefficients to measure the semantic similarity between genes. Secondly, it preliminarily predicts noisy annotations of a gene based on aggregated votes from semantic neighborhood genes of that gene. Next, NoGOA estimates the ratio of noisy annotations for each evidence code based on direct annotations in GOA files archived on different periods, and then weights entries of the association matrix via estimated ratios and propagates weights to ancestors of direct annotations using GO hierarchy. Finally, it integrates evidence-weighted association matrix and aggregated votes to predict noisy annotations. Experiments on archived GOA files of six model species (H. sapiens, A. thaliana, S. cerevisiae, G. gallus, B. Taurus and M. musculus) demonstrate that NoGOA achieves significantly better results than other related methods and removing noisy annotations improves the performance of gene function prediction. The comparative study justifies the effectiveness of integrating evidence codes with sparse representation for predicting noisy GO annotations. Codes and datasets are available at http://mlda.swu.edu.cn/codes.php?name=NoGOA .

  5. An annotation system for 3D fluid flow visualization

    NASA Technical Reports Server (NTRS)

    Loughlin, Maria M.; Hughes, John F.

    1995-01-01

    Annotation is a key activity of data analysis. However, current systems for data analysis focus almost exclusively on visualization. We propose a system which integrates annotations into a visualization system. Annotations are embedded in 3D data space, using the Post-it metaphor. This embedding allows contextual-based information storage and retrieval, and facilitates information sharing in collaborative environments. We provide a traditional database filter and a Magic Lens filter to create specialized views of the data. The system has been customized for fluid flow applications, with features which allow users to store parameters of visualization tools and sketch 3D volumes.

  6. A neotropical Miocene pollen database employing image-based search and semantic modeling.

    PubMed

    Han, Jing Ginger; Cao, Hongfei; Barb, Adrian; Punyasena, Surangi W; Jaramillo, Carlos; Shyu, Chi-Ren

    2014-08-01

    Digital microscopic pollen images are being generated with increasing speed and volume, producing opportunities to develop new computational methods that increase the consistency and efficiency of pollen analysis and provide the palynological community a computational framework for information sharing and knowledge transfer. • Mathematical methods were used to assign trait semantics (abstract morphological representations) of the images of neotropical Miocene pollen and spores. Advanced database-indexing structures were built to compare and retrieve similar images based on their visual content. A Web-based system was developed to provide novel tools for automatic trait semantic annotation and image retrieval by trait semantics and visual content. • Mathematical models that map visual features to trait semantics can be used to annotate images with morphology semantics and to search image databases with improved reliability and productivity. Images can also be searched by visual content, providing users with customized emphases on traits such as color, shape, and texture. • Content- and semantic-based image searches provide a powerful computational platform for pollen and spore identification. The infrastructure outlined provides a framework for building a community-wide palynological resource, streamlining the process of manual identification, analysis, and species discovery.

  7. PANDA: pathway and annotation explorer for visualizing and interpreting gene-centric data.

    PubMed

    Hart, Steven N; Moore, Raymond M; Zimmermann, Michael T; Oliver, Gavin R; Egan, Jan B; Bryce, Alan H; Kocher, Jean-Pierre A

    2015-01-01

    Objective. Bringing together genomics, transcriptomics, proteomics, and other -omics technologies is an important step towards developing highly personalized medicine. However, instrumentation has advances far beyond expectations and now we are able to generate data faster than it can be interpreted. Materials and Methods. We have developed PANDA (Pathway AND Annotation) Explorer, a visualization tool that integrates gene-level annotation in the context of biological pathways to help interpret complex data from disparate sources. PANDA is a web-based application that displays data in the context of well-studied pathways like KEGG, BioCarta, and PharmGKB. PANDA represents data/annotations as icons in the graph while maintaining the other data elements (i.e., other columns for the table of annotations). Custom pathways from underrepresented diseases can be imported when existing data sources are inadequate. PANDA also allows sharing annotations among collaborators. Results. In our first use case, we show how easy it is to view supplemental data from a manuscript in the context of a user's own data. Another use-case is provided describing how PANDA was leveraged to design a treatment strategy from the somatic variants found in the tumor of a patient with metastatic sarcomatoid renal cell carcinoma. Conclusion. PANDA facilitates the interpretation of gene-centric annotations by visually integrating this information with context of biological pathways. The application can be downloaded or used directly from our website: http://bioinformaticstools.mayo.edu/research/panda-viewer/.

  8. Recognition of Protein-coding Genes Based on Z-curve Algorithms

    PubMed Central

    -Biao Guo, Feng; Lin, Yan; -Ling Chen, Ling

    2014-01-01

    Recognition of protein-coding genes, a classical bioinformatics issue, is an absolutely needed step for annotating newly sequenced genomes. The Z-curve algorithm, as one of the most effective methods on this issue, has been successfully applied in annotating or re-annotating many genomes, including those of bacteria, archaea and viruses. Two Z-curve based ab initio gene-finding programs have been developed: ZCURVE (for bacteria and archaea) and ZCURVE_V (for viruses and phages). ZCURVE_C (for 57 bacteria) and Zfisher (for any bacterium) are web servers for re-annotation of bacterial and archaeal genomes. The above four tools can be used for genome annotation or re-annotation, either independently or combined with the other gene-finding programs. In addition to recognizing protein-coding genes and exons, Z-curve algorithms are also effective in recognizing promoters and translation start sites. Here, we summarize the applications of Z-curve algorithms in gene finding and genome annotation. PMID:24822027

  9. Paraconsistent Annotated Logic in Viability Analysis: an Approach to Product Launching

    NASA Astrophysics Data System (ADS)

    Romeu de Carvalho, Fábio; Brunstein, Israel; Abe, Jair Minoro

    2004-08-01

    In this paper we present an application of the Para-analyzer, a logical analyzer based on the Paraconsistent Annotated Logic Pτ, introduced by Da Silva Filho and Abe in the decision-making systems. An example is analyzed in detail showing how uncertainty, inconsistency and paracompleteness can be elegantly handled with this logical system. As application for the Para-analyzer in decision-making, we developed the BAM — Baricenter Analysis Method. In order to make the presentation easier, we present the BAM applied in the viability analysis of product launching. Some of the techniques of Paraconsistent Annotated Logic have been applied in Artificial Intelligence, Robotics, Information Technolgy (Computer Sciences), etc..

  10. Cloud-Based Evaluation of Anatomical Structure Segmentation and Landmark Detection Algorithms: VISCERAL Anatomy Benchmarks.

    PubMed

    Jimenez-Del-Toro, Oscar; Muller, Henning; Krenn, Markus; Gruenberg, Katharina; Taha, Abdel Aziz; Winterstein, Marianne; Eggel, Ivan; Foncubierta-Rodriguez, Antonio; Goksel, Orcun; Jakab, Andras; Kontokotsios, Georgios; Langs, Georg; Menze, Bjoern H; Salas Fernandez, Tomas; Schaer, Roger; Walleyo, Anna; Weber, Marc-Andre; Dicente Cid, Yashin; Gass, Tobias; Heinrich, Mattias; Jia, Fucang; Kahl, Fredrik; Kechichian, Razmig; Mai, Dominic; Spanier, Assaf B; Vincent, Graham; Wang, Chunliang; Wyeth, Daniel; Hanbury, Allan

    2016-11-01

    Variations in the shape and appearance of anatomical structures in medical images are often relevant radiological signs of disease. Automatic tools can help automate parts of this manual process. A cloud-based evaluation framework is presented in this paper including results of benchmarking current state-of-the-art medical imaging algorithms for anatomical structure segmentation and landmark detection: the VISCERAL Anatomy benchmarks. The algorithms are implemented in virtual machines in the cloud where participants can only access the training data and can be run privately by the benchmark administrators to objectively compare their performance in an unseen common test set. Overall, 120 computed tomography and magnetic resonance patient volumes were manually annotated to create a standard Gold Corpus containing a total of 1295 structures and 1760 landmarks. Ten participants contributed with automatic algorithms for the organ segmentation task, and three for the landmark localization task. Different algorithms obtained the best scores in the four available imaging modalities and for subsets of anatomical structures. The annotation framework, resulting data set, evaluation setup, results and performance analysis from the three VISCERAL Anatomy benchmarks are presented in this article. Both the VISCERAL data set and Silver Corpus generated with the fusion of the participant algorithms on a larger set of non-manually-annotated medical images are available to the research community.

  11. The application of an optical Fourier spectrum analyzer on detecting defects in mass-produced satellite photographs

    NASA Technical Reports Server (NTRS)

    Athale, R.; Lee, S. H.

    1976-01-01

    Various defects in mass-produced pictures transmitted to earth from a satellite are investigated. It is found that the following defects are readily detectable via Fourier spectrum analysis: (1) bit slip, (2) breakup causing loss of image, and (3) disabled track at the top of the imagery. The scratches made on the film during mass production, which are difficult to detect by visual observation, also show themselves readily in Fourier spectrum analysis. A relation is established between the number of scratches, their width and depth and the intensity of their Fourier spectra. Other defects that are found to be equally suitable for Fourier spectrum analysis or visual (image analysis) detection are synchronous loss without blurring of image, and density variation in gray scale. However, the Fourier spectrum analysis is found to be unsuitable for detection of such defects as pin holes, annotation error, synchronous loss with blurring of images, and missing image in the beginning of the work order. The design of an automated, real time system, which will reject defective films, is treated.

  12. a Clustering-Based Approach for Evaluation of EO Image Indexing

    NASA Astrophysics Data System (ADS)

    Bahmanyar, R.; Rigoll, G.; Datcu, M.

    2013-09-01

    The volume of Earth Observation data is increasing immensely in order of several Terabytes a day. Therefore, to explore and investigate the content of this huge amount of data, developing more sophisticated Content-Based Information Retrieval (CBIR) systems are highly demanded. These systems should be able to not only discover unknown structures behind the data, but also provide relevant results to the users' queries. Since in any retrieval system the images are processed based on a discrete set of their features (i.e., feature descriptors), study and assessment of the structure of feature space, build by different feature descriptors, is of high importance. In this paper, we introduce a clustering-based approach to study the content of image collections. In our approach, we claim that using both internal and external evaluation of clusters for different feature descriptors, helps to understand the structure of feature space. Moreover, the semantic understanding of users about the images also can be assessed. To validate the performance of our approach, we used an annotated Synthetic Aperture Radar (SAR) image collection. Quantitative results besides the visualization of feature space demonstrate the applicability of our approach.

  13. Core French: A Selected Annotated Resource List.

    ERIC Educational Resources Information Center

    Boyd, J. A.; Mollica, Anthony

    1985-01-01

    This is an annotated bibliography of: readers, workbooks, conversation books, cultural sources and readings, flash cards, duplicating or line masters, and media kits submitted by publishers as applicable to French second language instruction from kindergarten through senior high school levels. (MSE)

  14. Pfarao: a web application for protein family analysis customized for cytoskeletal and motor proteins (CyMoBase)

    PubMed Central

    Odronitz, Florian; Kollmar, Martin

    2006-01-01

    Background Annotation of protein sequences of eukaryotic organisms is crucial for the understanding of their function in the cell. Manual annotation is still by far the most accurate way to correctly predict genes. The classification of protein sequences, their phylogenetic relation and the assignment of function involves information from various sources. This often leads to a collection of heterogeneous data, which is hard to track. Cytoskeletal and motor proteins consist of large and diverse superfamilies comprising up to several dozen members per organism. Up to date there is no integrated tool available to assist in the manual large-scale comparative genomic analysis of protein families. Description Pfarao (Protein Family Application for Retrieval, Analysis and Organisation) is a database driven online working environment for the analysis of manually annotated protein sequences and their relationship. Currently, the system can store and interrelate a wide range of information about protein sequences, species, phylogenetic relations and sequencing projects as well as links to literature and domain predictions. Sequences can be imported from multiple sequence alignments that are generated during the annotation process. A web interface allows to conveniently browse the database and to compile tabular and graphical summaries of its content. Conclusion We implemented a protein sequence-centric web application to store, organize, interrelate, and present heterogeneous data that is generated in manual genome annotation and comparative genomics. The application has been developed for the analysis of cytoskeletal and motor proteins (CyMoBase) but can easily be adapted for any protein. PMID:17134497

  15. Enhanced Isotopic Ratio Outlier Analysis (IROA) Peak Detection and Identification with Ultra-High Resolution GC-Orbitrap/MS: Potential Application for Investigation of Model Organism Metabolomes.

    PubMed

    Qiu, Yunping; Moir, Robyn D; Willis, Ian M; Seethapathy, Suresh; Biniakewitz, Robert C; Kurland, Irwin J

    2018-01-18

    Identifying non-annotated peaks may have a significant impact on the understanding of biological systems. In silico methodologies have focused on ESI LC/MS/MS for identifying non-annotated MS peaks. In this study, we employed in silico methodology to develop an Isotopic Ratio Outlier Analysis (IROA) workflow using enhanced mass spectrometric data acquired with the ultra-high resolution GC-Orbitrap/MS to determine the identity of non-annotated metabolites. The higher resolution of the GC-Orbitrap/MS, together with its wide dynamic range, resulted in more IROA peak pairs detected, and increased reliability of chemical formulae generation (CFG). IROA uses two different 13 C-enriched carbon sources (randomized 95% 12 C and 95% 13 C) to produce mirror image isotopologue pairs, whose mass difference reveals the carbon chain length (n), which aids in the identification of endogenous metabolites. Accurate m/z, n, and derivatization information are obtained from our GC/MS workflow for unknown metabolite identification, and aids in silico methodologies for identifying isomeric and non-annotated metabolites. We were able to mine more mass spectral information using the same Saccharomyces cerevisiae growth protocol (Qiu et al. Anal. Chem 2016) with the ultra-high resolution GC-Orbitrap/MS, using 10% ammonia in methane as the CI reagent gas. We identified 244 IROA peaks pairs, which significantly increased IROA detection capability compared with our previous report (126 IROA peak pairs using a GC-TOF/MS machine). For 55 selected metabolites identified from matched IROA CI and EI spectra, using the GC-Orbitrap/MS vs. GC-TOF/MS, the average mass deviation for GC-Orbitrap/MS was 1.48 ppm, however, the average mass deviation was 32.2 ppm for the GC-TOF/MS machine. In summary, the higher resolution and wider dynamic range of the GC-Orbitrap/MS enabled more accurate CFG, and the coupling of accurate mass GC/MS IROA methodology with in silico fragmentation has great potential in unknown metabolite identification, with applications for characterizing model organism networks.

  16. Helioviewer.org: Browsing Very Large Image Archives Online Using JPEG 2000

    NASA Astrophysics Data System (ADS)

    Hughitt, V. K.; Ireland, J.; Mueller, D.; Dimitoglou, G.; Garcia Ortiz, J.; Schmidt, L.; Wamsler, B.; Beck, J.; Alexanderian, A.; Fleck, B.

    2009-12-01

    As the amount of solar data available to scientists continues to increase at faster and faster rates, it is important that there exist simple tools for navigating this data quickly with a minimal amount of effort. By combining heterogeneous solar physics datatypes such as full-disk images and coronagraphs, along with feature and event information, Helioviewer offers a simple and intuitive way to browse multiple datasets simultaneously. Images are stored in a repository using the JPEG 2000 format and tiled dynamically upon a client's request. By tiling images and serving only the portions of the image requested, it is possible for the client to work with very large images without having to fetch all of the data at once. In addition to a focus on intercommunication with other virtual observatories and browsers (VSO, HEK, etc), Helioviewer will offer a number of externally-available application programming interfaces (APIs) to enable easy third party use, adoption and extension. Recent efforts have resulted in increased performance, dynamic movie generation, and improved support for mobile web browsers. Future functionality will include: support for additional data-sources including RHESSI, SDO, STEREO, and TRACE, a navigable timeline of recorded solar events, social annotation, and basic client-side image processing.

  17. Digital Imaging and Communications in Medicine Whole Slide Imaging Connectathon at Digital Pathology Association Pathology Visions 2017

    PubMed Central

    Clunie, David; Hosseinzadeh, Dan; Wintell, Mikael; De Mena, David; Lajara, Nieves; Garcia-Rojo, Marcial; Bueno, Gloria; Saligrama, Kiran; Stearrett, Aaron; Toomey, David; Abels, Esther; Apeldoorn, Frank Van; Langevin, Stephane; Nichols, Sean; Schmid, Joachim; Horchner, Uwe; Beckwith, Bruce; Parwani, Anil; Pantanowitz, Liron

    2018-01-01

    As digital pathology systems for clinical diagnostic work applications become mainstream, interoperability between these systems from different vendors becomes critical. For the first time, multiple digital pathology vendors have publicly revealed the use of the digital imaging and communications in medicine (DICOM) standard file format and network protocol to communicate between separate whole slide acquisition, storage, and viewing components. Note the use of DICOM for clinical diagnostic applications is still to be validated in the United States. The successful demonstration shows that the DICOM standard is fundamentally sound, though many lessons were learned. These lessons will be incorporated as incremental improvements in the standard, provide more detailed profiles to constrain variation for specific use cases, and offer educational material for implementers. Future Connectathon events will expand the scope to include more devices and vendors, as well as more ambitious use cases including laboratory information system integration and annotation for image analysis, as well as more geographic diversity. Users should request DICOM features in all purchases and contracts. It is anticipated that the growth of DICOM-compliant manufacturers will likely also ease DICOM for pathology becoming a recognized standard and as such the regulatory pathway for digital pathology products. PMID:29619278

  18. Fostering Student Engagement with Digital Microscopic Images Using ThingLink, an Image Annotation Program

    ERIC Educational Resources Information Center

    Appasamy, Pierette

    2018-01-01

    The teaching of histology has changed dramatically with virtual microscopy. Fewer students of histology spend significant time viewing slides on a microscope and instead study images available in digital slide sets, generally accessible via the internet. However, students must still be able to correctly identify the defining characteristics of…

  19. Virus Particle Detection by Convolutional Neural Network in Transmission Electron Microscopy Images.

    PubMed

    Ito, Eisuke; Sato, Takaaki; Sano, Daisuke; Utagawa, Etsuko; Kato, Tsuyoshi

    2018-06-01

    A new computational method for the detection of virus particles in transmission electron microscopy (TEM) images is presented. Our approach is to use a convolutional neural network that transforms a TEM image to a probabilistic map that indicates where virus particles exist in the image. Our proposed approach automatically and simultaneously learns both discriminative features and classifier for virus particle detection by machine learning, in contrast to existing methods that are based on handcrafted features that yield many false positives and require several postprocessing steps. The detection performance of the proposed method was assessed against a dataset of TEM images containing feline calicivirus particles and compared with several existing detection methods, and the state-of-the-art performance of the developed method for detecting virus was demonstrated. Since our method is based on supervised learning that requires both the input images and their corresponding annotations, it is basically used for detection of already-known viruses. However, the method is highly flexible, and the convolutional networks can adapt themselves to any virus particles by learning automatically from an annotated dataset.

  20. Hierarchy-associated semantic-rule inference framework for classifying indoor scenes

    NASA Astrophysics Data System (ADS)

    Yu, Dan; Liu, Peng; Ye, Zhipeng; Tang, Xianglong; Zhao, Wei

    2016-03-01

    Typically, the initial task of classifying indoor scenes is challenging, because the spatial layout and decoration of a scene can vary considerably. Recent efforts at classifying object relationships commonly depend on the results of scene annotation and predefined rules, making classification inflexible. Furthermore, annotation results are easily affected by external factors. Inspired by human cognition, a scene-classification framework was proposed using the empirically based annotation (EBA) and a match-over rule-based (MRB) inference system. The semantic hierarchy of images is exploited by EBA to construct rules empirically for MRB classification. The problem of scene classification is divided into low-level annotation and high-level inference from a macro perspective. Low-level annotation involves detecting the semantic hierarchy and annotating the scene with a deformable-parts model and a bag-of-visual-words model. In high-level inference, hierarchical rules are extracted to train the decision tree for classification. The categories of testing samples are generated from the parts to the whole. Compared with traditional classification strategies, the proposed semantic hierarchy and corresponding rules reduce the effect of a variable background and improve the classification performance. The proposed framework was evaluated on a popular indoor scene dataset, and the experimental results demonstrate its effectiveness.

  1. The LISS--a public database of common imaging signs of lung diseases for computer-aided detection and diagnosis research and medical education.

    PubMed

    Han, Guanghui; Liu, Xiabi; Han, Feifei; Santika, I Nyoman Tenaya; Zhao, Yanfeng; Zhao, Xinming; Zhou, Chunwu

    2015-02-01

    Lung computed tomography (CT) imaging signs play important roles in the diagnosis of lung diseases. In this paper, we review the significance of CT imaging signs in disease diagnosis and determine the inclusion criterion of CT scans and CT imaging signs of our database. We develop the software of abnormal regions annotation and design the storage scheme of CT images and annotation data. Then, we present a publicly available database of lung CT imaging signs, called LISS for short, which contains 271 CT scans and 677 abnormal regions in them. The 677 abnormal regions are divided into nine categories of common CT imaging signs of lung disease (CISLs). The ground truth of these CISLs regions and the corresponding categories are provided. Furthermore, to make the database publicly available, all private data in CT scans are eliminated or replaced with provisioned values. The main characteristic of our LISS database is that it is developed from a new perspective of CT imaging signs of lung diseases instead of commonly considered lung nodules. Thus, it is promising to apply to computer-aided detection and diagnosis research and medical education.

  2. Polymeric endovascular strut and lumen detection algorithm for intracoronary optical coherence tomography images

    NASA Astrophysics Data System (ADS)

    Amrute, Junedh M.; Athanasiou, Lambros S.; Rikhtegar, Farhad; de la Torre Hernández, José M.; Camarero, Tamara García; Edelman, Elazer R.

    2018-03-01

    Polymeric endovascular implants are the next step in minimally invasive vascular interventions. As an alternative to traditional metallic drug-eluting stents, these often-erodible scaffolds present opportunities and challenges for patients and clinicians. Theoretically, as they resorb and are absorbed over time, they obviate the long-term complications of permanent implants, but in the short-term visualization and therefore positioning is problematic. Polymeric scaffolds can only be fully imaged using optical coherence tomography (OCT) imaging-they are relatively invisible via angiography-and segmentation of polymeric struts in OCT images is performed manually, a laborious and intractable procedure for large datasets. Traditional lumen detection methods using implant struts as boundary limits fail in images with polymeric implants. Therefore, it is necessary to develop an automated method to detect polymeric struts and luminal borders in OCT images; we present such a fully automated algorithm. Accuracy was validated using expert annotations on 1140 OCT images with a positive predictive value of 0.93 for strut detection and an R2 correlation coefficient of 0.94 between detected and expert-annotated lumen areas. The proposed algorithm allows for rapid, accurate, and automated detection of polymeric struts and the luminal border in OCT images.

  3. Selective Convolutional Descriptor Aggregation for Fine-Grained Image Retrieval.

    PubMed

    Wei, Xiu-Shen; Luo, Jian-Hao; Wu, Jianxin; Zhou, Zhi-Hua

    2017-06-01

    Deep convolutional neural network models pre-trained for the ImageNet classification task have been successfully adopted to tasks in other domains, such as texture description and object proposal generation, but these tasks require annotations for images in the new domain. In this paper, we focus on a novel and challenging task in the pure unsupervised setting: fine-grained image retrieval. Even with image labels, fine-grained images are difficult to classify, letting alone the unsupervised retrieval task. We propose the selective convolutional descriptor aggregation (SCDA) method. The SCDA first localizes the main object in fine-grained images, a step that discards the noisy background and keeps useful deep descriptors. The selected descriptors are then aggregated and the dimensionality is reduced into a short feature vector using the best practices we found. The SCDA is unsupervised, using no image label or bounding box annotation. Experiments on six fine-grained data sets confirm the effectiveness of the SCDA for fine-grained image retrieval. Besides, visualization of the SCDA features shows that they correspond to visual attributes (even subtle ones), which might explain SCDA's high-mean average precision in fine-grained retrieval. Moreover, on general image retrieval data sets, the SCDA achieves comparable retrieval results with the state-of-the-art general image retrieval approaches.

  4. Integrating concept ontology and multitask learning to achieve more effective classifier training for multilevel image annotation.

    PubMed

    Fan, Jianping; Gao, Yuli; Luo, Hangzai

    2008-03-01

    In this paper, we have developed a new scheme for achieving multilevel annotations of large-scale images automatically. To achieve more sufficient representation of various visual properties of the images, both the global visual features and the local visual features are extracted for image content representation. To tackle the problem of huge intraconcept visual diversity, multiple types of kernels are integrated to characterize the diverse visual similarity relationships between the images more precisely, and a multiple kernel learning algorithm is developed for SVM image classifier training. To address the problem of huge interconcept visual similarity, a novel multitask learning algorithm is developed to learn the correlated classifiers for the sibling image concepts under the same parent concept and enhance their discrimination and adaptation power significantly. To tackle the problem of huge intraconcept visual diversity for the image concepts at the higher levels of the concept ontology, a novel hierarchical boosting algorithm is developed to learn their ensemble classifiers hierarchically. In order to assist users on selecting more effective hypotheses for image classifier training, we have developed a novel hyperbolic framework for large-scale image visualization and interactive hypotheses assessment. Our experiments on large-scale image collections have also obtained very positive results.

  5. EuCAP, a Eukaryotic Community Annotation Package, and its application to the rice genome

    PubMed Central

    Thibaud-Nissen, Françoise; Campbell, Matthew; Hamilton, John P; Zhu, Wei; Buell, C Robin

    2007-01-01

    Background Despite the improvements of tools for automated annotation of genome sequences, manual curation at the structural and functional level can provide an increased level of refinement to genome annotation. The Institute for Genomic Research Rice Genome Annotation (hereafter named the Osa1 Genome Annotation) is the product of an automated pipeline and, for this reason, will benefit from the input of biologists with expertise in rice and/or particular gene families. Leveraging knowledge from a dispersed community of scientists is a demonstrated way of improving a genome annotation. This requires tools that facilitate 1) the submission of gene annotation to an annotation project, 2) the review of the submitted models by project annotators, and 3) the incorporation of the submitted models in the ongoing annotation effort. Results We have developed the Eukaryotic Community Annotation Package (EuCAP), an annotation tool, and have applied it to the rice genome. The primary level of curation by community annotators (CA) has been the annotation of gene families. Annotation can be submitted by email or through the EuCAP Web Tool. The CA models are aligned to the rice pseudomolecules and the coordinates of these alignments, along with functional annotation, are stored in the MySQL EuCAP Gene Model database. Web pages displaying the alignments of the CA models to the Osa1 Genome models are automatically generated from the EuCAP Gene Model database. The alignments are reviewed by the project annotators (PAs) in the context of experimental evidence. Upon approval by the PAs, the CA models, along with the corresponding functional annotations, are integrated into the Osa1 Genome Annotation. The CA annotations, grouped by family, are displayed on the Community Annotation pages of the project website , as well as in the Community Annotation track of the Genome Browser. Conclusion We have applied EuCAP to rice. As of July 2007, the structural and/or functional annotation of 1,094 genes representing 57 families have been deposited and integrated into the current gene set. All of the EuCAP components are open-source, thereby allowing the implementation of EuCAP for the annotation of other genomes. EuCAP is available at . PMID:17961238

  6. Using textons to rank crystallization droplets by the likely presence of crystals

    PubMed Central

    Ng, Jia Tsing; Dekker, Carien; Kroemer, Markus; Osborne, Michael; von Delft, Frank

    2014-01-01

    The visual inspection of crystallization experiments is an important yet time-consuming and subjective step in X-ray crystallo­graphy. Previously published studies have focused on automatically classifying crystallization droplets into distinct but ultimately arbitrary experiment outcomes; here, a method is described that instead ranks droplets by their likelihood of containing crystals or microcrystals, thereby prioritizing for visual inspection those images that are most likely to contain useful information. The use of textons is introduced to describe crystallization droplets objectively, allowing them to be scored with the posterior probability of a random forest classifier trained against droplets manually annotated for the presence or absence of crystals or microcrystals. Unlike multi-class classification, this two-class system lends itself naturally to unidirectional ranking, which is most useful for assisting sequential viewing because images can be arranged simply by using these scores: this places droplets with probable crystalline behaviour early in the viewing order. Using this approach, the top ten wells included at least one human-annotated crystal or microcrystal for 94% of the plates in a data set of 196 plates imaged with a Minstrel HT system. The algorithm is robustly transferable to at least one other imaging system: when the parameters trained from Minstrel HT images are applied to a data set imaged by the Rock Imager system, human-annotated crystals ranked in the top ten wells for 90% of the plates. Because rearranging images is fundamental to the approach, a custom viewer was written to seamlessly support such ranked viewing, along with another important output of the algorithm, namely the shape of the curve of scores, which is itself a useful overview of the behaviour of the plate; additional features with known usefulness were adopted from existing viewers. Evidence is presented that such ranked viewing of images allows faster but more accurate evaluation of drops, in particular for the identification of microcrystals. PMID:25286854

  7. Large-scale inference of gene function through phylogenetic annotation of Gene Ontology terms: case study of the apoptosis and autophagy cellular processes.

    PubMed

    Feuermann, Marc; Gaudet, Pascale; Mi, Huaiyu; Lewis, Suzanna E; Thomas, Paul D

    2016-01-01

    We previously reported a paradigm for large-scale phylogenomic analysis of gene families that takes advantage of the large corpus of experimentally supported Gene Ontology (GO) annotations. This 'GO Phylogenetic Annotation' approach integrates GO annotations from evolutionarily related genes across ∼100 different organisms in the context of a gene family tree, in which curators build an explicit model of the evolution of gene functions. GO Phylogenetic Annotation models the gain and loss of functions in a gene family tree, which is used to infer the functions of uncharacterized (or incompletely characterized) gene products, even for human proteins that are relatively well studied. Here, we report our results from applying this paradigm to two well-characterized cellular processes, apoptosis and autophagy. This revealed several important observations with respect to GO annotations and how they can be used for function inference. Notably, we applied only a small fraction of the experimentally supported GO annotations to infer function in other family members. The majority of other annotations describe indirect effects, phenotypes or results from high throughput experiments. In addition, we show here how feedback from phylogenetic annotation leads to significant improvements in the PANTHER trees, the GO annotations and GO itself. Thus GO phylogenetic annotation both increases the quantity and improves the accuracy of the GO annotations provided to the research community. We expect these phylogenetically based annotations to be of broad use in gene enrichment analysis as well as other applications of GO annotations.Database URL: http://amigo.geneontology.org/amigo. © The Author(s) 2016. Published by Oxford University Press.

  8. SAS- Semantic Annotation Service for Geoscience resources on the web

    NASA Astrophysics Data System (ADS)

    Elag, M.; Kumar, P.; Marini, L.; Li, R.; Jiang, P.

    2015-12-01

    There is a growing need for increased integration across the data and model resources that are disseminated on the web to advance their reuse across different earth science applications. Meaningful reuse of resources requires semantic metadata to realize the semantic web vision for allowing pragmatic linkage and integration among resources. Semantic metadata associates standard metadata with resources to turn them into semantically-enabled resources on the web. However, the lack of a common standardized metadata framework as well as the uncoordinated use of metadata fields across different geo-information systems, has led to a situation in which standards and related Standard Names abound. To address this need, we have designed SAS to provide a bridge between the core ontologies required to annotate resources and information systems in order to enable queries and analysis over annotation from a single environment (web). SAS is one of the services that are provided by the Geosematnic framework, which is a decentralized semantic framework to support the integration between models and data and allow semantically heterogeneous to interact with minimum human intervention. Here we present the design of SAS and demonstrate its application for annotating data and models. First we describe how predicates and their attributes are extracted from standards and ingested in the knowledge-base of the Geosemantic framework. Then we illustrate the application of SAS in annotating data managed by SEAD and annotating simulation models that have web interface. SAS is a step in a broader approach to raise the quality of geoscience data and models that are published on the web and allow users to better search, access, and use of the existing resources based on standard vocabularies that are encoded and published using semantic technologies.

  9. Aggregating and Predicting Sequence Labels from Crowd Annotations

    PubMed Central

    Nguyen, An T.; Wallace, Byron C.; Li, Junyi Jessy; Nenkova, Ani; Lease, Matthew

    2017-01-01

    Despite sequences being core to NLP, scant work has considered how to handle noisy sequence labels from multiple annotators for the same text. Given such annotations, we consider two complementary tasks: (1) aggregating sequential crowd labels to infer a best single set of consensus annotations; and (2) using crowd annotations as training data for a model that can predict sequences in unannotated text. For aggregation, we propose a novel Hidden Markov Model variant. To predict sequences in unannotated text, we propose a neural approach using Long Short Term Memory. We evaluate a suite of methods across two different applications and text genres: Named-Entity Recognition in news articles and Information Extraction from biomedical abstracts. Results show improvement over strong baselines. Our source code and data are available online1. PMID:29093611

  10. Mental Health Manpower, Volume I: An Annotated Bibliography and Commentary, and Volume II: Recruitment, Training and Utilization - A Compilation of Articles, Surveys, and a Review of Applicable Literature.

    ERIC Educational Resources Information Center

    Klutch, Murray

    The study was designed to provide a base for mental health manpower planning. The first and principal section of Volume I is an annotated bibliography of applicable articles and books. An index lists items included in the bibliography according to subject and profession. A discussion of two conceptual approaches to alleviating the manpower…

  11. Collaborative Workspaces within Distributed Virtual Environments.

    DTIC Science & Technology

    1996-12-01

    such as a text document, a 3D model, or a captured image using a collaborative workspace called the InPerson Whiteboard . The Whiteboard contains a...commands for editing objects drawn on the screen. Finally, when the call is completed, the Whiteboard can be saved to a file for future use . IRIS Annotator... use , and a shared whiteboard that includes a number of multimedia annotation tools. Both systems are also mindful of bandwidth limitations and can

  12. Global View of Mars Topography

    NASA Technical Reports Server (NTRS)

    2007-01-01

    [figure removed for brevity, see original site] Annotated Version

    This global map of Mars is based on topographical information collected by the Mars Orbiter Laser Altimeter instrument on NASA's Mars Global Surveyor orbiter. Illumination is from the upper right. The image width is approximately 18,000 kilometers (11,185 miles). Candor Chasma forms part of the large Martian canyon system named Valles Marineris. The location of Southwest Candor Chasma is indicated in the annotated version.

  13. Enhancement of the Shared Graphics Workspace.

    DTIC Science & Technology

    1987-12-31

    participants to share videodisc images and computer graphics displayed in color and text and facsimile information displayed in black on amber. They...could annotate the information in up to five * colors and print the annotated version at both sites, using a standard fax machine. The SGWS also used a fax...system to display a document, whether text or photo, the camera scans the document, digitizes the data, and sends it via direct memory access (DMA) to

  14. Categorizing biomedicine images using novel image features and sparse coding representation

    PubMed Central

    2013-01-01

    Background Images embedded in biomedical publications carry rich information that often concisely summarize key hypotheses adopted, methods employed, or results obtained in a published study. Therefore, they offer valuable clues for understanding main content in a biomedical publication. Prior studies have pointed out the potential of mining images embedded in biomedical publications for automatically understanding and retrieving such images' associated source documents. Within the broad area of biomedical image processing, categorizing biomedical images is a fundamental step for building many advanced image analysis, retrieval, and mining applications. Similar to any automatic categorization effort, discriminative image features can provide the most crucial aid in the process. Method We observe that many images embedded in biomedical publications carry versatile annotation text. Based on the locations of and the spatial relationships between these text elements in an image, we thus propose some novel image features for image categorization purpose, which quantitatively characterize the spatial positions and distributions of text elements inside a biomedical image. We further adopt a sparse coding representation (SCR) based technique to categorize images embedded in biomedical publications by leveraging our newly proposed image features. Results we randomly selected 990 images of the JPG format for use in our experiments where 310 images were used as training samples and the rest were used as the testing cases. We first segmented 310 sample images following the our proposed procedure. This step produced a total of 1035 sub-images. We then manually labeled all these sub-images according to the two-level hierarchical image taxonomy proposed by [1]. Among our annotation results, 316 are microscopy images, 126 are gel electrophoresis images, 135 are line charts, 156 are bar charts, 52 are spot charts, 25 are tables, 70 are flow charts, and the remaining 155 images are of the type "others". A serial of experimental results are obtained. Firstly, each image categorizing results is presented, and next image categorizing performance indexes such as precision, recall, F-score, are all listed. Different features which include conventional image features and our proposed novel features indicate different categorizing performance, and the results are demonstrated. Thirdly, we conduct an accuracy comparison between support vector machine classification method and our proposed sparse representation classification method. At last, our proposed approach is compared with three peer classification method and experimental results verify our impressively improved performance. Conclusions Compared with conventional image features that do not exploit characteristics regarding text positions and distributions inside images embedded in biomedical publications, our proposed image features coupled with the SR based representation model exhibit superior performance for classifying biomedical images as demonstrated in our comparative benchmark study. PMID:24565470

  15. Semantator: semantic annotator for converting biomedical text to linked data.

    PubMed

    Tao, Cui; Song, Dezhao; Sharma, Deepak; Chute, Christopher G

    2013-10-01

    More than 80% of biomedical data is embedded in plain text. The unstructured nature of these text-based documents makes it challenging to easily browse and query the data of interest in them. One approach to facilitate browsing and querying biomedical text is to convert the plain text to a linked web of data, i.e., converting data originally in free text to structured formats with defined meta-level semantics. In this paper, we introduce Semantator (Semantic Annotator), a semantic-web-based environment for annotating data of interest in biomedical documents, browsing and querying the annotated data, and interactively refining annotation results if needed. Through Semantator, information of interest can be either annotated manually or semi-automatically using plug-in information extraction tools. The annotated results will be stored in RDF and can be queried using the SPARQL query language. In addition, semantic reasoners can be directly applied to the annotated data for consistency checking and knowledge inference. Semantator has been released online and was used by the biomedical ontology community who provided positive feedbacks. Our evaluation results indicated that (1) Semantator can perform the annotation functionalities as designed; (2) Semantator can be adopted in real applications in clinical and transactional research; and (3) the annotated results using Semantator can be easily used in Semantic-web-based reasoning tools for further inference. Copyright © 2013 Elsevier Inc. All rights reserved.

  16. Open semantic annotation of scientific publications using DOMEO.

    PubMed

    Ciccarese, Paolo; Ocana, Marco; Clark, Tim

    2012-04-24

    Our group has developed a useful shared software framework for performing, versioning, sharing and viewing Web annotations of a number of kinds, using an open representation model. The Domeo Annotation Tool was developed in tandem with this open model, the Annotation Ontology (AO). Development of both the Annotation Framework and the open model was driven by requirements of several different types of alpha users, including bench scientists and biomedical curators from university research labs, online scientific communities, publishing and pharmaceutical companies.Several use cases were incrementally implemented by the toolkit. These use cases in biomedical communications include personal note-taking, group document annotation, semantic tagging, claim-evidence-context extraction, reagent tagging, and curation of textmining results from entity extraction algorithms. We report on the Domeo user interface here. Domeo has been deployed in beta release as part of the NIH Neuroscience Information Framework (NIF, http://www.neuinfo.org) and is scheduled for production deployment in the NIF's next full release.Future papers will describe other aspects of this work in detail, including Annotation Framework Services and components for integrating with external textmining services, such as the NCBO Annotator web service, and with other textmining applications using the Apache UIMA framework.

  17. Open semantic annotation of scientific publications using DOMEO

    PubMed Central

    2012-01-01

    Background Our group has developed a useful shared software framework for performing, versioning, sharing and viewing Web annotations of a number of kinds, using an open representation model. Methods The Domeo Annotation Tool was developed in tandem with this open model, the Annotation Ontology (AO). Development of both the Annotation Framework and the open model was driven by requirements of several different types of alpha users, including bench scientists and biomedical curators from university research labs, online scientific communities, publishing and pharmaceutical companies. Several use cases were incrementally implemented by the toolkit. These use cases in biomedical communications include personal note-taking, group document annotation, semantic tagging, claim-evidence-context extraction, reagent tagging, and curation of textmining results from entity extraction algorithms. Results We report on the Domeo user interface here. Domeo has been deployed in beta release as part of the NIH Neuroscience Information Framework (NIF, http://www.neuinfo.org) and is scheduled for production deployment in the NIF’s next full release. Future papers will describe other aspects of this work in detail, including Annotation Framework Services and components for integrating with external textmining services, such as the NCBO Annotator web service, and with other textmining applications using the Apache UIMA framework. PMID:22541592

  18. A Deep and Autoregressive Approach for Topic Modeling of Multimodal Data.

    PubMed

    Zheng, Yin; Zhang, Yu-Jin; Larochelle, Hugo

    2016-06-01

    Topic modeling based on latent Dirichlet allocation (LDA) has been a framework of choice to deal with multimodal data, such as in image annotation tasks. Another popular approach to model the multimodal data is through deep neural networks, such as the deep Boltzmann machine (DBM). Recently, a new type of topic model called the Document Neural Autoregressive Distribution Estimator (DocNADE) was proposed and demonstrated state-of-the-art performance for text document modeling. In this work, we show how to successfully apply and extend this model to multimodal data, such as simultaneous image classification and annotation. First, we propose SupDocNADE, a supervised extension of DocNADE, that increases the discriminative power of the learned hidden topic features and show how to employ it to learn a joint representation from image visual words, annotation words and class label information. We test our model on the LabelMe and UIUC-Sports data sets and show that it compares favorably to other topic models. Second, we propose a deep extension of our model and provide an efficient way of training the deep model. Experimental results show that our deep model outperforms its shallow version and reaches state-of-the-art performance on the Multimedia Information Retrieval (MIR) Flickr data set.

  19. Comprehensive cellular‐resolution atlas of the adult human brain

    PubMed Central

    Royall, Joshua J.; Sunkin, Susan M.; Ng, Lydia; Facer, Benjamin A.C.; Lesnar, Phil; Guillozet‐Bongaarts, Angie; McMurray, Bergen; Szafer, Aaron; Dolbeare, Tim A.; Stevens, Allison; Tirrell, Lee; Benner, Thomas; Caldejon, Shiella; Dalley, Rachel A.; Dee, Nick; Lau, Christopher; Nyhus, Julie; Reding, Melissa; Riley, Zackery L.; Sandman, David; Shen, Elaine; van der Kouwe, Andre; Varjabedian, Ani; Write, Michelle; Zollei, Lilla; Dang, Chinh; Knowles, James A.; Koch, Christof; Phillips, John W.; Sestan, Nenad; Wohnoutka, Paul; Zielke, H. Ronald; Hohmann, John G.; Jones, Allan R.; Bernard, Amy; Hawrylycz, Michael J.; Hof, Patrick R.; Fischl, Bruce

    2016-01-01

    ABSTRACT Detailed anatomical understanding of the human brain is essential for unraveling its functional architecture, yet current reference atlases have major limitations such as lack of whole‐brain coverage, relatively low image resolution, and sparse structural annotation. We present the first digital human brain atlas to incorporate neuroimaging, high‐resolution histology, and chemoarchitecture across a complete adult female brain, consisting of magnetic resonance imaging (MRI), diffusion‐weighted imaging (DWI), and 1,356 large‐format cellular resolution (1 µm/pixel) Nissl and immunohistochemistry anatomical plates. The atlas is comprehensively annotated for 862 structures, including 117 white matter tracts and several novel cyto‐ and chemoarchitecturally defined structures, and these annotations were transferred onto the matching MRI dataset. Neocortical delineations were done for sulci, gyri, and modified Brodmann areas to link macroscopic anatomical and microscopic cytoarchitectural parcellations. Correlated neuroimaging and histological structural delineation allowed fine feature identification in MRI data and subsequent structural identification in MRI data from other brains. This interactive online digital atlas is integrated with existing Allen Institute for Brain Science gene expression atlases and is publicly accessible as a resource for the neuroscience community. J. Comp. Neurol. 524:3127–3481, 2016. © 2016 The Authors The Journal of Comparative Neurology Published by Wiley Periodicals, Inc. PMID:27418273

  20. Time series modeling of live-cell shape dynamics for image-based phenotypic profiling.

    PubMed

    Gordonov, Simon; Hwang, Mun Kyung; Wells, Alan; Gertler, Frank B; Lauffenburger, Douglas A; Bathe, Mark

    2016-01-01

    Live-cell imaging can be used to capture spatio-temporal aspects of cellular responses that are not accessible to fixed-cell imaging. As the use of live-cell imaging continues to increase, new computational procedures are needed to characterize and classify the temporal dynamics of individual cells. For this purpose, here we present the general experimental-computational framework SAPHIRE (Stochastic Annotation of Phenotypic Individual-cell Responses) to characterize phenotypic cellular responses from time series imaging datasets. Hidden Markov modeling is used to infer and annotate morphological state and state-switching properties from image-derived cell shape measurements. Time series modeling is performed on each cell individually, making the approach broadly useful for analyzing asynchronous cell populations. Two-color fluorescent cells simultaneously expressing actin and nuclear reporters enabled us to profile temporal changes in cell shape following pharmacological inhibition of cytoskeleton-regulatory signaling pathways. Results are compared with existing approaches conventionally applied to fixed-cell imaging datasets, and indicate that time series modeling captures heterogeneous dynamic cellular responses that can improve drug classification and offer additional important insight into mechanisms of drug action. The software is available at http://saphire-hcs.org.

  1. A semi-automatic annotation tool for cooking video

    NASA Astrophysics Data System (ADS)

    Bianco, Simone; Ciocca, Gianluigi; Napoletano, Paolo; Schettini, Raimondo; Margherita, Roberto; Marini, Gianluca; Gianforme, Giorgio; Pantaleo, Giuseppe

    2013-03-01

    In order to create a cooking assistant application to guide the users in the preparation of the dishes relevant to their profile diets and food preferences, it is necessary to accurately annotate the video recipes, identifying and tracking the foods of the cook. These videos present particular annotation challenges such as frequent occlusions, food appearance changes, etc. Manually annotate the videos is a time-consuming, tedious and error-prone task. Fully automatic tools that integrate computer vision algorithms to extract and identify the elements of interest are not error free, and false positive and false negative detections need to be corrected in a post-processing stage. We present an interactive, semi-automatic tool for the annotation of cooking videos that integrates computer vision techniques under the supervision of the user. The annotation accuracy is increased with respect to completely automatic tools and the human effort is reduced with respect to completely manual ones. The performance and usability of the proposed tool are evaluated on the basis of the time and effort required to annotate the same video sequences.

  2. Inductive creation of an annotation schema for manually indexing clinical conditions from emergency department reports

    PubMed Central

    Chapman, Wendy W.; Dowling, John N.

    2006-01-01

    Evaluating automated indexing applications requires comparing automatically indexed terms against manual reference standard annotations. However, there are no standard guidelines for determining which words from a textual document to include in manual annotations, and the vague task can result in substantial variation among manual indexers. We applied grounded theory to emergency department reports to create an annotation schema representing syntactic and semantic variables that could be annotated when indexing clinical conditions. We describe the annotation schema, which includes variables representing medical concepts (e.g., symptom, demographics), linguistic form (e.g., noun, adjective), and modifier types (e.g., anatomic location, severity). We measured the schema’s quality and found: (1) the schema was comprehensive enough to be applied to 20 unseen reports without changes to the schema; (2) agreement between author annotators applying the schema was high, with an F measure of 93%; and (3) an error analysis showed that the authors made complementary errors when applying the schema, demonstrating that the schema incorporates both linguistic and medical expertise. PMID:16230050

  3. CAMERA: An integrated strategy for compound spectra extraction and annotation of LC/MS data sets

    PubMed Central

    Kuhl, Carsten; Tautenhahn, Ralf; Böttcher, Christoph; Larson, Tony R.; Neumann, Steffen

    2013-01-01

    Liquid chromatography coupled to mass spectrometry is routinely used for metabolomics experiments. In contrast to the fairly routine and automated data acquisition steps, subsequent compound annotation and identification require extensive manual analysis and thus form a major bottle neck in data interpretation. Here we present CAMERA, a Bioconductor package integrating algorithms to extract compound spectra, annotate isotope and adduct peaks, and propose the accurate compound mass even in highly complex data. To evaluate the algorithms, we compared the annotation of CAMERA against a manually defined annotation for a mixture of known compounds spiked into a complex matrix at different concentrations. CAMERA successfully extracted accurate masses for 89.7% and 90.3% of the annotatable compounds in positive and negative ion mode, respectively. Furthermore, we present a novel annotation approach that combines spectral information of data acquired in opposite ion modes to further improve the annotation rate. We demonstrate the utility of CAMERA in two different, easily adoptable plant metabolomics experiments, where the application of CAMERA drastically reduced the amount of manual analysis. PMID:22111785

  4. Enhanced functionalities for annotating and indexing clinical text with the NCBO Annotator.

    PubMed

    Tchechmedjiev, Andon; Abdaoui, Amine; Emonet, Vincent; Melzi, Soumia; Jonnagaddala, Jitendra; Jonquet, Clement

    2018-06-01

    Second use of clinical data commonly involves annotating biomedical text with terminologies and ontologies. The National Center for Biomedical Ontology Annotator is a frequently used annotation service, originally designed for biomedical data, but not very suitable for clinical text annotation. In order to add new functionalities to the NCBO Annotator without hosting or modifying the original Web service, we have designed a proxy architecture that enables seamless extensions by pre-processing of the input text and parameters, and post processing of the annotations. We have then implemented enhanced functionalities for annotating and indexing free text such as: scoring, detection of context (negation, experiencer, temporality), new output formats and coarse-grained concept recognition (with UMLS Semantic Groups). In this paper, we present the NCBO Annotator+, a Web service which incorporates these new functionalities as well as a small set of evaluation results for concept recognition and clinical context detection on two standard evaluation tasks (Clef eHealth 2017, SemEval 2014). The Annotator+ has been successfully integrated into the SIFR BioPortal platform-an implementation of NCBO BioPortal for French biomedical terminologies and ontologies-to annotate English text. A Web user interface is available for testing and ontology selection (http://bioportal.lirmm.fr/ncbo_annotatorplus); however the Annotator+ is meant to be used through the Web service application programming interface (http://services.bioportal.lirmm.fr/ncbo_annotatorplus). The code is openly available, and we also provide a Docker packaging to enable easy local deployment to process sensitive (e.g. clinical) data in-house (https://github.com/sifrproject). andon.tchechmedjiev@lirmm.fr. Supplementary data are available at Bioinformatics online.

  5. MitoFish and MitoAnnotator: A Mitochondrial Genome Database of Fish with an Accurate and Automatic Annotation Pipeline

    PubMed Central

    Iwasaki, Wataru; Fukunaga, Tsukasa; Isagozawa, Ryota; Yamada, Koichiro; Maeda, Yasunobu; Satoh, Takashi P.; Sado, Tetsuya; Mabuchi, Kohji; Takeshima, Hirohiko; Miya, Masaki; Nishida, Mutsumi

    2013-01-01

    Mitofish is a database of fish mitochondrial genomes (mitogenomes) that includes powerful and precise de novo annotations for mitogenome sequences. Fish occupy an important position in the evolution of vertebrates and the ecology of the hydrosphere, and mitogenomic sequence data have served as a rich source of information for resolving fish phylogenies and identifying new fish species. The importance of a mitogenomic database continues to grow at a rapid pace as massive amounts of mitogenomic data are generated with the advent of new sequencing technologies. A severe bottleneck seems likely to occur with regard to mitogenome annotation because of the overwhelming pace of data accumulation and the intrinsic difficulties in annotating sequences with degenerating transfer RNA structures, divergent start/stop codons of the coding elements, and the overlapping of adjacent elements. To ease this data backlog, we developed an annotation pipeline named MitoAnnotator. MitoAnnotator automatically annotates a fish mitogenome with a high degree of accuracy in approximately 5 min; thus, it is readily applicable to data sets of dozens of sequences. MitoFish also contains re-annotations of previously sequenced fish mitogenomes, enabling researchers to refer to them when they find annotations that are likely to be erroneous or while conducting comparative mitogenomic analyses. For users who need more information on the taxonomy, habitats, phenotypes, or life cycles of fish, MitoFish provides links to related databases. MitoFish and MitoAnnotator are freely available at http://mitofish.aori.u-tokyo.ac.jp/ (last accessed August 28, 2013); all of the data can be batch downloaded, and the annotation pipeline can be used via a web interface. PMID:23955518

  6. Multimodal MSI in Conjunction with Broad Coverage Spatially Resolved MS2 Increases Confidence in Both Molecular Identification and Localization.

    PubMed

    Veličković, Dušan; Chu, Rosalie K; Carrell, Alyssa A; Thomas, Mathew; Paša-Tolić, Ljiljana; Weston, David J; Anderton, Christopher R

    2018-01-02

    One critical aspect of mass spectrometry imaging (MSI) is the need to confidently identify detected analytes. While orthogonal tandem MS (e.g., LC-MS 2 ) experiments from sample extracts can assist in annotating ions, the spatial information about these molecules is lost. Accordingly, this could cause mislead conclusions, especially in cases where isobaric species exhibit different distributions within a sample. In this Technical Note, we employed a multimodal imaging approach, using matrix assisted laser desorption/ionization (MALDI)-MSI and liquid extraction surface analysis (LESA)-MS 2 I, to confidently annotate and localize a broad range of metabolites involved in a tripartite symbiosis system of moss, cyanobacteria, and fungus. We found that the combination of these two imaging modalities generated very congruent ion images, providing the link between highly accurate structural information onfered by LESA and high spatial resolution attainable by MALDI. These results demonstrate how this combined methodology could be very useful in differentiating metabolite routes in complex systems.

  7. v3NLP Framework: Tools to Build Applications for Extracting Concepts from Clinical Text

    PubMed Central

    Divita, Guy; Carter, Marjorie E.; Tran, Le-Thuy; Redd, Doug; Zeng, Qing T; Duvall, Scott; Samore, Matthew H.; Gundlapalli, Adi V.

    2016-01-01

    Introduction: Substantial amounts of clinically significant information are contained only within the narrative of the clinical notes in electronic medical records. The v3NLP Framework is a set of “best-of-breed” functionalities developed to transform this information into structured data for use in quality improvement, research, population health surveillance, and decision support. Background: MetaMap, cTAKES and similar well-known natural language processing (NLP) tools do not have sufficient scalability out of the box. The v3NLP Framework evolved out of the necessity to scale-up these tools up and provide a framework to customize and tune techniques that fit a variety of tasks, including document classification, tuned concept extraction for specific conditions, patient classification, and information retrieval. Innovation: Beyond scalability, several v3NLP Framework-developed projects have been efficacy tested and benchmarked. While v3NLP Framework includes annotators, pipelines and applications, its functionalities enable developers to create novel annotators and to place annotators into pipelines and scaled applications. Discussion: The v3NLP Framework has been successfully utilized in many projects including general concept extraction, risk factors for homelessness among veterans, and identification of mentions of the presence of an indwelling urinary catheter. Projects as diverse as predicting colonization with methicillin-resistant Staphylococcus aureus and extracting references to military sexual trauma are being built using v3NLP Framework components. Conclusion: The v3NLP Framework is a set of functionalities and components that provide Java developers with the ability to create novel annotators and to place those annotators into pipelines and applications to extract concepts from clinical text. There are scale-up and scale-out functionalities to process large numbers of records. PMID:27683667

  8. A neotropical Miocene pollen database employing image-based search and semantic modeling1

    PubMed Central

    Han, Jing Ginger; Cao, Hongfei; Barb, Adrian; Punyasena, Surangi W.; Jaramillo, Carlos; Shyu, Chi-Ren

    2014-01-01

    • Premise of the study: Digital microscopic pollen images are being generated with increasing speed and volume, producing opportunities to develop new computational methods that increase the consistency and efficiency of pollen analysis and provide the palynological community a computational framework for information sharing and knowledge transfer. • Methods: Mathematical methods were used to assign trait semantics (abstract morphological representations) of the images of neotropical Miocene pollen and spores. Advanced database-indexing structures were built to compare and retrieve similar images based on their visual content. A Web-based system was developed to provide novel tools for automatic trait semantic annotation and image retrieval by trait semantics and visual content. • Results: Mathematical models that map visual features to trait semantics can be used to annotate images with morphology semantics and to search image databases with improved reliability and productivity. Images can also be searched by visual content, providing users with customized emphases on traits such as color, shape, and texture. • Discussion: Content- and semantic-based image searches provide a powerful computational platform for pollen and spore identification. The infrastructure outlined provides a framework for building a community-wide palynological resource, streamlining the process of manual identification, analysis, and species discovery. PMID:25202648

  9. Nearest neighbor 3D segmentation with context features

    NASA Astrophysics Data System (ADS)

    Hristova, Evelin; Schulz, Heinrich; Brosch, Tom; Heinrich, Mattias P.; Nickisch, Hannes

    2018-03-01

    Automated and fast multi-label segmentation of medical images is challenging and clinically important. This paper builds upon a supervised machine learning framework that uses training data sets with dense organ annotations and vantage point trees to classify voxels in unseen images based on similarity of binary feature vectors extracted from the data. Without explicit model knowledge, the algorithm is applicable to different modalities and organs, and achieves high accuracy. The method is successfully tested on 70 abdominal CT and 42 pelvic MR images. With respect to ground truth, an average Dice overlap score of 0.76 for the CT segmentation of liver, spleen and kidneys is achieved. The mean score for the MR delineation of bladder, bones, prostate and rectum is 0.65. Additionally, we benchmark several variations of the main components of the method and reduce the computation time by up to 47% without significant loss of accuracy. The segmentation results are - for a nearest neighbor method - surprisingly accurate, robust as well as data and time efficient.

  10. Atmosphere-based image classification through luminance and hue

    NASA Astrophysics Data System (ADS)

    Xu, Feng; Zhang, Yujin

    2005-07-01

    In this paper a novel image classification system is proposed. Atmosphere serves an important role in generating the scene"s topic or in conveying the message behind the scene"s story, which belongs to abstract attribute level in semantic levels. At first, five atmosphere semantic categories are defined according to rules of photo and film grammar, followed by global luminance and hue features. Then the hierarchical SVM classifiers are applied. In each classification stage, corresponding features are extracted and the trained linear SVM is implemented, resulting in two classes. After three stages of classification, five atmosphere categories are obtained. At last, the text annotation of the atmosphere semantics and the corresponding features by Extensible Markup Language (XML) in MPEG-7 is defined, which can be integrated into more multimedia applications (such as searching, indexing and accessing of multimedia content). The experiment is performed on Corel images and film frames. The classification results prove the effectiveness of the definition of atmosphere semantic classes and the corresponding features.

  11. Australian sea-floor survey data, with images and expert annotations.

    PubMed

    Bewley, Michael; Friedman, Ariell; Ferrari, Renata; Hill, Nicole; Hovey, Renae; Barrett, Neville; Marzinelli, Ezequiel M; Pizarro, Oscar; Figueira, Will; Meyer, Lisa; Babcock, Russ; Bellchambers, Lynda; Byrne, Maria; Williams, Stefan B

    2015-01-01

    This Australian benthic data set (BENTHOZ-2015) consists of an expert-annotated set of georeferenced benthic images and associated sensor data, captured by an autonomous underwater vehicle (AUV) around Australia. This type of data is of interest to marine scientists studying benthic habitats and organisms. AUVs collect georeferenced images over an area with consistent illumination and altitude, and make it possible to generate broad scale, photo-realistic 3D maps. Marine scientists then typically spend several minutes on each of thousands of images, labeling substratum type and biota at a subset of points. Labels from four Australian research groups were combined using the CATAMI classification scheme, a hierarchical classification scheme based on taxonomy and morphology for scoring marine imagery. This data set consists of 407,968 expert labeled points from around the Australian coast, with associated images, geolocation and other sensor data. The robotic surveys that collected this data form part of Australia's Integrated Marine Observing System (IMOS) ongoing benthic monitoring program. There is reuse potential in marine science, robotics, and computer vision research.

  12. Australian sea-floor survey data, with images and expert annotations

    PubMed Central

    Bewley, Michael; Friedman, Ariell; Ferrari, Renata; Hill, Nicole; Hovey, Renae; Barrett, Neville; Pizarro, Oscar; Figueira, Will; Meyer, Lisa; Babcock, Russ; Bellchambers, Lynda; Byrne, Maria; Williams, Stefan B.

    2015-01-01

    This Australian benthic data set (BENTHOZ-2015) consists of an expert-annotated set of georeferenced benthic images and associated sensor data, captured by an autonomous underwater vehicle (AUV) around Australia. This type of data is of interest to marine scientists studying benthic habitats and organisms. AUVs collect georeferenced images over an area with consistent illumination and altitude, and make it possible to generate broad scale, photo-realistic 3D maps. Marine scientists then typically spend several minutes on each of thousands of images, labeling substratum type and biota at a subset of points. Labels from four Australian research groups were combined using the CATAMI classification scheme, a hierarchical classification scheme based on taxonomy and morphology for scoring marine imagery. This data set consists of 407,968 expert labeled points from around the Australian coast, with associated images, geolocation and other sensor data. The robotic surveys that collected this data form part of Australia's Integrated Marine Observing System (IMOS) ongoing benthic monitoring program. There is reuse potential in marine science, robotics, and computer vision research. PMID:26528396

  13. Australian sea-floor survey data, with images and expert annotations

    NASA Astrophysics Data System (ADS)

    Bewley, Michael; Friedman, Ariell; Ferrari, Renata; Hill, Nicole; Hovey, Renae; Barrett, Neville; Pizarro, Oscar; Figueira, Will; Meyer, Lisa; Babcock, Russ; Bellchambers, Lynda; Byrne, Maria; Williams, Stefan B.

    2015-10-01

    This Australian benthic data set (BENTHOZ-2015) consists of an expert-annotated set of georeferenced benthic images and associated sensor data, captured by an autonomous underwater vehicle (AUV) around Australia. This type of data is of interest to marine scientists studying benthic habitats and organisms. AUVs collect georeferenced images over an area with consistent illumination and altitude, and make it possible to generate broad scale, photo-realistic 3D maps. Marine scientists then typically spend several minutes on each of thousands of images, labeling substratum type and biota at a subset of points. Labels from four Australian research groups were combined using the CATAMI classification scheme, a hierarchical classification scheme based on taxonomy and morphology for scoring marine imagery. This data set consists of 407,968 expert labeled points from around the Australian coast, with associated images, geolocation and other sensor data. The robotic surveys that collected this data form part of Australia's Integrated Marine Observing System (IMOS) ongoing benthic monitoring program. There is reuse potential in marine science, robotics, and computer vision research.

  14. Automatic cerebrospinal fluid segmentation in non-contrast CT images using a 3D convolutional network

    NASA Astrophysics Data System (ADS)

    Patel, Ajay; van de Leemput, Sil C.; Prokop, Mathias; van Ginneken, Bram; Manniesing, Rashindra

    2017-03-01

    Segmentation of anatomical structures is fundamental in the development of computer aided diagnosis systems for cerebral pathologies. Manual annotations are laborious, time consuming and subject to human error and observer variability. Accurate quantification of cerebrospinal fluid (CSF) can be employed as a morphometric measure for diagnosis and patient outcome prediction. However, segmenting CSF in non-contrast CT images is complicated by low soft tissue contrast and image noise. In this paper we propose a state-of-the-art method using a multi-scale three-dimensional (3D) fully convolutional neural network (CNN) to automatically segment all CSF within the cranial cavity. The method is trained on a small dataset comprised of four manually annotated cerebral CT images. Quantitative evaluation of a separate test dataset of four images shows a mean Dice similarity coefficient of 0.87 +/- 0.01 and mean absolute volume difference of 4.77 +/- 2.70 %. The average prediction time was 68 seconds. Our method allows for fast and fully automated 3D segmentation of cerebral CSF in non-contrast CT, and shows promising results despite a limited amount of training data.

  15. Rapid storage and retrieval of genomic intervals from a relational database system using nested containment lists

    PubMed Central

    Wiley, Laura K.; Sivley, R. Michael; Bush, William S.

    2013-01-01

    Efficient storage and retrieval of genomic annotations based on range intervals is necessary, given the amount of data produced by next-generation sequencing studies. The indexing strategies of relational database systems (such as MySQL) greatly inhibit their use in genomic annotation tasks. This has led to the development of stand-alone applications that are dependent on flat-file libraries. In this work, we introduce MyNCList, an implementation of the NCList data structure within a MySQL database. MyNCList enables the storage, update and rapid retrieval of genomic annotations from the convenience of a relational database system. Range-based annotations of 1 million variants are retrieved in under a minute, making this approach feasible for whole-genome annotation tasks. Database URL: https://github.com/bushlab/mynclist PMID:23894185

  16. Rapid storage and retrieval of genomic intervals from a relational database system using nested containment lists.

    PubMed

    Wiley, Laura K; Sivley, R Michael; Bush, William S

    2013-01-01

    Efficient storage and retrieval of genomic annotations based on range intervals is necessary, given the amount of data produced by next-generation sequencing studies. The indexing strategies of relational database systems (such as MySQL) greatly inhibit their use in genomic annotation tasks. This has led to the development of stand-alone applications that are dependent on flat-file libraries. In this work, we introduce MyNCList, an implementation of the NCList data structure within a MySQL database. MyNCList enables the storage, update and rapid retrieval of genomic annotations from the convenience of a relational database system. Range-based annotations of 1 million variants are retrieved in under a minute, making this approach feasible for whole-genome annotation tasks. Database URL: https://github.com/bushlab/mynclist.

  17. Do we need annotation experts? A case study in celiac disease classification.

    PubMed

    Kwitt, Roland; Hegenbart, Sebastian; Rasiwasia, Nikhil; Vécsei, Andreas; Uhl, Andreas

    2014-01-01

    Inference of clinically-relevant findings from the visual appearance of images has become an essential part of processing pipelines for many problems in medical imaging. Typically, a sufficient amount labeled training data is assumed to be available, provided by domain experts. However, acquisition of this data is usually a time-consuming and expensive endeavor. In this work, we ask the question if, for certain problems, expert knowledge is actually required. In fact, we investigate the impact of letting non-expert volunteers annotate a database of endoscopy images which are then used to assess the absence/presence of celiac disease. Contrary to previous approaches, we are not interested in algorithms that can handle the label noise. Instead, we present compelling empirical evidence that label noise can be compensated by a sufficiently large corpus of training data, labeled by the non-experts.

  18. MIND: modality independent neighbourhood descriptor for multi-modal deformable registration.

    PubMed

    Heinrich, Mattias P; Jenkinson, Mark; Bhushan, Manav; Matin, Tahreema; Gleeson, Fergus V; Brady, Sir Michael; Schnabel, Julia A

    2012-10-01

    Deformable registration of images obtained from different modalities remains a challenging task in medical image analysis. This paper addresses this important problem and proposes a modality independent neighbourhood descriptor (MIND) for both linear and deformable multi-modal registration. Based on the similarity of small image patches within one image, it aims to extract the distinctive structure in a local neighbourhood, which is preserved across modalities. The descriptor is based on the concept of image self-similarity, which has been introduced for non-local means filtering for image denoising. It is able to distinguish between different types of features such as corners, edges and homogeneously textured regions. MIND is robust to the most considerable differences between modalities: non-functional intensity relations, image noise and non-uniform bias fields. The multi-dimensional descriptor can be efficiently computed in a dense fashion across the whole image and provides point-wise local similarity across modalities based on the absolute or squared difference between descriptors, making it applicable for a wide range of transformation models and optimisation algorithms. We use the sum of squared differences of the MIND representations of the images as a similarity metric within a symmetric non-parametric Gauss-Newton registration framework. In principle, MIND would be applicable to the registration of arbitrary modalities. In this work, we apply and validate it for the registration of clinical 3D thoracic CT scans between inhale and exhale as well as the alignment of 3D CT and MRI scans. Experimental results show the advantages of MIND over state-of-the-art techniques such as conditional mutual information and entropy images, with respect to clinically annotated landmark locations. Copyright © 2012 Elsevier B.V. All rights reserved.

  19. Sky Detection in Hazy Image.

    PubMed

    Song, Yingchao; Luo, Haibo; Ma, Junkai; Hui, Bin; Chang, Zheng

    2018-04-01

    Sky detection plays an essential role in various computer vision applications. Most existing sky detection approaches, being trained on ideal dataset, may lose efficacy when facing unfavorable conditions like the effects of weather and lighting conditions. In this paper, a novel algorithm for sky detection in hazy images is proposed from the perspective of probing the density of haze. We address the problem by an image segmentation and a region-level classification. To characterize the sky of hazy scenes, we unprecedentedly introduce several haze-relevant features that reflect the perceptual hazy density and the scene depth. Based on these features, the sky is separated by two imbalance SVM classifiers and a similarity measurement. Moreover, a sky dataset (named HazySky) with 500 annotated hazy images is built for model training and performance evaluation. To evaluate the performance of our method, we conducted extensive experiments both on our HazySky dataset and the SkyFinder dataset. The results demonstrate that our method performs better on the detection accuracy than previous methods, not only under hazy scenes, but also under other weather conditions.

  20. Clinical applications of textural analysis in non-small cell lung cancer.

    PubMed

    Phillips, Iain; Ajaz, Mazhar; Ezhil, Veni; Prakash, Vineet; Alobaidli, Sheaka; McQuaid, Sarah J; South, Christopher; Scuffham, James; Nisbet, Andrew; Evans, Philip

    2018-01-01

    Lung cancer is the leading cause of cancer mortality worldwide. Treatment pathways include regular cross-sectional imaging, generating large data sets which present intriguing possibilities for exploitation beyond standard visual interpretation. This additional data mining has been termed "radiomics" and includes semantic and agnostic approaches. Textural analysis (TA) is an example of the latter, and uses a range of mathematically derived features to describe an image or region of an image. Often TA is used to describe a suspected or known tumour. TA is an attractive tool as large existing image sets can be submitted to diverse techniques for data processing, presentation, interpretation and hypothesis testing with annotated clinical outcomes. There is a growing anthology of published data using different TA techniques to differentiate between benign and malignant lung nodules, differentiate tissue subtypes of lung cancer, prognosticate and predict outcome and treatment response, as well as predict treatment side effects and potentially aid radiotherapy planning. The aim of this systematic review is to summarize the current published data and understand the potential future role of TA in managing lung cancer.

  1. A curated collection of tissue microarray images and clinical outcome data of prostate cancer patients

    PubMed Central

    Zhong, Qing; Guo, Tiannan; Rechsteiner, Markus; Rüschoff, Jan H.; Rupp, Niels; Fankhauser, Christian; Saba, Karim; Mortezavi, Ashkan; Poyet, Cédric; Hermanns, Thomas; Zhu, Yi; Moch, Holger; Aebersold, Ruedi; Wild, Peter J.

    2017-01-01

    Microscopy image data of human cancers provide detailed phenotypes of spatially and morphologically intact tissues at single-cell resolution, thus complementing large-scale molecular analyses, e.g., next generation sequencing or proteomic profiling. Here we describe a high-resolution tissue microarray (TMA) image dataset from a cohort of 71 prostate tissue samples, which was hybridized with bright-field dual colour chromogenic and silver in situ hybridization probes for the tumour suppressor gene PTEN. These tissue samples were digitized and supplemented with expert annotations, clinical information, statistical models of PTEN genetic status, and computer source codes. For validation, we constructed an additional TMA dataset for 424 prostate tissues, hybridized with FISH probes for PTEN, and performed survival analysis on a subset of 339 radical prostatectomy specimens with overall, disease-specific and recurrence-free survival (maximum 167 months). For application, we further produced 6,036 image patches derived from two whole slides. Our curated collection of prostate cancer data sets provides reuse potential for both biomedical and computational studies. PMID:28291248

  2. Automated detection of microaneurysms using scale-adapted blob analysis and semi-supervised learning.

    PubMed

    Adal, Kedir M; Sidibé, Désiré; Ali, Sharib; Chaum, Edward; Karnowski, Thomas P; Mériaudeau, Fabrice

    2014-04-01

    Despite several attempts, automated detection of microaneurysm (MA) from digital fundus images still remains to be an open issue. This is due to the subtle nature of MAs against the surrounding tissues. In this paper, the microaneurysm detection problem is modeled as finding interest regions or blobs from an image and an automatic local-scale selection technique is presented. Several scale-adapted region descriptors are introduced to characterize these blob regions. A semi-supervised based learning approach, which requires few manually annotated learning examples, is also proposed to train a classifier which can detect true MAs. The developed system is built using only few manually labeled and a large number of unlabeled retinal color fundus images. The performance of the overall system is evaluated on Retinopathy Online Challenge (ROC) competition database. A competition performance measure (CPM) of 0.364 shows the competitiveness of the proposed system against state-of-the art techniques as well as the applicability of the proposed features to analyze fundus images. Copyright © 2013 Elsevier Ireland Ltd. All rights reserved.

  3. Sky Detection in Hazy Image

    PubMed Central

    Song, Yingchao; Luo, Haibo; Ma, Junkai; Hui, Bin; Chang, Zheng

    2018-01-01

    Sky detection plays an essential role in various computer vision applications. Most existing sky detection approaches, being trained on ideal dataset, may lose efficacy when facing unfavorable conditions like the effects of weather and lighting conditions. In this paper, a novel algorithm for sky detection in hazy images is proposed from the perspective of probing the density of haze. We address the problem by an image segmentation and a region-level classification. To characterize the sky of hazy scenes, we unprecedentedly introduce several haze-relevant features that reflect the perceptual hazy density and the scene depth. Based on these features, the sky is separated by two imbalance SVM classifiers and a similarity measurement. Moreover, a sky dataset (named HazySky) with 500 annotated hazy images is built for model training and performance evaluation. To evaluate the performance of our method, we conducted extensive experiments both on our HazySky dataset and the SkyFinder dataset. The results demonstrate that our method performs better on the detection accuracy than previous methods, not only under hazy scenes, but also under other weather conditions. PMID:29614778

  4. Teacher-to-Teacher: An Annotated Bibliography on DNA and Genetic Engineering.

    ERIC Educational Resources Information Center

    Mertens, Thomas R., Comp.

    1984-01-01

    Presented is an annotated bibliography of 24 books on DNA and genetic engineering. Areas considered in these books include: basic biological concepts to help understand advances in genetic engineering; applications of genetic engineering; social, legal, and moral issues of genetic engineering; and historical aspects leading to advances in…

  5. The Chicago American Indian Community, 1893-1988. Annotated Bibliography and Guide to Sources in Chicago.

    ERIC Educational Resources Information Center

    Beck, David

    This annotated bibliography identifies and describes documentary evidence of Chicago's American Indian population since the 1893 World's Columbian Exposition. Sources include studies and reports generated by Indian community organizations and agencies, community newsletters, newspapers, oral histories, grant applications, personal papers, and…

  6. Resources for Improving Principal Effectiveness. Annotated Bibliography of Packaged Training Programs.

    ERIC Educational Resources Information Center

    Gaddy, C. Stephen

    This annotated bibliography of commercially prepared training materials for management and leadership development programs offers 10 topical sections of references applicable to school principal training. Entries were selected by using the following criteria: (1) programs dealing too specifically with management in sales, manufacturing, finance,…

  7. CommWalker: correctly evaluating modules in molecular networks in light of annotation bias.

    PubMed

    Luecken, M D; Page, M J T; Crosby, A J; Mason, S; Reinert, G; Deane, C M

    2018-03-15

    Detecting novel functional modules in molecular networks is an important step in biological research. In the absence of gold standard functional modules, functional annotations are often used to verify whether detected modules/communities have biological meaning. However, as we show, the uneven distribution of functional annotations means that such evaluation methods favor communities of well-studied proteins. We propose a novel framework for the evaluation of communities as functional modules. Our proposed framework, CommWalker, takes communities as inputs and evaluates them in their local network environment by performing short random walks. We test CommWalker's ability to overcome annotation bias using input communities from four community detection methods on two protein interaction networks. We find that modules accepted by CommWalker are similarly co-expressed as those accepted by current methods. Crucially, CommWalker performs well not only in well-annotated regions, but also in regions otherwise obscured by poor annotation. CommWalker community prioritization both faithfully captures well-validated communities and identifies functional modules that may correspond to more novel biology. The CommWalker algorithm is freely available at opig.stats.ox.ac.uk/resources or as a docker image on the Docker Hub at hub.docker.com/r/lueckenmd/commwalker/. deane@stats.ox.ac.uk. Supplementary data are available at Bioinformatics online.

  8. Opacity annotation of diffuse lung diseases using deep convolutional neural network with multi-channel information

    NASA Astrophysics Data System (ADS)

    Mabu, Shingo; Kido, Shoji; Hashimoto, Noriaki; Hirano, Yasushi; Kuremoto, Takashi

    2018-02-01

    This research proposes a multi-channel deep convolutional neural network (DCNN) for computer-aided diagnosis (CAD) that classifies normal and abnormal opacities of diffuse lung diseases in Computed Tomography (CT) images. Because CT images are gray scale, DCNN usually uses one channel for inputting image data. On the other hand, this research uses multi-channel DCNN where each channel corresponds to the original raw image or the images transformed by some preprocessing techniques. In fact, the information obtained only from raw images is limited and some conventional research suggested that preprocessing of images contributes to improving the classification accuracy. Thus, the combination of the original and preprocessed images is expected to show higher accuracy. The proposed method realizes region of interest (ROI)-based opacity annotation. We used lung CT images taken in Yamaguchi University Hospital, Japan, and they are divided into 32 × 32 ROI images. The ROIs contain six kinds of opacities: consolidation, ground-glass opacity (GGO), emphysema, honeycombing, nodular, and normal. The aim of the proposed method is to classify each ROI into one of the six opacities (classes). The DCNN structure is based on VGG network that secured the first and second places in ImageNet ILSVRC-2014. From the experimental results, the classification accuracy of the proposed method was better than the conventional method with single channel, and there was a significant difference between them.

  9. Information Hiding: an Annotated Bibliography

    DTIC Science & Technology

    1999-04-13

    parameters needed for reconstruction are enciphered using DES . The encrypted image is hidden in a cover image . [153] 074115, ‘Watermarking algorithm ...authors present a block based watermarking algorithm for digital images . The D.C.T. of the block is increased by a certain value. Quality control is...includes evaluation of the watermark robustness and the subjec- tive visual image quality. Two algorithms use the frequency domain while the two others use

  10. A Virtual Microscope for Academic Medical Education: The Pate Project

    PubMed Central

    Hundt, Christian; Schmitt, Volker H; Schömer, Elmar; Kirkpatrick, C James

    2015-01-01

    Background Whole-slide imaging (WSI) has become more prominent and continues to gain in importance in student teaching. Applications with different scope have been developed. Many of these applications have either technical or design shortcomings. Objective To design a survey to determine student expectations of WSI applications for teaching histological and pathological diagnosis. To develop a new WSI application based on the findings of the survey. Methods A total of 216 students were questioned about their experiences and expectations of WSI applications, as well as favorable and undesired features. The survey included 14 multiple choice and two essay questions. Based on the survey, we developed a new WSI application called Pate utilizing open source technologies. Results The survey sample included 216 students—62.0% (134) women and 36.1% (78) men. Out of 216 students, 4 (1.9%) did not disclose their gender. The best-known preexisting WSI applications included Mainzer Histo Maps (199/216, 92.1%), Histoweb Tübingen (16/216, 7.4%), and Histonet Ulm (8/216, 3.7%). Desired features for the students were latitude in the slides (190/216, 88.0%), histological (191/216, 88.4%) and pathological (186/216, 86.1%) annotations, points of interest (181/216, 83.8%), background information (146/216, 67.6%), and auxiliary informational texts (113/216, 52.3%). By contrast, a discussion forum was far less important (9/216, 4.2%) for the students. Conclusions The survey revealed that the students appreciate a rich feature set, including WSI functionality, points of interest, auxiliary informational texts, and annotations. The development of Pate was significantly influenced by the findings of the survey. Although Pate currently has some issues with the Zoomify file format, it could be shown that Web technologies are capable of providing a high-performance WSI experience, as well as a rich feature set. PMID:25963527

  11. A 3D Image Filter for Parameter-Free Segmentation of Macromolecular Structures from Electron Tomograms

    PubMed Central

    Ali, Rubbiya A.; Landsberg, Michael J.; Knauth, Emily; Morgan, Garry P.; Marsh, Brad J.; Hankamer, Ben

    2012-01-01

    3D image reconstruction of large cellular volumes by electron tomography (ET) at high (≤5 nm) resolution can now routinely resolve organellar and compartmental membrane structures, protein coats, cytoskeletal filaments, and macromolecules. However, current image analysis methods for identifying in situ macromolecular structures within the crowded 3D ultrastructural landscape of a cell remain labor-intensive, time-consuming, and prone to user-bias and/or error. This paper demonstrates the development and application of a parameter-free, 3D implementation of the bilateral edge-detection (BLE) algorithm for the rapid and accurate segmentation of cellular tomograms. The performance of the 3D BLE filter has been tested on a range of synthetic and real biological data sets and validated against current leading filters—the pseudo 3D recursive and Canny filters. The performance of the 3D BLE filter was found to be comparable to or better than that of both the 3D recursive and Canny filters while offering the significant advantage that it requires no parameter input or optimisation. Edge widths as little as 2 pixels are reproducibly detected with signal intensity and grey scale values as low as 0.72% above the mean of the background noise. The 3D BLE thus provides an efficient method for the automated segmentation of complex cellular structures across multiple scales for further downstream processing, such as cellular annotation and sub-tomogram averaging, and provides a valuable tool for the accurate and high-throughput identification and annotation of 3D structural complexity at the subcellular level, as well as for mapping the spatial and temporal rearrangement of macromolecular assemblies in situ within cellular tomograms. PMID:22479430

  12. A novel method for efficient archiving and retrieval of biomedical images using MPEG-7

    NASA Astrophysics Data System (ADS)

    Meyer, Joerg; Pahwa, Ash

    2004-10-01

    Digital archiving and efficient retrieval of radiological scans have become critical steps in contemporary medical diagnostics. Since more and more images and image sequences (single scans or video) from various modalities (CT/MRI/PET/digital X-ray) are now available in digital formats (e.g., DICOM-3), hospitals and radiology clinics need to implement efficient protocols capable of managing the enormous amounts of data generated daily in a typical clinical routine. We present a method that appears to be a viable way to eliminate the tedious step of manually annotating image and video material for database indexing. MPEG-7 is a new framework that standardizes the way images are characterized in terms of color, shape, and other abstract, content-related criteria. A set of standardized descriptors that are automatically generated from an image is used to compare an image to other images in a database, and to compute the distance between two images for a given application domain. Text-based database queries can be replaced with image-based queries using MPEG-7. Consequently, image queries can be conducted without any prior knowledge of the keys that were used as indices in the database. Since the decoding and matching steps are not part of the MPEG-7 standard, this method also enables searches that were not planned by the time the keys were generated.

  13. AIRS Maps from Space Processing Software

    NASA Technical Reports Server (NTRS)

    Thompson, Charles K.; Licata, Stephen J.

    2012-01-01

    This software package processes Atmospheric Infrared Sounder (AIRS) Level 2 swath standard product geophysical parameters, and generates global, colorized, annotated maps. It automatically generates daily and multi-day averaged colorized and annotated maps of various AIRS Level 2 swath geophysical parameters. It also generates AIRS input data sets for Eyes on Earth, Puffer-sphere, and Magic Planet. This program is tailored to AIRS Level 2 data products. It re-projects data into 1/4-degree grids that can be combined and averaged for any number of days. The software scales and colorizes global grids utilizing AIRS-specific color tables, and annotates images with title and color bar. This software can be tailored for use with other swath data products for the purposes of visualization.

  14. AGORA : Organellar genome annotation from the amino acid and nucleotide references.

    PubMed

    Jung, Jaehee; Kim, Jong Im; Jeong, Young-Sik; Yi, Gangman

    2018-03-29

    Next-generation sequencing (NGS) technologies have led to the accumulation of highthroughput sequence data from various organisms in biology. To apply gene annotation of organellar genomes for various organisms, more optimized tools for functional gene annotation are required. Almost all gene annotation tools are mainly focused on the chloroplast genome of land plants or the mitochondrial genome of animals.We have developed a web application AGORA for the fast, user-friendly, and improved annotations of organellar genomes. AGORA annotates genes based on a BLAST-based homology search and clustering with selected reference sequences from the NCBI database or user-defined uploaded data. AGORA can annotate the functional genes in almost all mitochondrion and plastid genomes of eukaryotes. The gene annotation of a genome with an exon-intron structure within a gene or inverted repeat region is also available. It provides information of start and end positions of each gene, BLAST results compared with the reference sequence, and visualization of gene map by OGDRAW. Users can freely use the software, and the accessible URL is https://bigdata.dongguk.edu/gene_project/AGORA/.The main module of the tool is implemented by the python and php, and the web page is built by the HTML and CSS to support all browsers. gangman@dongguk.edu.

  15. Cross-organism learning method to discover new gene functionalities.

    PubMed

    Domeniconi, Giacomo; Masseroli, Marco; Moro, Gianluca; Pinoli, Pietro

    2016-04-01

    Knowledge of gene and protein functions is paramount for the understanding of physiological and pathological biological processes, as well as in the development of new drugs and therapies. Analyses for biomedical knowledge discovery greatly benefit from the availability of gene and protein functional feature descriptions expressed through controlled terminologies and ontologies, i.e., of gene and protein biomedical controlled annotations. In the last years, several databases of such annotations have become available; yet, these valuable annotations are incomplete, include errors and only some of them represent highly reliable human curated information. Computational techniques able to reliably predict new gene or protein annotations with an associated likelihood value are thus paramount. Here, we propose a novel cross-organisms learning approach to reliably predict new functionalities for the genes of an organism based on the known controlled annotations of the genes of another, evolutionarily related and better studied, organism. We leverage a new representation of the annotation discovery problem and a random perturbation of the available controlled annotations to allow the application of supervised algorithms to predict with good accuracy unknown gene annotations. Taking advantage of the numerous gene annotations available for a well-studied organism, our cross-organisms learning method creates and trains better prediction models, which can then be applied to predict new gene annotations of a target organism. We tested and compared our method with the equivalent single organism approach on different gene annotation datasets of five evolutionarily related organisms (Homo sapiens, Mus musculus, Bos taurus, Gallus gallus and Dictyostelium discoideum). Results show both the usefulness of the perturbation method of available annotations for better prediction model training and a great improvement of the cross-organism models with respect to the single-organism ones, without influence of the evolutionary distance between the considered organisms. The generated ranked lists of reliably predicted annotations, which describe novel gene functionalities and have an associated likelihood value, are very valuable both to complement available annotations, for better coverage in biomedical knowledge discovery analyses, and to quicken the annotation curation process, by focusing it on the prioritized novel annotations predicted. Copyright © 2015 Elsevier Ireland Ltd. All rights reserved.

  16. Reliability of Using Retinal Vascular Fractal Dimension as a Biomarker in the Diabetic Retinopathy Detection

    PubMed Central

    Zhang, Jiong; Bekkers, Erik; Abbasi-Sureshjani, Samaneh

    2016-01-01

    The retinal fractal dimension (FD) is a measure of vasculature branching pattern complexity. FD has been considered as a potential biomarker for the detection of several diseases like diabetes and hypertension. However, conflicting findings were found in the reported literature regarding the association between this biomarker and diseases. In this paper, we examine the stability of the FD measurement with respect to (1) different vessel annotations obtained from human observers, (2) automatic segmentation methods, (3) various regions of interest, (4) accuracy of vessel segmentation methods, and (5) different imaging modalities. Our results demonstrate that the relative errors for the measurement of FD are significant and FD varies considerably according to the image quality, modality, and the technique used for measuring it. Automated and semiautomated methods for the measurement of FD are not stable enough, which makes FD a deceptive biomarker in quantitative clinical applications. PMID:27703803

  17. Building Structured Personal Health Records from Photographs of Printed Medical Records.

    PubMed

    Li, Xiang; Hu, Gang; Teng, Xiaofei; Xie, Guotong

    2015-01-01

    Personal health records (PHRs) provide patient-centric healthcare by making health records accessible to patients. In China, it is very difficult for individuals to access electronic health records. Instead, individuals can easily obtain the printed copies of their own medical records, such as prescriptions and lab test reports, from hospitals. In this paper, we propose a practical approach to extract structured data from printed medical records photographed by mobile phones. An optical character recognition (OCR) pipeline is performed to recognize text in a document photo, which addresses the problems of low image quality and content complexity by image pre-processing and multiple OCR engine synthesis. A series of annotation algorithms that support flexible layouts are then used to identify the document type, entities of interest, and entity correlations, from which a structured PHR document is built. The proposed approach was applied to real world medical records to demonstrate the effectiveness and applicability.

  18. Building Structured Personal Health Records from Photographs of Printed Medical Records

    PubMed Central

    Li, Xiang; Hu, Gang; Teng, Xiaofei; Xie, Guotong

    2015-01-01

    Personal health records (PHRs) provide patient-centric healthcare by making health records accessible to patients. In China, it is very difficult for individuals to access electronic health records. Instead, individuals can easily obtain the printed copies of their own medical records, such as prescriptions and lab test reports, from hospitals. In this paper, we propose a practical approach to extract structured data from printed medical records photographed by mobile phones. An optical character recognition (OCR) pipeline is performed to recognize text in a document photo, which addresses the problems of low image quality and content complexity by image pre-processing and multiple OCR engine synthesis. A series of annotation algorithms that support flexible layouts are then used to identify the document type, entities of interest, and entity correlations, from which a structured PHR document is built. The proposed approach was applied to real world medical records to demonstrate the effectiveness and applicability. PMID:26958219

  19. Reliability of Using Retinal Vascular Fractal Dimension as a Biomarker in the Diabetic Retinopathy Detection.

    PubMed

    Huang, Fan; Dashtbozorg, Behdad; Zhang, Jiong; Bekkers, Erik; Abbasi-Sureshjani, Samaneh; Berendschot, Tos T J M; Ter Haar Romeny, Bart M

    2016-01-01

    The retinal fractal dimension (FD) is a measure of vasculature branching pattern complexity. FD has been considered as a potential biomarker for the detection of several diseases like diabetes and hypertension. However, conflicting findings were found in the reported literature regarding the association between this biomarker and diseases. In this paper, we examine the stability of the FD measurement with respect to (1) different vessel annotations obtained from human observers, (2) automatic segmentation methods, (3) various regions of interest, (4) accuracy of vessel segmentation methods, and (5) different imaging modalities. Our results demonstrate that the relative errors for the measurement of FD are significant and FD varies considerably according to the image quality, modality, and the technique used for measuring it. Automated and semiautomated methods for the measurement of FD are not stable enough, which makes FD a deceptive biomarker in quantitative clinical applications.

  20. UWGSP4: an imaging and graphics superworkstation and its medical applications

    NASA Astrophysics Data System (ADS)

    Jong, Jing-Ming; Park, Hyun Wook; Eo, Kilsu; Kim, Min-Hwan; Zhang, Peng; Kim, Yongmin

    1992-05-01

    UWGSP4 is configured with a parallel architecture for image processing and a pipelined architecture for computer graphics. The system's peak performance is 1,280 MFLOPS for image processing and over 200,000 Gouraud shaded 3-D polygons per second for graphics. The simulated sustained performance is about 50% of the peak performance in general image processing. Most of the 2-D image processing functions are efficiently vectorized and parallelized in UWGSP4. A performance of 770 MFLOPS in convolution and 440 MFLOPS in FFT is achieved. The real-time cine display, up to 32 frames of 1280 X 1024 pixels per second, is supported. In 3-D imaging, the update rate for the surface rendering is 10 frames of 20,000 polygons per second; the update rate for the volume rendering is 6 frames of 128 X 128 X 128 voxels per second. The system provides 1280 X 1024 X 32-bit double frame buffers and one 1280 X 1024 X 8-bit overlay buffer for supporting realistic animation, 24-bit true color, and text annotation. A 1280 X 1024- pixel, 66-Hz noninterlaced display screen with 1:1 aspect ratio can be windowed into the frame buffer for the display of any portion of the processed image or graphics.

  1. Science Activity Planner for the MER Mission

    NASA Technical Reports Server (NTRS)

    Norris, Jeffrey S.; Crockett, Thomas M.; Fox, Jason M.; Joswig, Joseph C.; Powell, Mark W.; Shams, Khawaja S.; Torres, Recaredo J.; Wallick, Michael N.; Mittman, David S.

    2008-01-01

    The Maestro Science Activity Planner is a computer program that assists human users in planning operations of the Mars Explorer Rover (MER) mission and visualizing scientific data returned from the MER rovers. Relative to its predecessors, this program is more powerful and easier to use. This program is built on the Java Eclipse open-source platform around a Web-browser-based user-interface paradigm to provide an intuitive user interface to Mars rovers and landers. This program affords a combination of advanced display and simulation capabilities. For example, a map view of terrain can be generated from images acquired by the High Resolution Imaging Science Explorer instrument aboard the Mars Reconnaissance Orbiter spacecraft and overlaid with images from a navigation camera (more precisely, a stereoscopic pair of cameras) aboard a rover, and an interactive, annotated rover traverse path can be incorporated into the overlay. It is also possible to construct an overhead perspective mosaic image of terrain from navigation-camera images. This program can be adapted to similar use on other outer-space missions and is potentially adaptable to numerous terrestrial applications involving analysis of data, operations of robots, and planning of such operations for acquisition of scientific data.

  2. Ontology-guided organ detection to retrieve web images of disease manifestation: towards the construction of a consumer-based health image library.

    PubMed

    Chen, Yang; Ren, Xiaofeng; Zhang, Guo-Qiang; Xu, Rong

    2013-01-01

    Visual information is a crucial aspect of medical knowledge. Building a comprehensive medical image base, in the spirit of the Unified Medical Language System (UMLS), would greatly benefit patient education and self-care. However, collection and annotation of such a large-scale image base is challenging. To combine visual object detection techniques with medical ontology to automatically mine web photos and retrieve a large number of disease manifestation images with minimal manual labeling effort. As a proof of concept, we first learnt five organ detectors on three detection scales for eyes, ears, lips, hands, and feet. Given a disease, we used information from the UMLS to select affected body parts, ran the pretrained organ detectors on web images, and combined the detection outputs to retrieve disease images. Compared with a supervised image retrieval approach that requires training images for every disease, our ontology-guided approach exploits shared visual information of body parts across diseases. In retrieving 2220 web images of 32 diseases, we reduced manual labeling effort to 15.6% while improving the average precision by 3.9% from 77.7% to 81.6%. For 40.6% of the diseases, we improved the precision by 10%. The results confirm the concept that the web is a feasible source for automatic disease image retrieval for health image database construction. Our approach requires a small amount of manual effort to collect complex disease images, and to annotate them by standard medical ontology terms.

  3. Using GO-WAR for mining cross-ontology weighted association rules.

    PubMed

    Agapito, Giuseppe; Cannataro, Mario; Guzzi, Pietro Hiram; Milano, Marianna

    2015-07-01

    The Gene Ontology (GO) is a structured repository of concepts (GO terms) that are associated to one or more gene products. The process of association is referred to as annotation. The relevance and the specificity of both GO terms and annotations are evaluated by a measure defined as information content (IC). The analysis of annotated data is thus an important challenge for bioinformatics. There exist different approaches of analysis. From those, the use of association rules (AR) may provide useful knowledge, and it has been used in some applications, e.g. improving the quality of annotations. Nevertheless classical association rules algorithms do not take into account the source of annotation nor the importance yielding to the generation of candidate rules with low IC. This paper presents GO-WAR (Gene Ontology-based Weighted Association Rules) a methodology for extracting weighted association rules. GO-WAR can extract association rules with a high level of IC without loss of support and confidence from a dataset of annotated data. A case study on using of GO-WAR on publicly available GO annotation datasets is used to demonstrate that our method outperforms current state of the art approaches. Copyright © 2015 Elsevier Ireland Ltd. All rights reserved.

  4. A Selected Annotated Bibliography on the Analysis of Water Resources System, Volume 2.

    ERIC Educational Resources Information Center

    Kriss, Carol; And Others

    Presented is an annotated bibliography of some recent selected publications pertaining to the application of systems analysis techniques for defining and evaluating alternative solutions to water resource problems. Both subject and author indices are provided. Keywords are listed at the end of each abstract. The abstracted material emphasizes the…

  5. A Selected Annotated Bibliography on the Analysis of Water Resource Systems.

    ERIC Educational Resources Information Center

    Gysi, Marshall; And Others

    Presented is an annotated bibliography of some selected publications pertaining to the application of systems analysis techniques to water resource problems. The majority of the references included in this bibliography have been published within the last five years. About half of the entries have informative abstracts and keywords following the…

  6. Patome: a database server for biological sequence annotation and analysis in issued patents and published patent applications.

    PubMed

    Lee, Byungwook; Kim, Taehyung; Kim, Seon-Kyu; Lee, Kwang H; Lee, Doheon

    2007-01-01

    With the advent of automated and high-throughput techniques, the number of patent applications containing biological sequences has been increasing rapidly. However, they have attracted relatively little attention compared to other sequence resources. We have built a database server called Patome, which contains biological sequence data disclosed in patents and published applications, as well as their analysis information. The analysis is divided into two steps. The first is an annotation step in which the disclosed sequences were annotated with RefSeq database. The second is an association step where the sequences were linked to Entrez Gene, OMIM and GO databases, and their results were saved as a gene-patent table. From the analysis, we found that 55% of human genes were associated with patenting. The gene-patent table can be used to identify whether a particular gene or disease is related to patenting. Patome is available at http://www.patome.org/; the information is updated bimonthly.

  7. Patome: a database server for biological sequence annotation and analysis in issued patents and published patent applications

    PubMed Central

    Lee, Byungwook; Kim, Taehyung; Kim, Seon-Kyu; Lee, Kwang H.; Lee, Doheon

    2007-01-01

    With the advent of automated and high-throughput techniques, the number of patent applications containing biological sequences has been increasing rapidly. However, they have attracted relatively little attention compared to other sequence resources. We have built a database server called Patome, which contains biological sequence data disclosed in patents and published applications, as well as their analysis information. The analysis is divided into two steps. The first is an annotation step in which the disclosed sequences were annotated with RefSeq database. The second is an association step where the sequences were linked to Entrez Gene, OMIM and GO databases, and their results were saved as a gene–patent table. From the analysis, we found that 55% of human genes were associated with patenting. The gene–patent table can be used to identify whether a particular gene or disease is related to patenting. Patome is available at ; the information is updated bimonthly. PMID:17085479

  8. Proposal for a common nomenclature for fragment ions in mass spectra of lipids

    PubMed Central

    Hartler, Jürgen; Christiansen, Klaus; Gallego, Sandra F.; Peng, Bing; Ahrends, Robert

    2017-01-01

    Advances in mass spectrometry-based lipidomics have in recent years prompted efforts to standardize the annotation of the vast number of lipid molecules that can be detected in biological systems. These efforts have focused on cataloguing, naming and drawing chemical structures of intact lipid molecules, but have provided no guidelines for annotation of lipid fragment ions detected using tandem and multi-stage mass spectrometry, albeit these fragment ions are mandatory for structural elucidation and high confidence lipid identification, especially in high throughput lipidomics workflows. Here we propose a nomenclature for the annotation of lipid fragment ions, describe its implementation and present a freely available web application, termed ALEX123 lipid calculator, that can be used to query a comprehensive database featuring curated lipid fragmentation information for more than 430,000 potential lipid molecules from 47 lipid classes covering five lipid categories. We note that the nomenclature is generic, extendable to stable isotope-labeled lipid molecules and applicable to automated annotation of fragment ions detected by most contemporary lipidomics platforms, including LC-MS/MS-based routines. PMID:29161304

  9. Proposal for a common nomenclature for fragment ions in mass spectra of lipids.

    PubMed

    Pauling, Josch K; Hermansson, Martin; Hartler, Jürgen; Christiansen, Klaus; Gallego, Sandra F; Peng, Bing; Ahrends, Robert; Ejsing, Christer S

    2017-01-01

    Advances in mass spectrometry-based lipidomics have in recent years prompted efforts to standardize the annotation of the vast number of lipid molecules that can be detected in biological systems. These efforts have focused on cataloguing, naming and drawing chemical structures of intact lipid molecules, but have provided no guidelines for annotation of lipid fragment ions detected using tandem and multi-stage mass spectrometry, albeit these fragment ions are mandatory for structural elucidation and high confidence lipid identification, especially in high throughput lipidomics workflows. Here we propose a nomenclature for the annotation of lipid fragment ions, describe its implementation and present a freely available web application, termed ALEX123 lipid calculator, that can be used to query a comprehensive database featuring curated lipid fragmentation information for more than 430,000 potential lipid molecules from 47 lipid classes covering five lipid categories. We note that the nomenclature is generic, extendable to stable isotope-labeled lipid molecules and applicable to automated annotation of fragment ions detected by most contemporary lipidomics platforms, including LC-MS/MS-based routines.

  10. Jannovar: a java library for exome annotation.

    PubMed

    Jäger, Marten; Wang, Kai; Bauer, Sebastian; Smedley, Damian; Krawitz, Peter; Robinson, Peter N

    2014-05-01

    Transcript-based annotation and pedigree analysis are two basic steps in the computational analysis of whole-exome sequencing experiments in genetic diagnostics and disease-gene discovery projects. Here, we present Jannovar, a stand-alone Java application as well as a Java library designed to be used in larger software frameworks for exome and genome analysis. Jannovar uses an interval tree to identify all transcripts affected by a given variant, and provides Human Genome Variation Society-compliant annotations both for variants affecting coding sequences and splice junctions as well as untranslated regions and noncoding RNA transcripts. Jannovar can also perform family-based pedigree analysis with Variant Call Format (VCF) files with data from members of a family segregating a Mendelian disorder. Using a desktop computer, Jannovar requires a few seconds to annotate a typical VCF file with exome data. Jannovar is freely available under the BSD2 license. Source code as well as the Java application and library file can be downloaded from http://compbio.charite.de (with tutorial) and https://github.com/charite/jannovar. © 2014 WILEY PERIODICALS, INC.

  11. GFFview: A Web Server for Parsing and Visualizing Annotation Information of Eukaryotic Genome.

    PubMed

    Deng, Feilong; Chen, Shi-Yi; Wu, Zhou-Lin; Hu, Yongsong; Jia, Xianbo; Lai, Song-Jia

    2017-10-01

    Owing to wide application of RNA sequencing (RNA-seq) technology, more and more eukaryotic genomes have been extensively annotated, such as the gene structure, alternative splicing, and noncoding loci. Annotation information of genome is prevalently stored as plain text in General Feature Format (GFF), which could be hundreds or thousands Mb in size. Therefore, it is a challenge for manipulating GFF file for biologists who have no bioinformatic skill. In this study, we provide a web server (GFFview) for parsing the annotation information of eukaryotic genome and then generating statistical description of six indices for visualization. GFFview is very useful for investigating quality and difference of the de novo assembled transcriptome in RNA-seq studies.

  12. Informatics in radiology: automated structured reporting of imaging findings using the AIM standard and XML.

    PubMed

    Zimmerman, Stefan L; Kim, Woojin; Boonn, William W

    2011-01-01

    Quantitative and descriptive imaging data are a vital component of the radiology report and are frequently of paramount importance to the ordering physician. Unfortunately, current methods of recording these data in the report are both inefficient and error prone. In addition, the free-text, unstructured format of a radiology report makes aggregate analysis of data from multiple reports difficult or even impossible without manual intervention. A structured reporting work flow has been developed that allows quantitative data created at an advanced imaging workstation to be seamlessly integrated into the radiology report with minimal radiologist intervention. As an intermediary step between the workstation and the reporting software, quantitative and descriptive data are converted into an extensible markup language (XML) file in a standardized format specified by the Annotation and Image Markup (AIM) project of the National Institutes of Health Cancer Biomedical Informatics Grid. The AIM standard was created to allow image annotation data to be stored in a uniform machine-readable format. These XML files containing imaging data can also be stored on a local database for data mining and analysis. This structured work flow solution has the potential to improve radiologist efficiency, reduce errors, and facilitate storage of quantitative and descriptive imaging data for research. Copyright © RSNA, 2011.

  13. Toward knowledge-enhanced viewing using encyclopedias and model-based segmentation

    NASA Astrophysics Data System (ADS)

    Kneser, Reinhard; Lehmann, Helko; Geller, Dieter; Qian, Yue-Chen; Weese, Jürgen

    2009-02-01

    To make accurate decisions based on imaging data, radiologists must associate the viewed imaging data with the corresponding anatomical structures. Furthermore, given a disease hypothesis possible image findings which verify the hypothesis must be considered and where and how they are expressed in the viewed images. If rare anatomical variants, rare pathologies, unfamiliar protocols, or ambiguous findings are present, external knowledge sources such as medical encyclopedias are consulted. These sources are accessed using keywords typically describing anatomical structures, image findings, pathologies. In this paper we present our vision of how a patient's imaging data can be automatically enhanced with anatomical knowledge as well as knowledge about image findings. On one hand, we propose the automatic annotation of the images with labels from a standard anatomical ontology. These labels are used as keywords for a medical encyclopedia such as STATdx to access anatomical descriptions, information about pathologies and image findings. On the other hand we envision encyclopedias to contain links to region- and finding-specific image processing algorithms. Then a finding is evaluated on an image by applying the respective algorithm in the associated anatomical region. Towards realization of our vision, we present our method and results of automatic annotation of anatomical structures in 3D MRI brain images. Thereby we develop a complex surface mesh model incorporating major structures of the brain and a model-based segmentation method. We demonstrate the validity by analyzing the results of several training and segmentation experiments with clinical data focusing particularly on the visual pathway.

  14. SharedCanvas: A Collaborative Model for Medieval Manuscript Layout Dissemination

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Sanderson, Robert D.; Albritton, Benjamin; Schwemmer, Rafael

    2011-01-01

    In this paper we present a model based on the principles of Linked Data that can be used to describe the interrelationships of images, texts and other resources to facilitate the interoperability of repositories of medieval manuscripts or other culturally important handwritten documents. The model is designed from a set of requirements derived from the real world use cases of some of the largest digitized medieval content holders, and instantiations of the model are intended as the input to collection-independent page turning and scholarly presentation interfaces. A canvas painting paradigm, such as in PDF and SVG, was selected based onmore » the lack of a one to one correlation between image and page, and to fulfill complex requirements such as when the full text of a page is known, but only fragments of the physical object remain. The model is implemented using technologies such as OAI-ORE Aggregations and OAC Annotations, as the fundamental building blocks of emerging Linked Digital Libraries. The model and implementation are evaluated through prototypes of both content providing and consuming applications. Although the system was designed from requirements drawn from the medieval manuscript domain, it is applicable to any layout-oriented presentation of images of text.« less

  15. Phaedra, a protocol-driven system for analysis and validation of high-content imaging and flow cytometry.

    PubMed

    Cornelissen, Frans; Cik, Miroslav; Gustin, Emmanuel

    2012-04-01

    High-content screening has brought new dimensions to cellular assays by generating rich data sets that characterize cell populations in great detail and detect subtle phenotypes. To derive relevant, reliable conclusions from these complex data, it is crucial to have informatics tools supporting quality control, data reduction, and data mining. These tools must reconcile the complexity of advanced analysis methods with the user-friendliness demanded by the user community. After review of existing applications, we realized the possibility of adding innovative new analysis options. Phaedra was developed to support workflows for drug screening and target discovery, interact with several laboratory information management systems, and process data generated by a range of techniques including high-content imaging, multicolor flow cytometry, and traditional high-throughput screening assays. The application is modular and flexible, with an interface that can be tuned to specific user roles. It offers user-friendly data visualization and reduction tools for HCS but also integrates Matlab for custom image analysis and the Konstanz Information Miner (KNIME) framework for data mining. Phaedra features efficient JPEG2000 compression and full drill-down functionality from dose-response curves down to individual cells, with exclusion and annotation options, cell classification, statistical quality controls, and reporting.

  16. Accessing the SEED genome databases via Web services API: tools for programmers.

    PubMed

    Disz, Terry; Akhter, Sajia; Cuevas, Daniel; Olson, Robert; Overbeek, Ross; Vonstein, Veronika; Stevens, Rick; Edwards, Robert A

    2010-06-14

    The SEED integrates many publicly available genome sequences into a single resource. The database contains accurate and up-to-date annotations based on the subsystems concept that leverages clustering between genomes and other clues to accurately and efficiently annotate microbial genomes. The backend is used as the foundation for many genome annotation tools, such as the Rapid Annotation using Subsystems Technology (RAST) server for whole genome annotation, the metagenomics RAST server for random community genome annotations, and the annotation clearinghouse for exchanging annotations from different resources. In addition to a web user interface, the SEED also provides Web services based API for programmatic access to the data in the SEED, allowing the development of third-party tools and mash-ups. The currently exposed Web services encompass over forty different methods for accessing data related to microbial genome annotations. The Web services provide comprehensive access to the database back end, allowing any programmer access to the most consistent and accurate genome annotations available. The Web services are deployed using a platform independent service-oriented approach that allows the user to choose the most suitable programming platform for their application. Example code demonstrate that Web services can be used to access the SEED using common bioinformatics programming languages such as Perl, Python, and Java. We present a novel approach to access the SEED database. Using Web services, a robust API for access to genomics data is provided, without requiring large volume downloads all at once. The API ensures timely access to the most current datasets available, including the new genomes as soon as they come online.

  17. Mindcontrol: A web application for brain segmentation quality control.

    PubMed

    Keshavan, Anisha; Datta, Esha; M McDonough, Ian; Madan, Christopher R; Jordan, Kesshi; Henry, Roland G

    2018-04-15

    Tissue classification plays a crucial role in the investigation of normal neural development, brain-behavior relationships, and the disease mechanisms of many psychiatric and neurological illnesses. Ensuring the accuracy of tissue classification is important for quality research and, in particular, the translation of imaging biomarkers to clinical practice. Assessment with the human eye is vital to correct various errors inherent to all currently available segmentation algorithms. Manual quality assurance becomes methodologically difficult at a large scale - a problem of increasing importance as the number of data sets is on the rise. To make this process more efficient, we have developed Mindcontrol, an open-source web application for the collaborative quality control of neuroimaging processing outputs. The Mindcontrol platform consists of a dashboard to organize data, descriptive visualizations to explore the data, an imaging viewer, and an in-browser annotation and editing toolbox for data curation and quality control. Mindcontrol is flexible and can be configured for the outputs of any software package in any data organization structure. Example configurations for three large, open-source datasets are presented: the 1000 Functional Connectomes Project (FCP), the Consortium for Reliability and Reproducibility (CoRR), and the Autism Brain Imaging Data Exchange (ABIDE) Collection. These demo applications link descriptive quality control metrics, regional brain volumes, and thickness scalars to a 3D imaging viewer and editing module, resulting in an easy-to-implement quality control protocol that can be scaled for any size and complexity of study. Copyright © 2017 The Authors. Published by Elsevier Inc. All rights reserved.

  18. Incorporating Feature-Based Annotations into Automatically Generated Knowledge Representations

    NASA Astrophysics Data System (ADS)

    Lumb, L. I.; Lederman, J. I.; Aldridge, K. D.

    2006-12-01

    Earth Science Markup Language (ESML) is efficient and effective in representing scientific data in an XML- based formalism. However, features of the data being represented are not accounted for in ESML. Such features might derive from events (e.g., a gap in data collection due to instrument servicing), identifications (e.g., a scientifically interesting area/volume in an image), or some other source. In order to account for features in an ESML context, we consider them from the perspective of annotation, i.e., the addition of information to existing documents without changing the originals. Although it is possible to extend ESML to incorporate feature-based annotations internally (e.g., by extending the XML schema for ESML), there are a number of complicating factors that we identify. Rather than pursuing the ESML-extension approach, we focus on an external representation for feature-based annotations via XML Pointer Language (XPointer). In previous work (Lumb &Aldridge, HPCS 2006, IEEE, doi:10.1109/HPCS.2006.26), we have shown that it is possible to extract relationships from ESML-based representations, and capture the results in the Resource Description Format (RDF). Thus we explore and report on this same requirement for XPointer-based annotations of ESML representations. As in our past efforts, the Global Geodynamics Project (GGP) allows us to illustrate with a real-world example this approach for introducing annotations into automatically generated knowledge representations.

  19. ConsPred: a rule-based (re-)annotation framework for prokaryotic genomes.

    PubMed

    Weinmaier, Thomas; Platzer, Alexander; Frank, Jeroen; Hellinger, Hans-Jörg; Tischler, Patrick; Rattei, Thomas

    2016-11-01

    The rapidly growing number of available prokaryotic genome sequences requires fully automated and high-quality software solutions for their initial and re-annotation. Here we present ConsPred, a prokaryotic genome annotation framework that performs intrinsic gene predictions, homology searches, predictions of non-coding genes as well as CRISPR repeats and integrates all evidence into a consensus annotation. ConsPred achieves comprehensive, high-quality annotations based on rules and priorities, similar to decision-making in manual curation and avoids conflicting predictions. Parameters controlling the annotation process are configurable by the user. ConsPred has been used in the institutions of the authors for longer than 5 years and can easily be extended and adapted to specific needs. The ConsPred algorithm for producing a consensus from the varying scores of multiple gene prediction programs approaches manual curation in accuracy. Its rule-based approach for choosing final predictions avoids overriding previous manual curations. ConsPred is implemented in Java, Perl and Shell and is freely available under the Creative Commons license as a stand-alone in-house pipeline or as an Amazon Machine Image for cloud computing, see https://sourceforge.net/projects/conspred/. thomas.rattei@univie.ac.atSupplementary information: Supplementary data are available at Bioinformatics online. © The Author 2016. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.

  20. Reusable Social Networking Capabilities for an Earth Science Collaboratory

    NASA Astrophysics Data System (ADS)

    Lynnes, C.; Da Silva, D.; Leptoukh, G. G.; Ramachandran, R.

    2011-12-01

    A vast untapped resource of data, tools, information and knowledge lies within the Earth science community. This is due to the fact that it is difficult to share the full spectrum of these entities, particularly their full context. As a result, most knowledge exchange is through person-to-person contact at meetings, email and journal articles, each of which can support only a limited level of detail. We propose the creation of an Earth Science Collaboratory (ESC): a framework that would enable sharing of data, tools, workflows, results and the contextual knowledge about these information entities. The Drupal platform is well positioned to provide the key social networking capabilities to the ESC. As a proof of concept of a rich collaboration mechanism, we have developed a Drupal-based mechanism for graphically annotating and commenting on results images from analysis workflows in the online Giovanni analysis system for remote sensing data. The annotations can be tagged and shared with others in the community. These capabilities are further supplemented by a Research Notebook capability reused from another online analysis system named Talkoot. The goal is a reusable set of modules that can integrate with variety of other applications either within Drupal web frameworks or at a machine level.

  1. Transformation of an uncertain video search pipeline to a sketch-based visual analytics loop.

    PubMed

    Legg, Philip A; Chung, David H S; Parry, Matthew L; Bown, Rhodri; Jones, Mark W; Griffiths, Iwan W; Chen, Min

    2013-12-01

    Traditional sketch-based image or video search systems rely on machine learning concepts as their core technology. However, in many applications, machine learning alone is impractical since videos may not be semantically annotated sufficiently, there may be a lack of suitable training data, and the search requirements of the user may frequently change for different tasks. In this work, we develop a visual analytics systems that overcomes the shortcomings of the traditional approach. We make use of a sketch-based interface to enable users to specify search requirement in a flexible manner without depending on semantic annotation. We employ active machine learning to train different analytical models for different types of search requirements. We use visualization to facilitate knowledge discovery at the different stages of visual analytics. This includes visualizing the parameter space of the trained model, visualizing the search space to support interactive browsing, visualizing candidature search results to support rapid interaction for active learning while minimizing watching videos, and visualizing aggregated information of the search results. We demonstrate the system for searching spatiotemporal attributes from sports video to identify key instances of the team and player performance.

  2. The Multimodal Brain Tumor Image Segmentation Benchmark (BRATS).

    PubMed

    Menze, Bjoern H; Jakab, Andras; Bauer, Stefan; Kalpathy-Cramer, Jayashree; Farahani, Keyvan; Kirby, Justin; Burren, Yuliya; Porz, Nicole; Slotboom, Johannes; Wiest, Roland; Lanczi, Levente; Gerstner, Elizabeth; Weber, Marc-André; Arbel, Tal; Avants, Brian B; Ayache, Nicholas; Buendia, Patricia; Collins, D Louis; Cordier, Nicolas; Corso, Jason J; Criminisi, Antonio; Das, Tilak; Delingette, Hervé; Demiralp, Çağatay; Durst, Christopher R; Dojat, Michel; Doyle, Senan; Festa, Joana; Forbes, Florence; Geremia, Ezequiel; Glocker, Ben; Golland, Polina; Guo, Xiaotao; Hamamci, Andac; Iftekharuddin, Khan M; Jena, Raj; John, Nigel M; Konukoglu, Ender; Lashkari, Danial; Mariz, José Antonió; Meier, Raphael; Pereira, Sérgio; Precup, Doina; Price, Stephen J; Raviv, Tammy Riklin; Reza, Syed M S; Ryan, Michael; Sarikaya, Duygu; Schwartz, Lawrence; Shin, Hoo-Chang; Shotton, Jamie; Silva, Carlos A; Sousa, Nuno; Subbanna, Nagesh K; Szekely, Gabor; Taylor, Thomas J; Thomas, Owen M; Tustison, Nicholas J; Unal, Gozde; Vasseur, Flor; Wintermark, Max; Ye, Dong Hye; Zhao, Liang; Zhao, Binsheng; Zikic, Darko; Prastawa, Marcel; Reyes, Mauricio; Van Leemput, Koen

    2015-10-01

    In this paper we report the set-up and results of the Multimodal Brain Tumor Image Segmentation Benchmark (BRATS) organized in conjunction with the MICCAI 2012 and 2013 conferences. Twenty state-of-the-art tumor segmentation algorithms were applied to a set of 65 multi-contrast MR scans of low- and high-grade glioma patients-manually annotated by up to four raters-and to 65 comparable scans generated using tumor image simulation software. Quantitative evaluations revealed considerable disagreement between the human raters in segmenting various tumor sub-regions (Dice scores in the range 74%-85%), illustrating the difficulty of this task. We found that different algorithms worked best for different sub-regions (reaching performance comparable to human inter-rater variability), but that no single algorithm ranked in the top for all sub-regions simultaneously. Fusing several good algorithms using a hierarchical majority vote yielded segmentations that consistently ranked above all individual algorithms, indicating remaining opportunities for further methodological improvements. The BRATS image data and manual annotations continue to be publicly available through an online evaluation system as an ongoing benchmarking resource.

  3. The Multimodal Brain Tumor Image Segmentation Benchmark (BRATS)

    PubMed Central

    Jakab, Andras; Bauer, Stefan; Kalpathy-Cramer, Jayashree; Farahani, Keyvan; Kirby, Justin; Burren, Yuliya; Porz, Nicole; Slotboom, Johannes; Wiest, Roland; Lanczi, Levente; Gerstner, Elizabeth; Weber, Marc-André; Arbel, Tal; Avants, Brian B.; Ayache, Nicholas; Buendia, Patricia; Collins, D. Louis; Cordier, Nicolas; Corso, Jason J.; Criminisi, Antonio; Das, Tilak; Delingette, Hervé; Demiralp, Çağatay; Durst, Christopher R.; Dojat, Michel; Doyle, Senan; Festa, Joana; Forbes, Florence; Geremia, Ezequiel; Glocker, Ben; Golland, Polina; Guo, Xiaotao; Hamamci, Andac; Iftekharuddin, Khan M.; Jena, Raj; John, Nigel M.; Konukoglu, Ender; Lashkari, Danial; Mariz, José António; Meier, Raphael; Pereira, Sérgio; Precup, Doina; Price, Stephen J.; Raviv, Tammy Riklin; Reza, Syed M. S.; Ryan, Michael; Sarikaya, Duygu; Schwartz, Lawrence; Shin, Hoo-Chang; Shotton, Jamie; Silva, Carlos A.; Sousa, Nuno; Subbanna, Nagesh K.; Szekely, Gabor; Taylor, Thomas J.; Thomas, Owen M.; Tustison, Nicholas J.; Unal, Gozde; Vasseur, Flor; Wintermark, Max; Ye, Dong Hye; Zhao, Liang; Zhao, Binsheng; Zikic, Darko; Prastawa, Marcel; Reyes, Mauricio; Van Leemput, Koen

    2016-01-01

    In this paper we report the set-up and results of the Multimodal Brain Tumor Image Segmentation Benchmark (BRATS) organized in conjunction with the MICCAI 2012 and 2013 conferences. Twenty state-of-the-art tumor segmentation algorithms were applied to a set of 65 multi-contrast MR scans of low- and high-grade glioma patients—manually annotated by up to four raters—and to 65 comparable scans generated using tumor image simulation software. Quantitative evaluations revealed considerable disagreement between the human raters in segmenting various tumor sub-regions (Dice scores in the range 74%–85%), illustrating the difficulty of this task. We found that different algorithms worked best for different sub-regions (reaching performance comparable to human inter-rater variability), but that no single algorithm ranked in the top for all sub-regions simultaneously. Fusing several good algorithms using a hierarchical majority vote yielded segmentations that consistently ranked above all individual algorithms, indicating remaining opportunities for further methodological improvements. The BRATS image data and manual annotations continue to be publicly available through an online evaluation system as an ongoing benchmarking resource. PMID:25494501

  4. Spatial Statistics for Segmenting Histological Structures in H&E Stained Tissue Images.

    PubMed

    Nguyen, Luong; Tosun, Akif Burak; Fine, Jeffrey L; Lee, Adrian V; Taylor, D Lansing; Chennubhotla, S Chakra

    2017-07-01

    Segmenting a broad class of histological structures in transmitted light and/or fluorescence-based images is a prerequisite for determining the pathological basis of cancer, elucidating spatial interactions between histological structures in tumor microenvironments (e.g., tumor infiltrating lymphocytes), facilitating precision medicine studies with deep molecular profiling, and providing an exploratory tool for pathologists. This paper focuses on segmenting histological structures in hematoxylin- and eosin-stained images of breast tissues, e.g., invasive carcinoma, carcinoma in situ, atypical and normal ducts, adipose tissue, and lymphocytes. We propose two graph-theoretic segmentation methods based on local spatial color and nuclei neighborhood statistics. For benchmarking, we curated a data set of 232 high-power field breast tissue images together with expertly annotated ground truth. To accurately model the preference for histological structures (ducts, vessels, tumor nets, adipose, etc.) over the remaining connective tissue and non-tissue areas in ground truth annotations, we propose a new region-based score for evaluating segmentation algorithms. We demonstrate the improvement of our proposed methods over the state-of-the-art algorithms in both region- and boundary-based performance measures.

  5. High-throughput dual-colour precision imaging for brain-wide connectome with cytoarchitectonic landmarks at the cellular level

    PubMed Central

    Gong, Hui; Xu, Dongli; Yuan, Jing; Li, Xiangning; Guo, Congdi; Peng, Jie; Li, Yuxin; Schwarz, Lindsay A.; Li, Anan; Hu, Bihe; Xiong, Benyi; Sun, Qingtao; Zhang, Yalun; Liu, Jiepeng; Zhong, Qiuyuan; Xu, Tonghui; Zeng, Shaoqun; Luo, Qingming

    2016-01-01

    The precise annotation and accurate identification of neural structures are prerequisites for studying mammalian brain function. The orientation of neurons and neural circuits is usually determined by mapping brain images to coarse axial-sampling planar reference atlases. However, individual differences at the cellular level likely lead to position errors and an inability to orient neural projections at single-cell resolution. Here, we present a high-throughput precision imaging method that can acquire a co-localized brain-wide data set of both fluorescent-labelled neurons and counterstained cell bodies at a voxel size of 0.32 × 0.32 × 2.0 μm in 3 days for a single mouse brain. We acquire mouse whole-brain imaging data sets of multiple types of neurons and projections with anatomical annotation at single-neuron resolution. The results show that the simultaneous acquisition of labelled neural structures and cytoarchitecture reference in the same brain greatly facilitates precise tracing of long-range projections and accurate locating of nuclei. PMID:27374071

  6. Multimodal MSI in Conjunction with Broad Coverage Spatially Resolved MS 2 Increases Confidence in Both Molecular Identification and Localization

    DOE PAGES

    Veličković, Dušan; Chu, Rosalie K.; Carrell, Alyssa A.; ...

    2017-12-06

    One critical aspect of mass spectrometry imaging (MSI) is the need to confidently identify detected analytes. While orthogonal tandem MS (e.g., LC–MS 2) experiments from sample extracts can assist in annotating ions, the spatial information about these molecules is lost. Accordingly, this could cause mislead conclusions, especially in cases where isobaric species exhibit different distributions within a sample. In this Technical Note, we employed a multimodal imaging approach, using matrix assisted laser desorption/ionization (MALDI)-MSI and liquid extraction surface analysis (LESA)-MS 2I, to confidently annotate and localize a broad range of metabolites involved in a tripartite symbiosis system of moss, cyanobacteria,more » and fungus. In conclusion, we found that the combination of these two imaging modalities generated very congruent ion images, providing the link between highly accurate structural information onfered by LESA and high spatial resolution attainable by MALDI. These results demonstrate how this combined methodology could be very useful in differentiating metabolite routes in complex systems.« less

  7. Multimodal MSI in Conjunction with Broad Coverage Spatially Resolved MS 2 Increases Confidence in Both Molecular Identification and Localization

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Veličković, Dušan; Chu, Rosalie K.; Carrell, Alyssa A.

    One critical aspect of mass spectrometry imaging (MSI) is the need to confidently identify detected analytes. While orthogonal tandem MS (e.g., LC–MS 2) experiments from sample extracts can assist in annotating ions, the spatial information about these molecules is lost. Accordingly, this could cause mislead conclusions, especially in cases where isobaric species exhibit different distributions within a sample. In this Technical Note, we employed a multimodal imaging approach, using matrix assisted laser desorption/ionization (MALDI)-MSI and liquid extraction surface analysis (LESA)-MS 2I, to confidently annotate and localize a broad range of metabolites involved in a tripartite symbiosis system of moss, cyanobacteria,more » and fungus. In conclusion, we found that the combination of these two imaging modalities generated very congruent ion images, providing the link between highly accurate structural information onfered by LESA and high spatial resolution attainable by MALDI. These results demonstrate how this combined methodology could be very useful in differentiating metabolite routes in complex systems.« less

  8. The UNO Aviation Monograph Series: Aviation Security: An Annotated Bibliography of Responses to the Gore Commission

    NASA Technical Reports Server (NTRS)

    Carrico, John S.; Schaaf, Michaela M.

    1998-01-01

    This monograph is a companion to UNOAI Monograph 96-2, "The Image of Airport Security: An Annotated Bibliography," compiled in June 1996. The White House Commission on Aviation Safety and Security, headed by Vice President Al Gore, was formed as a result of the TWA Flight 800 crash in August 1996. The Commission's final report included 31 recommendations addressed toward aviation security. The recommendations were cause for security issues to be revisited in the media and by the aviation industry. These developments necessitated the need for an updated bibliography to review the resulting literature. Many of the articles were written in response to the recommendations made by the Gore Commission. "Aviation Security: An Annotated Bibliography of Responses to the Gore Commission" is the result of this need.

  9. Annotation of phenotypic diversity: decoupling data curation and ontology curation using Phenex.

    PubMed

    Balhoff, James P; Dahdul, Wasila M; Dececchi, T Alexander; Lapp, Hilmar; Mabee, Paula M; Vision, Todd J

    2014-01-01

    Phenex (http://phenex.phenoscape.org/) is a desktop application for semantically annotating the phenotypic character matrix datasets common in evolutionary biology. Since its initial publication, we have added new features that address several major bottlenecks in the efficiency of the phenotype curation process: allowing curators during the data curation phase to provisionally request terms that are not yet available from a relevant ontology; supporting quality control against annotation guidelines to reduce later manual review and revision; and enabling the sharing of files for collaboration among curators. We decoupled data annotation from ontology development by creating an Ontology Request Broker (ORB) within Phenex. Curators can use the ORB to request a provisional term for use in data annotation; the provisional term can be automatically replaced with a permanent identifier once the term is added to an ontology. We added a set of annotation consistency checks to prevent common curation errors, reducing the need for later correction. We facilitated collaborative editing by improving the reliability of Phenex when used with online folder sharing services, via file change monitoring and continual autosave. With the addition of these new features, and in particular the Ontology Request Broker, Phenex users have been able to focus more effectively on data annotation. Phenoscape curators using Phenex have reported a smoother annotation workflow, with much reduced interruptions from ontology maintenance and file management issues.

  10. xGDBvm: A Web GUI-Driven Workflow for Annotating Eukaryotic Genomes in the Cloud[OPEN

    PubMed Central

    Merchant, Nirav

    2016-01-01

    Genome-wide annotation of gene structure requires the integration of numerous computational steps. Currently, annotation is arguably best accomplished through collaboration of bioinformatics and domain experts, with broad community involvement. However, such a collaborative approach is not scalable at today’s pace of sequence generation. To address this problem, we developed the xGDBvm software, which uses an intuitive graphical user interface to access a number of common genome analysis and gene structure tools, preconfigured in a self-contained virtual machine image. Once their virtual machine instance is deployed through iPlant’s Atmosphere cloud services, users access the xGDBvm workflow via a unified Web interface to manage inputs, set program parameters, configure links to high-performance computing (HPC) resources, view and manage output, apply analysis and editing tools, or access contextual help. The xGDBvm workflow will mask the genome, compute spliced alignments from transcript and/or protein inputs (locally or on a remote HPC cluster), predict gene structures and gene structure quality, and display output in a public or private genome browser complete with accessory tools. Problematic gene predictions are flagged and can be reannotated using the integrated yrGATE annotation tool. xGDBvm can also be configured to append or replace existing data or load precomputed data. Multiple genomes can be annotated and displayed, and outputs can be archived for sharing or backup. xGDBvm can be adapted to a variety of use cases including de novo genome annotation, reannotation, comparison of different annotations, and training or teaching. PMID:27020957

  11. xGDBvm: A Web GUI-Driven Workflow for Annotating Eukaryotic Genomes in the Cloud.

    PubMed

    Duvick, Jon; Standage, Daniel S; Merchant, Nirav; Brendel, Volker P

    2016-04-01

    Genome-wide annotation of gene structure requires the integration of numerous computational steps. Currently, annotation is arguably best accomplished through collaboration of bioinformatics and domain experts, with broad community involvement. However, such a collaborative approach is not scalable at today's pace of sequence generation. To address this problem, we developed the xGDBvm software, which uses an intuitive graphical user interface to access a number of common genome analysis and gene structure tools, preconfigured in a self-contained virtual machine image. Once their virtual machine instance is deployed through iPlant's Atmosphere cloud services, users access the xGDBvm workflow via a unified Web interface to manage inputs, set program parameters, configure links to high-performance computing (HPC) resources, view and manage output, apply analysis and editing tools, or access contextual help. The xGDBvm workflow will mask the genome, compute spliced alignments from transcript and/or protein inputs (locally or on a remote HPC cluster), predict gene structures and gene structure quality, and display output in a public or private genome browser complete with accessory tools. Problematic gene predictions are flagged and can be reannotated using the integrated yrGATE annotation tool. xGDBvm can also be configured to append or replace existing data or load precomputed data. Multiple genomes can be annotated and displayed, and outputs can be archived for sharing or backup. xGDBvm can be adapted to a variety of use cases including de novo genome annotation, reannotation, comparison of different annotations, and training or teaching. © 2016 American Society of Plant Biologists. All rights reserved.

  12. Visually impaired researchers get their hands on quantum chemistry: application to a computational study on the isomerization of a sterol.

    PubMed

    Lounnas, Valère; Wedler, Henry B; Newman, Timothy; Schaftenaar, Gijs; Harrison, Jason G; Nepomuceno, Gabriella; Pemberton, Ryan; Tantillo, Dean J; Vriend, Gert

    2014-11-01

    In molecular sciences, articles tend to revolve around 2D representations of 3D molecules, and sighted scientists often resort to 3D virtual reality software to study these molecules in detail. Blind and visually impaired (BVI) molecular scientists have access to a series of audio devices that can help them read the text in articles and work with computers. Reading articles published in this journal, though, is nearly impossible for them because they need to generate mental 3D images of molecules, but the article-reading software cannot do that for them. We have previously designed AsteriX, a web server that fully automatically decomposes articles, detects 2D plots of low molecular weight molecules, removes meta data and annotations from these plots, and converts them into 3D atomic coordinates. AsteriX-BVI goes one step further and converts the 3D representation into a 3D printable, haptic-enhanced format that includes Braille annotations. These Braille-annotated physical 3D models allow BVI scientists to generate a complete mental model of the molecule. AsteriX-BVI uses Molden to convert the meta data of quantum chemistry experiments into BVI friendly formats so that the entire line of scientific information that sighted people take for granted-from published articles, via printed results of computational chemistry experiments, to 3D models-is now available to BVI scientists too. The possibilities offered by AsteriX-BVI are illustrated by a project on the isomerization of a sterol, executed by the blind co-author of this article (HBW).

  13. Visually impaired researchers get their hands on quantum chemistry: application to a computational study on the isomerization of a sterol

    NASA Astrophysics Data System (ADS)

    Lounnas, Valère; Wedler, Henry B.; Newman, Timothy; Schaftenaar, Gijs; Harrison, Jason G.; Nepomuceno, Gabriella; Pemberton, Ryan; Tantillo, Dean J.; Vriend, Gert

    2014-11-01

    In molecular sciences, articles tend to revolve around 2D representations of 3D molecules, and sighted scientists often resort to 3D virtual reality software to study these molecules in detail. Blind and visually impaired (BVI) molecular scientists have access to a series of audio devices that can help them read the text in articles and work with computers. Reading articles published in this journal, though, is nearly impossible for them because they need to generate mental 3D images of molecules, but the article-reading software cannot do that for them. We have previously designed AsteriX, a web server that fully automatically decomposes articles, detects 2D plots of low molecular weight molecules, removes meta data and annotations from these plots, and converts them into 3D atomic coordinates. AsteriX-BVI goes one step further and converts the 3D representation into a 3D printable, haptic-enhanced format that includes Braille annotations. These Braille-annotated physical 3D models allow BVI scientists to generate a complete mental model of the molecule. AsteriX-BVI uses Molden to convert the meta data of quantum chemistry experiments into BVI friendly formats so that the entire line of scientific information that sighted people take for granted—from published articles, via printed results of computational chemistry experiments, to 3D models—is now available to BVI scientists too. The possibilities offered by AsteriX-BVI are illustrated by a project on the isomerization of a sterol, executed by the blind co-author of this article (HBW).

  14. DaGO-Fun: tool for Gene Ontology-based functional analysis using term information content measures.

    PubMed

    Mazandu, Gaston K; Mulder, Nicola J

    2013-09-25

    The use of Gene Ontology (GO) data in protein analyses have largely contributed to the improved outcomes of these analyses. Several GO semantic similarity measures have been proposed in recent years and provide tools that allow the integration of biological knowledge embedded in the GO structure into different biological analyses. There is a need for a unified tool that provides the scientific community with the opportunity to explore these different GO similarity measure approaches and their biological applications. We have developed DaGO-Fun, an online tool available at http://web.cbio.uct.ac.za/ITGOM, which incorporates many different GO similarity measures for exploring, analyzing and comparing GO terms and proteins within the context of GO. It uses GO data and UniProt proteins with their GO annotations as provided by the Gene Ontology Annotation (GOA) project to precompute GO term information content (IC), enabling rapid response to user queries. The DaGO-Fun online tool presents the advantage of integrating all the relevant IC-based GO similarity measures, including topology- and annotation-based approaches to facilitate effective exploration of these measures, thus enabling users to choose the most relevant approach for their application. Furthermore, this tool includes several biological applications related to GO semantic similarity scores, including the retrieval of genes based on their GO annotations, the clustering of functionally related genes within a set, and term enrichment analysis.

  15. LightWAVE: Waveform and Annotation Viewing and Editing in a Web Browser.

    PubMed

    Moody, George B

    2013-09-01

    This paper describes LightWAVE, recently-developed open-source software for viewing ECGs and other physiologic waveforms and associated annotations (event markers). It supports efficient interactive creation and modification of annotations, capabilities that are essential for building new collections of physiologic signals and time series for research. LightWAVE is constructed of components that interact in simple ways, making it straightforward to enhance or replace any of them. The back end (server) is a common gateway interface (CGI) application written in C for speed and efficiency. It retrieves data from its data repository (PhysioNet's open-access PhysioBank archives by default, or any set of files or web pages structured as in PhysioBank) and delivers them in response to requests generated by the front end. The front end (client) is a web application written in JavaScript. It runs within any modern web browser and does not require installation on the user's computer, tablet, or phone. Finally, LightWAVE's scribe is a tiny CGI application written in Perl, which records the user's edits in annotation files. LightWAVE's data repository, back end, and front end can be located on the same computer or on separate computers. The data repository may be split across multiple computers. For compatibility with the standard browser security model, the front end and the scribe must be loaded from the same domain.

  16. CuGene as a tool to view and explore genomic data

    NASA Astrophysics Data System (ADS)

    Haponiuk, Michał; Pawełkowicz, Magdalena; Przybecki, Zbigniew; Nowak, Robert M.

    2017-08-01

    Integrated CuGene is an easy-to-use, open-source, on-line tool that can be used to browse, analyze, and query genomic data and annotations. It places annotation tracks beneath genome coordinate positions, allowing rapid visual correlation of different types of information. It also allows users to upload and display their own experimental results or annotation sets. An important functionality of the application is a possibility to find similarity between sequences by applying four different algorithms of different accuracy. The presented tool was tested on real genomic data and is extensively used by Polish Consortium of Cucumber Genome Sequencing.

  17. Semantic-gap-oriented active learning for multilabel image annotation.

    PubMed

    Tang, Jinhui; Zha, Zheng-Jun; Tao, Dacheng; Chua, Tat-Seng

    2012-04-01

    User interaction is an effective way to handle the semantic gap problem in image annotation. To minimize user effort in the interactions, many active learning methods were proposed. These methods treat the semantic concepts individually or correlatively. However, they still neglect the key motivation of user feedback: to tackle the semantic gap. The size of the semantic gap of each concept is an important factor that affects the performance of user feedback. User should pay more efforts to the concepts with large semantic gaps, and vice versa. In this paper, we propose a semantic-gap-oriented active learning method, which incorporates the semantic gap measure into the information-minimization-based sample selection strategy. The basic learning model used in the active learning framework is an extended multilabel version of the sparse-graph-based semisupervised learning method that incorporates the semantic correlation. Extensive experiments conducted on two benchmark image data sets demonstrated the importance of bringing the semantic gap measure into the active learning process.

  18. Device and methods for "gold standard" registration of clinical 3D and 2D cerebral angiograms

    NASA Astrophysics Data System (ADS)

    Madan, Hennadii; Likar, Boštjan; Pernuš, Franjo; Å piclin, Žiga

    2015-03-01

    Translation of any novel and existing 3D-2D image registration methods into clinical image-guidance systems is limited due to lack of their objective validation on clinical image datasets. The main reason is that, besides the calibration of the 2D imaging system, a reference or "gold standard" registration is very difficult to obtain on clinical image datasets. In the context of cerebral endovascular image-guided interventions (EIGIs), we present a calibration device in the form of a headband with integrated fiducial markers and, secondly, propose an automated pipeline comprising 3D and 2D image processing, analysis and annotation steps, the result of which is a retrospective calibration of the 2D imaging system and an optimal, i.e., "gold standard" registration of 3D and 2D images. The device and methods were used to create the "gold standard" on 15 datasets of 3D and 2D cerebral angiograms, whereas each dataset was acquired on a patient undergoing EIGI for either aneurysm coiling or embolization of arteriovenous malformation. The use of the device integrated seamlessly in the clinical workflow of EIGI. While the automated pipeline eliminated all manual input or interactive image processing, analysis or annotation. In this way, the time to obtain the "gold standard" was reduced from 30 to less than one minute and the "gold standard" of 3D-2D registration on all 15 datasets of cerebral angiograms was obtained with a sub-0.1 mm accuracy.

  19. Compiled MPI: Cost-Effective Exascale Applications Development

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Bronevetsky, G; Quinlan, D; Lumsdaine, A

    2012-04-10

    The complexity of petascale and exascale machines makes it increasingly difficult to develop applications that can take advantage of them. Future systems are expected to feature billion-way parallelism, complex heterogeneous compute nodes and poor availability of memory (Peter Kogge, 2008). This new challenge for application development is motivating a significant amount of research and development on new programming models and runtime systems designed to simplify large-scale application development. Unfortunately, DoE has significant multi-decadal investment in a large family of mission-critical scientific applications. Scaling these applications to exascale machines will require a significant investment that will dwarf the costs of hardwaremore » procurement. A key reason for the difficulty in transitioning today's applications to exascale hardware is their reliance on explicit programming techniques, such as the Message Passing Interface (MPI) programming model to enable parallelism. MPI provides a portable and high performance message-passing system that enables scalable performance on a wide variety of platforms. However, it also forces developers to lock the details of parallelization together with application logic, making it very difficult to adapt the application to significant changes in the underlying system. Further, MPI's explicit interface makes it difficult to separate the application's synchronization and communication structure, reducing the amount of support that can be provided by compiler and run-time tools. This is in contrast to the recent research on more implicit parallel programming models such as Chapel, OpenMP and OpenCL, which promise to provide significantly more flexibility at the cost of reimplementing significant portions of the application. We are developing CoMPI, a novel compiler-driven approach to enable existing MPI applications to scale to exascale systems with minimal modifications that can be made incrementally over the application's lifetime. It includes: (1) New set of source code annotations, inserted either manually or automatically, that will clarify the application's use of MPI to the compiler infrastructure, enabling greater accuracy where needed; (2) A compiler transformation framework that leverages these annotations to transform the original MPI source code to improve its performance and scalability; (3) Novel MPI runtime implementation techniques that will provide a rich set of functionality extensions to be used by applications that have been transformed by our compiler; and (4) A novel compiler analysis that leverages simple user annotations to automatically extract the application's communication structure and synthesize most complex code annotations.« less

  20. A multimedia comprehensive informatics system with decision support tools for a multi-site collaboration research of stroke rehabilitation

    NASA Astrophysics Data System (ADS)

    Wang, Ximing; Documet, Jorge; Garrison, Kathleen A.; Winstein, Carolee J.; Liu, Brent

    2012-02-01

    Stroke is a major cause of adult disability. The Interdisciplinary Comprehensive Arm Rehabilitation Evaluation (I-CARE) clinical trial aims to evaluate a therapy for arm rehabilitation after stroke. A primary outcome measure is correlative analysis between stroke lesion characteristics and standard measures of rehabilitation progress, from data collected at seven research facilities across the country. Sharing and communication of brain imaging and behavioral data is thus a challenge for collaboration. A solution is proposed as a web-based system with tools supporting imaging and informatics related data. In this system, users may upload anonymized brain images through a secure internet connection and the system will sort the imaging data for storage in a centralized database. Users may utilize an annotation tool to mark up images. In addition to imaging informatics, electronic data forms, for example, clinical data forms, are also integrated. Clinical information is processed and stored in the database to enable future data mining related development. Tele-consultation is facilitated through the development of a thin-client image viewing application. For convenience, the system supports access through desktop PC, laptops, and iPAD. Thus, clinicians may enter data directly into the system via iPAD while working with participants in the study. Overall, this comprehensive imaging informatics system enables users to collect, organize and analyze stroke cases efficiently.

  1. Use of Annotations for Component and Framework Interoperability

    NASA Astrophysics Data System (ADS)

    David, O.; Lloyd, W.; Carlson, J.; Leavesley, G. H.; Geter, F.

    2009-12-01

    The popular programming languages Java and C# provide annotations, a form of meta-data construct. Software frameworks for web integration, web services, database access, and unit testing now take advantage of annotations to reduce the complexity of APIs and the quantity of integration code between the application and framework infrastructure. Adopting annotation features in frameworks has been observed to lead to cleaner and leaner application code. The USDA Object Modeling System (OMS) version 3.0 fully embraces the annotation approach and additionally defines a meta-data standard for components and models. In version 3.0 framework/model integration previously accomplished using API calls is now achieved using descriptive annotations. This enables the framework to provide additional functionality non-invasively such as implicit multithreading, and auto-documenting capabilities while achieving a significant reduction in the size of the model source code. Using a non-invasive methodology leads to models and modeling components with only minimal dependencies on the modeling framework. Since models and modeling components are not directly bound to framework by the use of specific APIs and/or data types they can more easily be reused both within the framework as well as outside of it. To study the effectiveness of an annotation based framework approach with other modeling frameworks, a framework-invasiveness study was conducted to evaluate the effects of framework design on model code quality. A monthly water balance model was implemented across several modeling frameworks and several software metrics were collected. The metrics selected were measures of non-invasive design methods for modeling frameworks from a software engineering perspective. It appears that the use of annotations positively impacts several software quality measures. In a next step, the PRMS model was implemented in OMS 3.0 and is currently being implemented for water supply forecasting in the western United States at the USDA NRCS National Water and Climate Center. PRMS is a component based modular precipitation-runoff model developed to evaluate the impacts of various combinations of precipitation, climate, and land use on streamflow and general basin hydrology. The new OMS 3.0 PRMS model source code is more concise and flexible as a result of using the new framework’s annotation based approach. The fully annotated components are now providing information directly for (i) model assembly and building, (ii) dataflow analysis for implicit multithreading, (iii) automated and comprehensive model documentation of component dependencies, physical data properties, (iv) automated model and component testing, and (v) automated audit-traceability to account for all model resources leading to a particular simulation result. Experience to date has demonstrated the multi-purpose value of using annotations. Annotations are also a feasible and practical method to enable interoperability among models and modeling frameworks. As a prototype example, model code annotations were used to generate binding and mediation code to allow the use of OMS 3.0 model components within the OpenMI context.

  2. Iterative Tensor Voting for Perceptual Grouping of Ill-Defined Curvilinear Structures: Application to Adherens Junctions

    PubMed Central

    Loss, Leandro A.; Bebis, George; Parvin, Bahram

    2012-01-01

    In this paper, a novel approach is proposed for perceptual grouping and localization of ill-defined curvilinear structures. Our approach builds upon the tensor voting and the iterative voting frameworks. Its efficacy lies on iterative refinements of curvilinear structures by gradually shifting from an exploratory to an exploitative mode. Such a mode shifting is achieved by reducing the aperture of the tensor voting fields, which is shown to improve curve grouping and inference by enhancing the concentration of the votes over promising, salient structures. The proposed technique is applied to delineation of adherens junctions imaged through fluorescence microscopy. This class of membrane-bound macromolecules maintains tissue structural integrity and cell-cell interactions. Visually, it exhibits fibrous patterns that may be diffused, punctate and frequently perceptual. Besides the application to real data, the proposed method is compared to prior methods on synthetic and annotated real data, showing high precision rates. PMID:21421432

  3. PathFinder: reconstruction and dynamic visualization of metabolic pathways.

    PubMed

    Goesmann, Alexander; Haubrock, Martin; Meyer, Folker; Kalinowski, Jörn; Giegerich, Robert

    2002-01-01

    Beyond methods for a gene-wise annotation and analysis of sequenced genomes new automated methods for functional analysis on a higher level are needed. The identification of realized metabolic pathways provides valuable information on gene expression and regulation. Detection of incomplete pathways helps to improve a constantly evolving genome annotation or discover alternative biochemical pathways. To utilize automated genome analysis on the level of metabolic pathways new methods for the dynamic representation and visualization of pathways are needed. PathFinder is a tool for the dynamic visualization of metabolic pathways based on annotation data. Pathways are represented as directed acyclic graphs, graph layout algorithms accomplish the dynamic drawing and visualization of the metabolic maps. A more detailed analysis of the input data on the level of biochemical pathways helps to identify genes and detect improper parts of annotations. As an Relational Database Management System (RDBMS) based internet application PathFinder reads a list of EC-numbers or a given annotation in EMBL- or Genbank-format and dynamically generates pathway graphs.

  4. Object recognition using deep convolutional neural networks with complete transfer and partial frozen layers

    NASA Astrophysics Data System (ADS)

    Kruithof, Maarten C.; Bouma, Henri; Fischer, Noëlle M.; Schutte, Klamer

    2016-10-01

    Object recognition is important to understand the content of video and allow flexible querying in a large number of cameras, especially for security applications. Recent benchmarks show that deep convolutional neural networks are excellent approaches for object recognition. This paper describes an approach of domain transfer, where features learned from a large annotated dataset are transferred to a target domain where less annotated examples are available as is typical for the security and defense domain. Many of these networks trained on natural images appear to learn features similar to Gabor filters and color blobs in the first layer. These first-layer features appear to be generic for many datasets and tasks while the last layer is specific. In this paper, we study the effect of copying all layers and fine-tuning a variable number. We performed an experiment with a Caffe-based network on 1000 ImageNet classes that are randomly divided in two equal subgroups for the transfer from one to the other. We copy all layers and vary the number of layers that is fine-tuned and the size of the target dataset. We performed additional experiments with the Keras platform on CIFAR-10 dataset to validate general applicability. We show with both platforms and both datasets that the accuracy on the target dataset improves when more target data is used. When the target dataset is large, it is beneficial to freeze only a few layers. For a large target dataset, the network without transfer learning performs better than the transfer network, especially if many layers are frozen. When the target dataset is small, it is beneficial to transfer (and freeze) many layers. For a small target dataset, the transfer network boosts generalization and it performs much better than the network without transfer learning. Learning time can be reduced by freezing many layers in a network.

  5. VESPA: software to facilitate genomic annotation of prokaryotic organisms through integration of proteomic and transcriptomic data.

    PubMed

    Peterson, Elena S; McCue, Lee Ann; Schrimpe-Rutledge, Alexandra C; Jensen, Jeffrey L; Walker, Hyunjoo; Kobold, Markus A; Webb, Samantha R; Payne, Samuel H; Ansong, Charles; Adkins, Joshua N; Cannon, William R; Webb-Robertson, Bobbie-Jo M

    2012-04-05

    The procedural aspects of genome sequencing and assembly have become relatively inexpensive, yet the full, accurate structural annotation of these genomes remains a challenge. Next-generation sequencing transcriptomics (RNA-Seq), global microarrays, and tandem mass spectrometry (MS/MS)-based proteomics have demonstrated immense value to genome curators as individual sources of information, however, integrating these data types to validate and improve structural annotation remains a major challenge. Current visual and statistical analytic tools are focused on a single data type, or existing software tools are retrofitted to analyze new data forms. We present Visual Exploration and Statistics to Promote Annotation (VESPA) is a new interactive visual analysis software tool focused on assisting scientists with the annotation of prokaryotic genomes though the integration of proteomics and transcriptomics data with current genome location coordinates. VESPA is a desktop Java™ application that integrates high-throughput proteomics data (peptide-centric) and transcriptomics (probe or RNA-Seq) data into a genomic context, all of which can be visualized at three levels of genomic resolution. Data is interrogated via searches linked to the genome visualizations to find regions with high likelihood of mis-annotation. Search results are linked to exports for further validation outside of VESPA or potential coding-regions can be analyzed concurrently with the software through interaction with BLAST. VESPA is demonstrated on two use cases (Yersinia pestis Pestoides F and Synechococcus sp. PCC 7002) to demonstrate the rapid manner in which mis-annotations can be found and explored in VESPA using either proteomics data alone, or in combination with transcriptomic data. VESPA is an interactive visual analytics tool that integrates high-throughput data into a genomic context to facilitate the discovery of structural mis-annotations in prokaryotic genomes. Data is evaluated via visual analysis across multiple levels of genomic resolution, linked searches and interaction with existing bioinformatics tools. We highlight the novel functionality of VESPA and core programming requirements for visualization of these large heterogeneous datasets for a client-side application. The software is freely available at https://www.biopilot.org/docs/Software/Vespa.php.

  6. Understanding Depressive Symptoms and Psychosocial Stressors on Twitter: A Corpus-Based Study.

    PubMed

    Mowery, Danielle; Smith, Hilary; Cheney, Tyler; Stoddard, Greg; Coppersmith, Glen; Bryan, Craig; Conway, Mike

    2017-02-28

    With a lifetime prevalence of 16.2%, major depressive disorder is the fifth biggest contributor to the disease burden in the United States. The aim of this study, building on previous work qualitatively analyzing depression-related Twitter data, was to describe the development of a comprehensive annotation scheme (ie, coding scheme) for manually annotating Twitter data with Diagnostic and Statistical Manual of Mental Disorders, Edition 5 (DSM 5) major depressive symptoms (eg, depressed mood, weight change, psychomotor agitation, or retardation) and Diagnostic and Statistical Manual of Mental Disorders, Edition IV (DSM-IV) psychosocial stressors (eg, educational problems, problems with primary support group, housing problems). Using this annotation scheme, we developed an annotated corpus, Depressive Symptom and Psychosocial Stressors Acquired Depression, the SAD corpus, consisting of 9300 tweets randomly sampled from the Twitter application programming interface (API) using depression-related keywords (eg, depressed, gloomy, grief). An analysis of our annotated corpus yielded several key results. First, 72.09% (6829/9473) of tweets containing relevant keywords were nonindicative of depressive symptoms (eg, "we're in for a new economic depression"). Second, the most prevalent symptoms in our dataset were depressed mood and fatigue or loss of energy. Third, less than 2% of tweets contained more than one depression related category (eg, diminished ability to think or concentrate, depressed mood). Finally, we found very high positive correlations between some depression-related symptoms in our annotated dataset (eg, fatigue or loss of energy and educational problems; educational problems and diminished ability to think). We successfully developed an annotation scheme and an annotated corpus, the SAD corpus, consisting of 9300 tweets randomly-selected from the Twitter application programming interface using depression-related keywords. Our analyses suggest that keyword queries alone might not be suitable for public health monitoring because context can change the meaning of keyword in a statement. However, postprocessing approaches could be useful for reducing the noise and improving the signal needed to detect depression symptoms using social media. ©Danielle Mowery, Hilary Smith, Tyler Cheney, Greg Stoddard, Glen Coppersmith, Craig Bryan, Mike Conway. Originally published in the Journal of Medical Internet Research (http://www.jmir.org), 28.02.2017.

  7. Understanding Depressive Symptoms and Psychosocial Stressors on Twitter: A Corpus-Based Study

    PubMed Central

    Smith, Hilary; Cheney, Tyler; Stoddard, Greg; Coppersmith, Glen; Bryan, Craig; Conway, Mike

    2017-01-01

    Background With a lifetime prevalence of 16.2%, major depressive disorder is the fifth biggest contributor to the disease burden in the United States. Objective The aim of this study, building on previous work qualitatively analyzing depression-related Twitter data, was to describe the development of a comprehensive annotation scheme (ie, coding scheme) for manually annotating Twitter data with Diagnostic and Statistical Manual of Mental Disorders, Edition 5 (DSM 5) major depressive symptoms (eg, depressed mood, weight change, psychomotor agitation, or retardation) and Diagnostic and Statistical Manual of Mental Disorders, Edition IV (DSM-IV) psychosocial stressors (eg, educational problems, problems with primary support group, housing problems). Methods Using this annotation scheme, we developed an annotated corpus, Depressive Symptom and Psychosocial Stressors Acquired Depression, the SAD corpus, consisting of 9300 tweets randomly sampled from the Twitter application programming interface (API) using depression-related keywords (eg, depressed, gloomy, grief). An analysis of our annotated corpus yielded several key results. Results First, 72.09% (6829/9473) of tweets containing relevant keywords were nonindicative of depressive symptoms (eg, “we’re in for a new economic depression”). Second, the most prevalent symptoms in our dataset were depressed mood and fatigue or loss of energy. Third, less than 2% of tweets contained more than one depression related category (eg, diminished ability to think or concentrate, depressed mood). Finally, we found very high positive correlations between some depression-related symptoms in our annotated dataset (eg, fatigue or loss of energy and educational problems; educational problems and diminished ability to think). Conclusions We successfully developed an annotation scheme and an annotated corpus, the SAD corpus, consisting of 9300 tweets randomly-selected from the Twitter application programming interface using depression-related keywords. Our analyses suggest that keyword queries alone might not be suitable for public health monitoring because context can change the meaning of keyword in a statement. However, postprocessing approaches could be useful for reducing the noise and improving the signal needed to detect depression symptoms using social media. PMID:28246066

  8. VESPA: software to facilitate genomic annotation of prokaryotic organisms through integration of proteomic and transcriptomic data

    PubMed Central

    2012-01-01

    Background The procedural aspects of genome sequencing and assembly have become relatively inexpensive, yet the full, accurate structural annotation of these genomes remains a challenge. Next-generation sequencing transcriptomics (RNA-Seq), global microarrays, and tandem mass spectrometry (MS/MS)-based proteomics have demonstrated immense value to genome curators as individual sources of information, however, integrating these data types to validate and improve structural annotation remains a major challenge. Current visual and statistical analytic tools are focused on a single data type, or existing software tools are retrofitted to analyze new data forms. We present Visual Exploration and Statistics to Promote Annotation (VESPA) is a new interactive visual analysis software tool focused on assisting scientists with the annotation of prokaryotic genomes though the integration of proteomics and transcriptomics data with current genome location coordinates. Results VESPA is a desktop Java™ application that integrates high-throughput proteomics data (peptide-centric) and transcriptomics (probe or RNA-Seq) data into a genomic context, all of which can be visualized at three levels of genomic resolution. Data is interrogated via searches linked to the genome visualizations to find regions with high likelihood of mis-annotation. Search results are linked to exports for further validation outside of VESPA or potential coding-regions can be analyzed concurrently with the software through interaction with BLAST. VESPA is demonstrated on two use cases (Yersinia pestis Pestoides F and Synechococcus sp. PCC 7002) to demonstrate the rapid manner in which mis-annotations can be found and explored in VESPA using either proteomics data alone, or in combination with transcriptomic data. Conclusions VESPA is an interactive visual analytics tool that integrates high-throughput data into a genomic context to facilitate the discovery of structural mis-annotations in prokaryotic genomes. Data is evaluated via visual analysis across multiple levels of genomic resolution, linked searches and interaction with existing bioinformatics tools. We highlight the novel functionality of VESPA and core programming requirements for visualization of these large heterogeneous datasets for a client-side application. The software is freely available at https://www.biopilot.org/docs/Software/Vespa.php. PMID:22480257

  9. Challenges in Whole-Genome Annotation of Pyrosequenced Eukaryotic Genomes

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Kuo, Alan; Grigoriev, Igor

    2009-04-17

    Pyrosequencing technologies such as 454/Roche and Solexa/Illumina vastly lower the cost of nucleotide sequencing compared to the traditional Sanger method, and thus promise to greatly expand the number of sequenced eukaryotic genomes. However, the new technologies also bring new challenges such as shorter reads and new kinds and higher rates of sequencing errors, which complicate genome assembly and gene prediction. At JGI we are deploying 454 technology for the sequencing and assembly of ever-larger eukaryotic genomes. Here we describe our first whole-genome annotation of a purely 454-sequenced fungal genome that is larger than a yeast (>30 Mbp). The pezizomycotine (filamentousmore » ascomycote) Aspergillus carbonarius belongs to the Aspergillus section Nigri species complex, members of which are significant as platforms for bioenergy and bioindustrial technology, as members of soil microbial communities and players in the global carbon cycle, and as agricultural toxigens. Application of a modified version of the standard JGI Annotation Pipeline has so far predicted ~;;10k genes. ~;;12percent of these preliminary annotations suffer a potential frameshift error, which is somewhat higher than the ~;;9percent rate in the Sanger-sequenced and conventionally assembled and annotated genome of fellow Aspergillus section Nigri member A. niger. Also,>90percent of A. niger genes have potential homologs in the A. carbonarius preliminary annotation. Weconclude, and with further annotation and comparative analysis expect to confirm, that 454 sequencing strategies provide a promising substrate for annotation of modestly sized eukaryotic genomes. We will also present results of annotation of a number of other pyrosequenced fungal genomes of bioenergy interest.« less

  10. Expansion of DSSTox: Leveraging public data to create a semantic cheminformatics resource with quality annotations for support of U.S. EPA applications. (American Chemical Society)

    EPA Science Inventory

    The expansion of chemical-bioassay data in the public domain is a boon to science; however, the difficulty in establishing accurate linkages from CAS registry number (CASRN) to structure, or for properly annotating names and synonyms for a particular structure is well known. DSS...

  11. Individualization and Modularization of Vocational Education Instructional Materials. An Annotated Bibliography of Publications and Projects. Bibliography Series No. 32.

    ERIC Educational Resources Information Center

    Magisos, Joel H., Comp.; Stakelon, Anne E., Comp.

    This annotated bibliography is designed to assist applicants for research grants under part C (section 131a) of the amendments to the Vocational Education Act of 1963 by providing access to documents, journal articles, and current projects related to the individualization and modularization of vocational education instructional materials. The…

  12. @Note: a workbench for biomedical text mining.

    PubMed

    Lourenço, Anália; Carreira, Rafael; Carneiro, Sónia; Maia, Paulo; Glez-Peña, Daniel; Fdez-Riverola, Florentino; Ferreira, Eugénio C; Rocha, Isabel; Rocha, Miguel

    2009-08-01

    Biomedical Text Mining (BioTM) is providing valuable approaches to the automated curation of scientific literature. However, most efforts have addressed the benchmarking of new algorithms rather than user operational needs. Bridging the gap between BioTM researchers and biologists' needs is crucial to solve real-world problems and promote further research. We present @Note, a platform for BioTM that aims at the effective translation of the advances between three distinct classes of users: biologists, text miners and software developers. Its main functional contributions are the ability to process abstracts and full-texts; an information retrieval module enabling PubMed search and journal crawling; a pre-processing module with PDF-to-text conversion, tokenisation and stopword removal; a semantic annotation schema; a lexicon-based annotator; a user-friendly annotation view that allows to correct annotations and a Text Mining Module supporting dataset preparation and algorithm evaluation. @Note improves the interoperability, modularity and flexibility when integrating in-home and open-source third-party components. Its component-based architecture allows the rapid development of new applications, emphasizing the principles of transparency and simplicity of use. Although it is still on-going, it has already allowed the development of applications that are currently being used.

  13. Application of MPEG-7 descriptors for content-based indexing of sports videos

    NASA Astrophysics Data System (ADS)

    Hoeynck, Michael; Auweiler, Thorsten; Ohm, Jens-Rainer

    2003-06-01

    The amount of multimedia data available worldwide is increasing every day. There is a vital need to annotate multimedia data in order to allow universal content access and to provide content-based search-and-retrieval functionalities. Since supervised video annotation can be time consuming, an automatic solution is appreciated. We review recent approaches to content-based indexing and annotation of videos for different kind of sports, and present our application for the automatic annotation of equestrian sports videos. Thereby, we especially concentrate on MPEG-7 based feature extraction and content description. We apply different visual descriptors for cut detection. Further, we extract the temporal positions of single obstacles on the course by analyzing MPEG-7 edge information and taking specific domain knowledge into account. Having determined single shot positions as well as the visual highlights, the information is jointly stored together with additional textual information in an MPEG-7 description scheme. Using this information, we generate content summaries which can be utilized in a user front-end in order to provide content-based access to the video stream, but further content-based queries and navigation on a video-on-demand streaming server.

  14. High-Throughput Classification of Radiographs Using Deep Convolutional Neural Networks.

    PubMed

    Rajkomar, Alvin; Lingam, Sneha; Taylor, Andrew G; Blum, Michael; Mongan, John

    2017-02-01

    The study aimed to determine if computer vision techniques rooted in deep learning can use a small set of radiographs to perform clinically relevant image classification with high fidelity. One thousand eight hundred eighty-five chest radiographs on 909 patients obtained between January 2013 and July 2015 at our institution were retrieved and anonymized. The source images were manually annotated as frontal or lateral and randomly divided into training, validation, and test sets. Training and validation sets were augmented to over 150,000 images using standard image manipulations. We then pre-trained a series of deep convolutional networks based on the open-source GoogLeNet with various transformations of the open-source ImageNet (non-radiology) images. These trained networks were then fine-tuned using the original and augmented radiology images. The model with highest validation accuracy was applied to our institutional test set and a publicly available set. Accuracy was assessed by using the Youden Index to set a binary cutoff for frontal or lateral classification. This retrospective study was IRB approved prior to initiation. A network pre-trained on 1.2 million greyscale ImageNet images and fine-tuned on augmented radiographs was chosen. The binary classification method correctly classified 100 % (95 % CI 99.73-100 %) of both our test set and the publicly available images. Classification was rapid, at 38 images per second. A deep convolutional neural network created using non-radiological images, and an augmented set of radiographs is effective in highly accurate classification of chest radiograph view type and is a feasible, rapid method for high-throughput annotation.

  15. An application to pulmonary emphysema classification based on model of texton learning by sparse representation

    NASA Astrophysics Data System (ADS)

    Zhang, Min; Zhou, Xiangrong; Goshima, Satoshi; Chen, Huayue; Muramatsu, Chisako; Hara, Takeshi; Yokoyama, Ryojiro; Kanematsu, Masayuki; Fujita, Hiroshi

    2012-03-01

    We aim at using a new texton based texture classification method in the classification of pulmonary emphysema in computed tomography (CT) images of the lungs. Different from conventional computer-aided diagnosis (CAD) pulmonary emphysema classification methods, in this paper, firstly, the dictionary of texton is learned via applying sparse representation(SR) to image patches in the training dataset. Then the SR coefficients of the test images over the dictionary are used to construct the histograms for texture presentations. Finally, classification is performed by using a nearest neighbor classifier with a histogram dissimilarity measure as distance. The proposed approach is tested on 3840 annotated regions of interest consisting of normal tissue and mild, moderate and severe pulmonary emphysema of three subtypes. The performance of the proposed system, with an accuracy of about 88%, is comparably higher than state of the art method based on the basic rotation invariant local binary pattern histograms and the texture classification method based on texton learning by k-means, which performs almost the best among other approaches in the literature.

  16. Multi-dimensional classification of biomedical text: Toward automated, practical provision of high-utility text to diverse users

    PubMed Central

    Shatkay, Hagit; Pan, Fengxia; Rzhetsky, Andrey; Wilbur, W. John

    2008-01-01

    Motivation: Much current research in biomedical text mining is concerned with serving biologists by extracting certain information from scientific text. We note that there is no ‘average biologist’ client; different users have distinct needs. For instance, as noted in past evaluation efforts (BioCreative, TREC, KDD) database curators are often interested in sentences showing experimental evidence and methods. Conversely, lab scientists searching for known information about a protein may seek facts, typically stated with high confidence. Text-mining systems can target specific end-users and become more effective, if the system can first identify text regions rich in the type of scientific content that is of interest to the user, retrieve documents that have many such regions, and focus on fact extraction from these regions. Here, we study the ability to characterize and classify such text automatically. We have recently introduced a multi-dimensional categorization and annotation scheme, developed to be applicable to a wide variety of biomedical documents and scientific statements, while intended to support specific biomedical retrieval and extraction tasks. Results: The annotation scheme was applied to a large corpus in a controlled effort by eight independent annotators, where three individual annotators independently tagged each sentence. We then trained and tested machine learning classifiers to automatically categorize sentence fragments based on the annotation. We discuss here the issues involved in this task, and present an overview of the results. The latter strongly suggest that automatic annotation along most of the dimensions is highly feasible, and that this new framework for scientific sentence categorization is applicable in practice. Contact: shatkay@cs.queensu.ca PMID:18718948

  17. Exploiting the potential of unlabeled endoscopic video data with self-supervised learning.

    PubMed

    Ross, Tobias; Zimmerer, David; Vemuri, Anant; Isensee, Fabian; Wiesenfarth, Manuel; Bodenstedt, Sebastian; Both, Fabian; Kessler, Philip; Wagner, Martin; Müller, Beat; Kenngott, Hannes; Speidel, Stefanie; Kopp-Schneider, Annette; Maier-Hein, Klaus; Maier-Hein, Lena

    2018-06-01

    Surgical data science is a new research field that aims to observe all aspects of the patient treatment process in order to provide the right assistance at the right time. Due to the breakthrough successes of deep learning-based solutions for automatic image annotation, the availability of reference annotations for algorithm training is becoming a major bottleneck in the field. The purpose of this paper was to investigate the concept of self-supervised learning to address this issue. Our approach is guided by the hypothesis that unlabeled video data can be used to learn a representation of the target domain that boosts the performance of state-of-the-art machine learning algorithms when used for pre-training. Core of the method is an auxiliary task based on raw endoscopic video data of the target domain that is used to initialize the convolutional neural network (CNN) for the target task. In this paper, we propose the re-colorization of medical images with a conditional generative adversarial network (cGAN)-based architecture as auxiliary task. A variant of the method involves a second pre-training step based on labeled data for the target task from a related domain. We validate both variants using medical instrument segmentation as target task. The proposed approach can be used to radically reduce the manual annotation effort involved in training CNNs. Compared to the baseline approach of generating annotated data from scratch, our method decreases exploratively the number of labeled images by up to 75% without sacrificing performance. Our method also outperforms alternative methods for CNN pre-training, such as pre-training on publicly available non-medical (COCO) or medical data (MICCAI EndoVis2017 challenge) using the target task (in this instance: segmentation). As it makes efficient use of available (non-)public and (un-)labeled data, the approach has the potential to become a valuable tool for CNN (pre-)training.

  18. Chest wall segmentation in automated 3D breast ultrasound scans.

    PubMed

    Tan, Tao; Platel, Bram; Mann, Ritse M; Huisman, Henkjan; Karssemeijer, Nico

    2013-12-01

    In this paper, we present an automatic method to segment the chest wall in automated 3D breast ultrasound images. Determining the location of the chest wall in automated 3D breast ultrasound images is necessary in computer-aided detection systems to remove automatically detected cancer candidates beyond the chest wall and it can be of great help for inter- and intra-modal image registration. We show that the visible part of the chest wall in an automated 3D breast ultrasound image can be accurately modeled by a cylinder. We fit the surface of our cylinder model to a set of automatically detected rib-surface points. The detection of the rib-surface points is done by a classifier using features representing local image intensity patterns and presence of rib shadows. Due to attenuation of the ultrasound signal, a clear shadow is visible behind the ribs. Evaluation of our segmentation method is done by computing the distance of manually annotated rib points to the surface of the automatically detected chest wall. We examined the performance on images obtained with the two most common 3D breast ultrasound devices in the market. In a dataset of 142 images, the average mean distance of the annotated points to the segmented chest wall was 5.59 ± 3.08 mm. Copyright © 2012 Elsevier B.V. All rights reserved.

  19. Comparative Omics-Driven Genome Annotation Refinement: Application across Yersiniae

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Rutledge, Alexandra C.; Jones, Marcus B.; Chauhan, Sadhana

    2012-03-27

    Genome sequencing continues to be a rapidly evolving technology, yet most downstream aspects of genome annotation pipelines remain relatively stable or are even being abandoned. To date, the perceived value of manual curation for genome annotations is not offset by the real cost and time associated with the process. In order to balance the large number of sequences generated, the annotation process is now performed almost exclusively in an automated fashion for most genome sequencing projects. One possible way to reduce errors inherent to automated computational annotations is to apply data from 'omics' measurements (i.e. transcriptional and proteomic) to themore » un-annotated genome with a proteogenomic-based approach. This approach does require additional experimental and bioinformatics methods to include omics technologies; however, the approach is readily automatable and can benefit from rapid developments occurring in those research domains as well. The annotation process can be improved by experimental validation of transcription and translation and aid in the discovery of annotation errors. Here the concept of annotation refinement has been extended to include a comparative assessment of genomes across closely related species, as is becoming common in sequencing efforts. Transcriptomic and proteomic data derived from three highly similar pathogenic Yersiniae (Y. pestis CO92, Y. pestis pestoides F, and Y. pseudotuberculosis PB1/+) was used to demonstrate a comprehensive comparative omic-based annotation methodology. Peptide and oligo measurements experimentally validated the expression of nearly 40% of each strain's predicted proteome and revealed the identification of 28 novel and 68 previously incorrect protein-coding sequences (e.g., observed frameshifts, extended start sites, and translated pseudogenes) within the three current Yersinia genome annotations. Gene loss is presumed to play a major role in Y. pestis acquiring its niche as a virulent pathogen, thus the discovery of many translated pseudogenes underscores a need for functional analyses to investigate hypotheses related to divergence. Refinements included the discovery of a seemingly essential ribosomal protein, several virulence-associated factors, and a transcriptional regulator, among other proteins, most of which are annotated as hypothetical, that were missed during annotation.« less

  20. What do we do with all this video? Better understanding public engagement for image and video annotation

    NASA Astrophysics Data System (ADS)

    Wiener, C.; Miller, A.; Zykov, V.

    2016-12-01

    Advanced robotic vehicles are increasingly being used by oceanographic research vessels to enable more efficient and widespread exploration of the ocean, particularly the deep ocean. With cutting-edge capabilities mounted onto robotic vehicles, data at high resolutions is being generated more than ever before, enabling enhanced data collection and the potential for broader participation. For example, high resolution camera technology not only improves visualization of the ocean environment, but also expands the capacity to engage participants remotely through increased use of telepresence and virtual reality techniques. Schmidt Ocean Institute is a private, non-profit operating foundation established to advance the understanding of the world's oceans through technological advancement, intelligent observation and analysis, and open sharing of information. Telepresence-enabled research is an important component of Schmidt Ocean Institute's science research cruises, which this presentation will highlight. Schmidt Ocean Institute is one of the only research programs that make their entire underwater vehicle dive series available online, creating a collection of video that enables anyone to follow deep sea research in real time. We encourage students, educators and the general public to take advantage of freely available dive videos. Additionally, other SOI-supported internet platforms, have engaged the public in image and video annotation activities. Examples of these new online platforms, which utilize citizen scientists to annotate scientific image and video data will be provided. This presentation will include an introduction to SOI-supported video and image tagging citizen science projects, real-time robot tracking, live ship-to-shore communications, and an array of outreach activities that enable scientists to interact with the public and explore the ocean in fascinating detail.

  1. Automated extraction of chemical structure information from digital raster images

    PubMed Central

    Park, Jungkap; Rosania, Gus R; Shedden, Kerby A; Nguyen, Mandee; Lyu, Naesung; Saitou, Kazuhiro

    2009-01-01

    Background To search for chemical structures in research articles, diagrams or text representing molecules need to be translated to a standard chemical file format compatible with cheminformatic search engines. Nevertheless, chemical information contained in research articles is often referenced as analog diagrams of chemical structures embedded in digital raster images. To automate analog-to-digital conversion of chemical structure diagrams in scientific research articles, several software systems have been developed. But their algorithmic performance and utility in cheminformatic research have not been investigated. Results This paper aims to provide critical reviews for these systems and also report our recent development of ChemReader – a fully automated tool for extracting chemical structure diagrams in research articles and converting them into standard, searchable chemical file formats. Basic algorithms for recognizing lines and letters representing bonds and atoms in chemical structure diagrams can be independently run in sequence from a graphical user interface-and the algorithm parameters can be readily changed-to facilitate additional development specifically tailored to a chemical database annotation scheme. Compared with existing software programs such as OSRA, Kekule, and CLiDE, our results indicate that ChemReader outperforms other software systems on several sets of sample images from diverse sources in terms of the rate of correct outputs and the accuracy on extracting molecular substructure patterns. Conclusion The availability of ChemReader as a cheminformatic tool for extracting chemical structure information from digital raster images allows research and development groups to enrich their chemical structure databases by annotating the entries with published research articles. Based on its stable performance and high accuracy, ChemReader may be sufficiently accurate for annotating the chemical database with links to scientific research articles. PMID:19196483

  2. Semantic labeling of digital photos by classification

    NASA Astrophysics Data System (ADS)

    Ciocca, Gianluigi; Cusano, Claudio; Schettini, Raimondo; Brambilla, Carla

    2003-01-01

    The paper addresses the problem of annotating photographs with broad semantic labels. To cope with the great variety of photos available on the WEB we have designed a hierarchical classification strategy which first classifies images as pornographic or not-pornographic. Not-pornographic images are then classified as indoor, outdoor, or close-up. On a database of over 9000 images, mostly downloaded from the web, our method achieves an average accuracy of close to 90%.

  3. Image standards in tissue-based diagnosis (diagnostic surgical pathology).

    PubMed

    Kayser, Klaus; Görtler, Jürgen; Goldmann, Torsten; Vollmer, Ekkehard; Hufnagl, Peter; Kayser, Gian

    2008-04-18

    Progress in automated image analysis, virtual microscopy, hospital information systems, and interdisciplinary data exchange require image standards to be applied in tissue-based diagnosis. To describe the theoretical background, practical experiences and comparable solutions in other medical fields to promote image standards applicable for diagnostic pathology. THEORY AND EXPERIENCES: Images used in tissue-based diagnosis present with pathology-specific characteristics. It seems appropriate to discuss their characteristics and potential standardization in relation to the levels of hierarchy in which they appear. All levels can be divided into legal, medical, and technological properties. Standards applied to the first level include regulations or aims to be fulfilled. In legal properties, they have to regulate features of privacy, image documentation, transmission, and presentation; in medical properties, features of disease-image combination, human-diagnostics, automated information extraction, archive retrieval and access; and in technological properties features of image acquisition, display, formats, transfer speed, safety, and system dynamics. The next lower second level has to implement the prescriptions of the upper one, i.e. describe how they are implemented. Legal aspects should demand secure encryption for privacy of all patient related data, image archives that include all images used for diagnostics for a period of 10 years at minimum, accurate annotations of dates and viewing, and precise hardware and software information. Medical aspects should demand standardized patients' files such as DICOM 3 or HL 7 including history and previous examinations, information of image display hardware and software, of image resolution and fields of view, of relation between sizes of biological objects and image sizes, and of access to archives and retrieval. Technological aspects should deal with image acquisition systems (resolution, colour temperature, focus, brightness, and quality evaluation procedures), display resolution data, implemented image formats, storage, cycle frequency, backup procedures, operation system, and external system accessibility. The lowest third level describes the permitted limits and threshold in detail. At present, an applicable standard including all mentioned features does not exist to our knowledge; some aspects can be taken from radiological standards (PACS, DICOM 3); others require specific solutions or are not covered yet. The progress in virtual microscopy and application of artificial intelligence (AI) in tissue-based diagnosis demands fast preparation and implementation of an internationally acceptable standard. The described hierarchic order as well as analytic investigation in all potentially necessary aspects and details offers an appropriate tool to specifically determine standardized requirements.

  4. The web server of IBM's Bioinformatics and Pattern Discovery group.

    PubMed

    Huynh, Tien; Rigoutsos, Isidore; Parida, Laxmi; Platt, Daniel; Shibuya, Tetsuo

    2003-07-01

    We herein present and discuss the services and content which are available on the web server of IBM's Bioinformatics and Pattern Discovery group. The server is operational around the clock and provides access to a variety of methods that have been published by the group's members and collaborators. The available tools correspond to applications ranging from the discovery of patterns in streams of events and the computation of multiple sequence alignments, to the discovery of genes in nucleic acid sequences and the interactive annotation of amino acid sequences. Additionally, annotations for more than 70 archaeal, bacterial, eukaryotic and viral genomes are available on-line and can be searched interactively. The tools and code bundles can be accessed beginning at http://cbcsrv.watson.ibm.com/Tspd.html whereas the genomics annotations are available at http://cbcsrv.watson.ibm.com/Annotations/.

  5. The web server of IBM's Bioinformatics and Pattern Discovery group

    PubMed Central

    Huynh, Tien; Rigoutsos, Isidore; Parida, Laxmi; Platt, Daniel; Shibuya, Tetsuo

    2003-01-01

    We herein present and discuss the services and content which are available on the web server of IBM's Bioinformatics and Pattern Discovery group. The server is operational around the clock and provides access to a variety of methods that have been published by the group's members and collaborators. The available tools correspond to applications ranging from the discovery of patterns in streams of events and the computation of multiple sequence alignments, to the discovery of genes in nucleic acid sequences and the interactive annotation of amino acid sequences. Additionally, annotations for more than 70 archaeal, bacterial, eukaryotic and viral genomes are available on-line and can be searched interactively. The tools and code bundles can be accessed beginning at http://cbcsrv.watson.ibm.com/Tspd.html whereas the genomics annotations are available at http://cbcsrv.watson.ibm.com/Annotations/. PMID:12824385

  6. Iodine and freeze-drying enhanced high-resolution MicroCT imaging for reconstructing 3D intraneural topography of human peripheral nerve fascicles.

    PubMed

    Yan, Liwei; Guo, Yongze; Qi, Jian; Zhu, Qingtang; Gu, Liqiang; Zheng, Canbin; Lin, Tao; Lu, Yutong; Zeng, Zitao; Yu, Sha; Zhu, Shuang; Zhou, Xiang; Zhang, Xi; Du, Yunfei; Yao, Zhi; Lu, Yao; Liu, Xiaolin

    2017-08-01

    The precise annotation and accurate identification of the topography of fascicles to the end organs are prerequisites for studying human peripheral nerves. In this study, we present a feasible imaging method that acquires 3D high-resolution (HR) topography of peripheral nerve fascicles using an iodine and freeze-drying (IFD) micro-computed tomography (microCT) method to greatly increase the contrast of fascicle images. The enhanced microCT imaging method can facilitate the reconstruction of high-contrast HR fascicle images, fascicle segmentation and extraction, feature analysis, and the tracing of fascicle topography to end organs, which define fascicle functions. The complex intraneural aggregation and distribution of fascicles is typically assessed using histological techniques or MR imaging to acquire coarse axial three-dimensional (3D) maps. However, the disadvantages of histological techniques (static, axial manual registration, and data instability) and MR imaging (low-resolution) limit these applications in reconstructing the topography of nerve fascicles. Thus, enhanced microCT is a new technique for acquiring 3D intraneural topography of the human peripheral nerve fascicles both to improve our understanding of neurobiological principles and to guide accurate repair in the clinic. Additionally, 3D microstructure data can be used as a biofabrication model, which in turn can be used to fabricate scaffolds to repair long nerve gaps. Copyright © 2017 Elsevier B.V. All rights reserved.

  7. An efficient visualization method for analyzing biometric data

    NASA Astrophysics Data System (ADS)

    Rahmes, Mark; McGonagle, Mike; Yates, J. Harlan; Henning, Ronda; Hackett, Jay

    2013-05-01

    We introduce a novel application for biometric data analysis. This technology can be used as part of a unique and systematic approach designed to augment existing processing chains. Our system provides image quality control and analysis capabilities. We show how analysis and efficient visualization are used as part of an automated process. The goal of this system is to provide a unified platform for the analysis of biometric images that reduce manual effort and increase the likelihood of a match being brought to an examiner's attention from either a manual or lights-out application. We discuss the functionality of FeatureSCOPE™ which provides an efficient tool for feature analysis and quality control of biometric extracted features. Biometric databases must be checked for accuracy for a large volume of data attributes. Our solution accelerates review of features by a factor of up to 100 times. Review of qualitative results and cost reduction is shown by using efficient parallel visual review for quality control. Our process automatically sorts and filters features for examination, and packs these into a condensed view. An analyst can then rapidly page through screens of features and flag and annotate outliers as necessary.

  8. Patients' Use and Evaluation of an Online System to Annotate Radiology Reports with Lay Language Definitions.

    PubMed

    Cook, Tessa S; Oh, Seong Cheol; Kahn, Charles E

    2017-09-01

    The increasing availability of personal health portals has made it easier for patients to obtain their imaging results online. However, the radiology report typically is designed to communicate findings and recommendations to the referring clinician, and may contain many terms unfamiliar to lay readers. We sought to evaluate a web-based interface that presented reports of knee MRI (magnetic resonance imaging) examinations with annotations that included patient-oriented definitions, anatomic illustrations, and hyperlinks to additional information. During a 7-month observational trial, a statement added to all knee MRI reports invited patients to view their annotated report online. We tracked the number of patients who opened their reports, the terms they hovered over to view definitions, and the time hovering over each term. Patients who accessed their annotated reports were invited to complete a survey. Of 1138 knee MRI examinations during the trial period, 185 patients (16.3%) opened their report in the viewing portal. Of those, 141 (76%) hovered over at least one term to view its definition, and 121 patients (65%) viewed a mean of 27.5 terms per examination and spent an average of 3.5 minutes viewing those terms. Of the 22 patients who completed the survey, 77% agreed that the definitions helped them understand the report and 91% stated that the illustrations were helpful. A system that provided definitions and illustrations of the medical and technical terms in radiology reports has potential to improve patients' understanding of their reports and their diagnoses. Copyright © 2017 The Association of University Radiologists. Published by Elsevier Inc. All rights reserved.

  9. Workflow and web application for annotating NCBI BioProject transcriptome data

    PubMed Central

    Vera Alvarez, Roberto; Medeiros Vidal, Newton; Garzón-Martínez, Gina A.; Barrero, Luz S.; Landsman, David

    2017-01-01

    Abstract The volume of transcriptome data is growing exponentially due to rapid improvement of experimental technologies. In response, large central resources such as those of the National Center for Biotechnology Information (NCBI) are continually adapting their computational infrastructure to accommodate this large influx of data. New and specialized databases, such as Transcriptome Shotgun Assembly Sequence Database (TSA) and Sequence Read Archive (SRA), have been created to aid the development and expansion of centralized repositories. Although the central resource databases are under continual development, they do not include automatic pipelines to increase annotation of newly deposited data. Therefore, third-party applications are required to achieve that aim. Here, we present an automatic workflow and web application for the annotation of transcriptome data. The workflow creates secondary data such as sequencing reads and BLAST alignments, which are available through the web application. They are based on freely available bioinformatics tools and scripts developed in-house. The interactive web application provides a search engine and several browser utilities. Graphical views of transcript alignments are available through SeqViewer, an embedded tool developed by NCBI for viewing biological sequence data. The web application is tightly integrated with other NCBI web applications and tools to extend the functionality of data processing and interconnectivity. We present a case study for the species Physalis peruviana with data generated from BioProject ID 67621. Database URL: http://www.ncbi.nlm.nih.gov/projects/physalis/ PMID:28605765

  10. 3D annotation and manipulation of medical anatomical structures

    NASA Astrophysics Data System (ADS)

    Vitanovski, Dime; Schaller, Christian; Hahn, Dieter; Daum, Volker; Hornegger, Joachim

    2009-02-01

    Although the medical scanners are rapidly moving towards a three-dimensional paradigm, the manipulation and annotation/labeling of the acquired data is still performed in a standard 2D environment. Editing and annotation of three-dimensional medical structures is currently a complex task and rather time-consuming, as it is carried out in 2D projections of the original object. A major problem in 2D annotation is the depth ambiguity, which requires 3D landmarks to be identified and localized in at least two of the cutting planes. Operating directly in a three-dimensional space enables the implicit consideration of the full 3D local context, which significantly increases accuracy and speed. A three-dimensional environment is as well more natural optimizing the user's comfort and acceptance. The 3D annotation environment requires the three-dimensional manipulation device and display. By means of two novel and advanced technologies, Wii Nintendo Controller and Philips 3D WoWvx display, we define an appropriate 3D annotation tool and a suitable 3D visualization monitor. We define non-coplanar setting of four Infrared LEDs with a known and exact position, which are tracked by the Wii and from which we compute the pose of the device by applying a standard pose estimation algorithm. The novel 3D renderer developed by Philips uses either the Z-value of a 3D volume, or it computes the depth information out of a 2D image, to provide a real 3D experience without having some special glasses. Within this paper we present a new framework for manipulation and annotation of medical landmarks directly in three-dimensional volume.

  11. The what, where, how and why of gene ontology—a primer for bioinformaticians

    PubMed Central

    du Plessis, Louis; Škunca, Nives

    2011-01-01

    With high-throughput technologies providing vast amounts of data, it has become more important to provide systematic, quality annotations. The Gene Ontology (GO) project is the largest resource for cataloguing gene function. Nonetheless, its use is not yet ubiquitous and is still fraught with pitfalls. In this review, we provide a short primer to the GO for bioinformaticians. We summarize important aspects of the structure of the ontology, describe sources and types of functional annotations, survey measures of GO annotation similarity, review typical uses of GO and discuss other important considerations pertaining to the use of GO in bioinformatics applications. PMID:21330331

  12. Textile Art as Illustration.

    ERIC Educational Resources Information Center

    Dickman, Floyd C.

    1998-01-01

    Used in picture-book illustration, such techniques as embroidery, applique, wood-block printing, batik, and quilting reflect cultural heritages and add richly textured images to this annotated list of titles for children from preschool through junior high. (Author)

  13. Video Annotation Software Application for Thorough Collaborative Assessment of and Feedback on Microteaching Lessons in Geography Education

    ERIC Educational Resources Information Center

    van der Westhuizen, Christo P.; Golightly, Aubrey

    2015-01-01

    This article discusses the process and findings of a study in which video annotation (VideoANT) and a learning management system (LMS) were implemented together in the microteaching lessons of fourth-year geography student teachers. The aim was to ensure adequate assessment of and feedback for each student, since these aspects are, in general, a…

  14. iDEAS: A web-based system for dry eye assessment.

    PubMed

    Remeseiro, Beatriz; Barreira, Noelia; García-Resúa, Carlos; Lira, Madalena; Giráldez, María J; Yebra-Pimentel, Eva; Penedo, Manuel G

    2016-07-01

    Dry eye disease is a public health problem, whose multifactorial etiology challenges clinicians and researchers making necessary the collaboration between different experts and centers. The evaluation of the interference patterns observed in the tear film lipid layer is a common clinical test used for dry eye diagnosis. However, it is a time-consuming task with a high degree of intra- as well as inter-observer variability, which makes the use of a computer-based analysis system highly desirable. This work introduces iDEAS (Dry Eye Assessment System), a web-based application to support dry eye diagnosis. iDEAS provides a framework for eye care experts to collaboratively work using image-based services in a distributed environment. It is composed of three main components: the web client for user interaction, the web application server for request processing, and the service module for image analysis. Specifically, this manuscript presents two automatic services: tear film classification, which classifies an image into one interference pattern; and tear film map, which illustrates the distribution of the patterns over the entire tear film. iDEAS has been evaluated by specialists from different institutions to test its performance. Both services have been evaluated in terms of a set of performance metrics using the annotations of different experts. Note that the processing time of both services has been also measured for efficiency purposes. iDEAS is a web-based application which provides a fast, reliable environment for dry eye assessment. The system allows practitioners to share images, clinical information and automatic assessments between remote computers. Additionally, it save time for experts, diminish the inter-expert variability and can be used in both clinical and research settings. Copyright © 2016 Elsevier Ireland Ltd. All rights reserved.

  15. A sentence sliding window approach to extract protein annotations from biomedical articles

    PubMed Central

    Krallinger, Martin; Padron, Maria; Valencia, Alfonso

    2005-01-01

    Background Within the emerging field of text mining and statistical natural language processing (NLP) applied to biomedical articles, a broad variety of techniques have been developed during the past years. Nevertheless, there is still a great ned of comparative assessment of the performance of the proposed methods and the development of common evaluation criteria. This issue was addressed by the Critical Assessment of Text Mining Methods in Molecular Biology (BioCreative) contest. The aim of this contest was to assess the performance of text mining systems applied to biomedical texts including tools which recognize named entities such as genes and proteins, and tools which automatically extract protein annotations. Results The "sentence sliding window" approach proposed here was found to efficiently extract text fragments from full text articles containing annotations on proteins, providing the highest number of correctly predicted annotations. Moreover, the number of correct extractions of individual entities (i.e. proteins and GO terms) involved in the relationships used for the annotations was significantly higher than the correct extractions of the complete annotations (protein-function relations). Conclusion We explored the use of averaging sentence sliding windows for information extraction, especially in a context where conventional training data is unavailable. The combination of our approach with more refined statistical estimators and machine learning techniques might be a way to improve annotation extraction for future biomedical text mining applications. PMID:15960831

  16. Automated and Accurate Estimation of Gene Family Abundance from Shotgun Metagenomes

    PubMed Central

    Nayfach, Stephen; Bradley, Patrick H.; Wyman, Stacia K.; Laurent, Timothy J.; Williams, Alex; Eisen, Jonathan A.; Pollard, Katherine S.; Sharpton, Thomas J.

    2015-01-01

    Shotgun metagenomic DNA sequencing is a widely applicable tool for characterizing the functions that are encoded by microbial communities. Several bioinformatic tools can be used to functionally annotate metagenomes, allowing researchers to draw inferences about the functional potential of the community and to identify putative functional biomarkers. However, little is known about how decisions made during annotation affect the reliability of the results. Here, we use statistical simulations to rigorously assess how to optimize annotation accuracy and speed, given parameters of the input data like read length and library size. We identify best practices in metagenome annotation and use them to guide the development of the Shotgun Metagenome Annotation Pipeline (ShotMAP). ShotMAP is an analytically flexible, end-to-end annotation pipeline that can be implemented either on a local computer or a cloud compute cluster. We use ShotMAP to assess how different annotation databases impact the interpretation of how marine metagenome and metatranscriptome functional capacity changes across seasons. We also apply ShotMAP to data obtained from a clinical microbiome investigation of inflammatory bowel disease. This analysis finds that gut microbiota collected from Crohn’s disease patients are functionally distinct from gut microbiota collected from either ulcerative colitis patients or healthy controls, with differential abundance of metabolic pathways related to host-microbiome interactions that may serve as putative biomarkers of disease. PMID:26565399

  17. DaGO-Fun: tool for Gene Ontology-based functional analysis using term information content measures

    PubMed Central

    2013-01-01

    Background The use of Gene Ontology (GO) data in protein analyses have largely contributed to the improved outcomes of these analyses. Several GO semantic similarity measures have been proposed in recent years and provide tools that allow the integration of biological knowledge embedded in the GO structure into different biological analyses. There is a need for a unified tool that provides the scientific community with the opportunity to explore these different GO similarity measure approaches and their biological applications. Results We have developed DaGO-Fun, an online tool available at http://web.cbio.uct.ac.za/ITGOM, which incorporates many different GO similarity measures for exploring, analyzing and comparing GO terms and proteins within the context of GO. It uses GO data and UniProt proteins with their GO annotations as provided by the Gene Ontology Annotation (GOA) project to precompute GO term information content (IC), enabling rapid response to user queries. Conclusions The DaGO-Fun online tool presents the advantage of integrating all the relevant IC-based GO similarity measures, including topology- and annotation-based approaches to facilitate effective exploration of these measures, thus enabling users to choose the most relevant approach for their application. Furthermore, this tool includes several biological applications related to GO semantic similarity scores, including the retrieval of genes based on their GO annotations, the clustering of functionally related genes within a set, and term enrichment analysis. PMID:24067102

  18. An open annotation ontology for science on web 3.0

    PubMed Central

    2011-01-01

    Background There is currently a gap between the rich and expressive collection of published biomedical ontologies, and the natural language expression of biomedical papers consumed on a daily basis by scientific researchers. The purpose of this paper is to provide an open, shareable structure for dynamic integration of biomedical domain ontologies with the scientific document, in the form of an Annotation Ontology (AO), thus closing this gap and enabling application of formal biomedical ontologies directly to the literature as it emerges. Methods Initial requirements for AO were elicited by analysis of integration needs between biomedical web communities, and of needs for representing and integrating results of biomedical text mining. Analysis of strengths and weaknesses of previous efforts in this area was also performed. A series of increasingly refined annotation tools were then developed along with a metadata model in OWL, and deployed for feedback and additional requirements the ontology to users at a major pharmaceutical company and a major academic center. Further requirements and critiques of the model were also elicited through discussions with many colleagues and incorporated into the work. Results This paper presents Annotation Ontology (AO), an open ontology in OWL-DL for annotating scientific documents on the web. AO supports both human and algorithmic content annotation. It enables “stand-off” or independent metadata anchored to specific positions in a web document by any one of several methods. In AO, the document may be annotated but is not required to be under update control of the annotator. AO contains a provenance model to support versioning, and a set model for specifying groups and containers of annotation. AO is freely available under open source license at http://purl.org/ao/, and extensive documentation including screencasts is available on AO’s Google Code page: http://code.google.com/p/annotation-ontology/ . Conclusions The Annotation Ontology meets critical requirements for an open, freely shareable model in OWL, of annotation metadata created against scientific documents on the Web. We believe AO can become a very useful common model for annotation metadata on Web documents, and will enable biomedical domain ontologies to be used quite widely to annotate the scientific literature. Potential collaborators and those with new relevant use cases are invited to contact the authors. PMID:21624159

  19. An open annotation ontology for science on web 3.0.

    PubMed

    Ciccarese, Paolo; Ocana, Marco; Garcia Castro, Leyla Jael; Das, Sudeshna; Clark, Tim

    2011-05-17

    There is currently a gap between the rich and expressive collection of published biomedical ontologies, and the natural language expression of biomedical papers consumed on a daily basis by scientific researchers. The purpose of this paper is to provide an open, shareable structure for dynamic integration of biomedical domain ontologies with the scientific document, in the form of an Annotation Ontology (AO), thus closing this gap and enabling application of formal biomedical ontologies directly to the literature as it emerges. Initial requirements for AO were elicited by analysis of integration needs between biomedical web communities, and of needs for representing and integrating results of biomedical text mining. Analysis of strengths and weaknesses of previous efforts in this area was also performed. A series of increasingly refined annotation tools were then developed along with a metadata model in OWL, and deployed for feedback and additional requirements the ontology to users at a major pharmaceutical company and a major academic center. Further requirements and critiques of the model were also elicited through discussions with many colleagues and incorporated into the work. This paper presents Annotation Ontology (AO), an open ontology in OWL-DL for annotating scientific documents on the web. AO supports both human and algorithmic content annotation. It enables "stand-off" or independent metadata anchored to specific positions in a web document by any one of several methods. In AO, the document may be annotated but is not required to be under update control of the annotator. AO contains a provenance model to support versioning, and a set model for specifying groups and containers of annotation. AO is freely available under open source license at http://purl.org/ao/, and extensive documentation including screencasts is available on AO's Google Code page: http://code.google.com/p/annotation-ontology/ . The Annotation Ontology meets critical requirements for an open, freely shareable model in OWL, of annotation metadata created against scientific documents on the Web. We believe AO can become a very useful common model for annotation metadata on Web documents, and will enable biomedical domain ontologies to be used quite widely to annotate the scientific literature. Potential collaborators and those with new relevant use cases are invited to contact the authors.

  20. Supervised learning of tools for content-based search of image databases

    NASA Astrophysics Data System (ADS)

    Delanoy, Richard L.

    1996-03-01

    A computer environment, called the Toolkit for Image Mining (TIM), is being developed with the goal of enabling users with diverse interests and varied computer skills to create search tools for content-based image retrieval and other pattern matching tasks. Search tools are generated using a simple paradigm of supervised learning that is based on the user pointing at mistakes of classification made by the current search tool. As mistakes are identified, a learning algorithm uses the identified mistakes to build up a model of the user's intentions, construct a new search tool, apply the search tool to a test image, display the match results as feedback to the user, and accept new inputs from the user. Search tools are constructed in the form of functional templates, which are generalized matched filters capable of knowledge- based image processing. The ability of this system to learn the user's intentions from experience contrasts with other existing approaches to content-based image retrieval that base searches on the characteristics of a single input example or on a predefined and semantically- constrained textual query. Currently, TIM is capable of learning spectral and textural patterns, but should be adaptable to the learning of shapes, as well. Possible applications of TIM include not only content-based image retrieval, but also quantitative image analysis, the generation of metadata for annotating images, data prioritization or data reduction in bandwidth-limited situations, and the construction of components for larger, more complex computer vision algorithms.

  1. Attribute-based classification for zero-shot visual object categorization.

    PubMed

    Lampert, Christoph H; Nickisch, Hannes; Harmeling, Stefan

    2014-03-01

    We study the problem of object recognition for categories for which we have no training examples, a task also called zero--data or zero-shot learning. This situation has hardly been studied in computer vision research, even though it occurs frequently; the world contains tens of thousands of different object classes, and image collections have been formed and suitably annotated for only a few of them. To tackle the problem, we introduce attribute-based classification: Objects are identified based on a high-level description that is phrased in terms of semantic attributes, such as the object's color or shape. Because the identification of each such property transcends the specific learning task at hand, the attribute classifiers can be prelearned independently, for example, from existing image data sets unrelated to the current task. Afterward, new classes can be detected based on their attribute representation, without the need for a new training phase. In this paper, we also introduce a new data set, Animals with Attributes, of over 30,000 images of 50 animal classes, annotated with 85 semantic attributes. Extensive experiments on this and two more data sets show that attribute-based classification indeed is able to categorize images without access to any training images of the target classes.

  2. Deconstructing Barbie: Using Creative Drama as a Tool for Image Making in Pre-Adolescent Girls.

    ERIC Educational Resources Information Center

    O'Hara, Elizabeth; Lanoux, Carol

    1999-01-01

    Discusses the dilemma of self-concept in pre-adolescent girls, as they revise their self-images based on information that the culture dictates as the norm. Argues that drama education can offer creative activities to help girls find their voice and bring them into their power. Includes two group drama activities and a short annotated bibliography…

  3. Advanced two-layer level set with a soft distance constraint for dual surfaces segmentation in medical images

    NASA Astrophysics Data System (ADS)

    Ji, Yuanbo; van der Geest, Rob J.; Nazarian, Saman; Lelieveldt, Boudewijn P. F.; Tao, Qian

    2018-03-01

    Anatomical objects in medical images very often have dual contours or surfaces that are highly correlated. Manually segmenting both of them by following local image details is tedious and subjective. In this study, we proposed a two-layer region-based level set method with a soft distance constraint, which not only regularizes the level set evolution at two levels, but also imposes prior information on wall thickness in an effective manner. By updating the level set function and distance constraint functions alternatingly, the method simultaneously optimizes both contours while regularizing their distance. The method was applied to segment the inner and outer wall of both left atrium (LA) and left ventricle (LV) from MR images, using a rough initialization from inside the blood pool. Compared to manual annotation from experience observers, the proposed method achieved an average perpendicular distance (APD) of less than 1mm for the LA segmentation, and less than 1.5mm for the LV segmentation, at both inner and outer contours. The method can be used as a practical tool for fast and accurate dual wall annotations given proper initialization.

  4. Online Graph Completion: Multivariate Signal Recovery in Computer Vision.

    PubMed

    Kim, Won Hwa; Jalal, Mona; Hwang, Seongjae; Johnson, Sterling C; Singh, Vikas

    2017-07-01

    The adoption of "human-in-the-loop" paradigms in computer vision and machine learning is leading to various applications where the actual data acquisition (e.g., human supervision) and the underlying inference algorithms are closely interwined. While classical work in active learning provides effective solutions when the learning module involves classification and regression tasks, many practical issues such as partially observed measurements, financial constraints and even additional distributional or structural aspects of the data typically fall outside the scope of this treatment. For instance, with sequential acquisition of partial measurements of data that manifest as a matrix (or tensor), novel strategies for completion (or collaborative filtering) of the remaining entries have only been studied recently. Motivated by vision problems where we seek to annotate a large dataset of images via a crowdsourced platform or alternatively, complement results from a state-of-the-art object detector using human feedback, we study the "completion" problem defined on graphs, where requests for additional measurements must be made sequentially. We design the optimization model in the Fourier domain of the graph describing how ideas based on adaptive submodularity provide algorithms that work well in practice. On a large set of images collected from Imgur, we see promising results on images that are otherwise difficult to categorize. We also show applications to an experimental design problem in neuroimaging.

  5. Annotated bibliography of structural equation modelling: technical work.

    PubMed

    Austin, J T; Wolfle, L M

    1991-05-01

    Researchers must be familiar with a variety of source literature to facilitate the informed use of structural equation modelling. Knowledge can be acquired through the study of an expanding literature found in a diverse set of publishing forums. We propose that structural equation modelling publications can be roughly classified into two groups: (a) technical and (b) substantive applications. Technical materials focus on the procedures rather than substantive conclusions derived from applications. The focus of this article is the former category; included are foundational/major contributions, minor contributions, critical and evaluative reviews, integrations, simulations and computer applications, precursor and historical material, and pedagogical textbooks. After a brief introduction, we annotate 294 articles in the technical category dating back to Sewall Wright (1921).

  6. Retinal fundus images for glaucoma analysis: the RIGA dataset

    NASA Astrophysics Data System (ADS)

    Almazroa, Ahmed; Alodhayb, Sami; Osman, Essameldin; Ramadan, Eslam; Hummadi, Mohammed; Dlaim, Mohammed; Alkatee, Muhannad; Raahemifar, Kaamran; Lakshminarayanan, Vasudevan

    2018-03-01

    Glaucoma neuropathy is a major cause of irreversible blindness worldwide. Current models of chronic care will not be able to close the gap of growing prevalence of glaucoma and challenges for access to healthcare services. Teleophthalmology is being developed to close this gap. In order to develop automated techniques for glaucoma detection which can be used in tele-ophthalmology we have developed a large retinal fundus dataset. A de-identified dataset of retinal fundus images for glaucoma analysis (RIGA) was derived from three sources for a total of 750 images. The optic cup and disc boundaries for each image was marked and annotated manually by six experienced ophthalmologists and included the cup to disc (CDR) estimates. Six parameters were extracted and assessed (the disc area and centroid, cup area and centroid, horizontal and vertical cup to disc ratios) among the ophthalmologists. The inter-observer annotations were compared by calculating the standard deviation (SD) for every image between the six ophthalmologists in order to determine if the outliers amongst the six and was used to filter the corresponding images. The data set will be made available to the research community in order to crowd source other analysis from other research groups in order to develop, validate and implement analysis algorithms appropriate for tele-glaucoma assessment. The RIGA dataset can be freely accessed online through University of Michigan, Deep Blue website (doi:10.7302/Z23R0R29).

  7. Impact of ontology evolution on functional analyses.

    PubMed

    Groß, Anika; Hartung, Michael; Prüfer, Kay; Kelso, Janet; Rahm, Erhard

    2012-10-15

    Ontologies are used in the annotation and analysis of biological data. As knowledge accumulates, ontologies and annotation undergo constant modifications to reflect this new knowledge. These modifications may influence the results of statistical applications such as functional enrichment analyses that describe experimental data in terms of ontological groupings. Here, we investigate to what degree modifications of the Gene Ontology (GO) impact these statistical analyses for both experimental and simulated data. The analysis is based on new measures for the stability of result sets and considers different ontology and annotation changes. Our results show that past changes in the GO are non-uniformly distributed over different branches of the ontology. Considering the semantic relatedness of significant categories in analysis results allows a more realistic stability assessment for functional enrichment studies. We observe that the results of term-enrichment analyses tend to be surprisingly stable despite changes in ontology and annotation.

  8. Automatic textual annotation of video news based on semantic visual object extraction

    NASA Astrophysics Data System (ADS)

    Boujemaa, Nozha; Fleuret, Francois; Gouet, Valerie; Sahbi, Hichem

    2003-12-01

    In this paper, we present our work for automatic generation of textual metadata based on visual content analysis of video news. We present two methods for semantic object detection and recognition from a cross modal image-text thesaurus. These thesaurus represent a supervised association between models and semantic labels. This paper is concerned with two semantic objects: faces and Tv logos. In the first part, we present our work for efficient face detection and recogniton with automatic name generation. This method allows us also to suggest the textual annotation of shots close-up estimation. On the other hand, we were interested to automatically detect and recognize different Tv logos present on incoming different news from different Tv Channels. This work was done jointly with the French Tv Channel TF1 within the "MediaWorks" project that consists on an hybrid text-image indexing and retrieval plateform for video news.

  9. A graph-based semantic similarity measure for the gene ontology.

    PubMed

    Alvarez, Marco A; Yan, Changhui

    2011-12-01

    Existing methods for calculating semantic similarities between pairs of Gene Ontology (GO) terms and gene products often rely on external databases like Gene Ontology Annotation (GOA) that annotate gene products using the GO terms. This dependency leads to some limitations in real applications. Here, we present a semantic similarity algorithm (SSA), that relies exclusively on the GO. When calculating the semantic similarity between a pair of input GO terms, SSA takes into account the shortest path between them, the depth of their nearest common ancestor, and a novel similarity score calculated between the definitions of the involved GO terms. In our work, we use SSA to calculate semantic similarities between pairs of proteins by combining pairwise semantic similarities between the GO terms that annotate the involved proteins. The reliability of SSA was evaluated by comparing the resulting semantic similarities between proteins with the functional similarities between proteins derived from expert annotations or sequence similarity. Comparisons with existing state-of-the-art methods showed that SSA is highly competitive with the other methods. SSA provides a reliable measure for semantics similarity independent of external databases of functional-annotation observations.

  10. Enabling comparative modeling of closely related genomes: Example genus Brucella

    DOE PAGES

    Faria, José P.; Edirisinghe, Janaka N.; Davis, James J.; ...

    2014-03-08

    For many scientific applications, it is highly desirable to be able to compare metabolic models of closely related genomes. In this study, we attempt to raise awareness to the fact that taking annotated genomes from public repositories and using them for metabolic model reconstructions is far from being trivial due to annotation inconsistencies. We are proposing a protocol for comparative analysis of metabolic models on closely related genomes, using fifteen strains of genus Brucella, which contains pathogens of both humans and livestock. This study lead to the identification and subsequent correction of inconsistent annotations in the SEED database, as wellmore » as the identification of 31 biochemical reactions that are common to Brucella, which are not originally identified by automated metabolic reconstructions. We are currently implementing this protocol for improving automated annotations within the SEED database and these improvements have been propagated into PATRIC, Model-SEED, KBase and RAST. This method is an enabling step for the future creation of consistent annotation systems and high-quality model reconstructions that will support in predicting accurate phenotypes such as pathogenicity, media requirements or type of respiration.« less

  11. Enabling comparative modeling of closely related genomes: Example genus Brucella

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Faria, José P.; Edirisinghe, Janaka N.; Davis, James J.

    For many scientific applications, it is highly desirable to be able to compare metabolic models of closely related genomes. In this study, we attempt to raise awareness to the fact that taking annotated genomes from public repositories and using them for metabolic model reconstructions is far from being trivial due to annotation inconsistencies. We are proposing a protocol for comparative analysis of metabolic models on closely related genomes, using fifteen strains of genus Brucella, which contains pathogens of both humans and livestock. This study lead to the identification and subsequent correction of inconsistent annotations in the SEED database, as wellmore » as the identification of 31 biochemical reactions that are common to Brucella, which are not originally identified by automated metabolic reconstructions. We are currently implementing this protocol for improving automated annotations within the SEED database and these improvements have been propagated into PATRIC, Model-SEED, KBase and RAST. This method is an enabling step for the future creation of consistent annotation systems and high-quality model reconstructions that will support in predicting accurate phenotypes such as pathogenicity, media requirements or type of respiration.« less

  12. Optimizing high performance computing workflow for protein functional annotation.

    PubMed

    Stanberry, Larissa; Rekepalli, Bhanu; Liu, Yuan; Giblock, Paul; Higdon, Roger; Montague, Elizabeth; Broomall, William; Kolker, Natali; Kolker, Eugene

    2014-09-10

    Functional annotation of newly sequenced genomes is one of the major challenges in modern biology. With modern sequencing technologies, the protein sequence universe is rapidly expanding. Newly sequenced bacterial genomes alone contain over 7.5 million proteins. The rate of data generation has far surpassed that of protein annotation. The volume of protein data makes manual curation infeasible, whereas a high compute cost limits the utility of existing automated approaches. In this work, we present an improved and optmized automated workflow to enable large-scale protein annotation. The workflow uses high performance computing architectures and a low complexity classification algorithm to assign proteins into existing clusters of orthologous groups of proteins. On the basis of the Position-Specific Iterative Basic Local Alignment Search Tool the algorithm ensures at least 80% specificity and sensitivity of the resulting classifications. The workflow utilizes highly scalable parallel applications for classification and sequence alignment. Using Extreme Science and Engineering Discovery Environment supercomputers, the workflow processed 1,200,000 newly sequenced bacterial proteins. With the rapid expansion of the protein sequence universe, the proposed workflow will enable scientists to annotate big genome data.

  13. Optimizing high performance computing workflow for protein functional annotation

    PubMed Central

    Stanberry, Larissa; Rekepalli, Bhanu; Liu, Yuan; Giblock, Paul; Higdon, Roger; Montague, Elizabeth; Broomall, William; Kolker, Natali; Kolker, Eugene

    2014-01-01

    Functional annotation of newly sequenced genomes is one of the major challenges in modern biology. With modern sequencing technologies, the protein sequence universe is rapidly expanding. Newly sequenced bacterial genomes alone contain over 7.5 million proteins. The rate of data generation has far surpassed that of protein annotation. The volume of protein data makes manual curation infeasible, whereas a high compute cost limits the utility of existing automated approaches. In this work, we present an improved and optmized automated workflow to enable large-scale protein annotation. The workflow uses high performance computing architectures and a low complexity classification algorithm to assign proteins into existing clusters of orthologous groups of proteins. On the basis of the Position-Specific Iterative Basic Local Alignment Search Tool the algorithm ensures at least 80% specificity and sensitivity of the resulting classifications. The workflow utilizes highly scalable parallel applications for classification and sequence alignment. Using Extreme Science and Engineering Discovery Environment supercomputers, the workflow processed 1,200,000 newly sequenced bacterial proteins. With the rapid expansion of the protein sequence universe, the proposed workflow will enable scientists to annotate big genome data. PMID:25313296

  14. Vehicle classification in WAMI imagery using deep network

    NASA Astrophysics Data System (ADS)

    Yi, Meng; Yang, Fan; Blasch, Erik; Sheaff, Carolyn; Liu, Kui; Chen, Genshe; Ling, Haibin

    2016-05-01

    Humans have always had a keen interest in understanding activities and the surrounding environment for mobility, communication, and survival. Thanks to recent progress in photography and breakthroughs in aviation, we are now able to capture tens of megapixels of ground imagery, namely Wide Area Motion Imagery (WAMI), at multiple frames per second from unmanned aerial vehicles (UAVs). WAMI serves as a great source for many applications, including security, urban planning and route planning. These applications require fast and accurate image understanding which is time consuming for humans, due to the large data volume and city-scale area coverage. Therefore, automatic processing and understanding of WAMI imagery has been gaining attention in both industry and the research community. This paper focuses on an essential step in WAMI imagery analysis, namely vehicle classification. That is, deciding whether a certain image patch contains a vehicle or not. We collect a set of positive and negative sample image patches, for training and testing the detector. Positive samples are 64 × 64 image patches centered on annotated vehicles. We generate two sets of negative images. The first set is generated from positive images with some location shift. The second set of negative patches is generated from randomly sampled patches. We also discard those patches if a vehicle accidentally locates at the center. Both positive and negative samples are randomly divided into 9000 training images and 3000 testing images. We propose to train a deep convolution network for classifying these patches. The classifier is based on a pre-trained AlexNet Model in the Caffe library, with an adapted loss function for vehicle classification. The performance of our classifier is compared to several traditional image classifier methods using Support Vector Machine (SVM) and Histogram of Oriented Gradient (HOG) features. While the SVM+HOG method achieves an accuracy of 91.2%, the accuracy of our deep network-based classifier reaches 97.9%.

  15. Integrated annotation and analysis of in situ hybridization images using the ImAnno system: application to the ear and sensory organs of the fetal mouse.

    PubMed

    Romand, Raymond; Ripp, Raymond; Poidevin, Laetitia; Boeglin, Marcel; Geffers, Lars; Dollé, Pascal; Poch, Olivier

    2015-01-01

    An in situ hybridization (ISH) study was performed on 2000 murine genes representing around 10% of the protein-coding genes present in the mouse genome using data generated by the EURExpress consortium. This study was carried out in 25 tissues of late gestation embryos (E14.5), with a special emphasis on the developing ear and on five distinct developing sensory organs, including the cochlea, the vestibular receptors, the sensory retina, the olfactory organ, and the vibrissae follicles. The results obtained from an analysis of more than 11,000 micrographs have been integrated in a newly developed knowledgebase, called ImAnno. In addition to managing the multilevel micrograph annotations performed by human experts, ImAnno provides public access to various integrated databases and tools. Thus, it facilitates the analysis of complex ISH gene expression patterns, as well as functional annotation and interaction of gene sets. It also provides direct links to human pathways and diseases. Hierarchical clustering of expression patterns in the 25 tissues revealed three main branches corresponding to tissues with common functions and/or embryonic origins. To illustrate the integrative power of ImAnno, we explored the expression, function and disease traits of the sensory epithelia of the five presumptive sensory organs. The study identified 623 genes (out of 2000) concomitantly expressed in the five embryonic epithelia, among which many (∼12%) were involved in human disorders. Finally, various multilevel interaction networks were characterized, highlighting differential functional enrichments of directly or indirectly interacting genes. These analyses exemplify an under-represention of "sensory" functions in the sensory gene set suggests that E14.5 is a pivotal stage between the developmental stage and the functional phase that will be fully reached only after birth.

  16. Automated image analysis of uterine cervical images

    NASA Astrophysics Data System (ADS)

    Li, Wenjing; Gu, Jia; Ferris, Daron; Poirson, Allen

    2007-03-01

    Cervical Cancer is the second most common cancer among women worldwide and the leading cause of cancer mortality of women in developing countries. If detected early and treated adequately, cervical cancer can be virtually prevented. Cervical precursor lesions and invasive cancer exhibit certain morphologic features that can be identified during a visual inspection exam. Digital imaging technologies allow us to assist the physician with a Computer-Aided Diagnosis (CAD) system. In colposcopy, epithelium that turns white after application of acetic acid is called acetowhite epithelium. Acetowhite epithelium is one of the major diagnostic features observed in detecting cancer and pre-cancerous regions. Automatic extraction of acetowhite regions from cervical images has been a challenging task due to specular reflection, various illumination conditions, and most importantly, large intra-patient variation. This paper presents a multi-step acetowhite region detection system to analyze the acetowhite lesions in cervical images automatically. First, the system calibrates the color of the cervical images to be independent of screening devices. Second, the anatomy of the uterine cervix is analyzed in terms of cervix region, external os region, columnar region, and squamous region. Third, the squamous region is further analyzed and subregions based on three levels of acetowhite are identified. The extracted acetowhite regions are accompanied by color scores to indicate the different levels of acetowhite. The system has been evaluated by 40 human subjects' data and demonstrates high correlation with experts' annotations.

  17. Helioviewer.org: An Open-source Tool for Visualizing Solar Data

    NASA Astrophysics Data System (ADS)

    Hughitt, V. Keith; Ireland, J.; Schmiedel, P.; Dimitoglou, G.; Mueller, D.; Fleck, B.

    2009-05-01

    As the amount of solar data available to scientists continues to increase at faster and faster rates, it is important that there exist simple tools for navigating this data quickly with a minimal amount of effort. By combining heterogeneous solar physics datatypes such as full-disk images and coronagraphs, along with feature and event information, Helioviewer offers a simple and intuitive way to browse multiple datasets simultaneously. Images are stored in a repository using the JPEG 2000 format and tiled dynamically upon a client's request. By tiling images and serving only the portions of the image requested, it is possible for the client to work with very large images without having to fetch all of the data at once. Currently, Helioviewer enables users to browse the entire SOHO data archive, updated hourly, as well as data feature/event catalog data from eight different catalogs including active region, flare, coronal mass ejection, type II radio burst data. In addition to a focus on intercommunication with other virtual observatories and browsers (VSO, HEK, etc), Helioviewer will offer a number of externally-available application programming interfaces (APIs) to enable easy third party use, adoption and extension. Future functionality will include: support for additional data-sources including TRACE, SDO and STEREO, dynamic movie generation, a navigable timeline of recorded solar events, social annotation, and basic client-side image processing.

  18. Context and Domain Knowledge Enhanced Entity Spotting in Informal Text

    NASA Astrophysics Data System (ADS)

    Gruhl, Daniel; Nagarajan, Meena; Pieper, Jan; Robson, Christine; Sheth, Amit

    This paper explores the application of restricted relationship graphs (RDF) and statistical NLP techniques to improve named entity annotation in challenging Informal English domains. We validate our approach using on-line forums discussing popular music. Named entity annotation is particularly difficult in this domain because it is characterized by a large number of ambiguous entities, such as the Madonna album "Music" or Lilly Allen's pop hit "Smile".

  19. Potential use of geothermal resources in the Snake River Basin: an environmental overview. Volume II. Annotated bibliography

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Spencer, S.G.; Russell, B.F.; Sullivan, J.F.

    This volume is a partially annotated bibliography of reference materials pertaining to the seven KGRA's. The bibliography is divided into sections by program element as follows: terrestrial ecology, aquatic ecology, heritage resources, socioeconomics and demography, geology, geothermal, soils, hydrology and water quality, seismicity, and subsidence. Cross-referencing is available for those references which are applicable to specific KGRA's. (MHR)

  20. Smoking Gun or Circumstantial Evidence? Comparison of Statistical Learning Methods using Functional Annotations for Prioritizing Risk Variants.

    PubMed

    Gagliano, Sarah A; Ravji, Reena; Barnes, Michael R; Weale, Michael E; Knight, Jo

    2015-08-24

    Although technology has triumphed in facilitating routine genome sequencing, new challenges have been created for the data-analyst. Genome-scale surveys of human variation generate volumes of data that far exceed capabilities for laboratory characterization. By incorporating functional annotations as predictors, statistical learning has been widely investigated for prioritizing genetic variants likely to be associated with complex disease. We compared three published prioritization procedures, which use different statistical learning algorithms and different predictors with regard to the quantity, type and coding. We also explored different combinations of algorithm and annotation set. As an application, we tested which methodology performed best for prioritizing variants using data from a large schizophrenia meta-analysis by the Psychiatric Genomics Consortium. Results suggest that all methods have considerable (and similar) predictive accuracies (AUCs 0.64-0.71) in test set data, but there is more variability in the application to the schizophrenia GWAS. In conclusion, a variety of algorithms and annotations seem to have a similar potential to effectively enrich true risk variants in genome-scale datasets, however none offer more than incremental improvement in prediction. We discuss how methods might be evolved for risk variant prediction to address the impending bottleneck of the new generation of genome re-sequencing studies.

  1. Automatic short axis orientation of the left ventricle in 3D ultrasound recordings

    NASA Astrophysics Data System (ADS)

    Pedrosa, João.; Heyde, Brecht; Heeren, Laurens; Engvall, Jan; Zamorano, Jose; Papachristidis, Alexandros; Edvardsen, Thor; Claus, Piet; D'hooge, Jan

    2016-04-01

    The recent advent of three-dimensional echocardiography has led to an increased interest from the scientific community in left ventricle segmentation frameworks for cardiac volume and function assessment. An automatic orientation of the segmented left ventricular mesh is an important step to obtain a point-to-point correspondence between the mesh and the cardiac anatomy. Furthermore, this would allow for an automatic division of the left ventricle into the standard 17 segments and, thus, fully automatic per-segment analysis, e.g. regional strain assessment. In this work, a method for fully automatic short axis orientation of the segmented left ventricle is presented. The proposed framework aims at detecting the inferior right ventricular insertion point. 211 three-dimensional echocardiographic images were used to validate this framework by comparison to manual annotation of the inferior right ventricular insertion point. A mean unsigned error of 8, 05° +/- 18, 50° was found, whereas the mean signed error was 1, 09°. Large deviations between the manual and automatic annotations (> 30°) only occurred in 3, 79% of cases. The average computation time was 666ms in a non-optimized MATLAB environment, which potentiates real-time application. In conclusion, a successful automatic real-time method for orientation of the segmented left ventricle is proposed.

  2. MetaPathways v2.5: quantitative functional, taxonomic and usability improvements.

    PubMed

    Konwar, Kishori M; Hanson, Niels W; Bhatia, Maya P; Kim, Dongjae; Wu, Shang-Ju; Hahn, Aria S; Morgan-Lang, Connor; Cheung, Hiu Kan; Hallam, Steven J

    2015-10-15

    Next-generation sequencing is producing vast amounts of sequence information from natural and engineered ecosystems. Although this data deluge has an enormous potential to transform our lives, knowledge creation and translation need software applications that scale with increasing data processing and analysis requirements. Here, we present improvements to MetaPathways, an annotation and analysis pipeline for environmental sequence information that expedites this transformation. We specifically address pathway prediction hazards through integration of a weighted taxonomic distance and enable quantitative comparison of assembled annotations through a normalized read-mapping measure. Additionally, we improve LAST homology searches through BLAST-equivalent E-values and output formats that are natively compatible with prevailing software applications. Finally, an updated graphical user interface allows for keyword annotation query and projection onto user-defined functional gene hierarchies, including the Carbohydrate-Active Enzyme database. MetaPathways v2.5 is available on GitHub: http://github.com/hallamlab/metapathways2. shallam@mail.ubc.ca Supplementary data are available at Bioinformatics online. © The Author 2015. Published by Oxford University Press.

  3. Year 2 Report: Protein Function Prediction Platform

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Zhou, C E

    2012-04-27

    Upon completion of our second year of development in a 3-year development cycle, we have completed a prototype protein structure-function annotation and function prediction system: Protein Function Prediction (PFP) platform (v.0.5). We have met our milestones for Years 1 and 2 and are positioned to continue development in completion of our original statement of work, or a reasonable modification thereof, in service to DTRA Programs involved in diagnostics and medical countermeasures research and development. The PFP platform is a multi-scale computational modeling system for protein structure-function annotation and function prediction. As of this writing, PFP is the only existing fullymore » automated, high-throughput, multi-scale modeling, whole-proteome annotation platform, and represents a significant advance in the field of genome annotation (Fig. 1). PFP modules perform protein functional annotations at the sequence, systems biology, protein structure, and atomistic levels of biological complexity (Fig. 2). Because these approaches provide orthogonal means of characterizing proteins and suggesting protein function, PFP processing maximizes the protein functional information that can currently be gained by computational means. Comprehensive annotation of pathogen genomes is essential for bio-defense applications in pathogen characterization, threat assessment, and medical countermeasure design and development in that it can short-cut the time and effort required to select and characterize protein biomarkers.« less

  4. A benchmark for vehicle detection on wide area motion imagery

    NASA Astrophysics Data System (ADS)

    Catrambone, Joseph; Amzovski, Ismail; Liang, Pengpeng; Blasch, Erik; Sheaff, Carolyn; Wang, Zhonghai; Chen, Genshe; Ling, Haibin

    2015-05-01

    Wide area motion imagery (WAMI) has been attracting an increased amount of research attention due to its large spatial and temporal coverage. An important application includes moving target analysis, where vehicle detection is often one of the first steps before advanced activity analysis. While there exist many vehicle detection algorithms, a thorough evaluation of them on WAMI data still remains a challenge mainly due to the lack of an appropriate benchmark data set. In this paper, we address a research need by presenting a new benchmark for wide area motion imagery vehicle detection data. The WAMI benchmark is based on the recently available Wright-Patterson Air Force Base (WPAFB09) dataset and the Temple Resolved Uncertainty Target History (TRUTH) associated target annotation. Trajectory annotations were provided in the original release of the WPAFB09 dataset, but detailed vehicle annotations were not available with the dataset. In addition, annotations of static vehicles, e.g., in parking lots, are also not identified in the original release. Addressing these issues, we re-annotated the whole dataset with detailed information for each vehicle, including not only a target's location, but also its pose and size. The annotated WAMI data set should be useful to community for a common benchmark to compare WAMI detection, tracking, and identification methods.

  5. Measuring What Latent Fingerprint Examiners Consider Sufficient Information for Individualization Determinations

    PubMed Central

    Ulery, Bradford T.; Hicklin, R. Austin; Roberts, Maria Antonia; Buscaglia, JoAnn

    2014-01-01

    Latent print examiners use their expertise to determine whether the information present in a comparison of two fingerprints (or palmprints) is sufficient to conclude that the prints were from the same source (individualization). When fingerprint evidence is presented in court, it is the examiner's determination—not an objective metric—that is presented. This study was designed to ascertain the factors that explain examiners' determinations of sufficiency for individualization. Volunteer latent print examiners (n = 170) were each assigned 22 pairs of latent and exemplar prints for examination, and annotated features, correspondence of features, and clarity. The 320 image pairs were selected specifically to control clarity and quantity of features. The predominant factor differentiating annotations associated with individualization and inconclusive determinations is the count of corresponding minutiae; other factors such as clarity provided minimal additional discriminative value. Examiners' counts of corresponding minutiae were strongly associated with their own determinations; however, due to substantial variation of both annotations and determinations among examiners, one examiner's annotation and determination on a given comparison is a relatively weak predictor of whether another examiner would individualize. The extensive variability in annotations also means that we must treat any individual examiner's minutia counts as interpretations of the (unknowable) information content of the prints: saying “the prints had N corresponding minutiae marked” is not the same as “the prints had N corresponding minutiae.” More consistency in annotations, which could be achieved through standardization and training, should lead to process improvements and provide greater transparency in casework. PMID:25372036

  6. Annotating Protein Functional Residues by Coupling High-Throughput Fitness Profile and Homologous-Structure Analysis.

    PubMed

    Du, Yushen; Wu, Nicholas C; Jiang, Lin; Zhang, Tianhao; Gong, Danyang; Shu, Sara; Wu, Ting-Ting; Sun, Ren

    2016-11-01

    Identification and annotation of functional residues are fundamental questions in protein sequence analysis. Sequence and structure conservation provides valuable information to tackle these questions. It is, however, limited by the incomplete sampling of sequence space in natural evolution. Moreover, proteins often have multiple functions, with overlapping sequences that present challenges to accurate annotation of the exact functions of individual residues by conservation-based methods. Using the influenza A virus PB1 protein as an example, we developed a method to systematically identify and annotate functional residues. We used saturation mutagenesis and high-throughput sequencing to measure the replication capacity of single nucleotide mutations across the entire PB1 protein. After predicting protein stability upon mutations, we identified functional PB1 residues that are essential for viral replication. To further annotate the functional residues important to the canonical or noncanonical functions of viral RNA-dependent RNA polymerase (vRdRp), we performed a homologous-structure analysis with 16 different vRdRp structures. We achieved high sensitivity in annotating the known canonical polymerase functional residues. Moreover, we identified a cluster of noncanonical functional residues located in the loop region of the PB1 β-ribbon. We further demonstrated that these residues were important for PB1 protein nuclear import through the interaction with Ran-binding protein 5. In summary, we developed a systematic and sensitive method to identify and annotate functional residues that are not restrained by sequence conservation. Importantly, this method is generally applicable to other proteins about which homologous-structure information is available. To fully comprehend the diverse functions of a protein, it is essential to understand the functionality of individual residues. Current methods are highly dependent on evolutionary sequence conservation, which is usually limited by sampling size. Sequence conservation-based methods are further confounded by structural constraints and multifunctionality of proteins. Here we present a method that can systematically identify and annotate functional residues of a given protein. We used a high-throughput functional profiling platform to identify essential residues. Coupling it with homologous-structure comparison, we were able to annotate multiple functions of proteins. We demonstrated the method with the PB1 protein of influenza A virus and identified novel functional residues in addition to its canonical function as an RNA-dependent RNA polymerase. Not limited to virology, this method is generally applicable to other proteins that can be functionally selected and about which homologous-structure information is available. Copyright © 2016 Du et al.

  7. Model and Interoperability using Meta Data Annotations

    NASA Astrophysics Data System (ADS)

    David, O.

    2011-12-01

    Software frameworks and architectures are in need for meta data to efficiently support model integration. Modelers have to know the context of a model, often stepping into modeling semantics and auxiliary information usually not provided in a concise structure and universal format, consumable by a range of (modeling) tools. XML often seems the obvious solution for capturing meta data, but its wide adoption to facilitate model interoperability is limited by XML schema fragmentation, complexity, and verbosity outside of a data-automation process. Ontologies seem to overcome those shortcomings, however the practical significance of their use remains to be demonstrated. OMS version 3 took a different approach for meta data representation. The fundamental building block of a modular model in OMS is a software component representing a single physical process, calibration method, or data access approach. Here, programing language features known as Annotations or Attributes were adopted. Within other (non-modeling) frameworks it has been observed that annotations lead to cleaner and leaner application code. Framework-supported model integration, traditionally accomplished using Application Programming Interfaces (API) calls is now achieved using descriptive code annotations. Fully annotated components for various hydrological and Ag-system models now provide information directly for (i) model assembly and building, (ii) data flow analysis for implicit multi-threading or visualization, (iii) automated and comprehensive model documentation of component dependencies, physical data properties, (iv) automated model and component testing, calibration, and optimization, and (v) automated audit-traceability to account for all model resources leading to a particular simulation result. Such a non-invasive methodology leads to models and modeling components with only minimal dependencies on the modeling framework but a strong reference to its originating code. Since models and modeling components are not directly bound to framework by the use of specific APIs and/or data types they can more easily be reused both within the framework as well as outside. While providing all those capabilities, a significant reduction in the size of the model source code was achieved. To support the benefit of annotations for a modeler, studies were conducted to evaluate the effectiveness of an annotation based framework approach with other modeling frameworks and libraries, a framework-invasiveness study was conducted to evaluate the effects of framework design on model code quality. A typical hydrological model was implemented across several modeling frameworks and several software metrics were collected. The metrics selected were measures of non-invasive design methods for modeling frameworks from a software engineering perspective. It appears that the use of annotations positively impacts several software quality measures. Experience to date has demonstrated the multi-purpose value of using annotations. Annotations are also a feasible and practical method to enable interoperability among models and modeling frameworks.

  8. Crowd-assisted polyp annotation of virtual colonoscopy videos

    NASA Astrophysics Data System (ADS)

    Park, Ji Hwan; Nadeem, Saad; Marino, Joseph; Baker, Kevin; Barish, Matthew; Kaufman, Arie

    2018-03-01

    Virtual colonoscopy (VC) allows a radiologist to navigate through a 3D colon model reconstructed from a computed tomography scan of the abdomen, looking for polyps, the precursors of colon cancer. Polyps are seen as protrusions on the colon wall and haustral folds, visible in the VC y-through videos. A complete review of the colon surface requires full navigation from the rectum to the cecum in antegrade and retrograde directions, which is a tedious task that takes an average of 30 minutes. Crowdsourcing is a technique for non-expert users to perform certain tasks, such as image or video annotation. In this work, we use crowdsourcing for the examination of complete VC y-through videos for polyp annotation by non-experts. The motivation for this is to potentially help the radiologist reach a diagnosis in a shorter period of time, and provide a stronger confirmation of the eventual diagnosis. The crowdsourcing interface includes an interactive tool for the crowd to annotate suspected polyps in the video with an enclosing box. Using our work flow, we achieve an overall polyps-per-patient sensitivity of 87.88% (95.65% for polyps >=5mm and 70% for polyps <5mm). We also demonstrate the efficacy and effectiveness of a non-expert user in detecting and annotating polyps and discuss their possibility in aiding radiologists in VC examinations.

  9. Clinical performance of an objective methodology to categorize tear film lipid layer patterns

    NASA Astrophysics Data System (ADS)

    Garcia-Resua, Carlos; Pena-Verdeal, Hugo; Giraldez, Maria J.; Yebra-Pimentel, Eva

    2017-08-01

    Purpose: To validate the performance of a new objective application designated iDEAS (Dry Eye Assessment System) to categorize different zones of lipid layer patterns (LLPs) in one image. Material and methods: Using the Tearscopeplus and a digital camera attached to a slit-lamp, 50 images were captured and analyzed by 4 experienced optometrists. In each image the observers outlined tear film zones that they clearly identified as a specific LLP. Further, the categorization made by the 4 optometrists (called observer 1, 2, 3 and 4) was compared with the automatic system included in iDEAS (5th observer). Results: In general, observer 3 classified worse than all observers (observers 1, 2, 4 and automatic application, Wilcoxon test, <0.05). The automatic system behaved similar to the remaining three observers (observer 1, 2 and 4) showing differences only for Open meshwork LLP when comparing with observer 4 (Wilcoxon test, p=0.02). For the remaining two observers (observer 1 and 2) there was not found statistical differences (Wilcoxon test, >0.05). Furthermore, we obtained a set of photographs per LLP category for which all optometrists showed agreement by using the new tool. After examining them, we detected the more characteristic features for each LLP to enhance the description of the patterns implemented by Guillon. Conclusions: The automatic application included in the iDEAS framework is able to provide zones similar to the annotations made by experienced optometrists. Thus, the manual process done by experts can be automated with the benefits of being unaffected by subjective factors.

  10. PathJam: a new service for integrating biological pathway information.

    PubMed

    Glez-Peña, Daniel; Reboiro-Jato, Miguel; Domínguez, Rubén; Gómez-López, Gonzalo; Pisano, David G; Fdez-Riverola, Florentino

    2010-10-28

    Biological pathways are crucial to much of the scientific research today including the study of specific biological processes related with human diseases. PathJam is a new comprehensive and freely accessible web-server application integrating scattered human pathway annotation from several public sources. The tool has been designed for both (i) being intuitive for wet-lab users providing statistical enrichment analysis of pathway annotations and (ii) giving support to the development of new integrative pathway applications. PathJam’s unique features and advantages include interactive graphs linking pathways and genes of interest, downloadable results in fully compatible formats, GSEA compatible output files and a standardized RESTful API.

  11. Lightweight genome viewer: portable software for browsing genomics data in its chromosomal context

    PubMed Central

    Faith, Jeremiah J; Olson, Andrew J; Gardner, Timothy S; Sachidanandam, Ravi

    2007-01-01

    Background Lightweight genome viewer (lwgv) is a web-based tool for visualization of sequence annotations in their chromosomal context. It performs most of the functions of larger genome browsers, while relying on standard flat-file formats and bypassing the database needs of most visualization tools. Visualization as an aide to discovery requires display of novel data in conjunction with static annotations in their chromosomal context. With database-based systems, displaying dynamic results requires temporary tables that need to be tracked for removal. Results lwgv simplifies the visualization of user-generated results on a local computer. The dynamic results of these analyses are written to transient files, which can import static content from a more permanent file. lwgv is currently used in many different applications, from whole genome browsers to single-gene RNAi design visualization, demonstrating its applicability in a large variety of contexts and scales. Conclusion lwgv provides a lightweight alternative to large genome browsers for visualizing biological annotations and dynamic analyses in their chromosomal context. It is particularly suited for applications ranging from short sequences to medium-sized genomes when the creation and maintenance of a large software and database infrastructure is not necessary or desired. PMID:17877794

  12. A Software Architecture for Intelligent Synthesis Environments

    NASA Technical Reports Server (NTRS)

    Filman, Robert E.; Norvig, Peter (Technical Monitor)

    2001-01-01

    The NASA's Intelligent Synthesis Environment (ISE) program is a grand attempt to develop a system to transform the way complex artifacts are engineered. This paper discusses a "middleware" architecture for enabling the development of ISE. Desirable elements of such an Intelligent Synthesis Architecture (ISA) include remote invocation; plug-and-play applications; scripting of applications; management of design artifacts, tools, and artifact and tool attributes; common system services; system management; and systematic enforcement of policies. This paper argues that the ISA extend conventional distributed object technology (DOT) such as CORBA and Product Data Managers with flexible repositories of product and tool annotations and "plug-and-play" mechanisms for inserting "ility" or orthogonal concerns into the system. I describe the Object Infrastructure Framework, an Aspect Oriented Programming (AOP) environment for developing distributed systems that provides utility insertion and enables consistent annotation maintenance. This technology can be used to enforce policies such as maintaining the annotations of artifacts, particularly the provenance and access control rules of artifacts-, performing automatic datatype transformations between representations; supplying alternative servers of the same service; reporting on the status of jobs and the system; conveying privileges throughout an application; supporting long-lived transactions; maintaining version consistency; and providing software redundancy and mobility.

  13. Lightweight genome viewer: portable software for browsing genomics data in its chromosomal context.

    PubMed

    Faith, Jeremiah J; Olson, Andrew J; Gardner, Timothy S; Sachidanandam, Ravi

    2007-09-18

    Lightweight genome viewer (lwgv) is a web-based tool for visualization of sequence annotations in their chromosomal context. It performs most of the functions of larger genome browsers, while relying on standard flat-file formats and bypassing the database needs of most visualization tools. Visualization as an aide to discovery requires display of novel data in conjunction with static annotations in their chromosomal context. With database-based systems, displaying dynamic results requires temporary tables that need to be tracked for removal. lwgv simplifies the visualization of user-generated results on a local computer. The dynamic results of these analyses are written to transient files, which can import static content from a more permanent file. lwgv is currently used in many different applications, from whole genome browsers to single-gene RNAi design visualization, demonstrating its applicability in a large variety of contexts and scales. lwgv provides a lightweight alternative to large genome browsers for visualizing biological annotations and dynamic analyses in their chromosomal context. It is particularly suited for applications ranging from short sequences to medium-sized genomes when the creation and maintenance of a large software and database infrastructure is not necessary or desired.

  14. Selected annotated bibliographies for adaptive filtering of digital image data

    USGS Publications Warehouse

    Mayers, Margaret; Wood, Lynnette

    1988-01-01

    Digital spatial filtering is an important tool both for enhancing the information content of satellite image data and for implementing cosmetic effects which make the imagery more interpretable and appealing to the eye. Spatial filtering is a context-dependent operation that alters the gray level of a pixel by computing a weighted average formed from the gray level values of other pixels in the immediate vicinity.Traditional spatial filtering involves passing a particular filter or set of filters over an entire image. This assumes that the filter parameter values are appropriate for the entire image, which in turn is based on the assumption that the statistics of the image are constant over the image. However, the statistics of an image may vary widely over the image, requiring an adaptive or "smart" filter whose parameters change as a function of the local statistical properties of the image. Then a pixel would be averaged only with more typical members of the same population. This annotated bibliography cites some of the work done in the area of adaptive filtering. The methods usually fall into two categories, (a) those that segment the image into subregions, each assumed to have stationary statistics, and use a different filter on each subregion, and (b) those that use a two-dimensional "sliding window" to continuously estimate the filter either the spatial or frequency domain, or may utilize both domains. They may be used to deal with images degraded by space variant noise, to suppress undesirable local radiometric statistics while enforcing desirable (user-defined) statistics, to treat problems where space-variant point spread functions are involved, to segment images into regions of constant value for classification, or to "tune" images in order to remove (nonstationary) variations in illumination, noise, contrast, shadows, or haze.Since adpative filtering, like nonadaptive filtering, is used in image processing to accomplish various goals, this bibliography is organized in subsections based on application areas. Contrast enhancement, edge enhancement, noise suppression, and smoothing are typically performed in order imaging process, (for example, degradations due to the optics and electronics of the sensor, or to blurring caused by the intervening atmosphere, uniform motion, or defocused optics). Some of the papers listed may apply to more than one of the above categories; when this happens the paper is listed under the category for which the paper's emphasis is greatest. A list of survey articles is also supplied. These articles are general discussions on adaptive filters and reviews of work done. Finally, a short list of miscellaneous articles are listed which were felt to be sufficiently important to be included, but do not fit into any of the above categories. This bibliography, listing items published from 1970 through 1987, is extensive, but by no means complete. It is intended as a guide for scientists and image analysts, listing references for background information as well as areas of significant development in adaptive filtering.

  15. A Comprehensive Approach to Sequence-oriented IsomiR annotation (CASMIR): demonstration with IsomiR profiling in colorectal neoplasia.

    PubMed

    Wu, Chung Wah; Evans, Jared M; Huang, Shengbing; Mahoney, Douglas W; Dukek, Brian A; Taylor, William R; Yab, Tracy C; Smyrk, Thomas C; Jen, Jin; Kisiel, John B; Ahlquist, David A

    2018-05-25

    MicroRNA (miRNA) profiling is an important step in studying biological associations and identifying marker candidates. miRNA exists in isoforms, called isomiRs, which may exhibit distinct properties. With conventional profiling methods, limitations in assay and analysis platforms may compromise isomiR interrogation. We introduce a comprehensive approach to sequence-oriented isomiR annotation (CASMIR) to allow unbiased identification of global isomiRs from small RNA sequencing data. In this approach, small RNA reads are maintained as independent sequences instead of being summarized under miRNA names. IsomiR features are identified through step-wise local alignment against canonical forms and precursor sequences. Through customizing the reference database, CASMIR is applicable to isomiR annotation across species. To demonstrate its application, we investigated isomiR profiles in normal and neoplastic human colorectal epithelia. We also ran miRDeep2, a popular miRNA analysis algorithm to validate isomiRs annotated by CASMIR. With CASMIR, specific and biologically relevant isomiR patterns could be identified. We note that specific isomiRs are often more abundant than their canonical forms. We identify isomiRs that are commonly up-regulated in both colorectal cancer and advanced adenoma, and illustrate advantages in targeting isomiRs as potential biomarkers over canonical forms. Studying miRNAs at the isomiR level could reveal new insight into miRNA biology and inform assay design for specific isomiRs. CASMIR facilitates comprehensive annotation of isomiR features in small RNA sequencing data for isomiR profiling and differential expression analysis.

  16. Watch-and-Comment as an Approach to Collaboratively Annotate Points of Interest in Video and Interactive-TV Programs

    NASA Astrophysics Data System (ADS)

    Pimentel, Maria Da Graça C.; Cattelan, Renan G.; Melo, Erick L.; Freitas, Giliard B.; Teixeira, Cesar A.

    In earlier work we proposed the Watch-and-Comment (WaC) paradigm as the seamless capture of multimodal comments made by one or more users while watching a video, resulting in the automatic generation of multimedia documents specifying annotated interactive videos. The aim is to allow services to be offered by applying document engineering techniques to the multimedia document generated automatically. The WaC paradigm was demonstrated with a WaCTool prototype application which supports multimodal annotation over video frames and segments, producing a corresponding interactive video. In this chapter, we extend the WaC paradigm to consider contexts in which several viewers may use their own mobile devices while watching and commenting on an interactive-TV program. We first review our previous work. Next, we discuss scenarios in which mobile users can collaborate via the WaC paradigm. We then present a new prototype application which allows users to employ their mobile devices to collaboratively annotate points of interest in video and interactive-TV programs. We also detail the current software infrastructure which supports our new prototype; the infrastructure extends the Ginga middleware for the Brazilian Digital TV with an implementation of the UPnP protocol - the aim is to provide the seamless integration of the users' mobile devices into the TV environment. As a result, the work reported in this chapter defines the WaC paradigm for the mobile-user as an approach to allow the collaborative annotation of the points of interest in video and interactive-TV programs.

  17. TH-CD-206-05: Machine-Learning Based Segmentation of Organs at Risks for Head and Neck Radiotherapy Planning

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Ibragimov, B; Pernus, F; Strojan, P

    Purpose: Accurate and efficient delineation of tumor target and organs-at-risks is essential for the success of radiotherapy. In reality, despite of decades of intense research efforts, auto-segmentation has not yet become clinical practice. In this study, we present, for the first time, a deep learning-based classification algorithm for autonomous segmentation in head and neck (HaN) treatment planning. Methods: Fifteen HN datasets of CT, MR and PET images with manual annotation of organs-at-risk (OARs) including spinal cord, brainstem, optic nerves, chiasm, eyes, mandible, tongue, parotid glands were collected and saved in a library of plans. We also have ten super-resolution MRmore » images of the tongue area, where the genioglossus and inferior longitudinalis tongue muscles are defined as organs of interest. We applied the concepts of random forest- and deep learning-based object classification for automated image annotation with the aim of using machine learning to facilitate head and neck radiotherapy planning process. In this new paradigm of segmentation, random forests were used for landmark-assisted segmentation of super-resolution MR images. Alternatively to auto-segmentation with random forest-based landmark detection, deep convolutional neural networks were developed for voxel-wise segmentation of OARs in single and multi-modal images. The network consisted of three pairs of convolution and pooing layer, one RuLU layer and a softmax layer. Results: We present a comprehensive study on using machine learning concepts for auto-segmentation of OARs and tongue muscles for the HaN radiotherapy planning. An accuracy of 81.8% in terms of Dice coefficient was achieved for segmentation of genioglossus and inferior longitudinalis tongue muscles. Preliminary results of OARs regimentation also indicate that deep-learning afforded an unprecedented opportunities to improve the accuracy and robustness of radiotherapy planning. Conclusion: A novel machine learning framework has been developed for image annotation and structure segmentation. Our results indicate the great potential of deep learning in radiotherapy treatment planning.« less

  18. A Digital Approach to Learning Petrology

    NASA Astrophysics Data System (ADS)

    Reid, M. R.

    2011-12-01

    In the undergraduate igneous and metamorphic petrology course at Northern Arizona University, we are employing petrographic microscopes equipped with relatively inexpensive ( $200) digital cameras that are linked to pen-tablet computers. The camera-tablet systems can assist student learning in a variety of ways. Images provided by the tablet computers can be used for helping students filter the visually complex specimens they examine. Instructors and students can simultaneously view the same petrographic features captured by the cameras and exchange information about them by pointing to salient features using the tablet pen. These images can become part of a virtual mineral/rock/texture portfolio tailored to individual student's needs. Captured digital illustrations can be annotated with digital ink or computer graphics tools; this activity emulates essential features of more traditional line drawings (visualizing an appropriate feature and selecting a representative image of it, internalizing the feature through studying and annotating it) while minimizing the frustration that many students feel about drawing. In these ways, we aim to help a student progress more efficiently from novice to expert. A number of our petrology laboratory exercises involve use of the camera-tablet systems for collaborative learning. Observational responsibilities are distributed among individual members of teams in order to increase interdependence and accountability, and to encourage efficiency. Annotated digital images are used to share students' findings and arrive at an understanding of an entire rock suite. This interdependence increases the individual's sense of responsibility for their work, and reporting out encourages students to practice use of technical vocabulary and to defend their observations. Pre- and post-course student interest in the camera-tablet systems has been assessed. In a post-course survey, the majority of students reported that, if available, they would use camera-tablet systems to capture microscope images (77%) and to make notes on images (71%). An informal focus group recommended introducing the cameras as soon as possible and having them available for making personal mineralogy/petrology portfolios. Because the stakes are perceived as high, use of the camera-tablet systems for peer-peer learning has been progressively modified to bolster student confidence in their collaborative efforts.

  19. Video Salient Object Detection via Fully Convolutional Networks.

    PubMed

    Wang, Wenguan; Shen, Jianbing; Shao, Ling

    This paper proposes a deep learning model to efficiently detect salient regions in videos. It addresses two important issues: 1) deep video saliency model training with the absence of sufficiently large and pixel-wise annotated video data and 2) fast video saliency training and detection. The proposed deep video saliency network consists of two modules, for capturing the spatial and temporal saliency information, respectively. The dynamic saliency model, explicitly incorporating saliency estimates from the static saliency model, directly produces spatiotemporal saliency inference without time-consuming optical flow computation. We further propose a novel data augmentation technique that simulates video training data from existing annotated image data sets, which enables our network to learn diverse saliency information and prevents overfitting with the limited number of training videos. Leveraging our synthetic video data (150K video sequences) and real videos, our deep video saliency model successfully learns both spatial and temporal saliency cues, thus producing accurate spatiotemporal saliency estimate. We advance the state-of-the-art on the densely annotated video segmentation data set (MAE of .06) and the Freiburg-Berkeley Motion Segmentation data set (MAE of .07), and do so with much improved speed (2 fps with all steps).This paper proposes a deep learning model to efficiently detect salient regions in videos. It addresses two important issues: 1) deep video saliency model training with the absence of sufficiently large and pixel-wise annotated video data and 2) fast video saliency training and detection. The proposed deep video saliency network consists of two modules, for capturing the spatial and temporal saliency information, respectively. The dynamic saliency model, explicitly incorporating saliency estimates from the static saliency model, directly produces spatiotemporal saliency inference without time-consuming optical flow computation. We further propose a novel data augmentation technique that simulates video training data from existing annotated image data sets, which enables our network to learn diverse saliency information and prevents overfitting with the limited number of training videos. Leveraging our synthetic video data (150K video sequences) and real videos, our deep video saliency model successfully learns both spatial and temporal saliency cues, thus producing accurate spatiotemporal saliency estimate. We advance the state-of-the-art on the densely annotated video segmentation data set (MAE of .06) and the Freiburg-Berkeley Motion Segmentation data set (MAE of .07), and do so with much improved speed (2 fps with all steps).

  20. GarlicESTdb: an online database and mining tool for garlic EST sequences.

    PubMed

    Kim, Dae-Won; Jung, Tae-Sung; Nam, Seong-Hyeuk; Kwon, Hyuk-Ryul; Kim, Aeri; Chae, Sung-Hwa; Choi, Sang-Haeng; Kim, Dong-Wook; Kim, Ryong Nam; Park, Hong-Seog

    2009-05-18

    Allium sativum., commonly known as garlic, is a species in the onion genus (Allium), which is a large and diverse one containing over 1,250 species. Its close relatives include chives, onion, leek and shallot. Garlic has been used throughout recorded history for culinary, medicinal use and health benefits. Currently, the interest in garlic is highly increasing due to nutritional and pharmaceutical value including high blood pressure and cholesterol, atherosclerosis and cancer. For all that, there are no comprehensive databases available for Expressed Sequence Tags(EST) of garlic for gene discovery and future efforts of genome annotation. That is why we developed a new garlic database and applications to enable comprehensive analysis of garlic gene expression. GarlicESTdb is an integrated database and mining tool for large-scale garlic (Allium sativum) EST sequencing. A total of 21,595 ESTs collected from an in-house cDNA library were used to construct the database. The analysis pipeline is an automated system written in JAVA and consists of the following components: automatic preprocessing of EST reads, assembly of raw sequences, annotation of the assembled sequences, storage of the analyzed information into MySQL databases, and graphic display of all processed data. A web application was implemented with the latest J2EE (Java 2 Platform Enterprise Edition) software technology (JSP/EJB/JavaServlet) for browsing and querying the database, for creation of dynamic web pages on the client side, and for mapping annotated enzymes to KEGG pathways, the AJAX framework was also used partially. The online resources, such as putative annotation, single nucleotide polymorphisms (SNP) and tandem repeat data sets, can be searched by text, explored on the website, searched using BLAST, and downloaded. To archive more significant BLAST results, a curation system was introduced with which biologists can easily edit best-hit annotation information for others to view. The GarlicESTdb web application is freely available at http://garlicdb.kribb.re.kr. GarlicESTdb is the first incorporated online information database of EST sequences isolated from garlic that can be freely accessed and downloaded. It has many useful features for interactive mining of EST contigs and datasets from each library, including curation of annotated information, expression profiling, information retrieval, and summary of statistics of functional annotation. Consequently, the development of GarlicESTdb will provide a crucial contribution to biologists for data-mining and more efficient experimental studies.

  1. Workflow and web application for annotating NCBI BioProject transcriptome data.

    PubMed

    Vera Alvarez, Roberto; Medeiros Vidal, Newton; Garzón-Martínez, Gina A; Barrero, Luz S; Landsman, David; Mariño-Ramírez, Leonardo

    2017-01-01

    The volume of transcriptome data is growing exponentially due to rapid improvement of experimental technologies. In response, large central resources such as those of the National Center for Biotechnology Information (NCBI) are continually adapting their computational infrastructure to accommodate this large influx of data. New and specialized databases, such as Transcriptome Shotgun Assembly Sequence Database (TSA) and Sequence Read Archive (SRA), have been created to aid the development and expansion of centralized repositories. Although the central resource databases are under continual development, they do not include automatic pipelines to increase annotation of newly deposited data. Therefore, third-party applications are required to achieve that aim. Here, we present an automatic workflow and web application for the annotation of transcriptome data. The workflow creates secondary data such as sequencing reads and BLAST alignments, which are available through the web application. They are based on freely available bioinformatics tools and scripts developed in-house. The interactive web application provides a search engine and several browser utilities. Graphical views of transcript alignments are available through SeqViewer, an embedded tool developed by NCBI for viewing biological sequence data. The web application is tightly integrated with other NCBI web applications and tools to extend the functionality of data processing and interconnectivity. We present a case study for the species Physalis peruviana with data generated from BioProject ID 67621. URL: http://www.ncbi.nlm.nih.gov/projects/physalis/. Published by Oxford University Press 2017. This work is written by US Government employees and is in the public domain in the US.

  2. TerraLook: Providing easy, no-cost access to satellite images for busy people and the technologically disinclined

    USGS Publications Warehouse

    Geller, G.N.; Fosnight, E.A.; Chaudhuri, Sambhudas

    2008-01-01

    Access to satellite images has been largely limited to communities with specialized tools and expertise, even though images could also benefit other communities. This situation has resulted in underutilization of the data. TerraLook, which consists of collections of georeferenced JPEG images and an open source toolkit to use them, makes satellite images available to those lacking experience with remote sensing. Users can find, roam, and zoom images, create and display vector overlays, adjust and annotate images so they can be used as a communication vehicle, compare images taken at different times, and perform other activities useful for natural resource management, sustainable development, education, and other activities. ?? 2007 IEEE.

  3. TerraLook: Providing easy, no-cost access to satellite images for busy people and the technologically disinclined

    USGS Publications Warehouse

    Geller, G.N.; Fosnight, E.A.; Chaudhuri, Sambhudas

    2007-01-01

    Access to satellite images has been largely limited to communities with specialized tools and expertise, even though images could also benefit other communities. This situation has resulted in underutilization of the data. TerraLook, which consists of collections of georeferenced JPEG images and an open source toolkit to use them, makes satellite images available to those lacking experience with remote sensing. Users can find, roam, and zoom images, create and display vector overlays, adjust and annotate images so they can be used as a communication vehicle, compare images taken at different times, and perform other activities useful for natural resource management, sustainable development, education, and other activities. ?? 2007 IEEE.

  4. Treelink: data integration, clustering and visualization of phylogenetic trees.

    PubMed

    Allende, Christian; Sohn, Erik; Little, Cedric

    2015-12-29

    Phylogenetic trees are central to a wide range of biological studies. In many of these studies, tree nodes need to be associated with a variety of attributes. For example, in studies concerned with viral relationships, tree nodes are associated with epidemiological information, such as location, age and subtype. Gene trees used in comparative genomics are usually linked with taxonomic information, such as functional annotations and events. A wide variety of tree visualization and annotation tools have been developed in the past, however none of them are intended for an integrative and comparative analysis. Treelink is a platform-independent software for linking datasets and sequence files to phylogenetic trees. The application allows an automated integration of datasets to trees for operations such as classifying a tree based on a field or showing the distribution of selected data attributes in branches and leafs. Genomic and proteonomic sequences can also be linked to the tree and extracted from internal and external nodes. A novel clustering algorithm to simplify trees and display the most divergent clades was also developed, where validation can be achieved using the data integration and classification function. Integrated geographical information allows ancestral character reconstruction for phylogeographic plotting based on parsimony and likelihood algorithms. Our software can successfully integrate phylogenetic trees with different data sources, and perform operations to differentiate and visualize those differences within a tree. File support includes the most popular formats such as newick and csv. Exporting visualizations as images, cluster outputs and genomic sequences is supported. Treelink is available as a web and desktop application at http://www.treelinkapp.com .

  5. A Benchmark Dataset and Saliency-guided Stacked Autoencoders for Video-based Salient Object Detection.

    PubMed

    Li, Jia; Xia, Changqun; Chen, Xiaowu

    2017-10-12

    Image-based salient object detection (SOD) has been extensively studied in past decades. However, video-based SOD is much less explored due to the lack of large-scale video datasets within which salient objects are unambiguously defined and annotated. Toward this end, this paper proposes a video-based SOD dataset that consists of 200 videos. In constructing the dataset, we manually annotate all objects and regions over 7,650 uniformly sampled keyframes and collect the eye-tracking data of 23 subjects who free-view all videos. From the user data, we find that salient objects in a video can be defined as objects that consistently pop-out throughout the video, and objects with such attributes can be unambiguously annotated by combining manually annotated object/region masks with eye-tracking data of multiple subjects. To the best of our knowledge, it is currently the largest dataset for videobased salient object detection. Based on this dataset, this paper proposes an unsupervised baseline approach for video-based SOD by using saliencyguided stacked autoencoders. In the proposed approach, multiple spatiotemporal saliency cues are first extracted at the pixel, superpixel and object levels. With these saliency cues, stacked autoencoders are constructed in an unsupervised manner that automatically infers a saliency score for each pixel by progressively encoding the high-dimensional saliency cues gathered from the pixel and its spatiotemporal neighbors. In experiments, the proposed unsupervised approach is compared with 31 state-of-the-art models on the proposed dataset and outperforms 30 of them, including 19 imagebased classic (unsupervised or non-deep learning) models, six image-based deep learning models, and five video-based unsupervised models. Moreover, benchmarking results show that the proposed dataset is very challenging and has the potential to boost the development of video-based SOD.

  6. A Standardised Vocabulary for Identifying Benthic Biota and Substrata from Underwater Imagery: The CATAMI Classification Scheme

    PubMed Central

    Jordan, Alan; Rees, Tony; Gowlett-Holmes, Karen

    2015-01-01

    Imagery collected by still and video cameras is an increasingly important tool for minimal impact, repeatable observations in the marine environment. Data generated from imagery includes identification, annotation and quantification of biological subjects and environmental features within an image. To be long-lived and useful beyond their project-specific initial purpose, and to maximize their utility across studies and disciplines, marine imagery data should use a standardised vocabulary of defined terms. This would enable the compilation of regional, national and/or global data sets from multiple sources, contributing to broad-scale management studies and development of automated annotation algorithms. The classification scheme developed under the Collaborative and Automated Tools for Analysis of Marine Imagery (CATAMI) project provides such a vocabulary. The CATAMI classification scheme introduces Australian-wide acknowledged, standardised terminology for annotating benthic substrates and biota in marine imagery. It combines coarse-level taxonomy and morphology, and is a flexible, hierarchical classification that bridges the gap between habitat/biotope characterisation and taxonomy, acknowledging limitations when describing biological taxa through imagery. It is fully described, documented, and maintained through curated online databases, and can be applied across benthic image collection methods, annotation platforms and scoring methods. Following release in 2013, the CATAMI classification scheme was taken up by a wide variety of users, including government, academia and industry. This rapid acceptance highlights the scheme’s utility and the potential to facilitate broad-scale multidisciplinary studies of marine ecosystems when applied globally. Here we present the CATAMI classification scheme, describe its conception and features, and discuss its utility and the opportunities as well as challenges arising from its use. PMID:26509918

  7. Discovering gene annotations in biomedical text databases

    PubMed Central

    Cakmak, Ali; Ozsoyoglu, Gultekin

    2008-01-01

    Background Genes and gene products are frequently annotated with Gene Ontology concepts based on the evidence provided in genomics articles. Manually locating and curating information about a genomic entity from the biomedical literature requires vast amounts of human effort. Hence, there is clearly a need forautomated computational tools to annotate the genes and gene products with Gene Ontology concepts by computationally capturing the related knowledge embedded in textual data. Results In this article, we present an automated genomic entity annotation system, GEANN, which extracts information about the characteristics of genes and gene products in article abstracts from PubMed, and translates the discoveredknowledge into Gene Ontology (GO) concepts, a widely-used standardized vocabulary of genomic traits. GEANN utilizes textual "extraction patterns", and a semantic matching framework to locate phrases matching to a pattern and produce Gene Ontology annotations for genes and gene products. In our experiments, GEANN has reached to the precision level of 78% at therecall level of 61%. On a select set of Gene Ontology concepts, GEANN either outperforms or is comparable to two other automated annotation studies. Use of WordNet for semantic pattern matching improves the precision and recall by 24% and 15%, respectively, and the improvement due to semantic pattern matching becomes more apparent as the Gene Ontology terms become more general. Conclusion GEANN is useful for two distinct purposes: (i) automating the annotation of genomic entities with Gene Ontology concepts, and (ii) providing existing annotations with additional "evidence articles" from the literature. The use of textual extraction patterns that are constructed based on the existing annotations achieve high precision. The semantic pattern matching framework provides a more flexible pattern matching scheme with respect to "exactmatching" with the advantage of locating approximate pattern occurrences with similar semantics. Relatively low recall performance of our pattern-based approach may be enhanced either by employing a probabilistic annotation framework based on the annotation neighbourhoods in textual data, or, alternatively, the statistical enrichment threshold may be adjusted to lower values for applications that put more value on achieving higher recall values. PMID:18325104

  8. GeneTools--application for functional annotation and statistical hypothesis testing.

    PubMed

    Beisvag, Vidar; Jünge, Frode K R; Bergum, Hallgeir; Jølsum, Lars; Lydersen, Stian; Günther, Clara-Cecilie; Ramampiaro, Heri; Langaas, Mette; Sandvik, Arne K; Laegreid, Astrid

    2006-10-24

    Modern biology has shifted from "one gene" approaches to methods for genomic-scale analysis like microarray technology, which allow simultaneous measurement of thousands of genes. This has created a need for tools facilitating interpretation of biological data in "batch" mode. However, such tools often leave the investigator with large volumes of apparently unorganized information. To meet this interpretation challenge, gene-set, or cluster testing has become a popular analytical tool. Many gene-set testing methods and software packages are now available, most of which use a variety of statistical tests to assess the genes in a set for biological information. However, the field is still evolving, and there is a great need for "integrated" solutions. GeneTools is a web-service providing access to a database that brings together information from a broad range of resources. The annotation data are updated weekly, guaranteeing that users get data most recently available. Data submitted by the user are stored in the database, where it can easily be updated, shared between users and exported in various formats. GeneTools provides three different tools: i) NMC Annotation Tool, which offers annotations from several databases like UniGene, Entrez Gene, SwissProt and GeneOntology, in both single- and batch search mode. ii) GO Annotator Tool, where users can add new gene ontology (GO) annotations to genes of interest. These user defined GO annotations can be used in further analysis or exported for public distribution. iii) eGOn, a tool for visualization and statistical hypothesis testing of GO category representation. As the first GO tool, eGOn supports hypothesis testing for three different situations (master-target situation, mutually exclusive target-target situation and intersecting target-target situation). An important additional function is an evidence-code filter that allows users, to select the GO annotations for the analysis. GeneTools is the first "all in one" annotation tool, providing users with a rapid extraction of highly relevant gene annotation data for e.g. thousands of genes or clones at once. It allows a user to define and archive new GO annotations and it supports hypothesis testing related to GO category representations. GeneTools is freely available through www.genetools.no

  9. Discovering gene annotations in biomedical text databases.

    PubMed

    Cakmak, Ali; Ozsoyoglu, Gultekin

    2008-03-06

    Genes and gene products are frequently annotated with Gene Ontology concepts based on the evidence provided in genomics articles. Manually locating and curating information about a genomic entity from the biomedical literature requires vast amounts of human effort. Hence, there is clearly a need forautomated computational tools to annotate the genes and gene products with Gene Ontology concepts by computationally capturing the related knowledge embedded in textual data. In this article, we present an automated genomic entity annotation system, GEANN, which extracts information about the characteristics of genes and gene products in article abstracts from PubMed, and translates the discoveredknowledge into Gene Ontology (GO) concepts, a widely-used standardized vocabulary of genomic traits. GEANN utilizes textual "extraction patterns", and a semantic matching framework to locate phrases matching to a pattern and produce Gene Ontology annotations for genes and gene products. In our experiments, GEANN has reached to the precision level of 78% at therecall level of 61%. On a select set of Gene Ontology concepts, GEANN either outperforms or is comparable to two other automated annotation studies. Use of WordNet for semantic pattern matching improves the precision and recall by 24% and 15%, respectively, and the improvement due to semantic pattern matching becomes more apparent as the Gene Ontology terms become more general. GEANN is useful for two distinct purposes: (i) automating the annotation of genomic entities with Gene Ontology concepts, and (ii) providing existing annotations with additional "evidence articles" from the literature. The use of textual extraction patterns that are constructed based on the existing annotations achieve high precision. The semantic pattern matching framework provides a more flexible pattern matching scheme with respect to "exactmatching" with the advantage of locating approximate pattern occurrences with similar semantics. Relatively low recall performance of our pattern-based approach may be enhanced either by employing a probabilistic annotation framework based on the annotation neighbourhoods in textual data, or, alternatively, the statistical enrichment threshold may be adjusted to lower values for applications that put more value on achieving higher recall values.

  10. Combining Human Computing and Machine Learning to Make Sense of Big (Aerial) Data for Disaster Response.

    PubMed

    Ofli, Ferda; Meier, Patrick; Imran, Muhammad; Castillo, Carlos; Tuia, Devis; Rey, Nicolas; Briant, Julien; Millet, Pauline; Reinhard, Friedrich; Parkan, Matthew; Joost, Stéphane

    2016-03-01

    Aerial imagery captured via unmanned aerial vehicles (UAVs) is playing an increasingly important role in disaster response. Unlike satellite imagery, aerial imagery can be captured and processed within hours rather than days. In addition, the spatial resolution of aerial imagery is an order of magnitude higher than the imagery produced by the most sophisticated commercial satellites today. Both the United States Federal Emergency Management Agency (FEMA) and the European Commission's Joint Research Center (JRC) have noted that aerial imagery will inevitably present a big data challenge. The purpose of this article is to get ahead of this future challenge by proposing a hybrid crowdsourcing and real-time machine learning solution to rapidly process large volumes of aerial data for disaster response in a time-sensitive manner. Crowdsourcing can be used to annotate features of interest in aerial images (such as damaged shelters and roads blocked by debris). These human-annotated features can then be used to train a supervised machine learning system to learn to recognize such features in new unseen images. In this article, we describe how this hybrid solution for image analysis can be implemented as a module (i.e., Aerial Clicker) to extend an existing platform called Artificial Intelligence for Disaster Response (AIDR), which has already been deployed to classify microblog messages during disasters using its Text Clicker module and in response to Cyclone Pam, a category 5 cyclone that devastated Vanuatu in March 2015. The hybrid solution we present can be applied to both aerial and satellite imagery and has applications beyond disaster response such as wildlife protection, human rights, and archeological exploration. As a proof of concept, we recently piloted this solution using very high-resolution aerial photographs of a wildlife reserve in Namibia to support rangers with their wildlife conservation efforts (SAVMAP project, http://lasig.epfl.ch/savmap ). The results suggest that the platform we have developed to combine crowdsourcing and machine learning to make sense of large volumes of aerial images can be used for disaster response.

  11. Medical student appraisal: applications for bedside patient education.

    PubMed

    Markman, T M; Sampognaro, P J; Mitchell, S L; Weeks, S R; Khalifian, S; Dattilo, J R

    2013-01-01

    Medical students are often afforded the privilege of counselling patients. In the past resources were limited to pen and paper or anatomic models. The evolution of mobile applications allows for limitless access to resources that facilitate bedside patient education. To evaluate the utility of six applications in patient education and promote awareness of implementing mobile resources in clinical care. Six medical students rotating on various clerkships evaluated a total of six mobile applications. Strengths, limitations, and suggested uses in clinical care were identified. Applications included Meditoons™, VisiblePatient™, DrawMD™, CardioTeach™, Visual Anatomy™, and 360° Patient Education Suite™. Data was generated from narrative responses supplied by each student during their evaluation period. Bedside teaching was enhanced by professional illustrations and animations depicting anatomy and pathophysiology. Impromptu teaching was facilitated, as resources were conveniently available on a student's smartphone or tablet. The ability to annotate and modify images and subsequently email to patients was an extraordinary improvement in provider-patient communication. Universal limitations included small smartphone screens and the novelty of new technology. Mobile applications have the potential to greatly enhance patient education and simultaneously build rapport. Endless opportunities exist for their integration in clinical practice, particularly for new diagnoses, consent for procedures, and at time of discharge. Providers should be encouraged to try new applications and utilize them with patients.

  12. compMS2Miner: An Automatable Metabolite Identification, Visualization, and Data-Sharing R Package for High-Resolution LC-MS Data Sets.

    PubMed

    Edmands, William M B; Petrick, Lauren; Barupal, Dinesh K; Scalbert, Augustin; Wilson, Mark J; Wickliffe, Jeffrey K; Rappaport, Stephen M

    2017-04-04

    A long-standing challenge of untargeted metabolomic profiling by ultrahigh-performance liquid chromatography-high-resolution mass spectrometry (UHPLC-HRMS) is efficient transition from unknown mass spectral features to confident metabolite annotations. The compMS 2 Miner (Comprehensive MS 2 Miner) package was developed in the R language to facilitate rapid, comprehensive feature annotation using a peak-picker-output and MS 2 data files as inputs. The number of MS 2 spectra that can be collected during a metabolomic profiling experiment far outweigh the amount of time required for pain-staking manual interpretation; therefore, a degree of software workflow autonomy is required for broad-scale metabolite annotation. CompMS 2 Miner integrates many useful tools in a single workflow for metabolite annotation and also provides a means to overview the MS 2 data with a Web application GUI compMS 2 Explorer (Comprehensive MS 2 Explorer) that also facilitates data-sharing and transparency. The automatable compMS 2 Miner workflow consists of the following steps: (i) matching unknown MS 1 features to precursor MS 2 scans, (ii) filtration of spectral noise (dynamic noise filter), (iii) generation of composite mass spectra by multiple similar spectrum signal summation and redundant/contaminant spectra removal, (iv) interpretation of possible fragment ion substructure using an internal database, (v) annotation of unknowns with chemical and spectral databases with prediction of mammalian biotransformation metabolites, wrapper functions for in silico fragmentation software, nearest neighbor chemical similarity scoring, random forest based retention time prediction, text-mining based false positive removal/true positive ranking, chemical taxonomic prediction and differential evolution based global annotation score optimization, and (vi) network graph visualizations, data curation, and sharing are made possible via the compMS 2 Explorer application. Metabolite identities and comments can also be recorded using an interactive table within compMS 2 Explorer. The utility of the package is illustrated with a data set of blood serum samples from 7 diet induced obese (DIO) and 7 nonobese (NO) C57BL/6J mice, which were also treated with an antibiotic (streptomycin) to knockdown the gut microbiota. The results of fully autonomous and objective usage of compMS 2 Miner are presented here. All automatically annotated spectra output by the workflow are provided in the Supporting Information and can alternatively be explored as publically available compMS 2 Explorer applications for both positive and negative modes ( https://wmbedmands.shinyapps.io/compMS2_mouseSera_POS and https://wmbedmands.shinyapps.io/compMS2_mouseSera_NEG ). The workflow provided rapid annotation of a diversity of endogenous and gut microbially derived metabolites affected by both diet and antibiotic treatment, which conformed to previously published reports. Composite spectra (n = 173) were autonomously matched to entries of the Massbank of North America (MoNA) spectral repository. These experimental and virtual (lipidBlast) spectra corresponded to 29 common endogenous compound classes (e.g., 51 lysophosphatidylcholines spectra) and were then used to calculate the ranking capability of 7 individual scoring metrics. It was found that an average of the 7 individual scoring metrics provided the most effective weighted average ranking ability of 3 for the MoNA matched spectra in spite of potential risk of false positive annotations emerging from automation. Minor structural differences such as relative carbon-carbon double bond positions were found in several cases to affect the correct rank of the MoNA annotated metabolite. The latest release and an example workflow is available in the package vignette ( https://github.com/WMBEdmands/compMS2Miner ) and a version of the published application is available on the shinyapps.io site ( https://wmbedmands.shinyapps.io/compMS2Example ).

  13. A novel biomedical image indexing and retrieval system via deep preference learning.

    PubMed

    Pang, Shuchao; Orgun, Mehmet A; Yu, Zhezhou

    2018-05-01

    The traditional biomedical image retrieval methods as well as content-based image retrieval (CBIR) methods originally designed for non-biomedical images either only consider using pixel and low-level features to describe an image or use deep features to describe images but still leave a lot of room for improving both accuracy and efficiency. In this work, we propose a new approach, which exploits deep learning technology to extract the high-level and compact features from biomedical images. The deep feature extraction process leverages multiple hidden layers to capture substantial feature structures of high-resolution images and represent them at different levels of abstraction, leading to an improved performance for indexing and retrieval of biomedical images. We exploit the current popular and multi-layered deep neural networks, namely, stacked denoising autoencoders (SDAE) and convolutional neural networks (CNN) to represent the discriminative features of biomedical images by transferring the feature representations and parameters of pre-trained deep neural networks from another domain. Moreover, in order to index all the images for finding the similarly referenced images, we also introduce preference learning technology to train and learn a kind of a preference model for the query image, which can output the similarity ranking list of images from a biomedical image database. To the best of our knowledge, this paper introduces preference learning technology for the first time into biomedical image retrieval. We evaluate the performance of two powerful algorithms based on our proposed system and compare them with those of popular biomedical image indexing approaches and existing regular image retrieval methods with detailed experiments over several well-known public biomedical image databases. Based on different criteria for the evaluation of retrieval performance, experimental results demonstrate that our proposed algorithms outperform the state-of-the-art techniques in indexing biomedical images. We propose a novel and automated indexing system based on deep preference learning to characterize biomedical images for developing computer aided diagnosis (CAD) systems in healthcare. Our proposed system shows an outstanding indexing ability and high efficiency for biomedical image retrieval applications and it can be used to collect and annotate the high-resolution images in a biomedical database for further biomedical image research and applications. Copyright © 2018 Elsevier B.V. All rights reserved.

  14. Deep Convolutional Neural Networks for Computer-Aided Detection: CNN Architectures, Dataset Characteristics and Transfer Learning.

    PubMed

    Shin, Hoo-Chang; Roth, Holger R; Gao, Mingchen; Lu, Le; Xu, Ziyue; Nogues, Isabella; Yao, Jianhua; Mollura, Daniel; Summers, Ronald M

    2016-05-01

    Remarkable progress has been made in image recognition, primarily due to the availability of large-scale annotated datasets and deep convolutional neural networks (CNNs). CNNs enable learning data-driven, highly representative, hierarchical image features from sufficient training data. However, obtaining datasets as comprehensively annotated as ImageNet in the medical imaging domain remains a challenge. There are currently three major techniques that successfully employ CNNs to medical image classification: training the CNN from scratch, using off-the-shelf pre-trained CNN features, and conducting unsupervised CNN pre-training with supervised fine-tuning. Another effective method is transfer learning, i.e., fine-tuning CNN models pre-trained from natural image dataset to medical image tasks. In this paper, we exploit three important, but previously understudied factors of employing deep convolutional neural networks to computer-aided detection problems. We first explore and evaluate different CNN architectures. The studied models contain 5 thousand to 160 million parameters, and vary in numbers of layers. We then evaluate the influence of dataset scale and spatial image context on performance. Finally, we examine when and why transfer learning from pre-trained ImageNet (via fine-tuning) can be useful. We study two specific computer-aided detection (CADe) problems, namely thoraco-abdominal lymph node (LN) detection and interstitial lung disease (ILD) classification. We achieve the state-of-the-art performance on the mediastinal LN detection, and report the first five-fold cross-validation classification results on predicting axial CT slices with ILD categories. Our extensive empirical evaluation, CNN model analysis and valuable insights can be extended to the design of high performance CAD systems for other medical imaging tasks.

  15. Ontorat: automatic generation of new ontology terms, annotations, and axioms based on ontology design patterns.

    PubMed

    Xiang, Zuoshuang; Zheng, Jie; Lin, Yu; He, Yongqun

    2015-01-01

    It is time-consuming to build an ontology with many terms and axioms. Thus it is desired to automate the process of ontology development. Ontology Design Patterns (ODPs) provide a reusable solution to solve a recurrent modeling problem in the context of ontology engineering. Because ontology terms often follow specific ODPs, the Ontology for Biomedical Investigations (OBI) developers proposed a Quick Term Templates (QTTs) process targeted at generating new ontology classes following the same pattern, using term templates in a spreadsheet format. Inspired by the ODPs and QTTs, the Ontorat web application is developed to automatically generate new ontology terms, annotations of terms, and logical axioms based on a specific ODP(s). The inputs of an Ontorat execution include axiom expression settings, an input data file, ID generation settings, and a target ontology (optional). The axiom expression settings can be saved as a predesigned Ontorat setting format text file for reuse. The input data file is generated based on a template file created by a specific ODP (text or Excel format). Ontorat is an efficient tool for ontology expansion. Different use cases are described. For example, Ontorat was applied to automatically generate over 1,000 Japan RIKEN cell line cell terms with both logical axioms and rich annotation axioms in the Cell Line Ontology (CLO). Approximately 800 licensed animal vaccines were represented and annotated in the Vaccine Ontology (VO) by Ontorat. The OBI team used Ontorat to add assay and device terms required by ENCODE project. Ontorat was also used to add missing annotations to all existing Biobank specific terms in the Biobank Ontology. A collection of ODPs and templates with examples are provided on the Ontorat website and can be reused to facilitate ontology development. With ever increasing ontology development and applications, Ontorat provides a timely platform for generating and annotating a large number of ontology terms by following design patterns. http://ontorat.hegroup.org/.

  16. CpGAVAS, an integrated web server for the annotation, visualization, analysis, and GenBank submission of completely sequenced chloroplast genome sequences

    PubMed Central

    2012-01-01

    Background The complete sequences of chloroplast genomes provide wealthy information regarding the evolutionary history of species. With the advance of next-generation sequencing technology, the number of completely sequenced chloroplast genomes is expected to increase exponentially, powerful computational tools annotating the genome sequences are in urgent need. Results We have developed a web server CPGAVAS. The server accepts a complete chloroplast genome sequence as input. First, it predicts protein-coding and rRNA genes based on the identification and mapping of the most similar, full-length protein, cDNA and rRNA sequences by integrating results from Blastx, Blastn, protein2genome and est2genome programs. Second, tRNA genes and inverted repeats (IR) are identified using tRNAscan, ARAGORN and vmatch respectively. Third, it calculates the summary statistics for the annotated genome. Fourth, it generates a circular map ready for publication. Fifth, it can create a Sequin file for GenBank submission. Last, it allows the extractions of protein and mRNA sequences for given list of genes and species. The annotation results in GFF3 format can be edited using any compatible annotation editing tools. The edited annotations can then be uploaded to CPGAVAS for update and re-analyses repeatedly. Using known chloroplast genome sequences as test set, we show that CPGAVAS performs comparably to another application DOGMA, while having several superior functionalities. Conclusions CPGAVAS allows the semi-automatic and complete annotation of a chloroplast genome sequence, and the visualization, editing and analysis of the annotation results. It will become an indispensible tool for researchers studying chloroplast genomes. The software is freely accessible from http://www.herbalgenomics.org/cpgavas. PMID:23256920

  17. A comparison study of atlas-based 3D cardiac MRI segmentation: global versus global and local transformations

    NASA Astrophysics Data System (ADS)

    Daryanani, Aditya; Dangi, Shusil; Ben-Zikri, Yehuda Kfir; Linte, Cristian A.

    2016-03-01

    Magnetic Resonance Imaging (MRI) is a standard-of-care imaging modality for cardiac function assessment and guidance of cardiac interventions thanks to its high image quality and lack of exposure to ionizing radiation. Cardiac health parameters such as left ventricular volume, ejection fraction, myocardial mass, thickness, and strain can be assessed by segmenting the heart from cardiac MRI images. Furthermore, the segmented pre-operative anatomical heart models can be used to precisely identify regions of interest to be treated during minimally invasive therapy. Hence, the use of accurate and computationally efficient segmentation techniques is critical, especially for intra-procedural guidance applications that rely on the peri-operative segmentation of subject-specific datasets without delaying the procedure workflow. Atlas-based segmentation incorporates prior knowledge of the anatomy of interest from expertly annotated image datasets. Typically, the ground truth atlas label is propagated to a test image using a combination of global and local registration. The high computational cost of non-rigid registration motivated us to obtain an initial segmentation using global transformations based on an atlas of the left ventricle from a population of patient MRI images and refine it using well developed technique based on graph cuts. Here we quantitatively compare the segmentations obtained from the global and global plus local atlases and refined using graph cut-based techniques with the expert segmentations according to several similarity metrics, including Dice correlation coefficient, Jaccard coefficient, Hausdorff distance, and Mean absolute distance error.

  18. Entrez Neuron RDFa: a pragmatic semantic web application for data integration in neuroscience research.

    PubMed

    Samwald, Matthias; Lim, Ernest; Masiar, Peter; Marenco, Luis; Chen, Huajun; Morse, Thomas; Mutalik, Pradeep; Shepherd, Gordon; Miller, Perry; Cheung, Kei-Hoi

    2009-01-01

    The amount of biomedical data available in Semantic Web formats has been rapidly growing in recent years. While these formats are machine-friendly, user-friendly web interfaces allowing easy querying of these data are typically lacking. We present "Entrez Neuron", a pilot neuron-centric interface that allows for keyword-based queries against a coherent repository of OWL ontologies. These ontologies describe neuronal structures, physiology, mathematical models and microscopy images. The returned query results are organized hierarchically according to brain architecture. Where possible, the application makes use of entities from the Open Biomedical Ontologies (OBO) and the 'HCLS knowledgebase' developed by the W3C Interest Group for Health Care and Life Science. It makes use of the emerging RDFa standard to embed ontology fragments and semantic annotations within its HTML-based user interface. The application and underlying ontologies demonstrate how Semantic Web technologies can be used for information integration within a curated information repository and between curated information repositories. It also demonstrates how information integration can be accomplished on the client side, through simple copying and pasting of portions of documents that contain RDFa markup.

  19. Telemedicine in practice.

    PubMed

    Thrall, J H; Boland, G

    1998-04-01

    Telemedicine is defined as the "delivery of health care and sharing of medical knowledge over a distance using telecommunication systems." The concept of telemedicine is not new. Beyond the use of the telephone, there were numerous attempts to develop telemedicine programs in the 1960s mostly based on interactive television. The early experience was conceptionally encouraging but suffered inadequate technology. With a few notable exceptions such as the telemetry of medical data in the space program, there was very little advancement of telemedicine in the 1970s and 1980s. Interest in telemedicine has exploded in the 1990s with the development of medical devices suited to capturing images and other data in digital electronic form and the development and installation of high speed, high bandwidth telecommunication systems around the world. Clinical applications of telemedicine are now found in virtually every specialty. Teleradiology is the most common application followed by cardiology, dermatology, psychiatry, emergency medicine, home health care, pathology, and oncology. The technological basis and the practical issues are highly variable from one clinical application to another. Teleradiology, including telenuclear medicine, is one of the more well-defined telemedicine services. Techniques have been developed for the acquisition and digitization of images, image compression, image transmission, and image interpretation. The American College of Radiology has promulgated standards for teleradiology, including the requirement for the use of high resolution 2000 x 2000 pixel workstations for the interpretation of plain films. Other elements of the standard address image annotation, patient confidentiality, workstation functionality, cathode ray tube brightness, and image compression. Teleradiology systems are now widely deployed in clinical practice. Applications include providing service from larger to smaller institutions, coverage of outpatient clinics, imaging centers, and nursing homes. Teleradiology is also being used in international applications. Unresolved issues in telemedicine include licensure, the development of standards, reimbursement for services, patient confidentiality, and telecommunications infrastructure and cost. A number of states and medical boards have instituted policies and regulations to prevent physicians who are not licensed in the respective state to provide telemedicine services. This is a major impediment to the delivery of telemedicine between states. Telemedicine, including teleradiology, is here to stay and is changing the practice of medicine dramatically. National and international communications networks are being created that enable the sharing of information and knowledge at a distance. Technological barriers are being overcome leaving organizational, legal, financial, and special interest issues as the major impediments to the further development of telemedicine and realization of its benefits.

  20. Interobserver variability in identification of breast tumors in MRI and its implications for prognostic biomarkers and radiogenomics

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Saha, Ashirbani, E-mail: as698@duke.edu; Grimm, La

    Purpose: To assess the interobserver variability of readers when outlining breast tumors in MRI, study the reasons behind the variability, and quantify the effect of the variability on algorithmic imaging features extracted from breast MRI. Methods: Four readers annotated breast tumors from the MRI examinations of 50 patients from one institution using a bounding box to indicate a tumor. All of the annotated tumors were biopsy proven cancers. The similarity of bounding boxes was analyzed using Dice coefficients. An automatic tumor segmentation algorithm was used to segment tumors from the readers’ annotations. The segmented tumors were then compared between readersmore » using Dice coefficients as the similarity metric. Cases showing high interobserver variability (average Dice coefficient <0.8) after segmentation were analyzed by a panel of radiologists to identify the reasons causing the low level of agreement. Furthermore, an imaging feature, quantifying tumor and breast tissue enhancement dynamics, was extracted from each segmented tumor for a patient. Pearson’s correlation coefficients were computed between the features for each pair of readers to assess the effect of the annotation on the feature values. Finally, the authors quantified the extent of variation in feature values caused by each of the individual reasons for low agreement. Results: The average agreement between readers in terms of the overlap (Dice coefficient) of the bounding box was 0.60. Automatic segmentation of tumor improved the average Dice coefficient for 92% of the cases to the average value of 0.77. The mean agreement between readers expressed by the correlation coefficient for the imaging feature was 0.96. Conclusions: There is a moderate variability between readers when identifying the rectangular outline of breast tumors on MRI. This variability is alleviated by the automatic segmentation of the tumors. Furthermore, the moderate interobserver variability in terms of the bounding box does not translate into a considerable variability in terms of assessment of enhancement dynamics. The authors propose some additional ways to further reduce the interobserver variability.« less

  1. Computer-aided diagnosis in phase contrast imaging X-ray computed tomography for quantitative characterization of ex vivo human patellar cartilage.

    PubMed

    Nagarajan, Mahesh B; Coan, Paola; Huber, Markus B; Diemoz, Paul C; Glaser, Christian; Wismuller, Axel

    2013-10-01

    Visualization of ex vivo human patellar cartilage matrix through the phase contrast imaging X-ray computed tomography (PCI-CT) has been previously demonstrated. Such studies revealed osteoarthritis-induced changes to chondrocyte organization in the radial zone. This study investigates the application of texture analysis to characterizing such chondrocyte patterns in the presence and absence of osteoarthritic damage. Texture features derived from Minkowski functionals (MF) and gray-level co-occurrence matrices (GLCM) were extracted from 842 regions of interest (ROI) annotated on PCI-CT images of ex vivo human patellar cartilage specimens. These texture features were subsequently used in a machine learning task with support vector regression to classify ROIs as healthy or osteoarthritic; classification performance was evaluated using the area under the receiver operating characteristic curve (AUC). The best classification performance was observed with the MF features perimeter (AUC: 0.94 ±0.08 ) and "Euler characteristic" (AUC: 0.94 ±0.07 ), and GLCM-derived feature "Correlation" (AUC: 0.93 ±0.07). These results suggest that such texture features can provide a detailed characterization of the chondrocyte organization in the cartilage matrix, enabling classification of cartilage as healthy or osteoarthritic with high accuracy.

  2. Evaluation of Multimodal Imaging Biomarkers of Prostate Cancer

    DTIC Science & Technology

    2016-11-01

    Release; Distribution Unlimited The views, opinions and/or findings contained in this report are those of the author(s) and should not be construed as an...NOTES 14. ABSTRACT The goals of the proposed studies are to develop, optimize and use imaging methods to non-invasively assess the temporal...relationship prostate cancer growth, androgen receptor ( AR ) levels, hypoxia, and translocator protein (TSPO) levels. As described in the statement of work

  3. BreakingNews: Article Annotation by Image and Text Processing.

    PubMed

    Ramisa, Arnau; Yan, Fei; Moreno-Noguer, Francesc; Mikolajczyk, Krystian

    2018-05-01

    Building upon recent Deep Neural Network architectures, current approaches lying in the intersection of Computer Vision and Natural Language Processing have achieved unprecedented breakthroughs in tasks like automatic captioning or image retrieval. Most of these learning methods, though, rely on large training sets of images associated with human annotations that specifically describe the visual content. In this paper we propose to go a step further and explore the more complex cases where textual descriptions are loosely related to the images. We focus on the particular domain of news articles in which the textual content often expresses connotative and ambiguous relations that are only suggested but not directly inferred from images. We introduce an adaptive CNN architecture that shares most of the structure for multiple tasks including source detection, article illustration and geolocation of articles. Deep Canonical Correlation Analysis is deployed for article illustration, and a new loss function based on Great Circle Distance is proposed for geolocation. Furthermore, we present BreakingNews, a novel dataset with approximately 100K news articles including images, text and captions, and enriched with heterogeneous meta-data (such as GPS coordinates and user comments). We show this dataset to be appropriate to explore all aforementioned problems, for which we provide a baseline performance using various Deep Learning architectures, and different representations of the textual and visual features. We report very promising results and bring to light several limitations of current state-of-the-art in this kind of domain, which we hope will help spur progress in the field.

  4. Automated Tracking of Quantitative Assessments of Tumor Burden in Clinical Trials1

    PubMed Central

    Rubin, Daniel L; Willrett, Debra; O'Connor, Martin J; Hage, Cleber; Kurtz, Camille; Moreira, Dilvan A

    2014-01-01

    There are two key challenges hindering effective use of quantitative assessment of imaging in cancer response assessment: 1) Radiologists usually describe the cancer lesions in imaging studies subjectively and sometimes ambiguously, and 2) it is difficult to repurpose imaging data, because lesion measurements are not recorded in a format that permits machine interpretation and interoperability. We have developed a freely available software platform on the basis of open standards, the electronic Physician Annotation Device (ePAD), to tackle these challenges in two ways. First, ePAD facilitates the radiologist in carrying out cancer lesion measurements as part of routine clinical trial image interpretation workflow. Second, ePAD records all image measurements and annotations in a data format that permits repurposing image data for analyses of alternative imaging biomarkers of treatment response. To determine the impact of ePAD on radiologist efficiency in quantitative assessment of imaging studies, a radiologist evaluated computed tomography (CT) imaging studies from 20 subjects having one baseline and three consecutive follow-up imaging studies with and without ePAD. The radiologist made measurements of target lesions in each imaging study using Response Evaluation Criteria in Solid Tumors 1.1 criteria, initially with the aid of ePAD, and then after a 30-day washout period, the exams were reread without ePAD. The mean total time required to review the images and summarize measurements of target lesions was 15% (P < .039) shorter using ePAD than without using this tool. In addition, it was possible to rapidly reanalyze the images to explore lesion cross-sectional area as an alternative imaging biomarker to linear measure. We conclude that ePAD appears promising to potentially improve reader efficiency for quantitative assessment of CT examinations, and it may enable discovery of future novel image-based biomarkers of cancer treatment response. PMID:24772204

  5. Genomic Sequence Variation Markup Language (GSVML).

    PubMed

    Nakaya, Jun; Kimura, Michio; Hiroi, Kaei; Ido, Keisuke; Yang, Woosung; Tanaka, Hiroshi

    2010-02-01

    With the aim of making good use of internationally accumulated genomic sequence variation data, which is increasing rapidly due to the explosive amount of genomic research at present, the development of an interoperable data exchange format and its international standardization are necessary. Genomic Sequence Variation Markup Language (GSVML) will focus on genomic sequence variation data and human health applications, such as gene based medicine or pharmacogenomics. We developed GSVML through eight steps, based on case analysis and domain investigations. By focusing on the design scope to human health applications and genomic sequence variation, we attempted to eliminate ambiguity and to ensure practicability. We intended to satisfy the requirements derived from the use case analysis of human-based clinical genomic applications. Based on database investigations, we attempted to minimize the redundancy of the data format, while maximizing the data covering range. We also attempted to ensure communication and interface ability with other Markup Languages, for exchange of omics data among various omics researchers or facilities. The interface ability with developing clinical standards, such as the Health Level Seven Genotype Information model, was analyzed. We developed the human health-oriented GSVML comprising variation data, direct annotation, and indirect annotation categories; the variation data category is required, while the direct and indirect annotation categories are optional. The annotation categories contain omics and clinical information, and have internal relationships. For designing, we examined 6 cases for three criteria as human health application and 15 data elements for three criteria as data formats for genomic sequence variation data exchange. The data format of five international SNP databases and six Markup Languages and the interface ability to the Health Level Seven Genotype Model in terms of 317 items were investigated. GSVML was developed as a potential data exchanging format for genomic sequence variation data exchange focusing on human health applications. The international standardization of GSVML is necessary, and is currently underway. GSVML can be applied to enhance the utilization of genomic sequence variation data worldwide by providing a communicable platform between clinical and research applications. Copyright 2009 Elsevier Ireland Ltd. All rights reserved.

  6. OLIVER: an online library of images for veterinary education and research.

    PubMed

    McGreevy, Paul; Shaw, Tim; Burn, Daniel; Miller, Nick

    2007-01-01

    As part of a strategic move by the University of Sydney toward increased flexibility in learning, the Faculty of Veterinary Science undertook a number of developments involving Web-based teaching and assessment. OLIVER underpins them by providing a rich, durable repository for learning objects. To integrate Web-based learning, case studies, and didactic presentations for veterinary and animal science students, we established an online library of images and other learning objects for use by academics in the Faculties of Veterinary Science and Agriculture. The objectives of OLIVER were to maximize the use of the faculty's teaching resources by providing a stable archiving facility for graphic images and other multimedia learning objects that allows flexible and precise searching, integrating indexing standards, thesauri, pull-down lists of preferred terms, and linking of objects within cases. OLIVER offers a portable and expandable Web-based shell that facilitates ongoing storage of learning objects in a range of media. Learning objects can be downloaded in common, standardized formats so that they can be easily imported for use in a range of applications, including Microsoft PowerPoint, WebCT, and Microsoft Word. OLIVER now contains more than 9,000 images relating to many facets of veterinary science; these are annotated and supported by search engines that allow rapid access to both images and relevant information. The Web site is easily updated and adapted as required.

  7. The Viking viewer for connectomics: scalable multi-user annotation and summarization of large volume data sets

    PubMed Central

    ANDERSON, JR; MOHAMMED, S; GRIMM, B; JONES, BW; KOSHEVOY, P; TASDIZEN, T; WHITAKER, R; MARC, RE

    2011-01-01

    Modern microscope automation permits the collection of vast amounts of continuous anatomical imagery in both two and three dimensions. These large data sets present significant challenges for data storage, access, viewing, annotation and analysis. The cost and overhead of collecting and storing the data can be extremely high. Large data sets quickly exceed an individual's capability for timely analysis and present challenges in efficiently applying transforms, if needed. Finally annotated anatomical data sets can represent a significant investment of resources and should be easily accessible to the scientific community. The Viking application was our solution created to view and annotate a 16.5 TB ultrastructural retinal connectome volume and we demonstrate its utility in reconstructing neural networks for a distinctive retinal amacrine cell class. Viking has several key features. (1) It works over the internet using HTTP and supports many concurrent users limited only by hardware. (2) It supports a multi-user, collaborative annotation strategy. (3) It cleanly demarcates viewing and analysis from data collection and hosting. (4) It is capable of applying transformations in real-time. (5) It has an easily extensible user interface, allowing addition of specialized modules without rewriting the viewer. PMID:21118201

  8. Sockeye: A 3D Environment for Comparative Genomics

    PubMed Central

    Montgomery, Stephen B.; Astakhova, Tamara; Bilenky, Mikhail; Birney, Ewan; Fu, Tony; Hassel, Maik; Melsopp, Craig; Rak, Marcin; Robertson, A. Gordon; Sleumer, Monica; Siddiqui, Asim S.; Jones, Steven J.M.

    2004-01-01

    Comparative genomics techniques are used in bioinformatics analyses to identify the structural and functional properties of DNA sequences. As the amount of available sequence data steadily increases, the ability to perform large-scale comparative analyses has become increasingly relevant. In addition, the growing complexity of genomic feature annotation means that new approaches to genomic visualization need to be explored. We have developed a Java-based application called Sockeye that uses three-dimensional (3D) graphics technology to facilitate the visualization of annotation and conservation across multiple sequences. This software uses the Ensembl database project to import sequence and annotation information from several eukaryotic species. A user can additionally import their own custom sequence and annotation data. Individual annotation objects are displayed in Sockeye by using custom 3D models. Ensembl-derived and imported sequences can be analyzed by using a suite of multiple and pair-wise alignment algorithms. The results of these comparative analyses are also displayed in the 3D environment of Sockeye. By using the Java3D API to visualize genomic data in a 3D environment, we are able to compactly display cross-sequence comparisons. This provides the user with a novel platform for visualizing and comparing genomic feature organization. PMID:15123592

  9. ExpTreeDB: web-based query and visualization of manually annotated gene expression profiling experiments of human and mouse from GEO.

    PubMed

    Ni, Ming; Ye, Fuqiang; Zhu, Juanjuan; Li, Zongwei; Yang, Shuai; Yang, Bite; Han, Lu; Wu, Yongge; Chen, Ying; Li, Fei; Wang, Shengqi; Bo, Xiaochen

    2014-12-01

    Numerous public microarray datasets are valuable resources for the scientific communities. Several online tools have made great steps to use these data by querying related datasets with users' own gene signatures or expression profiles. However, dataset annotation and result exhibition still need to be improved. ExpTreeDB is a database that allows for queries on human and mouse microarray experiments from Gene Expression Omnibus with gene signatures or profiles. Compared with similar applications, ExpTreeDB pays more attention to dataset annotations and result visualization. We introduced a multiple-level annotation system to depict and organize original experiments. For example, a tamoxifen-treated cell line experiment is hierarchically annotated as 'agent→drug→estrogen receptor antagonist→tamoxifen'. Consequently, retrieved results are exhibited by an interactive tree-structured graphics, which provide an overview for related experiments and might enlighten users on key items of interest. The database is freely available at http://biotech.bmi.ac.cn/ExpTreeDB. Web site is implemented in Perl, PHP, R, MySQL and Apache. © The Author 2014. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.

  10. City model enrichment

    NASA Astrophysics Data System (ADS)

    Smart, Philip D.; Quinn, Jonathan A.; Jones, Christopher B.

    The combination of mobile communication technology with location and orientation aware digital cameras has introduced increasing interest in the exploitation of 3D city models for applications such as augmented reality and automated image captioning. The effectiveness of such applications is, at present, severely limited by the often poor quality of semantic annotation of the 3D models. In this paper, we show how freely available sources of georeferenced Web 2.0 information can be used for automated enrichment of 3D city models. Point referenced names of prominent buildings and landmarks mined from Wikipedia articles and from the OpenStreetMaps digital map and Geonames gazetteer have been matched to the 2D ground plan geometry of a 3D city model. In order to address the ambiguities that arise in the associations between these sources and the city model, we present procedures to merge potentially related buildings and implement fuzzy matching between reference points and building polygons. An experimental evaluation demonstrates the effectiveness of the presented methods.

  11. Iterative tensor voting for perceptual grouping of ill-defined curvilinear structures.

    PubMed

    Loss, Leandro A; Bebis, George; Parvin, Bahram

    2011-08-01

    In this paper, a novel approach is proposed for perceptual grouping and localization of ill-defined curvilinear structures. Our approach builds upon the tensor voting and the iterative voting frameworks. Its efficacy lies on iterative refinements of curvilinear structures by gradually shifting from an exploratory to an exploitative mode. Such a mode shifting is achieved by reducing the aperture of the tensor voting fields, which is shown to improve curve grouping and inference by enhancing the concentration of the votes over promising, salient structures. The proposed technique is validated on delineating adherens junctions that are imaged through fluorescence microscopy. However, the method is also applicable for screening other organisms based on characteristics of their cell wall structures. Adherens junctions maintain tissue structural integrity and cell-cell interactions. Visually, they exhibit fibrous patterns that may be diffused, heterogeneous in fluorescence intensity, or punctate and frequently perceptual. Besides the application to real data, the proposed method is compared to prior methods on synthetic and annotated real data, showing high precision rates.

  12. Enriching text with images and colored light

    NASA Astrophysics Data System (ADS)

    Sekulovski, Dragan; Geleijnse, Gijs; Kater, Bram; Korst, Jan; Pauws, Steffen; Clout, Ramon

    2008-01-01

    We present an unsupervised method to enrich textual applications with relevant images and colors. The images are collected by querying large image repositories and subsequently the colors are computed using image processing. A prototype system based on this method is presented where the method is applied to song lyrics. In combination with a lyrics synchronization algorithm the system produces a rich multimedia experience. In order to identify terms within the text that may be associated with images and colors, we select noun phrases using a part of speech tagger. Large image repositories are queried with these terms. Per term representative colors are extracted using the collected images. Hereto, we either use a histogram-based or a mean shift-based algorithm. The representative color extraction uses the non-uniform distribution of the colors found in the large repositories. The images that are ranked best by the search engine are displayed on a screen, while the extracted representative colors are rendered on controllable lighting devices in the living room. We evaluate our method by comparing the computed colors to standard color representations of a set of English color terms. A second evaluation focuses on the distance in color between a queried term in English and its translation in a foreign language. Based on results from three sets of terms, a measure of suitability of a term for color extraction based on KL Divergence is proposed. Finally, we compare the performance of the algorithm using either the automatically indexed repository of Google Images and the manually annotated Flickr.com. Based on the results of these experiments, we conclude that using the presented method we can compute the relevant color for a term using a large image repository and image processing.

  13. Data annotation, recording and mapping system for the US open skies aircraft

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Brown, B.W.; Goede, W.F.; Farmer, R.G.

    1996-11-01

    This paper discusses the system developed by Northrop Grumman for the Defense Nuclear Agency (DNA), US Air Force, and the On-Site Inspection Agency (OSIA) to comply with the data annotation and reporting provisions of the Open Skies Treaty. This system, called the Data Annotation, Recording and Mapping System (DARMS), has been installed on the US OC-135 and meets or exceeds all annotation requirements for the Open Skies Treaty. The Open Skies Treaty, which will enter into force in the near future, allows any of the 26 signatory countries to fly fixed wing aircraft with imaging sensors over any of themore » other treaty participants, upon very short notice, and with no restricted flight areas. Sensor types presently allowed by the treaty are: optical framing and panoramic film cameras; video cameras ranging from analog PAL color television cameras to the more sophisticated digital monochrome and color line scanning or framing cameras; infrared line scanners; and synthetic aperture radars. Each sensor type has specific performance parameters which are limited by the treaty, as well as specific annotation requirements which must be achieved upon full entry into force. DARMS supports U.S. compliance with the Opens Skies Treaty by means of three subsystems: the Data Annotation Subsytem (DAS), which annotates sensor media with data obtained from sensors and the aircraft`s avionics system; the Data Recording System (DRS), which records all sensor and flight events on magnetic media for later use in generating Treaty mandated mission reports; and the Dynamic Sensor Mapping Subsystem (DSMS), which provides observers and sensor operators with a real-time moving map displays of the progress of the mission, complete with instantaneous and cumulative sensor coverages. This paper will describe DARMS and its subsystems in greater detail, along with the supporting avionics sub-systems. 7 figs.« less

  14. Enhanced annotations and features for comparing thousands of Pseudomonas genomes in the Pseudomonas genome database.

    PubMed

    Winsor, Geoffrey L; Griffiths, Emma J; Lo, Raymond; Dhillon, Bhavjinder K; Shay, Julie A; Brinkman, Fiona S L

    2016-01-04

    The Pseudomonas Genome Database (http://www.pseudomonas.com) is well known for the application of community-based annotation approaches for producing a high-quality Pseudomonas aeruginosa PAO1 genome annotation, and facilitating whole-genome comparative analyses with other Pseudomonas strains. To aid analysis of potentially thousands of complete and draft genome assemblies, this database and analysis platform was upgraded to integrate curated genome annotations and isolate metadata with enhanced tools for larger scale comparative analysis and visualization. Manually curated gene annotations are supplemented with improved computational analyses that help identify putative drug targets and vaccine candidates or assist with evolutionary studies by identifying orthologs, pathogen-associated genes and genomic islands. The database schema has been updated to integrate isolate metadata that will facilitate more powerful analysis of genomes across datasets in the future. We continue to place an emphasis on providing high-quality updates to gene annotations through regular review of the scientific literature and using community-based approaches including a major new Pseudomonas community initiative for the assignment of high-quality gene ontology terms to genes. As we further expand from thousands of genomes, we plan to provide enhancements that will aid data visualization and analysis arising from whole-genome comparative studies including more pan-genome and population-based approaches. © The Author(s) 2015. Published by Oxford University Press on behalf of Nucleic Acids Research.

  15. GAPP: A Proteogenomic Software for Genome Annotation and Global Profiling of Post-translational Modifications in Prokaryotes.

    PubMed

    Zhang, Jia; Yang, Ming-Kun; Zeng, Honghui; Ge, Feng

    2016-11-01

    Although the number of sequenced prokaryotic genomes is growing rapidly, experimentally verified annotation of prokaryotic genome remains patchy and challenging. To facilitate genome annotation efforts for prokaryotes, we developed an open source software called GAPP for genome annotation and global profiling of post-translational modifications (PTMs) in prokaryotes. With a single command, it provides a standard workflow to validate and refine predicted genetic models and discover diverse PTM events. We demonstrated the utility of GAPP using proteomic data from Helicobacter pylori, one of the major human pathogens that is responsible for many gastric diseases. Our results confirmed 84.9% of the existing predicted H. pylori proteins, identified 20 novel protein coding genes, and corrected four existing gene models with regard to translation initiation sites. In particular, GAPP revealed a large repertoire of PTMs using the same proteomic data and provided a rich resource that can be used to examine the functions of reversible modifications in this human pathogen. This software is a powerful tool for genome annotation and global discovery of PTMs and is applicable to any sequenced prokaryotic organism; we expect that it will become an integral part of ongoing genome annotation efforts for prokaryotes. GAPP is freely available at https://sourceforge.net/projects/gappproteogenomic/. © 2016 by The American Society for Biochemistry and Molecular Biology, Inc.

  16. 3D facial landmarks: Inter-operator variability of manual annotation

    PubMed Central

    2014-01-01

    Background Manual annotation of landmarks is a known source of variance, which exist in all fields of medical imaging, influencing the accuracy and interpretation of the results. However, the variability of human facial landmarks is only sparsely addressed in the current literature as opposed to e.g. the research fields of orthodontics and cephalometrics. We present a full facial 3D annotation procedure and a sparse set of manually annotated landmarks, in effort to reduce operator time and minimize the variance. Method Facial scans from 36 voluntary unrelated blood donors from the Danish Blood Donor Study was randomly chosen. Six operators twice manually annotated 73 anatomical and pseudo-landmarks, using a three-step scheme producing a dense point correspondence map. We analyzed both the intra- and inter-operator variability, using mixed-model ANOVA. We then compared four sparse sets of landmarks in order to construct a dense correspondence map of the 3D scans with a minimum point variance. Results The anatomical landmarks of the eye were associated with the lowest variance, particularly the center of the pupils. Whereas points of the jaw and eyebrows have the highest variation. We see marginal variability in regards to intra-operator and portraits. Using a sparse set of landmarks (n=14), that capture the whole face, the dense point mean variance was reduced from 1.92 to 0.54 mm. Conclusion The inter-operator variability was primarily associated with particular landmarks, where more leniently landmarks had the highest variability. The variables embedded in the portray and the reliability of a trained operator did only have marginal influence on the variability. Further, using 14 of the annotated landmarks we were able to reduced the variability and create a dense correspondences mesh to capture all facial features. PMID:25306436

  17. Development Issues on Linked Data Weblog Enrichment

    NASA Astrophysics Data System (ADS)

    Ruiz-Rube, Iván; Cornejo, Carlos M.; Dodero, Juan Manuel; García, Vicente M.

    In this paper, we describe the issues found during the development of LinkedBlog, a Linked Data extension for WordPress blogs. This extension enables to enrich text-based and video information contained in blog entries with RDF triples that are suitable to be stored, managed and exploited by other web-based applications. The issues have to do with the generality, usability, tracking, depth, security, trustiness and performance of the linked data enrichment process. The presented annotation approach aims at maintaining web-based contents independent from the underlying ontological model, by providing a loosely coupled RDFa-based approach in the linked data application. Finally, we detail how the performance of annotations can be improved through a semantic reasoner.

  18. Images of the Universe, Part II: The Decade in Astronomical Photographs.

    ERIC Educational Resources Information Center

    Mercury, 1982

    1982-01-01

    Provides an annotated list of technical and nontechnical astronomy books (reviewer's remarks, cost, publisher's name/address). Topics include general astronomy, general astronomy textbooks, solar system, amateur astronomy, astronomy history, archeoastronomy, space exploration, related physics books, pseudoscience, and others. (JN)

  19. ERAIZDA: a model for holistic annotation of animal infectious and zoonotic diseases

    PubMed Central

    Buza, Teresia M.; Jack, Sherman W.; Kirunda, Halid; Khaitsa, Margaret L.; Lawrence, Mark L.; Pruett, Stephen; Peterson, Daniel G.

    2015-01-01

    There is an urgent need for a unified resource that integrates trans-disciplinary annotations of emerging and reemerging animal infectious and zoonotic diseases. Such data integration will provide wonderful opportunity for epidemiologists, researchers and health policy makers to make data-driven decisions designed to improve animal health. Integrating emerging and reemerging animal infectious and zoonotic disease data from a large variety of sources into a unified open-access resource provides more plausible arguments to achieve better understanding of infectious and zoonotic diseases. We have developed a model for interlinking annotations of these diseases. These diseases are of particular interest because of the threats they pose to animal health, human health and global health security. We demonstrated the application of this model using brucellosis, an infectious and zoonotic disease. Preliminary annotations were deposited into VetBioBase database (http://vetbiobase.igbb.msstate.edu). This database is associated with user-friendly tools to facilitate searching, retrieving and downloading of disease-related information. Database URL: http://vetbiobase.igbb.msstate.edu PMID:26581408

  20. Annotation analysis for testing drug safety signals using unstructured clinical notes

    PubMed Central

    2012-01-01

    Background The electronic surveillance for adverse drug events is largely based upon the analysis of coded data from reporting systems. Yet, the vast majority of electronic health data lies embedded within the free text of clinical notes and is not gathered into centralized repositories. With the increasing access to large volumes of electronic medical data—in particular the clinical notes—it may be possible to computationally encode and to test drug safety signals in an active manner. Results We describe the application of simple annotation tools on clinical text and the mining of the resulting annotations to compute the risk of getting a myocardial infarction for patients with rheumatoid arthritis that take Vioxx. Our analysis clearly reveals elevated risks for myocardial infarction in rheumatoid arthritis patients taking Vioxx (odds ratio 2.06) before 2005. Conclusions Our results show that it is possible to apply annotation analysis methods for testing hypotheses about drug safety using electronic medical records. PMID:22541596

  1. Functional Annotation of Ion Channel Structures by Molecular Simulation.

    PubMed

    Trick, Jemma L; Chelvaniththilan, Sivapalan; Klesse, Gianni; Aryal, Prafulla; Wallace, E Jayne; Tucker, Stephen J; Sansom, Mark S P

    2016-12-06

    Ion channels play key roles in cell membranes, and recent advances are yielding an increasing number of structures. However, their functional relevance is often unclear and better tools are required for their functional annotation. In sub-nanometer pores such as ion channels, hydrophobic gating has been shown to promote dewetting to produce a functionally closed (i.e., non-conductive) state. Using the serotonin receptor (5-HT 3 R) structure as an example, we demonstrate the use of molecular dynamics to aid the functional annotation of channel structures via simulation of the behavior of water within the pore. Three increasingly complex simulation analyses are described: water equilibrium densities; single-ion free-energy profiles; and computational electrophysiology. All three approaches correctly predict the 5-HT 3 R crystal structure to represent a functionally closed (i.e., non-conductive) state. We also illustrate the application of water equilibrium density simulations to annotate different conformational states of a glycine receptor. Copyright © 2016 The Authors. Published by Elsevier Ltd.. All rights reserved.

  2. GAMES identifies and annotates mutations in next-generation sequencing projects.

    PubMed

    Sana, Maria Elena; Iascone, Maria; Marchetti, Daniela; Palatini, Jeff; Galasso, Marco; Volinia, Stefano

    2011-01-01

    Next-generation sequencing (NGS) methods have the potential for changing the landscape of biomedical science, but at the same time pose several problems in analysis and interpretation. Currently, there are many commercial and public software packages that analyze NGS data. However, the limitations of these applications include output which is insufficiently annotated and of difficult functional comprehension to end users. We developed GAMES (Genomic Analysis of Mutations Extracted by Sequencing), a pipeline aiming to serve as an efficient middleman between data deluge and investigators. GAMES attains multiple levels of filtering and annotation, such as aligning the reads to a reference genome, performing quality control and mutational analysis, integrating results with genome annotations and sorting each mismatch/deletion according to a range of parameters. Variations are matched to known polymorphisms. The prediction of functional mutations is achieved by using different approaches. Overall GAMES enables an effective complexity reduction in large-scale DNA-sequencing projects. GAMES is available free of charge to academic users and may be obtained from http://aqua.unife.it/GAMES.

  3. Generating region proposals for histopathological whole slide image retrieval.

    PubMed

    Ma, Yibing; Jiang, Zhiguo; Zhang, Haopeng; Xie, Fengying; Zheng, Yushan; Shi, Huaqiang; Zhao, Yu; Shi, Jun

    2018-06-01

    Content-based image retrieval is an effective method for histopathological image analysis. However, given a database of huge whole slide images (WSIs), acquiring appropriate region-of-interests (ROIs) for training is significant and difficult. Moreover, histopathological images can only be annotated by pathologists, resulting in the lack of labeling information. Therefore, it is an important and challenging task to generate ROIs from WSI and retrieve image with few labels. This paper presents a novel unsupervised region proposing method for histopathological WSI based on Selective Search. Specifically, the WSI is over-segmented into regions which are hierarchically merged until the WSI becomes a single region. Nucleus-oriented similarity measures for region mergence and Nucleus-Cytoplasm color space for histopathological image are specially defined to generate accurate region proposals. Additionally, we propose a new semi-supervised hashing method for image retrieval. The semantic features of images are extracted with Latent Dirichlet Allocation and transformed into binary hashing codes with Supervised Hashing. The methods are tested on a large-scale multi-class database of breast histopathological WSIs. The results demonstrate that for one WSI, our region proposing method can generate 7.3 thousand contoured regions which fit well with 95.8% of the ROIs annotated by pathologists. The proposed hashing method can retrieve a query image among 136 thousand images in 0.29 s and reach precision of 91% with only 10% of images labeled. The unsupervised region proposing method can generate regions as predictions of lesions in histopathological WSI. The region proposals can also serve as the training samples to train machine-learning models for image retrieval. The proposed hashing method can achieve fast and precise image retrieval with small amount of labels. Furthermore, the proposed methods can be potentially applied in online computer-aided-diagnosis systems. Copyright © 2018 Elsevier B.V. All rights reserved.

  4. Homology to peptide pattern for annotation of carbohydrate-active enzymes and prediction of function.

    PubMed

    Busk, P K; Pilgaard, B; Lezyk, M J; Meyer, A S; Lange, L

    2017-04-12

    Carbohydrate-active enzymes are found in all organisms and participate in key biological processes. These enzymes are classified in 274 families in the CAZy database but the sequence diversity within each family makes it a major task to identify new family members and to provide basis for prediction of enzyme function. A fast and reliable method for de novo annotation of genes encoding carbohydrate-active enzymes is to identify conserved peptides in the curated enzyme families followed by matching of the conserved peptides to the sequence of interest as demonstrated for the glycosyl hydrolase and the lytic polysaccharide monooxygenase families. This approach not only assigns the enzymes to families but also provides functional prediction of the enzymes with high accuracy. We identified conserved peptides for all enzyme families in the CAZy database with Peptide Pattern Recognition. The conserved peptides were matched to protein sequence for de novo annotation and functional prediction of carbohydrate-active enzymes with the Hotpep method. Annotation of protein sequences from 12 bacterial and 16 fungal genomes to families with Hotpep had an accuracy of 0.84 (measured as F1-score) compared to semiautomatic annotation by the CAZy database whereas the dbCAN HMM-based method had an accuracy of 0.77 with optimized parameters. Furthermore, Hotpep provided a functional prediction with 86% accuracy for the annotated genes. Hotpep is available as a stand-alone application for MS Windows. Hotpep is a state-of-the-art method for automatic annotation and functional prediction of carbohydrate-active enzymes.

  5. Annotating Protein Functional Residues by Coupling High-Throughput Fitness Profile and Homologous-Structure Analysis

    PubMed Central

    Du, Yushen; Wu, Nicholas C.; Jiang, Lin; Zhang, Tianhao; Gong, Danyang; Shu, Sara; Wu, Ting-Ting

    2016-01-01

    ABSTRACT Identification and annotation of functional residues are fundamental questions in protein sequence analysis. Sequence and structure conservation provides valuable information to tackle these questions. It is, however, limited by the incomplete sampling of sequence space in natural evolution. Moreover, proteins often have multiple functions, with overlapping sequences that present challenges to accurate annotation of the exact functions of individual residues by conservation-based methods. Using the influenza A virus PB1 protein as an example, we developed a method to systematically identify and annotate functional residues. We used saturation mutagenesis and high-throughput sequencing to measure the replication capacity of single nucleotide mutations across the entire PB1 protein. After predicting protein stability upon mutations, we identified functional PB1 residues that are essential for viral replication. To further annotate the functional residues important to the canonical or noncanonical functions of viral RNA-dependent RNA polymerase (vRdRp), we performed a homologous-structure analysis with 16 different vRdRp structures. We achieved high sensitivity in annotating the known canonical polymerase functional residues. Moreover, we identified a cluster of noncanonical functional residues located in the loop region of the PB1 β-ribbon. We further demonstrated that these residues were important for PB1 protein nuclear import through the interaction with Ran-binding protein 5. In summary, we developed a systematic and sensitive method to identify and annotate functional residues that are not restrained by sequence conservation. Importantly, this method is generally applicable to other proteins about which homologous-structure information is available. PMID:27803181

  6. Pairwise domain adaptation module for CNN-based 2-D/3-D registration.

    PubMed

    Zheng, Jiannan; Miao, Shun; Jane Wang, Z; Liao, Rui

    2018-04-01

    Accurate two-dimensional to three-dimensional (2-D/3-D) registration of preoperative 3-D data and intraoperative 2-D x-ray images is a key enabler for image-guided therapy. Recent advances in 2-D/3-D registration formulate the problem as a learning-based approach and exploit the modeling power of convolutional neural networks (CNN) to significantly improve the accuracy and efficiency of 2-D/3-D registration. However, for surgery-related applications, collecting a large clinical dataset with accurate annotations for training can be very challenging or impractical. Therefore, deep learning-based 2-D/3-D registration methods are often trained with synthetically generated data, and a performance gap is often observed when testing the trained model on clinical data. We propose a pairwise domain adaptation (PDA) module to adapt the model trained on source domain (i.e., synthetic data) to target domain (i.e., clinical data) by learning domain invariant features with only a few paired real and synthetic data. The PDA module is designed to be flexible for different deep learning-based 2-D/3-D registration frameworks, and it can be plugged into any pretrained CNN model such as a simple Batch-Norm layer. The proposed PDA module has been quantitatively evaluated on two clinical applications using different frameworks of deep networks, demonstrating its significant advantages of generalizability and flexibility for 2-D/3-D medical image registration when a small number of paired real-synthetic data can be obtained.

  7. WikiGenomes: an open web application for community consumption and curation of gene annotation data in Wikidata

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Putman, Tim E.; Lelong, Sebastien; Burgstaller-Muehlbacher, Sebastian

    With the advancement of genome-sequencing technologies, new genomes are being sequenced daily. Although these sequences are deposited in publicly available data warehouses, their functional and genomic annotations (beyond genes which are predicted automatically) mostly reside in the text of primary publications. Professional curators are hard at work extracting those annotations from the literature for the most studied organisms and depositing them in structured databases. However, the resources don’t exist to fund the comprehensive curation of the thousands of newly sequenced organisms in this manner. Here, we describe WikiGenomes (wikigenomes.org), a web application that facilitates the consumption and curation of genomicmore » data by the entire scientific community. WikiGenomes is based on Wikidata, an openly editable knowledge graph with the goal of aggregating published knowledge into a free and open database. WikiGenomes empowers the individual genomic researcher to contribute their expertise to the curation effort and integrates the knowledge into Wikidata, enabling it to be accessed by anyone without restriction.« less

  8. Internet Databases of the Properties, Enzymatic Reactions, and Metabolism of Small Molecules—Search Options and Applications in Food Science

    PubMed Central

    Minkiewicz, Piotr; Darewicz, Małgorzata; Iwaniak, Anna; Bucholska, Justyna; Starowicz, Piotr; Czyrko, Emilia

    2016-01-01

    Internet databases of small molecules, their enzymatic reactions, and metabolism have emerged as useful tools in food science. Database searching is also introduced as part of chemistry or enzymology courses for food technology students. Such resources support the search for information about single compounds and facilitate the introduction of secondary analyses of large datasets. Information can be retrieved from databases by searching for the compound name or structure, annotating with the help of chemical codes or drawn using molecule editing software. Data mining options may be enhanced by navigating through a network of links and cross-links between databases. Exemplary databases reviewed in this article belong to two classes: tools concerning small molecules (including general and specialized databases annotating food components) and tools annotating enzymes and metabolism. Some problems associated with database application are also discussed. Data summarized in computer databases may be used for calculation of daily intake of bioactive compounds, prediction of metabolism of food components, and their biological activity as well as for prediction of interactions between food component and drugs. PMID:27929431

  9. Internet Databases of the Properties, Enzymatic Reactions, and Metabolism of Small Molecules-Search Options and Applications in Food Science.

    PubMed

    Minkiewicz, Piotr; Darewicz, Małgorzata; Iwaniak, Anna; Bucholska, Justyna; Starowicz, Piotr; Czyrko, Emilia

    2016-12-06

    Internet databases of small molecules, their enzymatic reactions, and metabolism have emerged as useful tools in food science. Database searching is also introduced as part of chemistry or enzymology courses for food technology students. Such resources support the search for information about single compounds and facilitate the introduction of secondary analyses of large datasets. Information can be retrieved from databases by searching for the compound name or structure, annotating with the help of chemical codes or drawn using molecule editing software. Data mining options may be enhanced by navigating through a network of links and cross-links between databases. Exemplary databases reviewed in this article belong to two classes: tools concerning small molecules (including general and specialized databases annotating food components) and tools annotating enzymes and metabolism. Some problems associated with database application are also discussed. Data summarized in computer databases may be used for calculation of daily intake of bioactive compounds, prediction of metabolism of food components, and their biological activity as well as for prediction of interactions between food component and drugs.

  10. WikiGenomes: an open web application for community consumption and curation of gene annotation data in Wikidata

    DOE PAGES

    Putman, Tim E.; Lelong, Sebastien; Burgstaller-Muehlbacher, Sebastian; ...

    2017-03-06

    With the advancement of genome-sequencing technologies, new genomes are being sequenced daily. Although these sequences are deposited in publicly available data warehouses, their functional and genomic annotations (beyond genes which are predicted automatically) mostly reside in the text of primary publications. Professional curators are hard at work extracting those annotations from the literature for the most studied organisms and depositing them in structured databases. However, the resources don’t exist to fund the comprehensive curation of the thousands of newly sequenced organisms in this manner. Here, we describe WikiGenomes (wikigenomes.org), a web application that facilitates the consumption and curation of genomicmore » data by the entire scientific community. WikiGenomes is based on Wikidata, an openly editable knowledge graph with the goal of aggregating published knowledge into a free and open database. WikiGenomes empowers the individual genomic researcher to contribute their expertise to the curation effort and integrates the knowledge into Wikidata, enabling it to be accessed by anyone without restriction.« less

  11. Semantic Similarity in Biomedical Ontologies

    PubMed Central

    Pesquita, Catia; Faria, Daniel; Falcão, André O.; Lord, Phillip; Couto, Francisco M.

    2009-01-01

    In recent years, ontologies have become a mainstream topic in biomedical research. When biological entities are described using a common schema, such as an ontology, they can be compared by means of their annotations. This type of comparison is called semantic similarity, since it assesses the degree of relatedness between two entities by the similarity in meaning of their annotations. The application of semantic similarity to biomedical ontologies is recent; nevertheless, several studies have been published in the last few years describing and evaluating diverse approaches. Semantic similarity has become a valuable tool for validating the results drawn from biomedical studies such as gene clustering, gene expression data analysis, prediction and validation of molecular interactions, and disease gene prioritization. We review semantic similarity measures applied to biomedical ontologies and propose their classification according to the strategies they employ: node-based versus edge-based and pairwise versus groupwise. We also present comparative assessment studies and discuss the implications of their results. We survey the existing implementations of semantic similarity measures, and we describe examples of applications to biomedical research. This will clarify how biomedical researchers can benefit from semantic similarity measures and help them choose the approach most suitable for their studies. Biomedical ontologies are evolving toward increased coverage, formality, and integration, and their use for annotation is increasingly becoming a focus of both effort by biomedical experts and application of automated annotation procedures to create corpora of higher quality and completeness than are currently available. Given that semantic similarity measures are directly dependent on these evolutions, we can expect to see them gaining more relevance and even becoming as essential as sequence similarity is today in biomedical research. PMID:19649320

  12. Ahuna Mons

    NASA Image and Video Library

    2018-03-14

    This view from NASA's Dawn mission shows Ceres' tallest mountain, Ahuna Mons, 2.5 miles (4 kilometers) high and 11 miles (17 kilometers) wide. This is one of the few sites on Ceres at which a significant amount of sodium carbonate has been found, shown in green and red colors in the lower right image. The top and lower left images were collected by Dawn's framing camera. The top image is a 3D view reconstructed with the help of topography data. A non-annotated version is available at https://photojournal.jpl.nasa.gov/catalog/PIA21919

  13. 36 CFR 1206.22 - What type of proposal is eligible for a publications grant?

    Code of Federal Regulations, 2014 CFR

    2014-07-01

    ... projects include the production of: (1) Documentary editions that involve collecting, compiling... records; (2) Microfilm editions consisting of organized collections of images of original sources, usually without transcription and annotations; (3) Electronic editions consisting of organized collections of...

  14. 36 CFR 1206.22 - What type of proposal is eligible for a publications grant?

    Code of Federal Regulations, 2011 CFR

    2011-07-01

    ... projects include the production of: (1) Documentary editions that involve collecting, compiling... records; (2) Microfilm editions consisting of organized collections of images of original sources, usually without transcription and annotations; (3) Electronic editions consisting of organized collections of...

  15. 36 CFR § 1206.22 - What type of proposal is eligible for a publications grant?

    Code of Federal Regulations, 2013 CFR

    2013-07-01

    ... projects include the production of: (1) Documentary editions that involve collecting, compiling... records; (2) Microfilm editions consisting of organized collections of images of original sources, usually without transcription and annotations; (3) Electronic editions consisting of organized collections of...

  16. Picture Books Peek behind Cultural Curtains.

    ERIC Educational Resources Information Center

    Marantz, Sylvia; Marantz, Kenneth

    2000-01-01

    Discusses culture in picture books in three general categories: legends and histories; current life in particular areas; and the immigrant experience. Considers the translation of visual images, discusses authentic interpretations, and presents an annotated bibliography of picture books showing cultural diversity including African, Asian, Mexican,…

  17. PR Bibliography, 2001.

    ERIC Educational Resources Information Center

    Ramsey, Shirley, Ed.

    2001-01-01

    This annotated bibliography presents an overview of journal articles and books published in 2000 on public relations that can be helpful to teachers and students as well as to practitioners and managers. This bibliography is subdivided into 29 categories including campaigns; community relations; corporate image; education; employee relations;…

  18. Early esophageal cancer detection using RF classifiers

    NASA Astrophysics Data System (ADS)

    Janse, Markus H. A.; van der Sommen, Fons; Zinger, Svitlana; Schoon, Erik J.; de With, Peter H. N.

    2016-03-01

    Esophageal cancer is one of the fastest rising forms of cancer in the Western world. Using High-Definition (HD) endoscopy, gastroenterology experts can identify esophageal cancer at an early stage. Recent research shows that early cancer can be found using a state-of-the-art computer-aided detection (CADe) system based on analyzing static HD endoscopic images. Our research aims at extending this system by applying Random Forest (RF) classification, which introduces a confidence measure for detected cancer regions. To visualize this data, we propose a novel automated annotation system, employing the unique characteristics of the previous confidence measure. This approach allows reliable modeling of multi-expert knowledge and provides essential data for real-time video processing, to enable future use of the system in a clinical setting. The performance of the CADe system is evaluated on a 39-patient dataset, containing 100 images annotated by 5 expert gastroenterologists. The proposed system reaches a precision of 75% and recall of 90%, thereby improving the state-of-the-art results by 11 and 6 percentage points, respectively.

  19. Automated tissue segmentation of MR brain images in the presence of white matter lesions.

    PubMed

    Valverde, Sergi; Oliver, Arnau; Roura, Eloy; González-Villà, Sandra; Pareto, Deborah; Vilanova, Joan C; Ramió-Torrentà, Lluís; Rovira, Àlex; Lladó, Xavier

    2017-01-01

    Over the last few years, the increasing interest in brain tissue volume measurements on clinical settings has led to the development of a wide number of automated tissue segmentation methods. However, white matter lesions are known to reduce the performance of automated tissue segmentation methods, which requires manual annotation of the lesions and refilling them before segmentation, which is tedious and time-consuming. Here, we propose a new, fully automated T1-w/FLAIR tissue segmentation approach designed to deal with images in the presence of WM lesions. This approach integrates a robust partial volume tissue segmentation with WM outlier rejection and filling, combining intensity and probabilistic and morphological prior maps. We evaluate the performance of this method on the MRBrainS13 tissue segmentation challenge database, which contains images with vascular WM lesions, and also on a set of Multiple Sclerosis (MS) patient images. On both databases, we validate the performance of our method with other state-of-the-art techniques. On the MRBrainS13 data, the presented approach was at the time of submission the best ranked unsupervised intensity model method of the challenge (7th position) and clearly outperformed the other unsupervised pipelines such as FAST and SPM12. On MS data, the differences in tissue segmentation between the images segmented with our method and the same images where manual expert annotations were used to refill lesions on T1-w images before segmentation were lower or similar to the best state-of-the-art pipeline incorporating automated lesion segmentation and filling. Our results show that the proposed pipeline achieved very competitive results on both vascular and MS lesions. A public version of this approach is available to download for the neuro-imaging community. Copyright © 2016 Elsevier B.V. All rights reserved.

  20. Automatic recognition of conceptualization zones in scientific articles and two life science applications.

    PubMed

    Liakata, Maria; Saha, Shyamasree; Dobnik, Simon; Batchelor, Colin; Rebholz-Schuhmann, Dietrich

    2012-04-01

    Scholarly biomedical publications report on the findings of a research investigation. Scientists use a well-established discourse structure to relate their work to the state of the art, express their own motivation and hypotheses and report on their methods, results and conclusions. In previous work, we have proposed ways to explicitly annotate the structure of scientific investigations in scholarly publications. Here we present the means to facilitate automatic access to the scientific discourse of articles by automating the recognition of 11 categories at the sentence level, which we call Core Scientific Concepts (CoreSCs). These include: Hypothesis, Motivation, Goal, Object, Background, Method, Experiment, Model, Observation, Result and Conclusion. CoreSCs provide the structure and context to all statements and relations within an article and their automatic recognition can greatly facilitate biomedical information extraction by characterizing the different types of facts, hypotheses and evidence available in a scientific publication. We have trained and compared machine learning classifiers (support vector machines and conditional random fields) on a corpus of 265 full articles in biochemistry and chemistry to automatically recognize CoreSCs. We have evaluated our automatic classifications against a manually annotated gold standard, and have achieved promising accuracies with 'Experiment', 'Background' and 'Model' being the categories with the highest F1-scores (76%, 62% and 53%, respectively). We have analysed the task of CoreSC annotation both from a sentence classification as well as sequence labelling perspective and we present a detailed feature evaluation. The most discriminative features are local sentence features such as unigrams, bigrams and grammatical dependencies while features encoding the document structure, such as section headings, also play an important role for some of the categories. We discuss the usefulness of automatically generated CoreSCs in two biomedical applications as well as work in progress. A web-based tool for the automatic annotation of articles with CoreSCs and corresponding documentation is available online at http://www.sapientaproject.com/software http://www.sapientaproject.com also contains detailed information pertaining to CoreSC annotation and links to annotation guidelines as well as a corpus of manually annotated articles, which served as our training data. liakata@ebi.ac.uk Supplementary data are available at Bioinformatics online.

Top