Science.gov

Sample records for open biomedical annotator

  1. Comparison of concept recognizers for building the Open Biomedical Annotator

    PubMed Central

    Shah, Nigam H; Bhatia, Nipun; Jonquet, Clement; Rubin, Daniel; Chiang, Annie P; Musen, Mark A

    2009-01-01

    The National Center for Biomedical Ontology (NCBO) is developing a system for automated, ontology-based access to online biomedical resources (Shah NH, et al.: Ontology-driven indexing of public datasets for translational bioinformatics. BMC Bioinformatics 2009, 10(Suppl 2):S1). The system's indexing workflow processes the text metadata of diverse resources such as datasets from GEO and ArrayExpress to annotate and index them with concepts from appropriate ontologies. This indexing requires the use of a concept-recognition tool to identify ontology concepts in the resource's textual metadata. In this paper, we present a comparison of two concept recognizers – NLM's MetaMap and the University of Michigan's Mgrep. We utilize a number of data sources and dictionaries to evaluate the concept recognizers in terms of precision, recall, speed of execution, scalability and customizability. Our evaluations demonstrate that Mgrep has a clear edge over MetaMap for large-scale service oriented applications. Based on our analysis we also suggest areas of potential improvements for Mgrep. We have subsequently used Mgrep to build the Open Biomedical Annotator service. The Annotator service has access to a large dictionary of biomedical terms derived from the United Medical Language System (UMLS) and NCBO ontologies. The Annotator also leverages the hierarchical structure of the ontologies and their mappings to expand annotations. The Annotator service is available to the community as a REST Web service for creating ontology-based annotations of their data. PMID:19761568

  2. Comparison of concept recognizers for building the Open Biomedical Annotator.

    PubMed

    Shah, Nigam H; Bhatia, Nipun; Jonquet, Clement; Rubin, Daniel; Chiang, Annie P; Musen, Mark A

    2009-09-17

    The National Center for Biomedical Ontology (NCBO) is developing a system for automated, ontology-based access to online biomedical resources (Shah NH, et al.: Ontology-driven indexing of public datasets for translational bioinformatics. BMC Bioinformatics 2009, 10(Suppl 2):S1). The system's indexing workflow processes the text metadata of diverse resources such as datasets from GEO and ArrayExpress to annotate and index them with concepts from appropriate ontologies. This indexing requires the use of a concept-recognition tool to identify ontology concepts in the resource's textual metadata. In this paper, we present a comparison of two concept recognizers - NLM's MetaMap and the University of Michigan's Mgrep. We utilize a number of data sources and dictionaries to evaluate the concept recognizers in terms of precision, recall, speed of execution, scalability and customizability. Our evaluations demonstrate that Mgrep has a clear edge over MetaMap for large-scale service oriented applications. Based on our analysis we also suggest areas of potential improvements for Mgrep. We have subsequently used Mgrep to build the Open Biomedical Annotator service. The Annotator service has access to a large dictionary of biomedical terms derived from the United Medical Language System (UMLS) and NCBO ontologies. The Annotator also leverages the hierarchical structure of the ontologies and their mappings to expand annotations. The Annotator service is available to the community as a REST Web service for creating ontology-based annotations of their data.

  3. Ranking Biomedical Annotations with Annotator's Semantic Relevancy

    PubMed Central

    2014-01-01

    Biomedical annotation is a common and affective artifact for researchers to discuss, show opinion, and share discoveries. It becomes increasing popular in many online research communities, and implies much useful information. Ranking biomedical annotations is a critical problem for data user to efficiently get information. As the annotator's knowledge about the annotated entity normally determines quality of the annotations, we evaluate the knowledge, that is, semantic relationship between them, in two ways. The first is extracting relational information from credible websites by mining association rules between an annotator and a biomedical entity. The second way is frequent pattern mining from historical annotations, which reveals common features of biomedical entities that an annotator can annotate with high quality. We propose a weighted and concept-extended RDF model to represent an annotator, a biomedical entity, and their background attributes and merge information from the two ways as the context of an annotator. Based on that, we present a method to rank the annotations by evaluating their correctness according to user's vote and the semantic relevancy between the annotator and the annotated entity. The experimental results show that the approach is applicable and efficient even when data set is large. PMID:24899918

  4. Ranking biomedical annotations with annotator's semantic relevancy.

    PubMed

    Wu, Aihua

    2014-01-01

    Biomedical annotation is a common and affective artifact for researchers to discuss, show opinion, and share discoveries. It becomes increasing popular in many online research communities, and implies much useful information. Ranking biomedical annotations is a critical problem for data user to efficiently get information. As the annotator's knowledge about the annotated entity normally determines quality of the annotations, we evaluate the knowledge, that is, semantic relationship between them, in two ways. The first is extracting relational information from credible websites by mining association rules between an annotator and a biomedical entity. The second way is frequent pattern mining from historical annotations, which reveals common features of biomedical entities that an annotator can annotate with high quality. We propose a weighted and concept-extended RDF model to represent an annotator, a biomedical entity, and their background attributes and merge information from the two ways as the context of an annotator. Based on that, we present a method to rank the annotations by evaluating their correctness according to user's vote and the semantic relevancy between the annotator and the annotated entity. The experimental results show that the approach is applicable and efficient even when data set is large.

  5. Informatics in radiology: An open-source and open-access cancer biomedical informatics grid annotation and image markup template builder.

    PubMed

    Mongkolwat, Pattanasak; Channin, David S; Kleper, Vladimir; Rubin, Daniel L

    2012-01-01

    In a routine clinical environment or clinical trial, a case report form or structured reporting template can be used to quickly generate uniform and consistent reports. Annotation and image markup (AIM), a project supported by the National Cancer Institute's cancer biomedical informatics grid, can be used to collect information for a case report form or structured reporting template. AIM is designed to store, in a single information source, (a) the description of pixel data with use of markups or graphical drawings placed on the image, (b) calculation results (which may or may not be directly related to the markups), and (c) supplemental information. To facilitate the creation of AIM annotations with data entry templates, an AIM template schema and an open-source template creation application were developed to assist clinicians, image researchers, and designers of clinical trials to quickly create a set of data collection items, thereby ultimately making image information more readily accessible.

  6. Informatics in Radiology: An Open-Source and Open-Access Cancer Biomedical Informatics Grid Annotation and Image Markup Template Builder

    PubMed Central

    Channin, David S.; Rubin, Vladimir Kleper Daniel L.

    2012-01-01

    In a routine clinical environment or clinical trial, a case report form or structured reporting template can be used to quickly generate uniform and consistent reports. Annotation and Image Markup (AIM), a project supported by the National Cancer Institute’s cancer Biomedical Informatics Grid, can be used to collect information for a case report form or structured reporting template. AIM is designed to store, in a single information source, (a) the description of pixel data with use of markups or graphical drawings placed on the image, (b) calculation results (which may or may not be directly related to the markups), and (c) supplemental information. To facilitate the creation of AIM annotations with data entry templates, an AIM template schema and an open-source template creation application were developed to assist clinicians, image researchers, and designers of clinical trials to quickly create a set of data collection items, thereby ultimately making image information more readily accessible. © RSNA, 2012 PMID:22556315

  7. Corpus annotation for mining biomedical events from literature.

    PubMed

    Kim, Jin-Dong; Ohta, Tomoko; Tsujii, Jun'ichi

    2008-01-08

    Advanced Text Mining (TM) such as semantic enrichment of papers, event or relation extraction, and intelligent Question Answering have increasingly attracted attention in the bio-medical domain. For such attempts to succeed, text annotation from the biological point of view is indispensable. However, due to the complexity of the task, semantic annotation has never been tried on a large scale, apart from relatively simple term annotation. We have completed a new type of semantic annotation, event annotation, which is an addition to the existing annotations in the GENIA corpus. The corpus has already been annotated with POS (Parts of Speech), syntactic trees, terms, etc. The new annotation was made on half of the GENIA corpus, consisting of 1,000 Medline abstracts. It contains 9,372 sentences in which 36,114 events are identified. The major challenges during event annotation were (1) to design a scheme of annotation which meets specific requirements of text annotation, (2) to achieve biology-oriented annotation which reflect biologists' interpretation of text, and (3) to ensure the homogeneity of annotation quality across annotators. To meet these challenges, we introduced new concepts such as Single-facet Annotation and Semantic Typing, which have collectively contributed to successful completion of a large scale annotation. The resulting event-annotated corpus is the largest and one of the best in quality among similar annotation efforts. We expect it to become a valuable resource for NLP (Natural Language Processing)-based TM in the bio-medical domain.

  8. Corpus annotation for mining biomedical events from literature

    PubMed Central

    Kim, Jin-Dong; Ohta, Tomoko; Tsujii, Jun'ichi

    2008-01-01

    Background Advanced Text Mining (TM) such as semantic enrichment of papers, event or relation extraction, and intelligent Question Answering have increasingly attracted attention in the bio-medical domain. For such attempts to succeed, text annotation from the biological point of view is indispensable. However, due to the complexity of the task, semantic annotation has never been tried on a large scale, apart from relatively simple term annotation. Results We have completed a new type of semantic annotation, event annotation, which is an addition to the existing annotations in the GENIA corpus. The corpus has already been annotated with POS (Parts of Speech), syntactic trees, terms, etc. The new annotation was made on half of the GENIA corpus, consisting of 1,000 Medline abstracts. It contains 9,372 sentences in which 36,114 events are identified. The major challenges during event annotation were (1) to design a scheme of annotation which meets specific requirements of text annotation, (2) to achieve biology-oriented annotation which reflect biologists' interpretation of text, and (3) to ensure the homogeneity of annotation quality across annotators. To meet these challenges, we introduced new concepts such as Single-facet Annotation and Semantic Typing, which have collectively contributed to successful completion of a large scale annotation. Conclusion The resulting event-annotated corpus is the largest and one of the best in quality among similar annotation efforts. We expect it to become a valuable resource for NLP (Natural Language Processing)-based TM in the bio-medical domain. PMID:18182099

  9. Semantator: semantic annotator for converting biomedical text to linked data.

    PubMed

    Tao, Cui; Song, Dezhao; Sharma, Deepak; Chute, Christopher G

    2013-10-01

    More than 80% of biomedical data is embedded in plain text. The unstructured nature of these text-based documents makes it challenging to easily browse and query the data of interest in them. One approach to facilitate browsing and querying biomedical text is to convert the plain text to a linked web of data, i.e., converting data originally in free text to structured formats with defined meta-level semantics. In this paper, we introduce Semantator (Semantic Annotator), a semantic-web-based environment for annotating data of interest in biomedical documents, browsing and querying the annotated data, and interactively refining annotation results if needed. Through Semantator, information of interest can be either annotated manually or semi-automatically using plug-in information extraction tools. The annotated results will be stored in RDF and can be queried using the SPARQL query language. In addition, semantic reasoners can be directly applied to the annotated data for consistency checking and knowledge inference. Semantator has been released online and was used by the biomedical ontology community who provided positive feedbacks. Our evaluation results indicated that (1) Semantator can perform the annotation functionalities as designed; (2) Semantator can be adopted in real applications in clinical and transactional research; and (3) the annotated results using Semantator can be easily used in Semantic-web-based reasoning tools for further inference.

  10. Open semantic annotation of scientific publications using DOMEO

    PubMed Central

    2012-01-01

    Background Our group has developed a useful shared software framework for performing, versioning, sharing and viewing Web annotations of a number of kinds, using an open representation model. Methods The Domeo Annotation Tool was developed in tandem with this open model, the Annotation Ontology (AO). Development of both the Annotation Framework and the open model was driven by requirements of several different types of alpha users, including bench scientists and biomedical curators from university research labs, online scientific communities, publishing and pharmaceutical companies. Several use cases were incrementally implemented by the toolkit. These use cases in biomedical communications include personal note-taking, group document annotation, semantic tagging, claim-evidence-context extraction, reagent tagging, and curation of textmining results from entity extraction algorithms. Results We report on the Domeo user interface here. Domeo has been deployed in beta release as part of the NIH Neuroscience Information Framework (NIF, http://www.neuinfo.org) and is scheduled for production deployment in the NIF’s next full release. Future papers will describe other aspects of this work in detail, including Annotation Framework Services and components for integrating with external textmining services, such as the NCBO Annotator web service, and with other textmining applications using the Apache UIMA framework. PMID:22541592

  11. A survey on annotation tools for the biomedical literature.

    PubMed

    Neves, Mariana; Leser, Ulf

    2014-03-01

    New approaches to biomedical text mining crucially depend on the existence of comprehensive annotated corpora. Such corpora, commonly called gold standards, are important for learning patterns or models during the training phase, for evaluating and comparing the performance of algorithms and also for better understanding the information sought for by means of examples. Gold standards depend on human understanding and manual annotation of natural language text. This process is very time-consuming and expensive because it requires high intellectual effort from domain experts. Accordingly, the lack of gold standards is considered as one of the main bottlenecks for developing novel text mining methods. This situation led the development of tools that support humans in annotating texts. Such tools should be intuitive to use, should support a range of different input formats, should include visualization of annotated texts and should generate an easy-to-parse output format. Today, a range of tools which implement some of these functionalities are available. In this survey, we present a comprehensive survey of tools for supporting annotation of biomedical texts. Altogether, we considered almost 30 tools, 13 of which were selected for an in-depth comparison. The comparison was performed using predefined criteria and was accompanied by hands-on experiences whenever possible. Our survey shows that current tools can support many of the tasks in biomedical text annotation in a satisfying manner, but also that no tool can be considered as a true comprehensive solution.

  12. Desiderata for ontologies to be used in semantic annotation of biomedical documents.

    PubMed

    Bada, Michael; Hunter, Lawrence

    2011-02-01

    A wealth of knowledge valuable to the translational research scientist is contained within the vast biomedical literature, but this knowledge is typically in the form of natural language. Sophisticated natural-language-processing systems are needed to translate text into unambiguous formal representations grounded in high-quality consensus ontologies, and these systems in turn rely on gold-standard corpora of annotated documents for training and testing. To this end, we are constructing the Colorado Richly Annotated Full-Text (CRAFT) Corpus, a collection of 97 full-text biomedical journal articles that are being manually annotated with the entire sets of terms from select vocabularies, predominantly from the Open Biomedical Ontologies (OBO) library. Our efforts in building this corpus has illuminated infelicities of these ontologies with respect to the semantic annotation of biomedical documents, and we propose desiderata whose implementation could substantially improve their utility in this task; these include the integration of overlapping terms across OBOs, the resolution of OBO-specific ambiguities, the integration of the BFO with the OBOs and the use of mid-level ontologies, the inclusion of noncanonical instances, and the expansion of relations and realizable entities.

  13. Enriching a biomedical event corpus with meta-knowledge annotation.

    PubMed

    Thompson, Paul; Nawaz, Raheel; McNaught, John; Ananiadou, Sophia

    2011-10-10

    Biomedical papers contain rich information about entities, facts and events of biological relevance. To discover these automatically, we use text mining techniques, which rely on annotated corpora for training. In order to extract protein-protein interactions, genotype-phenotype/gene-disease associations, etc., we rely on event corpora that are annotated with classified, structured representations of important facts and findings contained within text. These provide an important resource for the training of domain-specific information extraction (IE) systems, to facilitate semantic-based searching of documents. Correct interpretation of these events is not possible without additional information, e.g., does an event describe a fact, a hypothesis, an experimental result or an analysis of results? How confident is the author about the validity of her analyses? These and other types of information, which we collectively term meta-knowledge, can be derived from the context of the event. We have designed an annotation scheme for meta-knowledge enrichment of biomedical event corpora. The scheme is multi-dimensional, in that each event is annotated for 5 different aspects of meta-knowledge that can be derived from the textual context of the event. Textual clues used to determine the values are also annotated. The scheme is intended to be general enough to allow integration with different types of bio-event annotation, whilst being detailed enough to capture important subtleties in the nature of the meta-knowledge expressed in the text. We report here on both the main features of the annotation scheme, as well as its application to the GENIA event corpus (1000 abstracts with 36,858 events). High levels of inter-annotator agreement have been achieved, falling in the range of 0.84-0.93 Kappa. By augmenting event annotations with meta-knowledge, more sophisticated IE systems can be trained, which allow interpretative information to be specified as part of the search criteria

  14. Enriching a biomedical event corpus with meta-knowledge annotation

    PubMed Central

    2011-01-01

    Background Biomedical papers contain rich information about entities, facts and events of biological relevance. To discover these automatically, we use text mining techniques, which rely on annotated corpora for training. In order to extract protein-protein interactions, genotype-phenotype/gene-disease associations, etc., we rely on event corpora that are annotated with classified, structured representations of important facts and findings contained within text. These provide an important resource for the training of domain-specific information extraction (IE) systems, to facilitate semantic-based searching of documents. Correct interpretation of these events is not possible without additional information, e.g., does an event describe a fact, a hypothesis, an experimental result or an analysis of results? How confident is the author about the validity of her analyses? These and other types of information, which we collectively term meta-knowledge, can be derived from the context of the event. Results We have designed an annotation scheme for meta-knowledge enrichment of biomedical event corpora. The scheme is multi-dimensional, in that each event is annotated for 5 different aspects of meta-knowledge that can be derived from the textual context of the event. Textual clues used to determine the values are also annotated. The scheme is intended to be general enough to allow integration with different types of bio-event annotation, whilst being detailed enough to capture important subtleties in the nature of the meta-knowledge expressed in the text. We report here on both the main features of the annotation scheme, as well as its application to the GENIA event corpus (1000 abstracts with 36,858 events). High levels of inter-annotator agreement have been achieved, falling in the range of 0.84-0.93 Kappa. Conclusion By augmenting event annotations with meta-knowledge, more sophisticated IE systems can be trained, which allow interpretative information to be specified as

  15. An open annotation ontology for science on web 3.0.

    PubMed

    Ciccarese, Paolo; Ocana, Marco; Garcia Castro, Leyla Jael; Das, Sudeshna; Clark, Tim

    2011-05-17

    There is currently a gap between the rich and expressive collection of published biomedical ontologies, and the natural language expression of biomedical papers consumed on a daily basis by scientific researchers. The purpose of this paper is to provide an open, shareable structure for dynamic integration of biomedical domain ontologies with the scientific document, in the form of an Annotation Ontology (AO), thus closing this gap and enabling application of formal biomedical ontologies directly to the literature as it emerges. Initial requirements for AO were elicited by analysis of integration needs between biomedical web communities, and of needs for representing and integrating results of biomedical text mining. Analysis of strengths and weaknesses of previous efforts in this area was also performed. A series of increasingly refined annotation tools were then developed along with a metadata model in OWL, and deployed for feedback and additional requirements the ontology to users at a major pharmaceutical company and a major academic center. Further requirements and critiques of the model were also elicited through discussions with many colleagues and incorporated into the work. This paper presents Annotation Ontology (AO), an open ontology in OWL-DL for annotating scientific documents on the web. AO supports both human and algorithmic content annotation. It enables "stand-off" or independent metadata anchored to specific positions in a web document by any one of several methods. In AO, the document may be annotated but is not required to be under update control of the annotator. AO contains a provenance model to support versioning, and a set model for specifying groups and containers of annotation. AO is freely available under open source license at http://purl.org/ao/, and extensive documentation including screencasts is available on AO's Google Code page: http://code.google.com/p/annotation-ontology/ . The Annotation Ontology meets critical requirements for

  16. An open annotation ontology for science on web 3.0

    PubMed Central

    2011-01-01

    Background There is currently a gap between the rich and expressive collection of published biomedical ontologies, and the natural language expression of biomedical papers consumed on a daily basis by scientific researchers. The purpose of this paper is to provide an open, shareable structure for dynamic integration of biomedical domain ontologies with the scientific document, in the form of an Annotation Ontology (AO), thus closing this gap and enabling application of formal biomedical ontologies directly to the literature as it emerges. Methods Initial requirements for AO were elicited by analysis of integration needs between biomedical web communities, and of needs for representing and integrating results of biomedical text mining. Analysis of strengths and weaknesses of previous efforts in this area was also performed. A series of increasingly refined annotation tools were then developed along with a metadata model in OWL, and deployed for feedback and additional requirements the ontology to users at a major pharmaceutical company and a major academic center. Further requirements and critiques of the model were also elicited through discussions with many colleagues and incorporated into the work. Results This paper presents Annotation Ontology (AO), an open ontology in OWL-DL for annotating scientific documents on the web. AO supports both human and algorithmic content annotation. It enables “stand-off” or independent metadata anchored to specific positions in a web document by any one of several methods. In AO, the document may be annotated but is not required to be under update control of the annotator. AO contains a provenance model to support versioning, and a set model for specifying groups and containers of annotation. AO is freely available under open source license at http://purl.org/ao/, and extensive documentation including screencasts is available on AO’s Google Code page: http://code.google.com/p/annotation-ontology/ . Conclusions The

  17. Large-scale biomedical concept recognition: an evaluation of current automatic annotators and their parameters

    PubMed Central

    2014-01-01

    Background Ontological concepts are useful for many different biomedical tasks. Concepts are difficult to recognize in text due to a disconnect between what is captured in an ontology and how the concepts are expressed in text. There are many recognizers for specific ontologies, but a general approach for concept recognition is an open problem. Results Three dictionary-based systems (MetaMap, NCBO Annotator, and ConceptMapper) are evaluated on eight biomedical ontologies in the Colorado Richly Annotated Full-Text (CRAFT) Corpus. Over 1,000 parameter combinations are examined, and best-performing parameters for each system-ontology pair are presented. Conclusions Baselines for concept recognition by three systems on eight biomedical ontologies are established (F-measures range from 0.14–0.83). Out of the three systems we tested, ConceptMapper is generally the best-performing system; it produces the highest F-measure of seven out of eight ontologies. Default parameters are not ideal for most systems on most ontologies; by changing parameters F-measure can be increased by up to 0.4. Not only are best performing parameters presented, but suggestions for choosing the best parameters based on ontology characteristics are presented. PMID:24571547

  18. RysannMD: A biomedical semantic annotator balancing speed and accuracy.

    PubMed

    Cuzzola, John; Jovanović, Jelena; Bagheri, Ebrahim

    2017-07-01

    Recently, both researchers and practitioners have explored the possibility of semantically annotating large and continuously evolving collections of biomedical texts such as research papers, medical reports, and physician notes in order to enable their efficient and effective management and use in clinical practice or research laboratories. Such annotations can be automatically generated by biomedical semantic annotators - tools that are specifically designed for detecting and disambiguating biomedical concepts mentioned in text. The biomedical community has already presented several solid automated semantic annotators. However, the existing tools are either strong in their disambiguation capacity, i.e., the ability to identify the correct biomedical concept for a given piece of text among several candidate concepts, or they excel in their processing time, i.e., work very efficiently, but none of the semantic annotation tools reported in the literature has both of these qualities. In this paper, we present RysannMD (Ryerson Semantic Annotator for Medical Domain), a biomedical semantic annotation tool that strikes a balance between processing time and performance while disambiguating biomedical terms. In other words, RysannMD provides reasonable disambiguation performance when choosing the right sense for a biomedical term in a given context, and does that in a reasonable time. To examine how RysannMD stands with respect to the state of the art biomedical semantic annotators, we have conducted a series of experiments using standard benchmarking corpora, including both gold and silver standards, and four modern biomedical semantic annotators, namely cTAKES, MetaMap, NOBLE Coder, and Neji. The annotators were compared with respect to the quality of the produced annotations measured against gold and silver standards using precision, recall, and F1 measure and speed, i.e., processing time. In the experiments, RysannMD achieved the best median F1 measure across the

  19. [Open access :an opportunity for biomedical research].

    PubMed

    Duchange, Nathalie; Autard, Delphine; Pinhas, Nicole

    2008-01-01

    Open access within the scientific community depends on the scientific context and the practices of the field. In the biomedical domain, the communication of research results is characterised by the importance of the peer reviewing process, the existence of a hierarchy among journals and the transfer of copyright to the editor. Biomedical publishing has become a lucrative market and the growth of electronic journals has not helped lower the costs. Indeed, it is difficult for today's public institutions to gain access to all the scientific literature. Open access is thus imperative, as demonstrated through the positions taken by a growing number of research funding bodies, the development of open access journals and efforts made in promoting open archives. This article describes the setting up of an Inserm portal for publication in the context of the French national protocol for open-access self-archiving and in an international context.

  20. The BioScope corpus: biomedical texts annotated for uncertainty, negation and their scopes.

    PubMed

    Vincze, Veronika; Szarvas, György; Farkas, Richárd; Móra, György; Csirik, János

    2008-11-19

    Detecting uncertain and negative assertions is essential in most BioMedical Text Mining tasks where, in general, the aim is to derive factual knowledge from textual data. This article reports on a corpus annotation project that has produced a freely available resource for research on handling negation and uncertainty in biomedical texts (we call this corpus the BioScope corpus). The corpus consists of three parts, namely medical free texts, biological full papers and biological scientific abstracts. The dataset contains annotations at the token level for negative and speculative keywords and at the sentence level for their linguistic scope. The annotation process was carried out by two independent linguist annotators and a chief linguist--also responsible for setting up the annotation guidelines --who resolved cases where the annotators disagreed. The resulting corpus consists of more than 20.000 sentences that were considered for annotation and over 10% of them actually contain one (or more) linguistic annotation suggesting negation or uncertainty. Statistics are reported on corpus size, ambiguity levels and the consistency of annotations. The corpus is accessible for academic purposes and is free of charge. Apart from the intended goal of serving as a common resource for the training, testing and comparing of biomedical Natural Language Processing systems, the corpus is also a good resource for the linguistic analysis of scientific and clinical texts.

  1. The BioScope corpus: biomedical texts annotated for uncertainty, negation and their scopes

    PubMed Central

    Vincze, Veronika; Szarvas, György; Farkas, Richárd; Móra, György; Csirik, János

    2008-01-01

    Background Detecting uncertain and negative assertions is essential in most BioMedical Text Mining tasks where, in general, the aim is to derive factual knowledge from textual data. This article reports on a corpus annotation project that has produced a freely available resource for research on handling negation and uncertainty in biomedical texts (we call this corpus the BioScope corpus). Results The corpus consists of three parts, namely medical free texts, biological full papers and biological scientific abstracts. The dataset contains annotations at the token level for negative and speculative keywords and at the sentence level for their linguistic scope. The annotation process was carried out by two independent linguist annotators and a chief linguist – also responsible for setting up the annotation guidelines – who resolved cases where the annotators disagreed. The resulting corpus consists of more than 20.000 sentences that were considered for annotation and over 10% of them actually contain one (or more) linguistic annotation suggesting negation or uncertainty. Conclusion Statistics are reported on corpus size, ambiguity levels and the consistency of annotations. The corpus is accessible for academic purposes and is free of charge. Apart from the intended goal of serving as a common resource for the training, testing and comparing of biomedical Natural Language Processing systems, the corpus is also a good resource for the linguistic analysis of scientific and clinical texts. PMID:19025695

  2. Composite annotations: requirements for mapping multiscale data and models to biomedical ontologies

    PubMed Central

    Cook, Daniel L.; Mejino, Jose L. V.; Neal, Maxwell L.; Gennari, John H.

    2009-01-01

    Current methods for annotating biomedical data resources rely on simple mappings between data elements and the contents of a variety of biomedical ontologies and controlled vocabularies. Here we point out that such simple mappings are inadequate for large-scale multiscale, multidomain integrative “virtual human” projects. For such integrative challenges, we describe a “composite annotation” schema that is simple yet sufficiently extensible for mapping the biomedical content of a variety of data sources and biosimulation models to available biomedical ontologies. PMID:19964601

  3. Biomedical article retrieval using multimodal features and image annotations in region-based CBIR

    NASA Astrophysics Data System (ADS)

    You, Daekeun; Antani, Sameer; Demner-Fushman, Dina; Rahman, Md Mahmudur; Govindaraju, Venu; Thoma, George R.

    2010-01-01

    Biomedical images are invaluable in establishing diagnosis, acquiring technical skills, and implementing best practices in many areas of medicine. At present, images needed for instructional purposes or in support of clinical decisions appear in specialized databases and in biomedical articles, and are often not easily accessible to retrieval tools. Our goal is to automatically annotate images extracted from scientific publications with respect to their usefulness for clinical decision support and instructional purposes, and project the annotations onto images stored in databases by linking images through content-based image similarity. Authors often use text labels and pointers overlaid on figures and illustrations in the articles to highlight regions of interest (ROI). These annotations are then referenced in the caption text or figure citations in the article text. In previous research we have developed two methods (a heuristic and dynamic time warping-based methods) for localizing and recognizing such pointers on biomedical images. In this work, we add robustness to our previous efforts by using a machine learning based approach to localizing and recognizing the pointers. Identifying these can assist in extracting relevant image content at regions within the image that are likely to be highly relevant to the discussion in the article text. Image regions can then be annotated using biomedical concepts from extracted snippets of text pertaining to images in scientific biomedical articles that are identified using National Library of Medicine's Unified Medical Language System® (UMLS) Metathesaurus. The resulting regional annotation and extracted image content are then used as indices for biomedical article retrieval using the multimodal features and region-based content-based image retrieval (CBIR) techniques. The hypothesis that such an approach would improve biomedical document retrieval is validated through experiments on an expert-marked biomedical article

  4. Annotating genes and genomes with DNA sequences extracted from biomedical articles

    PubMed Central

    Haeussler, Maximilian; Gerner, Martin; Bergman, Casey M.

    2011-01-01

    Motivation: Increasing rates of publication and DNA sequencing make the problem of finding relevant articles for a particular gene or genomic region more challenging than ever. Existing text-mining approaches focus on finding gene names or identifiers in English text. These are often not unique and do not identify the exact genomic location of a study. Results: Here, we report the results of a novel text-mining approach that extracts DNA sequences from biomedical articles and automatically maps them to genomic databases. We find that ∼20% of open access articles in PubMed central (PMC) have extractable DNA sequences that can be accurately mapped to the correct gene (91%) and genome (96%). We illustrate the utility of data extracted by text2genome from more than 150 000 PMC articles for the interpretation of ChIP-seq data and the design of quantitative reverse transcriptase (RT)-PCR experiments. Conclusion: Our approach links articles to genes and organisms without relying on gene names or identifiers. It also produces genome annotation tracks of the biomedical literature, thereby allowing researchers to use the power of modern genome browsers to access and analyze publications in the context of genomic data. Availability and implementation: Source code is available under a BSD license from http://sourceforge.net/projects/text2genome/ and results can be browsed and downloaded at http://text2genome.org. Contact: maximilianh@gmail.com Supplementary information: Supplementary data are available at Bioinformatics online. PMID:21325301

  5. Quantitative biomedical annotation using medical subject heading over-representation profiles (MeSHOPs).

    PubMed

    Cheung, Warren A; Ouellette, B F Francis; Wasserman, Wyeth W

    2012-09-27

    MEDLINE®/PubMed® indexes over 20 million biomedical articles, providing curated annotation of its contents using a controlled vocabulary known as Medical Subject Headings (MeSH). The MeSH vocabulary, developed over 50+ years, provides a broad coverage of topics across biomedical research. Distilling the essential biomedical themes for a topic of interest from the relevant literature is important to both understand the importance of related concepts and discover new relationships. We introduce a novel method for determining enriched curator-assigned MeSH annotations in a set of papers associated to a topic, such as a gene, an author or a disease. We generate MeSH Over-representation Profiles (MeSHOPs) to quantitatively summarize the annotations in a form convenient for further computational analysis and visualization. Based on a hypergeometric distribution of assigned terms, MeSHOPs statistically account for the prevalence of the associated biomedical annotation while highlighting unusually prevalent terms based on a specified background. MeSHOPs can be visualized using word clouds, providing a succinct quantitative graphical representation of the relative importance of terms. Using the publication dates of articles, MeSHOPs track changing patterns of annotation over time. Since MeSHOPs are quantitative vectors, MeSHOPs can be compared using standard techniques such as hierarchical clustering. The reliability of MeSHOP annotations is assessed based on the capacity to re-derive the subset of the Gene Ontology annotations with equivalent MeSH terms. MeSHOPs allows quantitative measurement of the degree of association between any entity and the annotated medical concepts, based directly on relevant primary literature. Comparison of MeSHOPs allows entities to be related based on shared medical themes in their literature. A web interface is provided for generating and visualizing MeSHOPs.

  6. Quantitative biomedical annotation using medical subject heading over-representation profiles (MeSHOPs)

    PubMed Central

    2012-01-01

    Background MEDLINE®/PubMed® indexes over 20 million biomedical articles, providing curated annotation of its contents using a controlled vocabulary known as Medical Subject Headings (MeSH). The MeSH vocabulary, developed over 50+ years, provides a broad coverage of topics across biomedical research. Distilling the essential biomedical themes for a topic of interest from the relevant literature is important to both understand the importance of related concepts and discover new relationships. Results We introduce a novel method for determining enriched curator-assigned MeSH annotations in a set of papers associated to a topic, such as a gene, an author or a disease. We generate MeSH Over-representation Profiles (MeSHOPs) to quantitatively summarize the annotations in a form convenient for further computational analysis and visualization. Based on a hypergeometric distribution of assigned terms, MeSHOPs statistically account for the prevalence of the associated biomedical annotation while highlighting unusually prevalent terms based on a specified background. MeSHOPs can be visualized using word clouds, providing a succinct quantitative graphical representation of the relative importance of terms. Using the publication dates of articles, MeSHOPs track changing patterns of annotation over time. Since MeSHOPs are quantitative vectors, MeSHOPs can be compared using standard techniques such as hierarchical clustering. The reliability of MeSHOP annotations is assessed based on the capacity to re-derive the subset of the Gene Ontology annotations with equivalent MeSH terms. Conclusions MeSHOPs allows quantitative measurement of the degree of association between any entity and the annotated medical concepts, based directly on relevant primary literature. Comparison of MeSHOPs allows entities to be related based on shared medical themes in their literature. A web interface is provided for generating and visualizing MeSHOPs. PMID:23017167

  7. Open Biomedical Engineering education in Africa.

    PubMed

    Ahluwalia, Arti; Atwine, Daniel; De Maria, Carmelo; Ibingira, Charles; Kipkorir, Emmauel; Kiros, Fasil; Madete, June; Mazzei, Daniele; Molyneux, Elisabeth; Moonga, Kando; Moshi, Mainen; Nzomo, Martin; Oduol, Vitalice; Okuonzi, John

    2015-08-01

    Despite the virtual revolution, the mainstream academic community in most countries remains largely ignorant of the potential of web-based teaching resources and of the expansion of open source software, hardware and rapid prototyping. In the context of Biomedical Engineering (BME), where human safety and wellbeing is paramount, a high level of supervision and quality control is required before open source concepts can be embraced by universities and integrated into the curriculum. In the meantime, students, more than their teachers, have become attuned to continuous streams of digital information, and teaching methods need to adapt rapidly by giving them the skills to filter meaningful information and by supporting collaboration and co-construction of knowledge using open, cloud and crowd based technology. In this paper we present our experience in bringing these concepts to university education in Africa, as a way of enabling rapid development and self-sufficiency in health care. We describe the three summer schools held in sub-Saharan Africa where both students and teachers embraced the philosophy of open BME education with enthusiasm, and discuss the advantages and disadvantages of opening education in this way in the developing and developed world.

  8. Coreference annotation and resolution in the Colorado Richly Annotated Full Text (CRAFT) corpus of biomedical journal articles.

    PubMed

    Cohen, K Bretonnel; Lanfranchi, Arrick; Choi, Miji Joo-Young; Bada, Michael; Baumgartner, William A; Panteleyeva, Natalya; Verspoor, Karin; Palmer, Martha; Hunter, Lawrence E

    2017-08-17

    Coreference resolution is the task of finding strings in text that have the same referent as other strings. Failures of coreference resolution are a common cause of false negatives in information extraction from the scientific literature. In order to better understand the nature of the phenomenon of coreference in biomedical publications and to increase performance on the task, we annotated the Colorado Richly Annotated Full Text (CRAFT) corpus with coreference relations. The corpus was manually annotated with coreference relations, including identity and appositives for all coreferring base noun phrases. The OntoNotes annotation guidelines, with minor adaptations, were used. Interannotator agreement ranges from 0.480 (entity-based CEAF) to 0.858 (Class-B3), depending on the metric that is used to assess it. The resulting corpus adds nearly 30,000 annotations to the previous release of the CRAFT corpus. Differences from related projects include a much broader definition of markables, connection to extensive annotation of several domain-relevant semantic classes, and connection to complete syntactic annotation. Tool performance was benchmarked on the data. A publicly available out-of-the-box, general-domain coreference resolution system achieved an F-measure of 0.14 (B3), while a simple domain-adapted rule-based system achieved an F-measure of 0.42. An ensemble of the two reached F of 0.46. Following the IDENTITY chains in the data would add 106,263 additional named entities in the full 97-paper corpus, for an increase of 76% percent in the semantic classes of the eight ontologies that have been annotated in earlier versions of the CRAFT corpus. The project produced a large data set for further investigation of coreference and coreference resolution in the scientific literature. The work raised issues in the phenomenon of reference in this domain and genre, and the paper proposes that many mentions that would be considered generic in the general domain are not

  9. Annotating the biomedical literature for the human variome

    PubMed Central

    Verspoor, Karin; Jimeno Yepes, Antonio; Cavedon, Lawrence; McIntosh, Tara; Herten-Crabb, Asha; Thomas, Zoë; Plazzer, John-Paul

    2013-01-01

    This article introduces the Variome Annotation Schema, a schema that aims to capture the core concepts and relations relevant to cataloguing and interpreting human genetic variation and its relationship to disease, as described in the published literature. The schema was inspired by the needs of the database curators of the International Society for Gastrointestinal Hereditary Tumours (InSiGHT) database, but is intended to have application to genetic variation information in a range of diseases. The schema has been applied to a small corpus of full text journal publications on the subject of inherited colorectal cancer. We show that the inter-annotator agreement on annotation of this corpus ranges from 0.78 to 0.95 F-score across different entity types when exact matching is measured, and improves to a minimum F-score of 0.87 when boundary matching is relaxed. Relations show more variability in agreement, but several are reliable, with the highest, cohort-has-size, reaching 0.90 F-score. We also explore the relevance of the schema to the InSiGHT database curation process. The schema and the corpus represent an important new resource for the development of text mining solutions that address relationships among patient cohorts, disease and genetic variation, and therefore, we also discuss the role text mining might play in the curation of information related to the human variome. The corpus is available at http://opennicta.com/home/health/variome. PMID:23584833

  10. Open Biomedical Ontology-based Medline exploration

    PubMed Central

    Xuan, Weijian; Dai, Manhong; Mirel, Barbara; Song, Jean; Athey, Brian; Watson, Stanley J; Meng, Fan

    2009-01-01

    Background Effective Medline database exploration is critical for the understanding of high throughput experimental results and the development of novel hypotheses about the mechanisms underlying the targeted biological processes. While existing solutions enhance Medline exploration through different approaches such as document clustering, network presentations of underlying conceptual relationships and the mapping of search results to MeSH and Gene Ontology trees, we believe the use of multiple ontologies from the Open Biomedical Ontology can greatly help researchers to explore literature from different perspectives as well as to quickly locate the most relevant Medline records for further investigation. Results We developed an ontology-based interactive Medline exploration solution called PubOnto to enable the interactive exploration and filtering of search results through the use of multiple ontologies from the OBO foundry. The PubOnto program is a rich internet application based on the FLEX platform. It contains a number of interactive tools, visualization capabilities, an open service architecture, and a customizable user interface. It is freely accessible at: . PMID:19426463

  11. Open access to biomedical engineering publications.

    PubMed

    Flexman, Jennifer A

    2008-01-01

    Scientific research is disseminated within the community and to the public in part through journals. Most scientific journals, in turn, protect the manuscript through copyright and recover their costs by charging subscription fees to individuals and institutions. This revenue stream is used to support the management of the journal and, in some cases, professional activities of the sponsoring society such as the Institute of Electrical and Electronics Engineers (IEEE). For example, the IEEE Engineering in Medicine and Biology Society (EMBS) manages seven academic publications representing the various areas of biomedical engineering. New business models have been proposed to distribute journal articles free of charge, either immediately or after a delay, to enable a greater dissemination of knowledge to both the public and the scientific community. However, publication costs must be recovered and likely at a higher cost to the manuscript authors. While there is little doubt that the foundations of scientific publication will change, the specifics and implications of an open source framework must be discussed.

  12. Generation of Silver Standard Concept Annotations from Biomedical Texts with Special Relevance to Phenotypes

    PubMed Central

    Oellrich, Anika; Collier, Nigel; Smedley, Damian; Groza, Tudor

    2015-01-01

    Electronic health records and scientific articles possess differing linguistic characteristics that may impact the performance of natural language processing tools developed for one or the other. In this paper, we investigate the performance of four extant concept recognition tools: the clinical Text Analysis and Knowledge Extraction System (cTAKES), the National Center for Biomedical Ontology (NCBO) Annotator, the Biomedical Concept Annotation System (BeCAS) and MetaMap. Each of the four concept recognition systems is applied to four different corpora: the i2b2 corpus of clinical documents, a PubMed corpus of Medline abstracts, a clinical trails corpus and the ShARe/CLEF corpus. In addition, we assess the individual system performances with respect to one gold standard annotation set, available for the ShARe/CLEF corpus. Furthermore, we built a silver standard annotation set from the individual systems’ output and assess the quality as well as the contribution of individual systems to the quality of the silver standard. Our results demonstrate that mainly the NCBO annotator and cTAKES contribute to the silver standard corpora (F1-measures in the range of 21% to 74%) and their quality (best F1-measure of 33%), independent from the type of text investigated. While BeCAS and MetaMap can contribute to the precision of silver standard annotations (precision of up to 42%), the F1-measure drops when combined with NCBO Annotator and cTAKES due to a low recall. In conclusion, the performances of individual systems need to be improved independently from the text types, and the leveraging strategies to best take advantage of individual systems’ annotations need to be revised. The textual content of the PubMed corpus, accession numbers for the clinical trials corpus, and assigned annotations of the four concept recognition systems as well as the generated silver standard annotation sets are available from http://purl.org/phenotype/resources. The textual content of the Sh

  13. Generation of silver standard concept annotations from biomedical texts with special relevance to phenotypes.

    PubMed

    Oellrich, Anika; Collier, Nigel; Smedley, Damian; Groza, Tudor

    2015-01-01

    Electronic health records and scientific articles possess differing linguistic characteristics that may impact the performance of natural language processing tools developed for one or the other. In this paper, we investigate the performance of four extant concept recognition tools: the clinical Text Analysis and Knowledge Extraction System (cTAKES), the National Center for Biomedical Ontology (NCBO) Annotator, the Biomedical Concept Annotation System (BeCAS) and MetaMap. Each of the four concept recognition systems is applied to four different corpora: the i2b2 corpus of clinical documents, a PubMed corpus of Medline abstracts, a clinical trails corpus and the ShARe/CLEF corpus. In addition, we assess the individual system performances with respect to one gold standard annotation set, available for the ShARe/CLEF corpus. Furthermore, we built a silver standard annotation set from the individual systems' output and assess the quality as well as the contribution of individual systems to the quality of the silver standard. Our results demonstrate that mainly the NCBO annotator and cTAKES contribute to the silver standard corpora (F1-measures in the range of 21% to 74%) and their quality (best F1-measure of 33%), independent from the type of text investigated. While BeCAS and MetaMap can contribute to the precision of silver standard annotations (precision of up to 42%), the F1-measure drops when combined with NCBO Annotator and cTAKES due to a low recall. In conclusion, the performances of individual systems need to be improved independently from the text types, and the leveraging strategies to best take advantage of individual systems' annotations need to be revised. The textual content of the PubMed corpus, accession numbers for the clinical trials corpus, and assigned annotations of the four concept recognition systems as well as the generated silver standard annotation sets are available from http://purl.org/phenotype/resources. The textual content of the Sh

  14. Recommending MeSH terms for annotating biomedical articles.

    PubMed

    Huang, Minlie; Névéol, Aurélie; Lu, Zhiyong

    2011-01-01

    Due to the high cost of manual curation of key aspects from the scientific literature, automated methods for assisting this process are greatly desired. Here, we report a novel approach to facilitate MeSH indexing, a challenging task of assigning MeSH terms to MEDLINE citations for their archiving and retrieval. Unlike previous methods for automatic MeSH term assignment, we reformulate the indexing task as a ranking problem such that relevant MeSH headings are ranked higher than those irrelevant ones. Specifically, for each document we retrieve 20 neighbor documents, obtain a list of MeSH main headings from neighbors, and rank the MeSH main headings using ListNet-a learning-to-rank algorithm. We trained our algorithm on 200 documents and tested on a previously used benchmark set of 200 documents and a larger dataset of 1000 documents. Tested on the benchmark dataset, our method achieved a precision of 0.390, recall of 0.712, and mean average precision (MAP) of 0.626. In comparison to the state of the art, we observe statistically significant improvements as large as 39% in MAP (p-value <0.001). Similar significant improvements were also obtained on the larger document set. Experimental results show that our approach makes the most accurate MeSH predictions to date, which suggests its great potential in making a practical impact on MeSH indexing. Furthermore, as discussed the proposed learning framework is robust and can be adapted to many other similar tasks beyond MeSH indexing in the biomedical domain. All data sets are available at: http://www.ncbi.nlm.nih.gov/CBBresearch/Lu/indexing.

  15. Recommending MeSH terms for annotating biomedical articles

    PubMed Central

    Huang, Minlie; Névéol, Aurélie

    2011-01-01

    Background Due to the high cost of manual curation of key aspects from the scientific literature, automated methods for assisting this process are greatly desired. Here, we report a novel approach to facilitate MeSH indexing, a challenging task of assigning MeSH terms to MEDLINE citations for their archiving and retrieval. Methods Unlike previous methods for automatic MeSH term assignment, we reformulate the indexing task as a ranking problem such that relevant MeSH headings are ranked higher than those irrelevant ones. Specifically, for each document we retrieve 20 neighbor documents, obtain a list of MeSH main headings from neighbors, and rank the MeSH main headings using ListNet–a learning-to-rank algorithm. We trained our algorithm on 200 documents and tested on a previously used benchmark set of 200 documents and a larger dataset of 1000 documents. Results Tested on the benchmark dataset, our method achieved a precision of 0.390, recall of 0.712, and mean average precision (MAP) of 0.626. In comparison to the state of the art, we observe statistically significant improvements as large as 39% in MAP (p-value <0.001). Similar significant improvements were also obtained on the larger document set. Conclusion Experimental results show that our approach makes the most accurate MeSH predictions to date, which suggests its great potential in making a practical impact on MeSH indexing. Furthermore, as discussed the proposed learning framework is robust and can be adapted to many other similar tasks beyond MeSH indexing in the biomedical domain. All data sets are available at: http://www.ncbi.nlm.nih.gov/CBBresearch/Lu/indexing. PMID:21613640

  16. Annotating image ROIs with text descriptions for multimodal biomedical document retrieval

    NASA Astrophysics Data System (ADS)

    You, Daekeun; Simpson, Matthew; Antani, Sameer; Demner-Fushman, Dina; Thoma, George R.

    2013-01-01

    Regions of interest (ROIs) that are pointed to by overlaid markers (arrows, asterisks, etc.) in biomedical images are expected to contain more important and relevant information than other regions for biomedical article indexing and retrieval. We have developed several algorithms that localize and extract the ROIs by recognizing markers on images. Cropped ROIs then need to be annotated with contents describing them best. In most cases accurate textual descriptions of the ROIs can be found from figure captions, and these need to be combined with image ROIs for annotation. The annotated ROIs can then be used to, for example, train classifiers that separate ROIs into known categories (medical concepts), or to build visual ontologies, for indexing and retrieval of biomedical articles. We propose an algorithm that pairs visual and textual ROIs that are extracted from images and figure captions, respectively. This algorithm based on dynamic time warping (DTW) clusters recognized pointers into groups, each of which contains pointers with identical visual properties (shape, size, color, etc.). Then a rule-based matching algorithm finds the best matching group for each textual ROI mention. Our method yields a precision and recall of 96% and 79%, respectively, when ground truth textual ROI data is used.

  17. A Bayesian network coding scheme for annotating biomedical information presented to genetic counseling clients.

    PubMed

    Green, Nancy

    2005-04-01

    We developed a Bayesian network coding scheme for annotating biomedical content in layperson-oriented clinical genetics documents. The coding scheme supports the representation of probabilistic and causal relationships among concepts in this domain, at a high enough level of abstraction to capture commonalities among genetic processes and their relationship to health. We are using the coding scheme to annotate a corpus of genetic counseling patient letters as part of the requirements analysis and knowledge acquisition phase of a natural language generation project. This paper describes the coding scheme and presents an evaluation of intercoder reliability for its tag set. In addition to giving examples of use of the coding scheme for analysis of discourse and linguistic features in this genre, we suggest other uses for it in analysis of layperson-oriented text and dialogue in medical communication.

  18. A Maximum-Entropy approach for accurate document annotation in the biomedical domain

    PubMed Central

    2012-01-01

    The increasing number of scientific literature on the Web and the absence of efficient tools used for classifying and searching the documents are the two most important factors that influence the speed of the search and the quality of the results. Previous studies have shown that the usage of ontologies makes it possible to process document and query information at the semantic level, which greatly improves the search for the relevant information and makes one step further towards the Semantic Web. A fundamental step in these approaches is the annotation of documents with ontology concepts, which can also be seen as a classification task. In this paper we address this issue for the biomedical domain and present a new automated and robust method, based on a Maximum Entropy approach, for annotating biomedical literature documents with terms from the Medical Subject Headings (MeSH). The experimental evaluation shows that the suggested Maximum Entropy approach for annotating biomedical documents with MeSH terms is highly accurate, robust to the ambiguity of terms, and can provide very good performance even when a very small number of training documents is used. More precisely, we show that the proposed algorithm obtained an average F-measure of 92.4% (precision 99.41%, recall 86.77%) for the full range of the explored terms (4,078 MeSH terms), and that the algorithm’s performance is resilient to terms’ ambiguity, achieving an average F-measure of 92.42% (precision 99.32%, recall 86.87%) in the explored MeSH terms which were found to be ambiguous according to the Unified Medical Language System (UMLS) thesaurus. Finally, we compared the results of the suggested methodology with a Naive Bayes and a Decision Trees classification approach, and we show that the Maximum Entropy based approach performed with higher F-Measure in both ambiguous and monosemous MeSH terms. PMID:22541593

  19. A Maximum-Entropy approach for accurate document annotation in the biomedical domain.

    PubMed

    Tsatsaronis, George; Macari, Natalia; Torge, Sunna; Dietze, Heiko; Schroeder, Michael

    2012-04-24

    The increasing number of scientific literature on the Web and the absence of efficient tools used for classifying and searching the documents are the two most important factors that influence the speed of the search and the quality of the results. Previous studies have shown that the usage of ontologies makes it possible to process document and query information at the semantic level, which greatly improves the search for the relevant information and makes one step further towards the Semantic Web. A fundamental step in these approaches is the annotation of documents with ontology concepts, which can also be seen as a classification task. In this paper we address this issue for the biomedical domain and present a new automated and robust method, based on a Maximum Entropy approach, for annotating biomedical literature documents with terms from the Medical Subject Headings (MeSH).The experimental evaluation shows that the suggested Maximum Entropy approach for annotating biomedical documents with MeSH terms is highly accurate, robust to the ambiguity of terms, and can provide very good performance even when a very small number of training documents is used. More precisely, we show that the proposed algorithm obtained an average F-measure of 92.4% (precision 99.41%, recall 86.77%) for the full range of the explored terms (4,078 MeSH terms), and that the algorithm's performance is resilient to terms' ambiguity, achieving an average F-measure of 92.42% (precision 99.32%, recall 86.87%) in the explored MeSH terms which were found to be ambiguous according to the Unified Medical Language System (UMLS) thesaurus. Finally, we compared the results of the suggested methodology with a Naive Bayes and a Decision Trees classification approach, and we show that the Maximum Entropy based approach performed with higher F-Measure in both ambiguous and monosemous MeSH terms.

  20. ProteoAnnotator--open source proteogenomics annotation software supporting PSI standards.

    PubMed

    Ghali, Fawaz; Krishna, Ritesh; Perkins, Simon; Collins, Andrew; Xia, Dong; Wastling, Jonathan; Jones, Andrew R

    2014-12-01

    The recent massive increase in capability for sequencing genomes is producing enormous advances in our understanding of biological systems. However, there is a bottleneck in genome annotation--determining the structure of all transcribed genes. Experimental data from MS studies can play a major role in confirming and correcting gene structure--proteogenomics. However, there are some technical and practical challenges to overcome, since proteogenomics requires pipelines comprising a complex set of interconnected modules as well as bespoke routines, for example in protein inference and statistics. We are introducing a complete, open source pipeline for proteogenomics, called ProteoAnnotator, which incorporates a graphical user interface and implements the Proteomics Standards Initiative mzIdentML standard for each analysis stage. All steps are included as standalone modules with the mzIdentML library, allowing other groups to re-use the whole pipeline or constituent parts within other tools. We have developed new modules for pre-processing and combining multiple search databases, for performing peptide-level statistics on mzIdentML files, for scoring grouped protein identifications matched to a given genomic locus to validate that updates to the official gene models are statistically sound and for mapping end results back onto the genome. ProteoAnnotator is available from http://www.proteoannotator.org/. All MS data have been deposited in the ProteomeXchange with identifiers PXD001042 and PXD001390 (http://proteomecentral.proteomexchange.org/dataset/PXD001042; http://proteomecentral.proteomexchange.org/dataset/PXD001390). © 2014 The Authors. PROTEOMICS Published by WILEY-VCH Verlag GmbH & Co. KGaA.

  1. Integration and Querying of Genomic and Proteomic Semantic Annotations for Biomedical Knowledge Extraction.

    PubMed

    Masseroli, Marco; Canakoglu, Arif; Ceri, Stefano

    2016-01-01

    Understanding complex biological phenomena involves answering complex biomedical questions on multiple biomolecular information simultaneously, which are expressed through multiple genomic and proteomic semantic annotations scattered in many distributed and heterogeneous data sources; such heterogeneity and dispersion hamper the biologists' ability of asking global queries and performing global evaluations. To overcome this problem, we developed a software architecture to create and maintain a Genomic and Proteomic Knowledge Base (GPKB), which integrates several of the most relevant sources of such dispersed information (including Entrez Gene, UniProt, IntAct, Expasy Enzyme, GO, GOA, BioCyc, KEGG, Reactome, and OMIM). Our solution is general, as it uses a flexible, modular, and multilevel global data schema based on abstraction and generalization of integrated data features, and a set of automatic procedures for easing data integration and maintenance, also when the integrated data sources evolve in data content, structure, and number. These procedures also assure consistency, quality, and provenance tracking of all integrated data, and perform the semantic closure of the hierarchical relationships of the integrated biomedical ontologies. At http://www.bioinformatics.deib.polimi.it/GPKB/, a Web interface allows graphical easy composition of queries, although complex, on the knowledge base, supporting also semantic query expansion and comprehensive explorative search of the integrated data to better sustain biomedical knowledge extraction.

  2. Bi-convex Optimization to Learn Classifiers from Multiple Biomedical Annotations

    PubMed Central

    Wang, Xin; Bi, Jinbo

    2016-01-01

    The problem of constructing classifiers from multiple annotators who provide inconsistent training labels is important and occurs in many application domains. Many existing methods focus on the understanding and learning of the crowd behaviors. Several probabilistic algorithms consider the construction of classifiers for specific tasks using consensus of multiple labelers annotations. These methods impose a prior on the consensus and develop an expectation-maximization algorithm based on logistic regression loss. We extend the discussion to the hinge loss commonly used by support vector machines. Our formulations form bi-convex programs that construct classifiers and estimate the reliability of each labeler simultaneously. Each labeler is associated with a reliability parameter, which can be a constant, or class-dependent, or varies for different examples. The hinge loss is modified by replacing the true labels by the weighted combination of labelers’ labels with reliabilities as weights. Statistical justification is discussed to motivate the use of linear combination of labels. In parallel to the expectation-maximization algorithm for logistic based methods, efficient alternating algorithms are developed to solve the proposed bi-convex programs. Experimental results on benchmark datasets and three real-world biomedical problems demonstrate that the proposed methods either outperform or are competitive to the state of the art. PMID:27295686

  3. Bi-convex Optimization to Learn Classifiers from Multiple Biomedical Annotations.

    PubMed

    Wang, Xin; Bi, Jinbo

    2016-06-07

    The problem of constructing classifiers from multiple annotators who provide inconsistent training labels is important and occurs in many application domains. Many existing methods focus on the understanding and learning of the crowd behaviors. Several probabilistic algorithms consider the construction of classifiers for specific tasks using consensus of multiple labelers annotations. These methods impose a prior on the consensus and develop an expectation-maximization algorithm based on logistic regression loss. We extend the discussion to the hinge loss commonly used by support vector machines. Our formulations form bi-convex programs that construct classifiers and estimate the reliability of each labeler simultaneously. Each labeler is associated with a reliability parameter, which can be a constant, or class-dependent, or varies for different examples. The hinge loss is modified by replacing the true labels by the weighted combination of labelers' labels with reliabilities as weights. Statistical justification is discussed to motivate the use of linear combination of labels. In parallel to the expectation-maximization algorithm for logistic based methods, efficient alternating algorithms are developed to solve the proposed bi-convex programs. Experimental results on benchmark datasets and three real-world biomedical problems demonstrate that the proposed methods either outperform or are competitive to the state of the art.

  4. Open architecture software platform for biomedical signal analysis.

    PubMed

    Duque, Juliano J; Silva, Luiz E V; Murta, Luiz O

    2013-01-01

    Biomedical signals are very important reporters of the physiological status in human body. Therefore, great attention is devoted to the study of analysis methods that help extracting the greatest amount of relevant information from these signals. There are several free of charge softwares which can process biomedical data, but they are usually closed architecture, not allowing addition of new functionalities by users. This paper presents a proposal for free open architecture software platform for biomedical signal analysis, named JBioS. Implemented in Java, the platform offers some basic functionalities to load and display signals, and allows the integration of new software components through plugins. JBioS facilitates validation of new analysis methods and provides an environment for multi-methods analysis. Plugins can be developed for preprocessing, analyzing and simulating signals. Some applications have been done using this platform, suggesting that, with these features, JBioS presents itself as a software with potential applications in both research and clinical area.

  5. [Self-archiving of biomedical papers in open access repositories].

    PubMed

    Abad-García, M Francisca; Melero, Remedios; Abadal, Ernest; González-Teruel, Aurora

    2010-04-01

    Open-access literature is digital, online, free of charge, and free of most copyright and licensing restrictions. Self-archiving or deposit of scholarly outputs in institutional repositories (open-access green route) is increasingly present in the activities of the scientific community. Besides the benefits of open access for visibility and dissemination of science, it is increasingly more often required by funding agencies to deposit papers and any other type of documents in repositories. In the biomedical environment this is even more relevant by the impact scientific literature can have on public health. However, to make self-archiving feasible, authors should be aware of its meaning and the terms in which they are allowed to archive their works. In that sense, there are some tools like Sherpa/RoMEO or DULCINEA (both directories of copyright licences of scientific journals at different levels) to find out what rights are retained by authors when they publish a paper and if they allow to implement self-archiving. PubMed Central and its British and Canadian counterparts are the main thematic repositories for biomedical fields. In our country there is none of similar nature, but most of the universities and CSIC, have already created their own institutional repositories. The increase in visibility of research results and their impact on a greater and earlier citation is one of the most frequently advance of open access, but removal of economic barriers to access to information is also a benefit to break borders between groups.

  6. OLS Client and OLS Dialog: Open Source Tools to Annotate Public Omics Datasets.

    PubMed

    Perez-Riverol, Yasset; Ternent, Tobias; Koch, Maximilian; Barsnes, Harald; Vrousgou, Olga; Jupp, Simon; Vizcaíno, Juan Antonio

    2017-10-01

    The availability of user-friendly software to annotate biological datasets and experimental details is becoming essential in data management practices, both in local storage systems and in public databases. The Ontology Lookup Service (OLS, http://www.ebi.ac.uk/ols) is a popular centralized service to query, browse and navigate biomedical ontologies and controlled vocabularies. Recently, the OLS framework has been completely redeveloped (version 3.0), including enhancements in the data model, like the added support for Web Ontology Language based ontologies, among many other improvements. However, the new OLS is not backwards compatible and new software tools are needed to enable access to this widely used framework now that the previous version is no longer available. We here present the OLS Client as a free, open-source Java library to retrieve information from the new version of the OLS. It enables rapid tool creation by providing a robust, pluggable programming interface and common data model to programmatically access the OLS. The library has already been integrated and is routinely used by several bioinformatics resources and related data annotation tools. Secondly, we also introduce an updated version of the OLS Dialog (version 2.0), a Java graphical user interface that can be easily plugged into Java desktop applications to access the OLS. The software and related documentation are freely available at https://github.com/PRIDE-Utilities/ols-client and https://github.com/PRIDE-Toolsuite/ols-dialog. © 2017 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.

  7. WebMedSA: a web-based framework for segmenting and annotating medical images using biomedical ontologies

    NASA Astrophysics Data System (ADS)

    Vega, Francisco; Pérez, Wilson; Tello, Andrés.; Saquicela, Victor; Espinoza, Mauricio; Solano-Quinde, Lizandro; Vidal, Maria-Esther; La Cruz, Alexandra

    2015-12-01

    Advances in medical imaging have fostered medical diagnosis based on digital images. Consequently, the number of studies by medical images diagnosis increases, thus, collaborative work and tele-radiology systems are required to effectively scale up to this diagnosis trend. We tackle the problem of the collaborative access of medical images, and present WebMedSA, a framework to manage large datasets of medical images. WebMedSA relies on a PACS and supports the ontological annotation, as well as segmentation and visualization of the images based on their semantic description. Ontological annotations can be performed directly on the volumetric image or at different image planes (e.g., axial, coronal, or sagittal); furthermore, annotations can be complemented after applying a segmentation technique. WebMedSA is based on three main steps: (1) RDF-ization process for extracting, anonymizing, and serializing metadata comprised in DICOM medical images into RDF/XML; (2) Integration of different biomedical ontologies (using L-MOM library), making this approach ontology independent; and (3) segmentation and visualization of annotated data which is further used to generate new annotations according to expert knowledge, and validation. Initial user evaluations suggest that WebMedSA facilitates the exchange of knowledge between radiologists, and provides the basis for collaborative work among them.

  8. OpenCL based machine learning labeling of biomedical datasets

    NASA Astrophysics Data System (ADS)

    Amoros, Oscar; Escalera, Sergio; Puig, Anna

    2011-03-01

    In this paper, we propose a two-stage labeling method of large biomedical datasets through a parallel approach in a single GPU. Diagnostic methods, structures volume measurements, and visualization systems are of major importance for surgery planning, intra-operative imaging and image-guided surgery. In all cases, to provide an automatic and interactive method to label or to tag different structures contained into input data becomes imperative. Several approaches to label or segment biomedical datasets has been proposed to discriminate different anatomical structures in an output tagged dataset. Among existing methods, supervised learning methods for segmentation have been devised to easily analyze biomedical datasets by a non-expert user. However, they still have some problems concerning practical application, such as slow learning and testing speeds. In addition, recent technological developments have led to widespread availability of multi-core CPUs and GPUs, as well as new software languages, such as NVIDIA's CUDA and OpenCL, allowing to apply parallel programming paradigms in conventional personal computers. Adaboost classifier is one of the most widely applied methods for labeling in the Machine Learning community. In a first stage, Adaboost trains a binary classifier from a set of pre-labeled samples described by a set of features. This binary classifier is defined as a weighted combination of weak classifiers. Each weak classifier is a simple decision function estimated on a single feature value. Then, at the testing stage, each weak classifier is independently applied on the features of a set of unlabeled samples. In this work, we propose an alternative representation of the Adaboost binary classifier. We use this proposed representation to define a new GPU-based parallelized Adaboost testing stage using OpenCL. We provide numerical experiments based on large available data sets and we compare our results to CPU-based strategies in terms of time and

  9. Crowd Control: Effectively Utilizing Unscreened Crowd Workers for Biomedical Data Annotation.

    PubMed

    Cocos, Anne; Qian, Ting; Callison-Burch, Chris; Masino, Aaron J

    2017-04-04

    Annotating unstructured texts in Electronic Health Records data is usually a necessary step for conducting machine learning research on such datasets. Manual annotation by domain experts provides data of the best quality, but has become increasingly impractical given the rapid increase in the volume of EHR data. In this article, we examine the effectiveness of crowdsourcing with unscreened online workers as an alternative for transforming unstructured texts in EHRs into annotated data that are directly usable in supervised learning models. We find the crowdsourced annotation data to be just as effective as expert data in training a sentence classification model to detect the mentioning of abnormal ear anatomy in radiology reports of audiology. Furthermore, we have discovered that enabling workers to self-report a confidence level associated with each annotation can help researchers pinpoint less-accurate annotations requiring expert scrutiny. Our findings suggest that even crowd workers without specific domain knowledge can contribute effectively to the task of annotating unstructured EHR datasets.

  10. Data annotation, recording and mapping system for the US open skies aircraft

    SciTech Connect

    Brown, B.W.; Goede, W.F.; Farmer, R.G.

    1996-11-01

    This paper discusses the system developed by Northrop Grumman for the Defense Nuclear Agency (DNA), US Air Force, and the On-Site Inspection Agency (OSIA) to comply with the data annotation and reporting provisions of the Open Skies Treaty. This system, called the Data Annotation, Recording and Mapping System (DARMS), has been installed on the US OC-135 and meets or exceeds all annotation requirements for the Open Skies Treaty. The Open Skies Treaty, which will enter into force in the near future, allows any of the 26 signatory countries to fly fixed wing aircraft with imaging sensors over any of the other treaty participants, upon very short notice, and with no restricted flight areas. Sensor types presently allowed by the treaty are: optical framing and panoramic film cameras; video cameras ranging from analog PAL color television cameras to the more sophisticated digital monochrome and color line scanning or framing cameras; infrared line scanners; and synthetic aperture radars. Each sensor type has specific performance parameters which are limited by the treaty, as well as specific annotation requirements which must be achieved upon full entry into force. DARMS supports U.S. compliance with the Opens Skies Treaty by means of three subsystems: the Data Annotation Subsytem (DAS), which annotates sensor media with data obtained from sensors and the aircraft`s avionics system; the Data Recording System (DRS), which records all sensor and flight events on magnetic media for later use in generating Treaty mandated mission reports; and the Dynamic Sensor Mapping Subsystem (DSMS), which provides observers and sensor operators with a real-time moving map displays of the progress of the mission, complete with instantaneous and cumulative sensor coverages. This paper will describe DARMS and its subsystems in greater detail, along with the supporting avionics sub-systems. 7 figs.

  11. For 481 biomedical open access journals, articles are not searchable in the Directory of Open Access Journals nor in conventional biomedical databases

    PubMed Central

    Andresen, Kristoffer; Pommergaard, Hans-Christian; Rosenberg, Jacob

    2015-01-01

    Background. Open access (OA) journals allows access to research papers free of charge to the reader. Traditionally, biomedical researchers use databases like MEDLINE and EMBASE to discover new advances. However, biomedical OA journals might not fulfill such databases’ criteria, hindering dissemination. The Directory of Open Access Journals (DOAJ) is a database exclusively listing OA journals. The aim of this study was to investigate DOAJ’s coverage of biomedical OA journals compared with the conventional biomedical databases. Methods. Information on all journals listed in four conventional biomedical databases (MEDLINE, PubMed Central, EMBASE and SCOPUS) and DOAJ were gathered. Journals were included if they were (1) actively publishing, (2) full OA, (3) prospectively indexed in one or more database, and (4) of biomedical subject. Impact factor and journal language were also collected. DOAJ was compared with conventional databases regarding the proportion of journals covered, along with their impact factor and publishing language. The proportion of journals with articles indexed by DOAJ was determined. Results. In total, 3,236 biomedical OA journals were included in the study. Of the included journals, 86.7% were listed in DOAJ. Combined, the conventional biomedical databases listed 75.0% of the journals; 18.7% in MEDLINE; 36.5% in PubMed Central; 51.5% in SCOPUS and 50.6% in EMBASE. Of the journals in DOAJ, 88.7% published in English and 20.6% had received impact factor for 2012 compared with 93.5% and 26.0%, respectively, for journals in the conventional biomedical databases. A subset of 51.1% and 48.5% of the journals in DOAJ had articles indexed from 2012 and 2013, respectively. Of journals exclusively listed in DOAJ, one journal had received an impact factor for 2012, and 59.6% of the journals had no content from 2013 indexed in DOAJ. Conclusions. DOAJ is the most complete registry of biomedical OA journals compared with five conventional biomedical

  12. For 481 biomedical open access journals, articles are not searchable in the Directory of Open Access Journals nor in conventional biomedical databases.

    PubMed

    Liljekvist, Mads Svane; Andresen, Kristoffer; Pommergaard, Hans-Christian; Rosenberg, Jacob

    2015-01-01

    Background. Open access (OA) journals allows access to research papers free of charge to the reader. Traditionally, biomedical researchers use databases like MEDLINE and EMBASE to discover new advances. However, biomedical OA journals might not fulfill such databases' criteria, hindering dissemination. The Directory of Open Access Journals (DOAJ) is a database exclusively listing OA journals. The aim of this study was to investigate DOAJ's coverage of biomedical OA journals compared with the conventional biomedical databases. Methods. Information on all journals listed in four conventional biomedical databases (MEDLINE, PubMed Central, EMBASE and SCOPUS) and DOAJ were gathered. Journals were included if they were (1) actively publishing, (2) full OA, (3) prospectively indexed in one or more database, and (4) of biomedical subject. Impact factor and journal language were also collected. DOAJ was compared with conventional databases regarding the proportion of journals covered, along with their impact factor and publishing language. The proportion of journals with articles indexed by DOAJ was determined. Results. In total, 3,236 biomedical OA journals were included in the study. Of the included journals, 86.7% were listed in DOAJ. Combined, the conventional biomedical databases listed 75.0% of the journals; 18.7% in MEDLINE; 36.5% in PubMed Central; 51.5% in SCOPUS and 50.6% in EMBASE. Of the journals in DOAJ, 88.7% published in English and 20.6% had received impact factor for 2012 compared with 93.5% and 26.0%, respectively, for journals in the conventional biomedical databases. A subset of 51.1% and 48.5% of the journals in DOAJ had articles indexed from 2012 and 2013, respectively. Of journals exclusively listed in DOAJ, one journal had received an impact factor for 2012, and 59.6% of the journals had no content from 2013 indexed in DOAJ. Conclusions. DOAJ is the most complete registry of biomedical OA journals compared with five conventional biomedical databases

  13. A selected annotated bibliography of the core biomedical literature pertaining to stroke, cervical spine, manipulation and head/neck movement

    PubMed Central

    Gotlib, Allan C.; Thiel, Haymo

    1985-01-01

    This manuscript’s purpose was to establish a knowledge base of information related to stroke and the cervical spine vascular structures, from both historical and current perspectives. The scientific biomedical literatures both indexed (ie. Index Medicus, CRAC) and non-indexed literature systems were scanned and the pertinent manuscripts were annotated. Citation is by occurence in the literature so that historical trends may be viewed more easily. No analysis of the reference material is offered. Suggested however is that: 1. complications to cervical spine manipulation are being recognized and reported with increasing frequency, 2. a cause and effect relationship between stroke and cervical spine manipulation has not been established, 3. a screening mechanism that is valid, reliable and reasonable needs to be established.

  14. Construction and accessibility of a cross-species phenotype ontology along with gene annotations for biomedical research.

    PubMed

    Köhler, Sebastian; Doelken, Sandra C; Ruef, Barbara J; Bauer, Sebastian; Washington, Nicole; Westerfield, Monte; Gkoutos, George; Schofield, Paul; Smedley, Damian; Lewis, Suzanna E; Robinson, Peter N; Mungall, Christopher J

    2013-01-01

    Phenotype analyses, e.g. investigating metabolic processes, tissue formation, or organism behavior, are an important element of most biological and medical research activities. Biomedical researchers are making increased use of ontological standards and methods to capture the results of such analyses, with one focus being the comparison and analysis of phenotype information between species. We have generated a cross-species phenotype ontology for human, mouse and zebrafish that contains classes from the Human Phenotype Ontology, Mammalian Phenotype Ontology, and generated classes for zebrafish phenotypes. We also provide up-to-date annotation data connecting human genes to phenotype classes from the generated ontology. We have included the data generation pipeline into our continuous integration system ensuring stable and up-to-date releases. This article describes the data generation process and is intended to help interested researchers access both the phenotype annotation data and the associated cross-species phenotype ontology. The resource described here can be used in sophisticated semantic similarity and gene set enrichment analyses for phenotype data across species. The stable releases of this resource can be obtained from http://purl.obolibrary.org/obo/hp/uberpheno/.

  15. BIOSMILE web search: a web application for annotating biomedical entities and relations.

    PubMed

    Dai, Hong-Jie; Huang, Chi-Hsin; Lin, Ryan T K; Tsai, Richard Tzong-Han; Hsu, Wen-Lian

    2008-07-01

    BIOSMILE web search (BWS), a web-based NCBI-PubMed search application, which can analyze articles for selected biomedical verbs and give users relational information, such as subject, object, location, manner, time, etc. After receiving keyword query input, BWS retrieves matching PubMed abstracts and lists them along with snippets by order of relevancy to protein-protein interaction. Users can then select articles for further analysis, and BWS will find and mark up biomedical relations in the text. The analysis results can be viewed in the abstract text or in table form. To date, BWS has been field tested by over 30 biologists and questionnaires have shown that subjects are highly satisfied with its capabilities and usability. BWS is accessible free of charge at http://bioservices.cse.yzu.edu.tw/BWS.

  16. Concept annotation in the CRAFT corpus

    PubMed Central

    2012-01-01

    Background Manually annotated corpora are critical for the training and evaluation of automated methods to identify concepts in biomedical text. Results This paper presents the concept annotations of the Colorado Richly Annotated Full-Text (CRAFT) Corpus, a collection of 97 full-length, open-access biomedical journal articles that have been annotated both semantically and syntactically to serve as a research resource for the biomedical natural-language-processing (NLP) community. CRAFT identifies all mentions of nearly all concepts from nine prominent biomedical ontologies and terminologies: the Cell Type Ontology, the Chemical Entities of Biological Interest ontology, the NCBI Taxonomy, the Protein Ontology, the Sequence Ontology, the entries of the Entrez Gene database, and the three subontologies of the Gene Ontology. The first public release includes the annotations for 67 of the 97 articles, reserving two sets of 15 articles for future text-mining competitions (after which these too will be released). Concept annotations were created based on a single set of guidelines, which has enabled us to achieve consistently high interannotator agreement. Conclusions As the initial 67-article release contains more than 560,000 tokens (and the full set more than 790,000 tokens), our corpus is among the largest gold-standard annotated biomedical corpora. Unlike most others, the journal articles that comprise the corpus are drawn from diverse biomedical disciplines and are marked up in their entirety. Additionally, with a concept-annotation count of nearly 100,000 in the 67-article subset (and more than 140,000 in the full collection), the scale of conceptual markup is also among the largest of comparable corpora. The concept annotations of the CRAFT Corpus have the potential to significantly advance biomedical text mining by providing a high-quality gold standard for NLP systems. The corpus, annotation guidelines, and other associated resources are freely available at http

  17. The impact of open access on biomedical research

    PubMed Central

    Varmus, Harold; Lipman, David; Ginsparg, Paul; Markovitz, Barry P

    2000-01-01

    A series of reports - and extracts of reports - from the Freedom of Information Conference, 6-7 July, 2000, New York Academy of Medicine. The conference was sponsored by BioMed Central, to promote debate about the communication and validation of biomedical research published on the internet. Details of the meeting and all presentations are available in full online at

  18. Text Detective: a rule-based system for gene annotation in biomedical texts

    PubMed Central

    Tamames, Javier

    2005-01-01

    Background The identification of mentions of gene or gene products in biomedical texts is a critical step in the development of text mining applications in biosciences. The complexity and ambiguity of gene nomenclature makes this a very difficult task. Methods Here we present a novel approach based on a combination of carefully designed rules and several lexicons of biological concepts, implemented in the Text Detective system. Text Detective is able to normalize the results of gene mentions found by offering the appropriate database reference. Results In BioCreAtIvE evaluation, Text Detective achieved results of 84% precision, 71% recall for task 1A, and 79% precision, 71% recall for mouse genes in task 1B. PMID:15960822

  19. Text detective: a rule-based system for gene annotation in biomedical texts.

    PubMed

    Tamames, Javier

    2005-01-01

    The identification of mentions of gene or gene products in biomedical texts is a critical step in the development of text mining applications in biosciences. The complexity and ambiguity of gene nomenclature makes this a very difficult task. Here we present a novel approach based on a combination of carefully designed rules and several lexicons of biological concepts, implemented in the Text Detective system. Text Detective is able to normalize the results of gene mentions found by offering the appropriate database reference. In BioCreAtIvE evaluation, Text Detective achieved results of 84% precision, 71% recall for task 1A, and 79% precision, 71% recall for mouse genes in task 1B.

  20. Generation of open biomedical datasets through ontology-driven transformation and integration processes.

    PubMed

    Carmen Legaz-García, María Del; Miñarro-Giménez, José Antonio; Menárguez-Tortosa, Marcos; Fernández-Breis, Jesualdo Tomás

    2016-06-03

    Biomedical research usually requires combining large volumes of data from multiple heterogeneous sources, which makes difficult the integrated exploitation of such data. The Semantic Web paradigm offers a natural technological space for data integration and exploitation by generating content readable by machines. Linked Open Data is a Semantic Web initiative that promotes the publication and sharing of data in machine readable semantic formats. We present an approach for the transformation and integration of heterogeneous biomedical data with the objective of generating open biomedical datasets in Semantic Web formats. The transformation of the data is based on the mappings between the entities of the data schema and the ontological infrastructure that provides the meaning to the content. Our approach permits different types of mappings and includes the possibility of defining complex transformation patterns. Once the mappings are defined, they can be automatically applied to datasets to generate logically consistent content and the mappings can be reused in further transformation processes. The results of our research are (1) a common transformation and integration process for heterogeneous biomedical data; (2) the application of Linked Open Data principles to generate interoperable, open, biomedical datasets; (3) a software tool, called SWIT, that implements the approach. In this paper we also describe how we have applied SWIT in different biomedical scenarios and some lessons learned. We have presented an approach that is able to generate open biomedical repositories in Semantic Web formats. SWIT is able to apply the Linked Open Data principles in the generation of the datasets, so allowing for linking their content to external repositories and creating linked open datasets. SWIT datasets may contain data from multiple sources and schemas, thus becoming integrated datasets.

  1. Open Data in Biomedical Science: Policy Drivers and Recent Progress

    EPA Science Inventory

    EPA's progress in implementing the open data initiatives first outlined in the 2009 Presidential memorandum on open government and more specifically regarding publications and data from publications in the 2013 Holdren memorandum. The presentation outlines the major points in bo...

  2. Open Data in Biomedical Science: Policy Drivers and Recent Progress

    EPA Science Inventory

    EPA's progress in implementing the open data initiatives first outlined in the 2009 Presidential memorandum on open government and more specifically regarding publications and data from publications in the 2013 Holdren memorandum. The presentation outlines the major points in bo...

  3. A methodology to annotate systems biology markup language models with the synthetic biology open language.

    PubMed

    Roehner, Nicholas; Myers, Chris J

    2014-02-21

    Recently, we have begun to witness the potential of synthetic biology, noted here in the form of bacteria and yeast that have been genetically engineered to produce biofuels, manufacture drug precursors, and even invade tumor cells. The success of these projects, however, has often failed in translation and application to new projects, a problem exacerbated by a lack of engineering standards that combine descriptions of the structure and function of DNA. To address this need, this paper describes a methodology to connect the systems biology markup language (SBML) to the synthetic biology open language (SBOL), existing standards that describe biochemical models and DNA components, respectively. Our methodology involves first annotating SBML model elements such as species and reactions with SBOL DNA components. A graph is then constructed from the model, with vertices corresponding to elements within the model and edges corresponding to the cause-and-effect relationships between these elements. Lastly, the graph is traversed to assemble the annotating DNA components into a composite DNA component, which is used to annotate the model itself and can be referenced by other composite models and DNA components. In this way, our methodology can be used to build up a hierarchical library of models annotated with DNA components. Such a library is a useful input to any future genetic technology mapping algorithm that would automate the process of composing DNA components to satisfy a behavioral specification. Our methodology for SBML-to-SBOL annotation is implemented in the latest version of our genetic design automation (GDA) software tool, iBioSim.

  4. Open Data in Biomedical Science: Policy Drivers and Recent ...

    EPA Pesticide Factsheets

    EPA's progress in implementing the open data initiatives first outlined in the 2009 Presidential memorandum on open government and more specifically regarding publications and data from publications in the 2013 Holdren memorandum. The presentation outlines the major points in both memorandums regarding open data, presents several (but not exhaustive) EPA initiatives on open data, some of which occurred will before both policy memorandums. The presentation concludes by outlining the initiatives to ensure public access to all EPA publications through PubMed Central and all publication-associated data through the Environmental Data Gateway and Data.gov. The purpose of this presentation is to present EPA's progress in implementing the open data initiatives first outlined in the 2009 Presidential memorandum on open government and more specifically regarding publications and data from publications in the 2013 Holdren memorandum.

  5. Scientific Reproducibility in Biomedical Research: Provenance Metadata Ontology for Semantic Annotation of Study Description

    PubMed Central

    Sahoo, Satya S.; Valdez, Joshua; Rueschman, Michael

    2016-01-01

    Scientific reproducibility is key to scientific progress as it allows the research community to build on validated results, protect patients from potentially harmful trial drugs derived from incorrect results, and reduce wastage of valuable resources. The National Institutes of Health (NIH) recently published a systematic guideline titled “Rigor and Reproducibility “ for supporting reproducible research studies, which has also been accepted by several scientific journals. These journals will require published articles to conform to these new guidelines. Provenance metadata describes the history or origin of data and it has been long used in computer science to capture metadata information for ensuring data quality and supporting scientific reproducibility. In this paper, we describe the development of Provenance for Clinical and healthcare Research (ProvCaRe) framework together with a provenance ontology to support scientific reproducibility by formally modeling a core set of data elements representing details of research study. We extend the PROV Ontology (PROV-O), which has been recommended as the provenance representation model by World Wide Web Consortium (W3C), to represent both: (a) data provenance, and (b) process provenance. We use 124 study variables from 6 clinical research studies from the National Sleep Research Resource (NSRR) to evaluate the coverage of the provenance ontology. NSRR is the largest repository of NIH-funded sleep datasets with 50,000 studies from 36,000 participants. The provenance ontology reuses ontology concepts from existing biomedical ontologies, for example the Systematized Nomenclature of Medicine Clinical Terms (SNOMED CT), to model the provenance information of research studies. The ProvCaRe framework is being developed as part of the Big Data to Knowledge (BD2K) data provenance project. PMID:28269904

  6. Scientific Reproducibility in Biomedical Research: Provenance Metadata Ontology for Semantic Annotation of Study Description.

    PubMed

    Sahoo, Satya S; Valdez, Joshua; Rueschman, Michael

    2016-01-01

    Scientific reproducibility is key to scientific progress as it allows the research community to build on validated results, protect patients from potentially harmful trial drugs derived from incorrect results, and reduce wastage of valuable resources. The National Institutes of Health (NIH) recently published a systematic guideline titled "Rigor and Reproducibility " for supporting reproducible research studies, which has also been accepted by several scientific journals. These journals will require published articles to conform to these new guidelines. Provenance metadata describes the history or origin of data and it has been long used in computer science to capture metadata information for ensuring data quality and supporting scientific reproducibility. In this paper, we describe the development of Provenance for Clinical and healthcare Research (ProvCaRe) framework together with a provenance ontology to support scientific reproducibility by formally modeling a core set of data elements representing details of research study. We extend the PROV Ontology (PROV-O), which has been recommended as the provenance representation model by World Wide Web Consortium (W3C), to represent both: (a) data provenance, and (b) process provenance. We use 124 study variables from 6 clinical research studies from the National Sleep Research Resource (NSRR) to evaluate the coverage of the provenance ontology. NSRR is the largest repository of NIH-funded sleep datasets with 50,000 studies from 36,000 participants. The provenance ontology reuses ontology concepts from existing biomedical ontologies, for example the Systematized Nomenclature of Medicine Clinical Terms (SNOMED CT), to model the provenance information of research studies. The ProvCaRe framework is being developed as part of the Big Data to Knowledge (BD2K) data provenance project.

  7. The biomedical discourse relation bank

    PubMed Central

    2011-01-01

    Background Identification of discourse relations, such as causal and contrastive relations, between situations mentioned in text is an important task for biomedical text-mining. A biomedical text corpus annotated with discourse relations would be very useful for developing and evaluating methods for biomedical discourse processing. However, little effort has been made to develop such an annotated resource. Results We have developed the Biomedical Discourse Relation Bank (BioDRB), in which we have annotated explicit and implicit discourse relations in 24 open-access full-text biomedical articles from the GENIA corpus. Guidelines for the annotation were adapted from the Penn Discourse TreeBank (PDTB), which has discourse relations annotated over open-domain news articles. We introduced new conventions and modifications to the sense classification. We report reliable inter-annotator agreement of over 80% for all sub-tasks. Experiments for identifying the sense of explicit discourse connectives show the connective itself as a highly reliable indicator for coarse sense classification (accuracy 90.9% and F1 score 0.89). These results are comparable to results obtained with the same classifier on the PDTB data. With more refined sense classification, there is degradation in performance (accuracy 69.2% and F1 score 0.28), mainly due to sparsity in the data. The size of the corpus was found to be sufficient for identifying the sense of explicit connectives, with classifier performance stabilizing at about 1900 training instances. Finally, the classifier performs poorly when trained on PDTB and tested on BioDRB (accuracy 54.5% and F1 score 0.57). Conclusion Our work shows that discourse relations can be reliably annotated in biomedical text. Coarse sense disambiguation of explicit connectives can be done with high reliability by using just the connective as a feature, but more refined sense classification requires either richer features or more annotated data. The poor

  8. Micropublications: a semantic model for claims, evidence, arguments and annotations in biomedical communications

    PubMed Central

    2014-01-01

    where simpler, formalized and purely statement-based models, such as the nanopublications model, will not be sufficient. At the same time they will add significant value to, and are intentionally compatible with, statement-based formalizations. We suggest that micropublications, generated by useful software tools supporting such activities as writing, editing, reviewing, and discussion, will be of great value in improving the quality and tractability of biomedical communications. PMID:26261718

  9. A top-level ontology of functions and its application in the Open Biomedical Ontologies.

    PubMed

    Burek, Patryk; Hoehndorf, Robert; Loebe, Frank; Visagie, Johann; Herre, Heinrich; Kelso, Janet

    2006-07-15

    A clear understanding of functions in biology is a key component in accurate modelling of molecular, cellular and organismal biology. Using the existing biomedical ontologies it has been impossible to capture the complexity of the community's knowledge about biological functions. We present here a top-level ontological framework for representing knowledge about biological functions. This framework lends greater accuracy, power and expressiveness to biomedical ontologies by providing a means to capture existing functional knowledge in a more formal manner. An initial major application of the ontology of functions is the provision of a principled way in which to curate functional knowledge and annotations in biomedical ontologies. Further potential applications include the facilitation of ontology interoperability and automated reasoning. A major advantage of the proposed implementation is that it is an extension to existing biomedical ontologies, and can be applied without substantial changes to these domain ontologies. The Ontology of Functions (OF) can be downloaded in OWL format from http://onto.eva.mpg.de/. Additionally, a UML profile and supplementary information and guides for using the OF can be accessed from the same website.

  10. BioSig: The Free and Open Source Software Library for Biomedical Signal Processing

    PubMed Central

    Vidaurre, Carmen; Sander, Tilmann H.; Schlögl, Alois

    2011-01-01

    BioSig is an open source software library for biomedical signal processing. The aim of the BioSig project is to foster research in biomedical signal processing by providing free and open source software tools for many different application areas. Some of the areas where BioSig can be employed are neuroinformatics, brain-computer interfaces, neurophysiology, psychology, cardiovascular systems, and sleep research. Moreover, the analysis of biosignals such as the electroencephalogram (EEG), electrocorticogram (ECoG), electrocardiogram (ECG), electrooculogram (EOG), electromyogram (EMG), or respiration signals is a very relevant element of the BioSig project. Specifically, BioSig provides solutions for data acquisition, artifact processing, quality control, feature extraction, classification, modeling, and data visualization, to name a few. In this paper, we highlight several methods to help students and researchers to work more efficiently with biomedical signals. PMID:21437227

  11. BioSig: the free and open source software library for biomedical signal processing.

    PubMed

    Vidaurre, Carmen; Sander, Tilmann H; Schlögl, Alois

    2011-01-01

    BioSig is an open source software library for biomedical signal processing. The aim of the BioSig project is to foster research in biomedical signal processing by providing free and open source software tools for many different application areas. Some of the areas where BioSig can be employed are neuroinformatics, brain-computer interfaces, neurophysiology, psychology, cardiovascular systems, and sleep research. Moreover, the analysis of biosignals such as the electroencephalogram (EEG), electrocorticogram (ECoG), electrocardiogram (ECG), electrooculogram (EOG), electromyogram (EMG), or respiration signals is a very relevant element of the BioSig project. Specifically, BioSig provides solutions for data acquisition, artifact processing, quality control, feature extraction, classification, modeling, and data visualization, to name a few. In this paper, we highlight several methods to help students and researchers to work more efficiently with biomedical signals.

  12. The Use of Annotations in Examination Marking: Opening a Window into Markers' Minds

    ERIC Educational Resources Information Center

    Crisp, Victoria; Johnson, Martin

    2007-01-01

    This study investigated the functions of annotations, the role of annotations in markers' decision-making processes, whether annotations conform to conventions, and whether these vary according to subject area. Across subjects a number of scripts were analysed to survey which annotations are subject specific and which are more general. Twelve…

  13. Facilitating Full-text Access to Biomedical Literature Using Open Access Resources.

    PubMed

    Kang, Hongyu; Hou, Zhen; Li, Jiao

    2015-01-01

    Open access (OA) resources and local libraries often have their own literature databases, especially in the field of biomedicine. We have developed a method of linking a local library to a biomedical OA resource facilitating researchers' full-text article access. The method uses a model based on vector space to measure similarities between two articles in local library and OA resources. The method achieved an F-score of 99.61%. This method of article linkage and mapping between local library and OA resources is available for use. Through this work, we have improved the full-text access of the biomedical OA resources.

  14. SORTA: a system for ontology-based re-coding and technical annotation of biomedical phenotype data

    PubMed Central

    Pang, Chao; Sollie, Annet; Sijtsma, Anna; Hendriksen, Dennis; Charbon, Bart; de Haan, Mark; de Boer, Tommy; Kelpin, Fleur; Jetten, Jonathan; van der Velde, Joeri K.; Smidt, Nynke; Sijmons, Rolf; Hillege, Hans; Swertz, Morris A.

    2015-01-01

    There is an urgent need to standardize the semantics of biomedical data values, such as phenotypes, to enable comparative and integrative analyses. However, it is unlikely that all studies will use the same data collection protocols. As a result, retrospective standardization is often required, which involves matching of original (unstructured or locally coded) data to widely used coding or ontology systems such as SNOMED CT (clinical terms), ICD-10 (International Classification of Disease) and HPO (Human Phenotype Ontology). This data curation process is usually a time-consuming process performed by a human expert. To help mechanize this process, we have developed SORTA, a computer-aided system for rapidly encoding free text or locally coded values to a formal coding system or ontology. SORTA matches original data values (uploaded in semicolon delimited format) to a target coding system (uploaded in Excel spreadsheet, OWL ontology web language or OBO open biomedical ontologies format). It then semi- automatically shortlists candidate codes for each data value using Lucene and n-gram based matching algorithms, and can also learn from matches chosen by human experts. We evaluated SORTA’s applicability in two use cases. For the LifeLines biobank, we used SORTA to recode 90 000 free text values (including 5211 unique values) about physical exercise to MET (Metabolic Equivalent of Task) codes. For the CINEAS clinical symptom coding system, we used SORTA to map to HPO, enriching HPO when necessary (315 terms matched so far). Out of the shortlists at rank 1, we found a precision/recall of 0.97/0.98 in LifeLines and of 0.58/0.45 in CINEAS. More importantly, users found the tool both a major time saver and a quality improvement because SORTA reduced the chances of human mistakes. Thus, SORTA can dramatically ease data (re)coding tasks and we believe it will prove useful for many more projects. Database URL: http://molgenis.org/sorta or as an open source download from

  15. SORTA: a system for ontology-based re-coding and technical annotation of biomedical phenotype data.

    PubMed

    Pang, Chao; Sollie, Annet; Sijtsma, Anna; Hendriksen, Dennis; Charbon, Bart; de Haan, Mark; de Boer, Tommy; Kelpin, Fleur; Jetten, Jonathan; van der Velde, Joeri K; Smidt, Nynke; Sijmons, Rolf; Hillege, Hans; Swertz, Morris A

    2015-01-01

    There is an urgent need to standardize the semantics of biomedical data values, such as phenotypes, to enable comparative and integrative analyses. However, it is unlikely that all studies will use the same data collection protocols. As a result, retrospective standardization is often required, which involves matching of original (unstructured or locally coded) data to widely used coding or ontology systems such as SNOMED CT (clinical terms), ICD-10 (International Classification of Disease) and HPO (Human Phenotype Ontology). This data curation process is usually a time-consuming process performed by a human expert. To help mechanize this process, we have developed SORTA, a computer-aided system for rapidly encoding free text or locally coded values to a formal coding system or ontology. SORTA matches original data values (uploaded in semicolon delimited format) to a target coding system (uploaded in Excel spreadsheet, OWL ontology web language or OBO open biomedical ontologies format). It then semi- automatically shortlists candidate codes for each data value using Lucene and n-gram based matching algorithms, and can also learn from matches chosen by human experts. We evaluated SORTA's applicability in two use cases. For the LifeLines biobank, we used SORTA to recode 90 000 free text values (including 5211 unique values) about physical exercise to MET (Metabolic Equivalent of Task) codes. For the CINEAS clinical symptom coding system, we used SORTA to map to HPO, enriching HPO when necessary (315 terms matched so far). Out of the shortlists at rank 1, we found a precision/recall of 0.97/0.98 in LifeLines and of 0.58/0.45 in CINEAS. More importantly, users found the tool both a major time saver and a quality improvement because SORTA reduced the chances of human mistakes. Thus, SORTA can dramatically ease data (re)coding tasks and we believe it will prove useful for many more projects. Database URL: http://molgenis.org/sorta or as an open source download from

  16. Do open access biomedical journals benefit smaller countries? The Slovenian experience.

    PubMed

    Turk, Nana

    2011-06-01

    Scientists from smaller countries have problems gaining visibility for their research. Does open access publishing provide a solution? Slovenia is a small country with around 5000 medical doctors, 1300 dentists and 1000 pharmacists. A search of Slovenia's Bibliographic database was carried out to identity all biomedical journals and those which are open access. Slovenia has 18 medical open access journals, but none has an impact factor and only 10 are indexed by Slovenian and international bibliographic databases. The visibility and quality of medical papers is poor. The solution might be to reduce the number of journals and encourage Slovenian scientists to publish their best articles in them.

  17. Harvest: an open platform for developing web-based biomedical data discovery and reporting applications

    PubMed Central

    Pennington, Jeffrey W; Ruth, Byron; Italia, Michael J; Miller, Jeffrey; Wrazien, Stacey; Loutrel, Jennifer G; Crenshaw, E Bryan; White, Peter S

    2014-01-01

    Biomedical researchers share a common challenge of making complex data understandable and accessible as they seek inherent relationships between attributes in disparate data types. Data discovery in this context is limited by a lack of query systems that efficiently show relationships between individual variables, but without the need to navigate underlying data models. We have addressed this need by developing Harvest, an open-source framework of modular components, and using it for the rapid development and deployment of custom data discovery software applications. Harvest incorporates visualizations of highly dimensional data in a web-based interface that promotes rapid exploration and export of any type of biomedical information, without exposing researchers to underlying data models. We evaluated Harvest with two cases: clinical data from pediatric cardiology and demonstration data from the OpenMRS project. Harvest's architecture and public open-source code offer a set of rapid application development tools to build data discovery applications for domain-specific biomedical data repositories. All resources, including the OpenMRS demonstration, can be found at http://harvest.research.chop.edu PMID:24131510

  18. Harvest: an open platform for developing web-based biomedical data discovery and reporting applications.

    PubMed

    Pennington, Jeffrey W; Ruth, Byron; Italia, Michael J; Miller, Jeffrey; Wrazien, Stacey; Loutrel, Jennifer G; Crenshaw, E Bryan; White, Peter S

    2014-01-01

    Biomedical researchers share a common challenge of making complex data understandable and accessible as they seek inherent relationships between attributes in disparate data types. Data discovery in this context is limited by a lack of query systems that efficiently show relationships between individual variables, but without the need to navigate underlying data models. We have addressed this need by developing Harvest, an open-source framework of modular components, and using it for the rapid development and deployment of custom data discovery software applications. Harvest incorporates visualizations of highly dimensional data in a web-based interface that promotes rapid exploration and export of any type of biomedical information, without exposing researchers to underlying data models. We evaluated Harvest with two cases: clinical data from pediatric cardiology and demonstration data from the OpenMRS project. Harvest's architecture and public open-source code offer a set of rapid application development tools to build data discovery applications for domain-specific biomedical data repositories. All resources, including the OpenMRS demonstration, can be found at http://harvest.research.chop.edu.

  19. Combining Open-domain and Biomedical Knowledge for Topic Recognition in Consumer Health Questions.

    PubMed

    Mrabet, Yassine; Kilicoglu, Halil; Roberts, Kirk; Demner-Fushman, Dina

    2016-01-01

    Determining the main topics in consumer health questions is a crucial step in their processing as it allows narrowing the search space to a specific semantic context. In this paper we propose a topic recognition approach based on biomedical and open-domain knowledge bases. In the first step of our method, we recognize named entities in consumer health questions using an unsupervised method that relies on a biomedical knowledge base, UMLS, and an open-domain knowledge base, DBpedia. In the next step, we cast topic recognition as a binary classification problem of deciding whether a named entity is the question topic or not. We evaluated our approach on a dataset from the National Library of Medicine (NLM), introduced in this paper, and another from the Genetic and Rare Disease Information Center (GARD). The combination of knowledge bases outperformed the results obtained by individual knowledge bases by up to 16.5% F1 and achieved state-of-the-art performance. Our results demonstrate that combining open-domain knowledge bases with biomedical knowledge bases can lead to a substantial improvement in understanding user-generated health content.

  20. Combining Open-domain and Biomedical Knowledge for Topic Recognition in Consumer Health Questions

    PubMed Central

    Mrabet, Yassine; Kilicoglu, Halil; Roberts, Kirk; Demner-Fushman, Dina

    2016-01-01

    Determining the main topics in consumer health questions is a crucial step in their processing as it allows narrowing the search space to a specific semantic context. In this paper we propose a topic recognition approach based on biomedical and open-domain knowledge bases. In the first step of our method, we recognize named entities in consumer health questions using an unsupervised method that relies on a biomedical knowledge base, UMLS, and an open-domain knowledge base, DBpedia. In the next step, we cast topic recognition as a binary classification problem of deciding whether a named entity is the question topic or not. We evaluated our approach on a dataset from the National Library of Medicine (NLM), introduced in this paper, and another from the Genetic and Rare Disease Information Center (GARD). The combination of knowledge bases outperformed the results obtained by individual knowledge bases by up to 16.5% F1 and achieved state-of-the-art performance. Our results demonstrate that combining open-domain knowledge bases with biomedical knowledge bases can lead to a substantial improvement in understanding user-generated health content. PMID:28269888

  1. A new visual navigation system for exploring biomedical Open Educational Resource (OER) videos

    PubMed Central

    Zhao, Baoquan; Xu, Songhua; Lin, Shujin; Luo, Xiaonan; Duan, Lian

    2016-01-01

    Objective Biomedical videos as open educational resources (OERs) are increasingly proliferating on the Internet. Unfortunately, seeking personally valuable content from among the vast corpus of quality yet diverse OER videos is nontrivial due to limitations of today’s keyword- and content-based video retrieval techniques. To address this need, this study introduces a novel visual navigation system that facilitates users’ information seeking from biomedical OER videos in mass quantity by interactively offering visual and textual navigational clues that are both semantically revealing and user-friendly. Materials and Methods The authors collected and processed around 25 000 YouTube videos, which collectively last for a total length of about 4000 h, in the broad field of biomedical sciences for our experiment. For each video, its semantic clues are first extracted automatically through computationally analyzing audio and visual signals, as well as text either accompanying or embedded in the video. These extracted clues are subsequently stored in a metadata database and indexed by a high-performance text search engine. During the online retrieval stage, the system renders video search results as dynamic web pages using a JavaScript library that allows users to interactively and intuitively explore video content both efficiently and effectively. Results The authors produced a prototype implementation of the proposed system, which is publicly accessible at https://patentq.njit.edu/oer. To examine the overall advantage of the proposed system for exploring biomedical OER videos, the authors further conducted a user study of a modest scale. The study results encouragingly demonstrate the functional effectiveness and user-friendliness of the new system for facilitating information seeking from and content exploration among massive biomedical OER videos. Conclusion Using the proposed tool, users can efficiently and effectively find videos of interest, precisely locate

  2. A Survey of Quality Assurance Practices in Biomedical Open Source Software Projects

    PubMed Central

    Koru, Günes; Neisa, Angelica; Umarji, Medha

    2007-01-01

    Background Open source (OS) software is continuously gaining recognition and use in the biomedical domain, for example, in health informatics and bioinformatics. Objectives Given the mission critical nature of applications in this domain and their potential impact on patient safety, it is important to understand to what degree and how effectively biomedical OS developers perform standard quality assurance (QA) activities such as peer reviews and testing. This would allow the users of biomedical OS software to better understand the quality risks, if any, and the developers to identify process improvement opportunities to produce higher quality software. Methods A survey of developers working on biomedical OS projects was conducted to examine the QA activities that are performed. We took a descriptive approach to summarize the implementation of QA activities and then examined some of the factors that may be related to the implementation of such practices. Results Our descriptive results show that 63% (95% CI, 54-72) of projects did not include peer reviews in their development process, while 82% (95% CI, 75-89) did include testing. Approximately 74% (95% CI, 67-81) of developers did not have a background in computing, 80% (95% CI, 74-87) were paid for their contributions to the project, and 52% (95% CI, 43-60) had PhDs. A multivariate logistic regression model to predict the implementation of peer reviews was not significant (likelihood ratio test = 16.86, 9 df, P = .051) and neither was a model to predict the implementation of testing (likelihood ratio test = 3.34, 9 df, P = .95). Conclusions Less attention is paid to peer review than testing. However, the former is a complementary, and necessary, QA practice rather than an alternative. Therefore, one can argue that there are quality risks, at least at this point in time, in transitioning biomedical OS software into any critical settings that may have operational, financial, or safety implications. Developers of

  3. Lessons learned in the generation of biomedical research datasets using Semantic Open Data technologies.

    PubMed

    Legaz-García, María del Carmen; Miñarro-Giménez, José Antonio; Menárguez-Tortosa, Marcos; Fernández-Breis, Jesualdo Tomás

    2015-01-01

    Biomedical research usually requires combining large volumes of data from multiple heterogeneous sources. Such heterogeneity makes difficult not only the generation of research-oriented dataset but also its exploitation. In recent years, the Open Data paradigm has proposed new ways for making data available in ways that sharing and integration are facilitated. Open Data approaches may pursue the generation of content readable only by humans and by both humans and machines, which are the ones of interest in our work. The Semantic Web provides a natural technological space for data integration and exploitation and offers a range of technologies for generating not only Open Datasets but also Linked Datasets, that is, open datasets linked to other open datasets. According to the Berners-Lee's classification, each open dataset can be given a rating between one and five stars attending to can be given to each dataset. In the last years, we have developed and applied our SWIT tool, which automates the generation of semantic datasets from heterogeneous data sources. SWIT produces four stars datasets, given that fifth one can be obtained by being the dataset linked from external ones. In this paper, we describe how we have applied the tool in two projects related to health care records and orthology data, as well as the major lessons learned from such efforts.

  4. ASGARD: an open-access database of annotated transcriptomes for emerging model arthropod species

    PubMed Central

    Zeng, Victor; Extavour, Cassandra G.

    2012-01-01

    The increased throughput and decreased cost of next-generation sequencing (NGS) have shifted the bottleneck genomic research from sequencing to annotation, analysis and accessibility. This is particularly challenging for research communities working on organisms that lack the basic infrastructure of a sequenced genome, or an efficient way to utilize whatever sequence data may be available. Here we present a new database, the Assembled Searchable Giant Arthropod Read Database (ASGARD). This database is a repository and search engine for transcriptomic data from arthropods that are of high interest to multiple research communities but currently lack sequenced genomes. We demonstrate the functionality and utility of ASGARD using de novo assembled transcriptomes from the milkweed bug Oncopeltus fasciatus, the cricket Gryllus bimaculatus and the amphipod crustacean Parhyale hawaiensis. We have annotated these transcriptomes to assign putative orthology, coding region determination, protein domain identification and Gene Ontology (GO) term annotation to all possible assembly products. ASGARD allows users to search all assemblies by orthology annotation, GO term annotation or Basic Local Alignment Search Tool. User-friendly features of ASGARD include search term auto-completion suggestions based on database content, the ability to download assembly product sequences in FASTA format, direct links to NCBI data for predicted orthologs and graphical representation of the location of protein domains and matches to similar sequences from the NCBI non-redundant database. ASGARD will be a useful repository for transcriptome data from future NGS studies on these and other emerging model arthropods, regardless of sequencing platform, assembly or annotation status. This database thus provides easy, one-stop access to multi-species annotated transcriptome information. We anticipate that this database will be useful for members of multiple research communities, including developmental

  5. ASGARD: an open-access database of annotated transcriptomes for emerging model arthropod species.

    PubMed

    Zeng, Victor; Extavour, Cassandra G

    2012-01-01

    The increased throughput and decreased cost of next-generation sequencing (NGS) have shifted the bottleneck genomic research from sequencing to annotation, analysis and accessibility. This is particularly challenging for research communities working on organisms that lack the basic infrastructure of a sequenced genome, or an efficient way to utilize whatever sequence data may be available. Here we present a new database, the Assembled Searchable Giant Arthropod Read Database (ASGARD). This database is a repository and search engine for transcriptomic data from arthropods that are of high interest to multiple research communities but currently lack sequenced genomes. We demonstrate the functionality and utility of ASGARD using de novo assembled transcriptomes from the milkweed bug Oncopeltus fasciatus, the cricket Gryllus bimaculatus and the amphipod crustacean Parhyale hawaiensis. We have annotated these transcriptomes to assign putative orthology, coding region determination, protein domain identification and Gene Ontology (GO) term annotation to all possible assembly products. ASGARD allows users to search all assemblies by orthology annotation, GO term annotation or Basic Local Alignment Search Tool. User-friendly features of ASGARD include search term auto-completion suggestions based on database content, the ability to download assembly product sequences in FASTA format, direct links to NCBI data for predicted orthologs and graphical representation of the location of protein domains and matches to similar sequences from the NCBI non-redundant database. ASGARD will be a useful repository for transcriptome data from future NGS studies on these and other emerging model arthropods, regardless of sequencing platform, assembly or annotation status. This database thus provides easy, one-stop access to multi-species annotated transcriptome information. We anticipate that this database will be useful for members of multiple research communities, including developmental

  6. A new visual navigation system for exploring biomedical Open Educational Resource (OER) videos.

    PubMed

    Zhao, Baoquan; Xu, Songhua; Lin, Shujin; Luo, Xiaonan; Duan, Lian

    2016-04-01

    Biomedical videos as open educational resources (OERs) are increasingly proliferating on the Internet. Unfortunately, seeking personally valuable content from among the vast corpus of quality yet diverse OER videos is nontrivial due to limitations of today's keyword- and content-based video retrieval techniques. To address this need, this study introduces a novel visual navigation system that facilitates users' information seeking from biomedical OER videos in mass quantity by interactively offering visual and textual navigational clues that are both semantically revealing and user-friendly. The authors collected and processed around 25 000 YouTube videos, which collectively last for a total length of about 4000 h, in the broad field of biomedical sciences for our experiment. For each video, its semantic clues are first extracted automatically through computationally analyzing audio and visual signals, as well as text either accompanying or embedded in the video. These extracted clues are subsequently stored in a metadata database and indexed by a high-performance text search engine. During the online retrieval stage, the system renders video search results as dynamic web pages using a JavaScript library that allows users to interactively and intuitively explore video content both efficiently and effectively.ResultsThe authors produced a prototype implementation of the proposed system, which is publicly accessible athttps://patentq.njit.edu/oer To examine the overall advantage of the proposed system for exploring biomedical OER videos, the authors further conducted a user study of a modest scale. The study results encouragingly demonstrate the functional effectiveness and user-friendliness of the new system for facilitating information seeking from and content exploration among massive biomedical OER videos. Using the proposed tool, users can efficiently and effectively find videos of interest, precisely locate video segments delivering personally valuable

  7. Status of open access in the biomedical field in 2005*†

    PubMed Central

    Matsubayashi, Mamiko; Kurata, Keiko; Sakai, Yukiko; Morioka, Tomoko; Kato, Shinya; Mine, Shinji; Ueda, Shuichi

    2009-01-01

    Objectives: This study was designed to document the state of open access (OA) in the biomedical field in 2005. Methods: PubMed was used to collect bibliographic data on target articles published in 2005. PubMed, Google Scholar, Google, and OAIster were then used to establish the availability of free full text online for these publications. Articles were analyzed by type of OA, country, type of article, impact factor, publisher, and publishing model to provide insight into the current state of OA. Results: Twenty-seven percent of all the articles were accessible as OA articles. More than 70% of the OA articles were provided through journal websites. Mid-rank commercial publishers often provided OA articles in OA journals, while society publishers tended to provide OA articles in the context of a traditional subscription model. The rate of OA articles available from the websites of individual authors or in institutional repositories was quite low. Discussion/Conclusions: In 2005, OA in the biomedical field was achieved under an umbrella of existing scholarly communication systems. Typically, OA articles were published as part of subscription journals published by scholarly societies. OA journals published by BioMed Central contributed to a small portion of all OA articles. PMID:19159007

  8. 3D visualization of biomedical CT images based on OpenGL and VRML techniques

    NASA Astrophysics Data System (ADS)

    Yin, Meng; Luo, Qingming; Xia, Fuhua

    2002-04-01

    Current high-performance computers and advanced image processing capabilities have made the application of three- dimensional visualization objects in biomedical computer tomographic (CT) images facilitate the researches on biomedical engineering greatly. Trying to cooperate with the update technology using Internet, where 3D data are typically stored and processed on powerful servers accessible by using TCP/IP, we should hold the results of the isosurface be applied in medical visualization generally. Furthermore, this project is a future part of PACS system our lab is working on. So in this system we use the 3D file format VRML2.0, which is used through the Web interface for manipulating 3D models. In this program we implemented to generate and modify triangular isosurface meshes by marching cubes algorithm. Then we used OpenGL and MFC techniques to render the isosurface and manipulating voxel data. This software is more adequate visualization of volumetric data. The drawbacks are that 3D image processing on personal computers is rather slow and the set of tools for 3D visualization is limited. However, these limitations have not affected the applicability of this platform for all the tasks needed in elementary experiments in laboratory or data preprocessed.

  9. Annotated bibliography of the biomedical literature pertaining to chiropractic, pediatrics and manipulation in relation to the treatment of health conditions

    PubMed Central

    Gotlib, Allan C; Beingessner, Melanie

    1995-01-01

    Biomedical literature retrieval, both indexed and non-indexed, with respect to the application of manipulative therapy with therapeutic intent and pediatric health conditions (ages 0 to 17 years) yielded 66 discrete documents which met specified inclusion and exclusion criteria. There was one experimental study (RCT’s), 3 observational (cohort, case control) studies and 62 descriptive studies (case series, case reports, surveys, literature reviews). An independent rating panel determined consistency with a modified quality of evidence scale adopted from procedure ratings system 1 of Clinical Guidelines for Chiropractic Practice in Canada. Results indicate minimal Class 1 and Class 2 and some Class 3 evidence for a variety of pediatric conditions utilizing the application of manipulation with therapeutic intent.

  10. PAPARA(ZZ)I: An open-source software interface for annotating photographs of the deep-sea

    NASA Astrophysics Data System (ADS)

    Marcon, Yann; Purser, Autun

    PAPARA(ZZ)I is a lightweight and intuitive image annotation program developed for the study of benthic megafauna. It offers functionalities such as free, grid and random point annotation. Annotations may be made following existing classification schemes for marine biota and substrata or with the use of user defined, customised lists of keywords, which broadens the range of potential application of the software to other types of studies (e.g. marine litter distribution assessment). If Internet access is available, PAPARA(ZZ)I can also query and use standardised taxa names directly from the World Register of Marine Species (WoRMS). Program outputs include abundances, densities and size calculations per keyword (e.g. per taxon). These results are written into text files that can be imported into spreadsheet programs for further analyses. PAPARA(ZZ)I is open-source and is available at http://papara-zz-i.github.io. Compiled versions exist for most 64-bit operating systems: Windows, Mac OS X and Linux.

  11. WikiGenomes: an open web application for community consumption and curation of gene annotation data in Wikidata

    PubMed Central

    Lelong, Sebastien; Burgstaller-Muehlbacher, Sebastian; Waagmeester, Andra; Diesh, Colin; Dunn, Nathan; Munoz-Torres, Monica; Stupp, Gregory S.; Wu, Chunlei

    2017-01-01

    Abstract With the advancement of genome-sequencing technologies, new genomes are being sequenced daily. Although these sequences are deposited in publicly available data warehouses, their functional and genomic annotations (beyond genes which are predicted automatically) mostly reside in the text of primary publications. Professional curators are hard at work extracting those annotations from the literature for the most studied organisms and depositing them in structured databases. However, the resources don’t exist to fund the comprehensive curation of the thousands of newly sequenced organisms in this manner. Here, we describe WikiGenomes (wikigenomes.org), a web application that facilitates the consumption and curation of genomic data by the entire scientific community. WikiGenomes is based on Wikidata, an openly editable knowledge graph with the goal of aggregating published knowledge into a free and open database. WikiGenomes empowers the individual genomic researcher to contribute their expertise to the curation effort and integrates the knowledge into Wikidata, enabling it to be accessed by anyone without restriction. Database URL: www.wikigenomes.org PMID:28365742

  12. WikiGenomes: an open web application for community consumption and curation of gene annotation data in Wikidata.

    PubMed

    Putman, Tim E; Lelong, Sebastien; Burgstaller-Muehlbacher, Sebastian; Waagmeester, Andra; Diesh, Colin; Dunn, Nathan; Munoz-Torres, Monica; Stupp, Gregory S; Wu, Chunlei; Su, Andrew I; Good, Benjamin M

    2017-01-01

    With the advancement of genome-sequencing technologies, new genomes are being sequenced daily. Although these sequences are deposited in publicly available data warehouses, their functional and genomic annotations (beyond genes which are predicted automatically) mostly reside in the text of primary publications. Professional curators are hard at work extracting those annotations from the literature for the most studied organisms and depositing them in structured databases. However, the resources don't exist to fund the comprehensive curation of the thousands of newly sequenced organisms in this manner. Here, we describe WikiGenomes (wikigenomes.org), a web application that facilitates the consumption and curation of genomic data by the entire scientific community. WikiGenomes is based on Wikidata, an openly editable knowledge graph with the goal of aggregating published knowledge into a free and open database. WikiGenomes empowers the individual genomic researcher to contribute their expertise to the curation effort and integrates the knowledge into Wikidata, enabling it to be accessed by anyone without restriction. www.wikigenomes.org.

  13. The ImageJ ecosystem: an open platform for biomedical image analysis

    PubMed Central

    Schindelin, Johannes; Rueden, Curtis T.; Hiner, Mark C.; Eliceiri, Kevin W.

    2015-01-01

    Technology in microscopy advances rapidly, enabling increasingly affordable, faster, and more precise quantitative biomedical imaging, which necessitates correspondingly more-advanced image processing and analysis techniques. A wide range of software is available – from commercial to academic, special-purpose to Swiss army knife, small to large–but a key characteristic of software that is suitable for scientific inquiry is its accessibility. Open-source software is ideal for scientific endeavors because it can be freely inspected, modified, and redistributed; in particular, the open-software platform ImageJ has had a huge impact on life sciences, and continues to do so. From its inception, ImageJ has grown significantly due largely to being freely available and its vibrant and helpful user community. Scientists as diverse as interested hobbyists, technical assistants, students, scientific staff, and advanced biology researchers use ImageJ on a daily basis, and exchange knowledge via its dedicated mailing list. Uses of ImageJ range from data visualization and teaching to advanced image processing and statistical analysis. The software's extensibility continues to attract biologists at all career stages as well as computer scientists who wish to effectively implement specific image-processing algorithms. In this review, we use the ImageJ project as a case study of how open-source software fosters its suites of software tools, making multitudes of image-analysis technology easily accessible to the scientific community. We specifically explore what makes ImageJ so popular, how it impacts life science, how it inspires other projects, and how it is self-influenced by coevolving projects within the ImageJ ecosystem. PMID:26153368

  14. The ImageJ ecosystem: An open platform for biomedical image analysis.

    PubMed

    Schindelin, Johannes; Rueden, Curtis T; Hiner, Mark C; Eliceiri, Kevin W

    2015-01-01

    Technology in microscopy advances rapidly, enabling increasingly affordable, faster, and more precise quantitative biomedical imaging, which necessitates correspondingly more-advanced image processing and analysis techniques. A wide range of software is available-from commercial to academic, special-purpose to Swiss army knife, small to large-but a key characteristic of software that is suitable for scientific inquiry is its accessibility. Open-source software is ideal for scientific endeavors because it can be freely inspected, modified, and redistributed; in particular, the open-software platform ImageJ has had a huge impact on the life sciences, and continues to do so. From its inception, ImageJ has grown significantly due largely to being freely available and its vibrant and helpful user community. Scientists as diverse as interested hobbyists, technical assistants, students, scientific staff, and advanced biology researchers use ImageJ on a daily basis, and exchange knowledge via its dedicated mailing list. Uses of ImageJ range from data visualization and teaching to advanced image processing and statistical analysis. The software's extensibility continues to attract biologists at all career stages as well as computer scientists who wish to effectively implement specific image-processing algorithms. In this review, we use the ImageJ project as a case study of how open-source software fosters its suites of software tools, making multitudes of image-analysis technology easily accessible to the scientific community. We specifically explore what makes ImageJ so popular, how it impacts the life sciences, how it inspires other projects, and how it is self-influenced by coevolving projects within the ImageJ ecosystem.

  15. Schools of Choice: An Annotated Catalog of Key Choice Elements: Open Enrollment, Diversity and Empowerment.

    ERIC Educational Resources Information Center

    Oakland Univ., Rochester, MI. School of Human and Educational Services.

    This catalog, an outcome of the Project To Access Choice in Education (PACE), lists examples of public schools throughout the nation offering choices in education. "Schools of Choice" are defined as those offering one or more of the following three dynamics: (1) open enrollment, the freedom for families to choose the elementary or secondary…

  16. Semantic annotation in biomedicine: the current landscape.

    PubMed

    Jovanović, Jelena; Bagheri, Ebrahim

    2017-09-22

    The abundance and unstructured nature of biomedical texts, be it clinical or research content, impose significant challenges for the effective and efficient use of information and knowledge stored in such texts. Annotation of biomedical documents with machine intelligible semantics facilitates advanced, semantics-based text management, curation, indexing, and search. This paper focuses on annotation of biomedical entity mentions with concepts from relevant biomedical knowledge bases such as UMLS. As a result, the meaning of those mentions is unambiguously and explicitly defined, and thus made readily available for automated processing. This process is widely known as semantic annotation, and the tools that perform it are known as semantic annotators.Over the last dozen years, the biomedical research community has invested significant efforts in the development of biomedical semantic annotation technology. Aiming to establish grounds for further developments in this area, we review a selected set of state of the art biomedical semantic annotators, focusing particularly on general purpose annotators, that is, semantic annotation tools that can be customized to work with texts from any area of biomedicine. We also examine potential directions for further improvements of today's annotators which could make them even more capable of meeting the needs of real-world applications. To motivate and encourage further developments in this area, along the suggested and/or related directions, we review existing and potential practical applications and benefits of semantic annotators.

  17. Computing human image annotation.

    PubMed

    Channin, David S; Mongkolwat, Pattanasak; Kleper, Vladimir; Rubin, Daniel L

    2009-01-01

    An image annotation is the explanatory or descriptive information about the pixel data of an image that is generated by a human (or machine) observer. An image markup is the graphical symbols placed over the image to depict an annotation. In the majority of current, clinical and research imaging practice, markup is captured in proprietary formats and annotations are referenced only in free text radiology reports. This makes these annotations difficult to query, retrieve and compute upon, hampering their integration into other data mining and analysis efforts. This paper describes the National Cancer Institute's Cancer Biomedical Informatics Grid's (caBIG) Annotation and Image Markup (AIM) project, focusing on how to use AIM to query for annotations. The AIM project delivers an information model for image annotation and markup. The model uses controlled terminologies for important concepts. All of the classes and attributes of the model have been harmonized with the other models and common data elements in use at the National Cancer Institute. The project also delivers XML schemata necessary to instantiate AIMs in XML as well as a software application for translating AIM XML into DICOM S/R and HL7 CDA. Large collections of AIM annotations can be built and then queried as Grid or Web services. Using the tools of the AIM project, image annotations and their markup can be captured and stored in human and machine readable formats. This enables the inclusion of human image observation and inference as part of larger data mining and analysis activities.

  18. Discovery and Functional Annotation of SIX6 Variants in Primary Open-Angle Glaucoma

    PubMed Central

    Allingham, R. Rand; Whigham, Benjamin T.; Havens, Shane; Garrett, Melanie E.; Qiao, Chunyan; Katsanis, Nicholas; Wiggs, Janey L.; Pasquale, Louis R.; Ashley-Koch, Allison; Oh, Edwin C.; Hauser, Michael A.

    2014-01-01

    Glaucoma is a leading cause of blindness worldwide. Primary open-angle glaucoma (POAG) is the most common subtype and is a complex trait with multigenic inheritance. Genome-wide association studies have previously identified a significant association between POAG and the SIX6 locus (rs10483727, odds ratio (OR) = 1.32, p = 3.87×10−11). SIX6 plays a role in ocular development and has been associated with the morphology of the optic nerve. We sequenced the SIX6 coding and regulatory regions in 262 POAG cases and 256 controls and identified six nonsynonymous coding variants, including five rare and one common variant, Asn141His (rs33912345), which was associated significantly with POAG (OR = 1.27, p = 4.2×10−10) in the NEIGHBOR/GLAUGEN datasets. These variants were tested in an in vivo Danio rerio (zebrafish) complementation assay to evaluate ocular metrics such as eye size and optic nerve structure. Five variants, found primarily in POAG cases, were hypomorphic or null, while the sixth variant, found only in controls, was benign. One variant in the SIX6 enhancer increased expression of SIX6 and disrupted its regulation. Finally, to our knowledge for the first time, we have identified a clinical feature in POAG patients that appears to be dependent upon SIX6 genotype: patients who are homozygous for the SIX6 risk allele (His141) have a statistically thinner retinal nerve fiber layer than patients homozygous for the SIX6 non-risk allele (Asn141). Our results, in combination with previous SIX6 work, lead us to hypothesize that SIX6 risk variants disrupt the development of the neural retina, leading to a reduced number of retinal ganglion cells, thereby increasing the risk of glaucoma-associated vision loss. PMID:24875647

  19. KEGG orthology-based annotation of the predicted proteome of Acropora digitifera: ZoophyteBase - an open access and searchable database of a coral genome

    PubMed Central

    2013-01-01

    Background Contemporary coral reef research has firmly established that a genomic approach is urgently needed to better understand the effects of anthropogenic environmental stress and global climate change on coral holobiont interactions. Here we present KEGG orthology-based annotation of the complete genome sequence of the scleractinian coral Acropora digitifera and provide the first comprehensive view of the genome of a reef-building coral by applying advanced bioinformatics. Description Sequences from the KEGG database of protein function were used to construct hidden Markov models. These models were used to search the predicted proteome of A. digitifera to establish complete genomic annotation. The annotated dataset is published in ZoophyteBase, an open access format with different options for searching the data. A particularly useful feature is the ability to use a Google-like search engine that links query words to protein attributes. We present features of the annotation that underpin the molecular structure of key processes of coral physiology that include (1) regulatory proteins of symbiosis, (2) planula and early developmental proteins, (3) neural messengers, receptors and sensory proteins, (4) calcification and Ca2+-signalling proteins, (5) plant-derived proteins, (6) proteins of nitrogen metabolism, (7) DNA repair proteins, (8) stress response proteins, (9) antioxidant and redox-protective proteins, (10) proteins of cellular apoptosis, (11) microbial symbioses and pathogenicity proteins, (12) proteins of viral pathogenicity, (13) toxins and venom, (14) proteins of the chemical defensome and (15) coral epigenetics. Conclusions We advocate that providing annotation in an open-access searchable database available to the public domain will give an unprecedented foundation to interrogate the fundamental molecular structure and interactions of coral symbiosis and allow critical questions to be addressed at the genomic level based on combined aspects of

  20. DeTEXT: A Database for Evaluating Text Extraction from Biomedical Literature Figures

    PubMed Central

    Yin, Xu-Cheng; Yang, Chun; Pei, Wei-Yi; Man, Haixia; Zhang, Jun; Learned-Miller, Erik; Yu, Hong

    2015-01-01

    Hundreds of millions of figures are available in biomedical literature, representing important biomedical experimental evidence. Since text is a rich source of information in figures, automatically extracting such text may assist in the task of mining figure information. A high-quality ground truth standard can greatly facilitate the development of an automated system. This article describes DeTEXT: A database for evaluating text extraction from biomedical literature figures. It is the first publicly available, human-annotated, high quality, and large-scale figure-text dataset with 288 full-text articles, 500 biomedical figures, and 9308 text regions. This article describes how figures were selected from open-access full-text biomedical articles and how annotation guidelines and annotation tools were developed. We also discuss the inter-annotator agreement and the reliability of the annotations. We summarize the statistics of the DeTEXT data and make available evaluation protocols for DeTEXT. Finally we lay out challenges we observed in the automated detection and recognition of figure text and discuss research directions in this area. DeTEXT is publicly available for downloading at http://prir.ustb.edu.cn/DeTEXT/. PMID:25951377

  1. xGDBvm: A Web GUI-Driven Workflow for Annotating Eukaryotic Genomes in the Cloud[OPEN

    PubMed Central

    Merchant, Nirav

    2016-01-01

    Genome-wide annotation of gene structure requires the integration of numerous computational steps. Currently, annotation is arguably best accomplished through collaboration of bioinformatics and domain experts, with broad community involvement. However, such a collaborative approach is not scalable at today’s pace of sequence generation. To address this problem, we developed the xGDBvm software, which uses an intuitive graphical user interface to access a number of common genome analysis and gene structure tools, preconfigured in a self-contained virtual machine image. Once their virtual machine instance is deployed through iPlant’s Atmosphere cloud services, users access the xGDBvm workflow via a unified Web interface to manage inputs, set program parameters, configure links to high-performance computing (HPC) resources, view and manage output, apply analysis and editing tools, or access contextual help. The xGDBvm workflow will mask the genome, compute spliced alignments from transcript and/or protein inputs (locally or on a remote HPC cluster), predict gene structures and gene structure quality, and display output in a public or private genome browser complete with accessory tools. Problematic gene predictions are flagged and can be reannotated using the integrated yrGATE annotation tool. xGDBvm can also be configured to append or replace existing data or load precomputed data. Multiple genomes can be annotated and displayed, and outputs can be archived for sharing or backup. xGDBvm can be adapted to a variety of use cases including de novo genome annotation, reannotation, comparison of different annotations, and training or teaching. PMID:27020957

  2. BioSimplify: an open source sentence simplification engine to improve recall in automatic biomedical information extraction.

    PubMed

    Jonnalagadda, Siddhartha; Gonzalez, Graciela

    2010-11-13

    BioSimplify is an open source tool written in Java that introduces and facilitates the use of a novel model for sentence simplification tuned for automatic discourse analysis and information extraction (as opposed to sentence simplification for improving human readability). The model is based on a "shot-gun" approach that produces many different (simpler) versions of the original sentence by combining variants of its constituent elements. This tool is optimized for processing biomedical scientific literature such as the abstracts indexed in PubMed. We tested our tool on its impact to the task of PPI extraction and it improved the f-score of the PPI tool by around 7%, with an improvement in recall of around 20%. The BioSimplify tool and test corpus can be downloaded from https://biosimplify.sourceforge.net.

  3. Open access in the biomedical field: a unique opportunity for researchers (and research itself).

    PubMed

    Giglia, E

    2007-06-01

    Aim of this article is to offer an overview of the Open Access strategy and its innovative idea of a free scholarly communication. Following the worldwide debate on the crisis of the scholarly communication and the new opportunities of a networked environment, definitions, purposes and real advantages of the Open Access pathway are presented from a researcher's point of view. To maximize the impact and dissemination, by providing free access to the result of the research, two complementary roads are pointed out and explained self-archiving in open archives and publishing in Open Access journals. To let authors make their choice the most useful tools to find one's way in this new reality are shown: directories, search engines, citation tracking projects. The starting survey being done, the article deals in its conclusions with the Open Access challenges and most debated themes: impact and dissemination, new assessment measures alternative to the Impact Factor, new mandatory policies of the funding agencies, questions related to the copyright issue.

  4. Opening Pathways for Underrepresented High School Students to Biomedical Research Careers: The Emory University RISE Program

    PubMed Central

    Rohrbaugh, Margaret C.; Corces, Victor G.

    2011-01-01

    Increasing the college graduation rates of underrepresented minority students in science disciplines is essential to attain a diverse workforce for the 21st century. The Research Internship and Science Education (RISE) program attempts to motivate and prepare students from the Atlanta Public School system, where underrepresented minority (URM) students comprise a majority of the population, for biomedical science careers by offering the opportunity to participate in an original research project. Students work in a research laboratory from the summer of their sophomore year until graduation, mentored by undergraduate and graduate students and postdoctoral fellows (postdocs). In addition, they receive instruction in college-level biology, scholastic assessment test (SAT) preparation classes, and help with the college application process. During the last 4 yr, RISE students have succeeded in the identification and characterization of a series of proteins involved in the regulation of nuclear organization and transcription. All but 1 of 39 RISE students have continued on to 4-year college undergraduate studies and 61% of those students are currently enrolled in science-related majors. These results suggest that the use of research-based experiences at the high school level may contribute to the increased recruitment of underrepresented students into science-related careers. PMID:21926301

  5. Opening pathways for underrepresented high school students to biomedical research careers: the Emory University RISE program.

    PubMed

    Rohrbaugh, Margaret C; Corces, Victor G

    2011-12-01

    Increasing the college graduation rates of underrepresented minority students in science disciplines is essential to attain a diverse workforce for the 21st century. The Research Internship and Science Education (RISE) program attempts to motivate and prepare students from the Atlanta Public School system, where underrepresented minority (URM) students comprise a majority of the population, for biomedical science careers by offering the opportunity to participate in an original research project. Students work in a research laboratory from the summer of their sophomore year until graduation, mentored by undergraduate and graduate students and postdoctoral fellows (postdocs). In addition, they receive instruction in college-level biology, scholastic assessment test (SAT) preparation classes, and help with the college application process. During the last 4 yr, RISE students have succeeded in the identification and characterization of a series of proteins involved in the regulation of nuclear organization and transcription. All but 1 of 39 RISE students have continued on to 4-year college undergraduate studies and 61% of those students are currently enrolled in science-related majors. These results suggest that the use of research-based experiences at the high school level may contribute to the increased recruitment of underrepresented students into science-related careers.

  6. Ethics of open access to biomedical research: Just a special case of ethics of open access to research

    PubMed Central

    Harnad, Stevan

    2007-01-01

    The ethical case for Open Access (OA) (free online access) to research findings is especially salient when it is public health that is being compromised by needless access restrictions. But the ethical imperative for OA is far more general: It applies to all scientific and scholarly research findings published in peer-reviewed journals. And peer-to-peer access is far more important than direct public access. Most research is funded so as to be conducted and published, by researchers, in order to be taken up, used, and built upon in further research and applications, again by researchers (pure and applied, including practitioners), for the benefit of the public that funded it – not in order to generate revenue for the peer-reviewed journal publishing industry (nor even because there is a burning public desire to read much of it). Hence OA needs to be mandated, by researchers' institutions and funders, for all research. PMID:18067660

  7. Ethics of open access to biomedical research: just a special case of ethics of open access to research.

    PubMed

    Harnad, Stevan

    2007-12-07

    The ethical case for Open Access (OA) (free online access) to research findings is especially salient when it is public health that is being compromised by needless access restrictions. But the ethical imperative for OA is far more general: It applies to all scientific and scholarly research findings published in peer-reviewed journals. And peer-to-peer access is far more important than direct public access. Most research is funded so as to be conducted and published, by researchers, in order to be taken up, used, and built upon in further research and applications, again by researchers (pure and applied, including practitioners), for the benefit of the public that funded it - not in order to generate revenue for the peer-reviewed journal publishing industry (nor even because there is a burning public desire to read much of it). Hence OA needs to be mandated, by researchers' institutions and funders, for all research.

  8. Automatic discourse connective detection in biomedical text.

    PubMed

    Ramesh, Balaji Polepalli; Prasad, Rashmi; Miller, Tim; Harrington, Brian; Yu, Hong

    2012-01-01

    Relation extraction in biomedical text mining systems has largely focused on identifying clause-level relations, but increasing sophistication demands the recognition of relations at discourse level. A first step in identifying discourse relations involves the detection of discourse connectives: words or phrases used in text to express discourse relations. In this study supervised machine-learning approaches were developed and evaluated for automatically identifying discourse connectives in biomedical text. Two supervised machine-learning models (support vector machines and conditional random fields) were explored for identifying discourse connectives in biomedical literature. In-domain supervised machine-learning classifiers were trained on the Biomedical Discourse Relation Bank, an annotated corpus of discourse relations over 24 full-text biomedical articles (~112,000 word tokens), a subset of the GENIA corpus. Novel domain adaptation techniques were also explored to leverage the larger open-domain Penn Discourse Treebank (~1 million word tokens). The models were evaluated using the standard evaluation metrics of precision, recall and F1 scores. Supervised machine-learning approaches can automatically identify discourse connectives in biomedical text, and the novel domain adaptation techniques yielded the best performance: 0.761 F1 score. A demonstration version of the fully implemented classifier BioConn is available at: http://bioconn.askhermes.org.

  9. Automatic discourse connective detection in biomedical text

    PubMed Central

    Polepalli Ramesh, Balaji; Prasad, Rashmi; Miller, Tim; Harrington, Brian

    2012-01-01

    Objective Relation extraction in biomedical text mining systems has largely focused on identifying clause-level relations, but increasing sophistication demands the recognition of relations at discourse level. A first step in identifying discourse relations involves the detection of discourse connectives: words or phrases used in text to express discourse relations. In this study supervised machine-learning approaches were developed and evaluated for automatically identifying discourse connectives in biomedical text. Materials and Methods Two supervised machine-learning models (support vector machines and conditional random fields) were explored for identifying discourse connectives in biomedical literature. In-domain supervised machine-learning classifiers were trained on the Biomedical Discourse Relation Bank, an annotated corpus of discourse relations over 24 full-text biomedical articles (∼112 000 word tokens), a subset of the GENIA corpus. Novel domain adaptation techniques were also explored to leverage the larger open-domain Penn Discourse Treebank (∼1 million word tokens). The models were evaluated using the standard evaluation metrics of precision, recall and F1 scores. Results and Conclusion Supervised machine-learning approaches can automatically identify discourse connectives in biomedical text, and the novel domain adaptation techniques yielded the best performance: 0.761 F1 score. A demonstration version of the fully implemented classifier BioConn is available at: http://bioconn.askhermes.org. PMID:22744958

  10. Diametral compression behavior of biomedical titanium scaffolds with open, interconnected pores prepared with the space holder method.

    PubMed

    Arifvianto, B; Leeflang, M A; Zhou, J

    2017-04-01

    Scaffolds with open, interconnected pores and appropriate mechanical properties are required to provide mechanical support and to guide the formation and development of new tissue in bone tissue engineering. Since the mechanical properties of the scaffold tend to decrease with increasing porosity, a balance must be sought in order to meet these two conflicting requirements. In this research, open, interconnected pores and mechanical properties of biomedical titanium scaffolds prepared by using the space holder method were characterized. Micro-computed tomography (micro-CT) and permeability analysis were carried out to quantify the porous structures and ascertain the presence of open, interconnected pores in the scaffolds fabricated. Diametral compression (DC) tests were performed to generate stress-strain diagrams that could be used to determine the elastic moduli and yield strengths of the scaffolds. Deformation and failure mechanisms involved in the DC tests of the titanium scaffolds were examined. The results of micro-CT and permeability analyses confirmed the presence of open, interconnected pores in the titanium scaffolds with porosity over a range of 31-61%. Among these scaffolds, a maximum specific surface area could be achieved in the scaffold with a total porosity of 5-55%. DC tests showed that the titanium scaffolds with elastic moduli and yield strengths of 0.64-3.47GPa and 28.67-80MPa, respectively, could be achieved. By comprehensive consideration of specific surface area, permeability and mechanical properties, the titanium scaffolds with porosities in a range of 50-55% were recommended to be used in cancellous bone tissue engineering. Copyright © 2017 Elsevier Ltd. All rights reserved.

  11. The Virtual Skeleton Database: An Open Access Repository for Biomedical Research and Collaboration

    PubMed Central

    Bonaretti, Serena; Pfahrer, Marcel; Niklaus, Roman; Büchler, Philippe

    2013-01-01

    Background Statistical shape models are widely used in biomedical research. They are routinely implemented for automatic image segmentation or object identification in medical images. In these fields, however, the acquisition of the large training datasets, required to develop these models, is usually a time-consuming process. Even after this effort, the collections of datasets are often lost or mishandled resulting in replication of work. Objective To solve these problems, the Virtual Skeleton Database (VSD) is proposed as a centralized storage system where the data necessary to build statistical shape models can be stored and shared. Methods The VSD provides an online repository system tailored to the needs of the medical research community. The processing of the most common image file types, a statistical shape model framework, and an ontology-based search provide the generic tools to store, exchange, and retrieve digital medical datasets. The hosted data are accessible to the community, and collaborative research catalyzes their productivity. Results To illustrate the need for an online repository for medical research, three exemplary projects of the VSD are presented: (1) an international collaboration to achieve improvement in cochlear surgery and implant optimization, (2) a population-based analysis of femoral fracture risk between genders, and (3) an online application developed for the evaluation and comparison of the segmentation of brain tumors. Conclusions The VSD is a novel system for scientific collaboration for the medical image community with a data-centric concept and semantically driven search option for anatomical structures. The repository has been proven to be a useful tool for collaborative model building, as a resource for biomechanical population studies, or to enhance segmentation algorithms. PMID:24220210

  12. Remarkable Growth of Open Access in the Biomedical Field: Analysis of PubMed Articles from 2006 to 2010

    PubMed Central

    Kurata, Keiko; Morioka, Tomoko; Yokoi, Keiko; Matsubayashi, Mamiko

    2013-01-01

    Introduction This study clarifies the trends observed in open access (OA) in the biomedical field between 2006 and 2010, and explores the possible explanations for the differences in OA rates revealed in recent surveys. Methods The study consists of a main survey and two supplementary surveys. In the main survey, a manual Google search was performed to investigate whether full-text versions of articles from PubMed were freely available. Target samples were articles published in 2005, 2007, and 2009; the searches were performed a year after publication in 2006, 2008, and 2010, respectively. Using the search results, we classified the OA provision methods into seven categories. The supplementary surveys calculated the OA rate using two search functions on PubMed: “LinkOut” and “Limits.” Results The main survey concluded that the OA rate increased significantly between 2006 and 2010: the OA rate in 2010 (50.2%) was twice that in 2006 (26.3%). Furthermore, majority of OA articles were available from OA journal (OAJ) websites, indicating that OAJs have consistently been a significant contributor to OA throughout the period. OA availability through the PubMed Central (PMC) repository also increased significantly. OA rates obtained from two supplementary surveys were lower than those found in the main survey. “LinkOut” could find only 40% of OA articles in the main survey. Discussion OA articles in the biomedical field have more than a 50% share. OA has been achieved through OAJs. The reason why the OA rates in our surveys are different from those in recent surveys seems to be the difference in sampling methods and verification procedures. PMID:23658683

  13. Remarkable growth of open access in the biomedical field: analysis of PubMed articles from 2006 to 2010.

    PubMed

    Kurata, Keiko; Morioka, Tomoko; Yokoi, Keiko; Matsubayashi, Mamiko

    2013-01-01

    This study clarifies the trends observed in open access (OA) in the biomedical field between 2006 and 2010, and explores the possible explanations for the differences in OA rates revealed in recent surveys. The study consists of a main survey and two supplementary surveys. In the main survey, a manual Google search was performed to investigate whether full-text versions of articles from PubMed were freely available. Target samples were articles published in 2005, 2007, and 2009; the searches were performed a year after publication in 2006, 2008, and 2010, respectively. Using the search results, we classified the OA provision methods into seven categories. The supplementary surveys calculated the OA rate using two search functions on PubMed: "LinkOut" and "Limits." The main survey concluded that the OA rate increased significantly between 2006 and 2010: the OA rate in 2010 (50.2%) was twice that in 2006 (26.3%). Furthermore, majority of OA articles were available from OA journal (OAJ) websites, indicating that OAJs have consistently been a significant contributor to OA throughout the period. OA availability through the PubMed Central (PMC) repository also increased significantly. OA rates obtained from two supplementary surveys were lower than those found in the main survey. "LinkOut" could find only 40% of OA articles in the main survey. OA articles in the biomedical field have more than a 50% share. OA has been achieved through OAJs. The reason why the OA rates in our surveys are different from those in recent surveys seems to be the difference in sampling methods and verification procedures.

  14. Getting More Out of Biomedical Documents with GATE's Full Lifecycle Open Source Text Analytics

    PubMed Central

    Cunningham, Hamish; Tablan, Valentin; Roberts, Angus; Bontcheva, Kalina

    2013-01-01

    This software article describes the GATE family of open source text analysis tools and processes. GATE is one of the most widely used systems of its type with yearly download rates of tens of thousands and many active users in both academic and industrial contexts. In this paper we report three examples of GATE-based systems operating in the life sciences and in medicine. First, in genome-wide association studies which have contributed to discovery of a head and neck cancer mutation association. Second, medical records analysis which has significantly increased the statistical power of treatment/outcome models in the UK's largest psychiatric patient cohort. Third, richer constructs in drug-related searching. We also explore the ways in which the GATE family supports the various stages of the lifecycle present in our examples. We conclude that the deployment of text mining for document abstraction or rich search and navigation is best thought of as a process, and that with the right computational tools and data collection strategies this process can be made defined and repeatable. The GATE research programme is now 20 years old and has grown from its roots as a specialist development tool for text processing to become a rather comprehensive ecosystem, bringing together software developers, language engineers and research staff from diverse fields. GATE now has a strong claim to cover a uniquely wide range of the lifecycle of text analysis systems. It forms a focal point for the integration and reuse of advances that have been made by many people (the majority outside of the authors' own group) who work in text processing for biomedicine and other areas. GATE is available online <1> under GNU open source licences and runs on all major operating systems. Support is available from an active user and developer community and also on a commercial basis. PMID:23408875

  15. Getting more out of biomedical documents with GATE's full lifecycle open source text analytics.

    PubMed

    Cunningham, Hamish; Tablan, Valentin; Roberts, Angus; Bontcheva, Kalina

    2013-01-01

    This software article describes the GATE family of open source text analysis tools and processes. GATE is one of the most widely used systems of its type with yearly download rates of tens of thousands and many active users in both academic and industrial contexts. In this paper we report three examples of GATE-based systems operating in the life sciences and in medicine. First, in genome-wide association studies which have contributed to discovery of a head and neck cancer mutation association. Second, medical records analysis which has significantly increased the statistical power of treatment/outcome models in the UK's largest psychiatric patient cohort. Third, richer constructs in drug-related searching. We also explore the ways in which the GATE family supports the various stages of the lifecycle present in our examples. We conclude that the deployment of text mining for document abstraction or rich search and navigation is best thought of as a process, and that with the right computational tools and data collection strategies this process can be made defined and repeatable. The GATE research programme is now 20 years old and has grown from its roots as a specialist development tool for text processing to become a rather comprehensive ecosystem, bringing together software developers, language engineers and research staff from diverse fields. GATE now has a strong claim to cover a uniquely wide range of the lifecycle of text analysis systems. It forms a focal point for the integration and reuse of advances that have been made by many people (the majority outside of the authors' own group) who work in text processing for biomedicine and other areas. GATE is available online <1> under GNU open source licences and runs on all major operating systems. Support is available from an active user and developer community and also on a commercial basis.

  16. Logical Gene Ontology Annotations (GOAL): exploring gene ontology annotations with OWL.

    PubMed

    Jupp, Simon; Stevens, Robert; Hoehndorf, Robert

    2012-04-24

    Ontologies such as the Gene Ontology (GO) and their use in annotations make cross species comparisons of genes possible, along with a wide range of other analytical activities. The bio-ontologies community, in particular the Open Biomedical Ontologies (OBO) community, have provided many other ontologies and an increasingly large volume of annotations of gene products that can be exploited in query and analysis. As many annotations with different ontologies centre upon gene products, there is a possibility to explore gene products through multiple ontological perspectives at the same time. Questions could be asked that link a gene product's function, process, cellular location, phenotype and disease. Current tools, such as AmiGO, allow exploration of genes based on their GO annotations, but not through multiple ontological perspectives. In addition, the semantics of these ontology's representations should be able to, through automated reasoning, afford richer query opportunities of the gene product annotations than is currently possible. To do this multi-perspective, richer querying of gene product annotations, we have created the Logical Gene Ontology, or GOAL ontology, in OWL that combines the Gene Ontology, Human Disease Ontology and the Mammalian Phenotype Ontology, together with classes that represent the annotations with these ontologies for mouse gene products. Each mouse gene product is represented as a class, with the appropriate relationships to the GO aspects, phenotype and disease with which it has been annotated. We then use defined classes to query these protein classes through automated reasoning, and to build a complex hierarchy of gene products. We have presented this through a Web interface that allows arbitrary queries to be constructed and the results displayed. This standard use of OWL affords a rich interaction with Gene Ontology, Human Disease Ontology and Mammalian Phenotype Ontology annotations for the mouse, to give a fine partitioning of

  17. BIOMedical Search Engine Framework: Lightweight and customized implementation of domain-specific biomedical search engines.

    PubMed

    Jácome, Alberto G; Fdez-Riverola, Florentino; Lourenço, Anália

    2016-07-01

    Text mining and semantic analysis approaches can be applied to the construction of biomedical domain-specific search engines and provide an attractive alternative to create personalized and enhanced search experiences. Therefore, this work introduces the new open-source BIOMedical Search Engine Framework for the fast and lightweight development of domain-specific search engines. The rationale behind this framework is to incorporate core features typically available in search engine frameworks with flexible and extensible technologies to retrieve biomedical documents, annotate meaningful domain concepts, and develop highly customized Web search interfaces. The BIOMedical Search Engine Framework integrates taggers for major biomedical concepts, such as diseases, drugs, genes, proteins, compounds and organisms, and enables the use of domain-specific controlled vocabulary. Technologies from the Typesafe Reactive Platform, the AngularJS JavaScript framework and the Bootstrap HTML/CSS framework support the customization of the domain-oriented search application. Moreover, the RESTful API of the BIOMedical Search Engine Framework allows the integration of the search engine into existing systems or a complete web interface personalization. The construction of the Smart Drug Search is described as proof-of-concept of the BIOMedical Search Engine Framework. This public search engine catalogs scientific literature about antimicrobial resistance, microbial virulence and topics alike. The keyword-based queries of the users are transformed into concepts and search results are presented and ranked accordingly. The semantic graph view portraits all the concepts found in the results, and the researcher may look into the relevance of different concepts, the strength of direct relations, and non-trivial, indirect relations. The number of occurrences of the concept shows its importance to the query, and the frequency of concept co-occurrence is indicative of biological relations

  18. LINNAEUS: A species name identification system for biomedical literature

    PubMed Central

    2010-01-01

    Background The task of recognizing and identifying species names in biomedical literature has recently been regarded as critical for a number of applications in text and data mining, including gene name recognition, species-specific document retrieval, and semantic enrichment of biomedical articles. Results In this paper we describe an open-source species name recognition and normalization software system, LINNAEUS, and evaluate its performance relative to several automatically generated biomedical corpora, as well as a novel corpus of full-text documents manually annotated for species mentions. LINNAEUS uses a dictionary-based approach (implemented as an efficient deterministic finite-state automaton) to identify species names and a set of heuristics to resolve ambiguous mentions. When compared against our manually annotated corpus, LINNAEUS performs with 94% recall and 97% precision at the mention level, and 98% recall and 90% precision at the document level. Our system successfully solves the problem of disambiguating uncertain species mentions, with 97% of all mentions in PubMed Central full-text documents resolved to unambiguous NCBI taxonomy identifiers. Conclusions LINNAEUS is an open source, stand-alone software system capable of recognizing and normalizing species name mentions with speed and accuracy, and can therefore be integrated into a range of bioinformatics and text-mining applications. The software and manually annotated corpus can be downloaded freely at http://linnaeus.sourceforge.net/. PMID:20149233

  19. Evolution of biomedical ontologies and mappings: Overview of recent approaches.

    PubMed

    Groß, Anika; Pruski, Cédric; Rahm, Erhard

    2016-01-01

    Biomedical ontologies are heavily used to annotate data, and different ontologies are often interlinked by ontology mappings. These ontology-based mappings and annotations are used in many applications and analysis tasks. Since biomedical ontologies are continuously updated dependent artifacts can become outdated and need to undergo evolution as well. Hence there is a need for largely automated approaches to keep ontology-based mappings up-to-date in the presence of evolving ontologies. In this article, we survey current approaches and novel directions in the context of ontology and mapping evolution. We will discuss requirements for mapping adaptation and provide a comprehensive overview on existing approaches. We will further identify open challenges and outline ideas for future developments.

  20. A Commentary on the Biomedical Information System

    ERIC Educational Resources Information Center

    Stokes, Joseph, III; Hayes, Robert M.

    1970-01-01

    The Biomedical Information System is described as one which includes closed intermediate and open data, mobilizing all biomedical information for physicians, teachers, students and administrators. (Editor/IE)

  1. BiOSS: A system for biomedical ontology selection.

    PubMed

    Martínez-Romero, Marcos; Vázquez-Naya, José M; Pereira, Javier; Pazos, Alejandro

    2014-04-01

    In biomedical informatics, ontologies are considered a key technology for annotating, retrieving and sharing the huge volume of publicly available data. Due to the increasing amount, complexity and variety of existing biomedical ontologies, choosing the ones to be used in a semantic annotation problem or to design a specific application is a difficult task. As a consequence, the design of approaches and tools addressed to facilitate the selection of biomedical ontologies is becoming a priority. In this paper we present BiOSS, a novel system for the selection of biomedical ontologies. BiOSS evaluates the adequacy of an ontology to a given domain according to three different criteria: (1) the extent to which the ontology covers the domain; (2) the semantic richness of the ontology in the domain; (3) the popularity of the ontology in the biomedical community. BiOSS has been applied to 5 representative problems of ontology selection. It also has been compared to existing methods and tools. Results are promising and show the usefulness of BiOSS to solve real-world ontology selection problems. BiOSS is openly available both as a web tool and a web service. Copyright © 2014 Elsevier Ireland Ltd. All rights reserved.

  2. In the pursuit of a semantic similarity metric based on UMLS annotations for articles in PubMed Central Open Access.

    PubMed

    Garcia Castro, Leyla Jael; Berlanga, Rafael; Garcia, Alexander

    2015-10-01

    Although full-text articles are provided by the publishers in electronic formats, it remains a challenge to find related work beyond the title and abstract context. Identifying related articles based on their abstract is indeed a good starting point; this process is straightforward and does not consume as many resources as full-text based similarity would require. However, further analyses may require in-depth understanding of the full content. Two articles with highly related abstracts can be substantially different regarding the full content. How similarity differs when considering title-and-abstract versus full-text and which semantic similarity metric provides better results when dealing with full-text articles are the main issues addressed in this manuscript. We have benchmarked three similarity metrics - BM25, PMRA, and Cosine, in order to determine which one performs best when using concept-based annotations on full-text documents. We also evaluated variations in similarity values based on title-and-abstract against those relying on full-text. Our test dataset comprises the Genomics track article collection from the 2005 Text Retrieval Conference. Initially, we used an entity recognition software to semantically annotate titles and abstracts as well as full-text with concepts defined in the Unified Medical Language System (UMLS®). For each article, we created a document profile, i.e., a set of identified concepts, term frequency, and inverse document frequency; we then applied various similarity metrics to those document profiles. We considered correlation, precision, recall, and F1 in order to determine which similarity metric performs best with concept-based annotations. For those full-text articles available in PubMed Central Open Access (PMC-OA), we also performed dispersion analyses in order to understand how similarity varies when considering full-text articles. We have found that the PubMed Related Articles similarity metric is the most suitable for

  3. Annotating the Function of the Human Genome with Gene Ontology and Disease Ontology

    PubMed Central

    Hu, Yang; Zhou, Wenyang; Ren, Jun; Dong, Lixiang

    2016-01-01

    Increasing evidences indicated that function annotation of human genome in molecular level and phenotype level is very important for systematic analysis of genes. In this study, we presented a framework named Gene2Function to annotate Gene Reference into Functions (GeneRIFs), in which each functional description of GeneRIFs could be annotated by a text mining tool Open Biomedical Annotator (OBA), and each Entrez gene could be mapped to Human Genome Organisation Gene Nomenclature Committee (HGNC) gene symbol. After annotating all the records about human genes of GeneRIFs, 288,869 associations between 13,148 mRNAs and 7,182 terms, 9,496 associations between 948 microRNAs and 533 terms, and 901 associations between 139 long noncoding RNAs (lncRNAs) and 297 terms were obtained as a comprehensive annotation resource of human genome. High consistency of term frequency of individual gene (Pearson correlation = 0.6401, p = 2.2e − 16) and gene frequency of individual term (Pearson correlation = 0.1298, p = 3.686e − 14) in GeneRIFs and GOA shows our annotation resource is very reliable. PMID:27635398

  4. Annotating the Function of the Human Genome with Gene Ontology and Disease Ontology.

    PubMed

    Hu, Yang; Zhou, Wenyang; Ren, Jun; Dong, Lixiang; Wang, Yadong; Jin, Shuilin; Cheng, Liang

    2016-01-01

    Increasing evidences indicated that function annotation of human genome in molecular level and phenotype level is very important for systematic analysis of genes. In this study, we presented a framework named Gene2Function to annotate Gene Reference into Functions (GeneRIFs), in which each functional description of GeneRIFs could be annotated by a text mining tool Open Biomedical Annotator (OBA), and each Entrez gene could be mapped to Human Genome Organisation Gene Nomenclature Committee (HGNC) gene symbol. After annotating all the records about human genes of GeneRIFs, 288,869 associations between 13,148 mRNAs and 7,182 terms, 9,496 associations between 948 microRNAs and 533 terms, and 901 associations between 139 long noncoding RNAs (lncRNAs) and 297 terms were obtained as a comprehensive annotation resource of human genome. High consistency of term frequency of individual gene (Pearson correlation = 0.6401, p = 2.2e - 16) and gene frequency of individual term (Pearson correlation = 0.1298, p = 3.686e - 14) in GeneRIFs and GOA shows our annotation resource is very reliable.

  5. Identifying discourse connectives in biomedical text.

    PubMed

    Ramesh, Balaji Polepalli; Yu, Hong

    2010-11-13

    Discourse connectives are words or phrases that connect or relate two coherent sentences or phrases and indicate the presence of discourse relations. Automatic recognition of discourse connectives may benefit many natural language processing applications. In this pilot study, we report the development of the supervised machine-learning classifiers with conditional random fields (CRFs) for automatically identifying discourse connectives in full-text biomedical articles. Our first classifier was trained on the open-domain 1 million token Penn Discourse Tree Bank (PDTB). We performed cross validation on biomedical articles (approximately 100K word tokens) that we annotated. The results show that the classifier trained on PDTB data attained a 0.55 F1-score for identifying discourse connectives in biomedical text, while the cross-validation results in the biomedical text attained a 0.69 F1-score, a much better performance despite a much smaller training size. Our preliminary analysis suggests the existence of domain-specific features, and we speculate that domain-adaption approaches may further improve performance.

  6. MAKER-P: A Tool Kit for the Rapid Creation, Management, and Quality Control of Plant Genome Annotations1[W][OPEN

    PubMed Central

    Campbell, Michael S.; Law, MeiYee; Holt, Carson; Stein, Joshua C.; Moghe, Gaurav D.; Hufnagel, David E.; Lei, Jikai; Achawanantakun, Rujira; Jiao, Dian; Lawrence, Carolyn J.; Ware, Doreen; Shiu, Shin-Han; Childs, Kevin L.; Sun, Yanni; Jiang, Ning; Yandell, Mark

    2014-01-01

    We have optimized and extended the widely used annotation engine MAKER in order to better support plant genome annotation efforts. New features include better parallelization for large repeat-rich plant genomes, noncoding RNA annotation capabilities, and support for pseudogene identification. We have benchmarked the resulting software tool kit, MAKER-P, using the Arabidopsis (Arabidopsis thaliana) and maize (Zea mays) genomes. Here, we demonstrate the ability of the MAKER-P tool kit to automatically update, extend, and revise the Arabidopsis annotations in light of newly available data and to annotate pseudogenes and noncoding RNAs absent from The Arabidopsis Informatics Resource 10 build. Our results demonstrate that MAKER-P can be used to manage and improve the annotations of even Arabidopsis, perhaps the best-annotated plant genome. We have also installed and benchmarked MAKER-P on the Texas Advanced Computing Center. We show that this public resource can de novo annotate the entire Arabidopsis and maize genomes in less than 3 h and produce annotations of comparable quality to those of the current The Arabidopsis Information Resource 10 and maize V2 annotation builds. PMID:24306534

  7. The National Center for Biomedical Ontology: Advancing Biomedicinethrough Structured Organization of Scientific Knowledge

    SciTech Connect

    Rubin, Daniel L.; Lewis, Suzanna E.; Mungall, Chris J.; Misra,Sima; Westerfield, Monte; Ashburner, Michael; Sim, Ida; Chute,Christopher G.; Solbrig, Harold; Storey, Margaret-Anne; Smith, Barry; Day-Richter, John; Noy, Natalya F.; Musen, Mark A.

    2006-01-23

    The National Center for Biomedical Ontology (http://bioontology.org) is a consortium that comprises leading informaticians, biologists, clinicians, and ontologists funded by the NIH Roadmap to develop innovative technology and methods that allow scientists to record, manage, and disseminate biomedical information and knowledge in machine-processable form. The goals of the Center are: (1) to help unify the divergent and isolated efforts in ontology development by promoting high quality open-source, standards-based tools to create, manage, and use ontologies, (2) to create new software tools so that scientists can use ontologies to annotate and analyze biomedical data, (3) to provide a national resource for the ongoing evaluation, integration, and evolution of biomedical ontologies and associated tools and theories in the context of driving biomedical projects (DBPs), and (4) to disseminate the tools and resources of the Center and to identify, evaluate, and communicate best practices of ontology development to the biomedical community. The Center is working toward these objectives by providing tools to develop ontologies and to annotate experimental data, and by developing resources to integrate and relate existing ontologies as well as by creating repositories of biomedical data that are annotated using those ontologies. The Center is providing training workshops in ontology design, development, and usage, and is also pursuing research in ontology evaluation, quality, and use of ontologies to promote scientific discovery. Through the research activities within the Center, collaborations with the DBPs, and interactions with the biomedical community, our goal is to help scientists to work more effectively in the e-science paradigm, enhancing experiment design, experiment execution, data analysis, information synthesis, hypothesis generation and testing, and understand human disease.

  8. Precision annotation of digital samples in NCBI's gene expression omnibus.

    PubMed

    Hadley, Dexter; Pan, James; El-Sayed, Osama; Aljabban, Jihad; Aljabban, Imad; Azad, Tej D; Hadied, Mohamad O; Raza, Shuaib; Rayikanti, Benjamin Abhishek; Chen, Bin; Paik, Hyojung; Aran, Dvir; Spatz, Jordan; Himmelstein, Daniel; Panahiazar, Maryam; Bhattacharya, Sanchita; Sirota, Marina; Musen, Mark A; Butte, Atul J

    2017-09-19

    The Gene Expression Omnibus (GEO) contains more than two million digital samples from functional genomics experiments amassed over almost two decades. However, individual sample meta-data remains poorly described by unstructured free text attributes preventing its largescale reanalysis. We introduce the Search Tag Analyze Resource for GEO as a web application (http://STARGEO.org) to curate better annotations of sample phenotypes uniformly across different studies, and to use these sample annotations to define robust genomic signatures of disease pathology by meta-analysis. In this paper, we target a small group of biomedical graduate students to show rapid crowd-curation of precise sample annotations across all phenotypes, and we demonstrate the biological validity of these crowd-curated annotations for breast cancer. STARGEO.org makes GEO data findable, accessible, interoperable and reusable (i.e., FAIR) to ultimately facilitate knowledge discovery. Our work demonstrates the utility of crowd-curation and interpretation of open 'big data' under FAIR principles as a first step towards realizing an ideal paradigm of precision medicine.

  9. Developments in the use of rare earth metal complexes as efficient catalysts for ring-opening polymerization of cyclic esters used in biomedical applications

    NASA Astrophysics Data System (ADS)

    Cota, Iuliana

    2017-04-01

    Biodegradable polymers represent a class of particularly useful materials for many biomedical and pharmaceutical applications. Among these types of polyesters, poly(ɛ-caprolactone) and polylactides are considered very promising for controlled drug delivery devices. These polymers are mainly produced by ring-opening polymerization of their respective cyclic esters, since this method allows a strict control of the molecular parameters (molecular weight and distribution) of the obtained polymers. The most widely used catalysts for ring-opening polymerization of cyclic esters are tin- and aluminium-based organometallic complexes; however since the contamination of the aliphatic polyesters by potentially toxic metallic residues is particularly of concern for biomedical applications, the possibility of replacing organometallic initiators by novel less toxic or more efficient organometallic complexes has been intensively studied. Thus, in the recent years, the use of highly reactive rare earth initiators/catalysts leading to lower polymer contamination has been developed. The use of rare earth complexes is considered a valuable strategy to decrease the polyester contamination by metallic residues and represents an attractive alternative to traditional organometallic complexes.

  10. Annotated Videography.

    ERIC Educational Resources Information Center

    United States Holocaust Memorial Museum, Washington, DC.

    This annotated list of 43 videotapes recommended for classroom use addresses various themes for teaching about the Holocaust, including: (1) overviews of the Holocaust; (2) life before the Holocaust; (3) propaganda; (4) racism, anti-Semitism; (5) "enemies of the state"; (6) ghettos; (7) camps; (8) genocide; (9) rescue; (10) resistance;…

  11. Automatic multi-label annotation of abdominal CT images using CBIR

    NASA Astrophysics Data System (ADS)

    Xue, Zhiyun; Antani, Sameer; Long, L. Rodney; Thoma, George R.

    2017-03-01

    We present a technique to annotate multiple organs shown in 2-D abdominal/pelvic CT images using CBIR. This annotation task is motivated by our research interests in visual question-answering (VQA). We aim to apply results from this effort in Open-iSM, a multimodal biomedical search engine developed by the National Library of Medicine (NLM). Understanding visual content of biomedical images is a necessary step for VQA. Though sufficient annotational information about an image may be available in related textual metadata, not all may be useful as descriptive tags, particularly for anatomy on the image. In this paper, we develop and evaluate a multi-label image annotation method using CBIR. We evaluate our method on two 2-D CT image datasets we generated from 3-D volumetric data obtained from a multi-organ segmentation challenge hosted in MICCAI 2015. Shape and spatial layout information is used to encode visual characteristics of the anatomy. We adapt a weighted voting scheme to assign multiple labels to the query image by combining the labels of the images identified as similar by the method. Key parameters that may affect the annotation performance, such as the number of images used in the label voting and the threshold for excluding labels that have low weights, are studied. The method proposes a coarse-to-fine retrieval strategy which integrates the classification with the nearest-neighbor search. Results from our evaluation (using the MICCAI CT image datasets as well as figures from Open-i) are presented.

  12. Evaluation of training with an annotation schema for manual annotation of clinical conditions from emergency department reports.

    PubMed

    Chapman, Wendy W; Dowling, John N; Hripcsak, George

    2008-02-01

    Determine whether agreement among annotators improves after being trained to use an annotation schema that specifies: what types of clinical conditions to annotate, the linguistic form of the annotations, and which modifiers to include. Three physicians and 3 lay people individually annotated all clinical conditions in 23 emergency department reports. For annotations made using a Baseline Schema and annotations made after training on a detailed annotation schema, we compared: (1) variability of annotation length and number and (2) annotator agreement, using the F-measure. Physicians showed higher agreement and lower variability after training on the detailed annotation schema than when applying the Baseline Schema. Lay people agreed with physicians almost as well as other physicians did but showed a slower learning curve. Training annotators on the annotation schema we developed increased agreement among annotators and should be useful in generating reference standard sets for natural language processing studies. The methodology we used to evaluate the schema could be applied to other types of annotation or classification tasks in biomedical informatics.

  13. Enabling Ontology Based Semantic Queries in Biomedical Database Systems

    PubMed Central

    Zheng, Shuai; Lu, James

    2014-01-01

    There is a lack of tools to ease the integration and ontology based semantic queries in biomedical databases, which are often annotated with ontology concepts. We aim to provide a middle layer between ontology repositories and semantically annotated databases to support semantic queries directly in the databases with expressive standard database query languages. We have developed a semantic query engine that provides semantic reasoning and query processing, and translates the queries into ontology repository operations on NCBO BioPortal. Semantic operators are implemented in the database as user defined functions extended to the database engine, thus semantic queries can be directly specified in standard database query languages such as SQL and XQuery. The system provides caching management to boosts query performance. The system is highly adaptable to support different ontologies through easy customizations. We have implemented the system DBOntoLink as an open source software, which supports major ontologies hosted at BioPortal. DBOntoLink supports a set of common ontology based semantic operations and have them fully integrated with a database management system IBM DB2. The system has been deployed and evaluated with an existing biomedical database for managing and querying image annotations and markups (AIM). Our performance study demonstrates the high expressiveness of semantic queries and the high efficiency of the queries. PMID:25541585

  14. Enabling Ontology Based Semantic Queries in Biomedical Database Systems.

    PubMed

    Zheng, Shuai; Wang, Fusheng; Lu, James

    2014-03-01

    There is a lack of tools to ease the integration and ontology based semantic queries in biomedical databases, which are often annotated with ontology concepts. We aim to provide a middle layer between ontology repositories and semantically annotated databases to support semantic queries directly in the databases with expressive standard database query languages. We have developed a semantic query engine that provides semantic reasoning and query processing, and translates the queries into ontology repository operations on NCBO BioPortal. Semantic operators are implemented in the database as user defined functions extended to the database engine, thus semantic queries can be directly specified in standard database query languages such as SQL and XQuery. The system provides caching management to boosts query performance. The system is highly adaptable to support different ontologies through easy customizations. We have implemented the system DBOntoLink as an open source software, which supports major ontologies hosted at BioPortal. DBOntoLink supports a set of common ontology based semantic operations and have them fully integrated with a database management system IBM DB2. The system has been deployed and evaluated with an existing biomedical database for managing and querying image annotations and markups (AIM). Our performance study demonstrates the high expressiveness of semantic queries and the high efficiency of the queries.

  15. Management of Dynamic Biomedical Terminologies: Current Status and Future Challenges

    PubMed Central

    Dos Reis, J. C.; Pruski, C.

    2015-01-01

    Summary Objectives Controlled terminologies and their dependent artefacts provide a consensual understanding of a domain while reducing ambiguities and enabling reasoning. However, the evolution of a domain’s knowledge directly impacts these terminologies and generates inconsistencies in the underlying biomedical information systems. In this article, we review existing work addressing the dynamic aspect of terminologies as well as their effects on mappings and semantic annotations. Methods We investigate approaches related to the identification, characterization and propagation of changes in terminologies, mappings and semantic annotations including techniques to update their content. Results and conclusion Based on the explored issues and existing methods, we outline open research challenges requiring investigation in the near future. PMID:26293859

  16. Semi-automatic semantic annotation of PubMed queries: a study on quality, efficiency, satisfaction.

    PubMed

    Névéol, Aurélie; Islamaj Doğan, Rezarta; Lu, Zhiyong

    2011-04-01

    Information processing algorithms require significant amounts of annotated data for training and testing. The availability of such data is often hindered by the complexity and high cost of production. In this paper, we investigate the benefits of a state-of-the-art tool to help with the semantic annotation of a large set of biomedical queries. Seven annotators were recruited to annotate a set of 10,000 PubMed® queries with 16 biomedical and bibliographic categories. About half of the queries were annotated from scratch, while the other half were automatically pre-annotated and manually corrected. The impact of the automatic pre-annotations was assessed on several aspects of the task: time, number of actions, annotator satisfaction, inter-annotator agreement, quality and number of the resulting annotations. The analysis of annotation results showed that the number of required hand annotations is 28.9% less when using pre-annotated results from automatic tools. As a result, the overall annotation time was substantially lower when pre-annotations were used, while inter-annotator agreement was significantly higher. In addition, there was no statistically significant difference in the semantic distribution or number of annotations produced when pre-annotations were used. The annotated query corpus is freely available to the research community. This study shows that automatic pre-annotations are found helpful by most annotators. Our experience suggests using an automatic tool to assist large-scale manual annotation projects. This helps speed-up the annotation time and improve annotation consistency while maintaining high quality of the final annotations.

  17. Semi-automatic semantic annotation of PubMed Queries: a study on quality, efficiency, satisfaction

    PubMed Central

    Névéol, Aurélie; Islamaj-Doğan, Rezarta; Lu, Zhiyong

    2010-01-01

    Information processing algorithms require significant amounts of annotated data for training and testing. The availability of such data is often hindered by the complexity and high cost of production. In this paper, we investigate the benefits of a state-of-the-art tool to help with the semantic annotation of a large set of biomedical information queries. Seven annotators were recruited to annotate a set of 10,000 PubMed® queries with 16 biomedical and bibliographic categories. About half of the queries were annotated from scratch, while the other half were automatically pre-annotated and manually corrected. The impact of the automatic pre-annotations was assessed on several aspects of the task: time, number of actions, annotator satisfaction, inter-annotator agreement, quality and number of the resulting annotations. The analysis of annotation results showed that the number of required hand annotations is 28.9% less when using pre-annotated results from automatic tools. As a result, the overall annotation time was substantially lower when pre-annotations were used, while inter-annotator agreement was significantly higher. In addition, there was no statistically significant difference in the semantic distribution or number of annotations produced when pre-annotations were used. The annotated query corpus is freely available to the research community. This study shows that automatic pre-annotations are found helpful by most annotators. Our experience suggests using an automatic tool to assist large-scale manual annotation projects. This helps speed-up the annotation time and improve annotation consistency while maintaining high quality of the final annotations. PMID:21094696

  18. Marky: a tool supporting annotation consistency in multi-user and iterative document annotation projects.

    PubMed

    Pérez-Pérez, Martín; Glez-Peña, Daniel; Fdez-Riverola, Florentino; Lourenço, Anália

    2015-02-01

    Document annotation is a key task in the development of Text Mining methods and applications. High quality annotated corpora are invaluable, but their preparation requires a considerable amount of resources and time. Although the existing annotation tools offer good user interaction interfaces to domain experts, project management and quality control abilities are still limited. Therefore, the current work introduces Marky, a new Web-based document annotation tool equipped to manage multi-user and iterative projects, and to evaluate annotation quality throughout the project life cycle. At the core, Marky is a Web application based on the open source CakePHP framework. User interface relies on HTML5 and CSS3 technologies. Rangy library assists in browser-independent implementation of common DOM range and selection tasks, and Ajax and JQuery technologies are used to enhance user-system interaction. Marky grants solid management of inter- and intra-annotator work. Most notably, its annotation tracking system supports systematic and on-demand agreement analysis and annotation amendment. Each annotator may work over documents as usual, but all the annotations made are saved by the tracking system and may be further compared. So, the project administrator is able to evaluate annotation consistency among annotators and across rounds of annotation, while annotators are able to reject or amend subsets of annotations made in previous rounds. As a side effect, the tracking system minimises resource and time consumption. Marky is a novel environment for managing multi-user and iterative document annotation projects. Compared to other tools, Marky offers a similar visually intuitive annotation experience while providing unique means to minimise annotation effort and enforce annotation quality, and therefore corpus consistency. Marky is freely available for non-commercial use at http://sing.ei.uvigo.es/marky. Copyright © 2014 Elsevier Ireland Ltd. All rights reserved.

  19. Food environment, walkability, and public open spaces are associated with incident development of cardio-metabolic risk factors in a biomedical cohort.

    PubMed

    Paquet, Catherine; Coffee, Neil T; Haren, Matthew T; Howard, Natasha J; Adams, Robert J; Taylor, Anne W; Daniel, Mark

    2014-07-01

    We investigated whether residential environment characteristics related to food (unhealthful/healthful food sources ratio), walkability and public open spaces (POS; number, median size, greenness and type) were associated with incidence of four cardio-metabolic risk factors (pre-diabetes/diabetes, hypertension, dyslipidaemia, abdominal obesity) in a biomedical cohort (n=3205). Results revealed that the risk of developing pre-diabetes/diabetes was lower for participants in areas with larger POS and greater walkability. Incident abdominal obesity was positively associated with the unhealthful food environment index. No associations were found with hypertension or dyslipidaemia. Results provide new evidence for specific, prospective associations between the built environment and cardio-metabolic risk factors. Copyright © 2014 Elsevier Ltd. All rights reserved.

  20. Computer systems for annotation of single molecule fragments

    DOEpatents

    Schwartz, David Charles; Severin, Jessica

    2016-07-19

    There are provided computer systems for visualizing and annotating single molecule images. Annotation systems in accordance with this disclosure allow a user to mark and annotate single molecules of interest and their restriction enzyme cut sites thereby determining the restriction fragments of single nucleic acid molecules. The markings and annotations may be automatically generated by the system in certain embodiments and they may be overlaid translucently onto the single molecule images. An image caching system may be implemented in the computer annotation systems to reduce image processing time. The annotation systems include one or more connectors connecting to one or more databases capable of storing single molecule data as well as other biomedical data. Such diverse array of data can be retrieved and used to validate the markings and annotations. The annotation systems may be implemented and deployed over a computer network. They may be ergonomically optimized to facilitate user interactions.

  1. National Center for Biomedical Ontology: advancing biomedicine through structured organization of scientific knowledge.

    PubMed

    Rubin, Daniel L; Lewis, Suzanna E; Mungall, Chris J; Misra, Sima; Westerfield, Monte; Ashburner, Michael; Sim, Ida; Chute, Christopher G; Solbrig, Harold; Storey, Margaret-Anne; Smith, Barry; Day-Richter, John; Noy, Natalya F; Musen, Mark A

    2006-01-01

    The National Center for Biomedical Ontology is a consortium that comprises leading informaticians, biologists, clinicians, and ontologists, funded by the National Institutes of Health (NIH) Roadmap, to develop innovative technology and methods that allow scientists to record, manage, and disseminate biomedical information and knowledge in machine-processable form. The goals of the Center are (1) to help unify the divergent and isolated efforts in ontology development by promoting high quality open-source, standards-based tools to create, manage, and use ontologies, (2) to create new software tools so that scientists can use ontologies to annotate and analyze biomedical data, (3) to provide a national resource for the ongoing evaluation, integration, and evolution of biomedical ontologies and associated tools and theories in the context of driving biomedical projects (DBPs), and (4) to disseminate the tools and resources of the Center and to identify, evaluate, and communicate best practices of ontology development to the biomedical community. Through the research activities within the Center, collaborations with the DBPs, and interactions with the biomedical community, our goal is to help scientists to work more effectively in the e-science paradigm, enhancing experiment design, experiment execution, data analysis, information synthesis, hypothesis generation and testing, and understand human disease.

  2. A modular framework for biomedical concept recognition.

    PubMed

    Campos, David; Matos, Sérgio; Oliveira, José Luís

    2013-09-24

    Concept recognition is an essential task in biomedical information extraction, presenting several complex and unsolved challenges. The development of such solutions is typically performed in an ad-hoc manner or using general information extraction frameworks, which are not optimized for the biomedical domain and normally require the integration of complex external libraries and/or the development of custom tools. This article presents Neji, an open source framework optimized for biomedical concept recognition built around four key characteristics: modularity, scalability, speed, and usability. It integrates modules for biomedical natural language processing, such as sentence splitting, tokenization, lemmatization, part-of-speech tagging, chunking and dependency parsing. Concept recognition is provided through dictionary matching and machine learning with normalization methods. Neji also integrates an innovative concept tree implementation, supporting overlapped concept names and respective disambiguation techniques. The most popular input and output formats, namely Pubmed XML, IeXML, CoNLL and A1, are also supported. On top of the built-in functionalities, developers and researchers can implement new processing modules or pipelines, or use the provided command-line interface tool to build their own solutions, applying the most appropriate techniques to identify heterogeneous biomedical concepts. Neji was evaluated against three gold standard corpora with heterogeneous biomedical concepts (CRAFT, AnEM and NCBI disease corpus), achieving high performance results on named entity recognition (F1-measure for overlap matching: species 95%, cell 92%, cellular components 83%, gene and proteins 76%, chemicals 65%, biological processes and molecular functions 63%, disorders 85%, and anatomical entities 82%) and on entity normalization (F1-measure for overlap name matching and correct identifier included in the returned list of identifiers: species 88%, cell 71%, cellular

  3. A modular framework for biomedical concept recognition

    PubMed Central

    2013-01-01

    Background Concept recognition is an essential task in biomedical information extraction, presenting several complex and unsolved challenges. The development of such solutions is typically performed in an ad-hoc manner or using general information extraction frameworks, which are not optimized for the biomedical domain and normally require the integration of complex external libraries and/or the development of custom tools. Results This article presents Neji, an open source framework optimized for biomedical concept recognition built around four key characteristics: modularity, scalability, speed, and usability. It integrates modules for biomedical natural language processing, such as sentence splitting, tokenization, lemmatization, part-of-speech tagging, chunking and dependency parsing. Concept recognition is provided through dictionary matching and machine learning with normalization methods. Neji also integrates an innovative concept tree implementation, supporting overlapped concept names and respective disambiguation techniques. The most popular input and output formats, namely Pubmed XML, IeXML, CoNLL and A1, are also supported. On top of the built-in functionalities, developers and researchers can implement new processing modules or pipelines, or use the provided command-line interface tool to build their own solutions, applying the most appropriate techniques to identify heterogeneous biomedical concepts. Neji was evaluated against three gold standard corpora with heterogeneous biomedical concepts (CRAFT, AnEM and NCBI disease corpus), achieving high performance results on named entity recognition (F1-measure for overlap matching: species 95%, cell 92%, cellular components 83%, gene and proteins 76%, chemicals 65%, biological processes and molecular functions 63%, disorders 85%, and anatomical entities 82%) and on entity normalization (F1-measure for overlap name matching and correct identifier included in the returned list of identifiers: species 88

  4. Multicriteria analysis using open-source data and software for the implementation of a centralized biomedical waste management system in a developing country (Guinea, Conakry).

    NASA Astrophysics Data System (ADS)

    Pérez Peña, José Vicente; Baldó, Mane; Acosta, Yarci; Verschueren, Laurent; Thibaud, Kenmognie; Bilivogui, Pépé; Jean-Paul Ngandu, Alain; Beavogui, Maoro

    2017-04-01

    In the last decade the increasing interest for public health has promoted specific regulations for the transport, storage, transformation and/or elimination of potentially toxic waste. A special concern should focus on the effective management of biomedical waste, due to the environmental and health risk associated with them. The first stage for the effective management these waste includes the selection of the best sites for the location of facilities for its storage and/or elimination. Best-site selection is accomplished by means of multi-criteria decision analyses (MCDA) that aim to minimize the social and environmental impact, and to maximize management efficiency. In this work we presented a methodology that uses open-source software and data to analyze the best location for the implantation of a centralized waste management system in a developing country (Guinea, Conakry). We applied an analytical hierarchy process (AHP) using different thematic layers such as land use (derived from up-to-date Sentinel 2 remote sensing images), soil type, distance and type of roads, hydrography, distance to dense populated areas, etc. Land-use data were derived from up-to-date Sentinel 2 remote sensing images, whereas roads and hydrography were obtained from the Open Street Map database and latter validated with administrative data. We performed the AHP analysis with the aid of QGIS open-software Geospatial Information System. This methodology is very effective for developing countries as it uses open-source software and data for the MCDA analysis, thus reducing costs in these first stages of the integrated analysis.

  5. Constructing a semantically enriched biomedical service space: a paradigm with bioinformatics resources.

    PubMed

    Koutkias, Vassilis; Malousi, Andigoni; Chouvarda, Ioanna; Maglaveras, Nicos

    2006-01-01

    Biomedical applications are becoming increasingly reliant on resource integration and information exchange within global solution frameworks that offer seamless connectivity and data sharing in distributed environments. Resource autonomy and data heterogeneity are the most important impediments towards this potential. Aiming to overcome these limitations, we propose an implementation of the service-oriented model towards the construction of an open, semantically enriched biomedical service space that enables advanced service registration, selection and access capabilities, as well as service interoperability. The proposed system is realised by defining service annotation ontologies and applying software agent technology as the means for service registration, matchmaking and interfacing in a Grid environment. The applicability of the envisioned biomedical service space is illustrated on a set of bioinformatics resources, addressing computational identification of protein-coding genes.

  6. NCBI prokaryotic genome annotation pipeline.

    PubMed

    Tatusova, Tatiana; DiCuccio, Michael; Badretdin, Azat; Chetvernin, Vyacheslav; Nawrocki, Eric P; Zaslavsky, Leonid; Lomsadze, Alexandre; Pruitt, Kim D; Borodovsky, Mark; Ostell, James

    2016-08-19

    Recent technological advances have opened unprecedented opportunities for large-scale sequencing and analysis of populations of pathogenic species in disease outbreaks, as well as for large-scale diversity studies aimed at expanding our knowledge across the whole domain of prokaryotes. To meet the challenge of timely interpretation of structure, function and meaning of this vast genetic information, a comprehensive approach to automatic genome annotation is critically needed. In collaboration with Georgia Tech, NCBI has developed a new approach to genome annotation that combines alignment based methods with methods of predicting protein-coding and RNA genes and other functional elements directly from sequence. A new gene finding tool, GeneMarkS+, uses the combined evidence of protein and RNA placement by homology as an initial map of annotation to generate and modify ab initio gene predictions across the whole genome. Thus, the new NCBI's Prokaryotic Genome Annotation Pipeline (PGAP) relies more on sequence similarity when confident comparative data are available, while it relies more on statistical predictions in the absence of external evidence. The pipeline provides a framework for generation and analysis of annotation on the full breadth of prokaryotic taxonomy. For additional information on PGAP see https://www.ncbi.nlm.nih.gov/genome/annotation_prok/ and the NCBI Handbook, https://www.ncbi.nlm.nih.gov/books/NBK174280/. Published by Oxford University Press on behalf of Nucleic Acids Research 2016. This work is written by (a) US Government employee(s) and is in the public domain in the US.

  7. [Biomedical informatics].

    PubMed

    Capurro, Daniel; Soto, Mauricio; Vivent, Macarena; Lopetegui, Marcelo; Herskovic, Jorge R

    2011-12-01

    Biomedical Informatics is a new discipline that arose from the need to incorporate information technologies to the generation, storage, distribution and analysis of information in the domain of biomedical sciences. This discipline comprises basic biomedical informatics, and public health informatics. The development of the discipline in Chile has been modest and most projects have originated from the interest of individual people or institutions, without a systematic and coordinated national development. Considering the unique features of health care system of our country, research in the area of biomedical informatics is becoming an imperative.

  8. Biomedical Imaging,

    DTIC Science & Technology

    precision required from the task. This report details the technologies in surface and subsurface imaging systems for research and commercial applications. Biomedical imaging, Anthropometry, Computer imaging.

  9. Developing an Open-Source Bibliometric Ranking Website Using Google Scholar Citation Profiles for Researchers in the Field of Biomedical Informatics.

    PubMed

    Sittig, Dean F; McCoy, Allison B; Wright, Adam; Lin, Jimmy

    2015-01-01

    We developed the Biomedical Informatics Researchers ranking website (rank.informatics-review.com) to overcome many of the limitations of previous scientific productivity ranking strategies. The website is composed of four key components that work together to create an automatically updating ranking website: (1) list of biomedical informatics researchers, (2) Google Scholar scraper, (3) display page, and (4) updater. The site has been useful to other groups in evaluating researchers, such as tenure and promotions committees in interpreting the various citation statistics reported by candidates. Creation of the Biomedical Informatics Researchers ranking website highlights the vast differences in scholarly productivity among members of the biomedical informatics research community.

  10. Biomedical Imaging

    NASA Astrophysics Data System (ADS)

    MacPherson, Emma

    This chapter builds on the basic principles of THz spectroscopy and explains how they can be applied to biomedical systems as well as the motivation for doing so. Sample preparation techniques and measurement methods for biomedical samples are described in detail. Examples of medical applications investigated hitherto including breast cancer and skin cancer are also presented.

  11. The environment ontology: contextualising biological and biomedical entities

    PubMed Central

    2013-01-01

    As biological and biomedical research increasingly reference the environmental context of the biological entities under study, the need for formalisation and standardisation of environment descriptors is growing. The Environment Ontology (ENVO; http://www.environmentontology.org) is a community-led, open project which seeks to provide an ontology for specifying a wide range of environments relevant to multiple life science disciplines and, through an open participation model, to accommodate the terminological requirements of all those needing to annotate data using ontology classes. This paper summarises ENVO’s motivation, content, structure, adoption, and governance approach. The ontology is available from http://purl.obolibrary.org/obo/envo.owl - an OBO format version is also available by switching the file suffix to “obo”. PMID:24330602

  12. The environment ontology: contextualising biological and biomedical entities.

    PubMed

    Buttigieg, Pier Luigi; Morrison, Norman; Smith, Barry; Mungall, Christopher J; Lewis, Suzanna E

    2013-12-11

    As biological and biomedical research increasingly reference the environmental context of the biological entities under study, the need for formalisation and standardisation of environment descriptors is growing. The Environment Ontology (ENVO; http://www.environmentontology.org) is a community-led, open project which seeks to provide an ontology for specifying a wide range of environments relevant to multiple life science disciplines and, through an open participation model, to accommodate the terminological requirements of all those needing to annotate data using ontology classes. This paper summarises ENVO's motivation, content, structure, adoption, and governance approach. The ontology is available from http://purl.obolibrary.org/obo/envo.owl - an OBO format version is also available by switching the file suffix to "obo".

  13. Biomedical ontologies: toward scientific debate.

    PubMed

    Maojo, V; Crespo, J; García-Remesal, M; de la Iglesia, D; Perez-Rey, D; Kulikowski, C

    2011-01-01

    Biomedical ontologies have been very successful in structuring knowledge for many different applications, receiving widespread praise for their utility and potential. Yet, the role of computational ontologies in scientific research, as opposed to knowledge management applications, has not been extensively discussed. We aim to stimulate further discussion on the advantages and challenges presented by biomedical ontologies from a scientific perspective. We review various aspects of biomedical ontologies going beyond their practical successes, and focus on some key scientific questions in two ways. First, we analyze and discuss current approaches to improve biomedical ontologies that are based largely on classical, Aristotelian ontological models of reality. Second, we raise various open questions about biomedical ontologies that require further research, analyzing in more detail those related to visual reasoning and spatial ontologies. We outline significant scientific issues that biomedical ontologies should consider, beyond current efforts of building practical consensus between them. For spatial ontologies, we suggest an approach for building "morphospatial" taxonomies, as an example that could stimulate research on fundamental open issues for biomedical ontologies. Analysis of a large number of problems with biomedical ontologies suggests that the field is very much open to alternative interpretations of current work, and in need of scientific debate and discussion that can lead to new ideas and research directions.

  14. The National Center for Biomedical Ontology: Advancing Biomedicinethrough Structured Organization of Scientific Knowledge

    SciTech Connect

    Rubin, Daniel L.; Lewis, Suzanna E.; Mungall, Chris J.; Misra,Sima; Westerfield, Monte; Ashburner, Michael; Sim, Ida; Chute,Christopher G.; Solbrig, Harold; Storey, Margaret-Anne; Smith, Barry; Day-Richter, John; Noy, Natalya F.; Musen, Mark A.

    2006-01-23

    The National Center for Biomedical Ontology(http://bioontology.org) is a consortium that comprises leadinginformaticians, biologists, clinicians, and ontologists funded by the NIHRoadmap to develop innovative technology and methods that allowscientists to record, manage, and disseminate biomedical information andknowledge in machine-processable form. The goals of the Center are: (1)to help unify the divergent and isolated efforts in ontology developmentby promoting high quality open-source, standards-based tools to create,manage, and use ontologies, (2) to create new software tools so thatscientists can use ontologies to annotate and analyze biomedical data,(3) to provide a national resource for the ongoing evaluation,integration, and evolution of biomedical ontologies and associated toolsand theories in the context of driving biomedical projects (DBPs), and(4) to disseminate the tools and resources of the Center and to identify,evaluate, and communicate best practices of ontology development to thebiomedical community. The Center is working toward these objectives byproviding tools to develop ontologies and to annotate experimental data,and by developing resources to integrate and relate existing ontologiesas well as by creating repositories of biomedical data that are annotatedusing those ontologies. The Center is providing training workshops inontology design, development, and usage, and is also pursuing research inontology evaluation, quality, and use of ontologies to promote scientificdiscovery. Through the research activities within the Center,collaborations with the DBPs, and interactions with the biomedicalcommunity, our goal is to help scientists to work more effectively in thee-science paradigm, enhancing experiment design, experiment execution,data analysis, information synthesis, hypothesis generation and testing,and understand human disease.

  15. Mining GO annotations for improving annotation consistency.

    PubMed

    Faria, Daniel; Schlicker, Andreas; Pesquita, Catia; Bastos, Hugo; Ferreira, António E N; Albrecht, Mario; Falcão, André O

    2012-01-01

    Despite the structure and objectivity provided by the Gene Ontology (GO), the annotation of proteins is a complex task that is subject to errors and inconsistencies. Electronically inferred annotations in particular are widely considered unreliable. However, given that manual curation of all GO annotations is unfeasible, it is imperative to improve the quality of electronically inferred annotations. In this work, we analyze the full GO molecular function annotation of UniProtKB proteins, and discuss some of the issues that affect their quality, focusing particularly on the lack of annotation consistency. Based on our analysis, we estimate that 64% of the UniProtKB proteins are incompletely annotated, and that inconsistent annotations affect 83% of the protein functions and at least 23% of the proteins. Additionally, we present and evaluate a data mining algorithm, based on the association rule learning methodology, for identifying implicit relationships between molecular function terms. The goal of this algorithm is to assist GO curators in updating GO and correcting and preventing inconsistent annotations. Our algorithm predicted 501 relationships with an estimated precision of 94%, whereas the basic association rule learning methodology predicted 12,352 relationships with a precision below 9%.

  16. PREFACE: 17th International School on Condensed Matter Physics (ISCMP): Open Problems in Condensed Matter Physics, Biomedical Physics and their Applications

    NASA Astrophysics Data System (ADS)

    Dimova-Malinovska, Doriana; Nesheva, Diana; Pecheva, Emilia; Petrov, Alexander G.; Primatarowa, Marina T.

    2012-12-01

    We are pleased to introduce the Proceedings of the 17th International School on Condensed Matter Physics: Open Problems in Condensed Matter Physics, Biomedical Physics and their Applications, organized by the Institute of Solid State Physics of the Bulgarian Academy of Sciences. The Chairman of the School was Professor Alexander G Petrov. Like prior events, the School took place in the beautiful Black Sea resort of Saints Constantine and Helena near Varna, going back to the refurbished facilities of the Panorama hotel. Participants from 17 different countries delivered 31 invited lecturers and 78 posters, contributing through three sessions of poster presentations. Papers submitted to the Proceedings were refereed according to the high standards of the Journal of Physics: Conference Series and the accepted papers illustrate the diversity and the high level of the contributions. Not least significant factor for the success of the 17 ISCMP was the social program, both the organized events (Welcome and Farewell Parties) and the variety of pleasant local restaurants and beaches. Visits to the Archaeological Museum (rich in valuable gold treasures of the ancient Thracian culture) and to the famous rock monastery Aladja were organized for the participants from the Varna Municipality. These Proceedings are published for the second time by the Journal of Physics: Conference Series. We are grateful to the Journal's staff for supporting this idea. The Committee decided that the next event will take place again in Saints Constantine and Helena, 1-5 September 2014. It will be entitled: Challenges of the Nanoscale Science: Theory, Materials and Applications. Doriana Dimova-Malinovska, Diana Nesheva, Emilia Pecheva, Alexander G Petrov and Marina T Primatarowa Editors

  17. Building a biomedical ontology recommender web service

    PubMed Central

    2010-01-01

    Background Researchers in biomedical informatics use ontologies and terminologies to annotate their data in order to facilitate data integration and translational discoveries. As the use of ontologies for annotation of biomedical datasets has risen, a common challenge is to identify ontologies that are best suited to annotating specific datasets. The number and variety of biomedical ontologies is large, and it is cumbersome for a researcher to figure out which ontology to use. Methods We present the Biomedical Ontology Recommender web service. The system uses textual metadata or a set of keywords describing a domain of interest and suggests appropriate ontologies for annotating or representing the data. The service makes a decision based on three criteria. The first one is coverage, or the ontologies that provide most terms covering the input text. The second is connectivity, or the ontologies that are most often mapped to by other ontologies. The final criterion is size, or the number of concepts in the ontologies. The service scores the ontologies as a function of scores of the annotations created using the National Center for Biomedical Ontology (NCBO) Annotator web service. We used all the ontologies from the UMLS Metathesaurus and the NCBO BioPortal. Results We compare and contrast our Recommender by an exhaustive functional comparison to previously published efforts. We evaluate and discuss the results of several recommendation heuristics in the context of three real world use cases. The best recommendations heuristics, rated ‘very relevant’ by expert evaluators, are the ones based on coverage and connectivity criteria. The Recommender service (alpha version) is available to the community and is embedded into BioPortal. PMID:20626921

  18. Building a biomedical ontology recommender web service.

    PubMed

    Jonquet, Clement; Musen, Mark A; Shah, Nigam H

    2010-06-22

    Researchers in biomedical informatics use ontologies and terminologies to annotate their data in order to facilitate data integration and translational discoveries. As the use of ontologies for annotation of biomedical datasets has risen, a common challenge is to identify ontologies that are best suited to annotating specific datasets. The number and variety of biomedical ontologies is large, and it is cumbersome for a researcher to figure out which ontology to use. We present the Biomedical Ontology Recommender web service. The system uses textual metadata or a set of keywords describing a domain of interest and suggests appropriate ontologies for annotating or representing the data. The service makes a decision based on three criteria. The first one is coverage, or the ontologies that provide most terms covering the input text. The second is connectivity, or the ontologies that are most often mapped to by other ontologies. The final criterion is size, or the number of concepts in the ontologies. The service scores the ontologies as a function of scores of the annotations created using the National Center for Biomedical Ontology (NCBO) Annotator web service. We used all the ontologies from the UMLS Metathesaurus and the NCBO BioPortal. We compare and contrast our Recommender by an exhaustive functional comparison to previously published efforts. We evaluate and discuss the results of several recommendation heuristics in the context of three real world use cases. The best recommendations heuristics, rated 'very relevant' by expert evaluators, are the ones based on coverage and connectivity criteria. The Recommender service (alpha version) is available to the community and is embedded into BioPortal.

  19. Opening up Academic Biomedical Research

    NASA Image and Video Library

    Eva Guinan, MD, Associate Professor of Pediatrics, Associate Direction, Center for Clinical and Translational Research at Harvard Medical School, was featured during the September 7, 2011 Innovatio...

  20. Biomedical Telectrodes

    NASA Technical Reports Server (NTRS)

    Shepherd, C. K.

    1989-01-01

    Compact transmitters eliminate need for wires to monitors. Biomedical telectrode is small electronic package that attaches to patient in manner similar to small adhesive bandage. Patient wearing biomedical telectrodes moves freely, without risk of breaking or entangling wire connections. Especially beneficial to patients undergoing electrocardiographic monitoring in intensive-care units in hospitals. Eliminates nuisance of coping with wire connections while dressing and going to toilet.

  1. Bacterial genome annotation.

    PubMed

    Beckloff, Nicholas; Starkenburg, Shawn; Freitas, Tracey; Chain, Patrick

    2012-01-01

    Annotation of prokaryotic sequences can be separated into structural and functional annotation. Structural annotation is dependent on algorithmic interrogation of experimental evidence to discover the physical characteristics of a gene. This is done in an effort to construct accurate gene models, so understanding function or evolution of genes among organisms is not impeded. Functional annotation is dependent on sequence similarity to other known genes or proteins in an effort to assess the function of the gene. Combining structural and functional annotation across genomes in a comparative manner promotes higher levels of accurate annotation as well as an advanced understanding of genome evolution. As the availability of bacterial sequences increases and annotation methods improve, the value of comparative annotation will increase.

  2. Question Analysis for Biomedical Question Answering

    PubMed Central

    Sable, Carl; Lee, Minsuk; Zhu, Hai Ran; Yu, Hong

    2005-01-01

    We are developing a biomedical question answering system. This paper describes our system’s architecture and our question analysis component. Specifically, we have explored the use of various supervised machine learning approaches to filter out unanswerable questions based on physicians’ annotations. PMID:16779389

  3. Integrating systems biology models and biomedical ontologies

    PubMed Central

    2011-01-01

    Background Systems biology is an approach to biology that emphasizes the structure and dynamic behavior of biological systems and the interactions that occur within them. To succeed, systems biology crucially depends on the accessibility and integration of data across domains and levels of granularity. Biomedical ontologies were developed to facilitate such an integration of data and are often used to annotate biosimulation models in systems biology. Results We provide a framework to integrate representations of in silico systems biology with those of in vivo biology as described by biomedical ontologies and demonstrate this framework using the Systems Biology Markup Language. We developed the SBML Harvester software that automatically converts annotated SBML models into OWL and we apply our software to those biosimulation models that are contained in the BioModels Database. We utilize the resulting knowledge base for complex biological queries that can bridge levels of granularity, verify models based on the biological phenomenon they represent and provide a means to establish a basic qualitative layer on which to express the semantics of biosimulation models. Conclusions We establish an information flow between biomedical ontologies and biosimulation models and we demonstrate that the integration of annotated biosimulation models and biomedical ontologies enables the verification of models as well as expressive queries. Establishing a bi-directional information flow between systems biology and biomedical ontologies has the potential to enable large-scale analyses of biological systems that span levels of granularity from molecules to organisms. PMID:21835028

  4. Enabling Ontology Based Semantic Queries in Biomedical Database Systems

    PubMed Central

    Zheng, Shuai; Wang, Fusheng; Lu, James; Saltz, Joel

    2013-01-01

    While current biomedical ontology repositories offer primitive query capabilities, it is difficult or cumbersome to support ontology based semantic queries directly in semantically annotated biomedical databases. The problem may be largely attributed to the mismatch between the models of the ontologies and the databases, and the mismatch between the query interfaces of the two systems. To fully realize semantic query capabilities based on ontologies, we develop a system DBOntoLink to provide unified semantic query interfaces by extending database query languages. With DBOntoLink, semantic queries can be directly and naturally specified as extended functions of the database query languages without any programming needed. DBOntoLink is adaptable to different ontologies through customizations and supports major biomedical ontologies hosted at the NCBO BioPortal. We demonstrate the use of DBOntoLink in a real world biomedical database with semantically annotated medical image annotations. PMID:23404054

  5. Enabling Ontology Based Semantic Queries in Biomedical Database Systems.

    PubMed

    Zheng, Shuai; Wang, Fusheng; Lu, James; Saltz, Joel

    2012-01-01

    While current biomedical ontology repositories offer primitive query capabilities, it is difficult or cumbersome to support ontology based semantic queries directly in semantically annotated biomedical databases. The problem may be largely attributed to the mismatch between the models of the ontologies and the databases, and the mismatch between the query interfaces of the two systems. To fully realize semantic query capabilities based on ontologies, we develop a system DBOntoLink to provide unified semantic query interfaces by extending database query languages. With DBOntoLink, semantic queries can be directly and naturally specified as extended functions of the database query languages without any programming needed. DBOntoLink is adaptable to different ontologies through customizations and supports major biomedical ontologies hosted at the NCBO BioPortal. We demonstrate the use of DBOntoLink in a real world biomedical database with semantically annotated medical image annotations.

  6. Biomedical research

    NASA Technical Reports Server (NTRS)

    1981-01-01

    Biomedical problems encountered by man in space which have been identified as a result of previous experience in simulated or actual spaceflight include cardiovascular deconditioning, motion sickness, bone loss, muscle atrophy, red cell alterations, fluid and electrolyte loss, radiation effects, radiation protection, behavior, and performance. The investigations and the findings in each of these areas were reviewed. A description of how biomedical research is organized within NASA, how it is funded, and how it is being reoriented to meet the needs of future manned space missions is also provided.

  7. Biomedical nanotechnology.

    PubMed

    Hurst, Sarah J

    2011-01-01

    This chapter summarizes the roles of nanomaterials in biomedical applications, focusing on those highlighted in this volume. A brief history of nanoscience and technology and a general introduction to the field are presented. Then, the chemical and physical properties of nanostructures that make them ideal for use in biomedical applications are highlighted. Examples of common applications, including sensing, imaging, and therapeutics, are given. Finally, the challenges associated with translating this field from the research laboratory to the clinic setting, in terms of the larger societal implications, are discussed.

  8. Gene ontology annotation by density and gravitation models.

    PubMed

    Hou, Wen-Juan; Lin, Kevin Hsin-Yih; Chen, Hsin-Hsi

    2006-01-01

    Gene Ontology (GO) is developed to provide standard vocabularies of gene products in different databases. The process of annotating GO terms to genes requires curators to read through lengthy articles. Methods for speeding up or automating the annotation process are thus of great importance. We propose a GO annotation approach using full-text biomedical documents for directing more relevant papers to curators. This system explores word density and gravitation relationships between genes and GO terms. Different density and gravitation models are built and several evaluation criteria are employed to assess the effects of the proposed methods.

  9. Teaching and Learning Communities through Online Annotation

    NASA Astrophysics Data System (ADS)

    van der Pluijm, B.

    2016-12-01

    What do colleagues do with your assigned textbook? What they say or think about the material? Want students to be more engaged in their learning experience? If so, online materials that complement standard lecture format provide new opportunity through managed, online group annotation that leverages the ubiquity of internet access, while personalizing learning. The concept is illustrated with the new online textbook "Processes in Structural Geology and Tectonics", by Ben van der Pluijm and Stephen Marshak, which offers a platform for sharing of experiences, supplementary materials and approaches, including readings, mathematical applications, exercises, challenge questions, quizzes, alternative explanations, and more. The annotation framework used is Hypothes.is, which offers a free, open platform markup environment for annotation of websites and PDF postings. The annotations can be public, grouped or individualized, as desired, including export access and download of annotations. A teacher group, hosted by a moderator/owner, limits access to members of a user group of teachers, so that its members can use, copy or transcribe annotations for their own lesson material. Likewise, an instructor can host a student group that encourages sharing of observations, questions and answers among students and instructor. Also, the instructor can create one or more closed groups that offers study help and hints to students. Options galore, all of which aim to engage students and to promote greater responsibility for their learning experience. Beyond new capacity, the ability to analyze student annotation supports individual learners and their needs. For example, student notes can be analyzed for key phrases and concepts, and identify misunderstandings, omissions and problems. Also, example annotations can be shared to enhance notetaking skills and to help with studying. Lastly, online annotation allows active application to lecture posted slides, supporting real-time notetaking

  10. Openings

    PubMed Central

    Selwyn, Peter A.

    2015-01-01

    Reviewing his clinic patient schedule for the day, a physician reflects on the history of a young woman he has been caring for over the past 9 years. What starts out as a routine visit then turns into a unique opening for communication and connection. A chance glimpse out the window of the exam room leads to a deeper meditation on parenthood, survival, and healing, not only for the patient but also for the physician. How many missed opportunities have we all had, without even realizing it, to allow this kind of fleeting but profound opening? PMID:26195687

  11. Dynamic multimedia annotation tool

    NASA Astrophysics Data System (ADS)

    Pfund, Thomas; Marchand-Maillet, Stephane

    2001-12-01

    Annotating image collections is crucial for different multimedia applications. Not only this provides an alternative access to visual information but it is a critical step to perform the evaluation of content-based image retrieval systems. Annotation is a tedious task so that there is a real need for developing tools that lighten the work of annotators. The tool should be flexible and offer customization so as to make the annotator the most comfortable. It should also automate the most tasks as possible. In this paper, we present a still image annotation tool that has been developed with the aim of being flexible and adaptive. The principle is to create a set of dynamic web pages that are an interface to a SQL database. The keyword set is fixed and every image receives from concurrent annotators a set of keywords along with time stamps and annotator Ids. Each annotator has the possibility of going back and forth within the collection and its previous annotations. He is helped by a number of search services and customization options. An administrative section allows the supervisor to control the parameter of the annotation, including the keyword set, given via an XML structure. The architecture of the tool is made flexible so as to accommodate further options through its development.

  12. Automated Update, Revision, and Quality Control of the Maize Genome Annotations Using MAKER-P Improves the B73 RefGen_v3 Gene Models and Identifies New Genes1[OPEN

    PubMed Central

    Law, MeiYee; Childs, Kevin L.; Campbell, Michael S.; Stein, Joshua C.; Olson, Andrew J.; Holt, Carson; Panchy, Nicholas; Lei, Jikai; Jiao, Dian; Andorf, Carson M.; Lawrence, Carolyn J.; Ware, Doreen; Shiu, Shin-Han; Sun, Yanni; Jiang, Ning; Yandell, Mark

    2015-01-01

    The large size and relative complexity of many plant genomes make creation, quality control, and dissemination of high-quality gene structure annotations challenging. In response, we have developed MAKER-P, a fast and easy-to-use genome annotation engine for plants. Here, we report the use of MAKER-P to update and revise the maize (Zea mays) B73 RefGen_v3 annotation build (5b+) in less than 3 h using the iPlant Cyberinfrastructure. MAKER-P identified and annotated 4,466 additional, well-supported protein-coding genes not present in the 5b+ annotation build, added additional untranslated regions to 1,393 5b+ gene models, identified 2,647 5b+ gene models that lack any supporting evidence (despite the use of large and diverse evidence data sets), identified 104,215 pseudogene fragments, and created an additional 2,522 noncoding gene annotations. We also describe a method for de novo training of MAKER-P for the annotation of newly sequenced grass genomes. Collectively, these results lead to the 6a maize genome annotation and demonstrate the utility of MAKER-P for rapid annotation, management, and quality control of grasses and other difficult-to-annotate plant genomes. PMID:25384563

  13. Publishing priorities of biomedical research funders

    PubMed Central

    Collins, Ellen

    2013-01-01

    Objectives To understand the publishing priorities, especially in relation to open access, of 10 UK biomedical research funders. Design Semistructured interviews. Setting 10 UK biomedical research funders. Participants 12 employees with responsibility for research management at 10 UK biomedical research funders; a purposive sample to represent a range of backgrounds and organisation types. Conclusions Publicly funded and large biomedical research funders are committed to open access publishing and are pleased with recent developments which have stimulated growth in this area. Smaller charitable funders are supportive of the aims of open access, but are concerned about the practical implications for their budgets and their funded researchers. Across the board, biomedical research funders are turning their attention to other priorities for sharing research outputs, including data, protocols and negative results. Further work is required to understand how smaller funders, including charitable funders, can support open access. PMID:24154520

  14. Making web annotations persistent over time

    SciTech Connect

    Sanderson, Robert; Van De Sompel, Herbert

    2010-01-01

    As Digital Libraries (DL) become more aligned with the web architecture, their functional components need to be fundamentally rethought in terms of URIs and HTTP. Annotation, a core scholarly activity enabled by many DL solutions, exhibits a clearly unacceptable characteristic when existing models are applied to the web: due to the representations of web resources changing over time, an annotation made about a web resource today may no longer be relevant to the representation that is served from that same resource tomorrow. We assume the existence of archived versions of resources, and combine the temporal features of the emerging Open Annotation data model with the capability offered by the Memento framework that allows seamless navigation from the URI of a resource to archived versions of that resource, and arrive at a solution that provides guarantees regarding the persistence of web annotations over time. More specifically, we provide theoretical solutions and proof-of-concept experimental evaluations for two problems: reconstructing an existing annotation so that the correct archived version is displayed for all resources involved in the annotation, and retrieving all annotations that involve a given archived version of a web resource.

  15. NCBO Ontology Recommender 2.0: an enhanced approach for biomedical ontology recommendation.

    PubMed

    Martínez-Romero, Marcos; Jonquet, Clement; O'Connor, Martin J; Graybeal, John; Pazos, Alejandro; Musen, Mark A

    2017-06-07

    Ontologies and controlled terminologies have become increasingly important in biomedical research. Researchers use ontologies to annotate their data with ontology terms, enabling better data integration and interoperability across disparate datasets. However, the number, variety and complexity of current biomedical ontologies make it cumbersome for researchers to determine which ones to reuse for their specific needs. To overcome this problem, in 2010 the National Center for Biomedical Ontology (NCBO) released the Ontology Recommender, which is a service that receives a biomedical text corpus or a list of keywords and suggests ontologies appropriate for referencing the indicated terms. We developed a new version of the NCBO Ontology Recommender. Called Ontology Recommender 2.0, it uses a novel recommendation approach that evaluates the relevance of an ontology to biomedical text data according to four different criteria: (1) the extent to which the ontology covers the input data; (2) the acceptance of the ontology in the biomedical community; (3) the level of detail of the ontology classes that cover the input data; and (4) the specialization of the ontology to the domain of the input data. Our evaluation shows that the enhanced recommender provides higher quality suggestions than the original approach, providing better coverage of the input data, more detailed information about their concepts, increased specialization for the domain of the input data, and greater acceptance and use in the community. In addition, it provides users with more explanatory information, along with suggestions of not only individual ontologies but also groups of ontologies to use together. It also can be customized to fit the needs of different ontology recommendation scenarios. Ontology Recommender 2.0 suggests relevant ontologies for annotating biomedical text data. It combines the strengths of its predecessor with a range of adjustments and new features that improve its reliability

  16. Galileo Reader and Annotator

    NASA Astrophysics Data System (ADS)

    Besomi, O.

    2011-06-01

    In his readings, Galileo made frequent use of annotations. Here, I will offer a general glance at them by discussing the case of the annotations to the Libra astronomica published in 1619 by Orazio Grassi, a Jesuit mathematician of the Collegio Romano. The annotations directly reflect Galileo's reaction to Grassi's book in a heated debate between the two astronomers. Galileo and Grassi had opposite ideas about the nature of the comets, which resulted in different scientific and theological implications. The annotations represent the starting point for Galileo's reply to the Libra, namely Il Saggiatore, which was published four years later and dedicated to the new pope Urban VIII.

  17. Biomedical Conferences

    NASA Technical Reports Server (NTRS)

    1976-01-01

    As a result of Biomedical Conferences, Vivo Metric Systems Co. has produced cardiac electrodes based on NASA technology. Frequently in science, one highly specialized discipline is unaware of relevant advances made in other areas. In an attempt to familiarize researchers in a variety of disciplines with medical problems and needs, NASA has sponsored conferences that bring together university scientists, practicing physicians and manufacturers of medical instruments.

  18. Exploring and linking biomedical resources through multidimensional semantic spaces.

    PubMed

    Berlanga, Rafael; Jiménez-Ruiz, Ernesto; Nebot, Victoria

    2012-01-25

    The semantic integration of biomedical resources is still a challenging issue which is required for effective information processing and data analysis. The availability of comprehensive knowledge resources such as biomedical ontologies and integrated thesauri greatly facilitates this integration effort by means of semantic annotation, which allows disparate data formats and contents to be expressed under a common semantic space. In this paper, we propose a multidimensional representation for such a semantic space, where dimensions regard the different perspectives in biomedical research (e.g., population, disease, anatomy and protein/genes). This paper presents a novel method for building multidimensional semantic spaces from semantically annotated biomedical data collections. This method consists of two main processes: knowledge and data normalization. The former one arranges the concepts provided by a reference knowledge resource (e.g., biomedical ontologies and thesauri) into a set of hierarchical dimensions for analysis purposes. The latter one reduces the annotation set associated to each collection item into a set of points of the multidimensional space. Additionally, we have developed a visual tool, called 3D-Browser, which implements OLAP-like operators over the generated multidimensional space. The method and the tool have been tested and evaluated in the context of the Health-e-Child (HeC) project. Automatic semantic annotation was applied to tag three collections of abstracts taken from PubMed, one for each target disease of the project, the Uniprot database, and the HeC patient record database. We adopted the UMLS Meta-thesaurus 2010AA as the reference knowledge resource. Current knowledge resources and semantic-aware technology make possible the integration of biomedical resources. Such an integration is performed through semantic annotation of the intended biomedical data resources. This paper shows how these annotations can be exploited for

  19. Biomedical technology in Franconia.

    PubMed

    Efferth, T

    2000-01-01

    Medical instrumentation and biotechnology business is developing rapidly in Franconia. The universities of Bayreuth, Erlangen-Nürnberg, and Würzburg hold upper ranks in biomedical extramural funding research. They have a high competence in biomedical research, medical instrumentation, and biotechnology. The association "BioMedTec Franken e.V" has been founded at the beginning of 1999 both to foster the information exchange between universities, industry and politics and to facilitate the establishment of biomedical companies by means of science parks. In the IGZ (Innovation and Foundation Center Nürnberg-Fürth-Erlangen) 4,500 square meters of space are currently shared by 19 novel companies. Since 1985 60 companies in the IGZ had a total turnover of about 74 Mio Euro. The TGZ (Technologie- und Gründerzentrum) in Würzburg provides space for 11 companies. For the specific needs of biomedical technology companies further science parks will be set up in the near future. A science park for medical instrumentation will be founded in Erlangen (IZMP, Innovations- und Gründerzentrum für Medizintechnik und Pharma in der Region Nürnberg, Fürch, Erlangen). Furthermore, a Biomedical Technology Center and a Research Center for Bicompatible Materials are to be founded in Würzburg and Bayreuth, respectively. Several communication platforms (Bayern Innovativ, FORWISS, FTT, KIM, N-TEC-VISIT, TBU, WETTI etc.) allow the transfer of local academic research activities to industrial utilization and open new co-operation possibilities. International pharmaceutical companies (Novartis, Nürnberg; Pharmacia Upjohn, Erlangen) are located in Franconia. Central Franconia represents a national focus for medical instrumentation. The Erlangen settlement of the Medical Engineering Section of Siemens employs 4,500 people including approximately 1,000 employees in the Siemens research center.

  20. Constructing a semantic predication gold standard from the biomedical literature

    PubMed Central

    2011-01-01

    Background Semantic relations increasingly underpin biomedical text mining and knowledge discovery applications. The success of such practical applications crucially depends on the quality of extracted relations, which can be assessed against a gold standard reference. Most such references in biomedical text mining focus on narrow subdomains and adopt different semantic representations, rendering them difficult to use for benchmarking independently developed relation extraction systems. In this article, we present a multi-phase gold standard annotation study, in which we annotated 500 sentences randomly selected from MEDLINE abstracts on a wide range of biomedical topics with 1371 semantic predications. The UMLS Metathesaurus served as the main source for conceptual information and the UMLS Semantic Network for relational information. We measured interannotator agreement and analyzed the annotations closely to identify some of the challenges in annotating biomedical text with relations based on an ontology or a terminology. Results We obtain fair to moderate interannotator agreement in the practice phase (0.378-0.475). With improved guidelines and additional semantic equivalence criteria, the agreement increases by 12% (0.415 to 0.536) in the main annotation phase. In addition, we find that agreement increases to 0.688 when the agreement calculation is limited to those predications that are based only on the explicitly provided UMLS concepts and relations. Conclusions While interannotator agreement in the practice phase confirms that conceptual annotation is a challenging task, the increasing agreement in the main annotation phase points out that an acceptable level of agreement can be achieved in multiple iterations, by setting stricter guidelines and establishing semantic equivalence criteria. Mapping text to ontological concepts emerges as the main challenge in conceptual annotation. Annotating predications involving biomolecular entities and processes is

  1. Semi-automatic conversion of BioProp semantic annotation to PASBio annotation.

    PubMed

    Tsai, Richard Tzong-Han; Dai, Hong-Jie; Huang, Chi-Hsin; Hsu, Wen-Lian

    2008-12-12

    Semantic role labeling (SRL) is an important text analysis technique. In SRL, sentences are represented by one or more predicate-argument structures (PAS). Each PAS is composed of a predicate (verb) and several arguments (noun phrases, adverbial phrases, etc.) with different semantic roles, including main arguments (agent or patient) as well as adjunct arguments (time, manner, or location). PropBank is the most widely used PAS corpus and annotation format in the newswire domain. In the biomedical field, however, more detailed and restrictive PAS annotation formats such as PASBio are popular. Unfortunately, due to the lack of an annotated PASBio corpus, no publicly available machine-learning (ML) based SRL systems based on PASBio have been developed. In previous work, we constructed a biomedical corpus based on the PropBank standard called BioProp, on which we developed an ML-based SRL system, BIOSMILE. In this paper, we aim to build a system to convert BIOSMILE's BioProp annotation output to PASBio annotation. Our system consists of BIOSMILE in combination with a BioProp-PASBio rule-based converter, and an additional semi-automatic rule generator. Our first experiment evaluated our rule-based converter's performance independently from BIOSMILE performance. The converter achieved an F-score of 85.29%. The second experiment evaluated combined system (BIOSMILE + rule-based converter). The system achieved an F-score of 69.08% for PASBio's 29 verbs. Our approach allows PAS conversion between BioProp and PASBio annotation using BIOSMILE alongside our newly developed semi-automatic rule generator and rule-based converter. Our system can match the performance of other state-of-the-art domain-specific ML-based SRL systems and can be easily customized for PASBio application development.

  2. Semi-automatic conversion of BioProp semantic annotation to PASBio annotation

    PubMed Central

    Tsai, Richard Tzong-Han; Dai, Hong-Jie; Huang, Chi-Hsin; Hsu, Wen-Lian

    2008-01-01

    Background Semantic role labeling (SRL) is an important text analysis technique. In SRL, sentences are represented by one or more predicate-argument structures (PAS). Each PAS is composed of a predicate (verb) and several arguments (noun phrases, adverbial phrases, etc.) with different semantic roles, including main arguments (agent or patient) as well as adjunct arguments (time, manner, or location). PropBank is the most widely used PAS corpus and annotation format in the newswire domain. In the biomedical field, however, more detailed and restrictive PAS annotation formats such as PASBio are popular. Unfortunately, due to the lack of an annotated PASBio corpus, no publicly available machine-learning (ML) based SRL systems based on PASBio have been developed. In previous work, we constructed a biomedical corpus based on the PropBank standard called BioProp, on which we developed an ML-based SRL system, BIOSMILE. In this paper, we aim to build a system to convert BIOSMILE's BioProp annotation output to PASBio annotation. Our system consists of BIOSMILE in combination with a BioProp-PASBio rule-based converter, and an additional semi-automatic rule generator. Results Our first experiment evaluated our rule-based converter's performance independently from BIOSMILE performance. The converter achieved an F-score of 85.29%. The second experiment evaluated combined system (BIOSMILE + rule-based converter). The system achieved an F-score of 69.08% for PASBio's 29 verbs. Conclusion Our approach allows PAS conversion between BioProp and PASBio annotation using BIOSMILE alongside our newly developed semi-automatic rule generator and rule-based converter. Our system can match the performance of other state-of-the-art domain-specific ML-based SRL systems and can be easily customized for PASBio application development. PMID:19091017

  3. Towards automated biomedical ontology harmonization.

    PubMed

    Uribe, Gustavo A; Lopez, Diego M; Blobel, Bernd

    2014-01-01

    The use of biomedical ontologies is increasing, especially in the context of health systems interoperability. Ontologies are key pieces to understand the semantics of information exchanged. However, given the diversity of biomedical ontologies, it is essential to develop tools that support harmonization processes amongst them. Several algorithms and tools are proposed by computer scientist for partially supporting ontology harmonization. However, these tools face several problems, especially in the biomedical domain where ontologies are large and complex. In the harmonization process, matching is a basic task. This paper explains the different ontology harmonization processes, analyzes existing matching tools, and proposes a prototype of an ontology harmonization service. The results demonstrate that there are many open issues in the field of biomedical ontology harmonization, such as: overcoming structural discrepancies between ontologies; the lack of semantic algorithms to automate the process; the low matching efficiency of existing algorithms; and the use of domain and top level ontologies in the matching process.

  4. Web-based Video Annotation and its Applications

    NASA Astrophysics Data System (ADS)

    Yamamoto, Daisuke; Nagao, Katashi

    In this paper, we developed a Web-based video annotation system, named iVAS (intelligent Video Annotation Server). Audiences can associate any video content on the Internet with annotations. The system analyzes video content in order to acquire cut/shot information and color histograms. And it also automatically generates a Web page for editing annotations. Then, audiences can create annotation data by two methods. The first one helps the users to create text data such as person/object names, scene descriptions, and comments interactively. The second method facilitates the users associating any video fragments with their subjective impression by just clicking a mouse button. The generated annotation data are accumulated and managed by an XML database connected with iVAS. We also developed some application systems based on annotations such as video retrieval, video simplification, and video-content-based community support. One of the major advantages of our approach is easy integration of hand-coded and automatically-generated (such as color histograms and cut/shot information) annotations. Additionally, since our annotation system is open for public, we must consider some reliability or correctness of annotation data. We also developed an automatic evaluation method of annotation reliability using the users' feedback. In the future, these fundamental technologies will contribute to the formation of new communities centered around video content.

  5. Annotated Humanities Programs.

    ERIC Educational Resources Information Center

    Adler, Richard R.; Applebee, Arthur

    The humanities programs offered in 1968 by 227 United States secondary schools are listed alphabetically by state, including almost 100 new programs not annotated in the 1967 listing (see TE 000 224). Each annotation presents a brief description of the approach to study used in the particular humanities course (e.g., American Studies, Culture…

  6. SEED Software Annotations.

    ERIC Educational Resources Information Center

    Bethke, Dee; And Others

    This document provides a composite index of the first five sets of software annotations produced by Project SEED. The software has been indexed by title, subject area, and grade level, and it covers sets of annotations distributed in September 1986, April 1987, September 1987, November 1987, and February 1988. The date column in the index…

  7. Sharing Annotated Audio Recordings of Clinic Visits With Patients—Development of the Open Recording Automated Logging System (ORALS): Study Protocol

    PubMed Central

    Dannenberg, Michelle D; Ganoe, Craig H; Haslett, William; Faill, Rebecca; Hassanpour, Saeed; Das, Amar; Arend, Roger; Masel, Meredith C; Piper, Sheryl; Reicher, Haley; Ryan, James; Elwyn, Glyn

    2017-01-01

    Background Providing patients with recordings of their clinic visits enhances patient and family engagement, yet few organizations routinely offer recordings. Challenges exist for organizations and patients, including data safety and navigating lengthy recordings. A secure system that allows patients to easily navigate recordings may be a solution. Objective The aim of this project is to develop and test an interoperable system to facilitate routine recording, the Open Recording Automated Logging System (ORALS), with the aim of increasing patient and family engagement. ORALS will consist of (1) technically proficient software using automated machine learning technology to enable accurate and automatic tagging of in-clinic audio recordings (tagging involves identifying elements of the clinic visit most important to patients [eg, treatment plan] on the recording) and (2) a secure, easy-to-use Web interface enabling the upload and accurate linkage of recordings to patients, which can be accessed at home. Methods We will use a mixed methods approach to develop and formatively test ORALS in 4 iterative stages: case study of pioneer clinics where recordings are currently offered to patients, ORALS design and user experience testing, ORALS software and user interface development, and rapid cycle testing of ORALS in a primary care clinic, assessing impact on patient and family engagement. Dartmouth’s Informatics Collaboratory for Design, Development and Dissemination team, patients, patient partners, caregivers, and clinicians will assist in developing ORALS. Results We will implement a publication plan that includes a final project report and articles for peer-reviewed journals. In addition to this work, we will regularly report on our progress using popular relevant Tweet chats and online using our website, www.openrecordings.org. We will disseminate our work at relevant conferences (eg, Academy Health, Health Datapalooza, and the Institute for Healthcare Improvement

  8. Text-mining assisted regulatory annotation

    PubMed Central

    Aerts, Stein; Haeussler, Maximilian; van Vooren, Steven; Griffith, Obi L; Hulpiau, Paco; Jones, Steven JM; Montgomery, Stephen B; Bergman, Casey M

    2008-01-01

    Background Decoding transcriptional regulatory networks and the genomic cis-regulatory logic implemented in their control nodes is a fundamental challenge in genome biology. High-throughput computational and experimental analyses of regulatory networks and sequences rely heavily on positive control data from prior small-scale experiments, but the vast majority of previously discovered regulatory data remains locked in the biomedical literature. Results We develop text-mining strategies to identify relevant publications and extract sequence information to assist the regulatory annotation process. Using a vector space model to identify Medline abstracts from papers likely to have high cis-regulatory content, we demonstrate that document relevance ranking can assist the curation of transcriptional regulatory networks and estimate that, minimally, 30,000 papers harbor unannotated cis-regulatory data. In addition, we show that DNA sequences can be extracted from primary text with high cis-regulatory content and mapped to genome sequences as a means of identifying the location, organism and target gene information that is critical to the cis-regulatory annotation process. Conclusion Our results demonstrate that text-mining technologies can be successfully integrated with genome annotation systems, thereby increasing the availability of annotated cis-regulatory data needed to catalyze advances in the field of gene regulation. PMID:18271954

  9. Automated annotation removal in agar plates.

    PubMed

    Vera, Sergio; Perez, Frederic; Lara, Laura; Ceresa, Mario; Carranza, Noemi; Herrero Jover, Javier; Gonzalez Ballester, Miguel A

    2013-01-01

    Agar plates are widely used in the biomedical field as a medium in which to artificially grow bacteria, algae or fungi. Agar plates (Petri dishes) are used routinely in microbiology laboratories in order to identify the type of micro-organism responsible for infections. Such diagnoses are based on counting the number and type of bacterial colonies growing in the Petri dish. The count of bacterial colonies is a time consuming task prone to human error, so interest in automated counting systems has increased in the recent years. One of the difficulties of automatizing the counting process is the presence of markers and annotations made in the lower part of the agar plate. Efficient removal of such markers can increase the accuracy of the bacterial counting system. This article introduces a fast method for detection, segmentation and removal of annotations in agar plates that improves the results of existing bacterial colony counting algorithms.

  10. The National Center for Biomedical Ontology

    PubMed Central

    Noy, Natalya F; Shah, Nigam H; Whetzel, Patricia L; Chute, Christopher G; Story, Margaret-Anne; Smith, Barry

    2011-01-01

    The National Center for Biomedical Ontology is now in its seventh year. The goals of this National Center for Biomedical Computing are to: create and maintain a repository of biomedical ontologies and terminologies; build tools and web services to enable the use of ontologies and terminologies in clinical and translational research; educate their trainees and the scientific community broadly about biomedical ontology and ontology-based technology and best practices; and collaborate with a variety of groups who develop and use ontologies and terminologies in biomedicine. The centerpiece of the National Center for Biomedical Ontology is a web-based resource known as BioPortal. BioPortal makes available for research in computationally useful forms more than 270 of the world's biomedical ontologies and terminologies, and supports a wide range of web services that enable investigators to use the ontologies to annotate and retrieve data, to generate value sets and special-purpose lexicons, and to perform advanced analytics on a wide range of biomedical data. PMID:22081220

  11. The National Center for Biomedical Ontology.

    PubMed

    Musen, Mark A; Noy, Natalya F; Shah, Nigam H; Whetzel, Patricia L; Chute, Christopher G; Story, Margaret-Anne; Smith, Barry

    2012-01-01

    The National Center for Biomedical Ontology is now in its seventh year. The goals of this National Center for Biomedical Computing are to: create and maintain a repository of biomedical ontologies and terminologies; build tools and web services to enable the use of ontologies and terminologies in clinical and translational research; educate their trainees and the scientific community broadly about biomedical ontology and ontology-based technology and best practices; and collaborate with a variety of groups who develop and use ontologies and terminologies in biomedicine. The centerpiece of the National Center for Biomedical Ontology is a web-based resource known as BioPortal. BioPortal makes available for research in computationally useful forms more than 270 of the world's biomedical ontologies and terminologies, and supports a wide range of web services that enable investigators to use the ontologies to annotate and retrieve data, to generate value sets and special-purpose lexicons, and to perform advanced analytics on a wide range of biomedical data.

  12. Case Studies in Reading: An Annotated Bibliography.

    ERIC Educational Resources Information Center

    Trela, Thaddeus M., Comp.; Becker, George J., Comp.

    Descriptions of individual diagnosis and remediation of reading problems experienced by students at all levels are included in this annotated bibliography. Included are books, texts having case study sections, and journal reports which together comprise useful sources of case studies of reading disabilities. An opening section lists nine "first…

  13. Genome Annotation Transfer Utility (GATU): rapid annotation of viral genomes using a closely related reference genome.

    PubMed

    Tcherepanov, Vasily; Ehlers, Angelika; Upton, Chris

    2006-06-13

    Since DNA sequencing has become easier and cheaper, an increasing number of closely related viral genomes have been sequenced. However, many of these have been deposited in GenBank without annotations, severely limiting their value to researchers. While maintaining comprehensive genomic databases for a set of virus families at the Viral Bioinformatics Resource Center http://www.biovirus.org and Viral Bioinformatics - Canada http://www.virology.ca, we found that researchers were unnecessarily spending time annotating viral genomes that were close relatives of already annotated viruses. We have therefore designed and implemented a novel tool, Genome Annotation Transfer Utility (GATU), to transfer annotations from a previously annotated reference genome to a new target genome, thereby greatly reducing this laborious task. GATU transfers annotations from a reference genome to a closely related target genome, while still giving the user final control over which annotations should be included. GATU also detects open reading frames present in the target but not the reference genome and provides the user with a variety of bioinformatics tools to quickly determine if these ORFs should also be included in the annotation. After this process is complete, GATU saves the newly annotated genome as a GenBank, EMBL or XML-format file. The software is coded in Java and runs on a variety of computer platforms. Its user-friendly Graphical User Interface is specifically designed for users trained in the biological sciences. GATU greatly simplifies the initial stages of genome annotation by using a closely related genome as a reference. It is not intended to be a gene prediction tool or a "complete" annotation system, but we have found that it significantly reduces the time required for annotation of genes and mature peptides as well as helping to standardize gene names between related organisms by transferring reference genome annotations to the target genome. The program is freely

  14. An analysis on the entity annotations in biological corpora.

    PubMed

    Neves, Mariana

    2014-01-01

    Collection of documents annotated with semantic entities and relationships are crucial resources to support development and evaluation of text mining solutions for the biomedical domain. Here I present an overview of 36 corpora and show an analysis on the semantic annotations they contain. Annotations for entity types were classified into six semantic groups and an overview on the semantic entities which can be found in each corpus is shown. Results show that while some semantic entities, such as genes, proteins and chemicals are consistently annotated in many collections, corpora available for diseases, variations and mutations are still few, in spite of their importance in the biological domain.

  15. Improving genome annotations using phylogenetic profile anomaly detection.

    PubMed

    Mikkelsen, Tarjei S; Galagan, James E; Mesirov, Jill P

    2005-02-15

    A promising strategy for refining genome annotations is to detect features that conflict with known functional or evolutionary relationships between groups of genes. Previous work in this area has been focused on investigating the absence of 'housekeeping' genes or components of well-studied pathways. We have sought to develop a method for improving new annotations that can automatically synthesize and use the information available in a database of other annotated genomes. We show that a probabilistic model of phylogenetic profiles, trained from a database of curated genome annotations, can be used to reliably detect errors in new annotations. We use our method to identify 22 genes that were missed in previously published annotations of prokaryotic genomes. The method was evaluated using MATLAB and open source software referenced in this work. Scripts and datasets are available from the authors upon request. tarjei@broad.mit.edu.

  16. National Space Biomedical Research Institute

    NASA Technical Reports Server (NTRS)

    1998-01-01

    The National Space Biomedical Research Institute (NSBRI) sponsors and performs fundamental and applied space biomedical research with the mission of leading a world-class, national effort in integrated, critical path space biomedical research that supports NASA's Human Exploration and Development of Space (HEDS) Strategic Plan. It focuses on the enabling of long-term human presence in, development of, and exploration of space. This will be accomplished by: designing, implementing, and validating effective countermeasures to address the biological and environmental impediments to long-term human space flight; defining the molecular, cellular, organ-level, integrated responses and mechanistic relationships that ultimately determine these impediments, where such activity fosters the development of novel countermeasures; establishing biomedical support technologies to maximize human performance in space, reduce biomedical hazards to an acceptable level, and deliver quality medical care; transferring and disseminating the biomedical advances in knowledge and technology acquired through living and working in space to the benefit of mankind in space and on Earth, including the treatment of patients suffering from gravity- and radiation-related conditions on Earth; and ensuring open involvement of the scientific community, industry, and the public at large in the Institute's activities and fostering a robust collaboration with NASA, particularly through Johnson Space Center.

  17. An annotated energy bibliography

    NASA Technical Reports Server (NTRS)

    Blow, S. J.

    1979-01-01

    Comprehensive annotated compilation of books, journals, periodicals, and reports on energy and energy related topics, contains approximately 10,0000 tehcnical and nontechnical references from bibliographic and other sources dated January 1975 through May 1977.

  18. An annotated energy bibliography

    NASA Technical Reports Server (NTRS)

    Blow, S. J.

    1979-01-01

    Comprehensive annotated compilation of books, journals, periodicals, and reports on energy and energy related topics, contains approximately 10,0000 tehcnical and nontechnical references from bibliographic and other sources dated January 1975 through May 1977.

  19. Annotation and visualization of endogenous retroviral sequences using the Distributed Annotation System (DAS) and eBioX

    PubMed Central

    Martínez Barrio, Álvaro; Lagercrantz, Erik; Sperber, Göran O; Blomberg, Jonas; Bongcam-Rudloff, Erik

    2009-01-01

    Background The Distributed Annotation System (DAS) is a widely used network protocol for sharing biological information. The distributed aspects of the protocol enable the use of various reference and annotation servers for connecting biological sequence data to pertinent annotations in order to depict an integrated view of the data for the final user. Results An annotation server has been devised to provide information about the endogenous retroviruses detected and annotated by a specialized in silico tool called RetroTector. We describe the procedure to implement the DAS 1.5 protocol commands necessary for constructing the DAS annotation server. We use our server to exemplify those steps. Data distribution is kept separated from visualization which is carried out by eBioX, an easy to use open source program incorporating multiple bioinformatics utilities. Some well characterized endogenous retroviruses are shown in two different DAS clients. A rapid analysis of areas free from retroviral insertions could be facilitated by our annotations. Conclusion The DAS protocol has shown to be advantageous in the distribution of endogenous retrovirus data. The distributed nature of the protocol is also found to aid in combining annotation and visualization along a genome in order to enhance the understanding of ERV contribution to its evolution. Reference and annotation servers are conjointly used by eBioX to provide visualization of ERV annotations as well as other data sources. Our DAS data source can be found in the central public DAS service repository, , or at . PMID:19534743

  20. An Introduction to Genome Annotation.

    PubMed

    Campbell, Michael S; Yandell, Mark

    2015-12-17

    Genome projects have evolved from large international undertakings to tractable endeavors for a single lab. Accurate genome annotation is critical for successful genomic, genetic, and molecular biology experiments. These annotations can be generated using a number of approaches and available software tools. This unit describes methods for genome annotation and a number of software tools commonly used in gene annotation.

  1. VariantAnnotation: a Bioconductor package for exploration and annotation of genetic variants

    PubMed Central

    Obenchain, Valerie; Lawrence, Michael; Carey, Vincent; Gogarten, Stephanie; Shannon, Paul; Morgan, Martin

    2014-01-01

    Summary: VariantAnnotation is an R / Bioconductor package for the exploration and annotation of genetic variants. Capabilities exist for reading, writing and filtering variant call format (VCF) files. VariantAnnotation allows ready access to additional R / Bioconductor facilities for advanced statistical analysis, data transformation, visualization and integration with diverse genomic resources. Availability and implementation: This package is implemented in R and available for download at the Bioconductor Web site (http://bioconductor.org/packages/2.13/bioc/html/VariantAnnotation.html). The package contains extensive help pages for individual functions and a ‘vignette’ outlining typical work flows; it is made available under the open source ‘Artistic-2.0’ license. Version 1.9.38 was used in this article. Contact: vobencha@fhcrc.org PMID:24681907

  2. Power-law-like distributions in biomedical publications and research funding.

    PubMed

    Su, Andrew I; Hogenesch, John B

    2007-01-01

    Gene annotation, as measured by links to the biomedical literature and funded grants, is governed by a power law, indicating that researchers favor the extensive study of relatively few genes. This emphasizes the need for data-driven science to accomplish genome-wide gene annotation.

  3. Dizeez: an online game for human gene-disease annotation.

    PubMed

    Loguercio, Salvatore; Good, Benjamin M; Su, Andrew I

    2013-01-01

    Structured gene annotations are a foundation upon which many bioinformatics and statistical analyses are built. However the structured annotations available in public databases are a sparse representation of biological knowledge as a whole. The rate of biomedical data generation is such that centralized biocuration efforts struggle to keep up. New models for gene annotation need to be explored that expand the pace at which we are able to structure biomedical knowledge. Recently, online games have emerged as an effective way to recruit, engage and organize large numbers of volunteers to help address difficult biological challenges. For example, games have been successfully developed for protein folding (Foldit), multiple sequence alignment (Phylo) and RNA structure design (EteRNA). Here we present Dizeez, a simple online game built with the purpose of structuring knowledge of gene-disease associations. Preliminary results from game play online and at scientific conferences suggest that Dizeez is producing valid gene-disease annotations not yet present in any public database. These early results provide a basic proof of principle that online games can be successfully applied to the challenge of gene annotation. Dizeez is available at http://genegames.org.

  4. Learning pathology using collaborative vs. individual annotation of whole slide images: a mixed methods trial.

    PubMed

    Sahota, Michael; Leung, Betty; Dowdell, Stephanie; Velan, Gary M

    2016-12-12

    Students in biomedical disciplines require understanding of normal and abnormal microscopic appearances of human tissues (histology and histopathology). For this purpose, practical classes in these disciplines typically use virtual microscopy, viewing digitised whole slide images in web browsers. To enhance engagement, tools have been developed to enable individual or collaborative annotation of whole slide images within web browsers. To date, there have been no studies that have critically compared the impact on learning of individual and collaborative annotations on whole slide images. Junior and senior students engaged in Pathology practical classes within Medical Science and Medicine programs participated in cross-over trials of individual and collaborative annotation activities. Students' understanding of microscopic morphology was compared using timed online quizzes, while students' perceptions of learning were evaluated using an online questionnaire. For senior medical students, collaborative annotation of whole slide images was superior for understanding key microscopic features when compared to individual annotation; whilst being at least equivalent to individual annotation for junior medical science students. Across cohorts, students agreed that the annotation activities provided a user-friendly learning environment that met their flexible learning needs, improved efficiency, provided useful feedback, and helped them to set learning priorities. Importantly, these activities were also perceived to enhance motivation and improve understanding. Collaborative annotation improves understanding of microscopic morphology for students with sufficient background understanding of the discipline. These findings have implications for the deployment of annotation activities in biomedical curricula, and potentially for postgraduate training in Anatomical Pathology.

  5. Semantic Annotation of Mutable Data

    PubMed Central

    Morris, Robert A.; Dou, Lei; Hanken, James; Kelly, Maureen; Lowery, David B.; Ludäscher, Bertram; Macklin, James A.; Morris, Paul J.

    2013-01-01

    Electronic annotation of scientific data is very similar to annotation of documents. Both types of annotation amplify the original object, add related knowledge to it, and dispute or support assertions in it. In each case, annotation is a framework for discourse about the original object, and, in each case, an annotation needs to clearly identify its scope and its own terminology. However, electronic annotation of data differs from annotation of documents: the content of the annotations, including expectations and supporting evidence, is more often shared among members of networks. Any consequent actions taken by the holders of the annotated data could be shared as well. But even those current annotation systems that admit data as their subject often make it difficult or impossible to annotate at fine-enough granularity to use the results in this way for data quality control. We address these kinds of issues by offering simple extensions to an existing annotation ontology and describe how the results support an interest-based distribution of annotations. We are using the result to design and deploy a platform that supports annotation services overlaid on networks of distributed data, with particular application to data quality control. Our initial instance supports a set of natural science collection metadata services. An important application is the support for data quality control and provision of missing data. A previous proof of concept demonstrated such use based on data annotations modeled with XML-Schema. PMID:24223697

  6. Integrating image data into biomedical text categorization.

    PubMed

    Shatkay, Hagit; Chen, Nawei; Blostein, Dorothea

    2006-07-15

    Categorization of biomedical articles is a central task for supporting various curation efforts. It can also form the basis for effective biomedical text mining. Automatic text classification in the biomedical domain is thus an active research area. Contests organized by the KDD Cup (2002) and the TREC Genomics track (since 2003) defined several annotation tasks that involved document classification, and provided training and test data sets. So far, these efforts focused on analyzing only the text content of documents. However, as was noted in the KDD'02 text mining contest-where figure-captions proved to be an invaluable feature for identifying documents of interest-images often provide curators with critical information. We examine the possibility of using information derived directly from image data, and of integrating it with text-based classification, for biomedical document categorization. We present a method for obtaining features from images and for using them-both alone and in combination with text-to perform the triage task introduced in the TREC Genomics track 2004. The task was to determine which documents are relevant to a given annotation task performed by the Mouse Genome Database curators. We show preliminary results, demonstrating that the method has a strong potential to enhance and complement traditional text-based categorization methods.

  7. Automated extraction and semantic analysis of mutation impacts from the biomedical literature.

    PubMed

    Naderi, Nona; Witte, René

    2012-06-18

    Mutations as sources of evolution have long been the focus of attention in the biomedical literature. Accessing the mutational information and their impacts on protein properties facilitates research in various domains, such as enzymology and pharmacology. However, manually curating the rich and fast growing repository of biomedical literature is expensive and time-consuming. As a solution, text mining approaches have increasingly been deployed in the biomedical domain. While the detection of single-point mutations is well covered by existing systems, challenges still exist in grounding impacts to their respective mutations and recognizing the affected protein properties, in particular kinetic and stability properties together with physical quantities. We present an ontology model for mutation impacts, together with a comprehensive text mining system for extracting and analysing mutation impact information from full-text articles. Organisms, as sources of proteins, are extracted to help disambiguation of genes and proteins. Our system then detects mutation series to correctly ground detected impacts using novel heuristics. It also extracts the affected protein properties, in particular kinetic and stability properties, as well as the magnitude of the effects and validates these relations against the domain ontology. The output of our system can be provided in various formats, in particular by populating an OWL-DL ontology, which can then be queried to provide structured information. The performance of the system is evaluated on our manually annotated corpora. In the impact detection task, our system achieves a precision of 70.4%-71.1%, a recall of 71.3%-71.5%, and grounds the detected impacts with an accuracy of 76.5%-77%. The developed system, including resources, evaluation data and end-user and developer documentation is freely available under an open source license at http://www.semanticsoftware.info/open-mutation-miner. We present Open Mutation Miner (OMM

  8. An integrated computational pipeline and database to support whole-genome sequence annotation.

    PubMed

    Mungall, C J; Misra, S; Berman, B P; Carlson, J; Frise, E; Harris, N; Marshall, B; Shu, S; Kaminker, J S; Prochnik, S E; Smith, C D; Smith, E; Tupy, J L; Wiel, C; Rubin, G M; Lewis, S E

    2002-01-01

    We describe here our experience in annotating the Drosophila melanogaster genome sequence, in the course of which we developed several new open-source software tools and a database schema to support large-scale genome annotation. We have developed these into an integrated and reusable software system for whole-genome annotation. The key contributions to overall annotation quality are the marshalling of high-quality sequences for alignments and the design of a system with an adaptable and expandable flexible architecture.

  9. Biomedical ultrasonoscope

    NASA Technical Reports Server (NTRS)

    Lee, R. D. (Inventor)

    1979-01-01

    The combination of a "C" mode scan electronics in a portable, battery powered biomedical ultrasonoscope having "A" and "M" mode scan electronics, the latter including a clock generator for generating clock pulses, a cathode ray tube having X, Y and Z axis inputs, a sweep generator connected between the clock generator and the X axis input of the cathode ray tube for generating a cathode ray sweep signal synchronized by the clock pulses, and a receiver adapted to be connected to the Z axis input of the cathode ray tube. The "C" mode scan electronics comprises a plurality of transducer elements arranged in a row and adapted to be positioned on the skin of the patient's body for converting a pulsed electrical signal to a pulsed ultrasonic signal, radiating the ultrasonic signal into the patient's body, picking up the echoes reflected from interfaces in the patient's body and converting the echoes to electrical signals; a plurality of transmitters, each transmitter being coupled to a respective transducer for transmitting a pulsed electrical signal thereto and for transmitting the converted electrical echo signals directly to the receiver, a sequencer connected between the clock generator and the plurality of transmitters and responsive to the clock pulses for firing the transmitters in cyclic order; and a staircase voltage generator connected between the clock generator and the Y axis input of the cathode ray tube for generating a staircase voltage having steps synchronized by the clock pulses.

  10. Wide coverage biomedical event extraction using multiple partially overlapping corpora

    PubMed Central

    2013-01-01

    Background Biomedical events are key to understanding physiological processes and disease, and wide coverage extraction is required for comprehensive automatic analysis of statements describing biomedical systems in the literature. In turn, the training and evaluation of extraction methods requires manually annotated corpora. However, as manual annotation is time-consuming and expensive, any single event-annotated corpus can only cover a limited number of semantic types. Although combined use of several such corpora could potentially allow an extraction system to achieve broad semantic coverage, there has been little research into learning from multiple corpora with partially overlapping semantic annotation scopes. Results We propose a method for learning from multiple corpora with partial semantic annotation overlap, and implement this method to improve our existing event extraction system, EventMine. An evaluation using seven event annotated corpora, including 65 event types in total, shows that learning from overlapping corpora can produce a single, corpus-independent, wide coverage extraction system that outperforms systems trained on single corpora and exceeds previously reported results on two established event extraction tasks from the BioNLP Shared Task 2011. Conclusions The proposed method allows the training of a wide-coverage, state-of-the-art event extraction system from multiple corpora with partial semantic annotation overlap. The resulting single model makes broad-coverage extraction straightforward in practice by removing the need to either select a subset of compatible corpora or semantic types, or to merge results from several models trained on different individual corpora. Multi-corpus learning also allows annotation efforts to focus on covering additional semantic types, rather than aiming for exhaustive coverage in any single annotation effort, or extending the coverage of semantic types annotated in existing corpora. PMID:23731785

  11. Algal functional annotation tool

    SciTech Connect

    2012-07-12

    Abstract BACKGROUND: Progress in genome sequencing is proceeding at an exponential pace, and several new algal genomes are becoming available every year. One of the challenges facing the community is the association of protein sequences encoded in the genomes with biological function. While most genome assembly projects generate annotations for predicted protein sequences, they are usually limited and integrate functional terms from a limited number of databases. Another challenge is the use of annotations to interpret large lists of 'interesting' genes generated by genome-scale datasets. Previously, these gene lists had to be analyzed across several independent biological databases, often on a gene-by-gene basis. In contrast, several annotation databases, such as DAVID, integrate data from multiple functional databases and reveal underlying biological themes of large gene lists. While several such databases have been constructed for animals, none is currently available for the study of algae. Due to renewed interest in algae as potential sources of biofuels and the emergence of multiple algal genome sequences, a significant need has arisen for such a database to process the growing compendiums of algal genomic data. DESCRIPTION: The Algal Functional Annotation Tool is a web-based comprehensive analysis suite integrating annotation data from several pathway, ontology, and protein family databases. The current version provides annotation for the model alga Chlamydomonas reinhardtii, and in the future will include additional genomes. The site allows users to interpret large gene lists by identifying associated functional terms, and their enrichment. Additionally, expression data for several experimental conditions were compiled and analyzed to provide an expression-based enrichment search. A tool to search for functionally-related genes based on gene expression across these conditions is also provided. Other features include dynamic visualization of genes on KEGG

  12. Human Genome Annotation

    NASA Astrophysics Data System (ADS)

    Gerstein, Mark

    A central problem for 21st century science is annotating the human genome and making this annotation useful for the interpretation of personal genomes. My talk will focus on annotating the 99% of the genome that does not code for canonical genes, concentrating on intergenic features such as structural variants (SVs), pseudogenes (protein fossils), binding sites, and novel transcribed RNAs (ncRNAs). In particular, I will describe how we identify regulatory sites and variable blocks (SVs) based on processing next-generation sequencing experiments. I will further explain how we cluster together groups of sites to create larger annotations. Next, I will discuss a comprehensive pseudogene identification pipeline, which has enabled us to identify >10K pseudogenes in the genome and analyze their distribution with respect to age, protein family, and chromosomal location. Throughout, I will try to introduce some of the computational algorithms and approaches that are required for genome annotation. Much of this work has been carried out in the framework of the ENCODE, modENCODE, and 1000 genomes projects.

  13. Evaluating Computational Gene Ontology Annotations.

    PubMed

    Škunca, Nives; Roberts, Richard J; Steffen, Martin

    2017-01-01

    Two avenues to understanding gene function are complementary and often overlapping: experimental work and computational prediction. While experimental annotation generally produces high-quality annotations, it is low throughput. Conversely, computational annotations have broad coverage, but the quality of annotations may be variable, and therefore evaluating the quality of computational annotations is a critical concern.In this chapter, we provide an overview of strategies to evaluate the quality of computational annotations. First, we discuss why evaluating quality in this setting is not trivial. We highlight the various issues that threaten to bias the evaluation of computational annotations, most of which stem from the incompleteness of biological databases. Second, we discuss solutions that address these issues, for example, targeted selection of new experimental annotations and leveraging the existing experimental annotations.

  14. Algal functional annotation tool

    SciTech Connect

    Lopez, D.; Casero, D.; Cokus, S. J.; Merchant, S. S.; Pellegrini, M.

    2012-07-01

    The Algal Functional Annotation Tool is a web-based comprehensive analysis suite integrating annotation data from several pathway, ontology, and protein family databases. The current version provides annotation for the model alga Chlamydomonas reinhardtii, and in the future will include additional genomes. The site allows users to interpret large gene lists by identifying associated functional terms, and their enrichment. Additionally, expression data for several experimental conditions were compiled and analyzed to provide an expression-based enrichment search. A tool to search for functionally-related genes based on gene expression across these conditions is also provided. Other features include dynamic visualization of genes on KEGG pathway maps and batch gene identifier conversion.

  15. Re-Annotator: Annotation Pipeline for Microarray Probe Sequences.

    PubMed

    Arloth, Janine; Bader, Daniel M; Röh, Simone; Altmann, Andre

    2015-01-01

    Microarray technologies are established approaches for high throughput gene expression, methylation and genotyping analysis. An accurate mapping of the array probes is essential to generate reliable biological findings. However, manufacturers of the microarray platforms typically provide incomplete and outdated annotation tables, which often rely on older genome and transcriptome versions that differ substantially from up-to-date sequence databases. Here, we present the Re-Annotator, a re-annotation pipeline for microarray probe sequences. It is primarily designed for gene expression microarrays but can also be adapted to other types of microarrays. The Re-Annotator uses a custom-built mRNA reference database to identify the positions of gene expression array probe sequences. We applied Re-Annotator to the Illumina Human-HT12 v4 microarray platform and found that about one quarter (25%) of the probes differed from the manufacturer's annotation. In further computational experiments on experimental gene expression data, we compared Re-Annotator to another probe re-annotation tool, ReMOAT, and found that Re-Annotator provided an improved re-annotation of microarray probes. A thorough re-annotation of probe information is crucial to any microarray analysis. The Re-Annotator pipeline is freely available at http://sourceforge.net/projects/reannotator along with re-annotated files for Illumina microarrays HumanHT-12 v3/v4 and MouseRef-8 v2.

  16. Injectors and Annotations

    NASA Technical Reports Server (NTRS)

    Filman, Robert E.

    2004-01-01

    In a previous paper, we presented the Object Infrastructure Framework. The goal of that system is to simplify the creation of distributed applications. The primary claim of that work is that non-functional 'ilities' could be achieved by controlling and manipulating the communications between components, thereby simplifying the development of distributed systems. A secondary element of that paper is to argue for extending the conventional distributed objects model in two important ways: 1) The ability to insert injectors (filters, wrappers) into the communication path between components; 2) The ability to annotate communications with additional information, and to propagate these annotations through an application. Here we express the descriptions of that paper.

  17. Effective use of latent semantic indexing and computational linguistics in biological and biomedical applications.

    PubMed

    Chen, Hongyu; Martin, Bronwen; Daimon, Caitlin M; Maudsley, Stuart

    2013-01-01

    Text mining is rapidly becoming an essential technique for the annotation and analysis of large biological data sets. Biomedical literature currently increases at a rate of several thousand papers per week, making automated information retrieval methods the only feasible method of managing this expanding corpus. With the increasing prevalence of open-access journals and constant growth of publicly-available repositories of biomedical literature, literature mining has become much more effective with respect to the extraction of biomedically-relevant data. In recent years, text mining of popular databases such as MEDLINE has evolved from basic term-searches to more sophisticated natural language processing techniques, indexing and retrieval methods, structural analysis and integration of literature with associated metadata. In this review, we will focus on Latent Semantic Indexing (LSI), a computational linguistics technique increasingly used for a variety of biological purposes. It is noted for its ability to consistently outperform benchmark Boolean text searches and co-occurrence models at information retrieval and its power to extract indirect relationships within a data set. LSI has been used successfully to formulate new hypotheses, generate novel connections from existing data, and validate empirical data.

  18. Effective use of latent semantic indexing and computational linguistics in biological and biomedical applications

    PubMed Central

    Chen, Hongyu; Martin, Bronwen; Daimon, Caitlin M.; Maudsley, Stuart

    2012-01-01

    Text mining is rapidly becoming an essential technique for the annotation and analysis of large biological data sets. Biomedical literature currently increases at a rate of several thousand papers per week, making automated information retrieval methods the only feasible method of managing this expanding corpus. With the increasing prevalence of open-access journals and constant growth of publicly-available repositories of biomedical literature, literature mining has become much more effective with respect to the extraction of biomedically-relevant data. In recent years, text mining of popular databases such as MEDLINE has evolved from basic term-searches to more sophisticated natural language processing techniques, indexing and retrieval methods, structural analysis and integration of literature with associated metadata. In this review, we will focus on Latent Semantic Indexing (LSI), a computational linguistics technique increasingly used for a variety of biological purposes. It is noted for its ability to consistently outperform benchmark Boolean text searches and co-occurrence models at information retrieval and its power to extract indirect relationships within a data set. LSI has been used successfully to formulate new hypotheses, generate novel connections from existing data, and validate empirical data. PMID:23386833

  19. Automated annotation of chemical names in the literature with tunable accuracy.

    PubMed

    Zhang, Jun D; Geer, Lewis Y; Bolton, Evan E; Bryant, Stephen H

    2011-11-22

    A significant portion of the biomedical and chemical literature refers to small molecules. The accurate identification and annotation of compound name that are relevant to the topic of the given literature can establish links between scientific publications and various chemical and life science databases. Manual annotation is the preferred method for these works because well-trained indexers can understand the paper topics as well as recognize key terms. However, considering the hundreds of thousands of new papers published annually, an automatic annotation system with high precision and relevance can be a useful complement to manual annotation. An automated chemical name annotation system, MeSH Automated Annotations (MAA), was developed to annotate small molecule names in scientific abstracts with tunable accuracy. This system aims to reproduce the MeSH term annotations on biomedical and chemical literature that would be created by indexers. When comparing automated free text matching to those indexed manually of 26 thousand MEDLINE abstracts, more than 40% of the annotations were false-positive (FP) cases. To reduce the FP rate, MAA incorporated several filters to remove "incorrect" annotations caused by nonspecific, partial, and low relevance chemical names. In part, relevance was measured by the position of the chemical name in the text. Tunable accuracy was obtained by adding or restricting the sections of the text scanned for chemical names. The best precision obtained was 96% with a 28% recall rate. The best performance of MAA, as measured with the F statistic was 66%, which favorably compares to other chemical name annotation systems. Accurate chemical name annotation can help researchers not only identify important chemical names in abstracts, but also match unindexed and unstructured abstracts to chemical records. The current work is tested against MEDLINE, but the algorithm is not specific to this corpus and it is possible that the algorithm can be

  20. BeCAS: biomedical concept recognition services and visualization.

    PubMed

    Nunes, Tiago; Campos, David; Matos, Sérgio; Oliveira, José Luís

    2013-08-01

    The continuous growth of the biomedical scientific literature has been motivating the development of text-mining tools able to efficiently process all this information. Although numerous domain-specific solutions are available, there is no web-based concept-recognition system that combines the ability to select multiple concept types to annotate, to reference external databases and to automatically annotate nested and intercepted concepts. BeCAS, the Biomedical Concept Annotation System, is an API for biomedical concept identification and a web-based tool that addresses these limitations. MEDLINE abstracts or free text can be annotated directly in the web interface, where identified concepts are enriched with links to reference databases. Using its customizable widget, it can also be used to augment external web pages with concept highlighting features. Furthermore, all text-processing and annotation features are made available through an HTTP REST API, allowing integration in any text-processing pipeline. BeCAS is freely available for non-commercial use at http://bioinformatics.ua.pt/becas. tiago.nunes@ua.pt or jlo@ua.pt.

  1. Modeling loosely annotated images using both given and imagined annotations

    NASA Astrophysics Data System (ADS)

    Tang, Hong; Boujemaa, Nozha; Chen, Yunhao; Deng, Lei

    2011-12-01

    In this paper, we present an approach to learn latent semantic analysis models from loosely annotated images for automatic image annotation and indexing. The given annotation in training images is loose due to: 1. ambiguous correspondences between visual features and annotated keywords; 2. incomplete lists of annotated keywords. The second reason motivates us to enrich the incomplete annotation in a simple way before learning a topic model. In particular, some ``imagined'' keywords are poured into the incomplete annotation through measuring similarity between keywords in terms of their co-occurrence. Then, both given and imagined annotations are employed to learn probabilistic topic models for automatically annotating new images. We conduct experiments on two image databases (i.e., Corel and ESP) coupled with their loose annotations, and compare the proposed method with state-of-the-art discrete annotation methods. The proposed method improves word-driven probability latent semantic analysis (PLSA-words) up to a comparable performance with the best discrete annotation method, while a merit of PLSA-words is still kept, i.e., a wider semantic range.

  2. Cheating. An Annotated Bibliography.

    ERIC Educational Resources Information Center

    Wildemuth, Barbara M., Comp.

    This 89-item, annotated bibliography was compiled to provide access to research and discussions of cheating and, specifically, cheating on tests. It is not limited to any educational level, nor is it confined to any specific curriculum area. Two data bases were searched by computer, and a library search was conducted. A computer search of the…

  3. Automated Microbial Genome Annotation

    SciTech Connect

    Land, Miriam

    2009-05-29

    Miriam Land of the DOE Joint Genome Institute at Oak Ridge National Laboratory gives a talk on the current state and future challenges of moving toward automated microbial genome annotation at the "Sequencing, Finishing, Analysis in the Future" meeting in Santa Fe, NM

  4. Annotation: The Savant Syndrome

    ERIC Educational Resources Information Center

    Heaton, Pamela; Wallace, Gregory L.

    2004-01-01

    Background: Whilst interest has focused on the origin and nature of the savant syndrome for over a century, it is only within the past two decades that empirical group studies have been carried out. Methods: The following annotation briefly reviews relevant research and also attempts to address outstanding issues in this research area.…

  5. Ghostwriting: An Annotated Bibliography.

    ERIC Educational Resources Information Center

    Simmons, Donald B.

    Drawn from communication journals, historical and news magazines, business and industrial magazines, political science and world affairs journals, general interest periodicals, and literary and political review magazines, the approximately 90 entries in this annotated bibliography discuss ghostwriting as practiced through the ages and reveal the…

  6. Annotated Bibliography. First Edition.

    ERIC Educational Resources Information Center

    Haring, Norris G.

    An annotated bibliography which presents approximately 300 references from 1951 to 1973 on the education of severely/profoundly handicapped persons. Citations are grouped alphabetically by author's name within the following categories: characteristics and treatment, gross motor development, sensory and motor development, physical therapy for the…

  7. Ghostwriting: An Annotated Bibliography.

    ERIC Educational Resources Information Center

    Simmons, Donald B.

    Drawn from communication journals, historical and news magazines, business and industrial magazines, political science and world affairs journals, general interest periodicals, and literary and political review magazines, the approximately 90 entries in this annotated bibliography discuss ghostwriting as practiced through the ages and reveal the…

  8. ANEXdb: an integrated animal ANnotation and microarray EXpression database.

    PubMed

    Couture, Oliver; Callenberg, Keith; Koul, Neeraj; Pandit, Sushain; Younes, Remy; Hu, Zhi-Liang; Dekkers, Jack; Reecy, James; Honavar, Vasant; Tuggle, Christopher

    2009-01-01

    To determine annotations of the sequence elements on microarrays used for transcriptional profiling experiments in livestock species, currently researchers must either use the sparse direct annotations available for these species or create their own annotations. ANEXdb ( http://www.anexdb.org ) is an open-source web application that supports integrated access of two databases that house microarray expression (ExpressDB) and EST annotation (AnnotDB) data. The expression database currently supports storage and querying of Affymetrix-based expression data as well as retrieval of experiments in a form ready for NCBI-GEO submission; these services are available online. AnnotDB currently houses a novel assembly of approximately 1.6 million unique porcine-expressed sequence reads called the Iowa Porcine Assembly (IPA), which consists of 140,087 consensus sequences, the Iowa Tentative Consensus (ITC) sequences, and 103,888 singletons. The IPA has been annotated via transfer of information from homologs identified through sequence alignment to NCBI RefSeq. These annotated sequences have been mapped to the Affymetrix porcine array elements, providing annotation for 22,569 of the 23,937 (94%) porcine-specific probe sets, of which 19,253 (80%) are linked to an NCBI RefSeq entry. The ITC has also been mined for sequence variation, providing evidence for up to 202,383 SNPs, 62,048 deletions, and 958 insertions in porcine-expressed sequence. These results create a single location to obtain porcine annotation of and sequence variation in differently expressed genes in expression experiments, thus permitting possible identification of causal variants in such genes of interest. The ANEXdb application is open source and available from SourceForge.net.

  9. The OBO Foundry: coordinated evolution of ontologies to support biomedical data integration

    PubMed Central

    Smith, Barry; Ashburner, Michael; Rosse, Cornelius; Bard, Jonathan; Bug, William; Ceusters, Werner; Goldberg, Louis J; Eilbeck, Karen; Ireland, Amelia; Mungall, Christopher J; Leontis, Neocles; Rocca-Serra, Philippe; Ruttenberg, Alan; Sansone, Susanna-Assunta; Scheuermann, Richard H; Shah, Nigam; Whetzel, Patricia L; Lewis, Suzanna

    2010-01-01

    The value of any kind of data is greatly enhanced when it exists in a form that allows it to be integrated with other data. One approach to integration is through the annotation of multiple bodies of data using common controlled vocabularies or ‘ontologies’. Unfortunately, the very success of this approach has led to a proliferation of ontologies, which itself creates obstacles to integration. The Open Biomedical Ontologies (OBO) consortium is pursuing a strategy to overcome this problem. Existing OBO ontologies, including the Gene Ontology, are undergoing coordinated reform, and new ontologies are being created on the basis of an evolving set of shared principles governing ontology development. The result is an expanding family of ontologies designed to be interoperable and logically well formed and to incorporate accurate representations of biological reality. We describe this OBO Foundry initiative and provide guidelines for those who might wish to become involved. PMID:17989687

  10. Gene calling and bacterial genome annotation with BG7.

    PubMed

    Tobes, Raquel; Pareja-Tobes, Pablo; Manrique, Marina; Pareja-Tobes, Eduardo; Kovach, Evdokim; Alekhin, Alexey; Pareja, Eduardo

    2015-01-01

    New massive sequencing technologies are providing many bacterial genome sequences from diverse taxa but a refined annotation of these genomes is crucial for obtaining scientific findings and new knowledge. Thus, bacterial genome annotation has emerged as a key point to investigate in bacteria. Any efficient tool designed specifically to annotate bacterial genomes sequenced with massively parallel technologies has to consider the specific features of bacterial genomes (absence of introns and scarcity of nonprotein-coding sequence) and of next-generation sequencing (NGS) technologies (presence of errors and not perfectly assembled genomes). These features make it convenient to focus on coding regions and, hence, on protein sequences that are the elements directly related with biological functions. In this chapter we describe how to annotate bacterial genomes with BG7, an open-source tool based on a protein-centered gene calling/annotation paradigm. BG7 is specifically designed for the annotation of bacterial genomes sequenced with NGS. This tool is sequence error tolerant maintaining their capabilities for the annotation of highly fragmented genomes or for annotating mixed sequences coming from several genomes (as those obtained through metagenomics samples). BG7 has been designed with scalability as a requirement, with a computing infrastructure completely based on cloud computing (Amazon Web Services).

  11. The annotation and the usage of scientific databases could be improved with public issue tracker software

    PubMed Central

    Dall'Olio, Giovanni Marco; Bertranpetit, Jaume; Laayouni, Hafid

    2010-01-01

    Since the publication of their longtime predecessor The Atlas of Protein Sequences and Structures in 1965 by Margaret Dayhoff, scientific databases have become a key factor in the organization of modern science. All the information and knowledge described in the novel scientific literature is translated into entries in many different scientific databases, making it possible to obtain very accurate information on a biological entity like genes or proteins without having to manually review the literature on it. However, even for the databases with the finest annotation procedures, errors or unclear parts sometimes appear in the publicly released version and influence the research of unaware scientists using them. The researcher that finds an error in a database is often left in a uncertain state, and often abandons the effort of reporting it because of a lack of a standard procedure to do so. In the present work, we propose that the simple adoption of a public error tracker application, as in many open software projects, could improve the quality of the annotations in many databases and encourage feedback from the scientific community on the data annotated publicly. In order to illustrate the situation, we describe a series of errors that we found and helped solve on the genes of a very well-known pathway in various biomedically relevant databases. We would like to show that, even if a majority of the most important scientific databases have procedures for reporting errors, these are usually not publicly visible, making the process of reporting errors time consuming and not useful. Also, the effort made by the user that reports the error often goes unacknowledged, putting him in a discouraging position. PMID:21186182

  12. Perceptions regarding biomedical engineering

    NASA Astrophysics Data System (ADS)

    Pearson, James E.

    1995-10-01

    Perceptions of biomedical engineering are important because they can influence private and public decisions on R&D funding and public policy. A survey was conducted of a group of persons active in biomedical engineering research in an attempt to determine the perceptions of the general public and of the biomedical community regarding biomedical engineering. The public is believed to have 'a little' knowledge of biomedical engineering, and to have a wide range of opinions on what biomedical engineers do. The survey respondents believe they are in general agreement with the public on several questions regarding biomedical engineering. However, the public is believed to be more inclined than workers in the field to think that biomedical engineering increases the cost of health care, and to be less supportive of increased R&D funding for health care technology.

  13. AmiGO: online access to ontology and annotation data

    SciTech Connect

    Carbon, Seth; Ireland, Amelia; Mungall, Christopher J.; Shu, ShengQiang; Marshall, Brad; Lewis, Suzanna

    2009-01-15

    AmiGO is a web application that allows users to query, browse, and visualize ontologies and related gene product annotation (association) data. AmiGO can be used online at the Gene Ontology (GO) website to access the data provided by the GO Consortium; it can also be downloaded and installed to browse local ontologies and annotations. AmiGO is free open source software developed and maintained by the GO Consortium.

  14. Apollo: a sequence annotation editor.

    PubMed

    Lewis, S E; Searle, S M J; Harris, N; Gibson, M; Lyer, V; Richter, J; Wiel, C; Bayraktaroglu, L; Birney, E; Crosby, M A; Kaminker, J S; Matthews, B B; Prochnik, S E; Smithy, C D; Tupy, J L; Rubin, G M; Misra, S; Mungall, C J; Clamp, M E

    2002-01-01

    The well-established inaccuracy of purely computational methods for annotating genome sequences necessitates an interactive tool to allow biological experts to refine these approximations by viewing and independently evaluating the data supporting each annotation. Apollo was developed to meet this need, enabling curators to inspect genome annotations closely and edit them. FlyBase biologists successfully used Apollo to annotate the Drosophila melanogaster genome and it is increasingly being used as a starting point for the development of customized annotation editing tools for other genome projects.

  15. Exploring and linking biomedical resources through multidimensional semantic spaces

    PubMed Central

    2012-01-01

    Background The semantic integration of biomedical resources is still a challenging issue which is required for effective information processing and data analysis. The availability of comprehensive knowledge resources such as biomedical ontologies and integrated thesauri greatly facilitates this integration effort by means of semantic annotation, which allows disparate data formats and contents to be expressed under a common semantic space. In this paper, we propose a multidimensional representation for such a semantic space, where dimensions regard the different perspectives in biomedical research (e.g., population, disease, anatomy and protein/genes). Results This paper presents a novel method for building multidimensional semantic spaces from semantically annotated biomedical data collections. This method consists of two main processes: knowledge and data normalization. The former one arranges the concepts provided by a reference knowledge resource (e.g., biomedical ontologies and thesauri) into a set of hierarchical dimensions for analysis purposes. The latter one reduces the annotation set associated to each collection item into a set of points of the multidimensional space. Additionally, we have developed a visual tool, called 3D-Browser, which implements OLAP-like operators over the generated multidimensional space. The method and the tool have been tested and evaluated in the context of the Health-e-Child (HeC) project. Automatic semantic annotation was applied to tag three collections of abstracts taken from PubMed, one for each target disease of the project, the Uniprot database, and the HeC patient record database. We adopted the UMLS Meta-thesaurus 2010AA as the reference knowledge resource. Conclusions Current knowledge resources and semantic-aware technology make possible the integration of biomedical resources. Such an integration is performed through semantic annotation of the intended biomedical data resources. This paper shows how these annotations

  16. GSV Annotated Bibliography

    SciTech Connect

    Roberts, Randy S.; Pope, Paul A.; Jiang, Ming; Trucano, Timothy G.; Aragon, Cecilia R.; Ni, Kevin; Wei, Thomas; Chilton, Lawrence K.; Bakel, Alan

    2011-06-14

    The following annotated bibliography was developed as part of the Geospatial Algorithm Veri cation and Validation (GSV) project for the Simulation, Algorithms and Modeling program of NA-22. Veri cation and Validation of geospatial image analysis algorithms covers a wide range of technologies. Papers in the bibliography are thus organized into the following ve topic areas: Image processing and analysis, usability and validation of geospatial image analysis algorithms, image distance measures, scene modeling and image rendering, and transportation simulation models.

  17. Figure content analysis for improved biomedical article retrieval

    NASA Astrophysics Data System (ADS)

    You, Daekeun; Apostolova, Emilia; Antani, Sameer; Demner-Fushman, Dina; Thoma, George R.

    2009-01-01

    Biomedical images are invaluable in medical education and establishing clinical diagnosis. Clinical decision support (CDS) can be improved by combining biomedical text with automatically annotated images extracted from relevant biomedical publications. In a previous study we reported 76.6% accuracy using supervised machine learning on the feasibility of automatically classifying images by combining figure captions and image content for usefulness in finding clinical evidence. Image content extraction is traditionally applied on entire images or on pre-determined image regions. Figure images articles vary greatly limiting benefit of whole image extraction beyond gross categorization for CDS due to the large variety. However, text annotations and pointers on them indicate regions of interest (ROI) that are then referenced in the caption or discussion in the article text. We have previously reported 72.02% accuracy in text and symbols localization but we failed to take advantage of the referenced image locality. In this work we combine article text analysis and figure image analysis for localizing pointer (arrows, symbols) to extract ROI pointed that can then be used to measure meaningful image content and associate it with the identified biomedical concepts for improved (text and image) content-based retrieval of biomedical articles. Biomedical concepts are identified using National Library of Medicine's Unified Medical Language System (UMLS) Metathesaurus. Our methods report an average precision and recall of 92.3% and 75.3%, respectively on identifying pointing symbols in images from a randomly selected image subset made available through the ImageCLEF 2008 campaign.

  18. Porting a lexicalized-grammar parser to the biomedical domain.

    PubMed

    Rimell, Laura; Clark, Stephen

    2009-10-01

    This paper introduces a state-of-the-art, linguistically motivated statistical parser to the biomedical text mining community, and proposes a method of adapting it to the biomedical domain requiring only limited resources for data annotation. The parser was originally developed using the Penn Treebank and is therefore tuned to newspaper text. Our approach takes advantage of a lexicalized grammar formalism, Combinatory Categorial Grammar (ccg), to train the parser at a lower level of representation than full syntactic derivations. The ccg parser uses three levels of representation: a first level consisting of part-of-speech (pos) tags; a second level consisting of more fine-grained ccg lexical categories; and a third, hierarchical level consisting of ccg derivations. We find that simply retraining the pos tagger on biomedical data leads to a large improvement in parsing performance, and that using annotated data at the intermediate lexical category level of representation improves parsing accuracy further. We describe the procedure involved in evaluating the parser, and obtain accuracies for biomedical data in the same range as those reported for newspaper text, and higher than those previously reported for the biomedical resource on which we evaluate. Our conclusion is that porting newspaper parsers to the biomedical domain, at least for parsers which use lexicalized grammars, may not be as difficult as first thought.

  19. Biocompatibility of implantable biomedical devices

    NASA Astrophysics Data System (ADS)

    Lyu, Suping

    2008-03-01

    Biomedical devices have been broadly used to treat human disease, especially chronic diseases where pharmaceuticals are less effective. Heart valve and artificial joint are examples. Biomedical devices perform by delivering therapies such as electric stimulations, mechanical supports and biological actions. While the uses of biomedical devices are highly successful they can trigger adverse biological reactions as well. The property that medical devices perform with intended functions but not causing unacceptable adverse effects was called biocompatibility in the early time. As our understanding of biomaterial-biological interactions getting broader, biocompatibility has more meanings. In this talk, I will present some adverse biological reactions observed with implantable biomedical devices. Among them are surface fouling of implantable sensors, calcification with vascular devices, restenosis with stents, foreign particle migration and mechanical fractures of devices due to inflammation reactions. While these effects are repeatable, there are very few quantitative data and theories to define them. The purpose of this presentation is to introduce this biocompatibility concept to biophysicists to stimulate research interests at different angles. An open question is how to quantitatively understand the biocompatibility that, like many other biological processes, has not been quantified experimentally.

  20. Unsupervised discovery of information structure in biomedical documents.

    PubMed

    Kiela, Douwe; Guo, Yufan; Stenius, Ulla; Korhonen, Anna

    2015-04-01

    Information structure (IS) analysis is a text mining technique, which classifies text in biomedical articles into categories that capture different types of information, such as objectives, methods, results and conclusions of research. It is a highly useful technique that can support a range of Biomedical Text Mining tasks and can help readers of biomedical literature find information of interest faster, accelerating the highly time-consuming process of literature review. Several approaches to IS analysis have been presented in the past, with promising results in real-world biomedical tasks. However, all existing approaches, even weakly supervised ones, require several hundreds of hand-annotated training sentences specific to the domain in question. Because biomedicine is subject to considerable domain variation, such annotations are expensive to obtain. This makes the application of IS analysis across biomedical domains difficult. In this article, we investigate an unsupervised approach to IS analysis and evaluate the performance of several unsupervised methods on a large corpus of biomedical abstracts collected from PubMed. Our best unsupervised algorithm (multilevel-weighted graph clustering algorithm) performs very well on the task, obtaining over 0.70 F scores for most IS categories when applied to well-known IS schemes. This level of performance is close to that of lightly supervised IS methods and has proven sufficient to aid a range of practical tasks. Thus, using an unsupervised approach, IS could be applied to support a wide range of tasks across sub-domains of biomedicine. We also demonstrate that unsupervised learning brings novel insights into IS of biomedical literature and discovers information categories that are not present in any of the existing IS schemes. The annotated corpus and software are available at http://www.cl.cam.ac.uk/∼dk427/bio14info.html. © The Author 2014. Published by Oxford University Press. All rights reserved. For

  1. Code generation through annotation of macromolecular structure data.

    PubMed

    Biggs, J; Pu, C; Bourne, P

    1997-01-01

    The maintenance of software which uses a rapidly evolving data annotation scheme is time consuming and expensive. At the same time without current software the annotation scheme itself becomes limited and is less likely to be widely adopted. A solution to this problem has been developed for the macromolecular Crystallographic Information File (mmCIF) annotation scheme. The approach could be generalized for a variety of annotation schemes used or proposed for molecular biology data. mmCIF provides a highly structured and complete annotation for describing NMR and X-ray crystallographic data and the resulting macromolecular structures. This annotation is maintained in the mmCIF dictionary which currently contains over 3,200 terms. A major challenge is to maintain code for converting between mmCIF and Protein Data Bank (PDB) annotations while both continue to evolve. The solution has been to define a simple domain specific language (DSL) which is added to the extensive annotation already found in the mmCIF dictionary. The DSL calls specific mapping modules for each category of data item in the mmCIF dictionary. Adding or changing the mapping between PDB and mmCIF items of data is straightforward since data categories (and hence mapping modules) correspond to elements of macromolecular structure familiar to the experimentalist. Each time a change is made to the macromolecular annotation the appropriate change is made to the easily located and modifiable mapping modules. A code generator is then called which reads the mapping modules and creates a new executable for performing the data conversion. In this way code is easily kept current by individuals with limited programming skill, but who have an understanding of macromolecular structure and details of the annotation scheme. Most important, the conversion process becomes part of the global dictionary and is not open to a variety of interpretations by different research groups writing code based on dictionary contents

  2. Elucidating high-dimensional cancer hallmark annotation via enriched ontology.

    PubMed

    Yan, Shankai; Wong, Ka-Chun

    2017-09-01

    Cancer hallmark annotation is a promising technique that could discover novel knowledge about cancer from the biomedical literature. The automated annotation of cancer hallmarks could reveal relevant cancer transformation processes in the literature or extract the articles that correspond to the cancer hallmark of interest. It acts as a complementary approach that can retrieve knowledge from massive text information, advancing numerous focused studies in cancer research. Nonetheless, the high-dimensional nature of cancer hallmark annotation imposes a unique challenge. To address the curse of dimensionality, we compared multiple cancer hallmark annotation methods on 1580 PubMed abstracts. Based on the insights, a novel approach, UDT-RF, which makes use of ontological features is proposed. It expands the feature space via the Medical Subject Headings (MeSH) ontology graph and utilizes novel feature selections for elucidating the high-dimensional cancer hallmark annotation space. To demonstrate its effectiveness, state-of-the-art methods are compared and evaluated by a multitude of performance metrics, revealing the full performance spectrum on the full set of cancer hallmarks. Several case studies are conducted, demonstrating how the proposed approach could reveal novel insights into cancers. https://github.com/cskyan/chmannot. Copyright © 2017 Elsevier Inc. All rights reserved.

  3. Semantic biomedical resource discovery: a Natural Language Processing framework.

    PubMed

    Sfakianaki, Pepi; Koumakis, Lefteris; Sfakianakis, Stelios; Iatraki, Galatia; Zacharioudakis, Giorgos; Graf, Norbert; Marias, Kostas; Tsiknakis, Manolis

    2015-09-30

    A plethora of publicly available biomedical resources do currently exist and are constantly increasing at a fast rate. In parallel, specialized repositories are been developed, indexing numerous clinical and biomedical tools. The main drawback of such repositories is the difficulty in locating appropriate resources for a clinical or biomedical decision task, especially for non-Information Technology expert users. In parallel, although NLP research in the clinical domain has been active since the 1960s, progress in the development of NLP applications has been slow and lags behind progress in the general NLP domain. The aim of the present study is to investigate the use of semantics for biomedical resources annotation with domain specific ontologies and exploit Natural Language Processing methods in empowering the non-Information Technology expert users to efficiently search for biomedical resources using natural language. A Natural Language Processing engine which can "translate" free text into targeted queries, automatically transforming a clinical research question into a request description that contains only terms of ontologies, has been implemented. The implementation is based on information extraction techniques for text in natural language, guided by integrated ontologies. Furthermore, knowledge from robust text mining methods has been incorporated to map descriptions into suitable domain ontologies in order to ensure that the biomedical resources descriptions are domain oriented and enhance the accuracy of services discovery. The framework is freely available as a web application at ( http://calchas.ics.forth.gr/ ). For our experiments, a range of clinical questions were established based on descriptions of clinical trials from the ClinicalTrials.gov registry as well as recommendations from clinicians. Domain experts manually identified the available tools in a tools repository which are suitable for addressing the clinical questions at hand, either

  4. Biomedical Terminology Mapper for UML projects

    PubMed Central

    Thibault, Julien C.; Frey, Lewis

    As the biomedical community collects and generates more and more data, the need to describe these datasets for exchange and interoperability becomes crucial. This paper presents a mapping algorithm that can help developers expose local implementations described with UML through standard terminologies. The input UML class or attribute name is first normalized and tokenized, then lookups in a UMLS-based dictionary are performed. For the evaluation of the algorithm 142 UML projects were extracted from caGrid and automatically mapped to National Cancer Institute (NCI) terminology concepts. Resulting mappings at the UML class and attribute levels were compared to the manually curated annotations provided in caGrid. Results are promising and show that this type of algorithm could speed-up the tedious process of mapping local implementations to standard biomedical terminologies. PMID:24303278

  5. Biomedical Terminology Mapper for UML projects.

    PubMed

    Thibault, Julien C; Frey, Lewis

    2013-01-01

    As the biomedical community collects and generates more and more data, the need to describe these datasets for exchange and interoperability becomes crucial. This paper presents a mapping algorithm that can help developers expose local implementations described with UML through standard terminologies. The input UML class or attribute name is first normalized and tokenized, then lookups in a UMLS-based dictionary are performed. For the evaluation of the algorithm 142 UML projects were extracted from caGrid and automatically mapped to National Cancer Institute (NCI) terminology concepts. Resulting mappings at the UML class and attribute levels were compared to the manually curated annotations provided in caGrid. Results are promising and show that this type of algorithm could speed-up the tedious process of mapping local implementations to standard biomedical terminologies.

  6. Annotation and visualization of endogenous retroviral sequences using the Distributed Annotation System (DAS) and eBioX.

    PubMed

    Barrio, Alvaro Martínez; Lagercrantz, Erik; Sperber, Göran O; Blomberg, Jonas; Bongcam-Rudloff, Erik

    2009-06-16

    The Distributed Annotation System (DAS) is a widely used network protocol for sharing biological information. The distributed aspects of the protocol enable the use of various reference and annotation servers for connecting biological sequence data to pertinent annotations in order to depict an integrated view of the data for the final user. An annotation server has been devised to provide information about the endogenous retroviruses detected and annotated by a specialized in silico tool called RetroTector. We describe the procedure to implement the DAS 1.5 protocol commands necessary for constructing the DAS annotation server. We use our server to exemplify those steps. Data distribution is kept separated from visualization which is carried out by eBioX, an easy to use open source program incorporating multiple bioinformatics utilities. Some well characterized endogenous retroviruses are shown in two different DAS clients. A rapid analysis of areas free from retroviral insertions could be facilitated by our annotations. The DAS protocol has shown to be advantageous in the distribution of endogenous retrovirus data. The distributed nature of the protocol is also found to aid in combining annotation and visualization along a genome in order to enhance the understanding of ERV contribution to its evolution. Reference and annotation servers are conjointly used by eBioX to provide visualization of ERV annotations as well as other data sources. Our DAS data source can be found in the central public DAS service repository, http://www.dasregistry.org, or at http://loka.bmc.uu.se/das/sources.

  7. Openness as infrastructure

    PubMed Central

    2011-01-01

    The advent of open access to peer reviewed scholarly literature in the biomedical sciences creates the opening to examine scholarship in general, and chemistry in particular, to see where and how novel forms of network technology can accelerate the scientific method. This paper examines broad trends in information access and openness with an eye towards their applications in chemistry. PMID:21999327

  8. BioInfer: a corpus for information extraction in the biomedical domain

    PubMed Central

    Pyysalo, Sampo; Ginter, Filip; Heimonen, Juho; Björne, Jari; Boberg, Jorma; Järvinen, Jouni; Salakoski, Tapio

    2007-01-01

    Background Lately, there has been a great interest in the application of information extraction methods to the biomedical domain, in particular, to the extraction of relationships of genes, proteins, and RNA from scientific publications. The development and evaluation of such methods requires annotated domain corpora. Results We present BioInfer (Bio Information Extraction Resource), a new public resource providing an annotated corpus of biomedical English. We describe an annotation scheme capturing named entities and their relationships along with a dependency analysis of sentence syntax. We further present ontologies defining the types of entities and relationships annotated in the corpus. Currently, the corpus contains 1100 sentences from abstracts of biomedical research articles annotated for relationships, named entities, as well as syntactic dependencies. Supporting software is provided with the corpus. The corpus is unique in the domain in combining these annotation types for a single set of sentences, and in the level of detail of the relationship annotation. Conclusion We introduce a corpus targeted at protein, gene, and RNA relationships which serves as a resource for the development of information extraction systems and their components such as parsers and domain analyzers. The corpus will be maintained and further developed with a current version being available at . PMID:17291334

  9. Evaluation of research in biomedical ontologies.

    PubMed

    Hoehndorf, Robert; Dumontier, Michel; Gkoutos, Georgios V

    2013-11-01

    Ontologies are now pervasive in biomedicine, where they serve as a means to standardize terminology, to enable access to domain knowledge, to verify data consistency and to facilitate integrative analyses over heterogeneous biomedical data. For this purpose, research on biomedical ontologies applies theories and methods from diverse disciplines such as information management, knowledge representation, cognitive science, linguistics and philosophy. Depending on the desired applications in which ontologies are being applied, the evaluation of research in biomedical ontologies must follow different strategies. Here, we provide a classification of research problems in which ontologies are being applied, focusing on the use of ontologies in basic and translational research, and we demonstrate how research results in biomedical ontologies can be evaluated. The evaluation strategies depend on the desired application and measure the success of using an ontology for a particular biomedical problem. For many applications, the success can be quantified, thereby facilitating the objective evaluation and comparison of research in biomedical ontology. The objective, quantifiable comparison of research results based on scientific applications opens up the possibility for systematically improving the utility of ontologies in biomedical research.

  10. Evaluation of research in biomedical ontologies

    PubMed Central

    Dumontier, Michel; Gkoutos, Georgios V.

    2013-01-01

    Ontologies are now pervasive in biomedicine, where they serve as a means to standardize terminology, to enable access to domain knowledge, to verify data consistency and to facilitate integrative analyses over heterogeneous biomedical data. For this purpose, research on biomedical ontologies applies theories and methods from diverse disciplines such as information management, knowledge representation, cognitive science, linguistics and philosophy. Depending on the desired applications in which ontologies are being applied, the evaluation of research in biomedical ontologies must follow different strategies. Here, we provide a classification of research problems in which ontologies are being applied, focusing on the use of ontologies in basic and translational research, and we demonstrate how research results in biomedical ontologies can be evaluated. The evaluation strategies depend on the desired application and measure the success of using an ontology for a particular biomedical problem. For many applications, the success can be quantified, thereby facilitating the objective evaluation and comparison of research in biomedical ontology. The objective, quantifiable comparison of research results based on scientific applications opens up the possibility for systematically improving the utility of ontologies in biomedical research. PMID:22962340

  11. RATT: Rapid Annotation Transfer Tool

    PubMed Central

    Otto, Thomas D.; Dillon, Gary P.; Degrave, Wim S.; Berriman, Matthew

    2011-01-01

    Second-generation sequencing technologies have made large-scale sequencing projects commonplace. However, making use of these datasets often requires gene function to be ascribed genome wide. Although tool development has kept pace with the changes in sequence production, for tasks such as mapping, de novo assembly or visualization, genome annotation remains a challenge. We have developed a method to rapidly provide accurate annotation for new genomes using previously annotated genomes as a reference. The method, implemented in a tool called RATT (Rapid Annotation Transfer Tool), transfers annotations from a high-quality reference to a new genome on the basis of conserved synteny. We demonstrate that a Mycobacterium tuberculosis genome or a single 2.5 Mb chromosome from a malaria parasite can be annotated in less than five minutes with only modest computational resources. RATT is available at http://ratt.sourceforge.net. PMID:21306991

  12. The Ontology for Biomedical Investigations.

    PubMed

    Bandrowski, Anita; Brinkman, Ryan; Brochhausen, Mathias; Brush, Matthew H; Bug, Bill; Chibucos, Marcus C; Clancy, Kevin; Courtot, Mélanie; Derom, Dirk; Dumontier, Michel; Fan, Liju; Fostel, Jennifer; Fragoso, Gilberto; Gibson, Frank; Gonzalez-Beltran, Alejandra; Haendel, Melissa A; He, Yongqun; Heiskanen, Mervi; Hernandez-Boussard, Tina; Jensen, Mark; Lin, Yu; Lister, Allyson L; Lord, Phillip; Malone, James; Manduchi, Elisabetta; McGee, Monnie; Morrison, Norman; Overton, James A; Parkinson, Helen; Peters, Bjoern; Rocca-Serra, Philippe; Ruttenberg, Alan; Sansone, Susanna-Assunta; Scheuermann, Richard H; Schober, Daniel; Smith, Barry; Soldatova, Larisa N; Stoeckert, Christian J; Taylor, Chris F; Torniai, Carlo; Turner, Jessica A; Vita, Randi; Whetzel, Patricia L; Zheng, Jie

    2016-01-01

    The Ontology for Biomedical Investigations (OBI) is an ontology that provides terms with precisely defined meanings to describe all aspects of how investigations in the biological and medical domains are conducted. OBI re-uses ontologies that provide a representation of biomedical knowledge from the Open Biological and Biomedical Ontologies (OBO) project and adds the ability to describe how this knowledge was derived. We here describe the state of OBI and several applications that are using it, such as adding semantic expressivity to existing databases, building data entry forms, and enabling interoperability between knowledge resources. OBI covers all phases of the investigation process, such as planning, execution and reporting. It represents information and material entities that participate in these processes, as well as roles and functions. Prior to OBI, it was not possible to use a single internally consistent resource that could be applied to multiple types of experiments for these applications. OBI has made this possible by creating terms for entities involved in biological and medical investigations and by importing parts of other biomedical ontologies such as GO, Chemical Entities of Biological Interest (ChEBI) and Phenotype Attribute and Trait Ontology (PATO) without altering their meaning. OBI is being used in a wide range of projects covering genomics, multi-omics, immunology, and catalogs of services. OBI has also spawned other ontologies (Information Artifact Ontology) and methods for importing parts of ontologies (Minimum information to reference an external ontology term (MIREOT)). The OBI project is an open cross-disciplinary collaborative effort, encompassing multiple research communities from around the globe. To date, OBI has created 2366 classes and 40 relations along with textual and formal definitions. The OBI Consortium maintains a web resource (http://obi-ontology.org) providing details on the people, policies, and issues being addressed

  13. The Ontology for Biomedical Investigations

    PubMed Central

    Bandrowski, Anita; Brinkman, Ryan; Brochhausen, Mathias; Brush, Matthew H.; Chibucos, Marcus C.; Clancy, Kevin; Courtot, Mélanie; Derom, Dirk; Dumontier, Michel; Fan, Liju; Fostel, Jennifer; Fragoso, Gilberto; Gibson, Frank; Gonzalez-Beltran, Alejandra; Haendel, Melissa A.; He, Yongqun; Heiskanen, Mervi; Hernandez-Boussard, Tina; Jensen, Mark; Lin, Yu; Lister, Allyson L.; Lord, Phillip; Malone, James; Manduchi, Elisabetta; McGee, Monnie; Morrison, Norman; Overton, James A.; Parkinson, Helen; Peters, Bjoern; Rocca-Serra, Philippe; Ruttenberg, Alan; Sansone, Susanna-Assunta; Scheuermann, Richard H.; Schober, Daniel; Smith, Barry; Soldatova, Larisa N.; Stoeckert, Christian J.; Taylor, Chris F.; Torniai, Carlo; Turner, Jessica A.; Vita, Randi; Whetzel, Patricia L.; Zheng, Jie

    2016-01-01

    The Ontology for Biomedical Investigations (OBI) is an ontology that provides terms with precisely defined meanings to describe all aspects of how investigations in the biological and medical domains are conducted. OBI re-uses ontologies that provide a representation of biomedical knowledge from the Open Biological and Biomedical Ontologies (OBO) project and adds the ability to describe how this knowledge was derived. We here describe the state of OBI and several applications that are using it, such as adding semantic expressivity to existing databases, building data entry forms, and enabling interoperability between knowledge resources. OBI covers all phases of the investigation process, such as planning, execution and reporting. It represents information and material entities that participate in these processes, as well as roles and functions. Prior to OBI, it was not possible to use a single internally consistent resource that could be applied to multiple types of experiments for these applications. OBI has made this possible by creating terms for entities involved in biological and medical investigations and by importing parts of other biomedical ontologies such as GO, Chemical Entities of Biological Interest (ChEBI) and Phenotype Attribute and Trait Ontology (PATO) without altering their meaning. OBI is being used in a wide range of projects covering genomics, multi-omics, immunology, and catalogs of services. OBI has also spawned other ontologies (Information Artifact Ontology) and methods for importing parts of ontologies (Minimum information to reference an external ontology term (MIREOT)). The OBI project is an open cross-disciplinary collaborative effort, encompassing multiple research communities from around the globe. To date, OBI has created 2366 classes and 40 relations along with textual and formal definitions. The OBI Consortium maintains a web resource (http://obi-ontology.org) providing details on the people, policies, and issues being addressed

  14. Toward an automatic method for extracting cancer- and other disease-related point mutations from the biomedical literature.

    PubMed

    Doughty, Emily; Kertesz-Farkas, Attila; Bodenreider, Olivier; Thompson, Gary; Adadey, Asa; Peterson, Thomas; Kann, Maricel G

    2011-02-01

    A major goal of biomedical research in personalized medicine is to find relationships between mutations and their corresponding disease phenotypes. However, most of the disease-related mutational data are currently buried in the biomedical literature in textual form and lack the necessary structure to allow easy retrieval and visualization. We introduce a high-throughput computational method for the identification of relevant disease mutations in PubMed abstracts applied to prostate (PCa) and breast cancer (BCa) mutations. We developed the extractor of mutations (EMU) tool to identify mutations and their associated genes. We benchmarked EMU against MutationFinder--a tool to extract point mutations from text. Our results show that both methods achieve comparable performance on two manually curated datasets. We also benchmarked EMU's performance for extracting the complete mutational information and phenotype. Remarkably, we show that one of the steps in our approach, a filter based on sequence analysis, increases the precision for that task from 0.34 to 0.59 (PCa) and from 0.39 to 0.61 (BCa). We also show that this high-throughput approach can be extended to other diseases. Our method improves the current status of disease-mutation databases by significantly increasing the number of annotated mutations. We found 51 and 128 mutations manually verified to be related to PCa and Bca, respectively, that are not currently annotated for these cancer types in the OMIM or Swiss-Prot databases. EMU's retrieval performance represents a 2-fold improvement in the number of annotated mutations for PCa and BCa. We further show that our method can benefit from full-text analysis once there is an increase in Open Access availability of full-text articles. Freely available at: http://bioinf.umbc.edu/EMU/ftp.

  15. Toward an automatic method for extracting cancer- and other disease-related point mutations from the biomedical literature

    PubMed Central

    Doughty, Emily; Kertesz-Farkas, Attila; Bodenreider, Olivier; Thompson, Gary; Adadey, Asa; Peterson, Thomas; Kann, Maricel G.

    2011-01-01

    Motivation: A major goal of biomedical research in personalized medicine is to find relationships between mutations and their corresponding disease phenotypes. However, most of the disease-related mutational data are currently buried in the biomedical literature in textual form and lack the necessary structure to allow easy retrieval and visualization. We introduce a high-throughput computational method for the identification of relevant disease mutations in PubMed abstracts applied to prostate (PCa) and breast cancer (BCa) mutations. Results: We developed the extractor of mutations (EMU) tool to identify mutations and their associated genes. We benchmarked EMU against MutationFinder—a tool to extract point mutations from text. Our results show that both methods achieve comparable performance on two manually curated datasets. We also benchmarked EMU's performance for extracting the complete mutational information and phenotype. Remarkably, we show that one of the steps in our approach, a filter based on sequence analysis, increases the precision for that task from 0.34 to 0.59 (PCa) and from 0.39 to 0.61 (BCa). We also show that this high-throughput approach can be extended to other diseases. Discussion: Our method improves the current status of disease-mutation databases by significantly increasing the number of annotated mutations. We found 51 and 128 mutations manually verified to be related to PCa and Bca, respectively, that are not currently annotated for these cancer types in the OMIM or Swiss-Prot databases. EMU's retrieval performance represents a 2-fold improvement in the number of annotated mutations for PCa and BCa. We further show that our method can benefit from full-text analysis once there is an increase in Open Access availability of full-text articles. Availability: Freely available at: http://bioinf.umbc.edu/EMU/ftp. Contact: mkann@umbc.edu Supplementary information: Supplementary data are available at Bioinformatics online. PMID:21138947

  16. The Ensembl gene annotation system

    PubMed Central

    Aken, Bronwen L.; Ayling, Sarah; Barrell, Daniel; Clarke, Laura; Curwen, Valery; Fairley, Susan; Fernandez Banet, Julio; Billis, Konstantinos; García Girón, Carlos; Hourlier, Thibaut; Howe, Kevin; Kähäri, Andreas; Kokocinski, Felix; Martin, Fergal J.; Murphy, Daniel N.; Nag, Rishi; Ruffier, Magali; Schuster, Michael; Tang, Y. Amy; Vogel, Jan-Hinnerk; White, Simon; Zadissa, Amonida; Flicek, Paul

    2016-01-01

    The Ensembl gene annotation system has been used to annotate over 70 different vertebrate species across a wide range of genome projects. Furthermore, it generates the automatic alignment-based annotation for the human and mouse GENCODE gene sets. The system is based on the alignment of biological sequences, including cDNAs, proteins and RNA-seq reads, to the target genome in order to construct candidate transcript models. Careful assessment and filtering of these candidate transcripts ultimately leads to the final gene set, which is made available on the Ensembl website. Here, we describe the annotation process in detail. Database URL: http://www.ensembl.org/index.html PMID:27337980

  17. Phylogenetic molecular function annotation

    NASA Astrophysics Data System (ADS)

    Engelhardt, Barbara E.; Jordan, Michael I.; Repo, Susanna T.; Brenner, Steven E.

    2009-07-01

    It is now easier to discover thousands of protein sequences in a new microbial genome than it is to biochemically characterize the specific activity of a single protein of unknown function. The molecular functions of protein sequences have typically been predicted using homology-based computational methods, which rely on the principle that homologous proteins share a similar function. However, some protein families include groups of proteins with different molecular functions. A phylogenetic approach for predicting molecular function (sometimes called "phylogenomics") is an effective means to predict protein molecular function. These methods incorporate functional evidence from all members of a family that have functional characterizations using the evolutionary history of the protein family to make robust predictions for the uncharacterized proteins. However, they are often difficult to apply on a genome-wide scale because of the time-consuming step of reconstructing the phylogenies of each protein to be annotated. Our automated approach for function annotation using phylogeny, the SIFTER (Statistical Inference of Function Through Evolutionary Relationships) methodology, uses a statistical graphical model to compute the probabilities of molecular functions for unannotated proteins. Our benchmark tests showed that SIFTER provides accurate functional predictions on various protein families, outperforming other available methods.

  18. Annotation: the savant syndrome.

    PubMed

    Heaton, Pamela; Wallace, Gregory L

    2004-07-01

    Whilst interest has focused on the origin and nature of the savant syndrome for over a century, it is only within the past two decades that empirical group studies have been carried out. The following annotation briefly reviews relevant research and also attempts to address outstanding issues in this research area. Traditionally, savants have been defined as intellectually impaired individuals who nevertheless display exceptional skills within specific domains. However, within the extant literature, cases of savants with developmental and other clinical disorders, but with average intellectual functioning, are increasingly reported. We thus propose that focus should diverge away from IQ scores to encompass discrepancies between functional impairments and unexpected skills. It has long been observed that savant skills are more prevalent in individuals with autism than in those with other disorders. Therefore, in this annotation we seek to explore the parameters of the savant syndrome by considering these skills within the context of neuropsychological accounts of autism. A striking finding amongst those with savant skills, but without the diagnosis of autism, is the presence of cognitive features and behavioural traits associated with the disorder. We thus conclude that autism (or autistic traits) and savant skills are inextricably linked and we should therefore look to autism in our quest to solve the puzzle of the savant syndrome. Copyright 2004 Association for Child Psychology and Psychiatry

  19. Visualizing GO Annotations.

    PubMed

    Supek, Fran; Škunca, Nives

    2017-01-01

    Contemporary techniques in biology produce readouts for large numbers of genes simultaneously, the typical example being differential gene expression measurements. Moreover, those genes are often richly annotated using GO terms that describe gene function and that can be used to summarize the results of the genome-scale experiments. However, making sense of such GO enrichment analyses may be challenging. For instance, overrepresented GO functions in a set of differentially expressed genes are typically output as a flat list, a format not adequate to capture the complexities of the hierarchical structure of the GO annotation labels.In this chapter, we survey various methods to visualize large, difficult-to-interpret lists of GO terms. We catalog their availability-Web-based or standalone, the main principles they employ in summarizing large lists of GO terms, and the visualization styles they support. These brief commentaries on each software are intended as a helpful inventory, rather than comprehensive descriptions of the underlying algorithms. Instead, we show examples of their use and suggest that the choice of an appropriate visualization tool may be crucial to the utility of GO in biological discovery.

  20. Managing Development Projects: A Selected, Annotated Bibliography. Annotated Bibliography #5.

    ERIC Educational Resources Information Center

    Chuenyane, Zachariah; And Others

    A selected annotated bibliography on managing development projects, intended for rural development practitioners, highlights items that outline some pressing issues and concerns confronting those involved in rural development in general and rural project management in particular. A section of annotated entries lists 21 publications on project…

  1. Widowed Persons Service: Selected Annotated Bibliography.

    ERIC Educational Resources Information Center

    Bressler, Dawn, Comp.; And Others

    This document presents an annotated bibliography of books and articles on topics relevant to widowhood. These annotations are included: (1) 21 annotations on the grief process; (2) 11 annotations on personal observations about widowhood; (3) 16 annotations on practical problems surrounding widowhood, including legal and financial problems and job…

  2. PAZAR: a framework for collection and dissemination of cis-regulatory sequence annotation

    PubMed Central

    Portales-Casamar, Elodie; Kirov, Stefan; Lim, Jonathan; Lithwick, Stuart; Swanson, Magdalena I; Ticoll, Amy; Snoddy, Jay; Wasserman, Wyeth W

    2007-01-01

    PAZAR is an open-access and open-source database of transcription factor and regulatory sequence annotation with associated web interface and programming tools for data submission and extraction. Curated boutique data collections can be maintained and disseminated through the unified schema of the mall-like PAZAR repository. The Pleiades Promoter Project collection of brain-linked regulatory sequences is introduced to demonstrate the depth of annotation possible within PAZAR. PAZAR, located at , is open for business. PMID:17916232

  3. The RICORDO approach to semantic interoperability for biomedical data and models: strategy, standards and solutions

    PubMed Central

    2011-01-01

    Background The practice and research of medicine generates considerable quantities of data and model resources (DMRs). Although in principle biomedical resources are re-usable, in practice few can currently be shared. In particular, the clinical communities in physiology and pharmacology research, as well as medical education, (i.e. PPME communities) are facing considerable operational and technical obstacles in sharing data and models. Findings We outline the efforts of the PPME communities to achieve automated semantic interoperability for clinical resource documentation in collaboration with the RICORDO project. Current community practices in resource documentation and knowledge management are overviewed. Furthermore, requirements and improvements sought by the PPME communities to current documentation practices are discussed. The RICORDO plan and effort in creating a representational framework and associated open software toolkit for the automated management of PPME metadata resources is also described. Conclusions RICORDO is providing the PPME community with tools to effect, share and reason over clinical resource annotations. This work is contributing to the semantic interoperability of DMRs through ontology-based annotation by (i) supporting more effective navigation and re-use of clinical DMRs, as well as (ii) sustaining interoperability operations based on the criterion of biological similarity. Operations facilitated by RICORDO will range from automated dataset matching to model merging and managing complex simulation workflows. In effect, RICORDO is contributing to community standards for resource sharing and interoperability. PMID:21878109

  4. NoGOA: predicting noisy GO annotations using evidences and sparse representation.

    PubMed

    Yu, Guoxian; Lu, Chang; Wang, Jun

    2017-07-21

    Gene Ontology (GO) is a community effort to represent functional features of gene products. GO annotations (GOA) provide functional associations between GO terms and gene products. Due to resources limitation, only a small portion of annotations are manually checked by curators, and the others are electronically inferred. Although quality control techniques have been applied to ensure the quality of annotations, the community consistently report that there are still considerable noisy (or incorrect) annotations. Given the wide application of annotations, however, how to identify noisy annotations is an important but yet seldom studied open problem. We introduce a novel approach called NoGOA to predict noisy annotations. NoGOA applies sparse representation on the gene-term association matrix to reduce the impact of noisy annotations, and takes advantage of sparse representation coefficients to measure the semantic similarity between genes. Secondly, it preliminarily predicts noisy annotations of a gene based on aggregated votes from semantic neighborhood genes of that gene. Next, NoGOA estimates the ratio of noisy annotations for each evidence code based on direct annotations in GOA files archived on different periods, and then weights entries of the association matrix via estimated ratios and propagates weights to ancestors of direct annotations using GO hierarchy. Finally, it integrates evidence-weighted association matrix and aggregated votes to predict noisy annotations. Experiments on archived GOA files of six model species (H. sapiens, A. thaliana, S. cerevisiae, G. gallus, B. Taurus and M. musculus) demonstrate that NoGOA achieves significantly better results than other related methods and removing noisy annotations improves the performance of gene function prediction. The comparative study justifies the effectiveness of integrating evidence codes with sparse representation for predicting noisy GO annotations. Codes and datasets are available at http://mlda.swu.edu.cn/codes.php?name=NoGOA .

  5. Morphosyntactic Annotation of CHILDES Transcripts

    ERIC Educational Resources Information Center

    Sagae, Kenji; Davis, Eric; Lavie, Alon; MacWhinney, Brian; Wintner, Shuly

    2010-01-01

    Corpora of child language are essential for research in child language acquisition and psycholinguistics. Linguistic annotation of the corpora provides researchers with better means for exploring the development of grammatical constructions and their usage. We describe a project whose goal is to annotate the English section of the CHILDES database…

  6. An Annotated Bibliography on Children.

    ERIC Educational Resources Information Center

    Bureau of Libraries and Educational Technology (DHEW/OE), Washington, DC.

    This annotated bibliography is a highly selective list of materials published in the last five years on the major problems, trends, methodologies and achievements in the field of child development. It contains annotated references to approximately 500 books, periodicals, technical reports, government documents, legislative materials, professional…

  7. Drug Education: An Annotated Bibliography.

    ERIC Educational Resources Information Center

    Mathieson, Moira B.

    This bibliography consists of a total of 215 entries dealing with drug education, including curriculum guides, and drawn from documents in the ERIC system. There are two sections, the first containing 130 annotated citations of documents and journal articles, and the second containing 85 citations of journal articles without annotations, but with…

  8. Biomedical Compounds from Marine organisms

    PubMed Central

    Jha, Rajeev Kumar; Zi-rong, Xu

    2004-01-01

    The Ocean, which is called the ‘mother of origin of life’, is also the source of structurally unique natural products that are mainly accumulated in living organisms. Several of these compounds show pharmacological activities and are helpful for the invention and discovery of bioactive compounds, primarily for deadly diseases like cancer, acquired immuno-deficiency syndrome (AIDS), arthritis, etc., while other compounds have been developed as analgesics or to treat inflammation, etc. The life-saving drugs are mainly found abundantly in microorganisms, algae and invertebrates, while they are scarce in vertebrates. Modern technologies have opened vast areas of research for the extraction of biomedical compounds from oceans and seas.

  9. National Space Biomedical Research Institute

    NASA Technical Reports Server (NTRS)

    2005-01-01

    NSBRI partners with NASA to develop countermeasures against the deleterious effects of long duration space flight. NSBRI's science and technology projects are directed toward this goal, which is accomplished by: 1. Designing, testing and validating effective countermeasures to address the biological and environmental impediments to long-term human space flight. 2. Defining the molecular, cellular, organ-level, integrated responses and mechanistic relationships that ultimately determine these impediments, where such activity fosters the development of novel countermeasures. 3. Establishing biomedical support technologies to maximize human performance in space, reduce biomedical hazards to an acceptable level and deliver quality medical care. 4. Transferring and disseminating the biomedical advances in knowledge and technology acquired through living and working in space to the general benefit of humankind; including the treatment of patients suffering from gravity- and radiation-related conditions on Earth. and 5. ensuring open involvement of the scientific community,industry and the public in the Institute's activities and fostering a robust collaboration with NASA, particularly through JSC.

  10. National Space Biomedical Research Institute

    NASA Technical Reports Server (NTRS)

    2005-01-01

    NSBRI partners with NASA to develop countermeasures against the deleterious effects of long duration space flight. NSBRI's science and technology projects are directed toward this goal, which is accomplished by: 1. Designing, testing and validating effective countermeasures to address the biological and environmental impediments to long-term human space flight. 2. Defining the molecular, cellular, organ-level, integrated responses and mechanistic relationships that ultimately determine these impediments, where such activity fosters the development of novel countermeasures. 3. Establishing biomedical support technologies to maximize human performance in space, reduce biomedical hazards to an acceptable level and deliver quality medical care. 4. Transferring and disseminating the biomedical advances in knowledge and technology acquired through living and working in space to the general benefit of humankind; including the treatment of patients suffering from gravity- and radiation-related conditions on Earth. and 5. ensuring open involvement of the scientific community,industry and the public in the Institute's activities and fostering a robust collaboration with NASA, particularly through JSC.

  11. Morphosyntactic annotation of CHILDES transcripts*

    PubMed Central

    SAGAE, KENJI; DAVIS, ERIC; LAVIE, ALON; MACWHINNEY, BRIAN; WINTNER, SHULY

    2014-01-01

    Corpora of child language are essential for research in child language acquisition and psycholinguistics. Linguistic annotation of the corpora provides researchers with better means for exploring the development of grammatical constructions and their usage. We describe a project whose goal is to annotate the English section of the CHILDES database with grammatical relations in the form of labeled dependency structures. We have produced a corpus of over 18,800 utterances (approximately 65,000 words) with manually curated gold-standard grammatical relation annotations. Using this corpus, we have developed a highly accurate data-driven parser for the English CHILDES data, which we used to automatically annotate the remainder of the English section of CHILDES. We have also extended the parser to Spanish, and are currently working on supporting more languages. The parser and the manually and automatically annotated data are freely available for research purposes. PMID:20334720

  12. A semi-supervised learning framework for biomedical event extraction based on hidden topics.

    PubMed

    Zhou, Deyu; Zhong, Dayou

    2015-05-01

    Scientists have devoted decades of efforts to understanding the interaction between proteins or RNA production. The information might empower the current knowledge on drug reactions or the development of certain diseases. Nevertheless, due to the lack of explicit structure, literature in life science, one of the most important sources of this information, prevents computer-based systems from accessing. Therefore, biomedical event extraction, automatically acquiring knowledge of molecular events in research articles, has attracted community-wide efforts recently. Most approaches are based on statistical models, requiring large-scale annotated corpora to precisely estimate models' parameters. However, it is usually difficult to obtain in practice. Therefore, employing un-annotated data based on semi-supervised learning for biomedical event extraction is a feasible solution and attracts more interests. In this paper, a semi-supervised learning framework based on hidden topics for biomedical event extraction is presented. In this framework, sentences in the un-annotated corpus are elaborately and automatically assigned with event annotations based on their distances to these sentences in the annotated corpus. More specifically, not only the structures of the sentences, but also the hidden topics embedded in the sentences are used for describing the distance. The sentences and newly assigned event annotations, together with the annotated corpus, are employed for training. Experiments were conducted on the multi-level event extraction corpus, a golden standard corpus. Experimental results show that more than 2.2% improvement on F-score on biomedical event extraction is achieved by the proposed framework when compared to the state-of-the-art approach. The results suggest that by incorporating un-annotated data, the proposed framework indeed improves the performance of the state-of-the-art event extraction system and the similarity between sentences might be precisely

  13. COGNATE: comparative gene annotation characterizer.

    PubMed

    Wilbrandt, Jeanne; Misof, Bernhard; Niehuis, Oliver

    2017-07-17

    The comparison of gene and genome structures across species has the potential to reveal major trends of genome evolution. However, such a comparative approach is currently hampered by a lack of standardization (e.g., Elliott TA, Gregory TR, Philos Trans Royal Soc B: Biol Sci 370:20140331, 2015). For example, testing the hypothesis that the total amount of coding sequences is a reliable measure of potential proteome diversity (Wang M, Kurland CG, Caetano-Anollés G, PNAS 108:11954, 2011) requires the application of standardized definitions of coding sequence and genes to create both comparable and comprehensive data sets and corresponding summary statistics. However, such standard definitions either do not exist or are not consistently applied. These circumstances call for a standard at the descriptive level using a minimum of parameters as well as an undeviating use of standardized terms, and for software that infers the required data under these strict definitions. The acquisition of a comprehensive, descriptive, and standardized set of parameters and summary statistics for genome publications and further analyses can thus greatly benefit from the availability of an easy to use standard tool. We developed a new open-source command-line tool, COGNATE (Comparative Gene Annotation Characterizer), which uses a given genome assembly and its annotation of protein-coding genes for a detailed description of the respective gene and genome structure parameters. Additionally, we revised the standard definitions of gene and genome structures and provide the definitions used by COGNATE as a working draft suggestion for further reference. Complete parameter lists and summary statistics are inferred using this set of definitions to allow down-stream analyses and to provide an overview of the genome and gene repertoire characteristics. COGNATE is written in Perl and freely available at the ZFMK homepage ( https://www.zfmk.de/en/COGNATE ) and on github ( https

  14. Managing, analysing, and integrating big data in medical bioinformatics: open problems and future perspectives.

    PubMed

    Merelli, Ivan; Pérez-Sánchez, Horacio; Gesing, Sandra; D'Agostino, Daniele

    2014-01-01

    The explosion of the data both in the biomedical research and in the healthcare systems demands urgent solutions. In particular, the research in omics sciences is moving from a hypothesis-driven to a data-driven approach. Healthcare is additionally always asking for a tighter integration with biomedical data in order to promote personalized medicine and to provide better treatments. Efficient analysis and interpretation of Big Data opens new avenues to explore molecular biology, new questions to ask about physiological and pathological states, and new ways to answer these open issues. Such analyses lead to better understanding of diseases and development of better and personalized diagnostics and therapeutics. However, such progresses are directly related to the availability of new solutions to deal with this huge amount of information. New paradigms are needed to store and access data, for its annotation and integration and finally for inferring knowledge and making it available to researchers. Bioinformatics can be viewed as the "glue" for all these processes. A clear awareness of present high performance computing (HPC) solutions in bioinformatics, Big Data analysis paradigms for computational biology, and the issues that are still open in the biomedical and healthcare fields represent the starting point to win this challenge.

  15. Managing, Analysing, and Integrating Big Data in Medical Bioinformatics: Open Problems and Future Perspectives

    PubMed Central

    Merelli, Ivan; Pérez-Sánchez, Horacio; Gesing, Sandra; D'Agostino, Daniele

    2014-01-01

    The explosion of the data both in the biomedical research and in the healthcare systems demands urgent solutions. In particular, the research in omics sciences is moving from a hypothesis-driven to a data-driven approach. Healthcare is additionally always asking for a tighter integration with biomedical data in order to promote personalized medicine and to provide better treatments. Efficient analysis and interpretation of Big Data opens new avenues to explore molecular biology, new questions to ask about physiological and pathological states, and new ways to answer these open issues. Such analyses lead to better understanding of diseases and development of better and personalized diagnostics and therapeutics. However, such progresses are directly related to the availability of new solutions to deal with this huge amount of information. New paradigms are needed to store and access data, for its annotation and integration and finally for inferring knowledge and making it available to researchers. Bioinformatics can be viewed as the “glue” for all these processes. A clear awareness of present high performance computing (HPC) solutions in bioinformatics, Big Data analysis paradigms for computational biology, and the issues that are still open in the biomedical and healthcare fields represent the starting point to win this challenge. PMID:25254202

  16. Annotations in Refseq (GSC8 Meeting)

    SciTech Connect

    Tatusova, Tatiana

    2009-09-10

    The Genomic Standards Consortium was formed in September 2005. It is an international, open-membership working body which promotes standardization in the description of genomes and the exchange and integration of genomic data. The 2009 meeting was an activity of a five-year funding "Research Coordination Network" from the National Science Foundation and was organized held at the DOE Joint Genome Institute with organizational support provided by the JGI and by the University of California - San Diego. Tatiana Tatusova of NCBI discusses "Annotations in Refseq" at the Genomic Standards Consortium's 8th meeting at the DOE JGI in Walnut Creek, Calif. on Sept. 10, 2009.

  17. Annotations in Refseq (GSC8 Meeting)

    ScienceCinema

    Tatusova, Tatiana

    2016-07-12

    The Genomic Standards Consortium was formed in September 2005. It is an international, open-membership working body which promotes standardization in the description of genomes and the exchange and integration of genomic data. The 2009 meeting was an activity of a five-year funding "Research Coordination Network" from the National Science Foundation and was organized held at the DOE Joint Genome Institute with organizational support provided by the JGI and by the University of California - San Diego. Tatiana Tatusova of NCBI discusses "Annotations in Refseq" at the Genomic Standards Consortium's 8th meeting at the DOE JGI in Walnut Creek, Calif. on Sept. 10, 2009.

  18. Trends in Biomedical Education.

    ERIC Educational Resources Information Center

    Peppas, Nicholas A.; Mallinson, Richard G.

    1982-01-01

    An analysis of trends in biomedical education within chemical education is presented. Data used for the analysis included: type/level of course, subjects taught, and textbook preferences. Results among others of the 1980 survey indicate that 28 out of 79 schools responding offer at least one course in biomedical engineering. (JN)

  19. Trends in Biomedical Education.

    ERIC Educational Resources Information Center

    Peppas, Nicholas A.; Mallinson, Richard G.

    1982-01-01

    An analysis of trends in biomedical education within chemical education is presented. Data used for the analysis included: type/level of course, subjects taught, and textbook preferences. Results among others of the 1980 survey indicate that 28 out of 79 schools responding offer at least one course in biomedical engineering. (JN)

  20. Biomedical ground lead system

    NASA Technical Reports Server (NTRS)

    1972-01-01

    The design and verification tests for the biomedical ground lead system of Apollo biomedical monitors are presented. Major efforts were made to provide a low impedance path to ground, reduce noise and artifact of ECG signals, and limit the current flowing in the ground electrode of the system.

  1. Biomedical applications engineering tasks

    NASA Technical Reports Server (NTRS)

    Laenger, C. J., Sr.

    1976-01-01

    The engineering tasks performed in response to needs articulated by clinicians are described. Initial contacts were made with these clinician-technology requestors by the Southwest Research Institute NASA Biomedical Applications Team. The basic purpose of the program was to effectively transfer aerospace technology into functional hardware to solve real biomedical problems.

  2. Exploring subdomain variation in biomedical language

    PubMed Central

    2011-01-01

    Background Applications of Natural Language Processing (NLP) technology to biomedical texts have generated significant interest in recent years. In this paper we identify and investigate the phenomenon of linguistic subdomain variation within the biomedical domain, i.e., the extent to which different subject areas of biomedicine are characterised by different linguistic behaviour. While variation at a coarser domain level such as between newswire and biomedical text is well-studied and known to affect the portability of NLP systems, we are the first to conduct an extensive investigation into more fine-grained levels of variation. Results Using the large OpenPMC text corpus, which spans the many subdomains of biomedicine, we investigate variation across a number of lexical, syntactic, semantic and discourse-related dimensions. These dimensions are chosen for their relevance to the performance of NLP systems. We use clustering techniques to analyse commonalities and distinctions among the subdomains. Conclusions We find that while patterns of inter-subdomain variation differ somewhat from one feature set to another, robust clusters can be identified that correspond to intuitive distinctions such as that between clinical and laboratory subjects. In particular, subdomains relating to genetics and molecular biology, which are the most common sources of material for training and evaluating biomedical NLP tools, are not representative of all biomedical subdomains. We conclude that an awareness of subdomain variation is important when considering the practical use of language processing applications by biomedical researchers. PMID:21619603

  3. Simbody: multibody dynamics for biomedical research

    PubMed Central

    Sherman, Michael A.; Seth, Ajay; Delp, Scott L.

    2015-01-01

    Multibody software designed for mechanical engineering has been successfully employed in biomedical research for many years. For real time operation some biomedical researchers have also adapted game physics engines. However, these tools were built for other purposes and do not fully address the needs of biomedical researchers using them to analyze the dynamics of biological structures and make clinically meaningful recommendations. We are addressing this problem through the development of an open source, extensible, high performance toolkit including a multibody mechanics library aimed at the needs of biomedical researchers. The resulting code, Simbody, supports research in a variety of fields including neuromuscular, prosthetic, and biomolecular simulation, and related research such as biologically-inspired design and control of humanoid robots and avatars. Simbody is the dynamics engine behind OpenSim, a widely used biomechanics simulation application. This article reviews issues that arise uniquely in biomedical research, and reports on the architecture, theory, and computational methods Simbody uses to address them. By addressing these needs explicitly Simbody provides a better match to the needs of researchers than can be obtained by adaptation of mechanical engineering or gaming codes. Simbody is a community resource, free for any purpose. We encourage wide adoption and invite contributions to the code base at https://simtk.org/home/simbody. PMID:25866705

  4. Simbody: multibody dynamics for biomedical research.

    PubMed

    Sherman, Michael A; Seth, Ajay; Delp, Scott L

    Multibody software designed for mechanical engineering has been successfully employed in biomedical research for many years. For real time operation some biomedical researchers have also adapted game physics engines. However, these tools were built for other purposes and do not fully address the needs of biomedical researchers using them to analyze the dynamics of biological structures and make clinically meaningful recommendations. We are addressing this problem through the development of an open source, extensible, high performance toolkit including a multibody mechanics library aimed at the needs of biomedical researchers. The resulting code, Simbody, supports research in a variety of fields including neuromuscular, prosthetic, and biomolecular simulation, and related research such as biologically-inspired design and control of humanoid robots and avatars. Simbody is the dynamics engine behind OpenSim, a widely used biomechanics simulation application. This article reviews issues that arise uniquely in biomedical research, and reports on the architecture, theory, and computational methods Simbody uses to address them. By addressing these needs explicitly Simbody provides a better match to the needs of researchers than can be obtained by adaptation of mechanical engineering or gaming codes. Simbody is a community resource, free for any purpose. We encourage wide adoption and invite contributions to the code base at https://simtk.org/home/simbody.

  5. 76 FR 1212 - Joint Biomedical Laboratory Research and Development and Clinical Science Research and...

    Federal Register 2010, 2011, 2012, 2013, 2014

    2011-01-07

    ... AFFAIRS Joint Biomedical Laboratory Research and Development and Clinical Science Research and Development... Eligibility of the Joint Biomedical Laboratory Research and Development and Clinical Science Research and... areas of biomedical, behavioral and clinical science research. The panel meeting will be open to the...

  6. 77 FR 23810 - Joint Biomedical Laboratory Research and Development and Clinical Science Research and...

    Federal Register 2010, 2011, 2012, 2013, 2014

    2012-04-20

    ... AFFAIRS Joint Biomedical Laboratory Research and Development and Clinical Science Research and Development... Biomedical Laboratory Research and Development and Clinical Science Research and Development Services... areas of biomedical, behavioral and clinical science research. The panel meetings will be open to the...

  7. 76 FR 79273 - Joint Biomedical Laboratory Research and Development and Clinical Science Research and...

    Federal Register 2010, 2011, 2012, 2013, 2014

    2011-12-21

    ... AFFAIRS Joint Biomedical Laboratory Research and Development and Clinical Science Research and Development... Eligibility of the Joint Biomedical Laboratory Research and Development and Clinical Science Research and... biomedical, behavioral, and clinical science research. The panel meeting will be open to the public for...

  8. GFam: a platform for automatic annotation of gene families

    PubMed Central

    Sasidharan, Rajkumar; Nepusz, Tamás; Swarbreck, David; Huala, Eva; Paccanaro, Alberto

    2012-01-01

    We have developed GFam, a platform for automatic annotation of gene/protein families. GFam provides a framework for genome initiatives and model organism resources to build domain-based families, derive meaningful functional labels and offers a seamless approach to propagate functional annotation across periodic genome updates. GFam is a hybrid approach that uses a greedy algorithm to chain component domains from InterPro annotation provided by its 12 member resources followed by a sequence-based connected component analysis of un-annotated sequence regions to derive consensus domain architecture for each sequence and subsequently generate families based on common architectures. Our integrated approach increases sequence coverage by 7.2 percentage points and residue coverage by 14.6 percentage points higher than the coverage relative to the best single-constituent database within InterPro for the proteome of Arabidopsis. The true power of GFam lies in maximizing annotation provided by the different InterPro data sources that offer resource-specific coverage for different regions of a sequence. GFam’s capability to capture higher sequence and residue coverage can be useful for genome annotation, comparative genomics and functional studies. GFam is a general-purpose software and can be used for any collection of protein sequences. The software is open source and can be obtained from http://www.paccanarolab.org/software/gfam/. PMID:22790981

  9. Gene Ontology annotations and resources.

    PubMed

    Blake, J A; Dolan, M; Drabkin, H; Hill, D P; Li, Ni; Sitnikov, D; Bridges, S; Burgess, S; Buza, T; McCarthy, F; Peddinti, D; Pillai, L; Carbon, S; Dietze, H; Ireland, A; Lewis, S E; Mungall, C J; Gaudet, P; Chrisholm, R L; Fey, P; Kibbe, W A; Basu, S; Siegele, D A; McIntosh, B K; Renfro, D P; Zweifel, A E; Hu, J C; Brown, N H; Tweedie, S; Alam-Faruque, Y; Apweiler, R; Auchinchloss, A; Axelsen, K; Bely, B; Blatter, M -C; Bonilla, C; Bouguerleret, L; Boutet, E; Breuza, L; Bridge, A; Chan, W M; Chavali, G; Coudert, E; Dimmer, E; Estreicher, A; Famiglietti, L; Feuermann, M; Gos, A; Gruaz-Gumowski, N; Hieta, R; Hinz, C; Hulo, C; Huntley, R; James, J; Jungo, F; Keller, G; Laiho, K; Legge, D; Lemercier, P; Lieberherr, D; Magrane, M; Martin, M J; Masson, P; Mutowo-Muellenet, P; O'Donovan, C; Pedruzzi, I; Pichler, K; Poggioli, D; Porras Millán, P; Poux, S; Rivoire, C; Roechert, B; Sawford, T; Schneider, M; Stutz, A; Sundaram, S; Tognolli, M; Xenarios, I; Foulgar, R; Lomax, J; Roncaglia, P; Khodiyar, V K; Lovering, R C; Talmud, P J; Chibucos, M; Giglio, M Gwinn; Chang, H -Y; Hunter, S; McAnulla, C; Mitchell, A; Sangrador, A; Stephan, R; Harris, M A; Oliver, S G; Rutherford, K; Wood, V; Bahler, J; Lock, A; Kersey, P J; McDowall, D M; Staines, D M; Dwinell, M; Shimoyama, M; Laulederkind, S; Hayman, T; Wang, S -J; Petri, V; Lowry, T; D'Eustachio, P; Matthews, L; Balakrishnan, R; Binkley, G; Cherry, J M; Costanzo, M C; Dwight, S S; Engel, S R; Fisk, D G; Hitz, B C; Hong, E L; Karra, K; Miyasato, S R; Nash, R S; Park, J; Skrzypek, M S; Weng, S; Wong, E D; Berardini, T Z; Huala, E; Mi, H; Thomas, P D; Chan, J; Kishore, R; Sternberg, P; Van Auken, K; Howe, D; Westerfield, M

    2013-01-01

    The Gene Ontology (GO) Consortium (GOC, http://www.geneontology.org) is a community-based bioinformatics resource that classifies gene product function through the use of structured, controlled vocabularies. Over the past year, the GOC has implemented several processes to increase the quantity, quality and specificity of GO annotations. First, the number of manual, literature-based annotations has grown at an increasing rate. Second, as a result of a new 'phylogenetic annotation' process, manually reviewed, homology-based annotations are becoming available for a broad range of species. Third, the quality of GO annotations has been improved through a streamlined process for, and automated quality checks of, GO annotations deposited by different annotation groups. Fourth, the consistency and correctness of the ontology itself has increased by using automated reasoning tools. Finally, the GO has been expanded not only to cover new areas of biology through focused interaction with experts, but also to capture greater specificity in all areas of the ontology using tools for adding new combinatorial terms. The GOC works closely with other ontology developers to support integrated use of terminologies. The GOC supports its user community through the use of e-mail lists, social media and web-based resources.

  10. An integrated computational pipeline and database to support whole-genome sequence annotation

    PubMed Central

    Mungall, CJ; Misra, S; Berman, BP; Carlson, J; Frise, E; Harris, N; Marshall, B; Shu, S; Kaminker, JS; Prochnik, SE; Smith, CD; Smith, E; Tupy, JL; Wiel, C; Rubin, GM; Lewis, SE

    2002-01-01

    We describe here our experience in annotating the Drosophila melanogaster genome sequence, in the course of which we developed several new open-source software tools and a database schema to support large-scale genome annotation. We have developed these into an integrated and reusable software system for whole-genome annotation. The key contributions to overall annotation quality are the marshalling of high-quality sequences for alignments and the design of a system with an adaptable and expandable flexible architecture. PMID:12537570

  11. Empirical data on corpus design and usage in biomedical natural language processing

    PubMed Central

    Cohen, K. Bretonnel; Ogren, Philip V.; Fox, Lynne; Hunter, Lawrence

    2005-01-01

    This paper describes the designs of six publicly available biomedical corpora. We then present usage data for the six corpora. We show that corpora that are carefully annotated with respect to structural and linguistic characteristics and that are distributed in standard formats are more widely used than corpora that are not. These findings have implications for the design of the next generation of biomedical corpora. PMID:16779021

  12. Empirical data on corpus design and usage in biomedical natural language processing.

    PubMed

    Cohen, K Bretonnel; Fox, Lynne; Ogren, Philip V; Hunter, Lawrence

    2005-01-01

    This paper describes the design of six publicly available biomedical corpora. We then present usage data for the six corpora. We show that corpora that are carefully annotated with respect to structural and linguistic characteristics and that are distributed in standard formats are more widely used than corpora that are not. These findings have implications for the design of the next generation of biomedical corpora.

  13. Annotated Bibliography; Freedom of Information Center Reports and Summary Papers.

    ERIC Educational Resources Information Center

    Freedom of Information Center, Columbia, MO.

    This bibliography lists and annotates almost 400 information reports, opinion papers, and summary papers dealing with freedom of information. Topics covered include the nature of press freedom and increased press efforts toward more open access to information; the press situation in many foreign countries, including France, Sweden, Communist…

  14. Convolutional Neural Networks for Biomedical Text Classification: Application in Indexing Biomedical Articles.

    PubMed

    Rios, Anthony; Kavuluru, Ramakanth

    2015-09-01

    Building high accuracy text classifiers is an important task in biomedicine given the wealth of information hidden in unstructured narratives such as research articles and clinical documents. Due to large feature spaces, traditionally, discriminative approaches such as logistic regression and support vector machines with n-gram and semantic features (e.g., named entities) have been used for text classification where additional performance gains are typically made through feature selection and ensemble approaches. In this paper, we demonstrate that a more direct approach using convolutional neural networks (CNNs) outperforms several traditional approaches in biomedical text classification with the specific use-case of assigning medical subject headings (or MeSH terms) to biomedical articles. Trained annotators at the national library of medicine (NLM) assign on an average 13 codes to each biomedical article, thus semantically indexing scientific literature to support NLM's PubMed search system. Recent evidence suggests that effective automated efforts for MeSH term assignment start with binary classifiers for each term. In this paper, we use CNNs to build binary text classifiers and achieve an absolute improvement of over 3% in macro F-score over a set of selected hard-to-classify MeSH terms when compared with the best prior results on a public dataset. Additional experiments on 50 high frequency terms in the dataset also show improvements with CNNs. Our results indicate the strong potential of CNNs in biomedical text classification tasks.

  15. Nominalization and Alternations in Biomedical Language

    PubMed Central

    Cohen, K. Bretonnel; Palmer, Martha; Hunter, Lawrence

    2008-01-01

    Background This paper presents data on alternations in the argument structure of common domain-specific verbs and their associated verbal nominalizations in the PennBioIE corpus. Alternation is the term in theoretical linguistics for variations in the surface syntactic form of verbs, e.g. the different forms of stimulate in FSH stimulates follicular development and follicular development is stimulated by FSH. The data is used to assess the implications of alternations for biomedical text mining systems and to test the fit of the sublanguage model to biomedical texts. Methodology/Principal Findings We examined 1,872 tokens of the ten most common domain-specific verbs or their zero-related nouns in the PennBioIE corpus and labelled them for the presence or absence of three alternations. We then annotated the arguments of 746 tokens of the nominalizations related to these verbs and counted alternations related to the presence or absence of arguments and to the syntactic position of non-absent arguments. We found that alternations are quite common both for verbs and for nominalizations. We also found a previously undescribed alternation involving an adjectival present participle. Conclusions/Significance We found that even in this semantically restricted domain, alternations are quite common, and alternations involving nominalizations are exceptionally diverse. Nonetheless, the sublanguage model applies to biomedical language. We also report on a previously undescribed alternation involving an adjectival present participle. PMID:18779866

  16. Microtask crowdsourcing for disease mention annotation in PubMed abstracts.

    PubMed

    Good, Benjamin M; Nanis, Max; Wu, Chunlei; Su, Andrew I

    2015-01-01

    Identifying concepts and relationships in biomedical text enables knowledge to be applied in computational analyses. Many biological natural language processing (BioNLP) projects attempt to address this challenge, but the state of the art still leaves much room for improvement. Progress in BioNLP research depends on large, annotated corpora for evaluating information extraction systems and training machine learning models. Traditionally, such corpora are created by small numbers of expert annotators often working over extended periods of time. Recent studies have shown that workers on microtask crowdsourcing platforms such as Amazon's Mechanical Turk (AMT) can, in aggregate, generate high-quality annotations of biomedical text. Here, we investigated the use of the AMT in capturing disease mentions in PubMed abstracts. We used the NCBI Disease corpus as a gold standard for refining and benchmarking our crowdsourcing protocol. After several iterations, we arrived at a protocol that reproduced the annotations of the 593 documents in the 'training set' of this gold standard with an overall F measure of 0.872 (precision 0.862, recall 0.883). The output can also be tuned to optimize for precision (max = 0.984 when recall = 0.269) or recall (max = 0.980 when precision = 0.436). Each document was completed by 15 workers, and their annotations were merged based on a simple voting method. In total 145 workers combined to complete all 593 documents in the span of 9 days at a cost of $.066 per abstract per worker. The quality of the annotations, as judged with the F measure, increases with the number of workers assigned to each task; however minimal performance gains were observed beyond 8 workers per task. These results add further evidence that microtask crowdsourcing can be a valuable tool for generating well-annotated corpora in BioNLP. Data produced for this analysis are available at http://figshare.com/articles/Disease_Mention_Annotation_with_Mechanical_Turk/1126402.

  17. A multi-ontology approach to annotate scientific documents based on a modularization technique.

    PubMed

    Gomes, Priscilla Corrêa E Castro; Moura, Ana Maria de Carvalho; Cavalcanti, Maria Cláudia

    2015-12-01

    Scientific text annotation has become an important task for biomedical scientists. Nowadays, there is an increasing need for the development of intelligent systems to support new scientific findings. Public databases available on the Web provide useful data, but much more useful information is only accessible in scientific texts. Text annotation may help as it relies on the use of ontologies to maintain annotations based on a uniform vocabulary. However, it is difficult to use an ontology, especially those that cover a large domain. In addition, since scientific texts explore multiple domains, which are covered by distinct ontologies, it becomes even more difficult to deal with such task. Moreover, there are dozens of ontologies in the biomedical area, and they are usually big in terms of the number of concepts. It is in this context that ontology modularization can be useful. This work presents an approach to annotate scientific documents using modules of different ontologies, which are built according to a module extraction technique. The main idea is to analyze a set of single-ontology annotations on a text to find out the user interests. Based on these annotations a set of modules are extracted from a set of distinct ontologies, and are made available for the user, for complementary annotation. The reduced size and focus of the extracted modules tend to facilitate the annotation task. An experiment was conducted to evaluate this approach, with the participation of a bioinformatician specialist of the Laboratory of Peptides and Proteins of the IOC/Fiocruz, who was interested in discovering new drug targets aiming at the combat of tropical diseases. Copyright © 2015 Elsevier Inc. All rights reserved.

  18. [Biomedical investigation in Mexico].

    PubMed

    Pérez-Tamayo, Ruy

    2004-01-01

    Biomedical research as a professional specialty developed in the Western World in the following four stages: 1) primitive medicine based on magico-religious concepts; 2) hippocratic medicine (500 AD), which renounced supernatural ideas on disease; 3) scientific medicine (1543), which eschewed tradition and authority, and 4) finally in 1813, the first full-time professional biomedical investigator Claude Bernard was appointed in France. Notheless, the first full-time professional biomedical investigator in Mexico did not appear until 1939, and the number is still growing despite present restrictions to investigator growth and development.

  19. Annotated Bibliography on Religious Development.

    ERIC Educational Resources Information Center

    Bucher, Anton A.; Reich, K. Helmut

    1991-01-01

    Presents an annotated bibliography on religious development that covers the areas of psychology and religion, measurement of religiousness, religious development during the life cycle, religious experiences, conversion, religion and morality, and images of God. (Author/BB)

  20. Patient Education: An Annotated Bibliography.

    ERIC Educational Resources Information Center

    Simmons, Jeannette

    Topics included in this annotated bibliography on patient education are (1) background on development of patient education programs, (2) patient education interventions, (3) references for health professionals, and (4) research and evaluation in patient education. (TA)

  1. Hopi Linguistics: An Annotated Bibliography

    ERIC Educational Resources Information Center

    Seaman, P. David

    1977-01-01

    This is a preliminary research-oriented bibliography on the Hopi language. All known items, through mid-1976, are included, with an annotation for each item sketching its nature and/or possible value. (Author/RM)

  2. Butternut (Juglans cinerea) annotated bibliography.

    Treesearch

    M.E. Ostry; M.J. Moore; S.A.N. Worrall

    2003-01-01

    An annotated bibliography of the major literature related to butternut (Juglans cinerea) from 1890 to 2002. Includes 230 citations and a topical index. Topics include diseases, conservation, genetics, insect pests, silvics, nut production, propagation, silviculture, and utilization.

  3. Publication Production: An Annotated Bibliography.

    ERIC Educational Resources Information Center

    Firman, Anthony H.

    1994-01-01

    Offers brief annotations of 52 articles and papers on document production (from the Society for Technical Communication's journal and proceedings) on 9 topics: information processing, document design, using color, typography, tables, illustrations, photography, printing and binding, and production management. (SR)

  4. Publication Production: An Annotated Bibliography.

    ERIC Educational Resources Information Center

    Firman, Anthony H.

    1994-01-01

    Offers brief annotations of 52 articles and papers on document production (from the Society for Technical Communication's journal and proceedings) on 9 topics: information processing, document design, using color, typography, tables, illustrations, photography, printing and binding, and production management. (SR)

  5. Quantifying the Impact and Extent of Undocumented Biomedical Synonymy

    PubMed Central

    Blair, David R.; Wang, Kanix; Nestorov, Svetlozar; Evans, James A.; Rzhetsky, Andrey

    2014-01-01

    Synonymous relationships among biomedical terms are extensively annotated within specialized terminologies, implying that synonymy is important for practical computational applications within this field. It remains unclear, however, whether text mining actually benefits from documented synonymy and whether existing biomedical thesauri provide adequate coverage of these linguistic relationships. In this study, we examine the impact and extent of undocumented synonymy within a very large compendium of biomedical thesauri. First, we demonstrate that missing synonymy has a significant negative impact on named entity normalization, an important problem within the field of biomedical text mining. To estimate the amount synonymy currently missing from thesauri, we develop a probabilistic model for the construction of synonym terminologies that is capable of handling a wide range of potential biases, and we evaluate its performance using the broader domain of near-synonymy among general English words. Our model predicts that over 90% of these relationships are currently undocumented, a result that we support experimentally through “crowd-sourcing.” Finally, we apply our model to biomedical terminologies and predict that they are missing the vast majority (>90%) of the synonymous relationships they intend to document. Overall, our results expose the dramatic incompleteness of current biomedical thesauri and suggest the need for “next-generation,” high-coverage lexical terminologies. PMID:25255227

  6. Quantifying the impact and extent of undocumented biomedical synonymy.

    PubMed

    Blair, David R; Wang, Kanix; Nestorov, Svetlozar; Evans, James A; Rzhetsky, Andrey

    2014-09-01

    Synonymous relationships among biomedical terms are extensively annotated within specialized terminologies, implying that synonymy is important for practical computational applications within this field. It remains unclear, however, whether text mining actually benefits from documented synonymy and whether existing biomedical thesauri provide adequate coverage of these linguistic relationships. In this study, we examine the impact and extent of undocumented synonymy within a very large compendium of biomedical thesauri. First, we demonstrate that missing synonymy has a significant negative impact on named entity normalization, an important problem within the field of biomedical text mining. To estimate the amount synonymy currently missing from thesauri, we develop a probabilistic model for the construction of synonym terminologies that is capable of handling a wide range of potential biases, and we evaluate its performance using the broader domain of near-synonymy among general English words. Our model predicts that over 90% of these relationships are currently undocumented, a result that we support experimentally through "crowd-sourcing." Finally, we apply our model to biomedical terminologies and predict that they are missing the vast majority (>90%) of the synonymous relationships they intend to document. Overall, our results expose the dramatic incompleteness of current biomedical thesauri and suggest the need for "next-generation," high-coverage lexical terminologies.

  7. Automated annotation of chemical names in the literature with tunable accuracy

    PubMed Central

    2011-01-01

    Background A significant portion of the biomedical and chemical literature refers to small molecules. The accurate identification and annotation of compound name that are relevant to the topic of the given literature can establish links between scientific publications and various chemical and life science databases. Manual annotation is the preferred method for these works because well-trained indexers can understand the paper topics as well as recognize key terms. However, considering the hundreds of thousands of new papers published annually, an automatic annotation system with high precision and relevance can be a useful complement to manual annotation. Results An automated chemical name annotation system, MeSH Automated Annotations (MAA), was developed to annotate small molecule names in scientific abstracts with tunable accuracy. This system aims to reproduce the MeSH term annotations on biomedical and chemical literature that would be created by indexers. When comparing automated free text matching to those indexed manually of 26 thousand MEDLINE abstracts, more than 40% of the annotations were false-positive (FP) cases. To reduce the FP rate, MAA incorporated several filters to remove "incorrect" annotations caused by nonspecific, partial, and low relevance chemical names. In part, relevance was measured by the position of the chemical name in the text. Tunable accuracy was obtained by adding or restricting the sections of the text scanned for chemical names. The best precision obtained was 96% with a 28% recall rate. The best performance of MAA, as measured with the F statistic was 66%, which favorably compares to other chemical name annotation systems. Conclusions Accurate chemical name annotation can help researchers not only identify important chemical names in abstracts, but also match unindexed and unstructured abstracts to chemical records. The current work is tested against MEDLINE, but the algorithm is not specific to this corpus and it is possible

  8. Gene Ontology Annotations and Resources

    PubMed Central

    2013-01-01

    The Gene Ontology (GO) Consortium (GOC, http://www.geneontology.org) is a community-based bioinformatics resource that classifies gene product function through the use of structured, controlled vocabularies. Over the past year, the GOC has implemented several processes to increase the quantity, quality and specificity of GO annotations. First, the number of manual, literature-based annotations has grown at an increasing rate. Second, as a result of a new ‘phylogenetic annotation’ process, manually reviewed, homology-based annotations are becoming available for a broad range of species. Third, the quality of GO annotations has been improved through a streamlined process for, and automated quality checks of, GO annotations deposited by different annotation groups. Fourth, the consistency and correctness of the ontology itself has increased by using automated reasoning tools. Finally, the GO has been expanded not only to cover new areas of biology through focused interaction with experts, but also to capture greater specificity in all areas of the ontology using tools for adding new combinatorial terms. The GOC works closely with other ontology developers to support integrated use of terminologies. The GOC supports its user community through the use of e-mail lists, social media and web-based resources. PMID:23161678

  9. caTissue Suite to OpenSpecimen: Developing an extensible, open source, web-based biobanking management system.

    PubMed

    McIntosh, Leslie D; Sharma, Mukesh K; Mulvihill, David; Gupta, Snehil; Juehne, Anthony; George, Bijoy; Khot, Suhas B; Kaushal, Atul; Watson, Mark A; Nagarajan, Rakesh

    2015-10-01

    The National Cancer Institute (NCI) Cancer Biomedical Informatics Grid® (caBIG®) program established standards and best practices for biorepository data management by creating an infrastructure to propagate biospecimen resource sharing while maintaining data integrity and security. caTissue Suite, a biospecimen data management software tool, has evolved from this effort. More recently, the caTissue Suite continues to evolve as an open source initiative known as OpenSpecimen. The essential functionality of OpenSpecimen includes the capture and representation of highly granular, hierarchically-structured data for biospecimen processing, quality assurance, tracking, and annotation. Ideal for multi-user and multi-site biorepository environments, OpenSpecimen permits role-based access to specific sets of data operations through a user-interface designed to accommodate varying workflows and unique user needs. The software is interoperable, both syntactically and semantically, with an array of other bioinformatics tools given its integration of standard vocabularies thus enabling research involving biospecimens. End-users are encouraged to share their day-to-day experiences in working with the application, thus providing to the community board insight into the needs and limitations which need be addressed. Users are also requested to review and validate new features through group testing environments and mock screens. Through this user interaction, application flexibility and interoperability have been recognized as necessary developmental focuses essential for accommodating diverse adoption scenarios and biobanking workflows to catalyze advances in biomedical research and operations. Given the diversity of biobanking practices and workforce roles, efforts have been made consistently to maintain robust data granularity while aiding user accessibility, data discoverability, and security within and across applications by providing a lower learning curve in using Open

  10. Towards a Consensus Annotation System (GSC8 Meeting)

    SciTech Connect

    White, Owen

    2009-09-10

    The Genomic Standards Consortium was formed in September 2005. It is an international, open-membership working body which promotes standardization in the description of genomes and the exchange and integration of genomic data. The 2009 meeting was an activity of a five-year funding "Research Coordination Network" from the National Science Foundation and was organized held at the DOE Joint Genome Institute with organizational support provided by the JGI and by the University of California - San Diego. "Comparing Annotations: Towards Consensus Annotation" at the Genomic Standards Consortium's 8th meeting at the DOE JGI in Walnut Creek, Calif. on Sept. 10, 2009

  11. Towards a Consensus Annotation System (GSC8 Meeting)

    ScienceCinema

    White, Owen [University of Maryland

    2016-07-12

    The Genomic Standards Consortium was formed in September 2005. It is an international, open-membership working body which promotes standardization in the description of genomes and the exchange and integration of genomic data. The 2009 meeting was an activity of a five-year funding "Research Coordination Network" from the National Science Foundation and was organized held at the DOE Joint Genome Institute with organizational support provided by the JGI and by the University of California - San Diego. "Comparing Annotations: Towards Consensus Annotation" at the Genomic Standards Consortium's 8th meeting at the DOE JGI in Walnut Creek, Calif. on Sept. 10, 2009

  12. Topics in Biomedical Optics: Introduction

    NASA Astrophysics Data System (ADS)

    Hebden, Jeremy C.; Boas, David A.; George, John S.; Durkin, Anthony J.

    2003-06-01

    The field of biomedical optics is experiencing tremendous growth. Biomedical technologies contribute in the creation of devices used in healthcare of various specialties (ophthalmology, cardiology, anesthesiology, and immunology, etc.). Recent research in biomedical optics is discussed. Overviews of meetings held at the 2002 Optical Society of America Biomedical Topical Meetings are presented.

  13. [The future of biomedical research at universities].

    PubMed

    Jimenez García, Rodrigo; Gil Miguel, Angel

    2003-01-01

    The present article reviews the historic background of research in the Spanish University and particularly biomedical research in our country. We analyze the last set of data facilitated by the University Council and the Consejo Superior de Investigaciones Científicas. We also review the implications that the National Plan of Quality has had on university research, clearly stimulating and improving the system, and new transformations that the Organic Law of Universities brings implied, and more specifically the National System of Qualification for the access to university teaching staff, which has research work of the teachers as the essential key. Finally, we review the biomedical scientific production during the last years by topics and universities, reflecting the improvement seen during the last decade not only in quantity but also in quality, which is more important. In conclusion, the review reflects a notable change in biomedical research in our universities opening an encouraging track for the future of research.

  14. Quality of computationally inferred gene ontology annotations.

    PubMed

    Skunca, Nives; Altenhoff, Adrian; Dessimoz, Christophe

    2012-05-01

    Gene Ontology (GO) has established itself as the undisputed standard for protein function annotation. Most annotations are inferred electronically, i.e. without individual curator supervision, but they are widely considered unreliable. At the same time, we crucially depend on those automated annotations, as most newly sequenced genomes are non-model organisms. Here, we introduce a methodology to systematically and quantitatively evaluate electronic annotations. By exploiting changes in successive releases of the UniProt Gene Ontology Annotation database, we assessed the quality of electronic annotations in terms of specificity, reliability, and coverage. Overall, we not only found that electronic annotations have significantly improved in recent years, but also that their reliability now rivals that of annotations inferred by curators when they use evidence other than experiments from primary literature. This work provides the means to identify the subset of electronic annotations that can be relied upon-an important outcome given that >98% of all annotations are inferred without direct curation.

  15. Manpower development for the biomedical industry space.

    PubMed

    Goh, James C H

    2013-01-01

    The Biomedical Sciences (BMS) Cluster is one of four key pillars of the Singapore economy. The Singapore Government has injected research funding for basic and translational research to attract companies to carry out their commercial R&D activities. To further intensify the R&D efforts, the National Research Foundation (NRF) was set up to coordinate the research activities of different agencies within the larger national framework and to fund strategic R&D initiatives. In recent years, funding agencies began to focus on support of translational and clinical research, particularly those with potential for commercialization. Translational research is beginning to have traction, in particular research funding for the development of innovation medical devices. Therefore, the Biomedical Sciences sector is projected to grow which means that there is a need to invest in human capital development to achieve sustainable growth. In support of this, education and training programs to strengthen the manpower capabilities for the Biomedical Sciences industry have been developed. In recent years, undergraduate and graduate degree courses in biomedical engineering/bioengineering have been developing at a rapid rate. The goal is to train students with skills to understand complex issues of biomedicine and to develop and implement of advanced technological applications to these problems. There are a variety of career opportunities open to graduates in biomedical engineering, however regardless of the type of career choices, students must not only focus on achieving good grades. They have to develop their marketability to employers through internships, overseas exchange programs, and involvement in leadership-type activities. Furthermore, curriculum has to be developed with biomedical innovation in mind and ensure relevance to the industry. The objective of this paper is to present the NUS Bioengineering undergraduate program in relation to manpower development for the biomedical

  16. Annotation of the Protein Coding Regions of the Equine Genome.

    PubMed

    Hestand, Matthew S; Kalbfleisch, Theodore S; Coleman, Stephen J; Zeng, Zheng; Liu, Jinze; Orlando, Ludovic; MacLeod, James N

    2015-01-01

    Current gene annotation of the horse genome is largely derived from in silico predictions and cross-species alignments. Only a small number of genes are annotated based on equine EST and mRNA sequences. To expand the number of equine genes annotated from equine experimental evidence, we sequenced mRNA from a pool of forty-three different tissues. From these, we derived the structures of 68,594 transcripts. In addition, we identified 301,829 positions with SNPs or small indels within these transcripts relative to EquCab2. Interestingly, 780 variants extend the open reading frame of the transcript and appear to be small errors in the equine reference genome, since they are also identified as homozygous variants by genomic DNA resequencing of the reference horse. Taken together, we provide a resource of equine mRNA structures and protein coding variants that will enhance equine and cross-species transcriptional and genomic comparisons.

  17. Annotation of the Protein Coding Regions of the Equine Genome

    PubMed Central

    Hestand, Matthew S.; Kalbfleisch, Theodore S.; Coleman, Stephen J.; Zeng, Zheng; Liu, Jinze; Orlando, Ludovic; MacLeod, James N.

    2015-01-01

    Current gene annotation of the horse genome is largely derived from in silico predictions and cross-species alignments. Only a small number of genes are annotated based on equine EST and mRNA sequences. To expand the number of equine genes annotated from equine experimental evidence, we sequenced mRNA from a pool of forty-three different tissues. From these, we derived the structures of 68,594 transcripts. In addition, we identified 301,829 positions with SNPs or small indels within these transcripts relative to EquCab2. Interestingly, 780 variants extend the open reading frame of the transcript and appear to be small errors in the equine reference genome, since they are also identified as homozygous variants by genomic DNA resequencing of the reference horse. Taken together, we provide a resource of equine mRNA structures and protein coding variants that will enhance equine and cross-species transcriptional and genomic comparisons. PMID:26107351

  18. A tool for sharing annotated research data: the "Category 0" UMLS (Unified Medical Language System) vocabularies.

    PubMed

    Berman, Jules J

    2003-06-16

    Large biomedical data sets have become increasingly important resources for medical researchers. Modern biomedical data sets are annotated with standard terms to describe the data and to support data linking between databases. The largest curated listing of biomedical terms is the the National Library of Medicine's Unified Medical Language System (UMLS). The UMLS contains more than 2 million biomedical terms collected from nearly 100 medical vocabularies. Many of the vocabularies contained in the UMLS carry restrictions on their use, making it impossible to share or distribute UMLS-annotated research data. However, a subset of the UMLS vocabularies, designated Category 0 by UMLS, can be used to annotate and share data sets without violating the UMLS License Agreement. The UMLS Category 0 vocabularies can be extracted from the parent UMLS metathesaurus using a Perl script supplied with this article. There are 43 Category 0 vocabularies that can be used freely for research purposes without violating the UMLS License Agreement. Among the Category 0 vocabularies are: MESH (Medical Subject Headings), NCBI (National Center for Bioinformatics) Taxonomy and ICD-9-CM (International Classification of Diseases-9-Clinical Modifiers). The extraction file containing all Category 0 terms and concepts is 72,581,138 bytes in length and contains 1,029,161 terms. The UMLS Metathesaurus MRCON file (January, 2003) is 151,048,493 bytes in length and contains 2,146,899 terms. Therefore the Category 0 vocabularies, in aggregate, are about half the size of the UMLS metathesaurus.A large publicly available listing of 567,921 different medical phrases were automatically coded using the full UMLS metatathesaurus and the Category 0 vocabularies. There were 545,321 phrases with one or more matches against UMLS terms while 468,785 phrases had one or more matches against the Category 0 terms. This indicates that when the two vocabularies are evaluated by their fitness to find at least one term

  19. The center for expanded data annotation and retrieval.

    PubMed

    Musen, Mark A; Bean, Carol A; Cheung, Kei-Hoi; Dumontier, Michel; Durante, Kim A; Gevaert, Olivier; Gonzalez-Beltran, Alejandra; Khatri, Purvesh; Kleinstein, Steven H; O'Connor, Martin J; Pouliot, Yannick; Rocca-Serra, Philippe; Sansone, Susanna-Assunta; Wiser, Jeffrey A

    2015-11-01

    The Center for Expanded Data Annotation and Retrieval is studying the creation of comprehensive and expressive metadata for biomedical datasets to facilitate data discovery, data interpretation, and data reuse. We take advantage of emerging community-based standard templates for describing different kinds of biomedical datasets, and we investigate the use of computational techniques to help investigators to assemble templates and to fill in their values. We are creating a repository of metadata from which we plan to identify metadata patterns that will drive predictive data entry when filling in metadata templates. The metadata repository not only will capture annotations specified when experimental datasets are initially created, but also will incorporate links to the published literature, including secondary analyses and possible refinements or retractions of experimental interpretations. By working initially with the Human Immunology Project Consortium and the developers of the ImmPort data repository, we are developing and evaluating an end-to-end solution to the problems of metadata authoring and management that will generalize to other data-management environments. © The Author 2015. Published by Oxford University Press on behalf of the American Medical Informatics Association. All rights reserved. For Permissions, please email: journals.permissions@oup.com.

  20. The center for expanded data annotation and retrieval

    PubMed Central

    Bean, Carol A; Cheung, Kei-Hoi; Dumontier, Michel; Durante, Kim A; Gevaert, Olivier; Gonzalez-Beltran, Alejandra; Khatri, Purvesh; Kleinstein, Steven H; O’Connor, Martin J; Pouliot, Yannick; Rocca-Serra, Philippe; Sansone, Susanna-Assunta; Wiser, Jeffrey A

    2015-01-01

    The Center for Expanded Data Annotation and Retrieval is studying the creation of comprehensive and expressive metadata for biomedical datasets to facilitate data discovery, data interpretation, and data reuse. We take advantage of emerging community-based standard templates for describing different kinds of biomedical datasets, and we investigate the use of computational techniques to help investigators to assemble templates and to fill in their values. We are creating a repository of metadata from which we plan to identify metadata patterns that will drive predictive data entry when filling in metadata templates. The metadata repository not only will capture annotations specified when experimental datasets are initially created, but also will incorporate links to the published literature, including secondary analyses and possible refinements or retractions of experimental interpretations. By working initially with the Human Immunology Project Consortium and the developers of the ImmPort data repository, we are developing and evaluating an end-to-end solution to the problems of metadata authoring and management that will generalize to other data-management environments. PMID:26112029

  1. Real Time Metagenomics: Using k-mers to annotate metagenomes

    PubMed Central

    Edwards, Robert A.; Olson, Robert; Disz, Terry; Pusch, Gordon D.; Vonstein, Veronika; Stevens, Rick; Overbeek, Ross

    2012-01-01

    Summary: Annotation of metagenomes involves comparing the individual sequence reads with a database of known sequences and assigning a unique function to each read. This is a time-consuming task that is computationally intensive (though not computationally complex). Here we present a novel approach to annotate metagenomes using unique k-mer oligopeptide sequences from 7 to 12 amino acids long. We demonstrate that k-mer-based annotations are faster and approach the sensitivity and precision of blastx-based annotations without loosing accuracy. A last-common ancestor approach was also developed to describe the members of the community. Availability and implementation: This open-source application was implemented in Perl and can be accessed via a user-friendly website at http://edwards.sdsu.edu/rtmg. In addition, code to access the annotation servers is available for download from http://www.theseed.org/. FIGfams and k-mers are available for download from ftp://ftp.theseed.org/FIGfams/. Contact: redwards@mail.sdsu.edu Supplementary information: Supplementary data are available at Bioinformatics online. PMID:23047562

  2. CoMAGC: a corpus with multi-faceted annotations of gene-cancer relations.

    PubMed

    Lee, Hee-Jin; Shim, Sang-Hyung; Song, Mi-Ryoung; Lee, Hyunju; Park, Jong C

    2013-11-14

    In order to access the large amount of information in biomedical literature about genes implicated in various cancers both efficiently and accurately, the aid of text mining (TM) systems is invaluable. Current TM systems do target either gene-cancer relations or biological processes involving genes and cancers, but the former type produces information not comprehensive enough to explain how a gene affects a cancer, and the latter does not provide a concise summary of gene-cancer relations. In this paper, we present a corpus for the development of TM systems that are specifically targeting gene-cancer relations but are still able to capture complex information in biomedical sentences. We describe CoMAGC, a corpus with multi-faceted annotations of gene-cancer relations. In CoMAGC, a piece of annotation is composed of four semantically orthogonal concepts that together express 1) how a gene changes, 2) how a cancer changes and 3) the causality between the gene and the cancer. The multi-faceted annotations are shown to have high inter-annotator agreement. In addition, we show that the annotations in CoMAGC allow us to infer the prospective roles of genes in cancers and to classify the genes into three classes according to the inferred roles. We encode the mapping between multi-faceted annotations and gene classes into 10 inference rules. The inference rules produce results with high accuracy as measured against human annotations. CoMAGC consists of 821 sentences on prostate, breast and ovarian cancers. Currently, we deal with changes in gene expression levels among other types of gene changes. The corpus is available at http://biopathway.org/CoMAGCunder the terms of the Creative Commons Attribution License ( http://creativecommons.org/licenses/by/2.0). The corpus will be an important resource for the development of advanced TM systems on gene-cancer relations.

  3. MEGANTE: a web-based system for integrated plant genome annotation.

    PubMed

    Numa, Hisataka; Itoh, Takeshi

    2014-01-01

    The recent advancement of high-throughput genome sequencing technologies has resulted in a considerable increase in demands for large-scale genome annotation. While annotation is a crucial step for downstream data analyses and experimental studies, this process requires substantial expertise and knowledge of bioinformatics. Here we present MEGANTE, a web-based annotation system that makes plant genome annotation easy for researchers unfamiliar with bioinformatics. Without any complicated configuration, users can perform genomic sequence annotations simply by uploading a sequence and selecting the species to query. MEGANTE automatically runs several analysis programs and integrates the results to select the appropriate consensus exon-intron structures and to predict open reading frames (ORFs) at each locus. Functional annotation, including a similarity search against known proteins and a functional domain search, are also performed for the predicted ORFs. The resultant annotation information is visualized with a widely used genome browser, GBrowse. For ease of analysis, the results can be downloaded in Microsoft Excel format. All of the query sequences and annotation results are stored on the server side so that users can access their own data from virtually anywhere on the web. The current release of MEGANTE targets 24 plant species from the Brassicaceae, Fabaceae, Musaceae, Poaceae, Salicaceae, Solanaceae, Rosaceae and Vitaceae families, and it allows users to submit a sequence up to 10 Mb in length and to save up to 100 sequences with the annotation information on the server. The MEGANTE web service is available at https://megante.dna.affrc.go.jp/.

  4. USI: a fast and accurate approach for conceptual document annotation.

    PubMed

    Fiorini, Nicolas; Ranwez, Sylvie; Montmain, Jacky; Ranwez, Vincent

    2015-03-14

    Semantic approaches such as concept-based information retrieval rely on a corpus in which resources are indexed by concepts belonging to a domain ontology. In order to keep such applications up-to-date, new entities need to be frequently annotated to enrich the corpus. However, this task is time-consuming and requires a high-level of expertise in both the domain and the related ontology. Different strategies have thus been proposed to ease this indexing process, each one taking advantage from the features of the document. In this paper we present USI (User-oriented Semantic Indexer), a fast and intuitive method for indexing tasks. We introduce a solution to suggest a conceptual annotation for new entities based on related already indexed documents. Our results, compared to those obtained by previous authors using the MeSH thesaurus and a dataset of biomedical papers, show that the method surpasses text-specific methods in terms of both quality and speed. Evaluations are done via usual metrics and semantic similarity. By only relying on neighbor documents, the User-oriented Semantic Indexer does not need a representative learning set. Yet, it provides better results than the other approaches by giving a consistent annotation scored with a global criterion - instead of one score per concept.

  5. Bioinformatics for spermatogenesis: annotation of male reproduction based on proteomics

    PubMed Central

    Zhou, Tao; Zhou, Zuo-Min; Guo, Xue-Jiang

    2013-01-01

    Proteomics strategies have been widely used in the field of male reproduction, both in basic and clinical research. Bioinformatics methods are indispensable in proteomics-based studies and are used for data presentation, database construction and functional annotation. In the present review, we focus on the functional annotation of gene lists obtained through qualitative or quantitative methods, summarizing the common and male reproduction specialized proteomics databases. We introduce several integrated tools used to find the hidden biological significance from the data obtained. We further describe in detail the information on male reproduction derived from Gene Ontology analyses, pathway analyses and biomedical analyses. We provide an overview of bioinformatics annotations in spermatogenesis, from gene function to biological function and from biological function to clinical application. On the basis of recently published proteomics studies and associated data, we show that bioinformatics methods help us to discover drug targets for sperm motility and to scan for cancer-testis genes. In addition, we summarize the online resources relevant to male reproduction research for the exploration of the regulation of spermatogenesis. PMID:23852026

  6. Zwitterionic ceramics for biomedical applications.

    PubMed

    Izquierdo-Barba, Isabel; Colilla, Montserrat; Vallet-Regí, María

    2016-08-01

    Bioceramics for bone tissue regeneration, local drug delivery and nanomedicine, are receiving growing attention by the biomaterials scientific community. The design of bioceramics with improved surface properties able to overcome clinical issues is a great scientific challenge. Zwitterionization of surfaces has arisen as a powerful alternative in the design of biocompatible bioceramics capable to inhibit bacterial and non-specific protein adsorption, which opens up new insights into the biomedical applications of these materials. This manuscript reviews the different approaches reported up to date for the synthesis and characterization of zwitterionic bioceramics with potential clinical applications. Zwitterionic bioceramics are receiving growing attention by the biomaterials scientific community due to their great potential in bone tissue regeneration, local drug delivery and nanomedicines. Herein, the different strategies developed so far to synthesize and characterize zwitterionic bioceramics with potential clinical applications are summarized. Copyright © 2016. Published by Elsevier Ltd.

  7. Updating annotations with the distributed annotation system and the automated sequence annotation pipeline

    PubMed Central

    Speier, William; Ochs, Michael F.

    2012-01-01

    Summary: The integration between BioDAS ProServer and Automated Sequence Annotation Pipeline (ASAP) provides an interface for querying diverse annotation sources, chaining and linking results, and standardizing the output using the Distributed Annotation System (DAS) protocol. This interface allows pipeline plans in ASAP to be integrated into any system using HTTP and also allows the information returned by ASAP to be included in the DAS registry for use in any DAS-aware system. Three example implementations have been developed: the first accesses TRANSFAC information to automatically create gene sets for the Coordinated Gene Activity in Pattern Sets (CoGAPS) algorithm; the second integrates annotations from multiple array platforms and provides unified annotations in an R environment; and the third wraps the UniProt database for integration with the SPICE DAS client. Availability: Source code for ASAP 2.7 and the DAS 1.6 interface is available under the GNU public license. Proserver 2.20 is free software available from SourceForge. Scripts for installation and configuration on Linux are provided at our website: http://www.rits.onc.jhmi.edu/dbb/custom/A6/ Contact: Speier@mii.ucla.edu or mfo@jhu.edu Supplementary information: Supplementary data are available at Bioinformatics online. PMID:22945787

  8. Automatic annotation of outdoor photographs

    NASA Astrophysics Data System (ADS)

    Cusano, Claudio; Schettini, Raimondo

    2011-01-01

    We propose here a strategy for the automatic annotation of outdoor photographs. Images are segmented in homogeneous regions which may be then assigned to seven different classes: sky, vegetation, snow, water, ground, street, and sand. These categories allows for content-aware image processing strategies. Our annotation strategy uses a normalized cut segmentation to identify the regions to be classified by a multi-class Support Vector Machine. The strategy has been evaluated on a set of images taken from the LabelMe dataset.

  9. Alignment-Annotator web server: rendering and annotating sequence alignments.

    PubMed

    Gille, Christoph; Fähling, Michael; Weyand, Birgit; Wieland, Thomas; Gille, Andreas

    2014-07-01

    Alignment-Annotator is a novel web service designed to generate interactive views of annotated nucleotide and amino acid sequence alignments (i) de novo and (ii) embedded in other software. All computations are performed at server side. Interactivity is implemented in HTML5, a language native to web browsers. The alignment is initially displayed using default settings and can be modified with the graphical user interfaces. For example, individual sequences can be reordered or deleted using drag and drop, amino acid color code schemes can be applied and annotations can be added. Annotations can be made manually or imported (BioDAS servers, the UniProt, the Catalytic Site Atlas and the PDB). Some edits take immediate effect while others require server interaction and may take a few seconds to execute. The final alignment document can be downloaded as a zip-archive containing the HTML files. Because of the use of HTML the resulting interactive alignment can be viewed on any platform including Windows, Mac OS X, Linux, Android and iOS in any standard web browser. Importantly, no plugins nor Java are required and therefore Alignment-Anotator represents the first interactive browser-based alignment visualization. http://www.bioinformatics.org/strap/aa/ and http://strap.charite.de/aa/. © The Author(s) 2014. Published by Oxford University Press on behalf of Nucleic Acids Research.

  10. Alignment-Annotator web server: rendering and annotating sequence alignments

    PubMed Central

    Gille, Christoph; Fähling, Michael; Weyand, Birgit; Wieland, Thomas; Gille, Andreas

    2014-01-01

    Alignment-Annotator is a novel web service designed to generate interactive views of annotated nucleotide and amino acid sequence alignments (i) de novo and (ii) embedded in other software. All computations are performed at server side. Interactivity is implemented in HTML5, a language native to web browsers. The alignment is initially displayed using default settings and can be modified with the graphical user interfaces. For example, individual sequences can be reordered or deleted using drag and drop, amino acid color code schemes can be applied and annotations can be added. Annotations can be made manually or imported (BioDAS servers, the UniProt, the Catalytic Site Atlas and the PDB). Some edits take immediate effect while others require server interaction and may take a few seconds to execute. The final alignment document can be downloaded as a zip-archive containing the HTML files. Because of the use of HTML the resulting interactive alignment can be viewed on any platform including Windows, Mac OS X, Linux, Android and iOS in any standard web browser. Importantly, no plugins nor Java are required and therefore Alignment-Anotator represents the first interactive browser-based alignment visualization. Availability: http://www.bioinformatics.org/strap/aa/ and http://strap.charite.de/aa/. PMID:24813445

  11. Ethics in biomedical engineering.

    PubMed

    Morsy, Ahmed; Flexman, Jennifer

    2008-01-01

    This session focuses on a number of aspects of the subject of Ethics in Biomedical Engineering. The session starts by providing a case study of a company that manufactures artificial heart valves where the valves were failing at an unexpected rate. The case study focuses on Biomedical Engineers working at the company and how their education and training did not prepare them to deal properly with such situation. The second part of the session highlights the need to learn about various ethics rules and policies regulating research involving human or animal subjects.

  12. Supporting undergraduate biomedical entrepreneurship.

    PubMed

    Patterson, P E

    2004-01-01

    As biomedical innovations become more sophisticated and expensive to bring to market, an approach is needed to ensure the survival of the best ideas. The tactic used by Iowa State University to provide entrepreneurship opportunities for undergraduate students in biomedical areas is a model that has proven to be both distinctive and effective. Iowa State supports and fosters undergraduate student entrepreneurship efforts through the Pappajohn Center for Entrepreneurship. This unique partnership encourages ISU faculty, researchers, and students to become involved in the world of entrepreneurship, while allowing Iowa's business communities to gain access to a wide array of available resources, skills, and information from Iowa State University.

  13. Commercial Biomedical Experiments

    NASA Technical Reports Server (NTRS)

    2003-01-01

    Experiments to seek solutions for a range of biomedical issues are at the heart of several investigations that will be hosted by the Commercial Instrumentation Technology Associates (ITA), Inc. Biomedical Experiments (CIBX-2) payload. CIBX-2 is unique, encompassing more than 20 separate experiments including cancer research, commercial experiments, and student hands-on experiments from 10 schools as part of ITA's ongoing University Among the Stars program. Valerie Cassanto of ITA checks the Canadian Protein Crystallization Experiment (CAPE) carried by STS-86 to Mir in 1997. The experiments are sponsored by NASA's Space Product Development Program (SPD).

  14. Commercial Biomedical Experiments Payload

    NASA Technical Reports Server (NTRS)

    2003-01-01

    Experiments to seek solutions for a range of biomedical issues are at the heart of several investigations that will be hosted by the Commercial Instrumentation Technology Associates (ITA), Inc. The biomedical experiments CIBX-2 payload is unique, encompassing more than 20 separate experiments including cancer research, commercial experiments, and student hands-on experiments from 10 schools as part of ITA's ongoing University Among the stars program. Here, Astronaut Story Musgrave activates the CMIX-5 (Commercial MDA ITA experiment) payload in the Space Shuttle mid deck during the STS-80 mission in 1996 which is similar to CIBX-2. The experiments are sponsored by NASA's Space Product Development Program (SPD).

  15. Commercial Biomedical Experiments

    NASA Technical Reports Server (NTRS)

    2003-01-01

    Experiments to seek solutions for a range of biomedical issues are at the heart of several investigations that will be hosted by the Commercial Instrumentation Technology Associates (ITA), Inc. Biomedical Experiments (CIBX-2) payload. CIBX-2 is unique, encompassing more than 20 separate experiments including cancer research, commercial experiments, and student hands-on experiments from 10 schools as part of ITA's ongoing University Among the Stars program. Valerie Cassanto of ITA checks the Canadian Protein Crystallization Experiment (CAPE) carried by STS-86 to Mir in 1997. The experiments are sponsored by NASA's Space Product Development Program (SPD).

  16. Commercial Biomedical Experiments Payload

    NASA Technical Reports Server (NTRS)

    2003-01-01

    Experiments to seek solutions for a range of biomedical issues are at the heart of several investigations that will be hosted by the Commercial Instrumentation Technology Associates (ITA), Inc. The biomedical experiments CIBX-2 payload is unique, encompassing more than 20 separate experiments including cancer research, commercial experiments, and student hands-on experiments from 10 schools as part of ITA's ongoing University Among the stars program. Here, Astronaut Story Musgrave activates the CMIX-5 (Commercial MDA ITA experiment) payload in the Space Shuttle mid deck during the STS-80 mission in 1996 which is similar to CIBX-2. The experiments are sponsored by NASA's Space Product Development Program (SPD).

  17. A biomedical engineer's library.

    PubMed

    Webster, J G

    1982-01-01

    A survey resulted in a list of the 101 textbooks used by 62 biomedical engineering educational programs. A second list shows the textbooks used by each school. A third list shows the 27 textbooks used at two or more schools and the number of times each is used. This selected compilation should be useful to (a) biomedical engineering curriculum committees considering program revision, (b) teachers considering course revision, (c) university and industrial librarians updating their collections, (d) individuals building a personal library, and (e) students desiring information about the emphasis of various educational programs.

  18. Biomedical materials and devices

    SciTech Connect

    Hanker, J. S. ); Giammara, B. L. )

    1989-01-01

    This conference reports on how biomedical materials and devices are undergoing important changes that require interdisciplinary approaches, innovation expertise, and access to sophisticated preparative and analytical equipment and methodologies. The interaction of materials scientists with biomedical, biotechnological, bioengineering and clinical scientists in the last decade has resulted in major advances in therapy. New therapeutic modalities and bioengineering methods and devices for the continuous removal of toxins or pathologic products present in arthritis, atherosclerosis and malignancy are presented. Novel monitoring and controlled drug delivery systems and discussions of materials such as blood or plasma substitutes, artificial organs, and bone graft substitutes are discussed.

  19. Biomedical implantable microelectronics.

    PubMed

    Meindl, J D

    1980-10-17

    Innovative applications of microelectronics in new biomedical implantable instruments offer a singular opportunity for advances in medical research and practice because of two salient factors: (i) beyond all other types of biomedical instruments, implants exploit fully the inherent technical advantages--complex functional capability, high reliability, lower power drain, small size and weight-of microelectronics, and (ii) implants bring microelectronics into intimate association with biological systems. The combination of these two factors enables otherwise impossible new experiments to be conducted and new paostheses developed that will improve the quality of human life.

  20. Biomedical enhancements as justice.

    PubMed

    Nam, Jeesoo

    2015-02-01

    Biomedical enhancements, the applications of medical technology to make better those who are neither ill nor deficient, have made great strides in the past few decades. Using Amartya Sen's capability approach as my framework, I argue in this article that far from being simply permissible, we have a prima facie moral obligation to use these new developments for the end goal of promoting social justice. In terms of both range and magnitude, the use of biomedical enhancements will mark a radical advance in how we compensate the most disadvantaged members of society.

  1. National Space Biomedical Research Institute Annual Report

    NASA Technical Reports Server (NTRS)

    2000-01-01

    This report summarizes the activities of the National Space Biomedical Research Institute (NSBRI) during FY 2000. The NSBRI is responsible for the development of countermeasures against the deleterious effects of long-duration space flight and performs fundamental and applied space biomedical research directed towards this specific goal. Its mission is to lead a world-class, national effort in integrated, critical path space biomedical research that supports NASA's Human Exploration and Development of Space (HEDS) Strategic Plan by focusing on the enabling of long-term human presence in, development of, and exploration of space. This is accomplished by: designing, testing and validating effective countermeasures to address the biological and environmental impediments to long-term human space flight; defining the molecular, cellular, organ-level, integrated responses and mechanistic relationships that ultimately determine these impediments, where such activity fosters the development of novel countermeasures; establishing biomedical support technologies to maximize human performance in space, reduce biomedical hazards to an acceptable level, and deliver quality medical care; transferring and disseminating the biomedical advances in knowledge and technology acquired through living and working in space to the general benefit of mankind, including the treatment of patients suffering from gravity- and radiation-related conditions on Earth; and ensuring open involvement of the scientific community, industry and the public at large in the Institute's activities and fostering a robust collaboration with NASA, particularly through NASA's Lyndon B. Johnson Space Center. Attachment:Appendices (A,B,C,D,E,F,G,H,I,J,K,L,M,N,O, and P.).

  2. Preserving sequence annotations across reference sequences.

    PubMed

    Tatum, Zuotian; Roos, Marco; Gibson, Andrew P; Taschner, Peter Em; Thompson, Mark; Schultes, Erik A; Laros, Jeroen Fj

    2014-01-01

    Matching and comparing sequence annotations of different reference sequences is vital to genomics research, yet many annotation formats do not specify the reference sequence types or versions used. This makes the integration of annotations from different sources difficult and error prone. As part of our effort to create linked data for interoperable sequence annotations, we present an RDF data model for sequence annotation using the ontological framework established by the OBO Foundry ontologies and the Basic Formal Ontology (BFO). We defined reference sequences as the common domain of integration for sequence annotations, and identified three semantic relationships between sequence annotations. In doing so, we created the Reference Sequence Annotation to compensate for gaps in the SO and in its mapping to BFO, particularly for annotations that refer to versions of consensus reference sequences. Moreover, we present three integration models for sequence annotations using different reference assemblies. We demonstrated a working example of a sequence annotation instance, and how this instance can be linked to other annotations on different reference sequences. Sequence annotations in this format are semantically rich and can be integrated easily with different assemblies. We also identify other challenges of modeling reference sequences with the BFO.

  3. Preserving sequence annotations across reference sequences

    PubMed Central

    2014-01-01

    Background Matching and comparing sequence annotations of different reference sequences is vital to genomics research, yet many annotation formats do not specify the reference sequence types or versions used. This makes the integration of annotations from different sources difficult and error prone. Results As part of our effort to create linked data for interoperable sequence annotations, we present an RDF data model for sequence annotation using the ontological framework established by the OBO Foundry ontologies and the Basic Formal Ontology (BFO). We defined reference sequences as the common domain of integration for sequence annotations, and identified three semantic relationships between sequence annotations. In doing so, we created the Reference Sequence Annotation to compensate for gaps in the SO and in its mapping to BFO, particularly for annotations that refer to versions of consensus reference sequences. Moreover, we present three integration models for sequence annotations using different reference assemblies. Conclusions We demonstrated a working example of a sequence annotation instance, and how this instance can be linked to other annotations on different reference sequences. Sequence annotations in this format are semantically rich and can be integrated easily with different assemblies. We also identify other challenges of modeling reference sequences with the BFO. PMID:25093075

  4. Cloud Based Metalearning System for Predictive Modeling of Biomedical Data

    PubMed Central

    Vukićević, Milan

    2014-01-01

    Rapid growth and storage of biomedical data enabled many opportunities for predictive modeling and improvement of healthcare processes. On the other side analysis of such large amounts of data is a difficult and computationally intensive task for most existing data mining algorithms. This problem is addressed by proposing a cloud based system that integrates metalearning framework for ranking and selection of best predictive algorithms for data at hand and open source big data technologies for analysis of biomedical data. PMID:24892101

  5. Annotated Bibliography of Professional Socialization.

    ERIC Educational Resources Information Center

    Rogers, John M.

    This bibliography contains annotations of 49 articles on the topic of professional socialization. The articles were identified using the Educational Resources Information Center (ERIC), Sociological Abstracts, Medline, and Cumulative Index of Nursing and Allied Health Literature data bases. A bias exists in the selection process towards items…

  6. MSDAC Resource Library Annotated Bibliography.

    ERIC Educational Resources Information Center

    Schlee, Phillip F., Comp.; And Others

    The Midwest Sex Discrimination Assistance Center presents an annotated bibliography of 56 monographs and 11 other media materials relating to women and sex discrimination for use in public schools. Media materials include slides, films, filmstrips, audio recordings, and posters. The bibliography is organized by subject and each annotation…

  7. Workforce Reductions. An Annotated Bibliography.

    ERIC Educational Resources Information Center

    Hickok, Thomas A.; Hickok, Thomas A.

    This report, which is based on a review of practitioner-oriented sources and scholarly journals, uses a three-part framework to organize annotated bibliographies that, together, list a total of 104 sources that provide the following three perspectives on work force reduction issues: organizational, organizational-individual relationship, and…

  8. Meaningful Assessment: An Annotated Bibliography.

    ERIC Educational Resources Information Center

    Thrond, Mary A.

    The annotated bibliography contains citations of nine references on alternative student assessment methods in second language programs, particularly at the secondary school level. The references include a critique of conventional reading comprehension assessment, a discussion of performance assessment, a proposal for a multi-trait, multi-method…

  9. Annotated Videography. Part 3. [Revised].

    ERIC Educational Resources Information Center

    United States Holocaust Memorial Museum, Washington, DC.

    This annotated videography has been designed to identify videotapes addressing Holocaust history that have been used effectively in classrooms and are available readily to most communities. The guide is divided into 15 topical categories, including: life before the Holocaust; perpetrators; propaganda; racism; antisemitism; mosaic of victims;…

  10. Hispanic Heritage. An Annotated Bibliography.

    ERIC Educational Resources Information Center

    Denver Univ., CO. School of Education.

    This annotated bibliography of a wide range of materials for the social studies teacher is concerned with the Hispano heritage. The sections are introduced by a brief description. The sections are: 1) general materials, 2) the land and the people, 3) the European background, 4) Spain's colonial system, 5) the Spanish borderlands, 6) the Anglo…

  11. Annotated Bibliography on Humanistic Education

    ERIC Educational Resources Information Center

    Ganung, Cynthia

    1975-01-01

    Part I of this annotated bibliography deals with books and articles on such topics as achievement motivation, process education, transactional analysis, discipline without punishment, role-playing, interpersonal skills, self-acceptance, moral education, self-awareness, values clarification, and non-verbal communication. Part II focuses on…

  12. English Language Learners: Annotated Bibliography

    ERIC Educational Resources Information Center

    Hector-Mason, Anestine; Bardack, Sarah

    2010-01-01

    This annotated bibliography represents a first step toward compiling a comprehensive overview of current research on issues related to English language learners (ELLs). It is intended to be a resource for researchers, policymakers, administrators, and educators who are engaged in efforts to bridge the divide between research, policy, and practice…

  13. Migrant Education: An Annotated Bibliography.

    ERIC Educational Resources Information Center

    Palmer, Barbara C., Comp.

    Materials selected for inclusion in the annotated bibliography of 139 publications from 1970 to 1980 give a general understanding of the lives of migrant children, their educational needs and problems, and various attempts made to meet those needs. The bibliography, a valuable tool for researchers and teachers in migrant education, includes books,…

  14. Nikos Kazantzakis: An Annotated Bibliography.

    ERIC Educational Resources Information Center

    Qiu, Kui

    This research paper consists of an annotated bibliography about Nikos Kazantzakis, one of the major modern Greek writers and author of "The Last Temptation of Christ,""Zorba the Greek," and many other works. Because of Kazantzakis' position in world literature there are many critical works about him; however, bibliographical…

  15. Radiocarbon Dating: An Annotated Bibliography.

    ERIC Educational Resources Information Center

    Fortine, Suellen

    This selective annotated bibliography covers various sources of information on the radiocarbon dating method, including journal articles, conference proceedings, and reports, reflecting the most important and useful sources of the last 25 years. The bibliography is divided into five parts--general background on radiocarbon, radiocarbon dating,…

  16. MSDAC Resource Library Annotated Bibliography.

    ERIC Educational Resources Information Center

    Watson, Cristel; And Others

    This annotated bibliography lists books, films, filmstrips, recordings, and booklets on sex equity. Entries are arranged according to the following topics: career resources, curriculum resources, management, sex equity, sex roles, women's studies, student activities, and sex-fair fiction. Included in each entry are name of author, editor or…

  17. Radiocarbon Dating: An Annotated Bibliography.

    ERIC Educational Resources Information Center

    Fortine, Suellen

    This selective annotated bibliography covers various sources of information on the radiocarbon dating method, including journal articles, conference proceedings, and reports, reflecting the most important and useful sources of the last 25 years. The bibliography is divided into five parts--general background on radiocarbon, radiocarbon dating,…

  18. Peaceful Peoples: An Annotated Bibliography.

    ERIC Educational Resources Information Center

    Bonta, Bruce D.

    This annotated bibliography includes 438 selected references to books, journal articles, essays within edited volumes, and dissertations that provide significant information about peaceful societies. Peaceful societies are groups that have developed harmonious social structures that allow them to get along with each other, and with outsiders,…

  19. Oral History: An Annotated Bibliography.

    ERIC Educational Resources Information Center

    Friedman, Paul G.

    Defining oral history as a method of inquiry by which the memories of individuals are elicited, preserved in interview transcripts or on tape recordings, and then used to enrich understanding of individuals' lives and the events in which they participated, this annotated bibliography provides a broad overview and a sampling of the resources…

  20. Music Analysis: An Annotated Bibliography.

    ERIC Educational Resources Information Center

    Fink, Michael

    One hundred and forty citations comprise this annotated bibliography of books, articles, and selected dissertations that encompass trends in music theory and k-16 music education since the late 19th century. Special emphasis is upon writings since the 1950's. During earlier development, music analysts concentrated upon the elements of music (i.e.,…

  1. Teacher Aides; An Annotated Bibliography.

    ERIC Educational Resources Information Center

    Marin County Public Schools, Corte Madera, CA.

    This annotated bibliography lists 40 items, published between 1966 and 1971, that have to do with teacher aides. The listing is arranged alphabetically by author. In addition to the abstract and standard bibliographic information, addresses where the material can be purchased are often included. The items cited include handbooks, research studies,…

  2. Staff Differentiation. An Annotated Bibliography.

    ERIC Educational Resources Information Center

    Marin County Superintendent of Schools, Corte Madera, CA.

    This annotated bibliography reviews selected literature focusing on the concept of staff differentiation. Included are 62 items (dated 1966-1970), along with a list of mailing addresses where copies of individual items can be obtained. Also a list of 31 staff differentiation projects receiving financial assistance from the U.S. Office of Education…

  3. Rural Education: An Annotated Bibliography.

    ERIC Educational Resources Information Center

    Massey, Sara

    The 120-item annotated bibliography was compiled to facilitate the development of a recently approved course entitled "Topics in Rural Education" at the University of Maine at Machias. Although the dates range from 1964 to 1982, most of the materials were prepared in the 1970s and 1980s. The interrelatedness of the issues makes categorization…

  4. Annotated Selected Puerto Rican Bibliography.

    ERIC Educational Resources Information Center

    Bravo, Enrique R., Comp.

    This work represents an effort on the part of The Urban Center to come one step closer to the realization of its goal to further the growth of ethnic studies. After extensive consultation with educationists from within and without the Puerto Rican community, it was decided that an annotated bilingual bibliography should be published to assist and…

  5. Vietnamese Amerasians: An Annotated Bibliography.

    ERIC Educational Resources Information Center

    Johnson, Mark C.; And Others

    This annotated bibliography on Vietnamese Amerasians includes primary and secondary sources as well as reviews of three documentary films. Sources were selected in order to provide an overview of the historical and political context of Amerasian resettlement and a review of the scant available research on coping and adaptation with this…

  6. Vietnamese Amerasians: An Annotated Bibliography.

    ERIC Educational Resources Information Center

    Johnson, Mark C.; And Others

    This annotated bibliography on Vietnamese Amerasians includes primary and secondary sources as well as reviews of three documentary films. Sources were selected in order to provide an overview of the historical and political context of Amerasian resettlement and a review of the scant available research on coping and adaptation with this…

  7. Workforce Reductions. An Annotated Bibliography.

    ERIC Educational Resources Information Center

    Hickok, Thomas A.; Hickok, Thomas A.

    This report, which is based on a review of practitioner-oriented sources and scholarly journals, uses a three-part framework to organize annotated bibliographies that, together, list a total of 104 sources that provide the following three perspectives on work force reduction issues: organizational, organizational-individual relationship, and…

  8. Aging Awareness: An Annotated Bibliography.

    ERIC Educational Resources Information Center

    Grant, Rugh; And Others

    This annotated bibliography cites books and articles on aging. The bibliography was compiled by a resource team who are helping teachers and elderly volunteers create classroom environments in which the strengths and uniqueness of these volunteers are recognized. The books in the first section "Aging in Society" describe the problems, aspirations,…

  9. Annotated Selected Puerto Rican Bibliography.

    ERIC Educational Resources Information Center

    Bravo, Enrique R., Comp.

    This work represents an effort on the part of The Urban Center to come one step closer to the realization of its goal to further the growth of ethnic studies. After extensive consultation with educationists from within and without the Puerto Rican community, it was decided that an annotated bilingual bibliography should be published to assist and…

  10. Infant Feeding: An Annotated Bibliography.

    ERIC Educational Resources Information Center

    Crowhurst, Christine Marie, Comp.; Kumer, Bonnie Lee, Comp.

    Intended for parents, health professionals and allied health workers, and others involved in caring for infants and young children, this annotated bibliography brings together in one selective listing a review of over 700 current publications related to infant feeding. Reflecting current knowledge in infant feeding, the bibliography has as its…

  11. Appalachian Women. An Annotated Bibliography.

    ERIC Educational Resources Information Center

    Hamm, Mary Margo

    This bibliography compiles annotations of 178 books, journal articles, ERIC documents, and dissertations on Appalachian women and their social, cultural, and economic environment. Entries were published 1966-93 and are listed in the following categories: (1) authors and literary criticism; (2) bibliographies and resource guides; (3) economics,…

  12. Teacher Evaluation: An Annotated Bibliography.

    ERIC Educational Resources Information Center

    McKenna, Bernard H.; And Others

    In his introduction to the 86-item annotated bibliography by Mueller and Poliakoff, McKenna discusses his views on teacher evaluation and his impressions of the documents cited. He observes, in part, that the current concern is with the process of evaluation and that most researchers continue to believe that student achievement is the most…

  13. Annotated Bibliography, Grades K-6.

    ERIC Educational Resources Information Center

    Massachusetts Dept. of Education, Boston. Bureau of Nutrition Education and School Food Services.

    This annotated bibliography on nutrition is for the use of teachers at the elementary grade level. It contains a list of books suitable for reading about nutrition and foods for pupils from kindergarten through the sixth grade. Films and audiovisual presentations for classroom use are also listed. The names and addresses from which these materials…

  14. ANNOTATED BIBLIOGRAPHY OF GEOLOGICAL EDUCATION.

    ERIC Educational Resources Information Center

    BERG, J. ROBERT; AND OTHERS

    ARTICLES ABOUT GEOLOGICAL EDUCATION WRITTEN DURING THE PERIOD 1919-62 ARE INCLUDED IN THIS ANNOTATED BIBLIOGRAPHY. RECOMMENDATIONS OF INDIVIDUAL EDUCATORS AND PROFESSIONAL GROUPS FOR THE UNDERGRADUATE AND GRADUATE PREPARATION OF GEOLOGISTS ARE CONTAINED IN MOST OF THE ITEMS. THE ARTICLES WERE ORIGINALLY PUBLISHED IN PROFESSIONAL JOURNALS OR…

  15. Biomedical Engineering in Modern Society

    ERIC Educational Resources Information Center

    Attinger, E. O.

    1971-01-01

    Considers definition of biomedical engineering (BME) and how biomedical engineers should be trained. State of the art descriptions of BME and BME education are followed by a brief look at the future of BME. (TS)

  16. Biomedical Engineering in Modern Society

    ERIC Educational Resources Information Center

    Attinger, E. O.

    1971-01-01

    Considers definition of biomedical engineering (BME) and how biomedical engineers should be trained. State of the art descriptions of BME and BME education are followed by a brief look at the future of BME. (TS)

  17. Anatomy for Biomedical Engineers

    ERIC Educational Resources Information Center

    Carmichael, Stephen W.; Robb, Richard A.

    2008-01-01

    There is a perceived need for anatomy instruction for graduate students enrolled in a biomedical engineering program. This appeared especially important for students interested in and using medical images. These students typically did not have a strong background in biology. The authors arranged for students to dissect regions of the body that…

  18. What is biomedical informatics?

    PubMed

    Bernstam, Elmer V; Smith, Jack W; Johnson, Todd R

    2010-02-01

    Biomedical informatics lacks a clear and theoretically-grounded definition. Many proposed definitions focus on data, information, and knowledge, but do not provide an adequate definition of these terms. Leveraging insights from the philosophy of information, we define informatics as the science of information, where information is data plus meaning. Biomedical informatics is the science of information as applied to or studied in the context of biomedicine. Defining the object of study of informatics as data plus meaning clearly distinguishes the field from related fields, such as computer science, statistics and biomedicine, which have different objects of study. The emphasis on data plus meaning also suggests that biomedical informatics problems tend to be difficult when they deal with concepts that are hard to capture using formal, computational definitions. In other words, problems where meaning must be considered are more difficult than problems where manipulating data without regard for meaning is sufficient. Furthermore, the definition implies that informatics research, teaching, and service should focus on biomedical information as data plus meaning rather than only computer applications in biomedicine.

  19. Principles of Biomedical Ethics

    PubMed Central

    Athar, Shahid

    2012-01-01

    In this presentation, I will discuss the principles of biomedical and Islamic medical ethics and an interfaith perspective on end-of-life issues. I will also discuss three cases to exemplify some of the conflicts in ethical decision-making. PMID:23610498

  20. Implantable CMOS Biomedical Devices

    PubMed Central

    Ohta, Jun; Tokuda, Takashi; Sasagawa, Kiyotaka; Noda, Toshihiko

    2009-01-01

    The results of recent research on our implantable CMOS biomedical devices are reviewed. Topics include retinal prosthesis devices and deep-brain implantation devices for small animals. Fundamental device structures and characteristics as well as in vivo experiments are presented. PMID:22291554

  1. Biomedical Results of Apollo

    NASA Technical Reports Server (NTRS)

    Johnston, R. S. (Editor); Dietlein, L. F. (Editor); Berry, C. A. (Editor); Parker, James F. (Compiler); West, Vita (Compiler)

    1975-01-01

    The biomedical program developed for Apollo is described in detail. The findings are listed of those investigations which are conducted to assess the effects of space flight on man's physiological and functional capacities, and significant medical events in Apollo are documented. Topics discussed include crew health and inflight monitoring, preflight and postflight medical testing, inflight experiments, quarantine, and life support systems.

  2. Anatomy for Biomedical Engineers

    ERIC Educational Resources Information Center

    Carmichael, Stephen W.; Robb, Richard A.

    2008-01-01

    There is a perceived need for anatomy instruction for graduate students enrolled in a biomedical engineering program. This appeared especially important for students interested in and using medical images. These students typically did not have a strong background in biology. The authors arranged for students to dissect regions of the body that…

  3. Texture in Biomedical Images

    NASA Astrophysics Data System (ADS)

    Petrou, Maria

    An overview of texture analysis methods is given and the merits of each method for biomedical applications are discussed. Methods discussed include Markov random fields, Gibbs distributions, co-occurrence matrices, Gabor functions and wavelets, Karhunen-Loève basis images, and local symmetry and orientation from the monogenic signal. Some example applications of texture to medical image processing are reviewed.

  4. Careers in biomedical engineering.

    PubMed

    Madrid, R E; Rotger, V I; Herrera, M C

    2010-01-01

    Although biomedical engineering was started in Argentina about 35 years ago, it has had a sustained growth for the last 25 years in human resources, with the emergence of new undergraduate and postgraduate careers, as well as in research, knowledge, technological development, and health care.

  5. [Ethics and biomedical research].

    PubMed

    Goussard, Christophe

    2007-01-01

    Ethics in biomedical research took off from the 1947 Nuremberg Code to its own right in the wake of the Declaration of Helsinki in 1964. Since then, (inter)national regulations and guidelines providing a framework for clinical studies and protection for study participants have been drafted and implemented, while ethics committees and drug evaluation agencies have sprung up throughout the world. These two developments were crucial in bringing about the protection of rights and safety of the participants and harmonization of the conduct of biomedical research. Ethics committees and drug evaluation agencies deliver ethical and scientific assessments on the quality and safety of the projects submitted to them and issue respectively approvals and authorizations to carry out clinical trials, while ensuring that they comply with regulatory requirements, ethical principles, and scientific guidelines. The advent of biomedical ethics, together with the responsible commitment of clinical investigators and of the pharmaceutical industry, has guaranteed respect for the patient, for whom and with whom research is conducted. Just as importantly, it has also ensured that patients reap the benefit of what is the primary objective of biomedical research: greater life expectancy, well-being, and quality of life.

  6. Digital biomedical. Photojournalism.

    PubMed

    Saine, Patrick J

    2002-01-01

    This article describes the strategies used to successfully complete a digitally based biomedical photojournalism assignment. A multi-step approach is suggested which includes project and funding identification, photographic planning, on-site photography and post project follow-up. Practical suggestions for utilizing digital imaging are included.

  7. Biomedical applications in EELA.

    PubMed

    Cardenas, Miguel; Hernández, Vicente; Mayo, Rafael; Blanquer, Ignacio; Perez-Griffo, Javier; Isea, Raul; Nuñez, Luis; Mora, Henry Ricardo; Fernández, Manuel

    2006-01-01

    The current demand for Grid Infrastructures to bring collabarating groups between Latina America and Europe has created the EELA proyect. This e-infrastructure is used by Biomedical groups in Latina America and Europe for the studies of ocnological analisis, neglected diseases, sequence alignments and computation plygonetics.

  8. Systems Theory and Communication. Annotated Bibliography.

    ERIC Educational Resources Information Center

    Covington, William G., Jr.

    This annotated bibliography presents annotations of 31 books and journal articles dealing with systems theory and its relation to organizational communication, marketing, information theory, and cybernetics. Materials were published between 1963 and 1992 and are listed alphabetically by author. (RS)

  9. Determining similarity of scientific entities in annotation datasets.

    PubMed

    Palma, Guillermo; Vidal, Maria-Esther; Haag, Eric; Raschid, Louiqa; Thor, Andreas

    2015-01-01

    Linked Open Data initiatives have made available a diversity of scientific collections where scientists have annotated entities in the datasets with controlled vocabulary terms from ontologies. Annotations encode scientific knowledge, which is captured in annotation datasets. Determining relatedness between annotated entities becomes a building block for pattern mining, e.g. identifying drug-drug relationships may depend on the similarity of the targets that interact with each drug. A diversity of similarity measures has been proposed in the literature to compute relatedness between a pair of entities. Each measure exploits some knowledge including the name, function, relationships with other entities, taxonomic neighborhood and semantic knowledge. We propose a novel general-purpose annotation similarity measure called 'AnnSim' that measures the relatedness between two entities based on the similarity of their annotations. We model AnnSim as a 1-1 maximum weight bipartite match and exploit properties of existing solvers to provide an efficient solution. We empirically study the performance of AnnSim on real-world datasets of drugs and disease associations from clinical trials and relationships between drugs and (genomic) targets. Using baselines that include a variety of measures, we identify where AnnSim can provide a deeper understanding of the semantics underlying the relatedness of a pair of entities or where it could lead to predicting new links or identifying potential novel patterns. Although AnnSim does not exploit knowledge or properties of a particular domain, its performance compares well with a variety of state-of-the-art domain-specific measures. Database URL: http://www.yeastgenome.org/

  10. Determining similarity of scientific entities in annotation datasets

    PubMed Central

    Palma, Guillermo; Vidal, Maria-Esther; Haag, Eric; Raschid, Louiqa; Thor, Andreas

    2015-01-01

    Linked Open Data initiatives have made available a diversity of scientific collections where scientists have annotated entities in the datasets with controlled vocabulary terms from ontologies. Annotations encode scientific knowledge, which is captured in annotation datasets. Determining relatedness between annotated entities becomes a building block for pattern mining, e.g. identifying drug–drug relationships may depend on the similarity of the targets that interact with each drug. A diversity of similarity measures has been proposed in the literature to compute relatedness between a pair of entities. Each measure exploits some knowledge including the name, function, relationships with other entities, taxonomic neighborhood and semantic knowledge. We propose a novel general-purpose annotation similarity measure called ‘AnnSim’ that measures the relatedness between two entities based on the similarity of their annotations. We model AnnSim as a 1–1 maximum weight bipartite match and exploit properties of existing solvers to provide an efficient solution. We empirically study the performance of AnnSim on real-world datasets of drugs and disease associations from clinical trials and relationships between drugs and (genomic) targets. Using baselines that include a variety of measures, we identify where AnnSim can provide a deeper understanding of the semantics underlying the relatedness of a pair of entities or where it could lead to predicting new links or identifying potential novel patterns. Although AnnSim does not exploit knowledge or properties of a particular domain, its performance compares well with a variety of state-of-the-art domain-specific measures. Database URL: http://www.yeastgenome.org/ PMID:25725057

  11. Annotation and Classification of Argumentative Writing Revisions

    ERIC Educational Resources Information Center

    Zhang, Fan; Litman, Diane

    2015-01-01

    This paper explores the annotation and classification of students' revision behaviors in argumentative writing. A sentence-level revision schema is proposed to capture why and how students make revisions. Based on the proposed schema, a small corpus of student essays and revisions was annotated. Studies show that manual annotation is reliable with…

  12. Annotation and Classification of Argumentative Writing Revisions

    ERIC Educational Resources Information Center

    Zhang, Fan; Litman, Diane

    2015-01-01

    This paper explores the annotation and classification of students' revision behaviors in argumentative writing. A sentence-level revision schema is proposed to capture why and how students make revisions. Based on the proposed schema, a small corpus of student essays and revisions was annotated. Studies show that manual annotation is reliable with…

  13. Genome re-annotation: a wiki solution?

    PubMed Central

    Salzberg, Steven L

    2007-01-01

    The annotation of most genomes becomes outdated over time, owing in part to our ever-improving knowledge of genomes and in part to improvements in bioinformatics software. Unfortunately, annotation is rarely if ever updated and resources to support routine reannotation are scarce. Wiki software, which would allow many scientists to edit each genome's annotation, offers one possible solution. PMID:17274839

  14. Next Generation Models for Storage and Representation of Microbial Biological Annotation

    SciTech Connect

    Quest, Daniel J; Land, Miriam L; Brettin, Thomas S; Cottingham, Robert W

    2010-01-01

    Background Traditional genome annotation systems were developed in a very different computing era, one where the World Wide Web was just emerging. Consequently, these systems are built as centralized black boxes focused on generating high quality annotation submissions to GenBank/EMBL supported by expert manual curation. The exponential growth of sequence data drives a growing need for increasingly higher quality and automatically generated annotation. Typical annotation pipelines utilize traditional database technologies, clustered computing resources, Perl, C, and UNIX file systems to process raw sequence data, identify genes, and predict and categorize gene function. These technologies tightly couple the annotation software system to hardware and third party software (e.g. relational database systems and schemas). This makes annotation systems hard to reproduce, inflexible to modification over time, difficult to assess, difficult to partition across multiple geographic sites, and difficult to understand for those who are not domain experts. These systems are not readily open to scrutiny and therefore not scientifically tractable. The advent of Semantic Web standards such as Resource Description Framework (RDF) and OWL Web Ontology Language (OWL) enables us to construct systems that address these challenges in a new comprehensive way. Results Here, we develop a framework for linking traditional data to OWL-based ontologies in genome annotation. We show how data standards can decouple hardware and third party software tools from annotation pipelines, thereby making annotation pipelines easier to reproduce and assess. An illustrative example shows how TURTLE (Terse RDF Triple Language) can be used as a human readable, but also semantically-aware, equivalent to GenBank/EMBL files. Conclusions The power of this approach lies in its ability to assemble annotation data from multiple databases across multiple locations into a representation that is understandable to

  15. BIOSSES: a semantic sentence similarity estimation system for the biomedical domain.

    PubMed

    Sogancioglu, Gizem; Öztürk, Hakime; Özgür, Arzucan

    2017-07-15

    The amount of information available in textual format is rapidly increasing in the biomedical domain. Therefore, natural language processing (NLP) applications are becoming increasingly important to facilitate the retrieval and analysis of these data. Computing the semantic similarity between sentences is an important component in many NLP tasks including text retrieval and summarization. A number of approaches have been proposed for semantic sentence similarity estimation for generic English. However, our experiments showed that such approaches do not effectively cover biomedical knowledge and produce poor results for biomedical text. We propose several approaches for sentence-level semantic similarity computation in the biomedical domain, including string similarity measures and measures based on the distributed vector representations of sentences learned in an unsupervised manner from a large biomedical corpus. In addition, ontology-based approaches are presented that utilize general and domain-specific ontologies. Finally, a supervised regression based model is developed that effectively combines the different similarity computation metrics. A benchmark data set consisting of 100 sentence pairs from the biomedical literature is manually annotated by five human experts and used for evaluating the proposed methods. The experiments showed that the supervised semantic sentence similarity computation approach obtained the best performance (0.836 correlation with gold standard human annotations) and improved over the state-of-the-art domain-independent systems up to 42.6% in terms of the Pearson correlation metric. A web-based system for biomedical semantic sentence similarity computation, the source code, and the annotated benchmark data set are available at: http://tabilab.cmpe.boun.edu.tr/BIOSSES/ . gizemsogancioglu@gmail.com or arzucan.ozgur@boun.edu.tr.

  16. New in protein structure and function annotation: hotspots, single nucleotide polymorphisms and the 'Deep Web'.

    PubMed

    Bromberg, Yana; Yachdav, Guy; Ofran, Yanay; Schneider, Reinhard; Rost, Burkhard

    2009-05-01

    The rapidly increasing quantity of protein sequence data continues to widen the gap between available sequences and annotations. Comparative modeling suggests some aspects of the 3D structures of approximately half of all known proteins; homology- and network-based inferences annotate some aspect of function for a similar fraction of the proteome. For most known protein sequences, however, there is detailed knowledge about neither their function nor their structure. Comprehensive efforts towards the expert curation of sequence annotations have failed to meet the demand of the rapidly increasing number of available sequences. Only the automated prediction of protein function in the absence of homology can close the gap between available sequences and annotations in the foreseeable future. This review focuses on two novel methods for automated annotation, and briefly presents an outlook on how modern web software may revolutionize the field of protein sequence annotation. First, predictions of protein binding sites and functional hotspots, and the evolution of these into the most successful type of prediction of protein function from sequence will be discussed. Second, a new tool, comprehensive in silico mutagenesis, which contributes important novel predictions of function and at the same time prepares for the onset of the next sequencing revolution, will be described. While these two new sub-fields of protein prediction represent the breakthroughs that have been achieved methodologically, it will then be argued that a different development might further change the way biomedical researchers benefit from annotations: modern web software can connect the worldwide web in any browser with the 'Deep Web' (ie, proprietary data resources). The availability of this direct connection, and the resulting access to a wealth of data, may impact drug discovery and development more than any existing method that contributes to protein annotation.

  17. Adapting content-based image retrieval techniques for the semantic annotation of medical images.

    PubMed

    Kumar, Ashnil; Dyer, Shane; Kim, Jinman; Li, Changyang; Leong, Philip H W; Fulham, Michael; Feng, Dagan

    2016-04-01

    The automatic annotation of medical images is a prerequisite for building comprehensive semantic archives that can be used to enhance evidence-based diagnosis, physician education, and biomedical research. Annotation also has important applications in the automatic generation of structured radiology reports. Much of the prior research work has focused on annotating images with properties such as the modality of the image, or the biological system or body region being imaged. However, many challenges remain for the annotation of high-level semantic content in medical images (e.g., presence of calcification, vessel obstruction, etc.) due to the difficulty in discovering relationships and associations between low-level image features and high-level semantic concepts. This difficulty is further compounded by the lack of labelled training data. In this paper, we present a method for the automatic semantic annotation of medical images that leverages techniques from content-based image retrieval (CBIR). CBIR is a well-established image search technology that uses quantifiable low-level image features to represent the high-level semantic content depicted in those images. Our method extends CBIR techniques to identify or retrieve a collection of labelled images that have similar low-level features and then uses this collection to determine the best high-level semantic annotations. We demonstrate our annotation method using retrieval via weighted nearest-neighbour retrieval and multi-class classification to show that our approach is viable regardless of the underlying retrieval strategy. We experimentally compared our method with several well-established baseline techniques (classification and regression) and showed that our method achieved the highest accuracy in the annotation of liver computed tomography (CT) images.

  18. An agent-based system for re-annotation of genomes.

    PubMed

    Nascimento, Leonardo Vianna do; Bazzan, Ana L C

    2005-09-30

    Genome annotation projects can produce incorrect results if they are based on obsolete data or inappropriate models. We have developed an automatic re-annotation system that uses agents to perform repetitive tasks and reports the results to the user. These tasks involve BLAST searches on biological databases (GenBank) and the use of detection tools (Genemark and Glimmer) to identify new open reading frames. Several agents execute these tools and combine their results to produce a list of open reading frames that is sent back to the user. Our goal was to reduce the manual work, executing most tasks automatically by computational tools. A prototype was implemented and validated using Mycoplasma pneumoniae and Haemophilus influenzae original annotated genomes. The results reported by the system identify most of new features present in the re-annotated versions of these genomes.

  19. EST-PAC a web package for EST annotation and protein sequence prediction.

    PubMed

    Strahm, Yvan; Powell, David; Lefèvre, Christophe

    2006-10-12

    With the decreasing cost of DNA sequencing technology and the vast diversity of biological resources, researchers increasingly face the basic challenge of annotating a larger number of expressed sequences tags (EST) from a variety of species. This typically consists of a series of repetitive tasks, which should be automated and easy to use. The results of these annotation tasks need to be stored and organized in a consistent way. All these operations should be self-installing, platform independent, easy to customize and amenable to using distributed bioinformatics resources available on the Internet. In order to address these issues, we present EST-PAC a web oriented multi-platform software package for expressed sequences tag (EST) annotation. EST-PAC provides a solution for the administration of EST and protein sequence annotations accessible through a web interface. Three aspects of EST annotation are automated: 1) searching local or remote biological databases for sequence similarities using Blast services, 2) predicting protein coding sequence from EST data and, 3) annotating predicted protein sequences with functional domain predictions. In practice, EST-PAC integrates the BLASTALL suite, EST-Scan2 and HMMER in a relational database system accessible through a simple web interface. EST-PAC also takes advantage of the relational database to allow consistent storage, powerful queries of results and, management of the annotation process. The system allows users to customize annotation strategies and provides an open-source data-management environment for research and education in bioinformatics.

  20. VariOtator, a Software Tool for Variation Annotation with the Variation Ontology.

    PubMed

    Schaafsma, Gerard C P; Vihinen, Mauno

    2016-04-01

    The Variation Ontology (VariO) is used for describing and annotating types, effects, consequences, and mechanisms of variations. To facilitate easy and consistent annotations, the online application VariOtator was developed. For variation type annotations, VariOtator is fully automated, accepting variant descriptions in Human Genome Variation Society (HGVS) format, and generating VariO terms, either with or without full lineage, that is, all parent terms. When a coding DNA variant description with a reference sequence is provided, VariOtator checks the description first with Mutalyzer and then generates the predicted RNA and protein descriptions with their respective VariO annotations. For the other sublevels, function, structure, and property, annotations cannot be automated, and VariOtator generates annotation based on provided details. For VariO terms relating to structure and property, one can use attribute terms as modifiers and evidence code terms for annotating experimental evidence. There is an online batch version, and stand-alone batch versions to be used with a Leiden Open Variation Database (LOVD) download file. A SOAP Web service allows client programs to access VariOtator programmatically. Thus, systematic variation effect and type annotations can be efficiently generated to allow easy use and integration of variations and their consequences.

  1. Biomedical image processing.

    PubMed

    Huang, H K

    1981-01-01

    Biomedical image processing is a very broad field; it covers biomedical signal gathering, image forming, picture processing, and image display to medical diagnosis based on features extracted from images. This article reviews this topic in both its fundamentals and applications. In its fundamentals, some basic image processing techniques including outlining, deblurring, noise cleaning, filtering, search, classical analysis and texture analysis have been reviewed together with examples. The state-of-the-art image processing systems have been introduced and discussed in two categories: general purpose image processing systems and image analyzers. In order for these systems to be effective for biomedical applications, special biomedical image processing languages have to be developed. The combination of both hardware and software leads to clinical imaging devices. Two different types of clinical imaging devices have been discussed. There are radiological imagings which include radiography, thermography, ultrasound, nuclear medicine and CT. Among these, thermography is the most noninvasive but is limited in application due to the low energy of its source. X-ray CT is excellent for static anatomical images and is moving toward the measurement of dynamic function, whereas nuclear imaging is moving toward organ metabolism and ultrasound is toward tissue physical characteristics. Heart imaging is one of the most interesting and challenging research topics in biomedical image processing; current methods including the invasive-technique cineangiography, and noninvasive ultrasound, nuclear medicine, transmission, and emission CT methodologies have been reviewed. Two current federally funded research projects in heart imaging, the dynamic spatial reconstructor and the dynamic cardiac three-dimensional densitometer, should bring some fruitful results in the near future. Miscrosopic imaging technique is very different from the radiological imaging technique in the sense that

  2. Dictionary-driven protein annotation

    PubMed Central

    Rigoutsos, Isidore; Huynh, Tien; Floratos, Aris; Parida, Laxmi; Platt, Daniel

    2002-01-01

    Computational methods seeking to automatically determine the properties (functional, structural, physicochemical, etc.) of a protein directly from the sequence have long been the focus of numerous research groups. With the advent of advanced sequencing methods and systems, the number of amino acid sequences that are being deposited in the public databases has been increasing steadily. This has in turn generated a renewed demand for automated approaches that can annotate individual sequences and complete genomes quickly, exhaustively and objectively. In this paper, we present one such approach that is centered around and exploits the Bio-Dictionary, a collection of amino acid patterns that completely covers the natural sequence space and can capture functional and structural signals that have been reused during evolution, within and across protein families. Our annotation approach also makes use of a weighted, position-specific scoring scheme that is unaffected by the over-representation of well-conserved proteins and protein fragments in the databases used. For a given query sequence, the method permits one to determine, in a single pass, the following: local and global similarities between the query and any protein already present in a public database; the likeness of the query to all available archaeal/bacterial/eukaryotic/viral sequences in the database as a function of amino acid position within the query; the character of secondary structure of the query as a function of amino acid position within the query; the cytoplasmic, transmembrane or extracellular behavior of the query; the nature and position of binding domains, active sites, post-translationally modified sites, signal peptides, etc. In terms of performance, the proposed method is exhaustive, objective and allows for the rapid annotation of individual sequences and full genomes. Annotation examples are presented and discussed in Results, including individual queries and complete genomes that were

  3. Dictionary-driven protein annotation.

    PubMed

    Rigoutsos, Isidore; Huynh, Tien; Floratos, Aris; Parida, Laxmi; Platt, Daniel

    2002-09-01

    Computational methods seeking to automatically determine the properties (functional, structural, physicochemical, etc.) of a protein directly from the sequence have long been the focus of numerous research groups. With the advent of advanced sequencing methods and systems, the number of amino acid sequences that are being deposited in the public databases has been increasing steadily. This has in turn generated a renewed demand for automated approaches that can annotate individual sequences and complete genomes quickly, exhaustively and objectively. In this paper, we present one such approach that is centered around and exploits the Bio-Dictionary, a collection of amino acid patterns that completely covers the natural sequence space and can capture functional and structural signals that have been reused during evolution, within and across protein families. Our annotation approach also makes use of a weighted, position-specific scoring scheme that is unaffected by the over-representation of well-conserved proteins and protein fragments in the databases used. For a given query sequence, the method permits one to determine, in a single pass, the following: local and global similarities between the query and any protein already present in a public database; the likeness of the query to all available archaeal/ bacterial/eukaryotic/viral sequences in the database as a function of amino acid position within the query; the character of secondary structure of the query as a function of amino acid position within the query; the cytoplasmic, transmembrane or extracellular behavior of the query; the nature and position of binding domains, active sites, post-translationally modified sites, signal peptides, etc. In terms of performance, the proposed method is exhaustive, objective and allows for the rapid annotation of individual sequences and full genomes. Annotation examples are presented and discussed in Results, including individual queries and complete genomes that were

  4. The automatic annotation of bacterial genomes

    PubMed Central

    Richardson, Emily J.

    2013-01-01

    With the development of ultra-high-throughput technologies, the cost of sequencing bacterial genomes has been vastly reduced. As more genomes are sequenced, less time can be spent manually annotating those genomes, resulting in an increased reliance on automatic annotation pipelines. However, automatic pipelines can produce inaccurate genome annotation and their results often require manual curation. Here, we discuss the automatic and manual annotation of bacterial genomes, identify common problems introduced by the current genome annotation process and suggests potential solutions. PMID:22408191

  5. Automatic annotation of organellar genomes with DOGMA

    SciTech Connect

    Wyman, Stacia; Jansen, Robert K.; Boore, Jeffrey L.

    2004-06-01

    Dual Organellar GenoMe Annotator (DOGMA) automates the annotation of extra-nuclear organellar (chloroplast and animal mitochondrial) genomes. It is a web-based package that allows the use of comparative BLAST searches to identify and annotate genes in a genome. DOGMA presents a list of putative genes to the user in a graphical format for viewing and editing. Annotations are stored on our password-protected server. Complete annotations can be extracted for direct submission to GenBank. Furthermore, intergenic regions of specified length can be extracted, as well the nucleotide sequences and amino acid sequences of the genes.

  6. FunctionAnnotator, a versatile and efficient web tool for non-model organism annotation.

    PubMed

    Chen, Ting-Wen; Gan, Ruei-Chi; Fang, Yi-Kai; Chien, Kun-Yi; Liao, Wei-Chao; Chen, Chia-Chun; Wu, Timothy H; Chang, Ian Yi-Feng; Yang, Chi; Huang, Po-Jung; Yeh, Yuan-Ming; Chiu, Cheng-Hsun; Huang, Tzu-Wen; Tang, Petrus

    2017-09-05

    Along with the constant improvement in high-throughput sequencing technology, an increasing number of transcriptome sequencing projects are carried out in organisms without decoded genome information and even on environmental biological samples. To study the biological functions of novel transcripts, the very first task is to identify their potential functions. We present a web-based annotation tool, FunctionAnnotator, which offers comprehensive annotations, including GO term assignment, enzyme annotation, domain/motif identification and predictions for subcellular localization. To accelerate the annotation process, we have optimized the computation processes and used parallel computing for all annotation steps. Moreover, FunctionAnnotator is designed to be versatile, and it generates a variety of useful outputs for facilitating other analyses. Here, we demonstrate how FunctionAnnotator can be helpful in annotating non-model organisms. We further illustrate that FunctionAnnotator can estimate the taxonomic composition of environmental samples and assist in the identification of novel proteins by combining RNA-Seq data with proteomics technology. In summary, FunctionAnnotator can efficiently annotate transcriptomes and greatly benefits studies focusing on non-model organisms or metatranscriptomes. FunctionAnnotator, a comprehensive annotation web-service tool, is freely available online at: http://fa.cgu.edu.tw/ . This new web-based annotator will shed light on field studies involving organisms without a reference genome.

  7. Deep Question Answering for protein annotation.

    PubMed

    Gobeill, Julien; Gaudinat, Arnaud; Pasche, Emilie; Vishnyakova, Dina; Gaudet, Pascale; Bairoch, Amos; Ruch, Patrick

    2015-01-01

    Biomedical professionals have access to a huge amount of literature, but when they use a search engine, they often have to deal with too many documents to efficiently find the appropriate information in a reasonable time. In this perspective, question-answering (QA) engines are designed to display answers, which were automatically extracted from the retrieved documents. Standard QA engines in literature process a user question, then retrieve relevant documents and finally extract some possible answers out of these documents using various named-entity recognition processes. In our study, we try to answer complex genomics questions, which can be adequately answered only using Gene Ontology (GO) concepts. Such complex answers cannot be found using state-of-the-art dictionary- and redundancy-based QA engines. We compare the effectiveness of two dictionary-based classifiers for extracting correct GO answers from a large set of 100 retrieved abstracts per question. In the same way, we also investigate the power of GOCat, a GO supervised classifier. GOCat exploits the GOA database to propose GO concepts that were annotated by curators for similar abstracts. This approach is called deep QA, as it adds an original classification step, and exploits curated biological data to infer answers, which are not explicitly mentioned in the retrieved documents. We show that for complex answers such as protein functional descriptions, the redundancy phenomenon has a limited effect. Similarly usual dictionary-based approaches are relatively ineffective. In contrast, we demonstrate how existing curated data, beyond information extraction, can be exploited by a supervised classifier, such as GOCat, to massively improve both the quantity and the quality of the answers with a +100% improvement for both recall and precision. Database URL: http://eagl.unige.ch/DeepQA4PA/. © The Author(s) 2015. Published by Oxford University Press.

  8. Deep Question Answering for protein annotation

    PubMed Central

    Gobeill, Julien; Gaudinat, Arnaud; Pasche, Emilie; Vishnyakova, Dina; Gaudet, Pascale; Bairoch, Amos; Ruch, Patrick

    2015-01-01

    Biomedical professionals have access to a huge amount of literature, but when they use a search engine, they often have to deal with too many documents to efficiently find the appropriate information in a reasonable time. In this perspective, question-answering (QA) engines are designed to display answers, which were automatically extracted from the retrieved documents. Standard QA engines in literature process a user question, then retrieve relevant documents and finally extract some possible answers out of these documents using various named-entity recognition processes. In our study, we try to answer complex genomics questions, which can be adequately answered only using Gene Ontology (GO) concepts. Such complex answers cannot be found using state-of-the-art dictionary- and redundancy-based QA engines. We compare the effectiveness of two dictionary-based classifiers for extracting correct GO answers from a large set of 100 retrieved abstracts per question. In the same way, we also investigate the power of GOCat, a GO supervised classifier. GOCat exploits the GOA database to propose GO concepts that were annotated by curators for similar abstracts. This approach is called deep QA, as it adds an original classification step, and exploits curated biological data to infer answers, which are not explicitly mentioned in the retrieved documents. We show that for complex answers such as protein functional descriptions, the redundancy phenomenon has a limited effect. Similarly usual dictionary-based approaches are relatively ineffective. In contrast, we demonstrate how existing curated data, beyond information extraction, can be exploited by a supervised classifier, such as GOCat, to massively improve both the quantity and the quality of the answers with a +100% improvement for both recall and precision. Database URL: http://eagl.unige.ch/DeepQA4PA/ PMID:26384372

  9. Graphene for Biomedical Implants

    NASA Astrophysics Data System (ADS)

    Moore, Thomas; Podila, Ramakrishna; Alexis, Frank; Rao, Apparao; Clemson Bioengineering Team; Clemson Physics Team

    2013-03-01

    In this study, we used graphene, a one-atom thick sheet of carbon atoms, to modify the surfaces of existing implant materials to enhance both bio- and hemo-compatibility. This novel effort meets all functional criteria for a biomedical implant coating as it is chemically inert, atomically smooth and highly durable, with the potential for greatly enhancing the effectiveness of such implants. Specifically, graphene coatings on nitinol, a widely used implant and stent material, showed that graphene coated nitinol (Gr-NiTi) supports excellent smooth muscle and endothelial cell growth leading to better cell proliferation. We further determined that the serum albumin adsorption on Gr-NiTi is greater than that of fibrinogen, an important and well understood criterion for promoting a lower thrombosis rate. These hemo-and biocompatible properties and associated charge transfer mechanisms, along with high strength, chemical inertness and durability give graphene an edge over most antithrombogenic coatings for biomedical implants and devices.

  10. Sharing big biomedical data.

    PubMed

    Toga, Arthur W; Dinov, Ivo D

    The promise of Big Biomedical Data may be offset by the enormous challenges in handling, analyzing, and sharing it. In this paper, we provide a framework for developing practical and reasonable data sharing policies that incorporate the sociological, financial, technical and scientific requirements of a sustainable Big Data dependent scientific community. Many biomedical and healthcare studies may be significantly impacted by using large, heterogeneous and incongruent datasets; however there are significant technical, social, regulatory, and institutional barriers that need to be overcome to ensure the power of Big Data overcomes these detrimental factors. Pragmatic policies that demand extensive sharing of data, promotion of data fusion, provenance, interoperability and balance security and protection of personal information are critical for the long term impact of translational Big Data analytics.

  11. Biochemiluminescence and biomedical applications.

    PubMed

    Champiat, D; Roux, A; Lhomme, O; Nosenzo, G

    1994-12-01

    Although used for analytical purposes for more than 40 years it is only recently that biochemiluminescence (BCL) has found widespread acceptance. Methods employing BCL reactions now play an important role in biomedical research and laboratory medicine. The main attractions for the assay technology include exquisite sensitivity (attomole-zeptomole), high selectivity, speed and simplicity. In biomedical research, the most important applications of BCL are: (1) to estimate microbial numbers and to assess cellular states (e.g., after exposure to antibiotic or cytotoxic agents) and in reporter gene studies (firefly luciferase gene); (2) NAD(P)H involved in redox/dehydrogenase studies using Vibrio luciferase complex; (3) BCL labels and CL detection of enzyme labels in immunoassays are the most widespread routine application for this technology. BCL enzyme immunoassays represent the most active area of development, e.g., enhanced BCL method for peroxidase and BCL assays for alkaline phosphatase labels using adamantyl 1,2-dioxetane.

  12. A high-quality annotated transcriptome of swine peripheral blood.

    PubMed

    Liu, Haibo; Smith, Timothy P L; Nonneman, Dan J; Dekkers, Jack C M; Tuggle, Christopher K

    2017-06-24

    High throughput gene expression profiling assays of peripheral blood are widely used in biomedicine, as well as in animal genetics and physiology research. Accurate, comprehensive, and precise interpretation of such high throughput assays relies on well-characterized reference genomes and/or transcriptomes. However, neither the reference genome nor the peripheral blood transcriptome of the pig have been sufficiently assembled and annotated to support such profiling assays in this emerging biomedical model organism. We aimed to assemble published and novel RNA-seq data to provide a comprehensive, well-annotated blood transcriptome for pigs by integrating a de novo assembly with a genome-guided assembly. A de novo and a genome-guided transcriptome of porcine whole peripheral blood was assembled with ~162 million pairs of paired-end and ~183 million single-end, trimmed and normalized Illumina RNA-seq reads (~6 billion initial reads from 146 RNA-seq libraries) from five independent studies by using the Trinity and Cufflinks software, respectively. We then removed putative transcripts (PTs) of low confidence from both assemblies and merged the remaining PTs into an integrated transcriptome consisting of 132,928 PTs, with 126,225 (~95%) PTs from the de novo assembly and more than 91% of PTs spliced. In the integrated transcriptome, ~90% and 63% of PTs had significant sequence similarity to sequences in the NCBI NT and NR databases, respectively; 68,754 (~52%) PTs were annotated with 15,965 unique gene ontology (GO) terms; and 7618 PTs annotated with Enzyme Commission codes were assigned to 134 pathways curated by the Kyoto Encyclopedia of Genes and Genomes (KEGG). Full exon-intron junctions of 17,528 PTs were validated by PacBio IsoSeq full-length cDNA reads from 3 other porcine tissues, NCBI pig RefSeq mRNAs and transcripts from Ensembl Sscrofa10.2 annotation. Completeness of the 5' termini of 37,569 PTs was validated by public cap analysis of gene expression (CAGE

  13. Oncotator: cancer variant annotation tool.

    PubMed

    Ramos, Alex H; Lichtenstein, Lee; Gupta, Manaswi; Lawrence, Michael S; Pugh, Trevor J; Saksena, Gordon; Meyerson, Matthew; Getz, Gad

    2015-04-01

    Oncotator is a tool for annotating genomic point mutations and short nucleotide insertions/deletions (indels) with variant- and gene-centric information relevant to cancer researchers. This information is drawn from 14 different publicly available resources that have been pooled and indexed, and we provide an extensible framework to add additional data sources. Annotations linked to variants range from basic information, such as gene names and functional classification (e.g. missense), to cancer-specific data from resources such as the Catalogue of Somatic Mutations in Cancer (COSMIC), the Cancer Gene Census, and The Cancer Genome Atlas (TCGA). For local use, Oncotator is freely available as a python module hosted on Github (https://github.com/broadinstitute/oncotator). Furthermore, Oncotator is also available as a web service and web application at http://www.broadinstitute.org/oncotator/.

  14. Use of Annotations for Component and Framework Interoperability

    NASA Astrophysics Data System (ADS)

    David, O.; Lloyd, W.; Carlson, J.; Leavesley, G. H.; Geter, F.

    2009-12-01

    western United States at the USDA NRCS National Water and Climate Center. PRMS is a component based modular precipitation-runoff model developed to evaluate the impacts of various combinations of precipitation, climate, and land use on streamflow and general basin hydrology. The new OMS 3.0 PRMS model source code is more concise and flexible as a result of using the new framework’s annotation based approach. The fully annotated components are now providing information directly for (i) model assembly and building, (ii) dataflow analysis for implicit multithreading, (iii) automated and comprehensive model documentation of component dependencies, physical data properties, (iv) automated model and component testing, and (v) automated audit-traceability to account for all model resources leading to a particular simulation result. Experience to date has demonstrated the multi-purpose value of using annotations. Annotations are also a feasible and practical method to enable interoperability among models and modeling frameworks. As a prototype example, model code annotations were used to generate binding and mediation code to allow the use of OMS 3.0 model components within the OpenMI context.

  15. Evaluating Hierarchical Structure in Music Annotations

    PubMed Central

    McFee, Brian; Nieto, Oriol; Farbood, Morwaread M.; Bello, Juan Pablo

    2017-01-01

    Music exhibits structure at multiple scales, ranging from motifs to large-scale functional components. When inferring the structure of a piece, different listeners may attend to different temporal scales, which can result in disagreements when they describe the same piece. In the field of music informatics research (MIR), it is common to use corpora annotated with structural boundaries at different levels. By quantifying disagreements between multiple annotators, previous research has yielded several insights relevant to the study of music cognition. First, annotators tend to agree when structural boundaries are ambiguous. Second, this ambiguity seems to depend on musical features, time scale, and genre. Furthermore, it is possible to tune current annotation evaluation metrics to better align with these perceptual differences. However, previous work has not directly analyzed the effects of hierarchical structure because the existing methods for comparing structural annotations are designed for “flat” descriptions, and do not readily generalize to hierarchical annotations. In this paper, we extend and generalize previous work on the evaluation of hierarchical descriptions of musical structure. We derive an evaluation metric which can compare hierarchical annotations holistically across multiple levels. sing this metric, we investigate inter-annotator agreement on the multilevel annotations of two different music corpora, investigate the influence of acoustic properties on hierarchical annotations, and evaluate existing hierarchical segmentation algorithms against the distribution of inter-annotator agreement. PMID:28824514

  16. Evaluating Hierarchical Structure in Music Annotations.

    PubMed

    McFee, Brian; Nieto, Oriol; Farbood, Morwaread M; Bello, Juan Pablo

    2017-01-01

    Music exhibits structure at multiple scales, ranging from motifs to large-scale functional components. When inferring the structure of a piece, different listeners may attend to different temporal scales, which can result in disagreements when they describe the same piece. In the field of music informatics research (MIR), it is common to use corpora annotated with structural boundaries at different levels. By quantifying disagreements between multiple annotators, previous research has yielded several insights relevant to the study of music cognition. First, annotators tend to agree when structural boundaries are ambiguous. Second, this ambiguity seems to depend on musical features, time scale, and genre. Furthermore, it is possible to tune current annotation evaluation metrics to better align with these perceptual differences. However, previous work has not directly analyzed the effects of hierarchical structure because the existing methods for comparing structural annotations are designed for "flat" descriptions, and do not readily generalize to hierarchical annotations. In this paper, we extend and generalize previous work on the evaluation of hierarchical descriptions of musical structure. We derive an evaluation metric which can compare hierarchical annotations holistically across multiple levels. sing this metric, we investigate inter-annotator agreement on the multilevel annotations of two different music corpora, investigate the influence of acoustic properties on hierarchical annotations, and evaluate existing hierarchical segmentation algorithms against the distribution of inter-annotator agreement.

  17. [Biomedical activity of biosurfactants].

    PubMed

    Krasowska, Anna

    2010-07-23

    Biosurfactants, amphiphilic compounds, synthesized by microorganisms have surface, antimicrobial and antitumor properties. Biosurfactants prevent adhesion and biofilms formation by bacteria and fungi on various surfaces. For many years microbial surfactants are used as antibiotics with board spectrum of activity against microorganisms. Biosurfactants act as antiviral compounds and their antitumor activities are mediated through induction of apoptosis. This work presents the current state of knowledge related to biomedical activity of biosurfactants.

  18. Biomedical applications of photochemistry.

    PubMed

    Chan, Barbara Pui

    2010-10-01

    Photochemistry is the study of photochemical reactions between light and molecules. Recently, there have been increasing interests in using photochemical reactions in the fields of biomaterials and tissue engineering. This work revisits the components and mechanisms of photochemistry and reviews biomedical applications of photochemistry in various disciplines, including oncology, molecular biology, and biosurgery, with particular emphasis on tissue engineering. Finally, potential toxicities and research opportunities in this field are discussed.

  19. Biomedical Applications of Graphene

    PubMed Central

    Shen, He; Zhang, Liming; Liu, Min; Zhang, Zhijun

    2012-01-01

    Graphene exhibits unique 2-D structure and exceptional phyiscal and chemical properties that lead to many potential applications. Among various applications, biomedical applications of graphene have attracted ever-increasing interests over the last three years. In this review, we present an overview of current advances in applications of graphene in biomedicine with focus on drug delivery, cancer therapy and biological imaging, together with a brief discussion on the challenges and perspectives for future research in this field. PMID:22448195

  20. Adaptive Biomedical Innovation.

    PubMed

    Honig, P K; Hirsch, G

    2016-12-01

    Adaptive Biomedical Innovation (ABI) is a multistakeholder approach to product and process innovation aimed at accelerating the delivery of clinical value to patients and society. ABI offers the opportunity to transcend the fragmentation and linearity of decision-making in our current model and create a common collaborative framework that optimizes the benefit and access of new medicines for patients as well as creating a more sustainable innovation ecosystem.

  1. Glyconanoparticles for biomedical applications.

    PubMed

    Dong, Chang-Ming

    2011-03-01

    Over the past two decades, glycosylated nanoparticles (i.e., glyconanoparticles having sugar residues on the surface) received much attention for biomedical applications such as bioassays and targeted drug delivery. This minireview focuses on three aspects: (1) glycosylated gold nanoparticles, (2) glycosylated quantum dots, and (3) glyconanoparticles self-assembled from amphiphilic glycopolymers. The synthetic methods and the multivalent interactions between glyconanoparticles and lectins is shortly illustrated.

  2. Saccharomyces cerevisiae S288C genome annotation: a working hypothesis

    PubMed Central

    Fisk, Dianna G.; Ball, Catherine A.; Dolinski, Kara; Engel, Stacia R.; Hong, Eurie L.; Issel-Tarver, Laurie; Schwartz, Katja; Sethuraman, Anand; Botstein, David; Cherry, J. Michael

    2011-01-01

    The S. cerevisiae genome is the most well-characterized eukaryotic genome and one of the simplest in terms of identifying open reading frames (ORFs), yet its primary annotation has been updated continually in the decade since its initial release in 1996 (Goffeau et al., 1996). The Saccharomyces Genome Database (SGD; www.yeastgenome.org) (Hirschman et al., 2006), the community-designated repository for this reference genome, strives to ensure that the S. cerevisiae annotation is as accurate and useful as possible. At SGD, the S. cerevisiae genome sequence and annotation are treated as a working hypothesis, which must be repeatedly tested and refined. In this paper, in celebration of the tenth anniversary of the completion of the S. cerevisiae genome sequence, we discuss the ways in which the S. cerevisiae sequence and annotation have changed, consider the multiple sources of experimental and comparative data on which these changes are based, and describe our methods for evaluating, incorporating and documenting these new data. PMID:17001629

  3. Annotation of protein residues based on a literature analysis: cross-validation against UniProtKb

    PubMed Central

    Nagel, Kevin; Jimeno-Yepes, Antonio; Rebholz-Schuhmann, Dietrich

    2009-01-01

    Background A protein annotation database, such as the Universal Protein Resource knowledge base (UniProtKb), is a valuable resource for the validation and interpretation of predicted 3D structure patterns in proteins. Existing studies have focussed on point mutation extraction methods from biomedical literature which can be used to support the time consuming work of manual database curation. However, these methods were limited to point mutation extraction and do not extract features for the annotation of proteins at the residue level. Results This work introduces a system that identifies protein residues in MEDLINE abstracts and annotates them with features extracted from the context written in the surrounding text. MEDLINE abstract texts have been processed to identify protein mentions in combination with taxonomic species and protein residues (F1-measure 0.52). The identified protein-species-residue triplets have been validated and benchmarked against reference data resources (UniProtKb, average F1-measure of 0.54). Then, contextual features were extracted through shallow and deep parsing and the features have been classified into predefined categories (F1-measure ranges from 0.15 to 0.67). Furthermore, the feature sets have been aligned with annotation types in UniProtKb to assess the relevance of the annotations for ongoing curation projects. Altogether, the annotations have been assessed automatically and manually against reference data resources. Conclusion This work proposes a solution for the automatic extraction of functional annotation for protein residues from biomedical articles. The presented approach is an extension to other existing systems in that a wider range of residue entities are considered and that features of residues are extracted as annotations. PMID:19758468

  4. Leveraging the national cyberinfrastructure for biomedical research.

    PubMed

    LeDuc, Richard; Vaughn, Matthew; Fonner, John M; Sullivan, Michael; Williams, James G; Blood, Philip D; Taylor, James; Barnett, William

    2014-01-01

    In the USA, the national cyberinfrastructure refers to a system of research supercomputer and other IT facilities and the high speed networks that connect them. These resources have been heavily leveraged by scientists in disciplines such as high energy physics, astronomy, and climatology, but until recently they have been little used by biomedical researchers. We suggest that many of the 'Big Data' challenges facing the medical informatics community can be efficiently handled using national-scale cyberinfrastructure. Resources such as the Extreme Science and Discovery Environment, the Open Science Grid, and Internet2 provide economical and proven infrastructures for Big Data challenges, but these resources can be difficult to approach. Specialized web portals, support centers, and virtual organizations can be constructed on these resources to meet defined computational challenges, specifically for genomics. We provide examples of how this has been done in basic biology as an illustration for the biomedical informatics community.

  5. Leveraging the national cyberinfrastructure for biomedical research

    PubMed Central

    LeDuc, Richard; Vaughn, Matthew; Fonner, John M; Sullivan, Michael; Williams, James G; Blood, Philip D; Taylor, James; Barnett, William

    2014-01-01

    In the USA, the national cyberinfrastructure refers to a system of research supercomputer and other IT facilities and the high speed networks that connect them. These resources have been heavily leveraged by scientists in disciplines such as high energy physics, astronomy, and climatology, but until recently they have been little used by biomedical researchers. We suggest that many of the ‘Big Data’ challenges facing the medical informatics community can be efficiently handled using national-scale cyberinfrastructure. Resources such as the Extreme Science and Discovery Environment, the Open Science Grid, and Internet2 provide economical and proven infrastructures for Big Data challenges, but these resources can be difficult to approach. Specialized web portals, support centers, and virtual organizations can be constructed on these resources to meet defined computational challenges, specifically for genomics. We provide examples of how this has been done in basic biology as an illustration for the biomedical informatics community. PMID:23964072

  6. The Biomedical Resource Ontology (BRO) to Enable Resource Discovery in Clinical and Translational Research

    PubMed Central

    Tenenbaum, Jessica D.; Whetzel, Patricia L.; Anderson, Kent; Borromeo, Charles D.; Dinov, Ivo D.; Gabriel, Davera; Kirschner, Beth; Mirel, Barbara; Morris, Tim; Noy, Natasha; Nyulas, Csongor; Rubenson, David; Saxman, Paul R.; Singh, Harpreet; Whelan, Nancy; Wright, Zach; Athey, Brian D.; Becich, Michael J.; Ginsburg, Geoffrey S.; Musen, Mark A.; Smith, Kevin A.; Tarantal, Alice F.; Rubin, Daniel L; Lyster, Peter

    2010-01-01

    The biomedical research community relies on a diverse set of resources, both within their own institutions and at other research centers. In addition, an increasing number of shared electronic resources have been developed. Without effective means to locate and query these resources, it is challenging, if not impossible, for investigators to be aware of the myriad resources available, or to effectively perform resource discovery when the need arises. In this paper, we describe the development and use of the Biomedical Resource Ontology (BRO) to enable semantic annotation and discovery of biomedical resources. We also describe the Resource Discovery System (RDS) which is a federated, inter-institutional pilot project that uses the BRO to facilitate resource discovery on the Internet. Through the RDS framework and its associated Biositemaps infrastructure, the BRO facilitates semantic search and discovery of biomedical resources, breaking down barriers and streamlining scientific research that will improve human health. PMID:20955817

  7. The Biomedical Resource Ontology (BRO) to enable resource discovery in clinical and translational research.

    PubMed

    Tenenbaum, Jessica D; Whetzel, Patricia L; Anderson, Kent; Borromeo, Charles D; Dinov, Ivo D; Gabriel, Davera; Kirschner, Beth; Mirel, Barbara; Morris, Tim; Noy, Natasha; Nyulas, Csongor; Rubenson, David; Saxman, Paul R; Singh, Harpreet; Whelan, Nancy; Wright, Zach; Athey, Brian D; Becich, Michael J; Ginsburg, Geoffrey S; Musen, Mark A; Smith, Kevin A; Tarantal, Alice F; Rubin, Daniel L; Lyster, Peter

    2011-02-01

    The biomedical research community relies on a diverse set of resources, both within their own institutions and at other research centers. In addition, an increasing number of shared electronic resources have been developed. Without effective means to locate and query these resources, it is challenging, if not impossible, for investigators to be aware of the myriad resources available, or to effectively perform resource discovery when the need arises. In this paper, we describe the development and use of the Biomedical Resource Ontology (BRO) to enable semantic annotation and discovery of biomedical resources. We also describe the Resource Discovery System (RDS) which is a federated, inter-institutional pilot project that uses the BRO to facilitate resource discovery on the Internet. Through the RDS framework and its associated Biositemaps infrastructure, the BRO facilitates semantic search and discovery of biomedical resources, breaking down barriers and streamlining scientific research that will improve human health. Copyright © 2010 Elsevier Inc. All rights reserved.

  8. [Big data, medical language and biomedical terminology systems].

    PubMed

    Schulz, Stefan; López-García, Pablo

    2015-08-01

    A variety of rich terminology systems, such as thesauri, classifications, nomenclatures and ontologies support information and knowledge processing in health care and biomedical research. Nevertheless, human language, manifested as individually written texts, persists as the primary carrier of information, in the description of disease courses or treatment episodes in electronic medical records, and in the description of biomedical research in scientific publications. In the context of the discussion about big data in biomedicine, we hypothesize that the abstraction of the individuality of natural language utterances into structured and semantically normalized information facilitates the use of statistical data analytics to distil new knowledge out of textual data from biomedical research and clinical routine. Computerized human language technologies are constantly evolving and are increasingly ready to annotate narratives with codes from biomedical terminology. However, this depends heavily on linguistic and terminological resources. The creation and maintenance of such resources is labor-intensive. Nevertheless, it is sensible to assume that big data methods can be used to support this process. Examples include the learning of hierarchical relationships, the grouping of synonymous terms into concepts and the disambiguation of homonyms. Although clear evidence is still lacking, the combination of natural language technologies, semantic resources, and big data analytics is promising.

  9. Bio-Inspired Extreme Wetting Surfaces for Biomedical Applications

    PubMed Central

    Shin, Sera; Seo, Jungmok; Han, Heetak; Kang, Subin; Kim, Hyunchul; Lee, Taeyoon

    2016-01-01

    Biological creatures with unique surface wettability have long served as a source of inspiration for scientists and engineers. More specifically, materials exhibiting extreme wetting properties, such as superhydrophilic and superhydrophobic surfaces, have attracted considerable attention because of their potential use in various applications, such as self-cleaning fabrics, anti-fog windows, anti-corrosive coatings, drag-reduction systems, and efficient water transportation. In particular, the engineering of surface wettability by manipulating chemical properties and structure opens emerging biomedical applications ranging from high-throughput cell culture platforms to biomedical devices. This review describes design and fabrication methods for artificial extreme wetting surfaces. Next, we introduce some of the newer and emerging biomedical applications using extreme wetting surfaces. Current challenges and future prospects of the surfaces for potential biomedical applications are also addressed. PMID:28787916

  10. [3D visualization and information interaction in biomedical applications].

    PubMed

    Pu, F; Fan, Y; Jiang, W; Zhang, M; Mak, A F; Chen, J

    2001-06-01

    3D visualization and virtual reality are important trend in the development of modern science and technology, and as well in the studies on biomedical engineering. This paper presents a computer procedure developed for 3D visualization in biomedical applications. The biomedical models are constructed in slice sequences based on polygon cells and information interaction is realized on the basis of OpenGL selection mode in particular consideration of the specialties in this field such as irregularity in geometry and complexity in material etc. The software developed has functions of 3D model construction and visualization, real-time modeling transformation, information interaction and so on. It could serve as useful platform for 3D visualization in biomedical engineering research.

  11. Ontology-based annotations and semantic relations in large-scale (epi)genomics data.

    PubMed

    Galeota, Eugenia; Pelizzola, Mattia

    2016-05-03

    Public repositories of large-scale biological data currently contain hundreds of thousands of experiments, including high-throughput sequencing and microarray data. The potential of using these resources to assemble data sets combining samples previously not associated is vastly unexplored. This requires the ability to associate samples with clear annotations and to relate experiments matched with different annotation terms. In this study, we illustrate the semantic annotation of Gene Expression Omnibus samples metadata using concepts from biomedical ontologies, focusing on the association of thousands of chromatin immunoprecipitation followed by high-throughput sequencing (ChIP-seq) samples with a given target, tissue and disease state. Next, we demonstrate the feasibility of quantitatively measuring the semantic similarity between different samples, with the aim of combining experiments associated with the same or similar semantic annotations, thus allowing the generation of large data sets without the need of additional experiments. We compared tools based on Unified Medical Language System with tools that use topic-specific ontologies, showing that the second approach outperforms the first both in the annotation process and in the computation of semantic similarity measures. Finally, we demonstrated the potential of this approach by identifying semantically homogeneous groups of ChIP-seq samples targeting the Myc transcription factor, and expanding this data set with semantically coherent epigenetic samples. The semantic information of these data sets proved to be coherent with the ChIP-seq signal and with the current knowledge about this transcription factor.

  12. Quality of Computationally Inferred Gene Ontology Annotations

    PubMed Central

    Škunca, Nives; Altenhoff, Adrian; Dessimoz, Christophe

    2012-01-01

    Gene Ontology (GO) has established itself as the undisputed standard for protein function annotation. Most annotations are inferred electronically, i.e. without individual curator supervision, but they are widely considered unreliable. At the same time, we crucially depend on those automated annotations, as most newly sequenced genomes are non-model organisms. Here, we introduce a methodology to systematically and quantitatively evaluate electronic annotations. By exploiting changes in successive releases of the UniProt Gene Ontology Annotation database, we assessed the quality of electronic annotations in terms of specificity, reliability, and coverage. Overall, we not only found that electronic annotations have significantly improved in recent years, but also that their reliability now rivals that of annotations inferred by curators when they use evidence other than experiments from primary literature. This work provides the means to identify the subset of electronic annotations that can be relied upon—an important outcome given that >98% of all annotations are inferred without direct curation. PMID:22693439

  13. Collaborative Design of an Image Annotation Tool for Oceanographic Imaging Systems

    NASA Astrophysics Data System (ADS)

    Futrelle, J.; York, A.

    2012-12-01

    . Focusing on interoperability and web-accessibility means the tool can be used to annotate any collection of web-accessible images, opening up possibilities for cross-institutional collaboration and citizen science. A prototype implementation is already in use for scallop and groundfish surveys and is being extended to support phytoplankton imagery.; High-level Information Model of General-Purpose Image Annotation Tool

  14. Computational algorithms to predict Gene Ontology annotations

    PubMed Central

    2015-01-01

    Background Gene function annotations, which are associations between a gene and a term of a controlled vocabulary describing gene functional features, are of paramount importance in modern biology. Datasets of these annotations, such as the ones provided by the Gene Ontology Consortium, are used to design novel biological experiments and interpret their results. Despite their importance, these sources of information have some known issues. They are incomplete, since biological knowledge is far from being definitive and it rapidly evolves, and some erroneous annotations may be present. Since the curation process of novel annotations is a costly procedure, both in economical and time terms, computational tools that can reliably predict likely annotations, and thus quicken the discovery of new gene annotations, are very useful. Methods We used a set of computational algorithms and weighting schemes to infer novel gene annotations from a set of known ones. We used the latent semantic analysis approach, implementing two popular algorithms (Latent Semantic Indexing and Probabilistic Latent Semantic Analysis) and propose a novel method, the Semantic IMproved Latent Semantic Analysis, which adds a clustering step on the set of considered genes. Furthermore, we propose the improvement of these algorithms by weighting the annotations in the input set. Results We tested our methods and their weighted variants on the Gene Ontology annotation sets of three model organism genes (Bos taurus, Danio rerio and Drosophila melanogaster ). The methods showed their ability in predicting novel gene annotations and the weighting procedures demonstrated to lead to a valuable improvement, although the obtained results vary according to the dimension of the input annotation set and the considered algorithm. Conclusions Out of the three considered methods, the Semantic IMproved Latent Semantic Analysis is the one that provides better results. In particular, when coupled with a proper

  15. [Main characteristics of current biomedical research, in Chile].

    PubMed

    Valdés S, Gloria; Armas M, Rodolfo; Reyes B, Humberto

    2012-04-01

    Biomedical research is a fundamental tool for the development of a country, requiring human and financial resources. To define some current characteristics of biomedical research, in Chile. Data on entities funding bio-medical research, participant institutions, and the number of active investigators for the period 2007-2009 were obtained from institutional sources; publications indexed in PubMed for 2008-2009 were analysed. Most financial resources invested in biomedical research projects (approximately US$ 19 million per year) came from the "Comisión Nacional de Investigación Científica y Tecnológica" (CONICYT), a state institution with 3 independent Funds administering competitive grant applications open annually to institutional or independent investigators in Chile. Other sources and universities raised the total amount to US$ 26 million. Since 2007 to 2009, 408 investigators participated in projects funded by CONICYT. The main participant institutions were Universidad de Chile and Pontificia Universidad Católica de Chile, both adding up to 84% of all funded projects. Independently, in 2009,160 research projects -mainly multi centric clinical trials- received approximately US$ 24 million from foreign pharmaceutical companies. Publications listed in PubMed were classified as "clinical research" (n = 879, including public health) or "basic biomedical research" (n = 312). Biomedical research in Chile is mainly supported by state funds and university resources, but clinical trials also obtained an almost equivalent amount from foreign resources. Investigators are predominantly located in two universities. A small number of MD-PhD programs are aimed to train and incorporate new scientists. Only a few new Medical Schools participate in biomedical research. A National Registry of biomedical research projects, including the clinical trials, is required among other initiatives to stimulate research in biomedical sciences in Chile.

  16. The PRIME Lab biomedical program

    NASA Astrophysics Data System (ADS)

    Jackson, George S.; Elmore, David; Rickey, Frank A.; Musameh, Sharif M.; Sharma, Pankaj; Hillegonds, Darren; Coury, Louis; Kissinger, Peter

    2000-10-01

    The biomedical accelerator mass spectrometry (AMS) initiative at PRIME Lab including the status of equipment and sample preparation is described. Several biomedical projects are underway involving one or more of the nuclides: 14C, 26Al and 41Ca. Routine production of CaF 2 and graphite is taking place. Finally, the future direction and plans for improvement of the biomedical program at PRIME Lab are discussed.

  17. NIH Funding for Biomedical Imaging

    NASA Astrophysics Data System (ADS)

    Conroy, Richard

    Biomedical imaging, and in particular MRI and CT, is often identified as among the top 10 most significant advances in healthcare in the 20th century. This presentation will describe some of the recent advances in medical physics and imaging being funded by NIH in this century and current funding opportunities. The presentation will also highlight the role of multidisciplinary research in bringing concepts from the physical sciences and applying them to challenges in biological and biomedical research.. NIH Funding for Biomedical Imaging.

  18. China's growing biomedical industry.

    PubMed

    Han, Pei

    2009-06-01

    The biomedical industry in China is developing rapidly, and new biological drugs are increasing their share of the pharmaceutical market based on people's needs. China is the largest producer and user of vaccines in the world, but the existing production of vaccines is far from enough to meet the needs of the market. The entire market of biological drugs in China is still smaller than that for traditional medicines and chemicals. Therefore, the biopharmaceutical industry has the potential to be the rising star in the pharmaceutical market in the future.

  19. Anatomy for biomedical engineers.

    PubMed

    Carmichael, Stephen W; Robb, Richard A

    2008-01-01

    There is a perceived need for anatomy instruction for graduate students enrolled in a biomedical engineering program. This appeared especially important for students interested in and using medical images. These students typically did not have a strong background in biology. The authors arranged for students to dissect regions of the body that were of particular interest to them. Following completion of all the dissections, the students presented what they had learned to the entire class in the anatomy laboratory. This course has fulfilled an important need for our students.

  20. Biomedical systems analysis program

    NASA Technical Reports Server (NTRS)

    1979-01-01

    Biomedical monitoring programs which were developed to provide a system analysis context for a unified hypothesis for adaptation to space flight are presented and discussed. A real-time system of data analysis and decision making to assure the greatest possible crew safety and mission success is described. Information about man's abilities, limitations, and characteristic reactions to weightless space flight was analyzed and simulation models were developed. The predictive capabilities of simulation models for fluid-electrolyte regulation, erythropoiesis regulation, and calcium regulation are discussed.

  1. Caffeine analogs: biomedical impact.

    PubMed

    Daly, J W

    2007-08-01

    Caffeine, widely consumed in beverages, and many xanthine analogs have had a major impact on biomedical research. Caffeine and various analogs, the latter designed to enhance potency and selectivity toward specific biological targets, have played key roles in defining the nature and role of adenosine receptors, phosphodiesterases, and calcium release channels in physiological processes. Such xanthines and other caffeine-inspired heterocycles now provide important research tools and potential therapeutic agents for intervention in Alzheimer's disease, asthma, cancer, diabetes, and Parkinson's disease. Such compounds also have activity as analgesics, antiinflammatories, antitussives, behavioral stimulants, diuretics/natriuretics, and lipolytics. Adverse effects can include anxiety, hypertension, certain drug interactions, and withdrawal symptoms.

  2. TriAnnot: A Versatile and High Performance Pipeline for the Automated Annotation of Plant Genomes

    PubMed Central

    Leroy, Philippe; Guilhot, Nicolas; Sakai, Hiroaki; Bernard, Aurélien; Choulet, Frédéric; Theil, Sébastien; Reboux, Sébastien; Amano, Naoki; Flutre, Timothée; Pelegrin, Céline; Ohyanagi, Hajime; Seidel, Michael; Giacomoni, Franck; Reichstadt, Mathieu; Alaux, Michael; Gicquello, Emmanuelle; Legeai, Fabrice; Cerutti, Lorenzo; Numa, Hisataka; Tanaka, Tsuyoshi; Mayer, Klaus; Itoh, Takeshi; Quesneville, Hadi; Feuillet, Catherine

    2012-01-01

    In support of the international effort to obtain a reference sequence of the bread wheat genome and to provide plant communities dealing with large and complex genomes with a versatile, easy-to-use online automated tool for annotation, we have developed the TriAnnot pipeline. Its modular architecture allows for the annotation and masking of transposable elements, the structural, and functional annotation of protein-coding genes with an evidence-based quality indexing, and the identification of conserved non-coding sequences and molecular markers. The TriAnnot pipeline is parallelized on a 712 CPU computing cluster that can run a 1-Gb sequence annotation in less than 5 days. It is accessible through a web interface for small scale analyses or through a server for large scale annotations. The performance of TriAnnot was evaluated in terms of sensitivity, specificity, and general fitness using curated reference sequence sets from rice and wheat. In less than 8 h, TriAnnot was able to predict more than 83% of the 3,748 CDS from rice chromosome 1 with a fitness of 67.4%. On a set of 12 reference Mb-sized contigs from wheat chromosome 3B, TriAnnot predicted and annotated 93.3% of the genes among which 54% were perfectly identified in accordance with the reference annotation. It also allowed the curation of 12 genes based on new biological evidences, increasing the percentage of perfect gene prediction to 63%. TriAnnot systematically showed a higher fitness than other annotation pipelines that are not improved for wheat. As it is easily adaptable to the annotation of other plant genomes, TriAnnot should become a useful resource for the annotation of large and complex genomes in the future. PMID:22645565

  3. Omics data management and annotation.

    PubMed

    Harel, Arye; Dalah, Irina; Pietrokovski, Shmuel; Safran, Marilyn; Lancet, Doron

    2011-01-01

    Technological Omics breakthroughs, including next generation sequencing, bring avalanches of data which need to undergo effective data management to ensure integrity, security, and maximal knowledge-gleaning. Data management system requirements include flexible input formats, diverse data entry mechanisms and views, user friendliness, attention to standards, hardware and software platform definition, as well as robustness. Relevant solutions elaborated by the scientific community include Laboratory Information Management Systems (LIMS) and standardization protocols facilitating data sharing and managing. In project planning, special consideration has to be made when choosing relevant Omics annotation sources, since many of them overlap and require sophisticated integration heuristics. The data modeling step defines and categorizes the data into objects (e.g., genes, articles, disorders) and creates an application flow. A data storage/warehouse mechanism must be selected, such as file-based systems and relational databases, the latter typically used for larger projects. Omics project life cycle considerations must include the definition and deployment of new versions, incorporating either full or partial updates. Finally, quality assurance (QA) procedures must validate data and feature integrity, as well as system performance expectations. We illustrate these data management principles with examples from the life cycle of the GeneCards Omics project (http://www.genecards.org), a comprehensive, widely used compendium of annotative information about human genes. For example, the GeneCards infrastructure has recently been changed from text files to a relational database, enabling better organization and views of the growing data. Omics data handling benefits from the wealth of Web-based information, the vast amount of public domain software, increasingly affordable hardware, and effective use of data management and annotation principles as outlined in this chapter.

  4. Biomedical applications of aerospace technology

    NASA Technical Reports Server (NTRS)

    Castles, T. R.

    1971-01-01

    Aerospace technology transfer to biomedical research problems is discussed, including transfer innovations and potential applications. Statistical analysis of the transfer activities and impact is also presented.

  5. The Twin Cities biomedical consortium.

    PubMed

    Bailey, A S

    1975-07-01

    Twenty-eight health science libraries in the St. Paul-Minneapolis area formed the Twin Cities Biomedical Consortium with the intention of developing a strong network of biomedical libraries in the Twin Cities area. Toward this end, programs were designed to strengthen lines of communication and increase cooperation among local health science libraries; improve access to biomedical information at the local level; and enable the Consortium, as a group, to meet an increasing proportion of its members' needs for biomedical information. Presently, the TCBC comprises libraries in twenty-two hospitals, two county medical societies, one school of nursing, one junior college, and two private corporations.

  6. 77 FR 20489 - Joint Biomedical Laboratory Research and Development and Clinical Science Research and...

    Federal Register 2010, 2011, 2012, 2013, 2014

    2012-04-04

    ... AFFAIRS Joint Biomedical Laboratory Research and Development and Clinical Science Research and Development... Biomedical Laboratory Research and Development and Clinical Science Research and Development Services... science research. The panel meetings will be open to the public for approximately one-half hour at the...

  7. 76 FR 24974 - Joint Biomedical Laboratory Research and Development and Clinical Science Research and...

    Federal Register 2010, 2011, 2012, 2013, 2014

    2011-05-03

    ... AFFAIRS Joint Biomedical Laboratory Research and Development and Clinical Science Research and Development... following four panels of the Joint Biomedical Laboratory Research and Development and Clinical Science... clinical science research. The panel meetings will be open to the public for approximately one hour at the...

  8. 78 FR 28292 - Joint Biomedical Laboratory Research and Development and Clinical Science Research and...

    Federal Register 2010, 2011, 2012, 2013, 2014

    2013-05-14

    ... AFFAIRS Joint Biomedical Laboratory Research and Development and Clinical Science Research and Development... Research and Development and Clinical Science Research and Development Services Scientific Merit Review... areas of biomedical, behavioral and clinical science research. The panel meetings will be open to the...

  9. snpGeneSets: An R Package for Genome-Wide Study Annotation.

    PubMed

    Mei, Hao; Li, Lianna; Jiang, Fan; Simino, Jeannette; Griswold, Michael; Mosley, Thomas; Liu, Shijian

    2016-12-07

    Genome-wide studies (GWS) of SNP associations and differential gene expressions have generated abundant results; next-generation sequencing technology has further boosted the number of variants and genes identified. Effective interpretation requires massive annotation and downstream analysis of these genome-wide results, a computationally challenging task. We developed the snpGeneSets package to simplify annotation and analysis of GWS results. Our package integrates local copies of knowledge bases for SNPs, genes, and gene sets, and implements wrapper functions in the R language to enable transparent access to low-level databases for efficient annotation of large genomic data. The package contains functions that execute three types of annotations: (1) genomic mapping annotation for SNPs and genes and functional annotation for gene sets; (2) bidirectional mapping between SNPs and genes, and genes and gene sets; and (3) calculation of gene effect measures from SNP associations and performance of gene set enrichment analyses to identify functional pathways. We applied snpGeneSets to type 2 diabetes (T2D) results from the NHGRI genome-wide association study (GWAS) catalog, a Finnish GWAS, and a genome-wide expression study (GWES). These studies demonstrate the usefulness of snpGeneSets for annotating and performing enrichment analysis of GWS results. The package is open-source, free, and can be downloaded at: https://www.umc.edu/biostats_software/.

  10. snpGeneSets: An R Package for Genome-Wide Study Annotation

    PubMed Central

    Mei, Hao; Li, Lianna; Jiang, Fan; Simino, Jeannette; Griswold, Michael; Mosley, Thomas; Liu, Shijian

    2016-01-01

    Genome-wide studies (GWS) of SNP associations and differential gene expressions have generated abundant results; next-generation sequencing technology has further boosted the number of variants and genes identified. Effective interpretation requires massive annotation and downstream analysis of these genome-wide results, a computationally challenging task. We developed the snpGeneSets package to simplify annotation and analysis of GWS results. Our package integrates local copies of knowledge bases for SNPs, genes, and gene sets, and implements wrapper functions in the R language to enable transparent access to low-level databases for efficient annotation of large genomic data. The package contains functions that execute three types of annotations: (1) genomic mapping annotation for SNPs and genes and functional annotation for gene sets; (2) bidirectional mapping between SNPs and genes, and genes and gene sets; and (3) calculation of gene effect measures from SNP associations and performance of gene set enrichment analyses to identify functional pathways. We applied snpGeneSets to type 2 diabetes (T2D) results from the NHGRI genome-wide association study (GWAS) catalog, a Finnish GWAS, and a genome-wide expression study (GWES). These studies demonstrate the usefulness of snpGeneSets for annotating and performing enrichment analysis of GWS results. The package is open-source, free, and can be downloaded at: https://www.umc.edu/biostats_software/. PMID:27807048

  11. AGMIAL: implementing an annotation strategy for prokaryote genomes as a distributed system

    PubMed Central

    Bryson, K.; Loux, V.; Bossy, R.; Nicolas, P.; Chaillou, S.; van de Guchte, M.; Penaud, S.; Maguin, E.; Hoebeke, M.; Bessières, P.; Gibrat, J-F

    2006-01-01

    We have implemented a genome annotation system for prokaryotes called AGMIAL. Our approach embodies a number of key principles. First, expert manual annotators are seen as a critical component of the overall system; user interfaces were cyclically refined to satisfy their needs. Second, the overall process should be orchestrated in terms of a global annotation strategy; this facilitates coordination between a team of annotators and automatic data analysis. Third, the annotation strategy should allow progressive and incremental annotation from a time when only a few draft contigs are available, to when a final finished assembly is produced. The overall architecture employed is modular and extensible, being based on the W3 standard Web services framework. Specialized modules interact with two independent core modules that are used to annotate, respectively, genomic and protein sequences. AGMIAL is currently being used by several INRA laboratories to analyze genomes of bacteria relevant to the food-processing industry, and is distributed under an open source license. PMID:16855290

  12. An Annotated Bibliography of the Open Literature on Deception.

    DTIC Science & Technology

    1985-12-01

    covert preparations and covert employment of weapons based upon advances in life sciences, including recombinant DNA, immunology, toxicology...Concealment and Revelation. New York: Pantheon Books, 1982, 332 pp. Bok, Sissela, Lying: Moral Choices in Public and Private Life . New York, NY: Pantheon...34 Commentary, Vol. 62, No. 6, December 1976, pp. 27-33. A detailed account of life in China today: the central phenomenon of its unique, almost pure

  13. Annotating Socio-Cultural Structures in Text

    DTIC Science & Technology

    2012-10-31

    from the traditional k-Nearest Neighbor (kNN) algorithm. Using experiments on three different multi-label learning problems, i.e. Yeast gene ...annotated NP/ VP Pane: Shows the sentence parsed using the Parts of Speech tagger Document View Pane: Specifies the document (being annotated) in three...used to annotate the document. In the current application we use the Level 1, Level 2 taxonomy. New concepts may be added to or deleted from the

  14. Regulation of biomedical products.

    PubMed

    Gillett, Grant; Saville-Cook, Donald

    2010-05-01

    Two recent decisions, one from Australia and one from Canada, should cause us to examine the ethical issues surrounding the regulation of biomedical products. The protection of vulnerable consumers from variable quality and poorly prepared drugs with uncertain parameters of safety and efficacy is a priority for any community and should not have to be weighed against possible costs based on restrictions of trade. However, the possibility of an environment in which the multinational biomedical industry edges out any other players in the treatment of various illnesses has its own dangers. Not least is the apparent collusion between regulators and industry that ramps up the costs and intensity of licensing and risk management so that only an industry-type budget can sustain the costs of compliance. This has the untoward effect of delivering contemporary health care into the hands of those who make immense fortunes out of it. An approach to regulation that tempers bureaucratic mechanisms with a dose of common sense and realistic evidence-based risk assessment could go a long way in avoiding the Scylla and Charybdis awaiting the clinical world in these troubled waters.

  15. Nuclear microscopy: biomedical applications

    NASA Astrophysics Data System (ADS)

    Watt, Frank; Landsberg, Judith P.

    1993-05-01

    Recent developments in high energy ion beam techniques and technology have enabled the scanning proton microprobe (SPM) to make advances in biomedical research. In particular the combination of proton induced X-ray emission (PIXE) to measure the elemental concentrations of inorganic elements, Rutherford backscattering spectrometry (RBS) to characterise the organic matrix, and scanning transmission ion microscopy (STIM) to provide information on the density and structure of the sample, represents a powerful set of techniques which can be applied simultaneously to the specimen under investigation. This paper reviews briefly the biomedical work using the proton microprobe that has been carried out since the 2nd Int. Conf. on Nuclear Microprobe Technology and Applications held in Melbourne, 1990. Three recent and diverse examples of medical research are also presented from work carried out using the Oxford SPM. The first is a preliminary experiment carried out using human hair as a monitor for potential toxicity, using PIXE elemental mapping across the hair cross section to differentiate between elements contained within the hair and contamination from external sources. The second example is in the use of STIM to map individual cells in freeze-dried tissue, showing the possibility of the in situ microanalysis of cells and their extracellular environment. The third is the use of PIXE, RBS and STIM to identify and analyse the elemental constituents of neuritic plaque cores in untreated freeze-dried Alzheimer's tissue. This work resolves a current controversy by revealing an absence of aluminium levels in plaque cores at the 15 ppm level.

  16. Biomedical Interdisciplinary Curriculum Project: BIP (Biomedical Instrumentation Package) User's Manual.

    ERIC Educational Resources Information Center

    Biomedical Interdisciplinary Curriculum Project, Berkeley, CA.

    Described is the Biomedical Instrument Package (BIP) and its use. The BIP was developed for use in understanding colorimetry, sound, electricity, and bioelectric phenomena. It can also be used in a wide range of measurements such as current, voltage, resistance, temperature, and pH. Though it was developed primarily for use in biomedical science…

  17. A token centric part-of-speech tagger for biomedical text.

    PubMed

    Barrett, Neil; Weber-Jahnke, Jens

    2014-05-01

    Difficulties with part-of-speech (POS) tagging of biomedical text is accessing and annotating appropriate training corpora. These difficulties may result in POS taggers trained on corpora that differ from the tagger's target biomedical text (cross-domain tagging). In such cases where training and target corpora differ tagging accuracy decreases. This paper presents a POS tagger for cross-domain tagging called TcT. TcT estimates a tag's likelihood for a given token by combining token collocation probabilities and the token's tag probabilities calculated using a Naive Bayes classifier. We compared TcT to three POS taggers used in the biomedical domain (mxpost, Brill and TnT). We trained each tagger on a non-biomedical corpus and evaluated it on biomedical corpora. TcT was more accurate in cross-domain tagging than mxpost, Brill and TnT (respective averages 83.9, 81.0, 79.5 and 78.8). Our analysis of tagger performance suggests that lexical differences between corpora have more effect on tagging accuracy than originally considered by previous research work. Biomedical POS tagging algorithms may be modified to improve their cross-domain tagging accuracy without requiring extra training or large training data sets. Future work should reexamine POS tagging methods for biomedical text. This differs from the work to date that has focused on retraining existing POS taggers. Copyright © 2014 Elsevier B.V. All rights reserved.

  18. A beginner's guide to eukaryotic genome annotation.

    PubMed

    Yandell, Mark; Ence, Daniel

    2012-04-18

    The falling cost of genome sequencing is having a marked impact on the research community with respect to which genomes are sequenced and how and where they are annotated. Genome annotation projects have generally become small-scale affairs that are often carried out by an individual laboratory. Although annotating a eukaryotic genome assembly is now within the reach of non-experts, it remains a challenging task. Here we provide an overview of the genome annotation process and the available tools and describe some best-practice approaches.

  19. Assessment of disease named entity recognition on a corpus of annotated sentences

    PubMed Central

    Jimeno, Antonio; Jimenez-Ruiz, Ernesto; Lee, Vivian; Gaudan, Sylvain; Berlanga, Rafael; Rebholz-Schuhmann, Dietrich

    2008-01-01

    Background In recent years, the recognition of semantic types from the biomedical scientific literature has been focused on named entities like protein and gene names (PGNs) and gene ontology terms (GO terms). Other semantic types like diseases have not received the same level of attention. Different solutions have been proposed to identify disease named entities in the scientific literature. While matching the terminology with language patterns suffers from low recall (e.g., Whatizit) other solutions make use of morpho-syntactic features to better cover the full scope of terminological variability (e.g., MetaMap). Currently, MetaMap that is provided from the National Library of Medicine (NLM) is the state of the art solution for the annotation of concepts from UMLS (Unified Medical Language System) in the literature. Nonetheless, its performance has not yet been assessed on an annotated corpus. In addition, little effort has been invested so far to generate an annotated dataset that links disease entities in text to disease entries in a database, thesaurus or ontology and that could serve as a gold standard to benchmark text mining solutions. Results As part of our research work, we have taken a corpus that has been delivered in the past for the identification of associations of genes to diseases based on the UMLS Metathesaurus and we have reprocessed and re-annotated the corpus. We have gathered annotations for disease entities from two curators, analyzed their disagreement (0.51 in the kappa-statistic) and composed a single annotated corpus for public use. Thereafter, three solutions for disease named entity recognition including MetaMap have been applied to the corpus to automatically annotate it with UMLS Metathesaurus concepts. The resulting annotations have been benchmarked to compare their performance. Conclusions The annotated corpus is publicly available at and can serve as a benchmark to other systems. In addition, we found that dictionary look

  20. High performance flexible electronics for biomedical devices.

    PubMed

    Salvatore, Giovanni A; Munzenrieder, Niko; Zysset, Christoph; Kinkeldei, Thomas; Petti, Luisa; Troster, Gerhard

    2014-01-01

    Plastic electronics is soft, deformable and lightweight and it is suitable for the realization of devices which can form an intimate interface with the body, be implanted or integrated into textile for wearable and biomedical applications. Here, we present flexible electronics based on amorphous oxide semiconductors (a-IGZO) whose performance can achieve MHz frequency even when bent around hair. We developed an assembly technique to integrate complex electronic functionalities into textile while preserving the softness of the garment. All this and further developments can open up new opportunities in health monitoring, biotechnology and telemedicine.

  1. Biomedical ontology improves biomedical literature clustering performance: a comparison study.

    PubMed

    Yoo, Illhoi; Hu, Xiaohua; Song, Il-Yeol

    2007-01-01

    Document clustering has been used for better document retrieval and text mining. In this paper, we investigate if a biomedical ontology improves biomedical literature clustering performance in terms of the effectiveness and the scalability. For this investigation, we perform a comprehensive comparison study of various document clustering approaches such as hierarchical clustering methods, Bisecting K-means, K-means and Suffix Tree Clustering (STC). According to our experiment results, a biomedical ontology significantly enhances clustering quality on biomedical documents. In addition, our results show that decent document clustering approaches, such as Bisecting K-means, K-means and STC, gains some benefit from the ontology while hierarchical algorithms showing the poorest clustering quality do not reap the benefit of the biomedical ontology.

  2. IDPredictor: predict database links in biomedical database.

    PubMed

    Mehlhorn, Hendrik; Lange, Matthias; Scholz, Uwe; Schreiber, Falk

    2012-06-26

    Knowledge found in biomedical databases, in particular in Web information systems, is a major bioinformatics resource. In general, this biological knowledge is worldwide represented in a network of databases. These data is spread among thousands of databases, which overlap in content, but differ substantially with respect to content detail, interface, formats and data structure. To support a functional annotation of lab data, such as protein sequences, metabolites or DNA sequences as well as a semi-automated data exploration in information retrieval environments, an integrated view to databases is essential. Search engines have the potential of assisting in data retrieval from these structured sources, but fall short of providing a comprehensive knowledge except out of the interlinked databases. A prerequisite of supporting the concept of an integrated data view is to acquire insights into cross-references among database entities. This issue is being hampered by the fact, that only a fraction of all possible cross-references are explicitely tagged in the particular biomedical informations systems. In this work, we investigate to what extend an automated construction of an integrated data network is possible. We propose a method that predicts and extracts cross-references from multiple life science databases and possible referenced data targets. We study the retrieval quality of our method and report on first, promising results. The method is implemented as the tool IDPredictor, which is published under the DOI 10.5447/IPK/2012/4 and is freely available using the URL: http://dx.doi.org/10.5447/IPK/2012/4.

  3. Ordinal symbolic analysis and its application to biomedical recordings

    PubMed Central

    Amigó, José M.; Keller, Karsten; Unakafova, Valentina A.

    2015-01-01

    Ordinal symbolic analysis opens an interesting and powerful perspective on time-series analysis. Here, we review this relatively new approach and highlight its relation to symbolic dynamics and representations. Our exposition reaches from the general ideas up to recent developments, with special emphasis on its applications to biomedical recordings. The latter will be illustrated with epilepsy data. PMID:25548264

  4. Data Mining Algorithms for Classification of Complex Biomedical Data

    ERIC Educational Resources Information Center

    Lan, Liang

    2012-01-01

    In my dissertation, I will present my research which contributes to solve the following three open problems from biomedical informatics: (1) Multi-task approaches for microarray classification; (2) Multi-label classification of gene and protein prediction from multi-source biological data; (3) Spatial scan for movement data. In microarray…

  5. Data Mining Algorithms for Classification of Complex Biomedical Data

    ERIC Educational Resources Information Center

    Lan, Liang

    2012-01-01

    In my dissertation, I will present my research which contributes to solve the following three open problems from biomedical informatics: (1) Multi-task approaches for microarray classification; (2) Multi-label classification of gene and protein prediction from multi-source biological data; (3) Spatial scan for movement data. In microarray…

  6. Professional Identification for Biomedical Engineers

    ERIC Educational Resources Information Center

    Long, Francis M.

    1973-01-01

    Discusses four methods of professional identification in biomedical engineering including registration, certification, accreditation, and possible membership qualification of the societies. Indicates that the destiny of the biomedical engineer may be under the control of a new profession, neither the medical nor the engineering. (CC)

  7. Biomedical Knowledge and Clinical Expertise.

    ERIC Educational Resources Information Center

    Boshuizen, Henny P. A.; Schmidt, Henk G.

    A study examined the application and availability of clinical and biomedical knowledge in the clinical reasoning of physicians as well as possible mechanisms responsible for changes in the organization of clinical and biomedical knowledge in the development from novice to expert. Subjects were 28 students (10 second year, 8 fourth year, and 10…

  8. Space Biomedical Research in JAXA

    NASA Astrophysics Data System (ADS)

    Izumi, Ryutaro; Ogawa, Megumi; Kawashima, Shino; Inoue, Natsuhiko; Ohshima, Hiroshi; Tanaka, Kazunari; Mukai, Chiaki; Tachibana, Shoichi

    This paper introduces the activity of the newly launched JAXA Space Biomedical Research Office, including ongoing space clinical medicine research. It also explains the new office's goals, policy, criteria for prioritizing research themes, and process for conducting research, as well as some topics of space biomedical research.

  9. Professional Identification for Biomedical Engineers

    ERIC Educational Resources Information Center

    Long, Francis M.

    1973-01-01

    Discusses four methods of professional identification in biomedical engineering including registration, certification, accreditation, and possible membership qualification of the societies. Indicates that the destiny of the biomedical engineer may be under the control of a new profession, neither the medical nor the engineering. (CC)

  10. Pooling annotated corpora for clinical concept extraction

    PubMed Central

    2013-01-01

    Background The availability of annotated corpora has facilitated the application of machine learning algorithms to concept extraction from clinical notes. However, high expenditure and labor are required for creating the annotations. A potential alternative is to reuse existing corpora from other institutions by pooling with local corpora, for training machine taggers. In this paper we have investigated the latter approach by pooling corpora from 2010 i2b2/VA NLP challenge and Mayo Clinic Rochester, to evaluate taggers for recognition of medical problems. The corpora were annotated for medical problems, but with different guidelines. The taggers were constructed using an existing tagging system MedTagger that consisted of dictionary lookup, part of speech (POS) tagging and machine learning for named entity prediction and concept extraction. We hope that our current work will be a useful case study for facilitating reuse of annotated corpora across institutions. Results We found that pooling was effective when the size of the local corpus was small and after some of the guideline differences were reconciled. The benefits of pooling, however, diminished as more locally annotated documents were included in the training data. We examined the annotation guidelines to identify factors that determine the effect of pooling. Conclusions The effectiveness of pooling corpora, is dependent on several factors, which include compatibility of annotation guidelines, distribution of report types and size of local and foreign corpora. Simple methods to rectify some of the guideline differences can facilitate pooling. Our findings need to be confirmed with further studies on different corpora. To facilitate the pooling and reuse of annotated corpora, we suggest that – i) the NLP community should develop a standard annotation guideline that addresses the potential areas of guideline differences that are partly identified in this paper; ii) corpora should be annotated with a two

  11. The Use of Slides in Biomedical Speeches.

    ERIC Educational Resources Information Center

    Dubois, Betty Lou

    1980-01-01

    Describes study of biomedical papers read at 63rd Annual Meeting of the Federation of American Societies for Experimental Biology. Concludes slides play broader role in biomedical speeches than nonlinguistic visual devices do in biomedical journal articles. (Author/BK)

  12. Accessing biomedical literature in the current information landscape.

    PubMed

    Khare, Ritu; Leaman, Robert; Lu, Zhiyong

    2014-01-01

    Biomedical and life sciences literature is unique because of its exponentially increasing volume and interdisciplinary nature. Biomedical literature access is essential for several types of users including biomedical researchers, clinicians, database curators, and bibliometricians. In the past few decades, several online search tools and literature archives, generic as well as biomedicine specific, have been developed. We present this chapter in the light of three consecutive steps of literature access: searching for citations, retrieving full text, and viewing the article. The first section presents the current state of practice of biomedical literature access, including an analysis of the search tools most frequently used by the users, including PubMed, Google Scholar, Web of Science, Scopus, and Embase, and a study on biomedical literature archives such as PubMed Central. The next section describes current research and the state-of-the-art systems motivated by the challenges a user faces during query formulation and interpretation of search results. The research solutions are classified into five key areas related to text and data mining, text similarity search, semantic search, query support, relevance ranking, and clustering results. Finally, the last section describes some predicted future trends for improving biomedical literature access, such as searching and reading articles on portable devices, and adoption of the open access policy.

  13. Accessing Biomedical Literature in the Current Information Landscape

    PubMed Central

    Khare, Ritu; Leaman, Robert; Lu, Zhiyong

    2015-01-01

    i. Summary Biomedical and life sciences literature is unique because of its exponentially increasing volume and interdisciplinary nature. Biomedical literature access is essential for several types of users including biomedical researchers, clinicians, database curators, and bibliometricians. In the past few decades, several online search tools and literature archives, generic as well as biomedicine-specific, have been developed. We present this chapter in the light of three consecutive steps of literature access: searching for citations, retrieving full-text, and viewing the article. The first section presents the current state of practice of biomedical literature access, including an analysis of the search tools most frequently used by the users, including PubMed, Google Scholar, Web of Science, Scopus, and Embase, and a study on biomedical literature archives such as PubMed Central. The next section describes current research and the state-of-the-art systems motivated by the challenges a user faces during query formulation and interpretation of search results. The research solutions are classified into five key areas related to text and data mining, text similarity search, semantic search, query support, relevance ranking, and clustering results. Finally, the last section describes some predicted future trends for improving biomedical literature access, such as searching and reading articles on portable devices, and adoption of the open access policy. PMID:24788259

  14. Biomedical informatics and translational medicine

    PubMed Central

    2010-01-01

    Biomedical informatics involves a core set of methodologies that can provide a foundation for crossing the "translational barriers" associated with translational medicine. To this end, the fundamental aspects of biomedical informatics (e.g., bioinformatics, imaging informatics, clinical informatics, and public health informatics) may be essential in helping improve the ability to bring basic research findings to the bedside, evaluate the efficacy of interventions across communities, and enable the assessment of the eventual impact of translational medicine innovations on health policies. Here, a brief description is provided for a selection of key biomedical informatics topics (Decision Support, Natural Language Processing, Standards, Information Retrieval, and Electronic Health Records) and their relevance to translational medicine. Based on contributions and advancements in each of these topic areas, the article proposes that biomedical informatics practitioners ("biomedical informaticians") can be essential members of translational medicine teams. PMID:20187952

  15. Biomedical informatics and translational medicine.

    PubMed

    Sarkar, Indra Neil

    2010-02-26

    Biomedical informatics involves a core set of methodologies that can provide a foundation for crossing the "translational barriers" associated with translational medicine. To this end, the fundamental aspects of biomedical informatics (e.g., bioinformatics, imaging informatics, clinical informatics, and public health informatics) may be essential in helping improve the ability to bring basic research findings to the bedside, evaluate the efficacy of interventions across communities, and enable the assessment of the eventual impact of translational medicine innovations on health policies. Here, a brief description is provided for a selection of key biomedical informatics topics (Decision Support, Natural Language Processing, Standards, Information Retrieval, and Electronic Health Records) and their relevance to translational medicine. Based on contributions and advancements in each of these topic areas, the article proposes that biomedical informatics practitioners ("biomedical informaticians") can be essential members of translational medicine teams.

  16. Proteogenomics: the needs and roles to be filled by proteomics in genome annotation

    SciTech Connect

    Ansong, Charles; Purvine, Samuel O.; Adkins, Joshua N.; Lipton, Mary S.; Smith, Richard D.

    2008-01-01

    While genome sequencing efforts reveal the basic building blocks of life, a genome sequence alone is insufficient for elucidating biological function. Genome annotation – the process of identifying genes and assigning function to each gene in a genome sequence – provides the means to elucidate biological function from sequence. Current state-of-the-art high throughput genome annotation uses a combination of comparative (sequence similarity data) and non-comparative (ab initio gene prediction algorithms) methods to identify open reading frames in genome sequences. Because approaches used to validate the presence of these open reading frames are typically based on the information derived from the annotated genomes, they cannot independently and unequivocally determine whether a predicted open reading frame is translated into a protein. With the ability to directly measure peptides arising from expressed proteins, high throughput liquid chromatography-tandem mass spectrometry-based proteomics, approaches can be used to verify coding regions of a genomic sequence. Here, we highlight several ways in which high throughput tandem mass spectrometry-based proteomics can improve the quality of genome annotations and suggest that it could be efficiently applied during the initial gene calling process so that the improvements are propagated through the subsequent functional annotation process.

  17. Customization of biomedical terminologies.

    PubMed

    Homo, Julien; Dupuch, Laëtitia; Benbrahim, Allel; Grabar, Natalia; Dupuch, Marie

    2012-01-01

    Within the biomedical area over one hundred terminologies exist and are merged in the Unified Medical Language System Metathesaurus, which gives over 1 million concepts. When such huge terminological resources are available, the users must deal with them and specifically they must deal with irrelevant parts of these terminologies. We propose to exploit seed terms and semantic distance algorithms in order to customize the terminologies and to limit within them a semantically homogeneous space. An evaluation performed by a medical expert indicates that the proposed approach is relevant for the customization of terminologies and that the extracted terms are mostly relevant to the seeds. It also indicates that different algorithms provide with similar or identical results within a given terminology. The difference is due to the terminologies exploited. A special attention must be paid to the definition of optimal association between the semantic similarity algorithms and the thresholds specific to a given terminology.

  18. Biomedical studies by PIXE

    NASA Astrophysics Data System (ADS)

    Afarideh, H.; Amirabadi, A.; Hadji-Saeid, S. M.; Mansourian, N.; Kaviani, K.; Zibafar, E.

    1996-04-01

    In the present biomedical research, PIXE a powerful technique for elemental analysis was employed to illustrate the importance of multi-elemental determination of serum trace elements in two cases of great medical interest. Those are evaluation of the desferroxamine drug (DPO), a widely used therapy for patient with β-thalassemia-Major (β-thal-M), and investigation of elemental variations in blood-serum in hyperbilirubinamia new-borns before and after blood transfusion (BT). The purpose of the work is to demonstrate the various aspects of PIXE analysis by some practical examples as well as to draw some general conclusions regarding the cure of those patients with the above mentioned disorders or diseases. To present in details each case, we divide the paper in two parts: part 1 and part 2 to consider the experimental procedure as well as the results individually.

  19. Biomedical applications of collagens.

    PubMed

    Ramshaw, John A M

    2016-05-01

    Collagen-based biomedical materials have developed into important, clinically effective materials used in a range of devices that have gained wide acceptance. These devices come with collagen in various formats, including those based on stabilized natural tissues, those that are based on extracted and purified collagens, and designed composite, biosynthetic materials. Further knowledge on the structure and function of collagens has led to on-going developments and improvements. Among these developments has been the production of recombinant collagen materials that are well defined and are disease free. Most recently, a group of bacterial, non-animal collagens has emerged that may provide an excellent, novel source of collagen for use in biomaterials and other applications. These newer collagens are discussed in detail. They can be modified to direct their function, and they can be fabricated into various formats, including films and sponges, while solutions can also be adapted for use in surface coating technologies.

  20. Skylab biomedical hardware development

    NASA Technical Reports Server (NTRS)

    Huffstetler, W. J., Jr.; Lem, J. D.

    1974-01-01

    The development of hardware to support biomedical experimentation and operations in the Skylab vehicle presented unique technical problems. Designs were required to enable the accurate measurement of many varied physiological parameters and to compensate for zero g such that uninhibited equipment operation would be possible. Because of problems that occurred during the orbital workshop launch, special tests were run and new equipment was designed and built for use by the first Skylab crew. Design concepts used in the development of hardware to support cardiovascular, pulmonary, vestibular, body, and specimen mass measuring experiments are discussed. Additionally, major problem areas and the corresponding design solutions, as well as knowledge gained that will be pertinent for future life sciences hardware development, are presented.

  1. Community-based Ontology Development, Annotation and Discussion with MediaWiki extension Ontokiwi and Ontokiwi-based Ontobedia.

    PubMed

    Ong, Edison; He, Yongqun

    2016-01-01

    Hundreds of biological and biomedical ontologies have been developed to support data standardization, integration and analysis. Although ontologies are typically developed for community usage, community efforts in ontology development are limited. To support ontology visualization, distribution, and community-based annotation and development, we have developed Ontokiwi, an ontology extension to the MediaWiki software. Ontokiwi displays hierarchical classes and ontological axioms. Ontology classes and axioms can be edited and added using Ontokiwi form or MediaWiki source editor. Ontokiwi also inherits MediaWiki features such as Wikitext editing and version control. Based on the Ontokiwi/MediaWiki software package, we have developed Ontobedia, which targets to support community-based development and annotations of biological and biomedical ontologies. As demonstrations, we have loaded the Ontology of Adverse Events (OAE) and the Cell Line Ontology (CLO) into Ontobedia. Our studies showed that Ontobedia was able to achieve expected Ontokiwi features.

  2. Community-based Ontology Development, Annotation and Discussion with MediaWiki extension Ontokiwi and Ontokiwi-based Ontobedia

    PubMed Central

    Ong, Edison; He, Yongqun

    2016-01-01

    Hundreds of biological and biomedical ontologies have been developed to support data standardization, integration and analysis. Although ontologies are typically developed for community usage, community efforts in ontology development are limited. To support ontology visualization, distribution, and community-based annotation and development, we have developed Ontokiwi, an ontology extension to the MediaWiki software. Ontokiwi displays hierarchical classes and ontological axioms. Ontology classes and axioms can be edited and added using Ontokiwi form or MediaWiki source editor. Ontokiwi also inherits MediaWiki features such as Wikitext editing and version control. Based on the Ontokiwi/MediaWiki software package, we have developed Ontobedia, which targets to support community-based development and annotations of biological and biomedical ontologies. As demonstrations, we have loaded the Ontology of Adverse Events (OAE) and the Cell Line Ontology (CLO) into Ontobedia. Our studies showed that Ontobedia was able to achieve expected Ontokiwi features. PMID:27570653

  3. Crowdsourcing image annotation for nucleus detection and segmentation in computational pathology: evaluating experts, automated methods, and the crowd.

    PubMed

    Irshad, H; Montaser-Kouhsari, L; Waltz, G; Bucur, O; Nowak, J A; Dong, F; Knoblauch, N W; Beck, A H

    2015-01-01

    The development of tools in computational pathology to assist physicians and biomedical scientists in the diagnosis of disease requires access to high-quality annotated images for algorithm learning and evaluation. Generating high-quality expert-derived annotations is time-consuming and expensive. We explore the use of crowdsourcing for rapidly obtaining annotations for two core tasks in com- putational pathology: nucleus detection and nucleus segmentation. We designed and implemented crowdsourcing experiments using the CrowdFlower platform, which provides access to a large set of labor channel partners that accesses and manages millions of contributors worldwide. We obtained annotations from four types of annotators and compared concordance across these groups. We obtained: crowdsourced annotations for nucleus detection and segmentation on a total of 810 images; annotations using automated methods on 810 images; annotations from research fellows for detection and segmentation on 477 and 455 images, respectively; and expert pathologist-derived annotations for detection and segmentation on 80 and 63 images, respectively. For the crowdsourced annotations, we evaluated performance across a range of contributor skill levels (1, 2, or 3). The crowdsourced annotations (4,860 images in total) were completed in only a fraction of the time and cost required for obtaining annotations using traditional methods. For the nucleus detection task, the research fellow-derived annotations showed the strongest concordance with the expert pathologist- derived annotations (F-M =93.68%), followed by the crowd-sourced contributor levels 1,2, and 3 and the automated method, which showed relatively similar performance (F-M = 87.84%, 88.49%, 87.26%, and 86.99%, respectively). For the nucleus segmentation task, the crowdsourced contributor level 3-derived annotations, research fellow-derived annotations, and automated method showed the strongest concordance with the expert pathologist

  4. CROWDSOURCING IMAGE ANNOTATION FOR NUCLEUS DETECTION AND SEGMENTATION IN COMPUTATIONAL PATHOLOGY: EVALUATING EXPERTS, AUTOMATED METHODS, AND THE CROWD

    PubMed Central

    Irshad, H.; Montaser-Kouhsari, L.; Waltz, G.; Bucur, O.; Nowak, J.A.; Dong, F.; Knoblauch, N.W.; Beck, A. H.

    2014-01-01

    The development of tools in computational pathology to assist physicians and biomedical scientists in the diagnosis of disease requires access to high-quality annotated images for algorithm learning and evaluation. Generating high-quality expert-derived annotations is time-consuming and expensive. We explore the use of crowdsourcing for rapidly obtaining annotations for two core tasks in computational pathology: nucleus detection and nucleus segmentation. We designed and implemented crowdsourcing experiments using the CrowdFlower platform, which provides access to a large set of labor channel partners that accesses and manages millions of contributors worldwide. We obtained annotations from four types of annotators and compared concordance across these groups. We obtained: crowdsourced annotations for nucleus detection and segmentation on a total of 810 images; annotations using automated methods on 810 images; annotations from research fellows for detection and segmentation on 477 and 455 images, respectively; and expert pathologist-derived annotations for detection and segmentation on 80 and 63 images, respectively. For the crowdsourced annotations, we evaluated performance across a range of contributor skill levels (1, 2, or 3). The crowdsourced annotations (4,860 images in total) were completed in only a fraction of the time and cost required for obtaining annotations using traditional methods. For the nucleus detection task, the research fellow-derived annotations showed the strongest concordance with the expert pathologist-derived annotations (F−M =93.68%), followed by the crowd-sourced contributor levels 1,2, and 3 and the automated method, which showed relatively similar performance (F−M = 87.84%, 88.49%, 87.26%, and 86.99%, respectively). For the nucleus segmentation task, the crowdsourced contributor level 3-derived annotations, research fellow-derived annotations, and automated method showed the strongest concordance with the expert pathologist

  5. Harnessing Collaborative Annotations on Online Formative Assessments

    ERIC Educational Resources Information Center

    Lin, Jian-Wei; Lai, Yuan-Cheng

    2013-01-01

    This paper harnesses collaborative annotations by students as learning feedback on online formative assessments to improve the learning achievements of students. Through the developed Web platform, students can conduct formative assessments, collaboratively annotate, and review historical records in a convenient way, while teachers can generate…

  6. Annotated Catalog of Bilingual Vocational Training Materials.

    ERIC Educational Resources Information Center

    Miranda (L.) and Associates, Bethesda, MD.

    This catalog contains annotations for 170 bilingual vocational training materials. Most of the materials are written in English, but materials written in 13 source languages and directed toward speakers of 17 target languages are provided. Annotations are provided for the following different types of documents: administrative, assessment and…

  7. Interactive Electronic Technical Manuals (IETMs) Annotated Bibliography

    DTIC Science & Technology

    2002-10-22

    Copyright 2002, Carnegie Mellon University October 2002 1 Interactive Electronic Technical Manuals (IETMs) Annotated Bibliography... Interactive Electronic Technical Manuals (IETMs). It focuses especially on natural language dialog and speech recognition for use in tutoring, training...DATES COVERED 00-00-2002 to 00-00-2002 4. TITLE AND SUBTITLE Interactive Electronic Technical Manuals (IETMs) Annotated Biblioigraphy 5a

  8. Elementary Social Studies. Authorized Resources Annotated List.

    ERIC Educational Resources Information Center

    Alberta Dept. of Education, Edmonton. Curriculum Standards Branch.

    This comprehensive, annotated resource list is designed to assist in selecting resources authorized by the Alberta (Canada) Education Department for the elementary social studies classroom. Within each grade and topic, annotated entries for basic learning resources are listed, followed by support learning resources and authorized teaching…

  9. Elementary Health: Authorized Resources Annotated List.

    ERIC Educational Resources Information Center

    Alberta Dept. of Education, Edmonton. Curriculum Standards Branch.

    This comprehensive, annotated resource list is designed to assist in selecting resources authorized by the Alberta (Canada) Education Department for the elementary health classroom (Grades 1-6). Within each grade and topic, annotated entries for basic learning resources are listed, followed by support learning resources and authorized teaching…

  10. Annotation as an Index to Critical Writing

    ERIC Educational Resources Information Center

    Liu, Keming

    2006-01-01

    The differences in the ability to write critical and analytical essays among students with individual annotation styles were investigated. Critical and analytical writing was determined by the writer's ability to respond to a text with logical and critical analysis and attention to its thematic argument. Annotation styles were determined by ways…

  11. Language Intensity: A Comprehensive, Annotated Bibliography.

    ERIC Educational Resources Information Center

    Preiss, Raymond W.

    Noting that message variables offer communication scholars a conceptually rich body of information, this 30-item annotated bibliography reflects the diversity of research conducted in the area of language intensity. The journal articles, conference papers, and chapters of books in the annotated bibliography are divided into sections on general…

  12. Black English Annotations for Elementary Reading Programs.

    ERIC Educational Resources Information Center

    Prasad, Sandre

    This report describes a program that uses annotations in the teacher's editions of existing reading programs to indicate the characteristics of black English that may interfere with the reading process of black children. The first part of the report provides a rationale for the annotation approach, explaining that the discrepancy between written…

  13. Harnessing Collaborative Annotations on Online Formative Assessments

    ERIC Educational Resources Information Center

    Lin, Jian-Wei; Lai, Yuan-Cheng

    2013-01-01

    This paper harnesses collaborative annotations by students as learning feedback on online formative assessments to improve the learning achievements of students. Through the developed Web platform, students can conduct formative assessments, collaboratively annotate, and review historical records in a convenient way, while teachers can generate…

  14. Assisted annotation of medical free text using RapTAT.

    PubMed

    Gobbel, Glenn T; Garvin, Jennifer; Reeves, Ruth; Cronin, Robert M; Heavirland, Julia; Williams, Jenifer; Weaver, Allison; Jayaramaraja, Shrimalini; Giuse, Dario; Speroff, Theodore; Brown, Steven H; Xu, Hua; Matheny, Michael E

    2014-01-01

    To determine whether assisted annotation using interactive training can reduce the time required to annotate a clinical document corpus without introducing bias. A tool, RapTAT, was designed to assist annotation by iteratively pre-annotating probable phrases of interest within a document, presenting the annotations to a reviewer for correction, and then using the corrected annotations for further machine learning-based training before pre-annotating subsequent documents. Annotators reviewed 404 clinical notes either manually or using RapTAT assistance for concepts related to quality of care during heart failure treatment. Notes were divided into 20 batches of 19-21 documents for iterative annotation and training. The number of correct RapTAT pre-annotations increased significantly and annotation time per batch decreased by ~50% over the course of annotation. Annotation rate increased from batch to batch for assisted but not manual reviewers. Pre-annotation F-measure increased from 0.5 to 0.6 to >0.80 (relative to both assisted reviewer and reference annotations) over the first three batches and more slowly thereafter. Overall inter-annotator agreement was significantly higher between RapTAT-assisted reviewers (0.89) than between manual reviewers (0.85). The tool reduced workload by decreasing the number of annotations needing to be added and helping reviewers to annotate at an increased rate. Agreement between the pre-annotations and reference standard, and agreement between the pre-annotations and assisted annotations, were similar throughout the annotation process, which suggests that pre-annotation did not introduce bias. Pre-annotations generated by a tool capable of interactive training can reduce the time required to create an annotated document corpus by up to 50%. Published by the BMJ Publishing Group Limited. For permission to use (where not already granted under a licence) please go to http://group.bmj.com/group/rights-licensing/permissions.

  15. Assisted annotation of medical free text using RapTAT

    PubMed Central

    Gobbel, Glenn T; Garvin, Jennifer; Reeves, Ruth; Cronin, Robert M; Heavirland, Julia; Williams, Jenifer; Weaver, Allison; Jayaramaraja, Shrimalini; Giuse, Dario; Speroff, Theodore; Brown, Steven H; Xu, Hua; Matheny, Michael E

    2014-01-01

    Objective To determine whether assisted annotation using interactive training can reduce the time required to annotate a clinical document corpus without introducing bias. Materials and methods A tool, RapTAT, was designed to assist annotation by iteratively pre-annotating probable phrases of interest within a document, presenting the annotations to a reviewer for correction, and then using the corrected annotations for further machine learning-based training before pre-annotating subsequent documents. Annotators reviewed 404 clinical notes either manually or using RapTAT assistance for concepts related to quality of care during heart failure treatment. Notes were divided into 20 batches of 19–21 documents for iterative annotation and training. Results The number of correct RapTAT pre-annotations increased significantly and annotation time per batch decreased by ∼50% over the course of annotation. Annotation rate increased from batch to batch for assisted but not manual reviewers. Pre-annotation F-measure increased from 0.5 to 0.6 to >0.80 (relative to both assisted reviewer and reference annotations) over the first three batches and more slowly thereafter. Overall inter-annotator agreement was significantly higher between RapTAT-assisted reviewers (0.89) than between manual reviewers (0.85). Discussion The tool reduced workload by decreasing the number of annotations needing to be added and helping reviewers to annotate at an increased rate. Agreement between the pre-annotations and reference standard, and agreement between the pre-annotations and assisted annotations, were similar throughout the annotation process, which suggests that pre-annotation did not introduce bias. Conclusions Pre-annotations generated by a tool capable of interactive training can reduce the time required to create an annotated document corpus by up to 50%. PMID:24431336

  16. Extracting semantically enriched events from biomedical literature

    PubMed Central

    2012-01-01

    Background Research into event-based text mining from the biomedical literature has been growing in popularity to facilitate the development of advanced biomedical text mining systems. Such technology permits advanced search, which goes beyond document or sentence-based retrieval. However, existing event-based systems typically ignore additional information within the textual context of events that can determine, amongst other things, whether an event represents a fact, hypothesis, experimental result or analysis of results, whether it describes new or previously reported knowledge, and whether it is speculated or negated. We refer to such contextual information as meta-knowledge. The automatic recognition of such information can permit the training of systems allowing finer-grained searching of events according to the meta-knowledge that is associated with them. Results Based on a corpus of 1,000 MEDLINE abstracts, fully manually annotated with both events and associated meta-knowledge, we have constructed a machine learning-based system that automatically assigns meta-knowledge information to events. This system has been integrated into EventMine, a state-of-the-art event extraction system, in order to create a more advanced system (EventMine-MK) that not only extracts events from text automatically, but also assigns five different types of meta-knowledge to these events. The meta-knowledge assignment module of EventMine-MK performs with macro-averaged F-scores in the range of 57-87% on the BioNLP’09 Shared Task corpus. EventMine-MK has been evaluated on the BioNLP’09 Shared Task subtask of detecting negated and speculated events. Our results show that EventMine-MK can outperform other state-of-the-art systems that participated in this task. Conclusions We have constructed the first practical system that extracts both events and associated, detailed meta-knowledge information from biomedical literature. The automatically assigned meta-knowledge information

  17. Towards Viral Genome Annotation Standards, Report from the 2010 NCBI Annotation Workshop.

    PubMed

    Brister, James Rodney; Bao, Yiming; Kuiken, Carla; Lefkowitz, Elliot J; Le Mercier, Philippe; Leplae, Raphael; Madupu, Ramana; Scheuermann, Richard H; Schobel, Seth; Seto, Donald; Shrivastava, Susmita; Sterk, Peter; Zeng, Qiandong; Klimke, William; Tatusova, Tatiana

    2010-10-01

    Improvements in DNA sequencing technologies portend a new era in virology and could possibly lead to a giant leap in our understanding of viral evolution and ecology. Yet, as viral genome sequences begin to fill the world's biological databases, it is critically important to recognize that the scientific promise of this era is dependent on consistent and comprehensive genome annotation. With this in mind, the NCBI Genome Annotation Workshop recently hosted a study group tasked with developing sequence, function, and metadata annotation standards for viral genomes. This report describes the issues involved in viral genome annotation and reviews policy recommendations presented at the NCBI Annotation Workshop.

  18. Towards Viral Genome Annotation Standards, Report from the 2010 NCBI Annotation Workshop

    PubMed Central

    Brister, James Rodney; Bao, Yiming; Kuiken, Carla; Lefkowitz, Elliot J.; Le Mercier, Philippe; Leplae, Raphael; Madupu, Ramana; Scheuermann, Richard H.; Schobel, Seth; Seto, Donald; Shrivastava, Susmita; Sterk, Peter; Zeng, Qiandong; Klimke, William; Tatusova, Tatiana

    2010-01-01

    Improvements in DNA sequencing technologies portend a new era in virology and could possibly lead to a giant leap in our understanding of viral evolution and ecology. Yet, as viral genome sequences begin to fill the world’s biological databases, it is critically important to recognize that the scientific promise of this era is dependent on consistent and comprehensive genome annotation. With this in mind, the NCBI Genome Annotation Workshop recently hosted a study group tasked with developing sequence, function, and metadata annotation standards for viral genomes. This report describes the issues involved in viral genome annotation and reviews policy recommendations presented at the NCBI Annotation Workshop. PMID:21994619

  19. Automatically classifying sentences in full-text biomedical articles into Introduction, Methods, Results and Discussion.

    PubMed

    Agarwal, Shashank; Yu, Hong

    2009-12-01

    Biomedical texts can be typically represented by four rhetorical categories: Introduction, Methods, Results and Discussion (IMRAD). Classifying sentences into these categories can benefit many other text-mining tasks. Although many studies have applied different approaches for automatically classifying sentences in MEDLINE abstracts into the IMRAD categories, few have explored the classification of sentences that appear in full-text biomedical articles. We first evaluated whether sentences in full-text biomedical articles could be reliably annotated into the IMRAD format and then explored different approaches for automatically classifying these sentences into the IMRAD categories. Our results show an overall annotation agreement of 82.14% with a Kappa score of 0.756. The best classification system is a multinomial naïve Bayes classifier trained on manually annotated data that achieved 91.95% accuracy and an average F-score of 91.55%, which is significantly higher than baseline systems. A web version of this system is available online at-http://wood.ims.uwm.edu/full_text_classifier/.

  20. Automatically classifying sentences in full-text biomedical articles into Introduction, Methods, Results and Discussion

    PubMed Central

    Agarwal, Shashank; Yu, Hong

    2009-01-01

    Biomedical texts can be typically represented by four rhetorical categories: Introduction, Methods, Results and Discussion (IMRAD). Classifying sentences into these categories can benefit many other text-mining tasks. Although many studies have applied different approaches for automatically classifying sentences in MEDLINE abstracts into the IMRAD categories, few have explored the classification of sentences that appear in full-text biomedical articles. We first evaluated whether sentences in full-text biomedical articles could be reliably annotated into the IMRAD format and then explored different approaches for automatically classifying these sentences into the IMRAD categories. Our results show an overall annotation agreement of 82.14% with a Kappa score of 0.756. The best classification system is a multinomial naïve Bayes classifier trained on manually annotated data that achieved 91.95% accuracy and an average F-score of 91.55%, which is significantly higher than baseline systems. A web version of this system is available online at—http://wood.ims.uwm.edu/full_text_classifier/. Contact: hongyu@uwm.edu PMID:19783830